Hi,
Impala being a code recipe, the creator expects datasets to have been created by other means, possibly other calls to the public API. It's still quite simple to make your own Impala recipe creator to have the ability to create the output:
from dataikuapi.dss.recipe import SingleOutputRecipeCreator
class ImpalaRecipeCreator(SingleOutputRecipeCreator):
def __init__(self, name, project):
SingleOutputRecipeCreator.__init__(self, 'impala', name, project)
And use it like:
r = ImpalaRecipeCreator('test', prj).with_input(inputDatasetName).with_new_output(outputDatasetName, 'hdfs_managed', format_option_id='PARQUET_HIVE').build()
The newely-created recipe will come with the default code snippet, which is a "select * from ..." . To change the SQL query, you can then get and set the recipe's definition.
Regards,
Frederic