Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi,
I am new in using Dataiku API. I tried some simple examples such as create dataset, delete dataset and so on. Also I found one of your examples that creates Python recipe and sets inputs and outputs
from dataikuapi import GroupingRecipeCreator
builder = GroupingRecipeCreator('test_group', project)
builder = builder.with_input("input_dataset_name")
builder = builder.with_new_output("output_dataset_name", "hdfs_managed", format_option_id="PARQUET_HIVE")
builder = builder.with_group_key("quantity") # the recipe is created with one grouping key
recipe = builder.build()
Basically object builder helps to create a recipe. But is there any way to run a recipe? Or is it only possible to run this in Dataiku manually?
Hi Povilas,
The philosophy of running a flow of datasets, recipes and models in Dataiku revolves around the concept of Job and Scenario. In the API, you do not run a Recipe but rather build its output, either using a Job or a Scenario.
If you plan on using different elements of the API to create a Dataiku project, test it and automate it, I would advise:
1. Creating the datasets, recipes and models using https://doc.dataiku.com/dss/latest/publicapi/client-python/datasets.html, https://doc.dataiku.com/dss/latest/publicapi/client-python/recipes.html and https://doc.dataiku.com/dss/latest/publicapi/client-python/ml.html
2. Build/train some datasets/models by launching Jobs building the outputs(s) of the recipe: https://doc.dataiku.com/dss/latest/publicapi/client-python/jobs.html
3. Create a scenario to automate the update of datasets and models: https://doc.dataiku.com/dss/latest/publicapi/client-python/scenarios.html
In general, it may be faster to use the interface to initialize a "template project", including scenarios. Then copy this template several times with some programmatic changes using the API.
Hope it helps,
Alex