0 votes

I have a dataset say Monday_passengers ,I have cleaned this data and generated a new dataset called 'passengers'.

If I get data everyday with same schema, I want to follow the same cleaning steps and append to the 'passengers' dataset.

How can I do that?

1 Answer

+3 votes

If you are directly connected to a database that receives new data everyday, DSS will apply the cleaning steps of your recipes to the full dataset so the new data will be taken into account and propagated until the end of the workflow. You can also directly go into a recipe, in the Input/Ouptut section and check "Append instead of overwrite" - more info on building datasets

If you are using flat files, you can drag and drop the new daily files into the Upload section and they will be stacked if they have the same structure.

If you are working with larger data, the right way would be to use the partitioning engine of DSS (and/or your underlying database). You can find more information on partitioning here - more info on partitions 

Hope this helps

1,322 questions
1,341 answers
11,889 users

©Dataiku 2012-2018 - Privacy Policy