0 votes
Hi Team,

I have dataset with 10,000 rows and dataset has month column range between JAN till DEC. I want to split this by Month.

I can do this by Visual Recipe "SPLIT" but I have to create 12 different dataset for this.

My question, if I want the DSS to create Dataset according to distinct values of month column  then how can I do this ?



asked by

1 Answer

0 votes

This is a good use case for partitioning: https://doc.dataiku.com/dss/latest/partitions/index.html

Instead of creating 12 datasets using a split recipe, you can use the sync recipe with a partitioned output dataset. Then in the Settings > Connection menu of the output dataset, configure the partitioning column containing your month. Try discrete partition type if your months are encoded like "1" to "12" or time range partition type if they are encoded like dates ("YYYY-MM").

In the settings of your sync recipe, make sure you click on "Redispatch partitioning according to input columns". Then you will be able to build your selected partitions.
answered by
992 questions
1,025 answers
3,154 users

┬ęDataiku 2012-2018 - Privacy Policy