0 votes
Hello all,

I have a dataset which contains many rows for a given event, which id is in the "event_id" column. There are of course many events in the dataset.

Is there a way to split this dataset more easily than manually defining the output datasets using the split visual recipe? There are hundreds of events... (it would be a bit painful, or at least time-consuming).

I am using DSS 2.2.2

 

Thanks in advance!
by

1 Answer

0 votes

Hi Alex,

There is not really a better way to do this than with the split recipe. If you want to have one dataset per event, you need anyway to create these datasets. Maybe you could create the datasets with DSS API but is still not ideal.

The best option would be to change your strategy. You should keep a single dataset and create a partition on the event_id column. To learn more about it, you can read Working with partitions and Repartitioning a non-partitioned dataset.

I hope that helps,
Jeremy

by
1,113 questions
1,156 answers
1,301 comments
11,008 users

┬ęDataiku 2012-2018 - Privacy Policy