Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hello to all Dataiku users,
I am writing to you because I have a problem. I hope you can help me. (a diagram of the problem is attached)
I'm looking for a way to historicize some datasets in green CSV or excel (HDFS). Indeed I should be able to keep a history of datasets in HDFS in a subfolder for example at each RUN
I explain, at each RUN of the flow zone, I would like the intermediate dataset and the final dataset to be stored in a subfolder in hdfs.
The objective is that I can compare the different versions at each run (because in my recipes, I can modify things)
I don't know if I am very clear. Thanks for your help
Take a look at the following Dataiku Features see if they can be of help to you.
This seems to go over some of this https://community.dataiku.com/t5/Using-Dataiku/how-can-select-the-append-mode-in-a-dataset/td-p/3367
I know that I've been able to use these two features to acheive something like I think you want to do.
There is another discussion about doing something like this through python.
Dear tgb417,
Thank you very much, I'm looking into it