0 votes

Can't figure out how to do this. Have tried two approaches, neither worked.

1. Approach 1: Pandas way: data.to_csv('data.csv')

This does not throw an error, but I don't see the dataset anywhere in my flow...

2. Approach 2: Dataiku way: # Recipe outputs
recommenderdata_tosave = dataiku.Dataset("recommenderdata_tosave")


In the notebook, I get this error: 

Exception: None: dataset does not exist: PROJECT.recommenderdata_tosave

This code works in the python recipe in the flow, but not in the notebook for some reason.

Any help would be appreciated.


1 Answer

+1 vote

"write_with_schema" does not create the dataset, it only "fills" it. You need to first declare the dataset in your Flow. The best way to do it is to create it as a "managed" dataset, so that DSS handles all the connection details: in the Flow, click on "+ Dataset" > "Internal" > "Managed dataset". You now only need to enter the name, and select where you want this dataset to be stored. You can then use it in the notebook.
Thanks for your quick and helpful answer. I confirm that this works!

To anyone who comes after me, this is what I did:

1. Create managed data set as explained above (I named it: data_for_recommender_managed)

2. save it from the notebook with the following code:

# Recipe outputs
data_for_recommender_managed = dataiku.Dataset("data_for_recommender_managed")
# the dataframe in memory in the notebook is called: data_for_recommender
1,299 questions
1,327 answers
11,865 users

©Dataiku 2012-2018 - Privacy Policy