0 votes

I am writing a python notebook within a project but I get an error when trying to export it to dataiku with panda dataframe. 

 The error is the following: 

Exception: An error occurred during dataset write (6oh2RgiLQn): NullPointerException: null

and this despite the fact I precede with the following code: 

result = pd.DataFrame(my_array_to_export)
dss_runs = dataiku.Dataset("my_export")
dss_runs.write_with_schema(result)

 

 

 

 

 

 

 

 

EDIT: here is the configuration of the my_export dataset created in the project: 

 

asked by
edited by
Is the table with name "my_export" already exists in the flow? it needs to exist in the flow for it to be written there. Dataiku does not create the data set.
Yes it is. I created a dataset in managed file system with the exact same name. I have edited the inital question to show you a screenshot of the configured dataset, thank you for your help
Interesting, I just tried it with a jupyter notebook and did not have any problem with the managed folder, I would just check that it has the advanced parameters, but I believe that the main problem is that it even does not find the folder. I guess your Dataset is the first to exist in the flow, isn't it?
edit: I have just tried it in a new proyect and it worked fine, just have to put "import first data set" create the managed folder and write in the jupyter notebook the code.
(images: https://imgur.com/a/chS0N)
Another way around it that I have used is to not create a data set but create a code recipe, and in the recipe write your code.
Really thanks a lot for your very detailed help. I had to reset a project from scratch and follow your steps to make it work.
 I still don't really know why it was not working in the current project, but I will take some time later to understand.

1 Answer

0 votes
Hi,

As explained by larispardo, this normally works without any issue if the dataframe that you write with the .write_with_schema() method is valid and if the dataset exists in the flow. Maybe inspect the result dataframe using

from dataiku import pandasutils as pdu

pdu.audit(result)

Could you try restarting your notebook kernel and relaunching the code blocks involved?

Cheers,

Alex
answered by
I had to reset a project from scratch and follow larispardo steps to make it work.  pud.audit is very useful thanks for your help !
930 questions
957 answers
958 comments
1,808 users