Q & A
Governance & Security
Learn Dataiku DSS
Q & A
Ask a Question
Email or Username
I forgot my password
Encoding disappears after using dkuWriteDataset()
Encoding of character columns disappears after using dkuWriteDataset() function in R recipe. My data frame contains non-Latin letters which seem to be correctly encoded (UTF-8) after importing data set with dkuReadDataset(). However, encoding disappears after I write resulting data frame further in project flow. Letters are displayed as question marks "?????" when written to the next data set. How can I keep encoding using dkuWriteDataset()?
Thank you in advance.
Hi, Would you be able to send us a sample of your data so that we try to reproduce the problem on our side? Also what version of Dataiku are you using? You can check that in the software by clicking on the top bar > ? > About.
Hi, were you able to solve your issue?
We are using Version 4.1.1. It seems that R recipe works fine if the original data come from .csv format. So I can overcome this problem. However, we normally extract data using SQL recipe and store it in our internal database (instead of FileSystem). Then we use R recipe to transform data and it returns "?????" in the next data set. For example Russian letters are returned as "?????".
Also, if I use dataiku filter/transform recipes the resulting data set seems fine and returns Russian letters as expected. So it must be something related to R recipe I think.
Data could look something like this:
Hi, Thanks for the feedback. It sounds like an R-specific encoding issue. What is the original dataset stored as? How is it produced?
Original data is stored as dataiku dataset, which was extracted using SQL recipe.
Would you be able to send us an actual sample of the data as a file? For instance, you can export the input dataset right before the R recipe. You can send it to alexandre.combessie -at- dataiku.com
I have sent you a dataset.
Hi Vaidas, I am not able to reproduce your issue based on the data you have sent me. Is this specific to your SQL server? Can you reproduce if you write this output to a local filesystem?
to add a comment.
to answer this question.
Most popular tags
Text encoding different for Python script and Notebook
Wrong charset detected with UTF-16 CSV FIle
Bug : scenario script dropped after instance restart
Dataiku 5 does not start after upgrade from 4.3
'Append instead of overwrite' does not append table after recipe
Welcome to Dataiku Answers, where you can ask questions and receive answers from other members of the community.
©Dataiku 2012-2018 -