0 votes
Hi,

Encoding of character columns disappears after using dkuWriteDataset() function in R recipe. My data frame contains non-Latin letters which seem to be correctly encoded (UTF-8) after importing data set with dkuReadDataset(). However, encoding disappears after I write resulting data frame further in project flow. Letters are displayed as question marks "?????" when written to the next data set. How can I keep encoding using dkuWriteDataset()?

Thank you in advance.
asked by
edited by
Hi, Would you be able to send us a sample of your data so that we try to reproduce the problem on our side? Also what version of Dataiku are you using? You can check that in the software by clicking on the top bar > ? > About.
Hi, were you able to solve your issue?
Hi,

We are using Version 4.1.1. It seems that R recipe works fine if the original data come from .csv format. So I can overcome this problem. However, we normally extract data using SQL recipe and store it in our internal database (instead of FileSystem). Then we use R recipe to transform data and it returns "?????" in the next data set. For example Russian letters are returned as "?????".  

Also, if I use dataiku filter/transform recipes the resulting data set seems fine and returns Russian letters as expected. So it must be something related to R recipe I think.

Data could look something like this:
gender,city
F,Москва
M,Новосибирск
M,Екатеринбург
Hi, Thanks for the feedback. It sounds like an R-specific encoding issue. What is the original dataset stored as? How is it produced?
Original data is stored as dataiku dataset, which was extracted using SQL recipe.
Would you be able to send us an actual sample of the data as a file? For instance, you can export the input dataset right before the R recipe. You can send it to alexandre.combessie -at- dataiku.com
I have sent you a dataset.
Hi Vaidas, I am not able to reproduce your issue based on the data you have sent me. Is this specific to your SQL server? Can you reproduce if you write this output to a local filesystem?

Please log in or register to answer this question.

971 questions
998 answers
1,047 comments
2,361 users

©Dataiku 2012-2018 - Privacy Policy