+1 vote

I have to deal with CSV file with UTF-16 encoding due to specific characters (asian chars).

DSS detects a wrong charset instead of the UTF-16 (iso-8859-15). The consequence is that the first name column is invalid, probably because the BOM of the CSV is interpreted as the first column name.

Fortunatelly, I can manually edit the charset and it works. But in several "automatic" cases I will be not there to edit it :)

Is-there a way to correct that ?

To reproduce the issue, you can download this file : http://www.filedropper.com/romain

NOTE : The file is correctly detected with "file" unix command as "Little-endian UTF-16 Unicode text, with CRLF line terminators"

1 Answer

0 votes
Hello Dataiku team,

This issue is not resolved. Any chance to correct this in the next minor release?
Hi, Would you be able to send us a sample of this file so we can try to reproduce? You can add it to this thread using a like from a file transfer service such as WeTransfer, or send it to me at alexandre dot combessie at dataiku dot com. Cheers, Alex
Hi Alex,

Thanks for your help. I sent you an example of file by email.
Thanks, I have been able to reproduce this issue. I have reported this to our R&D team.
This issue will be fixed in the forthcoming 5.1.3 release.
1,200 questions
1,229 answers
11,760 users

┬ęDataiku 2012-2018 - Privacy Policy