0 votes

Hello, I am trying to replicate the churn prediction case that is in the teachable.dataiku website and I receive the following error:

[2017/04/26-19:02:26.285] [Exec-38] [INFO] [dku.utils]  - /home/dataiku/dss/pyenv/lib/python2.7/site-packages/unidecode/__init__.py:46: RuntimeWarning: Argument <type 'str'> is not an unicode object. Passing an encoded string will likely have unexpected results.
[2017/04/26-19:02:26.285] [Exec-38] [INFO] [dku.utils]  -   _warn_if_not_unicode(string)
[2017/04/26-19:02:26.340] [Exec-38] [INFO] [dku.utils]  - Traceback (most recent call last):
[2017/04/26-19:02:26.365] [Exec-38] [INFO] [dku.utils]  -   File "/home/dataiku/dss/lib/python/vw_transformer.py", line 99, in <module>
[2017/04/26-19:02:26.365] [Exec-38] [INFO] [dku.utils]  -     sys.stdout.write(vw_record + "\n")
[2017/04/26-19:02:26.365] [Exec-38] [INFO] [dku.utils]  - IOError: [Errno 32] Broken pipe
[2017/04/26-19:02:26.457] [Thread-23] [ERROR] [dku.flow.shell]  - Error while sending input to script
java.io.IOException: Broken pipe
	at java.io.FileOutputStream.writeBytes(Native Method)
	at java.io.FileOutputStream.write(FileOutputStream.java:326)
	at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
	at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
	at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282)
	at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
	at java.io.OutputStreamWriter.write(OutputStreamWriter.java:207)
	at java.io.BufferedWriter.flushBuffer(BufferedWriter.java:129)
	at java.io.BufferedWriter.write(BufferedWriter.java:230)
	at java.io.Writer.write(Writer.java:157)
	at java.io.Writer.append(Writer.java:227)
	at com.dataiku.dip.output.CSVOutputFormatter.appendExcelStyle(CSVOutputFormatter.java:109)
	at com.dataiku.dip.output.CSVOutputFormatter.appendFieldToLine(CSVOutputFormatter.java:198)
	at com.dataiku.dip.output.CSVOutputFormatter.format(CSVOutputFormatter.java:183)
	at com.dataiku.dip.output.StringOutputFormatter.format(StringOutputFormatter.java:33)
	at com.dataiku.dip.output.OutputStreamOutputWriter.emitRow(OutputStreamOutputWriter.java:32)
	at com.dataiku.dip.input.formats.csv.CSVFormatExtractor.doExtractStream(CSVFormatExtractor.java:366)
	at com.dataiku.dip.input.formats.csv.CSVFormatExtractor.doExtractStream(CSVFormatExtractor.java:161)
	at com.dataiku.dip.input.formats.ArchiveCapableFormatExtractor.run(ArchiveCapableFormatExtractor.java:135)
	at com.dataiku.dip.datasets.AbstractSingleThreadPusher.pushSplits(AbstractSingleThreadPusher.java:176)
	at com.dataiku.dip.datasets.UniversalSingleThreadPusher.push(UniversalSingleThreadPusher.java:226)
	at com.dataiku.dip.datasets.UniversalSingleThreadPusher.push(UniversalSingleThreadPusher.java:64)
	at com.dataiku.dip.recipes.code.shell.ShellScriptRecipeRunner$PipeInThread.run(ShellScriptRecipeRunner.java:220)
[2017/04/26-19:02:26.459] [Thread-23] [INFO] [dku.flow.shell]  - Closing the script input
[2017/04/26-19:02:26.463] [FRT-35-FlowRunnable] [INFO] [dku.flow.activity] - Run thread failed for activity compute_PG83dheF_NP
com.dataiku.dip.exceptions.ProcessDiedException: The shell process failed (exit code: 127). More info might be available in the logs.

It seems to be entering my python code, but this is not sending back the info. I might be wrong.

Any help will be greatly appreciated.

edited by

1 Answer

0 votes
Hi - did you set something in the "Pipe in" or "Pipe out" dropdown menus? It needs to be set to "--nothing--" in both cases, as the Python script takes care of the reading the input data directly.
Yes, I have tried it in all forms, with and without something in the pipe in pipe out, and I always receive the same error 127. I put a database in pipe in, and changed the value so it does not go through the python script and recceived the same error, here is the log:
[2017/04/26-21:13:03.509] [Exec-37] [INFO] [dku.utils]  -     State    Account_Length    Area_Code    Phone    Intl_Plan    VMail_Plan    VMail_Message    Day_Mins    Day_Calls    Day_Charge    Eve_Mins    Eve_Calls    Eve_Charge    Night_Mins    Night_Calls    Night_Charge    Intl_Mins    Intl_Calls    Intl_Charge    CustServ_Calls    Churn    splitter
[2017/04/26-21:13:03.510] [Exec-37] [INFO] [dku.utils]  -                        ^
[2017/04/26-21:13:03.510] [Exec-37] [INFO] [dku.utils]  - SyntaxError: invalid syntax
[2017/04/26-21:13:03.544] [Exec-37] [INFO] [dku.utils]  - /home/dataiku/dss/jobs/CHURNPREDICTION/Build_model_vw_2017-04-26T21-13-01.235/compute_PG83dheF_NP/shelljUDVJ4ANXQzQ/script.sh: line 36: --dataset=train: command not found
[2017/04/26-21:13:03.546] [Thread-23] [ERROR] [dku.flow.shell]  - Error while sending input to script
java.io.IOException: Broken pipe
Also, could you please run "vw --version" in a terminal, on the server hosting DSS, and see what is the output ?
okey, I not sure if I got it I have searched for the command and have not found it, so I checked my versions and are the latest of version 8, also tried to run my python code, but I do not have dataiku package in my python folder, so I can use it via dataiku, not in the terminal.
1,299 questions
1,327 answers
11,865 users

┬ęDataiku 2012-2018 - Privacy Policy