0 votes

Hey,

I am currently trying to run tensorboard in the visual analysis for my keras model.

Here are all packages installed in my conda env:

  1. scikit-learn>=0.19.1,<0.20
  2. statsmodels>=0.8,<0.9
  3. jinja2>=2.10,<2.11
  4. tensorflow
  5. flask
  6. tensorboard
  7. keras
  8. xgboost
  9. h5py=2.7.1
  10. pillow=5.1.0
  11. matplotlib
  12. gensim
  13. nltk
  14. seaborn
  15. bokeh

When trying to switch to tensorboard I am getting the following exception:

Backend died before start, caused by: CustomPythonKernelPythonException: Failed to run webapp backend
HTTP code: 500, type: com.dataiku.dip.webapps.backend.WebAppBackendRunner$BackendStartFailedException

Did anyone succeed running tensorboard on dataiku?

 

Cheers,

Matthew

asked by
Hi, you would need to look for the detailed error in the run/backend.log. Please don't hesitate to post the error here.
I have posted my logs. However, my comment is being verified since yesterday evening.
[2018/10/28-19:03:24.101] [FT-MainLoopThread-DUFAYJXo-1976] [INFO] [dku.kernels]  - Getting kernel tail
[2018/10/28-19:03:24.101] [FT-MainLoopThread-DUFAYJXo-1976] [WARN] [dku.webapps.backends]  - Recorded crash log: {
  "totalLines": 21,
  "lines": [
    "2018-10-28 19:03:23,054 INFO Starting Webapp backend",
    "2018-10-28 19:03:23,054 INFO Connecting to parent at port 37309",
    "2018-10-28 19:03:23,055 INFO Connected to parent at port 37309",
    "2018-10-28 19:03:23,055 INFO Webapp backend connected to DSS",
    "2018-10-28 19:03:23,055 INFO Starting backend for web app: TAP.TENSORBOARD_TAP-Ux3XEjx0-ah0dA8v8-s20",
    "/var/dataiku/code-envs/python/ML_3_6/lib/python3.6/site-packages/tensorflow/python/framework/ops.py:923: DeprecationWarning: builtin type EagerTensor has no __module__ attribute",
    "  EagerTensor \u003d c_api.TFE_Py_InitEagerTensor(_EagerTensorBase)",
    "/var/dataiku/code-envs/python/ML_3_6/lib/python3.6/site-packages/tensorflow/python/util/tf_inspect.py:75: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() or inspect.getfullargspec()",
    "  return _inspect.getargspec(target)",
    "/var/dataiku/code-envs/python/ML_3_6/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 \u003d\u003d np.dtype(float).type`.",
    "  from ._conv import register_converters as _register_converters",
    "Using TensorFlow backend.",
    "2018-10-28 19:03:23,903 ERROR Backend main loop failed",
    "Traceback (most recent call last):",
    "  File \"/home/ubuntu/dataiku-dss-5.0.2/python/dataiku/webapps/backend.py\", line 37, in serve",
    "    exec(command[\"code\"], globals(), globals()) # in globals so that flask can find them",
    "  File \"\u003cstring\u003e\", line 85, in \u003cmodule\u003e",
    "  File \"\u003cstring\u003e\", line 78, in __get_tb_app",
    "  File \"\u003cstring\u003e\", line 62, in __get_custom_assets_zip_provider",
    "  File \"\u003cstring\u003e\", line 37, in customize_tb_page",
    "TypeError: a bytes-like object is required, not \u0027str\u0027"
  ],
  "status": [
    1,
    1,
    1,
    1,
    1,
    1,
    1,
    1,
    1,
    1,
    1,
    1,
    1,
    3,
    1,
    1,
    1,
    1,
    1,
    1,
    3
  ],
  "maxLevel": 3
}
[2018/10/28-19:03:24.101] [FT-MainLoopThread-DUFAYJXo-1976] [WARN] [dku.futures]  - Future thread failed
com.dataiku.dip.webapps.backend.WebAppBackendRunner$BackendStartFailedException: Backend died before start
    at com.dataiku.dip.webapps.standard.backend.StandardWebAppBackendRunner.run(StandardWebAppBackendRunner.java:108)
    at com.dataiku.dip.webapps.backend.WebAppBackend$MainLoopThread.compute(WebAppBackend.java:116)
    at com.dataiku.dip.webapps.backend.WebAppBackend$MainLoopThread.compute(WebAppBackend.java:87)
    at com.dataiku.dip.futures.SimpleFutureThread.execute(SimpleFutureThread.java:35)
    at com.dataiku.dip.futures.FutureThread.run(FutureThread.java:105)
Caused by: com.dataiku.dip.io.CustomPythonKernelPythonException: Failed to run webapp backend
    at com.dataiku.dip.webapps.standard.backend.StandardWebAppBackendRunner.run(StandardWebAppBackendRunner.java:101)
    ... 4 more
[2018/10/28-19:03:24.345] [FT-StartWaitThread-h1UjuHah-1977] [WARN] [dku.futures]  - Future thread failed
com.dataiku.dip.webapps.backend.WebAppBackendRunner$BackendStartFailedException: Backend died before start
    at com.dataiku.dip.webapps.standard.backend.StandardWebAppBackendRunner.run(StandardWebAppBackendRunner.java:108)
    at com.dataiku.dip.webapps.backend.WebAppBackend$MainLoopThread.compute(WebAppBackend.java:116)
    at com.dataiku.dip.webapps.backend.WebAppBackend$MainLoopThread.compute(WebAppBackend.java:87)
    at com.dataiku.dip.futures.SimpleFutureThread.execute(SimpleFutureThread.java:35)
    at com.dataiku.dip.futures.FutureThread.run(FutureThread.java:105)
Caused by: com.dataiku.dip.io.CustomPythonKernelPythonException: Failed to run webapp backend
    at com.dataiku.dip.webapps.standard.backend.StandardWebAppBackendRunner.run(StandardWebAppBackendRunner.java:101)
    ... 4 more
[2018/10/28-19:03:25.820] [qtp688005825-1945] [DEBUG] [dku.tracing]  - [ct: 0] Start call: /api/admin/logs/get-content [GET] user=admin [name=backend.log]


Cheers,

Matthew
Hi Clément,

thanks for your quick response.

The log file contains the following information:

[2018/10/28-19:03:20.439] [qtp688005825-1956] [DEBUG] [dku.tracing]  - [ct: 0] Start call: /api/webapps/webapp-start-tensorboard [POST] user=admin [projectKey=TAP analysisId=Ux3XEjx0 taskId=ah0dA8v8 sessionId=s20]
[2018/10/28-19:03:20.441] [qtp688005825-1956] [INFO] [dku.webapps.proxy]  - [ct: 2] Init webapp with template: tensorboard
[2018/10/28-19:03:20.441] [qtp688005825-1956] [INFO] [dku.webapps.backends.manager]  - Setting running code for TENSORBOARD_TAP-Ux3XEjx0-ah0dA8v8-s20
[2018/10/28-19:03:20.441] [qtp688005825-1956] [INFO] [dku.webapps.backends]  - [ct: 2] Stopping backend for webapp TAP.TENSORBOARD_TAP-Ux3XEjx0-ah0dA8v8-s20
[2018/10/28-19:03:20.441] [qtp688005825-1956] [INFO] [dku.webapps.backends]  - [ct: 2] No 'runAs' field. Use current auth context
[2018/10/28-19:03:20.441] [qtp688005825-1956] [INFO] [dku.webapps.backends]  - [ct: 2] Starting webapp backend with authCtx: <AC:user:admin>
[2018/10/28-19:03:20.442] [FT-MainLoopThread-DUFAYJXo-1976] [INFO] [dku.webapps.backends]  - [ct: 0] Start backend (re)start MainLoop
[2018/10/28-19:03:20.442] [FT-MainLoopThread-DUFAYJXo-1976] [INFO] [dku.webapps.backends]  - [ct: 0] Build runner of type PROXY
[2018/10/28-19:03:20.442] [FT-MainLoopThread-DUFAYJXo-1976] [INFO] [dku.webapps.standard]  - [ct: 0] Running web app python backend
[2018/10/28-19:03:20.443] [qtp688005825-1956] [DEBUG] [dku.tracing]  - [ct: 4] Done call: /api/webapps/webapp-start-tensorboard [POST] time=4ms user=admin [projectKey=TAP analysisId=Ux3XEjx0 taskId=ah0dA8v8 sessionId=s20]
[2018/10/28-19:03:20.443] [FT-MainLoopThread-DUFAYJXo-1976] [INFO] [dip.tickets]  - [ct: 1] Creating API ticket for webapp_backend:TAP.TENSORBOARD_TAP-Ux3XEjx0-ah0dA8v8-s20 on behalf of admin id=webapp_backend:TAP.TENSORBOARD_TAP-Ux3XEjx0-ah0dA8v8-s20_FypLRZKfyxXF
[2018/10/28-19:03:20.443] [FT-MainLoopThread-DUFAYJXo-1976] [INFO] [dku.block.link]  - Started a socket on port 37309
[2018/10/28-19:03:20.444] [FT-MainLoopThread-DUFAYJXo-1976] [INFO] [dip.venv.selector]  - Select in project with {"useBuiltinEnv":false,"preventOverride":false,"envName":"ML_3_6"}
[2018/10/28-19:03:20.444] [FT-MainLoopThread-DUFAYJXo-1976] [INFO] [dku.code.envs.resolution]  - [ct: 2] Executing Python activity in env: ML_3_6
[2018/10/28-19:03:20.444] [FT-MainLoopThread-DUFAYJXo-1976] [INFO] [dku.python.single_command.kernel]  - Starting Python process for kernel  python-single-command-kernel
[2018/10/28-19:03:20.444] [FT-MainLoopThread-DUFAYJXo-1976] [INFO] [dku.security.process]  - Starting process (regular)
[2018/10/28-19:03:20.445] [FT-MainLoopThread-DUFAYJXo-1976] [INFO] [dku.security.process]  - Process started with pid=17478
[2018/10/28-19:03:20.445] [FT-MainLoopThread-DUFAYJXo-1976] [INFO] [dku.processes.cgroups]  - [ct: 3] Will use cgroups []
[2018/10/28-19:03:20.445] [FT-MainLoopThread-DUFAYJXo-1976] [INFO] [dku.processes.cgroups]  - [ct: 3] Applying rules to used cgroups: []
[2018/10/28-19:03:20.483] [qtp688005825-1956] [DEBUG] [dku.tracing]  - [ct: 1] Start call: /api/futures/get-update [GET] user=admin [futureId=h1UjuHah]
[2018/10/28-19:03:20.483] [qtp688005825-1956] [DEBUG] [dku.tracing]  - [ct: 1] Done call: /api/futures/get-update [GET] time=1ms user=admin [futureId=h1UjuHah]
[2018/10/28-19:03:21.026] [qtp688005825-1963] [DEBUG] [dku.tracing]  - [ct: 1] Start call: /api/futures/get-update [GET] user=admin [futureId=h1UjuHah]
[2018/10/28-19:03:21.026] [qtp688005825-1963] [DEBUG] [dku.tracing]  - [ct: 1] Done call: /api/futures/get-update [GET] time=1ms user=admin [futureId=h1UjuHah]
[2018/10/28-19:03:21.573] [qtp688005825-1945] [DEBUG] [dku.tracing]  - [ct: 1] Start call: /api/futures/get-update [GET] user=admin [futureId=h1UjuHah]
[2018/10/28-19:03:21.573] [qtp688005825-1945] [DEBUG] [dku.tracing]  - [ct: 1] Done call: /api/futures/get-update [GET] time=1ms user=admin [futureId=h1UjuHah]
[2018/10/28-19:03:23.054] [KNL-python-single-command-kernel-err-1983] [INFO] [dku.utils]  - 2018-10-28 19:03:23,054 INFO Starting Webapp backend
[2018/10/28-19:03:23.054] [KNL-python-single-command-kernel-err-1983] [INFO] [dku.utils]  - 2018-10-28 19:03:23,054 INFO Connecting to parent at port 37309
[2018/10/28-19:03:23.055] [FT-MainLoopThread-DUFAYJXo-1976] [INFO] [dku.link.secret_protected]  - Connected to kernel
[2018/10/28-19:03:23.055] [KNL-python-single-command-kernel-err-1983] [INFO] [dku.utils]  - 2018-10-28 19:03:23,055 INFO Connected to parent at port 37309
[2018/10/28-19:03:23.055] [KNL-python-single-command-kernel-err-1983] [INFO] [dku.utils]  - 2018-10-28 19:03:23,055 INFO Webapp backend connected to DSS
[2018/10/28-19:03:23.055] [KNL-python-single-command-kernel-err-1983] [INFO] [dku.utils]  - 2018-10-28 19:03:23,055 INFO Starting backend for web app: TAP.TENSORBOARD_TAP-Ux3XEjx0-ah0dA8v8-s20
[2018/10/28-19:03:23.192] [KNL-python-single-command-kernel-err-1983] [INFO] [dku.utils]  - /var/dataiku/code-envs/python/ML_3_6/lib/python3.6/site-packages/tensorflow/python/framework/ops.py:923: DeprecationWarning: builtin type EagerTensor has no __module__ attribute
[2018/10/28-19:03:23.192] [KNL-python-single-command-kernel-err-1983] [INFO] [dku.utils]  -   EagerTensor = c_api.TFE_Py_InitEagerTensor(_EagerTensorBase)
[2018/10/28-19:03:23.193] [KNL-python-single-command-kernel-err-1983] [INFO] [dku.utils]  - /var/dataiku/code-envs/python/ML_3_6/lib/python3.6/site-packages/tensorflow/python/util/tf_inspect.py:75: DeprecationWarning: inspect.getargspec() is deprecated, use inspect.signature() or inspect.getfullargspec()
[2018/10/28-19:03:23.193] [KNL-python-single-command-kernel-err-1983] [INFO] [dku.utils]  -   return _inspect.getargspec(target)
[2018/10/28-19:03:23.505] [qtp688005825-1956] [DEBUG] [dku.tracing]  - [ct: 1] Start call: /api/futures/get-update [GET] user=admin [futureId=h1UjuHah]
[2018/10/28-19:03:23.505] [qtp688005825-1956] [DEBUG] [dku.tracing]  - [ct: 1] Done call: /api/futures/get-update [GET] time=1ms user=admin [futureId=h1UjuHah]
[2018/10/28-19:03:23.508] [KNL-python-single-command-kernel-err-1983] [INFO] [dku.utils]  - /var/dataiku/code-envs/python/ML_3_6/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
[2018/10/28-19:03:23.508] [KNL-python-single-command-kernel-err-1983] [INFO] [dku.utils]  -   from ._conv import register_converters as _register_converters
[2018/10/28-19:03:23.839] [KNL-python-single-command-kernel-err-1983] [INFO] [dku.utils]  - Using TensorFlow backend.
[2018/10/28-19:03:23.903] [KNL-python-single-command-kernel-err-1983] [INFO] [dku.utils]  - 2018-10-28 19:03:23,903 ERROR Backend main loop failed
[2018/10/28-19:03:23.904] [KNL-python-single-command-kernel-err-1983] [INFO] [dku.utils]  - Traceback (most recent call last):
[2018/10/28-19:03:23.904] [KNL-python-single-command-kernel-err-1983] [INFO] [dku.utils]  -   File "/home/ubuntu/dataiku-dss-5.0.2/python/dataiku/webapps/backend.py", line 37, in serve
[2018/10/28-19:03:23.904] [KNL-python-single-command-kernel-err-1983] [INFO] [dku.utils]  -     exec(command["code"], globals(), globals()) # in globals so that flask can find them
[2018/10/28-19:03:23.904] [KNL-python-single-command-kernel-err-1983] [INFO] [dku.utils]  -   File "<string>", line 85, in <module>
[2018/10/28-19:03:23.904] [KNL-python-single-command-kernel-err-1983] [INFO] [dku.utils]  -   File "<string>", line 78, in __get_tb_app
[2018/10/28-19:03:23.904] [KNL-python-single-command-kernel-err-1983] [INFO] [dku.utils]  -   File "<string>", line 62, in __get_custom_assets_zip_provider
[2018/10/28-19:03:23.904] [KNL-python-single-command-kernel-err-1983] [INFO] [dku.utils]  -   File "<string>", line 37, in customize_tb_page
[2018/10/28-19:03:23.904] [KNL-python-single-command-kernel-err-1983] [INFO] [dku.utils]  - TypeError: a bytes-like object is required, not 'str'
[2018/10/28-19:03:23.904] [FT-MainLoopThread-DUFAYJXo-1976] [INFO] [dku.webapps.standard]  - [ct: 3462] Backend sends {"type":"ERROR"}
[2018/10/28-19:03:24.099] [KNL-python-single-command-kernel-monitor-1981] [INFO] [dku.kernels]  - Process done with code 0
[2018/10/28-19:03:24.099] [FT-MainLoopThread-DUFAYJXo-1976] [INFO] [dku.kernels]  - Getting kernel tail
[2018/10/28-19:03:24.100] [FT-MainLoopThread-DUFAYJXo-1976] [ERROR] [dku.webapps.standard]  - Failure during kernel run (wasStarted=false)
com.dataiku.dip.io.CustomPythonKernelPythonException: Failed to run webapp backend
    at com.dataiku.dip.webapps.standard.backend.StandardWebAppBackendRunner.run(StandardWebAppBackendRunner.java:101)
    at com.dataiku.dip.webapps.backend.WebAppBackend$MainLoopThread.compute(WebAppBackend.java:116)
    at com.dataiku.dip.webapps.backend.WebAppBackend$MainLoopThread.compute(WebAppBackend.java:87)
    at com.dataiku.dip.futures.SimpleFutureThread.execute(SimpleFutureThread.java:35)
    at com.dataiku.dip.futures.FutureThread.run(FutureThread.java:105)
[2018/10/28-19:03:24.100] [FT-MainLoopThread-DUFAYJXo-1976] [INFO] [dku.kernels]  - Getting kernel tail
[2018/10/28-19:03:24.100] [FT-MainLoopThread-DUFAYJXo-1976] [INFO] [dip.tickets]  - [ct: 3658] Destroying API ticket for webapp_backend:TAP.TENSORBOARD_TAP-Ux3XEjx0-ah0dA8v8-s20 on behalf of admin
[2018/10/28-19:03:24.100] [FT-MainLoopThread-DUFAYJXo-1976] [ERROR] [dku.webapps.backends]  - Backend start failed, aborting restart loop
com.dataiku.dip.webapps.backend.WebAppBackendRunner$BackendStartFailedException: Backend died before start
    at com.dataiku.dip.webapps.standard.backend.StandardWebAppBackendRunner.run(StandardWebAppBackendRunner.java:108)
    at com.dataiku.dip.webapps.backend.WebAppBackend$MainLoopThread.compute(WebAppBackend.java:116)
    at com.dataiku.dip.webapps.backend.WebAppBackend$MainLoopThread.compute(WebAppBackend.java:87)
    at com.dataiku.dip.futures.SimpleFutureThread.execute(SimpleFutureThread.java:35)
    at com.dataiku.dip.futures.FutureThread.run(FutureThread.java:105)
Caused by: com.dataiku.dip.io.CustomPythonKernelPythonException: Failed to run webapp backend
    at com.dataiku.dip.webapps.standard.backend.StandardWebAppBackendRunner.run(StandardWebAppBackendRunner.java:101)
    ... 4 more
Hi Matt,  I approved your comment (we have a sensitive spam checker, sorry about that). Could you please also add the code of the image preprocessing step you used?  You can find it in the Features Handling section (cf. https://www.dataiku.com/learn/guide/visual/machine-learning/deep-learning-images.html)
Hi Alex,
spam filter definitely makes sense :D
Here is the feature handling step:

processor = TokenizerProcessor(num_words=10000, max_len=32)

from dataiku.doctor.deep_learning.shared_variables import set_variable
set_variable("tokenizer_processor", processor)
Hi Matt,
Unfortunately, I am not able to reproduce your issue. I have successfully created a simple binary classification example (predicting polarity) using one text feature and training on 1 GPU. Is that similar to your setup? My feature preprocessing looks like this:

from dataiku.doctor.deep_learning.preprocessing import TokenizerProcessor
processor = TokenizerProcessor(num_words=10000, max_len=32)

Are you able to reproduce the problem with this preprocessing step?

I understand that only tensorboard is failing, but the model does finish training?

1 Answer

0 votes
Hi Matt,

for me Tensorboard doesn't work either on Python 3.6. Creating a Python 2.7 environment seemed to work for me.
answered by
971 questions
998 answers
1,047 comments
2,361 users

©Dataiku 2012-2018 - Privacy Policy