container is running beyond physical memory

UserBird
Dataiker
container is running beyond physical memory
Les logs remontent l'erreur suivante : [11:06:04] [INFO] [dku.proc] - Container [pid=69410,containerID=container_e36_1497242348080_3059_01_000062] is running beyond physical memory limits. Current usage: 2.7 GB of 2 GB physical memory used; 4.5 GB of 4.2 GB virtual memory used. Killing container.

Une idรฉe or an idea ?
0 Kudos
6 Replies
jereze
Community Manager
Community Manager

It seems that you don't enough RAM. How many GB of RAM do you have on the server? Or is it a virtual machine?



But it can be normal. It is marked as INFO and not as ERROR.



From the documentation:



A minimum of 8 GB of RAM is required. More RAM can be required if you intend to load large datasets in memory (for example in the IPython notebook component). 16 GB of RAM are highly recommended.

Jeremy, Product Manager at Dataiku
0 Kudos
brunoperez
Level 1
Thanks for your response
it's a physical machine with 128 Gb of RAM !!!
so i'm suprise of this info message !
an idea ?
0 Kudos
jereze
Community Manager
Community Manager
Do you have any crash or error in DSS? Can be just normal.
Jeremy, Product Manager at Dataiku
0 Kudos
brunoperez
Level 1
It's an error. see view below
[2017/06/14-12:38:49.678] [Exec-75] [INFO] [dku.proc] act.compute_journee_postale_traitements_meca_tae_NP - Task with the most failures(4):
[2017/06/14-12:38:49.679] [Exec-75] [INFO] [dku.proc] act.compute_journee_postale_traitements_meca_tae_NP - -----
[2017/06/14-12:38:49.679] [Exec-75] [INFO] [dku.proc] act.compute_journee_postale_traitements_meca_tae_NP - Task ID:
[2017/06/14-12:38:49.679] [Exec-75] [INFO] [dku.proc] act.compute_journee_postale_traitements_meca_tae_NP - task_1497242348080_2137_m_001445
[2017/06/14-12:38:49.679] [Exec-75] [INFO] [dku.proc] act.compute_journee_postale_traitements_meca_tae_NP -
[2017/06/14-12:38:49.679] [Exec-75] [INFO] [dku.proc] act.compute_journee_postale_traitements_meca_tae_NP - URL:
[2017/06/14-12:38:49.680] [Exec-75] [INFO] [dku.proc] act.compute_journee_postale_traitements_meca_tae_NP - http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1497242348080_2137&tipid=task_1497242348080_2137_m_001445
[2017/06/14-12:38:49.680] [Exec-75] [INFO] [dku.proc] act.compute_journee_postale_traitements_meca_tae_NP - -----
[2017/06/14-12:38:49.680] [Exec-75] [INFO] [dku.proc] act.compute_journee_postale_traitements_meca_tae_NP - Diagnostic Messages for this Task:
[2017/06/14-12:38:49.680] [Exec-75] [INFO] [dku.proc] act.compute_journee_postale_traitements_meca_tae_NP - Container [pid=5312,containerID=container_e36_1497242348080_2137_01_000949] is running beyond physical memory limits. Current usage: 2.8 GB of 2 GB physical memory used; 4.6 GB of 4.2 GB virtual memory used. Killing container.
[2017/06/14-12:38:49.680] [Exec-75] [INFO] [dku.proc] act.compute_journee_postale_traitements_meca_tae_NP - Dump of the process-tree for container_e36_1497242348080_2137_01_000949 :
[2017/06/14-12:38:49.680] [Exec-75] [INFO] [dku.proc] act.compute_journee_postale_traitements_meca_tae_NP - |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
[2017/06/14-12:38:49.681] [Exec-75] [INFO] [dku.proc] act.compute_journee_postale_traitements_meca_tae_NP - |- 5312 5310 5312 5312 (bash) 0 0 108703744 267 /bin/bash -c /usr/java/default/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Djava.net.preferIPv4Stack=true -Xmx1600000000 -Djava.io.tmpdir=/u07/hadoop/yarn/nm/usercache/dataiku/appcache/application_1497242348080_2137/container_e36_1497242348080_2137_01_000949/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/container/application_1497242348080_2137/container_e36_1497242348080_2137_01_000949 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 10.116.35.19 24085 attempt_1497242348080_2137_m_001445_3 39582418600885 1>/var/log/hadoop-yarn/container/application_1497242348080_2137/container_e36_1497242348080_2137_01_000949/stdout 2>/var/log/hadoop-yarn/container/application_1497242348080_2137/container_e36_1497242348080_2137_01_000949/stderr
[2017/06/14-12:38:49.681] [Exec-75] [INFO] [dku.proc] act.compute_journee_postale_traitements_meca_tae_NP - |- 5337 5312 5312 5312 (java) 14934 617 4845563904 732116 /usr/java/default/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Djava.net.preferIPv4Stack=true -Xmx1600000000 -Djava.io.tmpdir=/u07/hadoop/yarn/nm/usercache/dataiku/appcache/application_1497242348080_2137/container_e36_1497242348080_2137_01_000949/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/container/application_1497242348080_2137/container_e36_1497242348080_2137_01_000949 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 10.116.35.19 24085 attempt_1497242348080_2137_m_001445_3 39582418600885
[2017/06/14-12:38:49.681] [Exec-75] [INFO] [dku.proc] act.compute_journee_postale_traitements_meca_tae_NP -
[2017/06/14-12:38:49.681] [Exec-75] [INFO] [dku.proc] act.compute_journee_postale_traitements_meca_tae_NP - Container killed on request. Exit code is 143
[2017/06/14-12:38:49.681] [Exec-75] [INFO] [dku.proc] act.compute_journee_postale_traitements_meca_tae_NP - Container exited with a non-zero exit code 143
[2017/06/14-12:38:49.681] [Exec-75] [INFO] [dku.proc] act.compute_journee_postale_traitements_meca_tae_NP -
[2017/06/14-12:38:49.682] [Exec-75] [INFO] [dku.proc] act.compute_journee_postale_traitements_meca_tae_NP -
[2017/06/14-12:38:50.993] [Exec-75] [INFO] [dku.proc] act.compute_journee_postale_traitements_meca_tae_NP - FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
[2017/06/14-12:38:50.993] [Exec-75] [INFO] [dku.proc] act.compute_journee_postale_traitements_meca_tae_NP - MapReduce Jobs Launched:
[2017/06/14-12:38:50.994] [Exec-75] [INFO] [dku.proc] act.compute_journee_postale_traitements_meca_tae_NP - Stage-Stage-1: Map: 1770 Cumulative CPU: 626347.87 sec HDFS Read: 438711897594 HDFS Write: 430789011622 FAIL
[2017/06/14-12:38:50.994] [Exec-75] [INFO] [dku.proc] act.compute_journee_postale_traitements_meca_tae_NP - Total MapReduce CPU Time Spent: 7 days 5 hours 59 minutes 7 seconds 870 msec
[2017/06/14-12:38:51.374] [FRT-73-FlowRunnable] [DEBUG] [dku.hadoop] act.compute_journee_postale_traitements_meca_tae_NP - Initializing Hadoop FS with context UGI: dataiku (auth:SIMPLE) (login: dataiku (auth:SIMPLE))
[2017/06/14-12:38:51.391] [FRT-73-FlowRunnable] [INFO] [dku.flow.activity] act.compute_journee_postale_traitements_meca_tae_NP - Run thread failed for activity compute_journee_postale_traitements_meca_tae_NP
java.lang.Exception: Failed to execute Hive script, please check job logs
at com.dataiku.dip.recipes.code.hive.AbstractHiveRecipeRunner.startAndWaitHive(AbstractHiveRecipeRunner.java:207)
at com.dataiku.dip.recipes.code.hive.HiveRecipeRunner.run(HiveRecipeRunner.java:139)
at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:352)
[2017/06/14-12:38:51.488] [ActivityExecutor-53] [INFO] [dku.flow.activity] running compute_journee_postale_traitements_meca_tae_NP - activity is finished
[2017/06/14-12:38:51.489] [ActivityExecutor-53] [ERROR] [dku.flow.activity] running compute_journee_postale_traitements_meca_tae_NP - Activity failed
java.lang.Exception: Failed to execute Hive script, please check job logs
at com.dataiku.dip.recipes.code.hive.AbstractHiveRecipeRunner.startAndWaitHive(AbstractHiveRecipeRunner.java:207)
at com.dataiku.dip.recipes.code.hive.HiveRecipeRunner.run(HiveRecipeRunner.java:139)
at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:352)
[2017/06/14-12:38:51.489] [ActivityExecutor-53] [INFO] [dku.flow.activity] running compute_journee_postale_traitements_meca_tae_NP - Executing default post-activity lifecycle hook
[2017/06/14-12:38:51.494] [ActivityExecutor-53] [DEBUG] [dku.datasets.hdfs] running compute_journee_postale_traitements_meca_tae_NP - Built HDFS dataset handler dataset=DISCOVERER_RETURNS.journee_postale_traitements_meca_tae connection=connection_philippe configuredRoot=/data/discovery_layer/poc_philippe/DISCOVERER_RETURNS.journee_postale_traitements_meca_tae effectiveRoot=/data/discovery_layer/poc_philippe/DISCOVERER_RETURNS.journee_postale_traitements_meca_tae
[2017/06/14-12:38:51.494] [ActivityExecutor-53] [INFO] [dku.flow.activity] running compute_journee_postale_traitements_meca_tae_NP - Removing samples for DISCOVERER_RETURNS.journee_postale_traitements_meca_tae
[2017/06/14-12:38:51.496] [ActivityExecutor-53] [INFO] [dku.flow.activity] running compute_journee_postale_traitements_meca_tae_NP - Done post-activity tasks
0 Kudos
jereze
Community Manager
Community Manager
Ok, indeed, there is an error.
There is an error with Hive ("Failed to execute Hive script"). I guess you are a customer of Dataiku, so you can contact the support team and they will help you.
Jeremy, Product Manager at Dataiku
0 Kudos
brunoperez
Level 1
Thanks a lot for you time
Yes i'm a customer of dataiku
i will see with the support team
0 Kudos

Labels

?
Labels (1)
A banner prompting to get Dataiku