0 votes

Hello again, 

I'm having issues trying to use Spark engine for shakers(prepare recipe). When i'm to remove a column (or any other operation) using the Spark engine i get the following error:


 

[2019/05/09-19:55:30.876] [null-err-60] [INFO] [dku.utils]  - WARNING: An illegal reflective access operation has occurred
[2019/05/09-19:55:30.876] [null-err-60] [INFO] [dku.utils]  - WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/usr/local/spark/jars/spark-unsafe_2.12-2.4.2.jar) to method java.nio.Bits.unaligned()
[2019/05/09-19:55:30.876] [null-err-60] [INFO] [dku.utils]  - WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
[2019/05/09-19:55:30.877] [null-err-60] [INFO] [dku.utils]  - WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
[2019/05/09-19:55:30.877] [null-err-60] [INFO] [dku.utils]  - WARNING: All illegal access operations will be denied in a future release
[2019/05/09-19:55:30.894] [null-err-60] [INFO] [dku.utils]  - 2019-05-09 19:55:30,894 WARN spark.SparkConf: The configuration key 'spark.yarn.executor.memoryOverhead' has been deprecated as of Spark 2.3 and may be removed in the future. Please use the new key 'spark.executor.memoryOverhead' instead.
[2019/05/09-19:55:31.166] [null-err-60] [INFO] [dku.utils]  - 2019-05-09 19:55:31,166 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[2019/05/09-19:55:31.691] [null-err-60] [INFO] [dku.utils]  - Exception in thread "main" java.lang.NoClassDefFoundError: scala/App$class
[2019/05/09-19:55:31.691] [null-err-60] [INFO] [dku.utils]  - 	at com.dataiku.dip.shaker.sparkimpl.ShakerSparkEntryPoint$.<init>(ShakerSparkEntryPoint.scala:15)
[2019/05/09-19:55:31.691] [null-err-60] [INFO] [dku.utils]  - 	at com.dataiku.dip.shaker.sparkimpl.ShakerSparkEntryPoint$.<clinit>(ShakerSparkEntryPoint.scala)
[2019/05/09-19:55:31.692] [null-err-60] [INFO] [dku.utils]  - 	at com.dataiku.dip.shaker.sparkimpl.ShakerSparkEntryPoint.main(ShakerSparkEntryPoint.scala)
[2019/05/09-19:55:31.692] [null-err-60] [INFO] [dku.utils]  - 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[2019/05/09-19:55:31.692] [null-err-60] [INFO] [dku.utils]  - 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[2019/05/09-19:55:31.692] [null-err-60] [INFO] [dku.utils]  - 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2019/05/09-19:55:31.693] [null-err-60] [INFO] [dku.utils]  - 	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
[2019/05/09-19:55:31.693] [null-err-60] [INFO] [dku.utils]  - 	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
[2019/05/09-19:55:31.693] [null-err-60] [INFO] [dku.utils]  - 	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849)
[2019/05/09-19:55:31.693] [null-err-60] [INFO] [dku.utils]  - 	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
[2019/05/09-19:55:31.693] [null-err-60] [INFO] [dku.utils]  - 	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
[2019/05/09-19:55:31.694] [null-err-60] [INFO] [dku.utils]  - 	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
[2019/05/09-19:55:31.694] [null-err-60] [INFO] [dku.utils]  - 	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
[2019/05/09-19:55:31.694] [null-err-60] [INFO] [dku.utils]  - 	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
[2019/05/09-19:55:31.694] [null-err-60] [INFO] [dku.utils]  - 	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
[2019/05/09-19:55:31.694] [null-err-60] [INFO] [dku.utils]  - Caused by: java.lang.ClassNotFoundException: scala.App$class
[2019/05/09-19:55:31.695] [null-err-60] [INFO] [dku.utils]  - 	at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:471)
[2019/05/09-19:55:31.695] [null-err-60] [INFO] [dku.utils]  - 	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:588)
[2019/05/09-19:55:31.695] [null-err-60] [INFO] [dku.utils]  - 	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
[2019/05/09-19:55:31.696] [null-err-60] [INFO] [dku.utils]  - 	... 15 more
[2019/05/09-19:55:31.699] [null-err-60] [INFO] [dku.utils]  - 2019-05-09 19:55:31,699 INFO util.ShutdownHookManager: Shutdown hook called
[2019/05/09-19:55:31.701] [null-err-60] [INFO] [dku.utils]  - 2019-05-09 19:55:31,701 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-84cb5134-3b95-4aa2-bd5d-489c63c0abcf
[2019/05/09-19:55:31.738] [FRT-57-FlowRunnable] [INFO] [dku.flow.activity] - Run thread failed for activity compute_RC_2006_Parquet_prepared_NP
com.dataiku.dip.exceptions.ProcessDiedException: The Spark process failed (exit code: 1). More info might be available in the logs.
	at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.throwSubprocessError(AbstractCodeBasedActivityRunner.java:195)
	at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.handleExecutionResult(AbstractCodeBasedActivityRunner.java:168)
	at com.dataiku.dip.dataflow.exec.AbstractCodeBasedActivityRunner.execute(AbstractCodeBasedActivityRunner.java:102)
	at com.dataiku.dip.dataflow.exec.AbstractSparkBasedRecipeRunner.runUsingSparkSubmit(AbstractSparkBasedRecipeRunner.java:288)
	at com.dataiku.dip.dataflow.exec.AbstractSparkBasedRecipeRunner.doRunSpark(AbstractSparkBasedRecipeRunner.java:116)
	at com.dataiku.dip.dataflow.exec.AbstractSparkBasedRecipeRunner.runSpark(AbstractSparkBasedRecipeRunner.java:93)
	at com.dataiku.dip.dataflow.exec.AbstractSparkBasedRecipeRunner.runSpark(AbstractSparkBasedRecipeRunner.java:81)
	at com.dataiku.dip.recipes.shaker.ShakerSparkRecipeRunner.run(ShakerSparkRecipeRunner.java:51)
	at com.dataiku.dip.dataflow.jobrunner.ActivityRunner$FlowRunnableThread.run(ActivityRunner.java:366)
[2019/05/09-19:55:31.902] [ActivityExecutor-46] [INFO] [dku.flow.activity] running compute_RC_2006_Parquet_prepared_NP - activity is finished

I'm assuming it's about classpath? Been trying to wrap my head around this for the past 2 hours so any input would be appreciated. 

Thanks in advance, 

by
reopened by

1 Answer

0 votes
Best answer

 

Hello,

This error seems to indicate that your version of Spark was built with scala 2.12 (scala versions are not binary-compatible between them).
If you are using Spark 2.4.2, they released it built with scala 2.12 by default, unlike other spark 2.x (including 2.4.3 and presumably future 2.4.x) that are built with scala 2.11.

DSS supports Spark 2 up to 2.3.x and 2.4 should work when it's built with scala 2.11 but not 2.12. So you can try to:

  • use spark 2.4.2 but a version that was built with scala 2.11
  • use spark 2.4.3 (default build is scala 2.11)
  • use another version of Spark

Hope it helps

by
selected by
Thanks Adrien, i will be trying with Spark 2.4.3.
1,319 questions
1,339 answers
1,539 comments
11,888 users

©Dataiku 2012-2018 - Privacy Policy