0 votes

Hello, 

Spark won't see hfds:/// and just looks for file:/// when i'm trying to process a HDFS managed dataset. I followed the How-To link on: 

https://www.dataiku.com/learn/guide/spark/tips-and-troubleshooting.html

However couldn't figure out what to edit. Here is my env-spark.sh in DATA_DIR/bin/

```
export DKU_SPARK_ENABLED=true
export DKU_SPARK_HOME='/usr/local/spark'
export DKU_SPARK_VERSION='2.4.2'
export PYSPARK_DRIVER_PYTHON="$DKUPYTHONBIN"
export DKU_PYSPARK_PYTHONPATH='/usr/local/spark/python:/usr/local/spark/python/lib/py4j-0.10.7-src.zip'
if [ -n "$DKURBIN" ]; then
  export SPARKR_DRIVER_R="$DKURBIN"
fi

```
My hadoop is located at /usr/local/hadoop and spark is located at /usr/local/spark. 

Can you please help me? Thanks in advance. 

by
reopened by

1 Answer

0 votes
Best answer
Solved it by:

```
export HADOOP_INSTALL='usr/local/hadoop'
export HADOOP_CONF_DIR='/usr/local/hadoop/etc/hadoop'
export DKU_SPARK_ENABLED=true
export DKU_SPARK_HOME='/usr/local/spark'
export DKU_SPARK_VERSION='2.4.2'
export PYSPARK_DRIVER_PYTHON="$DKUPYTHONBIN"
export DKU_PYSPARK_PYTHONPATH='/usr/local/spark/python:/usr/local/spark/python/lib/py4j-0.10.7-src.zip'
if [ -n "$DKURBIN" ]; then
  export SPARKR_DRIVER_R="$DKURBIN"
fi

```
Thanks anyways my dudes
by
1,275 questions
1,303 answers
1,486 comments
11,831 users

©Dataiku 2012-2018 - Privacy Policy