Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hello,
Spark won't see hfds:/// and just looks for file:/// when i'm trying to process a HDFS managed dataset. I followed the How-To link on:
https://www.dataiku.com/learn/guide/spark/tips-and-troubleshooting.html
However couldn't figure out what to edit. Here is my env-spark.sh in DATA_DIR/bin/
```
export DKU_SPARK_ENABLED=true
export DKU_SPARK_HOME='/usr/local/spark'
export DKU_SPARK_VERSION='2.4.2'
export PYSPARK_DRIVER_PYTHON="$DKUPYTHONBIN"
export DKU_PYSPARK_PYTHONPATH='/usr/local/spark/python:/usr/local/spark/python/lib/py4j-0.10.7-src.zip'
if [ -n "$DKURBIN" ]; then
export SPARKR_DRIVER_R="$DKURBIN"
fi
```
My hadoop is located at /usr/local/hadoop and spark is located at /usr/local/spark.
Can you please help me? Thanks in advance.