I have the VirtualBox up and running and have downloaded the Spark File "spark-2.4.3-bin-hadoop2.7.tgz"

Im trying to install this and then Sparkling Water but im really struggling with how to do this.


Whenever i try any commands it says "command not found"


I have VERY limited knowledge working within a Linux command prompt, whats my next steps i order to install Spark onto my Virtual Machine?


Thanks for looking


To be very honest, this is almost impossible without some knowledge of Linux command line.

Also, your Spark and/or your H2O will not actually be distributed, which significantly limits the benefit they bring in, compared to simple in-memry machine learning. What is it that you want to do more precisely ?
Hi Clement!

Thanks for taking your time to reply.

I dont mind learning about Linux and how to do this and ive posted some questions on a Linux forum to start me off.

I basically have the free version and i am using it on a small to medium size data set in memory.

I just wanted to use the Naive Bayes algorithm and have it integrated into the DSS work flow.
