Hi,

I'm a newbie, so please bear with me.

*I'm using a windows 10 machine. I installed spark here:*
C:\spark-2.1.0-bin-hadoop2.7\spark-2.1.0-bin-hadoop2.7

*I also installed h2o sparkling water here:*
C:\sparkling-water-2.1.1

*I use this code in command line to launch a jupyter notebook for pyspark:*
cd C:\spark-2.1.0-bin-hadoop2.7\spark-2.1.0-bin-hadoop2.7
bin\pyspark --executor-cores 2

*I then run this code inside the jupyter notebook to start spark:*
spark_home = "C:\spark-2.1.0-bin-hadoop2.7\spark-2.1.0-bin-hadoop2.7"

if not spark_home:

    raise ValueError('SPARK_HOME environment variable is not set')

sys.path.insert(0, os.path.join(spark_home, 'python'))

sys.path.insert(0, os.path.join(spark_home,
'C:\spark-2.1.0-bin-hadoop2.7\spark-2.1.0-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip'))

exec(open(os.path.join(spark_home, 'python/pyspark/shell.py')).read())


*My question is, how can I get h2o running in the same jupyter notebook?*
I spent hours trying to google for the answer but couldn't figure out how
to do it...

Many thanks in advance!
Zeming

Reply via email to