The logs are self explanatory.
It says "java.io.IOException: Incomplete HDFS URI, no host:
hdfs:/user/hduser/share/lib/spark-assembly.jar"
you need to specify the host in the above hdfs url.
It should look something like the following:
hdfs://:8020/user/hduser/share/lib/spark-assembly.jar
-
We are running a batch job with the following specifications
• Building RandomForest with config : maxbins=100, depth=19, num of trees
=
20
• Multiple runs with different input data size 2.8 GB, 10 Million records
• We are running spark application on Yarn in cluster mode, with 3