Have you tried looking at Spark GUI to see the time it takes to load from
HDFS?
Spark GUI by default runs on port 4040. However, you can set in spark-submit
${SPARK_HOME}/bin/spark-submit \
…...
--conf "spark.ui.port="
and access it through hostname:port
HTH
Dr Mich Talebzadeh
LinkedI
Hello,
I want to ask if there any way to measure HDFS data loading time at
the start of my program. I tried to add an action e.g count() after val
data = sc.textFile() call. But I notice that my program takes more time
to finish than before adding count call. Is there any other way to do i