Hi, I have been playing around with spark for a couple of days. I am using spark-1.0.1-bin-hadoop1 and the Java API. The main idea of the implementation is to run Hive queries on Spark. I used JavaHiveContext to achieve this (As per the examples).
I have 2 questions. 1. I am wondering how I could get the execution times of a spark job? Does Spark provide monitoring facilities in the form of an API? 2. I used a laymen way to get the execution times by enclosing a JavaHiveContext.hql method with System.nanoTime() as follows long start, end; JavaHiveContext hiveCtx; JavaSchemaRDD hiveResult; start = System.nanoTime(); hiveResult = hiveCtx.hql(query); end = System.nanoTime(); System.out.println(start-end); But the result I got is drastically different from the execution times recorded in SparkUI. Can you please explain this disparity? Look forward to hearing from you. rgds -- *Niranda Perera* Software Engineer, WSO2 Inc. Mobile: +94-71-554-8430 Twitter: @n1r44 <https://twitter.com/N1R44>