Getting the execution times of spark job

Niranda Perera Tue, 02 Sep 2014 00:02:29 -0700

Hi,

I have been playing around with spark for a couple of days. I am
using spark-1.0.1-bin-hadoop1 and the Java API. The main idea of the
implementation is to run Hive queries on Spark. I used JavaHiveContext to
achieve this (As per the examples).


I have 2 questions.
1. I am wondering how I could get the execution times of a spark job? Does
Spark provide monitoring facilities in the form of an API?

2. I used a laymen way to get the execution times by enclosing a
JavaHiveContext.hql method with System.nanoTime() as follows

long start, end;
JavaHiveContext hiveCtx;
JavaSchemaRDD hiveResult;

start = System.nanoTime();
hiveResult = hiveCtx.hql(query);
end = System.nanoTime();
System.out.println(start-end);

But the result I got is drastically different from the execution times
recorded in SparkUI. Can you please explain this disparity?

Look forward to hearing from you.

rgds

-- 
*Niranda Perera*
Software Engineer, WSO2 Inc.
Mobile: +94-71-554-8430
Twitter: @n1r44 <https://twitter.com/N1R44>

Getting the execution times of spark job

Reply via email to