Reactive Spark: generalizing streaming API from micro-batches to event-based

2016-07-09 Thread Ahmed Mahran
Hi, I'd like to present an idea about generalizing the legacy streaming API a bit. The streaming API assumes an equi-frequent micro-batches model such that streaming data are allocated and jobs are submitted into a batch every fixed amount of time (aka batchDuration). This model could be extended

Spark application Runtime Measurement

2016-07-09 Thread Fei Hu
Dear all, I have a question about how to measure the runtime for a Spak application. Here is an example: - On the Spark UI: the total duration time is 2.0 minutes = 120 seconds as following [image: Screen Shot 2016-07-09 at 11.45.44 PM.png] - However, when I check the jobs launched by