a
lot of processing overhead -- I couldn't figure out exactly why but it
seemed to have something to do with forEachRDD only being executed on the
driver.
On Thu, Aug 20, 2015 at 1:39 PM, Iulian DragoČ™
wrote:
> On Thu, Aug 20, 2015 at 6:58 PM, Justin Grimes wrote:
>
> We are aggregating re
We are aggregating real time logs of events, and want to do windows of 30
minutes. However, since the computation doesn't start until 30 minutes have
passed, there is a ton of data built up that processing could've already
started on. When it comes time to actually process the data, there is too
mu