Re: GraphX - ConnectedComponents (Pregel) - longer and longer interval between jobs

Thomas Gerber Fri, 26 Jun 2015 09:15:28 -0700

Note that this problem is probably NOT caused directly by GraphX, but
GraphX reveals it because as you go further down the iterations, you get
further and further away of a shuffle you can rely on.


On Thu, Jun 25, 2015 at 7:43 PM, Thomas Gerber <[email protected]>
wrote:

> Hello,
>
> We run GraphX ConnectedComponents, and we notice that there is a time gap
> that becomes larger and larger during Jobs, that is not accounted for.
>
> In the screenshot attached, you will notice that each job only takes
> around 2 1/2min. At first, the next job/iteration starts immediately after
> the previous one. But as we go through iterations, there is a gap (time
> where job N+1 starts - time where job N finishes) that grows, reaching
> ultimately 6 minutes around the 30th iteration .
>
> I suspect it has to do with DAG computation on the driver, as evidenced by
> the very large (and getting larger at every iteration) of pending stages
> that are ultimately skipped.
>
> So,
> 1. is there anything obvious we can do to make that "gap" between
> iterations shorter?
> 2. would dividing the number of partitions in the input RDD per 2 divide
> the gap by 2 as well?
>
> I ask because 3 min gap on average for a job length of 2 1/2 min => we are
> "wasting" 50% of CPU time on the Executors.
>
> Thanks!
> Thomas
>

Re: GraphX - ConnectedComponents (Pregel) - longer and longer interval between jobs

Reply via email to