Re: Driver spins hours in query plan optimization

Everett Anderson Tue, 02 May 2017 11:44:46 -0700

Seems like

https://issues.apache.org/jira/browse/SPARK-13346


is likely the same issue.

Seems like for some people persist() doesn't work and they have to convert
to RDDs and back.


On Fri, Apr 14, 2017 at 1:39 PM, Everett Anderson <ever...@nuna.com> wrote:

> Hi,
>
> We keep hitting a situation on Spark 2.0.2 (haven't tested later versions,
> yet) where the driver spins forever seemingly in query plan optimization
> for moderate queries, such as the union of a few (~5) other DataFrames.
>
> We can see the driver spinning with one core in the nioEventLoopGroup-2-2
> thread in a deep trace like the attached.
>
> Throwing in a MEMORY_OR_DISK persist() so the query plan is collapsed
> works around this, but it's a little surprising how often we encounter the
> problem, forcing us to work to manage persisting/unpersisting tables and
> potentially suffering unnecessary disk I/O.
>
> I've looking through JIRA but don't see open issues about this -- might've
> just not found them successfully.
>
> Anyone else encounter this?
>
>

Re: Driver spins hours in query plan optimization

Reply via email to