From:* Teng Qiu [mailto:teng...@gmail.com]
> *Sent:* Tuesday, February 16, 2016 12:11 PM
> *To:* Dukek, Dillon
> *Cc:* user@spark.apache.org
> *Subject:* Re: Spark SQL step with many tasks takes a long time to begin
> processing
>
>
>
> i believe this is a known issue for using sp
360-316-9309
Email: dillon.du...@t-mobile.com
From: Teng Qiu [mailto:teng...@gmail.com]
Sent: Tuesday, February 16, 2016 12:11 PM
To: Dukek, Dillon
Cc: user@spark.apache.org
Subject: Re: Spark SQL step with many tasks takes a long time to begin
processing
i believe this is a known issue for u
i believe this is a known issue for using spark/hive with files on s3, this
huge delay on driver side is caused by partition listing and split
computation, and it is more like a issue by hive, since you are using
thrift server, the sql queries are running in HiveContext.
qubole made some optimizat