Re: hiveContext sql number of tasks

Deng Ching-Mallete Wed, 07 Oct 2015 07:39:01 -0700

Hi,

You can do coalesce(N), where N is the number of partitions you want it
reduced to, after loading the data into an RDD.


HTH,
Deng

On Wed, Oct 7, 2015 at 6:34 PM, patcharee <patcharee.thong...@uni.no> wrote:

> Hi,
>
> I do a sql query on about 10,000 partitioned orc files. Because of the
> partition schema the files cannot be merged any longer (to reduce the total
> number).
>
> From this command hiveContext.sql(sqlText), the 10K tasks were created to
> handle each file. Is it possible to use less tasks? How to force the spark
> sql to use less tasks?
>
> BR,
> Patcharee
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: hiveContext sql number of tasks

Reply via email to