Try to have a look at this doc

http://blog.cloudera.com/blog/2014/05/apache-spark-resource-management-and-yarn-app-models/

HTH

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 18 April 2016 at 20:43, Dogtail L <spark.ru...@gmail.com> wrote:

> Hi,
>
> When launching a job in Spark, I have great trouble deciding the number of
> tasks. Someone says it is better to create a task per HDFS block size,
> i.e., make sure one task process 128MB of input data; others suggest that
> the number of tasks should be the twice of the total cores available to the
> job. Also, I found that someone suggests launching small tasks using Spark,
> i.e., make sure each task lasts around 100ms.
>
> I am quite confused about all these suggestions. Is there any general rule
> for deciding the number of tasks in Spark? Great thanks!
>
> Best
>

Reply via email to