Re: Number of partitions and Number of concurrent tasks

Nicholas Chammas Fri, 01 Aug 2014 09:07:19 -0700

Darin,

I think the number of cores in your cluster is a hard limit on how many
concurrent tasks you can execute at one time. If you want more parallelism,
I think you just need more cores in your cluster--that is, bigger nodes, or
more nodes.


Daniel,

Have you been able to get around this limit?

Nick



On Fri, Aug 1, 2014 at 11:49 AM, Daniel Siegmann <daniel.siegm...@velos.io>
wrote:

> Sorry, but I haven't used Spark on EC2 and I'm not sure what the problem
> could be. Hopefully someone else will be able to help. The only thing I
> could suggest is to try setting both the worker instances and the number of
> cores (assuming spark-ec2 has such a parameter).
>
>
> On Thu, Jul 31, 2014 at 3:03 PM, Darin McBeath <ddmcbe...@yahoo.com>
> wrote:
>
>> Ok, I set the number of spark worker instances to 2 (below is my startup
>> command).  But, this essentially had the effect of increasing my number of
>> workers from 3 to 6 (which was good) but it also reduced my number of cores
>> per worker from 8 to 4 (which was not so good).  In the end, I would still
>> only be able to concurrently process 24 partitions in parallel.  I'm
>> starting a stand-alone cluster using the spark provided ec2 scripts .  I
>> tried setting the env variable for SPARK_WORKER_CORES in the spark_ec2.py
>> but this had no effect. So, it's not clear if I could even set the
>> SPARK_WORKER_CORES with the ec2 scripts.  Anyway, not sure if there is
>> anything else I can try but at least wanted to document what I did try and
>> the net effect.  I'm open to any suggestions/advice.
>>
>>  ./spark-ec2 -k *key* -i key.pem --hadoop-major-version=2 launch -s 3 -t
>> m3.2xlarge -w 3600 --spot-price=.08 -z us-east-1e --worker-instances=2
>> *my-cluster*
>>
>>
>>   ------------------------------
>>  *From:* Daniel Siegmann <daniel.siegm...@velos.io>
>> *To:* Darin McBeath <ddmcbe...@yahoo.com>
>> *Cc:* Daniel Siegmann <daniel.siegm...@velos.io>; "user@spark.apache.org"
>> <user@spark.apache.org>
>> *Sent:* Thursday, July 31, 2014 10:04 AM
>>
>> *Subject:* Re: Number of partitions and Number of concurrent tasks
>>
>> I haven't configured this myself. I'd start with setting
>> SPARK_WORKER_CORES to a higher value, since that's a bit simpler than
>> adding more workers. This defaults to "all available cores" according to
>> the documentation, so I'm not sure if you can actually set it higher. If
>> not, you can get around this by adding more worker instances; I believe
>> simply setting SPARK_WORKER_INSTANCES to 2 would be sufficient.
>>
>> I don't think you *have* to set the cores if you have more workers - it
>> will default to 8 cores per worker (in your case). But maybe 16 cores per
>> node will be too many. You'll have to test. Keep in mind that more workers
>> means more memory and such too, so you may need to tweak some other
>> settings downward in this case.
>>
>> On a side note: I've read some people found performance was better when
>> they had more workers with less memory each, instead of a single worker
>> with tons of memory, because it cut down on garbage collection time. But I
>> can't speak to that myself.
>>
>> In any case, if you increase the number of cores available in your
>> cluster (whether per worker, or adding more workers per node, or of course
>> adding more nodes) you should see more tasks running concurrently. Whether
>> this will actually be *faster* probably depends mainly on whether the
>> CPUs in your nodes were really being fully utilized with the current number
>> of cores.
>>
>>
>> On Wed, Jul 30, 2014 at 8:30 PM, Darin McBeath <ddmcbe...@yahoo.com>
>> wrote:
>>
>> Thanks.
>>
>>  So to make sure I understand.  Since I'm using a 'stand-alone' cluster,
>> I would set SPARK_WORKER_INSTANCES to something like 2 (instead of the
>> default value of 1).  Is that correct?  But, it also sounds like I need to
>> explicitly set a value for SPARKER_WORKER_CORES (based on what the
>> documentation states).  What would I want that value to be based on my
>> configuration below?  Or, would I leave that alone?
>>
>>   ------------------------------
>>  *From:* Daniel Siegmann <daniel.siegm...@velos.io>
>> *To:* user@spark.apache.org; Darin McBeath <ddmcbe...@yahoo.com>
>> *Sent:* Wednesday, July 30, 2014 5:58 PM
>> *Subject:* Re: Number of partitions and Number of concurrent tasks
>>
>> This is correct behavior. Each "core" can execute exactly one task at a
>> time, with each task corresponding to a partition. If your cluster only has
>> 24 cores, you can only run at most 24 tasks at once.
>>
>> You could run multiple workers per node to get more executors. That would
>> give you more cores in the cluster. But however many cores you have, each
>> core will run only one task at a time.
>>
>>
>> On Wed, Jul 30, 2014 at 3:56 PM, Darin McBeath <ddmcbe...@yahoo.com>
>> wrote:
>>
>>  I have a cluster with 3 nodes (each with 8 cores) using Spark 1.0.1.
>>
>> I have an RDD<String> which I've repartitioned so it has 100 partitions
>> (hoping to increase the parallelism).
>>
>> When I do a transformation (such as filter) on this RDD, I can't  seem to
>> get more than 24 tasks (my total number of cores across the 3 nodes) going
>> at one point in time.  By tasks, I mean the number of tasks that appear
>> under the Application UI.  I tried explicitly setting the
>> spark.default.parallelism to 48 (hoping I would get 48 tasks concurrently
>> running) and verified this in the Application UI for the running
>> application but this had no effect.  Perhaps, this is ignored for a
>> 'filter' and the default is the total number of cores available.
>>
>> I'm fairly new with Spark so maybe I'm just missing or misunderstanding
>> something fundamental.  Any help would be appreciated.
>>
>> Thanks.
>>
>> Darin.
>>
>>
>>
>>
>> --
>> Daniel Siegmann, Software Developer
>> Velos
>> Accelerating Machine Learning
>>
>> 440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001
>> E: daniel.siegm...@velos.io W: www.velos.io
>>
>>
>>
>>
>>
>> --
>> Daniel Siegmann, Software Developer
>> Velos
>> Accelerating Machine Learning
>>
>> 440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001
>> E: daniel.siegm...@velos.io W: www.velos.io
>>
>>
>>
>
>
> --
> Daniel Siegmann, Software Developer
> Velos
> Accelerating Machine Learning
>
> 440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001
> E: daniel.siegm...@velos.io W: www.velos.io
>

Re: Number of partitions and Number of concurrent tasks

Reply via email to