Darin, I think the number of cores in your cluster is a hard limit on how many concurrent tasks you can execute at one time. If you want more parallelism, I think you just need more cores in your cluster--that is, bigger nodes, or more nodes.
Daniel, Have you been able to get around this limit? Nick On Fri, Aug 1, 2014 at 11:49 AM, Daniel Siegmann <daniel.siegm...@velos.io> wrote: > Sorry, but I haven't used Spark on EC2 and I'm not sure what the problem > could be. Hopefully someone else will be able to help. The only thing I > could suggest is to try setting both the worker instances and the number of > cores (assuming spark-ec2 has such a parameter). > > > On Thu, Jul 31, 2014 at 3:03 PM, Darin McBeath <ddmcbe...@yahoo.com> > wrote: > >> Ok, I set the number of spark worker instances to 2 (below is my startup >> command). But, this essentially had the effect of increasing my number of >> workers from 3 to 6 (which was good) but it also reduced my number of cores >> per worker from 8 to 4 (which was not so good). In the end, I would still >> only be able to concurrently process 24 partitions in parallel. I'm >> starting a stand-alone cluster using the spark provided ec2 scripts . I >> tried setting the env variable for SPARK_WORKER_CORES in the spark_ec2.py >> but this had no effect. So, it's not clear if I could even set the >> SPARK_WORKER_CORES with the ec2 scripts. Anyway, not sure if there is >> anything else I can try but at least wanted to document what I did try and >> the net effect. I'm open to any suggestions/advice. >> >> ./spark-ec2 -k *key* -i key.pem --hadoop-major-version=2 launch -s 3 -t >> m3.2xlarge -w 3600 --spot-price=.08 -z us-east-1e --worker-instances=2 >> *my-cluster* >> >> >> ------------------------------ >> *From:* Daniel Siegmann <daniel.siegm...@velos.io> >> *To:* Darin McBeath <ddmcbe...@yahoo.com> >> *Cc:* Daniel Siegmann <daniel.siegm...@velos.io>; "user@spark.apache.org" >> <user@spark.apache.org> >> *Sent:* Thursday, July 31, 2014 10:04 AM >> >> *Subject:* Re: Number of partitions and Number of concurrent tasks >> >> I haven't configured this myself. I'd start with setting >> SPARK_WORKER_CORES to a higher value, since that's a bit simpler than >> adding more workers. This defaults to "all available cores" according to >> the documentation, so I'm not sure if you can actually set it higher. If >> not, you can get around this by adding more worker instances; I believe >> simply setting SPARK_WORKER_INSTANCES to 2 would be sufficient. >> >> I don't think you *have* to set the cores if you have more workers - it >> will default to 8 cores per worker (in your case). But maybe 16 cores per >> node will be too many. You'll have to test. Keep in mind that more workers >> means more memory and such too, so you may need to tweak some other >> settings downward in this case. >> >> On a side note: I've read some people found performance was better when >> they had more workers with less memory each, instead of a single worker >> with tons of memory, because it cut down on garbage collection time. But I >> can't speak to that myself. >> >> In any case, if you increase the number of cores available in your >> cluster (whether per worker, or adding more workers per node, or of course >> adding more nodes) you should see more tasks running concurrently. Whether >> this will actually be *faster* probably depends mainly on whether the >> CPUs in your nodes were really being fully utilized with the current number >> of cores. >> >> >> On Wed, Jul 30, 2014 at 8:30 PM, Darin McBeath <ddmcbe...@yahoo.com> >> wrote: >> >> Thanks. >> >> So to make sure I understand. Since I'm using a 'stand-alone' cluster, >> I would set SPARK_WORKER_INSTANCES to something like 2 (instead of the >> default value of 1). Is that correct? But, it also sounds like I need to >> explicitly set a value for SPARKER_WORKER_CORES (based on what the >> documentation states). What would I want that value to be based on my >> configuration below? Or, would I leave that alone? >> >> ------------------------------ >> *From:* Daniel Siegmann <daniel.siegm...@velos.io> >> *To:* user@spark.apache.org; Darin McBeath <ddmcbe...@yahoo.com> >> *Sent:* Wednesday, July 30, 2014 5:58 PM >> *Subject:* Re: Number of partitions and Number of concurrent tasks >> >> This is correct behavior. Each "core" can execute exactly one task at a >> time, with each task corresponding to a partition. If your cluster only has >> 24 cores, you can only run at most 24 tasks at once. >> >> You could run multiple workers per node to get more executors. That would >> give you more cores in the cluster. But however many cores you have, each >> core will run only one task at a time. >> >> >> On Wed, Jul 30, 2014 at 3:56 PM, Darin McBeath <ddmcbe...@yahoo.com> >> wrote: >> >> I have a cluster with 3 nodes (each with 8 cores) using Spark 1.0.1. >> >> I have an RDD<String> which I've repartitioned so it has 100 partitions >> (hoping to increase the parallelism). >> >> When I do a transformation (such as filter) on this RDD, I can't seem to >> get more than 24 tasks (my total number of cores across the 3 nodes) going >> at one point in time. By tasks, I mean the number of tasks that appear >> under the Application UI. I tried explicitly setting the >> spark.default.parallelism to 48 (hoping I would get 48 tasks concurrently >> running) and verified this in the Application UI for the running >> application but this had no effect. Perhaps, this is ignored for a >> 'filter' and the default is the total number of cores available. >> >> I'm fairly new with Spark so maybe I'm just missing or misunderstanding >> something fundamental. Any help would be appreciated. >> >> Thanks. >> >> Darin. >> >> >> >> >> -- >> Daniel Siegmann, Software Developer >> Velos >> Accelerating Machine Learning >> >> 440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001 >> E: daniel.siegm...@velos.io W: www.velos.io >> >> >> >> >> >> -- >> Daniel Siegmann, Software Developer >> Velos >> Accelerating Machine Learning >> >> 440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001 >> E: daniel.siegm...@velos.io W: www.velos.io >> >> >> > > > -- > Daniel Siegmann, Software Developer > Velos > Accelerating Machine Learning > > 440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001 > E: daniel.siegm...@velos.io W: www.velos.io >