To eliminate any skepticism around whether cpu is a good performance metric
for this workload, I did a couple comparison runs of an example job to
demonstrate a more universal change in performance metrics (stage/job time)
between coarse and fine-grained mode on mesos.

The workload is identical here - pulling tgz archives from s3, parsing json
lines from the files and ultimately creating documents to index into solr.
The tasks are not inserting into solr (just to let you know that there's no
network side-effect of the map task). The runs are on the same exact
hardware in ec2 (m2.4xlarge, with 68GB of ram and 45G executor memory),
exact same jvm and it's not dependent on order of running the jobs, meaning
I get the same results whether I run the coarse first or whether I run the
fine-grained first. No other frameworks/tasks are running on the mesos
cluster during the test. I see the same results whether it's a 3-node
cluster, or whether it's a 200-node cluster.

With the CMS collector in fine-grained mode, the map stage takes roughly
2.9h, and coarse-grained mode takes 3.4h. Because both modes initially
start out performing similarly, the total execution time gap widens as the
job size grows. To put that another way, the difference is much smaller for
jobs/stages < 1 hour. When I submit this job for a much larger dataset that
takes 5+ hours, the difference in total stage time moves closer and closer
to roughly 20-30% longer execution time.

With the G1 collector in fine-grained mode, the map stage takes roughly
2.2h, and coarse-grained mode takes 2.7h. Again, the fine and coarse-grained
execution tests are on the exact same machines, exact same dataset, and
only changing spark.mesos.coarse to true/false.

Let me know if there's anything else I can provide here.

Thanks,
-Adam


On Mon, Nov 23, 2015 at 11:27 AM, Adam McElwee <a...@mcelwee.me> wrote:

>
>
> On Mon, Nov 23, 2015 at 7:36 AM, Iulian Dragoș <iulian.dra...@typesafe.com
> > wrote:
>
>>
>>
>> On Sat, Nov 21, 2015 at 3:37 AM, Adam McElwee <a...@mcelwee.me> wrote:
>>
>>> I've used fine-grained mode on our mesos spark clusters until this week,
>>> mostly because it was the default. I started trying coarse-grained because
>>> of the recent chatter on the mailing list about wanting to move the mesos
>>> execution path to coarse-grained only. The odd things is, coarse-grained vs
>>> fine-grained seems to yield drastic cluster utilization metrics for any of
>>> our jobs that I've tried out this week.
>>>
>>> If this is best as a new thread, please let me know, and I'll try not to
>>> derail this conversation. Otherwise, details below:
>>>
>>
>> I think it's ok to discuss it here.
>>
>>
>>> We monitor our spark clusters with ganglia, and historically, we
>>> maintain at least 90% cpu utilization across the cluster. Making a single
>>> configuration change to use coarse-grained execution instead of
>>> fine-grained consistently yields a cpu utilization pattern that starts
>>> around 90% at the beginning of the job, and then it slowly decreases over
>>> the next 1-1.5 hours to level out around 65% cpu utilization on the
>>> cluster. Does anyone have a clue why I'd be seeing such a negative effect
>>> of switching to coarse-grained mode? GC activity is comparable in both
>>> cases. I've tried 1.5.2, as well as the 1.6.0 preview tag that's on github.
>>>
>>
>> I'm not very familiar with Ganglia, and how it computes utilization. But
>> one thing comes to mind: did you enable dynamic allocation
>> <https://spark.apache.org/docs/latest/running-on-mesos.html#dynamic-resource-allocation-with-mesos>
>> on coarse-grained mode?
>>
>
> Dynamic allocation is definitely not enabled. The only delta between runs
> is adding --conf "spark.mesos.coarse=true" the job submission. Ganglia is
> just pulling stats from the procfs, and I've never seen it report bad
> results. If I sample any of the 100-200 nodes in the cluster, dstat
> reflects the same average cpu that I'm seeing reflected in ganglia.
>
>>
>> iulian
>>
>
>

Reply via email to