Re: Removing the Mesos fine-grained mode

Iulian Dragoș Wed, 20 Jan 2016 09:15:18 -0800

That'd be great, thanks Adam!

On Tue, Jan 19, 2016 at 5:41 PM, Adam McElwee <a...@mcelwee.me> wrote:


> Sorry, I never got a chance to circle back with the master logs for this.
> I definitely can't share the job code, since it's used to build a pretty
> core dataset for my company, but let me see if I can pull some logs
> together in the next couple days.
>
> On Tue, Jan 19, 2016 at 10:08 AM, Iulian Dragoș <
> iulian.dra...@typesafe.com> wrote:
>
>> It would be good to get to the bottom of this.
>>
>> Adam, could you share the Spark app that you're using to test this?
>>
>> iulian
>>
>> On Mon, Nov 30, 2015 at 10:10 PM, Timothy Chen <tnac...@gmail.com> wrote:
>>
>>> Hi Adam,
>>>
>>> Thanks for the graphs and the tests, definitely interested to dig a
>>> bit deeper to find out what's could be the cause of this.
>>>
>>> Do you have the spark driver logs for both runs?
>>>
>>> Tim
>>>
>>> On Mon, Nov 30, 2015 at 9:06 AM, Adam McElwee <a...@mcelwee.me> wrote:
>>> > To eliminate any skepticism around whether cpu is a good performance
>>> metric
>>> > for this workload, I did a couple comparison runs of an example job to
>>> > demonstrate a more universal change in performance metrics (stage/job
>>> time)
>>> > between coarse and fine-grained mode on mesos.
>>> >
>>> > The workload is identical here - pulling tgz archives from s3, parsing
>>> json
>>> > lines from the files and ultimately creating documents to index into
>>> solr.
>>> > The tasks are not inserting into solr (just to let you know that
>>> there's no
>>> > network side-effect of the map task). The runs are on the same exact
>>> > hardware in ec2 (m2.4xlarge, with 68GB of ram and 45G executor memory),
>>> > exact same jvm and it's not dependent on order of running the jobs,
>>> meaning
>>> > I get the same results whether I run the coarse first or whether I run
>>> the
>>> > fine-grained first. No other frameworks/tasks are running on the mesos
>>> > cluster during the test. I see the same results whether it's a 3-node
>>> > cluster, or whether it's a 200-node cluster.
>>> >
>>> > With the CMS collector in fine-grained mode, the map stage takes
>>> roughly
>>> > 2.9h, and coarse-grained mode takes 3.4h. Because both modes initially
>>> start
>>> > out performing similarly, the total execution time gap widens as the
>>> job
>>> > size grows. To put that another way, the difference is much smaller for
>>> > jobs/stages < 1 hour. When I submit this job for a much larger dataset
>>> that
>>> > takes 5+ hours, the difference in total stage time moves closer and
>>> closer
>>> > to roughly 20-30% longer execution time.
>>> >
>>> > With the G1 collector in fine-grained mode, the map stage takes roughly
>>> > 2.2h, and coarse-grained mode takes 2.7h. Again, the fine and
>>> coarse-grained
>>> > execution tests are on the exact same machines, exact same dataset,
>>> and only
>>> > changing spark.mesos.coarse to true/false.
>>> >
>>> > Let me know if there's anything else I can provide here.
>>> >
>>> > Thanks,
>>> > -Adam
>>> >
>>> >
>>> > On Mon, Nov 23, 2015 at 11:27 AM, Adam McElwee <a...@mcelwee.me>
>>> wrote:
>>> >>
>>> >>
>>> >>
>>> >> On Mon, Nov 23, 2015 at 7:36 AM, Iulian Dragoș
>>> >> <iulian.dra...@typesafe.com> wrote:
>>> >>>
>>> >>>
>>> >>>
>>> >>> On Sat, Nov 21, 2015 at 3:37 AM, Adam McElwee <a...@mcelwee.me>
>>> wrote:
>>> >>>>
>>> >>>> I've used fine-grained mode on our mesos spark clusters until this
>>> week,
>>> >>>> mostly because it was the default. I started trying coarse-grained
>>> because
>>> >>>> of the recent chatter on the mailing list about wanting to move the
>>> mesos
>>> >>>> execution path to coarse-grained only. The odd things is,
>>> coarse-grained vs
>>> >>>> fine-grained seems to yield drastic cluster utilization metrics for
>>> any of
>>> >>>> our jobs that I've tried out this week.
>>> >>>>
>>> >>>> If this is best as a new thread, please let me know, and I'll try
>>> not to
>>> >>>> derail this conversation. Otherwise, details below:
>>> >>>
>>> >>>
>>> >>> I think it's ok to discuss it here.
>>> >>>
>>> >>>>
>>> >>>> We monitor our spark clusters with ganglia, and historically, we
>>> >>>> maintain at least 90% cpu utilization across the cluster. Making a
>>> single
>>> >>>> configuration change to use coarse-grained execution instead of
>>> fine-grained
>>> >>>> consistently yields a cpu utilization pattern that starts around
>>> 90% at the
>>> >>>> beginning of the job, and then it slowly decreases over the next
>>> 1-1.5 hours
>>> >>>> to level out around 65% cpu utilization on the cluster. Does anyone
>>> have a
>>> >>>> clue why I'd be seeing such a negative effect of switching to
>>> coarse-grained
>>> >>>> mode? GC activity is comparable in both cases. I've tried 1.5.2, as
>>> well as
>>> >>>> the 1.6.0 preview tag that's on github.
>>> >>>
>>> >>>
>>> >>> I'm not very familiar with Ganglia, and how it computes utilization.
>>> But
>>> >>> one thing comes to mind: did you enable dynamic allocation on
>>> coarse-grained
>>> >>> mode?
>>> >>
>>> >>
>>> >> Dynamic allocation is definitely not enabled. The only delta between
>>> runs
>>> >> is adding --conf "spark.mesos.coarse=true" the job submission.
>>> Ganglia is
>>> >> just pulling stats from the procfs, and I've never seen it report bad
>>> >> results. If I sample any of the 100-200 nodes in the cluster, dstat
>>> reflects
>>> >> the same average cpu that I'm seeing reflected in ganglia.
>>> >>>
>>> >>>
>>> >>> iulian
>>> >>
>>> >>
>>> >
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: dev-h...@spark.apache.org
>>>
>>>
>>
>>
>> --
>>
>> --
>> Iulian Dragos
>>
>> ------
>> Reactive Apps on the JVM
>> www.typesafe.com
>>
>>
>


-- 

--
Iulian Dragos

------
Reactive Apps on the JVM
www.typesafe.com

Re: Removing the Mesos fine-grained mode

Reply via email to