It would be good to get to the bottom of this. Adam, could you share the Spark app that you're using to test this?
iulian On Mon, Nov 30, 2015 at 10:10 PM, Timothy Chen <tnac...@gmail.com> wrote: > Hi Adam, > > Thanks for the graphs and the tests, definitely interested to dig a > bit deeper to find out what's could be the cause of this. > > Do you have the spark driver logs for both runs? > > Tim > > On Mon, Nov 30, 2015 at 9:06 AM, Adam McElwee <a...@mcelwee.me> wrote: > > To eliminate any skepticism around whether cpu is a good performance > metric > > for this workload, I did a couple comparison runs of an example job to > > demonstrate a more universal change in performance metrics (stage/job > time) > > between coarse and fine-grained mode on mesos. > > > > The workload is identical here - pulling tgz archives from s3, parsing > json > > lines from the files and ultimately creating documents to index into > solr. > > The tasks are not inserting into solr (just to let you know that there's > no > > network side-effect of the map task). The runs are on the same exact > > hardware in ec2 (m2.4xlarge, with 68GB of ram and 45G executor memory), > > exact same jvm and it's not dependent on order of running the jobs, > meaning > > I get the same results whether I run the coarse first or whether I run > the > > fine-grained first. No other frameworks/tasks are running on the mesos > > cluster during the test. I see the same results whether it's a 3-node > > cluster, or whether it's a 200-node cluster. > > > > With the CMS collector in fine-grained mode, the map stage takes roughly > > 2.9h, and coarse-grained mode takes 3.4h. Because both modes initially > start > > out performing similarly, the total execution time gap widens as the job > > size grows. To put that another way, the difference is much smaller for > > jobs/stages < 1 hour. When I submit this job for a much larger dataset > that > > takes 5+ hours, the difference in total stage time moves closer and > closer > > to roughly 20-30% longer execution time. > > > > With the G1 collector in fine-grained mode, the map stage takes roughly > > 2.2h, and coarse-grained mode takes 2.7h. Again, the fine and > coarse-grained > > execution tests are on the exact same machines, exact same dataset, and > only > > changing spark.mesos.coarse to true/false. > > > > Let me know if there's anything else I can provide here. > > > > Thanks, > > -Adam > > > > > > On Mon, Nov 23, 2015 at 11:27 AM, Adam McElwee <a...@mcelwee.me> wrote: > >> > >> > >> > >> On Mon, Nov 23, 2015 at 7:36 AM, Iulian Dragoș > >> <iulian.dra...@typesafe.com> wrote: > >>> > >>> > >>> > >>> On Sat, Nov 21, 2015 at 3:37 AM, Adam McElwee <a...@mcelwee.me> wrote: > >>>> > >>>> I've used fine-grained mode on our mesos spark clusters until this > week, > >>>> mostly because it was the default. I started trying coarse-grained > because > >>>> of the recent chatter on the mailing list about wanting to move the > mesos > >>>> execution path to coarse-grained only. The odd things is, > coarse-grained vs > >>>> fine-grained seems to yield drastic cluster utilization metrics for > any of > >>>> our jobs that I've tried out this week. > >>>> > >>>> If this is best as a new thread, please let me know, and I'll try not > to > >>>> derail this conversation. Otherwise, details below: > >>> > >>> > >>> I think it's ok to discuss it here. > >>> > >>>> > >>>> We monitor our spark clusters with ganglia, and historically, we > >>>> maintain at least 90% cpu utilization across the cluster. Making a > single > >>>> configuration change to use coarse-grained execution instead of > fine-grained > >>>> consistently yields a cpu utilization pattern that starts around 90% > at the > >>>> beginning of the job, and then it slowly decreases over the next > 1-1.5 hours > >>>> to level out around 65% cpu utilization on the cluster. Does anyone > have a > >>>> clue why I'd be seeing such a negative effect of switching to > coarse-grained > >>>> mode? GC activity is comparable in both cases. I've tried 1.5.2, as > well as > >>>> the 1.6.0 preview tag that's on github. > >>> > >>> > >>> I'm not very familiar with Ganglia, and how it computes utilization. > But > >>> one thing comes to mind: did you enable dynamic allocation on > coarse-grained > >>> mode? > >> > >> > >> Dynamic allocation is definitely not enabled. The only delta between > runs > >> is adding --conf "spark.mesos.coarse=true" the job submission. Ganglia > is > >> just pulling stats from the procfs, and I've never seen it report bad > >> results. If I sample any of the 100-200 nodes in the cluster, dstat > reflects > >> the same average cpu that I'm seeing reflected in ganglia. > >>> > >>> > >>> iulian > >> > >> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > > -- -- Iulian Dragos ------ Reactive Apps on the JVM www.typesafe.com