Re: SequenceFile and object reuse

2015-11-18 Thread Ryan Williams
Hey Jeff, in addition to what Sandy said, there are two more reasons that this might not be as bad as it seems; I may be incorrect in my understanding though. First, the "additional step" you're referring to is not likely to be adding any overhead; the "extra map" is really just materializing the

Spree: a live-updating web UI for Spark

2015-07-27 Thread Ryan Williams
Probably relevant to people on this list: on Friday I released a clone of the Spark web UI built using Meteor so that everything updates in real-time, saving you from endlessly refreshing the page while jobs are running :) It can also serve as the UI for running as well as

Re: Re: spark 1.3.1 jars in repo1.maven.org

2015-06-02 Thread Ryan Williams
$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:149)" > > > Best Regards, > Shixiong Zhu > > 2015-06-03 0:08 GMT+08:00 Ryan Williams : > >> I think this is causing issues upgrading ADAM >> <https://github.com/bigdatagenomic

Re: Re: spark 1.3.1 jars in repo1.maven.org

2015-06-02 Thread Ryan Williams
I think this is causing issues upgrading ADAM to Spark 1.3.1 (cf. adam#690 ); attempting to build against Hadoop 1.0.4 yields errors like: 2015-06-02 15:57:44 ERROR Executor:96 - Exce

Re: bitten by spark.yarn.executor.memoryOverhead

2015-03-02 Thread Ryan Williams
For reference, the initial version of #3525 (still open) made this fraction a configurable value, but consensus went against that being desirable so I removed it and marked SPARK-4665 as "won't fix". My

Monitoring Spark with Graphite and Grafana

2015-02-26 Thread Ryan Williams
If anyone is curious to try exporting Spark metrics to Graphite, I just published a post about my experience doing that, building dashboards in Grafana , and using them to monitor Spark jobs: http://www.hammerlab.org/2015/02/27/monitoring-spark-with-graphite-and-grafana/ Code

Re: Data Loss - Spark streaming

2014-12-16 Thread Ryan Williams
TD's portion seems to start at 27:24: http://youtu.be/jcJq3ZalXD8?t=27m24s On Tue Dec 16 2014 at 7:13:43 AM Gerard Maas wrote: > Hi Jeniba, > > The second part of this meetup recording has a very good answer to your > question. TD explains the current behavior and the on-going work in Spark > S

FileNotFoundException in appcache shuffle files

2014-10-28 Thread Ryan Williams
My job is failing with the following error: 14/10/29 02:59:14 WARN scheduler.TaskSetManager: Lost task 1543.0 in stage 3.0 (TID 6266, demeter-csmau08-19.demeter.hpc.mssm.edu): java.io.FileNotFoundException: /data/05/dfs/dn/yarn/nm/usercache/willir31/appcache/application_1413512480649_0108/spark-lo

Re: scalac crash when compiling DataTypeConversions.scala

2014-10-26 Thread Ryan Williams
encountering this issue. > Typically you would have changed one or more of the profiles/options - > which leads to this occurring. > > 2014-10-22 22:00 GMT-07:00 Ryan Williams : > > I started building Spark / running Spark tests this weekend and on maybe >> 5-10 occasions have

scalac crash when compiling DataTypeConversions.scala

2014-10-22 Thread Ryan Williams
I started building Spark / running Spark tests this weekend and on maybe 5-10 occasions have run into a compiler crash while compiling DataTypeConversions.scala. Here <https://gist.github.com/ryan-williams/7673d7da928570907f4d> is a full gist of an innocuous test command (mvn test -D