Re: shapeless in spark 2.1.0

2016-12-29 Thread Ryan Williams
Other option would presumably be for someone to make a release of breeze with old-shapeless shaded... unless shapeless classes are exposed in breeze's public API, in which case you'd have to copy the relevant shapeless classes into breeze and then publish that? On Thu, Dec 29, 2016, 1:05 PM Sean O

Re: shapeless in spark 2.1.0

2016-12-29 Thread Ryan Williams
`mvn dependency:tree -Dverbose -Dincludes=:shapeless_2.11` shows: [INFO] \- org.apache.spark:spark-mllib_2.11:jar:2.1.0:provided [INFO]\- org.scalanlp:breeze_2.11:jar:0.12:provided [INFO] \- com.chuusai:shapeless_2.11:jar:2.0.0:provided On Thu, Dec 29, 2016 at 12:11 PM Herman van Hövell

Re: spark-core "compile"-scope transitive-dependency on scalatest

2016-12-15 Thread Ryan Williams
gt; > I'll re-open that bug, if you want to send a PR. (I think it's just a > matter of making the scalatest dependency "provided" in spark-tags, if > I remember the discussion.) > > On Thu, Dec 15, 2016 at 4:15 PM, Ryan Williams > wrote: > > spark-core

spark-core "compile"-scope transitive-dependency on scalatest

2016-12-15 Thread Ryan Williams
spark-core depends on spark-tags (compile scope) which depends on scalatest (compile scope), so spark-core leaks test-deps into downstream libraries' "compile"-scope classpath. The cause is that spark-core has logical "test->test" and "compile->compile" dependencies on spark-tags, but spark-tags p

Re: Compatibility of 1.6 spark.eventLog with a 2.0 History Server

2016-09-15 Thread Ryan Williams
What is meant by: """ (This is because clicking the refresh button in browser, updates the UI with latest events, where-as in the 1.6 code base, this does not happen) """ Hasn't refreshing the page updated all the information in the UI through the 1.x line?

Re: Setting YARN executors' JAVA_HOME

2016-08-18 Thread Ryan Williams
apache.org/docs/latest/configuration.html>. > > The page addresses what you need. You can look for > spark.executorEnv.[EnvironmentVariableName] > and set your java home as > spark.executorEnv.JAVA_HOME= > > Regards, > Dhruve > > On Thu, Aug 18, 2016 at 12:49 PM,

Setting YARN executors' JAVA_HOME

2016-08-18 Thread Ryan Williams
I need to tell YARN a JAVA_HOME to use when spawning containers (to run a Java 8 app on Java 7 YARN). The only way I've found that works is setting SPARK_YARN_USER_ENV="JAVA_HOME=/path/to/java8". The code

Re: Latency due to driver fetching sizes of output statuses

2016-01-23 Thread Ryan Williams
ps://issues.apache.org/jira/browse/SPARK-10193 > https://github.com/apache/spark/pull/8427 > > On Sat, Jan 23, 2016 at 1:40 PM, Ryan Williams < > ryan.blake.willi...@gmail.com> wrote: > >> I have a recursive algorithm that performs a few jobs on successively >> smaller

Latency due to driver fetching sizes of output statuses

2016-01-23 Thread Ryan Williams
me it computes it, no executors have joined or left the cluster. In this gist <https://gist.github.com/ryan-williams/445ef8736a688bd78edb#file-job-108> you can see two jobs stalling for almost a minute each between "Starting job:" and "Got job"; with larger input datasets my RDD

Re: Off-heap storage and dynamic allocation

2015-11-03 Thread Ryan Williams
fwiw, I think that having cached RDD partitions prevents executors from being removed under dynamic allocation by default; see SPARK-8958 . The "spark.dynamicAllocation.cachedExecutorIdleTimeout" config

Re: Live UI

2015-10-12 Thread Ryan Williams
Yea, definitely check out Spree ! It functions as "live" UI, history server, and archival storage of event log data. There are pros and cons to building something like it in Spark trunk (and running it in the Spark driver, presumably) that I've spent a lot of ti

Re: An alternate UI for Spark.

2015-09-13 Thread Ryan Williams
You can check out Spree for one data point about how this can be done; it is a near-clone of the Spark web UI that updates in real-time. It uses JsonRelay , a SparkListener that sends events as JSON over the networ

Re: [ANNOUNCE] Announcing Spark 1.5.0

2015-09-11 Thread Ryan Williams
park/spark-parent_2.10/1.5.0/ >> >> >> >> On Fri, Sep 11, 2015 at 10:21 AM, Ryan Williams < >> ryan.blake.willi...@gmail.com> wrote: >> >>> Any idea why 1.5.0 is not in Maven central yet >>> <http://search.maven.org/#search%7Cga%7C1%

Re: [ANNOUNCE] Announcing Spark 1.5.0

2015-09-11 Thread Ryan Williams
Any idea why 1.5.0 is not in Maven central yet ? Is that a separate release process? On Wed, Sep 9, 2015 at 12:40 PM andy petrella wrote: > You can try it out really quickly by "building" a Spark Notebook from > http://spark-

"Spree": Live-updating web UI for Spark

2015-07-27 Thread Ryan Williams
Hi dev@spark, I wanted to quickly ping about Spree , a live-updating web UI for Spark that I released on Friday (along with some supporting infrastructure), and mention a couple things that came up while I worked on it

Re: Resource usage of a spark application

2015-05-21 Thread Ryan Williams
graphite has seen metrics from. Let me know, here or in issues on the repo, if you have any issues with that or that doesn't make sense! > > 2015-05-19 21:43 GMT+02:00 Ryan Williams : > >> Hi Peter, a few months ago I was using MetricsSystem to export to >> Graphit

Re: Resource usage of a spark application

2015-05-19 Thread Ryan Williams
Hi Peter, a few months ago I was using MetricsSystem to export to Graphite and then view in Grafana; relevant scripts and some instructions are here if you want to take a look. On Sun, May 17, 2015 at 8:48 AM Peter Prettenhofer < peter.prett

Monitoring Spark with Graphite and Grafana

2015-02-26 Thread Ryan Williams
If anyone is curious to try exporting Spark metrics to Graphite, I just published a post about my experience doing that, building dashboards in Grafana , and using them to monitor Spark jobs: http://www.hammerlab.org/2015/02/27/monitoring-spark-with-graphite-and-grafana/ Code

Re: Building Spark with Pants

2015-02-16 Thread Ryan Williams
I worked on Pants at Foursquare for a while and when coming up to speed on Spark was interested in the possibility of building it with Pants, particularly because allowing developers to share/reuse each others' compilation artifacts seems like it would be a boon to productivity; that was/is Pants'

Present/Future of monitoring spark jobs, "MetricsSystem" vs. Web UI, etc.

2015-01-09 Thread Ryan Williams
I've long wished the web UI gave me a better sense of how the metrics it reports are changing over time, so I was intrigued to stumble across the MetricsSystem

Re: zinc invocation examples

2014-12-05 Thread Ryan Williams
fwiw I've been using `zinc -scala-home $SCALA_HOME -nailed -start` which: - starts a nailgun server as well, - uses my installed scala 2.{10,11}, as opposed to zinc's default 2.9.2 : "If no options are passed to locate a version of Scala then Scala 2.9.2

Re: Spurious test failures, testing best practices

2014-12-04 Thread Ryan Williams
5:49:58 PM Marcelo Vanzin wrote: > On Tue, Dec 2, 2014 at 4:40 PM, Ryan Williams > wrote: > >> But you only need to compile the others once. > > > > once... every time I rebase off master, or am obliged to `mvn clean` by > some > > other build-correctness bug

Re: Spurious test failures, testing best practices

2014-12-02 Thread Ryan Williams
On Tue Dec 02 2014 at 4:46:20 PM Marcelo Vanzin wrote: > On Tue, Dec 2, 2014 at 3:39 PM, Ryan Williams > wrote: > > Marcelo: by my count, there are 19 maven modules in the codebase. I am > > typically only concerned with "core" (and therefore its two dependen

Re: Spurious test failures, testing best practices

2014-12-02 Thread Ryan Williams
stuff I am not using, which both `mvn package` and `mvn install` on the parent project do. On Tue Dec 02 2014 at 3:45:48 PM Marcelo Vanzin wrote: > On Tue, Dec 2, 2014 at 2:40 PM, Ryan Williams > wrote: > > Following on Mark's Maven examples, here is another related issue I

Re: Spurious test failures, testing best practices

2014-12-02 Thread Ryan Williams
twork/core. Here's <https://gist.github.com/ryan-williams/1711189e7d0af558738d> a sample full output from running `mvn install -X -U -DskipTests -pl network/shuffle` from such a state (the -U was to get around a previous failure based on having cached a failed lookup of network-common-1.3.0-S

Re: Spurious test failures, testing best practices

2014-11-30 Thread Ryan Williams
pshot and publish docs there. > > - Patrick > > On Sun, Nov 30, 2014 at 6:15 PM, Patrick Wendell > wrote: > > Hey Ryan, > > > > The existing JIRA also covers publishing nightly docs: > > https://issues.apache.org/jira/browse/SPARK-1517 > > > > -

Re: Spurious test failures, testing best practices

2014-11-30 Thread Ryan Williams
up on the names of all Spark integration tests), spurious failures still abound, there's no good way to run only the things that a given change actually could have broken, etc. Anyway, hopefully zinc brings me to the world of ~minute iteration times that have been reported on this thread. On

Re: Spurious test failures, testing best practices

2014-11-30 Thread Ryan Williams
tests including e.g. filesystem interactions to try and reduce >> variance across environments. However, that seems difficult. >> >> As the number of developers of Spark increases, it's definitely a good >> idea for us to invest in developer infrastructure including th

Re: Spurious test failures, testing best practices

2014-11-30 Thread Ryan Williams
t integration tests, the whole test process > will take an hour, but most of the developers I know leave that to Jenkins > and only run individual tests locally before submitting a patch. > > Matei > > > > On Nov 30, 2014, at 2:39 PM, Ryan Williams < > ryan.blake.willi...@

Spurious test failures, testing best practices

2014-11-30 Thread Ryan Williams
isdom about how to build/test in a sane way by trawling mailing list archives seems suboptimal. Thanks for reading, looking forward to hearing your ideas! -Ryan P.S. Is "best practice" for emailing this list to not incorporate any HTML in the body? It seems like all of the archives I'