Re: [VOTE] Release Apache Spark 2.0.2 (RC1)

2016-10-28 Thread Shixiong(Ryan) Zhu
-1. The history server is broken because of some refactoring work in Structured Streaming: https://issues.apache.org/jira/browse/SPARK-18143 On Fri, Oct 28, 2016 at 12:58 PM, Weiqing Yang wrote: > +1 (non binding) > > > > Environment: CentOS Linux release 7.0.1406 / openjdk version "1.8.0_111"/

Re: Spark has a compile dependency on scalatest

2016-10-28 Thread Marcelo Vanzin
Hmm. Yes, that makes sense. Spark's root pom does not affect your application's pom, in which case it will pick compile over test if there are conflicting dependencies. Perhaps spark-tags should override it to provided instead of compile... On Fri, Oct 28, 2016 at 1:22 PM, Shixiong(Ryan) Zhu wro

Re: Spark has a compile dependency on scalatest

2016-10-28 Thread Shixiong(Ryan) Zhu
This is my test pom: 4.0.0 foo bar 1.0 org.apache.spark spark-core_2.10 2.0.1 scalatest is in the compile scope: [INFO] bar:foo:jar:1.0 [INFO] \- org.apache.spark:spark-core_2.10:jar:2.0.1:compile [INFO]+- org.apache.avro:avro-mapred:jar:hadoop2:1.7.7:compile [INFO]

Re: Spark has a compile dependency on scalatest

2016-10-28 Thread Marcelo Vanzin
The root pom declares scalatest explicitly with test scope. It's added by default to all sub-modules, so every one should get it in test scope unless the module explicitly overrides that, like the tags module does. If you look at the "blessed" dependency list in dev/deps, there's no scalatest. Th

Re: Spark has a compile dependency on scalatest

2016-10-28 Thread Sean Owen
Yes, but scalatest doesn't end up in compile scope, says Maven? ... [INFO] +- org.apache.spark:spark-tags_2.11:jar:2.1.0-SNAPSHOT:compile [INFO] | +- (org.scalatest:scalatest_2.11:jar:2.2.6:test - scope managed from compile; omitted for duplicate) [INFO] | \- (org.spark-project.spark:unused:j

Re: [VOTE] Release Apache Spark 2.0.2 (RC1)

2016-10-28 Thread Weiqing Yang
+1 (non binding) Environment: CentOS Linux release 7.0.1406 / openjdk version "1.8.0_111"/ R version 3.3.1 ./build/mvn -Pyarn -Phadoop-2.7 -Pkinesis-asl -Phive -Phive-thriftserver -Dpyspark -Dsparkr -DskipTests clean package ./build/mvn -Pyarn -Phadoop-2.7 -Pkinesis-asl -Phive -Phive-thriftse

Re: Spark has a compile dependency on scalatest

2016-10-28 Thread Shixiong(Ryan) Zhu
You can just exclude scalatest from Spark. On Fri, Oct 28, 2016 at 12:51 PM, Jeremy Smith wrote: > spark-core depends on spark-launcher (compile) > spark-launcher depends on spark-tags (compile) > spark-tags depends on scalatest (compile) > > To be honest I'm not all that familiar with the proje

Re: Spark has a compile dependency on scalatest

2016-10-28 Thread Shixiong(Ryan) Zhu
spark-tags is in the compile scope of spark-core... On Fri, Oct 28, 2016 at 12:27 PM, Sean Owen wrote: > It's required because the tags module uses it to define annotations for > tests. I don't see it in compile scope for anything but the tags module, > which is then in test scope for other modu

Re: Spark has a compile dependency on scalatest

2016-10-28 Thread Jeremy Smith
spark-core depends on spark-launcher (compile) spark-launcher depends on spark-tags (compile) spark-tags depends on scalatest (compile) To be honest I'm not all that familiar with the project structure - should I just exclude spark-launcher if I'm not using it? On Fri, Oct 28, 2016 at 12:27 PM, S

Re: Spark has a compile dependency on scalatest

2016-10-28 Thread Sean Owen
It's required because the tags module uses it to define annotations for tests. I don't see it in compile scope for anything but the tags module, which is then in test scope for other modules. What are you seeing that makes you say it's in compile scope? On Fri, Oct 28, 2016 at 8:19 PM Jeremy Smith

Spark has a compile dependency on scalatest

2016-10-28 Thread Jeremy Smith
Hey everybody, Just a heads up that currently Spark 2.0.1 has a compile dependency on Scalatest 2.2.6. It comes from spark-core's dependency on spark-launcher, which has a transitive dependency on spark-tags, which has a compile dependency on Scalatest. This makes it impossible to use any other v

Re: [VOTE] Release Apache Spark 2.0.2 (RC1)

2016-10-28 Thread Ryan Blue
+1 (non-binding) Checksums and build are fine. The tarball matches the release tag except that .gitignore is missing. It would be nice if the tarball were created using git archive so that the commit ref is present, but otherwise everything looks fine. ​ On Thu, Oct 27, 2016 at 12:18 AM, Reynold

Re: Straw poll: dropping support for things like Scala 2.10

2016-10-28 Thread Koert Kuipers
thats correct in my experience: we have found a scala update to be straightforward and basically somewhat invisible to ops, but a java upgrade a pain because it is managed and "certified" by ops. On Fri, Oct 28, 2016 at 9:44 AM, Steve Loughran wrote: > Twitter just led the release of Hadoop 2.6.

Re: MemoryStore reporting wrong free memory in spark 1.6.2

2016-10-28 Thread Sushrut Ikhar
found the reporting bug in 1.6.2- Utils.bytesToString(maxMemory - blocksMemoryUsed)) should've been used https://github.com/apache/spark/blob/v2.0.1/core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala#L154 https://github.com/apache/spark/blob/v1.6.2/core/src/main/scala/org/apach

MemoryStore reporting wrong free memory in spark 1.6.2

2016-10-28 Thread Sushrut Ikhar
Hi, I am seeing wrong computation of storage memory available which is leading to executor failures. I have allocated 8g memory with params: spark.memory.fraction=0.7 spark.memory.storageFraction=0.4 As expected, I was able to see 5.2 GB storage memory in UI. However, as per memory store logs I am

Re: Straw poll: dropping support for things like Scala 2.10

2016-10-28 Thread Steve Loughran
Twitter just led the release of Hadoop 2.6.5 precisely because they wanted to keep a Java 6 cluster up: the bigger your cluster, the less of a rush to upgrade. HDP? I believe we install & prefer (openjdk) Java 8, but the Hadoop branch-2 line is intended to build/run on Java 7 too. There's alway

Re: Straw poll: dropping support for things like Scala 2.10

2016-10-28 Thread Chris Fregly
i seem to remember a large spark user (tencent, i believe) chiming in late during these discussions 6-12 months ago and squashing any sort of deprecation given the massive effort that would be required to upgrade their environment. i just want to make sure these convos take into consideration la

Re: Straw poll: dropping support for things like Scala 2.10

2016-10-28 Thread Sean Owen
If the subtext is vendors, then I'd have a look at what recent distros look like. I'll write about CDH as a representative example, but I think other distros are naturally similar. CDH has been on Java 8, Hadoop 2.6, Python 2.7 for almost two years (CDH 5.3 / Dec 2014). Granted, this depends on in

Re: Straw poll: dropping support for things like Scala 2.10

2016-10-28 Thread Matei Zaharia
BTW maybe one key point that isn't obvious is that with YARN and Mesos, the version of Spark used can be solely up to the developer who writes an app, not to the cluster administrator. So even in very conservative orgs, developers can download a new version of Spark, run it, and demonstrate valu

Re: Straw poll: dropping support for things like Scala 2.10

2016-10-28 Thread Matei Zaharia
Deprecating them is fine (and I know they're already deprecated), the question is just whether to remove them. For example, what exactly is the downside of having Python 2.6 or Java 7 right now? If it's high, then we can remove them, but I just haven't seen a ton of details. It also sounded like