Re: [VOTE] Release Apache Spark 1.4.1 (RC4)

2015-07-08 Thread Patrick Wendell
+1 On Wed, Jul 8, 2015 at 10:55 PM, Patrick Wendell wrote: > Please vote on releasing the following candidate as Apache Spark version > 1.4.1! > > This release fixes a handful of known issues in Spark 1.4.0, listed here: > http://s.apache.org/spark-1.4.1 > > The tag to be voted on is v1.4.1-rc4

Spark and Haskell support

2015-07-08 Thread Vasili I. Galchin
Hello, 1) I have been rereading kind email responses to my Spark queries. Thx. 2) I have also been reading "R" code: 1) RDD.R 2) DataFrame.R 3) All following API's => https://cwiki.apache.org/confluence/display/SPARK/Spark+Internals 4) Python ... http

[VOTE] Release Apache Spark 1.4.1 (RC4)

2015-07-08 Thread Patrick Wendell
Please vote on releasing the following candidate as Apache Spark version 1.4.1! This release fixes a handful of known issues in Spark 1.4.0, listed here: http://s.apache.org/spark-1.4.1 The tag to be voted on is v1.4.1-rc4 (commit dbaa5c2): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=co

[RESULT] [VOTE] Release Apache Spark 1.4.1 (RC3)

2015-07-08 Thread Patrick Wendell
This vote is cancelled in favor of RC4. - Patrick On Tue, Jul 7, 2015 at 12:06 PM, Patrick Wendell wrote: > Please vote on releasing the following candidate as Apache Spark version > 1.4.1! > > This release fixes a handful of known issues in Spark 1.4.0, listed here: > http://s.apache.org/spark

Re: What steps to take to work on [Spark-8899] issue?

2015-07-08 Thread Chandrashekhar Kotekar
Ohk. Thanks, I will choose some other issue then. Regards, Chandrash3khar Kotekar Mobile - +91 8600011455 On Thu, Jul 9, 2015 at 12:21 AM, Holden Karau wrote: > Not exactly but it means someone has come up with what they think a > solution to the problem is and that they've submitted some code

Why are all spark deps not shaded to avoid dependency hell?

2015-07-08 Thread ankits
I frequently encounter problems building Spark as a dependency in java projects because of version conflicts with other dependencies. Usually there will be two different versions of a library and we'll see an AbstractMethodError or invalid signature etc. So far, I've seen it happen with jackson, s

Re: [VOTE] Release Apache Spark 1.4.1 (RC3)

2015-07-08 Thread Patrick Wendell
Hey All, The issue that Josh pointed out is not just a test failure, it's an issue with an important bug fix that was not correctly back-ported into the 1.4 branch. Unfortunately the overall state of the 1.4 branch tests on Jenkins was not in great shape so this was missed earlier on. Given that

Code movements from Driver to Workers

2015-07-08 Thread Eugene Morozov
Hi, I have a question regarding code movements. It’s not clear of how exactly my code is being moved onto Worker nodes to be completed. My assumption was that by submitting jar file through spark-submit, Spark copies this jar file to Worker nodes and adds this jar to their classpath. My exper

Re: [VOTE] Release Apache Spark 1.4.1 (RC3)

2015-07-08 Thread Josh Rosen
I've filed https://issues.apache.org/jira/browse/SPARK-8903 to fix the DataFrameStatSuite test failure. The problem turned out to be caused by a mistake made while resolving a merge-conflict when backporting that patch to branch-1.4. I've submitted https://github.com/apache/spark/pull/7295 to fix

Re: What steps to take to work on [Spark-8899] issue?

2015-07-08 Thread Holden Karau
Not exactly but it means someone has come up with what they think a solution to the problem is and that they've submitted some code for consideration/review. On Wednesday, July 8, 2015, Chandrashekhar Kotekar < shekhar.kote...@gmail.com> wrote: > Maybe it is stupid question but 'pull request post

Re: What steps to take to work on [Spark-8899] issue?

2015-07-08 Thread Chandrashekhar Kotekar
Maybe it is stupid question but 'pull request posted to it' means this bug is already fixed? Regards, Chandrash3khar Kotekar Mobile - +91 8600011455 On Thu, Jul 9, 2015 at 12:14 AM, Michael Armbrust wrote: > There is a lot of info here: > https://cwiki.apache.org/confluence/display/SPARK/Contr

Re: What steps to take to work on [Spark-8899] issue?

2015-07-08 Thread Michael Armbrust
There is a lot of info here: https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark In this particular case I'd start by looking at the JIRA (which already has a pull request posted to it). On Wed, Jul 8, 2015 at 11:40 AM, Chandrashekhar Kotekar < shekhar.kote...@gmail.com> wrote

What steps to take to work on [Spark-8899] issue?

2015-07-08 Thread Chandrashekhar Kotekar
Hi, Although I have 7+ years experience in Java development, I am new to open source contribution. To understand which steps one needs to take to work on some issue and upload those changes, I have decided to work on this [Spark-8899] issue which is marked as 'trivial'. So far I have done followi

Re: [VOTE] Release Apache Spark 1.4.1 (RC3)

2015-07-08 Thread Sean Owen
I see, but shouldn't this test not be run when Hive isn't in the build? On Wed, Jul 8, 2015 at 7:13 PM, Andrew Or wrote: > @Sean You actually need to run HiveSparkSubmitSuite with `-Phive` and > `-Phive-thriftserver`. The MissingRequirementsError is just complaining that > it can't find the right

Re: [VOTE] Release Apache Spark 1.4.1 (RC3)

2015-07-08 Thread Andrew Or
@Sean You actually need to run HiveSparkSubmitSuite with `-Phive` and `-Phive-thriftserver`. The MissingRequirementsError is just complaining that it can't find the right classes. The other one (DataFrameStatSuite) is a little more concerning. 2015-07-08 10:43 GMT-07:00 Pradeep Bashyal : > Hi Shi

Re: [VOTE] Release Apache Spark 1.4.1 (RC3)

2015-07-08 Thread Pradeep Bashyal
Hi Shivaram, I created a Jira Issue for the documentation error. https://issues.apache.org/jira/browse/SPARK-8901 Thanks Pradeep On Wed, Jul 8, 2015 at 11:40 AM, Shivaram Venkataraman < shiva...@eecs.berkeley.edu> wrote: > Hi Pradeep > > Thanks for the catch -- Lets open a JIRA and PR for it.

Re: [VOTE] Release Apache Spark 1.4.1 (RC3)

2015-07-08 Thread Patrick Wendell
Yeah - we can fix the docs separately from the release. - Patrick On Wed, Jul 8, 2015 at 10:03 AM, Mark Hamstra wrote: > HiveSparkSubmitSuite is fine for me, but I do see the same issue with > DataFrameStatSuite -- OSX 10.10.4, java > > 1.7.0_75, -Phive -Phive-thriftserver -Phadoop-2.4 -Pyarn >

Re: [VOTE] Release Apache Spark 1.4.1 (RC3)

2015-07-08 Thread Mark Hamstra
HiveSparkSubmitSuite is fine for me, but I do see the same issue with DataFrameStatSuite -- OSX 10.10.4, java 1.7.0_75, -Phive -Phive-thriftserver -Phadoop-2.4 -Pyarn On Wed, Jul 8, 2015 at 4:18 AM, Sean Owen wrote: > The POM issue is resolved and the build succeeds. The license and sigs > stil

Re: [VOTE] Release Apache Spark 1.4.1 (RC3)

2015-07-08 Thread Shivaram Venkataraman
Hi Pradeep Thanks for the catch -- Lets open a JIRA and PR for it. I don't think documentation changes affect the release though Patrick can confirm that. Thanks Shivaram On Wed, Jul 8, 2015 at 9:35 AM, Pradeep Bashyal wrote: > Here's one thing I ran into: > > The SparkR documentation example

Re: [VOTE] Release Apache Spark 1.4.1 (RC3)

2015-07-08 Thread Sean Owen
Although that should be fixed if it's incorrect, it's not something that would nearly block a release. The question here is whether this artifact can be released as 1.4.1, or whether it has a blocking regression from 1.4.0. On Wed, Jul 8, 2015 at 5:35 PM, Pradeep Bashyal wrote: > Here's one thing

Re: [VOTE] Release Apache Spark 1.4.1 (RC3)

2015-07-08 Thread Pradeep Bashyal
Here's one thing I ran into: The SparkR documentation example in http://people.apache.org/~pwendell/spark-releases/latest/sparkr.html is incorrect. sc <- sparkR.init(packages="com.databricks:spark-csv_2.11:1.0.3") should be sc <- sparkR.init(sparkPackages="com.databricks:spark-csv_2.11:

Re: Spark job hangs when History server events are written to hdfs

2015-07-08 Thread Pankaj Arora
I will reproduce this and get the datanode logs but I remember there was some exception in data node logs. Also this is reproducible if you restart hdfs in between and this doesn’t recover after hdfs comes back again. Shouldn’t there be a way to recover from these type of errors. Thanks and Reg

Re: [VOTE] Release Apache Spark 1.4.1 (RC3)

2015-07-08 Thread Sean Owen
The POM issue is resolved and the build succeeds. The license and sigs still work. The tests pass for me with "-Pyarn -Phadoop-2.6", with the following two exceptions. Is anyone else seeing these? this is consistent on Ubuntu 14 with Java 7/8: DataFrameStatSuite: ... - special crosstab elements (.

Re: Spark job hangs when History server events are written to hdfs

2015-07-08 Thread Archit Thakur
As such we do not open any files by ourselves. EventLoggingListener opens the file to write down the events in json format for history server. But it uses the same writer(PrintWriter object) and eventually the same output stream (which boils down to DFSOutputStream for us). It seems DFSOutputStream

Re: spark - redshift !!!

2015-07-08 Thread spark user
Hi 'I am looking how to load data in redshift .Thanks  On Wednesday, July 8, 2015 12:47 AM, shahab wrote: Hi, I did some experiment with loading data from s3 into spark. I loaded data from s3 using sc.textFile(). Have a look at the following code snippet: val csv = sc.textFile(

Re: Spark job hangs when History server events are written to hdfs

2015-07-08 Thread Akhil Das
Can you look in the datanode logs and see whats going on? Most likely, you are hitting the ulimit on open file handles. Thanks Best Regards On Wed, Jul 8, 2015 at 10:55 AM, Pankaj Arora wrote: > Hi, > > I am running long running application over yarn using spark and I am > facing issues while