Re: java.lang.NoSuchMethodError while saving a random forest model Spark version 1.5

2015-12-17 Thread Yanbo Liang
Spark 1.5 officially use Parquet 1.7.0, but Spark 1.3 use Parquet 1.6.0. It's better to check which version of Parquet is used in your environment. 2015-12-17 10:26 GMT+08:00 Joseph Bradley : > This method is tested in the Spark 1.5 unit tests, so I'd guess it's a > problem with the Parquet depen

Re: How do we convert a Dataset includes timestamp columns to RDD?

2015-12-17 Thread Kousuke Saruta
Hi Yu, I found it's because DateTimeUtils passed to StaticInvoke is not serializable. I think it's potential bug that StaticInvoke can receives non-Serializable objects. I opened a PR about this issue. https://github.com/apache/spark/pull/10357 - Kousuke On 2015/12/17 16:35, Yu Ishikawa wr

Re: Update to Spar Mesos docs possibly? LIBPROCESS_IP needs to be set for client mode

2015-12-17 Thread Iulian Dragoș
On Wed, Dec 16, 2015 at 5:42 PM, Aaron wrote: > Wrt to PR, sure, let me update the documentation, i'll send it out > shortly. My Fork is on Github..is the PR from there ok? > Absolutely. Have a look at https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark if you haven't done s

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-17 Thread Iulian Dragoș
-0 (non-binding) Unfortunately the Mesos cluster regression is still there (see my comment for explanations). I'm not voting to delay the release any longer though. We tested (and passed) Mesos in: - client mode - fine/coarse-grained

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-17 Thread Kousuke Saruta
+1 On 2015/12/17 6:32, Michael Armbrust wrote: Please vote on releasing the following candidate as Apache Spark version 1.6.0! The vote is open until Saturday, December 19, 2015 at 18:00 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache

implict ClassTag in KafkaUtils

2015-12-17 Thread Hao Ren
Hi, I am reading spark streaming Kafka code. In org.apache.spark.streaming.kafka.KafkaUtils file, the function "createDirectStream" takes key class, value class, etc to create classTag. However, they are all implicit. I don't understand why they are implicit. In fact, I can not find any other ov

Re: How do we convert a Dataset includes timestamp columns to RDD?

2015-12-17 Thread Yu Ishikawa
Hi Kosuke, Thank you for the PR. I think we should fix this bug before releasing Spark 1.6 ASAP. I'm looking forward to merging it. Thanks, Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/How-do-we-convert-a-Dataset-includes-

Re: implict ClassTag in KafkaUtils

2015-12-17 Thread Saisai Shao
Actually this is a Scala problem. createDirectStream actually requires implicit values, which is implied as context bound, Java does not have the equivalence, so here change the java class to the ClassTag, and make it as implicit value, it will be used by createDirectStream. Thanks Saisai On Th

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-17 Thread Timothy O
+1 On Thursday, December 17, 2015 8:22 AM, Kousuke Saruta wrote: +1 On 2015/12/17 6:32, Michael Armbrust wrote: Please vote on releasing the following candidate as Apache Spark version 1.6.0! The vote is open until Saturday, December 19, 2015 at 18:00 UTC and passes if a m

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-17 Thread syepes
-1 (YARN Cluster deployment mode not working) I have just tested 1.6 (d509194b) on our HDP 2.3 platform and the cluster mode does not seem work. It looks like some parameter are not being passed correctly. This example works correctly with 1.5. # spark-submit --master yarn --deploy-mode cluster -

Re: implict ClassTag in KafkaUtils

2015-12-17 Thread Hao Ren
Thank you for your quick answer. It helped me to find an implicit conversion for JavaInputDStream which takes implicit ClassTag. Cheers. On Thu, Dec 17, 2015 at 3:11 PM, Saisai Shao wrote: > Actually this is a Scala problem. createDirectStream actually requires > implicit values, which is impli

Re: does spark really support label expr like && or || ?

2015-12-17 Thread Ted Yu
I consulted with YARN developer, the notion presented in Allen's email is not supported yet. Only single node label should be specified. Cheers On Wed, Dec 16, 2015 at 6:40 PM, Allen Zhang wrote: > more details commands: > > 2. yarn rmadmin -replaceLabelsOnNode spark-dev:54321,foo; > yarn r

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-17 Thread Andrew Or
@syepes I just run Spark 1.6 (881f254) on YARN with Hadoop 2.4.0. I was able to run a simple application in cluster mode successfully. Can you verify whether the org.apache.spark.yarn.ApplicationMaster class exists in your assembly jar? jar -tf assembly.jar | grep ApplicationMaster -Andrew 20

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-17 Thread Sebastian YEPES FERNANDEZ
@Andrew Thanks for the reply, did you run this in a Hortonworks or Cloudera cluster? I suspect the issue is coming from the ​extraJavaOptions as these are necessary in HDP, the strange thing is that with exactly the same settings 1.5 works. # jar -tf spark-assembly-1.6.0-SNAPSHOT-hadoop2.7.1.jar |

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-17 Thread Andrew Or
That seems like an HDP-specific issue. I did a quick search on "spark bad substitution" and all the results have to do with people failing to run YARN cluster in HDP. Here is a workaround

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-17 Thread Michael Gummelt
The fix for the Mesos cluster regression has introduced another Mesos cluster bug. Namely, the MesosClusterDispatcher crashes when trying to write to ZK: https://issues.apache.org/jira/browse/SPARK-12413 I have a tentative fix here: https://github.com/apache/spark/pull/10366 On Thu, Dec 17, 2015

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-17 Thread Vinay Shukla
Agree with Andrew, we shouldn't block the release for this. This issue won't be there in Spark distribution from Hortonworks since we set the HDP version. If you want to use the Apache Spark with HDP you can modify mapred-site.xml to replace the hdp.version property with the right value for your

Re: JIRA: Wrong dates from imported JIRAs

2015-12-17 Thread Lars Francke
Okay thanks guys, that's two -1s and that's fair enough. I'll leave it at that. On Thu, Dec 17, 2015 at 1:39 AM, Josh Rosen wrote: > Personally, I'd rather avoid the risk of breaking things during the > reimport. In my experience we've had a lot of unforeseen problems with JIRA > import/export a

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-17 Thread Vinay Shukla
One correction, the better way is to just create a file called java-opts in .../spark/conf with the following config value in it -Dhdp.version=. One way to get the HDP version is to run the below one lines on a node of your HDP cluster. hdp-select status hadoop-client | sed 's/hadoop-client - \(

Re: does spark really support label expr like && or || ?

2015-12-17 Thread Allen Zhang
^_^ , Thanks Ted. At 2015-12-18 03:38:46, "Ted Yu" wrote: I consulted with YARN developer, the notion presented in Allen's email is not supported yet. Only single node label should be specified. Cheers On Wed, Dec 16, 2015 at 6:40 PM, Allen Zhang wrote: more details commands: 2.

Re: [VOTE] Release Apache Spark 1.6.0 (RC3)

2015-12-17 Thread Krishna Sankar
+1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 29:32 min mvn clean package -Pyarn -Phadoop-2.6 -DskipTests 2. Tested pyspark, mllib (iPython 4.0) 2.0 Spark version is 1.6.0 2.1. statistics (min,max,mean,Pearson,Spearman) OK 2.2. Linear/Ridge/Laso Regression OK 2.3