Re: Using mllib-1.1.0-SNAPSHOT on Spark 1.0.1

2014-08-12 Thread Debasish Das
I figured out the issuethe driver memory was at 512 MB and for our datasets, the following code needed more memory... // Materialize usersOut and productsOut. usersOut.count() productsOut.count() Thanks. Deb On Sat, Aug 9, 2014 at 6:12 PM, Debasish Das wrote: > Actually nope it did not

Re: Using mllib-1.1.0-SNAPSHOT on Spark 1.0.1

2014-08-09 Thread Debasish Das
Actually nope it did not work fine... With multiple ALS iteration, I am getting the same error (with or without my mllib changes) Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 206 in stage 52.0 failed 4 times, most recent failure: Lost task

Re: Using mllib-1.1.0-SNAPSHOT on Spark 1.0.1

2014-08-09 Thread Debasish Das
Including mllib inside assembly worked fine...If I deploy only the core and send mllib as --jars then this problem shows up... Xiangrui could you please comment if it is a bug or expected behavior ? I will create a JIRA if this needs to be tracked... On Sat, Aug 9, 2014 at 11:01 AM, Matt Forbes

Re: Using mllib-1.1.0-SNAPSHOT on Spark 1.0.1

2014-08-09 Thread Matt Forbes
I was having this same problem early this week and had to include my changes in the assembly. On Sat, Aug 9, 2014 at 9:59 AM, Debasish Das wrote: > I validated that I can reproduce this problem with master as well (without > adding any of my mllib changes)... > > I separated mllib jar from asse

Re: Using mllib-1.1.0-SNAPSHOT on Spark 1.0.1

2014-08-09 Thread Debasish Das
I validated that I can reproduce this problem with master as well (without adding any of my mllib changes)... I separated mllib jar from assembly, deploy the assembly and then I supply the mllib jar as --jars option to spark-submit... I get this error: 14/08/09 12:49:32 INFO DAGScheduler: Failed

Re: Using mllib-1.1.0-SNAPSHOT on Spark 1.0.1

2014-08-09 Thread Debasish Das
Hi Xiangrui, Based on your suggestion I moved core and mllib both to 1.1.0-SNAPSHOT...I am still getting class cast exception: Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 249 in stage 52.0 failed 4 times, most recent failure: Lost task 249.3

Re: Using mllib-1.1.0-SNAPSHOT on Spark 1.0.1

2014-08-06 Thread Debasish Das
I did not play with Hadoop settings...everything is compiled with 2.3.0CDH5.0.2 for me... I did try to bump the version number of HBase from 0.94 to 0.96 or 0.98 but there was no profile for CDH in the pom...but that's unrelated to this ! On Wed, Aug 6, 2014 at 9:45 AM, DB Tsai wrote: > One re

Re: Using mllib-1.1.0-SNAPSHOT on Spark 1.0.1

2014-08-06 Thread DB Tsai
One related question, is mllib jar independent from hadoop version (doesnt use hadoop api directly)? Can I use mllib jar compile for one version of hadoop and use it in another version of hadoop? Sent from my Google Nexus 5 On Aug 6, 2014 8:29 AM, "Debasish Das" wrote: > Hi Xiangrui, > > Maintai

Re: Using mllib-1.1.0-SNAPSHOT on Spark 1.0.1

2014-08-06 Thread Debasish Das
Ok...let me look into it a bit more and most likely I will deploy the Spark v1.1 and then use mllib 1.1 SNAPSHOT jar with it so that we follow your guideline of not running newer spark component on older version of spark core... That should solve this issue unless it is related to Java versions...

Re: Using mllib-1.1.0-SNAPSHOT on Spark 1.0.1

2014-08-06 Thread Xiangrui Meng
One thing I like to clarify is that we do not support running a newer version of a Spark component on top of a older version of Spark core. I don't remember any code change in MLlib that requires Spark v1.1 but I might miss some PRs. There were changes to CoGroup, which may be relevant: https://gi

Re: Using mllib-1.1.0-SNAPSHOT on Spark 1.0.1

2014-08-06 Thread Debasish Das
Hi Xiangrui, Maintaining another file will be a pain later so I deployed spark 1.0.1 without mllib and then my application jar bundles mllib 1.1.0-SNAPSHOT along with the code changes for quadratic optimization... Later the plan is to patch the snapshot mllib with the deployed stable mllib... Th

Re: Using mllib-1.1.0-SNAPSHOT on Spark 1.0.1

2014-08-05 Thread Debasish Das
Hi Xiangrui, I used your idea and kept a cherry picked version of ALS.scala in my application and call it ALSQp.scala...this is a OK workaround for now till a version adds up to master for example... For the bug with userClassPathFirst, looks like Koert already found this issue in the following J

Re: Using mllib-1.1.0-SNAPSHOT on Spark 1.0.1

2014-08-05 Thread Xiangrui Meng
If you cannot change the Spark jar deployed on the cluster, an easy solution would be renaming ALS in your jar. If userClassPathFirst doesn't work, could you create a JIRA and attach the log? Thanks! -Xiangrui On Tue, Aug 5, 2014 at 9:10 AM, Debasish Das wrote: > I created the assembly file but s

Re: Using mllib-1.1.0-SNAPSHOT on Spark 1.0.1

2014-08-05 Thread Debasish Das
I created the assembly file but still it wants to pick the mllib from the cluster: jar tf ./target/ml-0.0.1-SNAPSHOT-jar-with-dependencies.jar | grep QuadraticMinimizer org/apache/spark/mllib/optimization/QuadraticMinimizer$$anon$1.class /Users/v606014/dist-1.0.1/bin/spark-submit --master spark:

Re: Using mllib-1.1.0-SNAPSHOT on Spark 1.0.1

2014-08-02 Thread Xiangrui Meng
Yes, that should work. spark-mllib-1.1.0 should be compatible with spark-core-1.0.1. On Sat, Aug 2, 2014 at 10:54 AM, Debasish Das wrote: > Let me try it... > > Will this be fixed if I generate a assembly file with mllib-1.1.0 SNAPSHOT > jar and other dependencies with the rest of the application

Re: Using mllib-1.1.0-SNAPSHOT on Spark 1.0.1

2014-08-02 Thread Debasish Das
Let me try it... Will this be fixed if I generate a assembly file with mllib-1.1.0 SNAPSHOT jar and other dependencies with the rest of the application code ? On Sat, Aug 2, 2014 at 10:46 AM, Xiangrui Meng wrote: > You can try enabling "spark.files.userClassPathFirst". But I'm not > sure whet

Re: Using mllib-1.1.0-SNAPSHOT on Spark 1.0.1

2014-08-02 Thread Xiangrui Meng
You can try enabling "spark.files.userClassPathFirst". But I'm not sure whether it could solve your problem. -Xiangrui On Sat, Aug 2, 2014 at 10:13 AM, Debasish Das wrote: > Hi, > > I have deployed spark stable 1.0.1 on the cluster but I have new code that > I added in mllib-1.1.0-SNAPSHOT. > > I