Re: How the scala style checker works?

2014-03-19 Thread Nirmal Allugari
Hey Nan The line contains exactly 100 chars and the cursor will be at the 101 char and thus it indicate so in the IDE. *Thanks,* *Nirmal Reddy.* On Thu, Mar 20, 2014 at 10:49 AM, Nan Zhu wrote: > Hi, all > > I'm just curious about the working mechanism of scala style checker > > When I work o

How the scala style checker works?

2014-03-19 Thread Nan Zhu
Hi, all I’m just curious about the working mechanism of scala style checker When I work on a PR, I found that the following line contains 101 chars, violating the 100 limitation https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L51

Re: Spark 0.9.1 release

2014-03-19 Thread Mridul Muralidharan
If 1.0 is just round the corner, then it is fair enough to push to that, thanks for clarifying ! Regards, Mridul On Wed, Mar 19, 2014 at 6:12 PM, Tathagata Das wrote: > I agree that the garbage collection > PRwould make things very > convenient in a lot

Re: Spark 0.9.1 release

2014-03-19 Thread Tathagata Das
I agree that the garbage collection PRwould make things very convenient in a lot of usecases. However, there are two broads reasons why it is hard for that PR to get into 0.9.1. 1. The PR still needs some amount of work and quite a lot of testing. While we

Re: Spark 0.9.1 release

2014-03-19 Thread Mridul Muralidharan
Would be great if the garbage collection PR is also committed - if not the whole thing, atleast the part to unpersist broadcast variables explicitly would be great. Currently we are running with a custom impl which does something similar, and I would like to move to standard distribution for that.

Spark 0.9.1 release

2014-03-19 Thread Tathagata Das
Hello everyone, Since the release of Spark 0.9, we have received a number of important bug fixes and we would like to make a bug-fix release of Spark 0.9.1. We are going to cut a release candidate soon and we would love it if people test it out. We have backported several bug fixes into the 0.9 a

Re: ALS solve.solvePositive

2014-03-19 Thread Xiangrui Meng
They have been merged into the master branch. However, the improvements are for implicit ALS computation. I don't think they can speed up normal ALS computation. Could you share more details about the variable projection? JIRAs: https://spark-project.atlassian.net/browse/SPARK-1266 https://spark-

Re: Announcing the official Spark Job Server repo

2014-03-19 Thread Christopher Nguyen
+1, Evan et al. -- Christopher T. Nguyen Co-founder & CEO, Adatao linkedin.com/in/ctnguyen On Tue, Mar 18, 2014 at 1:51 PM, Evan Chan wrote: > Dear Spark developers, > > Ooyala is happy to announce that we have pushed our official, Spark > 0.9.0 / Scala 2.10-compatible, jo

Re: repositories for spark jars

2014-03-19 Thread Evan Chan
The alternative is for Spark to not explicitly include hadoop_client, perhaps only as "provided", and provide a facility to insert the hadoop client jars of your choice at packaging time. Unfortunately, hadoop_client pulls in a ton of other deps, so it's not as simple as copying one extra jar int

Re: Announcing the official Spark Job Server repo

2014-03-19 Thread Evan Chan
https://spark-project.atlassian.net/browse/SPARK-1283 On Wed, Mar 19, 2014 at 10:59 AM, Gerard Maas wrote: > this is cool +1 > > > On Wed, Mar 19, 2014 at 6:54 PM, Patrick Wendell wrote: > >> Evan - yep definitely open a JIRA. It would be nice to have a contrib >> repo set-up for the 1.0 release

Re: ALS solve.solvePositive

2014-03-19 Thread Debasish Das
Nope...with the cleaner dataset I am not noticing issues with the dposv and this dataset is even bigger...20 M users and 1 M products...I don't think other than cholesky anything else will get us the efficiency we need... For my usecase we also need to see the effectiveness of positive factors and

Wrong input split mapping? I am reading a set of files from s3 and writing output to the same account in a different folder. My input split mappings seem to be wrong somehow. It appends base-maps or p

2014-03-19 Thread Usman Ghani
14/03/19 19:11:37 INFO Executor: Serialized size of result for 678 is 1423 14/03/19 19:11:37 INFO Executor: Sending result for 678 directly to driver14/03/19 19:11:37 INFO Executor: Finished task ID 678 14/03/19 19:11:37 INFO NativeS3FileSystem: Opening key 'test_data/jws/video_logs2/video_logs2_00

Re: [PySpark]: reading arbitrary Hadoop InputFormats

2014-03-19 Thread Nick Pentreath
Ok - I'll work something up and reopen a PR against the new spark mirror. The API itself mirrors the newHadoopFile etc methods, so that should be quite stable once finalised. It's the "wrapper" stuff of how to serialize custom classes and read them in Python that is the potential tricky par

Re: [PySpark]: reading arbitrary Hadoop InputFormats

2014-03-19 Thread Matei Zaharia
Hey Nick, no worries if this can’t be done in time. It’s probably better to test it thoroughly. If you do have something partially working though, the main concern will be the API, i.e. whether it’s an API we want to support indefinitely. It would be bad to add this and then make major changes t

Re: Announcing the official Spark Job Server repo

2014-03-19 Thread Gerard Maas
this is cool +1 On Wed, Mar 19, 2014 at 6:54 PM, Patrick Wendell wrote: > Evan - yep definitely open a JIRA. It would be nice to have a contrib > repo set-up for the 1.0 release. > > On Tue, Mar 18, 2014 at 11:28 PM, Evan Chan wrote: > > Matei, > > > > Maybe it's time to explore the spark-cont

Re: Announcing the official Spark Job Server repo

2014-03-19 Thread Patrick Wendell
Evan - yep definitely open a JIRA. It would be nice to have a contrib repo set-up for the 1.0 release. On Tue, Mar 18, 2014 at 11:28 PM, Evan Chan wrote: > Matei, > > Maybe it's time to explore the spark-contrib idea again? Should I > start a JIRA ticket? > > -Evan > > > On Tue, Mar 18, 2014 at

Re: ALS solve.solvePositive

2014-03-19 Thread Xiangrui Meng
Another question: do you have negative or out-of-range user or product ids or? -Xiangrui On Tue, Mar 11, 2014 at 8:00 PM, Debasish Das wrote: > Nope..I did not test implicit feedback yet...will get into more detailed > debug and generate the testcase hopefully next week... > On Mar 11, 2014 7:02

[Exception]:Could not obtain block

2014-03-19 Thread mohit.goyal
I am getting below error while running scala application with input file present in hdfs. Exception in thread "main" org.apache.spark.SparkException: Job aborted: Task 1.0:41 failed 4 times (most recent failure: Exception failure: java.io.IOException: Could not obtain block: blk_528925763174330039