date:20140619

Problems with Pyspark + Dill tests

2014-06-19 Thread Mark Baker

Hi. As part of my attempt to port Pyspark to Python 3, I've re-applied, with modifications, Josh's old commit for using Dill with Pyspark (as Dill already supports Python 3). Alas, I ran into an odd problem that I could use some help with. Josh's old commit; https://github.com/JoshRosen/incubator

Re: Trailing Tasks Saving to HDFS

2014-06-19 Thread Surendranauth Hiraman

I've created an issue for this but if anyone has any advice, please let me know. Basically, on about 10 GBs of data, saveAsTextFile() to HDFS hangs on two remaining tasks (out of 320). Those tasks seem to be waiting on data from another task on another node. Eventually (about 2 hours later) they t

assign SPARK-2126 to me?

2014-06-19 Thread Nan Zhu

Hi, all Any admin can assign this issue https://issues.apache.org/jira/browse/SPARK-2126 to me? I have started working on this Thanks, -- Nan Zhu

Re: Trailing Tasks Saving to HDFS

2014-06-19 Thread Patrick Wendell

I'll make a comment on the JIRA - thanks for reporting this, let's get to the bottom of it. On Thu, Jun 19, 2014 at 11:19 AM, Surendranauth Hiraman wrote: > I've created an issue for this but if anyone has any advice, please let me > know. > > Basically, on about 10 GBs of data, saveAsTextFile()

Re: Problems with Pyspark + Dill tests

2014-06-19 Thread Josh Rosen

Thanks for helping with the Dill integration; I had some early first attempts, but had to set them aside when I got busy with some other work. Just to bring everyone up to speed regarding context: There are some objects that PySpark’s `cloudpickle` library doesn’t serialize properly, such as ope

Rationale behind scala enumerations instead of sealed traits and case objects

2014-06-19 Thread Luis Ángel Vicente Sánchez

While I was trying to execute a job using spark-submit, I discover a scala.MatchError at runtime... a DriverStateChanged.FAILED message was send to an actor, and the match statement used was not taking that value into account. When I inspected that DriverStateChange.scala file I discovered that it

Problems with Pyspark + Dill tests

Re: Trailing Tasks Saving to HDFS

assign SPARK-2126 to me?

Re: Trailing Tasks Saving to HDFS

Re: Problems with Pyspark + Dill tests

Rationale behind scala enumerations instead of sealed traits and case objects

6 matches

Site Navigation

Mail list logo

Footer information