subject:"Re\: PySpark issue with sortByKey\: \"IndexError\: list index out of range\""

Re: PySpark issue with sortByKey: "IndexError: list index out of range"

2014-11-13 Thread Davies Liu

be, e-mail: [hidden email] >> For additional commands, e-mail: [hidden email] >> >> >> >> >> If you reply to this email, your message will be added to the discussion >> below: >> >> http://apache-spark-user-list.10

Re: PySpark issue with sortByKey: "IndexError: list index out of range"

2014-11-13 Thread santon

Thanks for the thoughts. I've been testing on Spark 1.1 and haven't seen the IndexError yet. I've run into some other errors ("too many open files"), but these issues seem to have been discussed already. The dataset, by the way, was about 40 Gb and 188 million lines; I'm running a sort on 3 worker

Re: PySpark issue with sortByKey: "IndexError: list index out of range"

2014-11-09 Thread santon

Sorry for the delay. I'll try to add some more details on Monday. Unfortunately, I don't have a script to reproduce the error. Actually, it seemed to be more about the data set than the script. The same code on different data sets lead to different results; only larger data sets on the order of 40

Re: PySpark issue with sortByKey: "IndexError: list index out of range"

2014-11-07 Thread Davies Liu

Could you tell how large is the data set? It will help us to debug this issue. On Thu, Nov 6, 2014 at 10:39 AM, skane wrote: > I don't have any insight into this bug, but on Spark version 1.0.0 I ran into > the same bug running the 'sort.py' example. On a smaller data set, it worked > fine. On a

Re: PySpark issue with sortByKey: "IndexError: list index out of range"

2014-11-06 Thread Davies Liu

It should be fixed in 1.1+. Could you have a script to reproduce it? On Thu, Nov 6, 2014 at 10:39 AM, skane wrote: > I don't have any insight into this bug, but on Spark version 1.0.0 I ran into > the same bug running the 'sort.py' example. On a smaller data set, it worked > fine. On a larger da

Re: PySpark issue with sortByKey: "IndexError: list index out of range"

2014-11-06 Thread skane

I don't have any insight into this bug, but on Spark version 1.0.0 I ran into the same bug running the 'sort.py' example. On a smaller data set, it worked fine. On a larger data set I got this error: Traceback (most recent call last): File "/home/skane/spark/examples/src/main/python/sort.py", li

Re: PySpark issue with sortByKey: "IndexError: list index out of range"

Re: PySpark issue with sortByKey: "IndexError: list index out of range"

Re: PySpark issue with sortByKey: "IndexError: list index out of range"

Re: PySpark issue with sortByKey: "IndexError: list index out of range"

Re: PySpark issue with sortByKey: "IndexError: list index out of range"

Re: PySpark issue with sortByKey: "IndexError: list index out of range"

6 matches

Site Navigation

Mail list logo

Footer information