be, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>
>>
>>
>> If you reply to this email, your message will be added to the discussion
>> below:
>>
>> http://apache-spark-user-list.10
Thanks for the thoughts. I've been testing on Spark 1.1 and haven't seen
the IndexError yet. I've run into some other errors ("too many open
files"), but these issues seem to have been discussed already. The dataset,
by the way, was about 40 Gb and 188 million lines; I'm running a sort on 3
worker
Sorry for the delay. I'll try to add some more details on Monday.
Unfortunately, I don't have a script to reproduce the error. Actually, it
seemed to be more about the data set than the script. The same code on
different data sets lead to different results; only larger data sets on the
order of 40
Could you tell how large is the data set? It will help us to debug this issue.
On Thu, Nov 6, 2014 at 10:39 AM, skane wrote:
> I don't have any insight into this bug, but on Spark version 1.0.0 I ran into
> the same bug running the 'sort.py' example. On a smaller data set, it worked
> fine. On a
It should be fixed in 1.1+.
Could you have a script to reproduce it?
On Thu, Nov 6, 2014 at 10:39 AM, skane wrote:
> I don't have any insight into this bug, but on Spark version 1.0.0 I ran into
> the same bug running the 'sort.py' example. On a smaller data set, it worked
> fine. On a larger da
I don't have any insight into this bug, but on Spark version 1.0.0 I ran into
the same bug running the 'sort.py' example. On a smaller data set, it worked
fine. On a larger data set I got this error:
Traceback (most recent call last):
File "/home/skane/spark/examples/src/main/python/sort.py", li