Re: ValueError: can not serialize object larger than 2G

Ted Yu Thu, 08 Oct 2015 13:56:25 -0700

To fix the problem, consider increasing number of partitions for your job.

Showing code snippet would help us understand your use case better.


Cheers

On Thu, Oct 8, 2015 at 1:39 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> See the comment of FramedSerializer() in serializers.py :
>
>     Serializer that writes objects as a stream of (length, data) pairs,
>     where C{length} is a 32-bit integer and data is C{length} bytes.
>
> Hence the limit on the size of object.
>
> On Thu, Oct 8, 2015 at 12:56 PM, XIANDI <zxd_ci...@hotmail.com> wrote:
>
>>   File "/home/hadoop/spark/python/pyspark/worker.py", line 101, in main
>>     process()
>>   File "/home/hadoop/spark/python/pyspark/worker.py", line 96, in process
>>     serializer.dump_stream(func(split_index, iterator), outfile)
>>   File "/home/hadoop/spark/python/pyspark/serializers.py", line 126, in
>> dump_stream
>>     self._write_with_length(obj, stream)
>>   File "/home/hadoop/spark/python/pyspark/serializers.py", line 140, in
>> _write_with_length
>>     raise ValueError("can not serialize object larger than 2G")
>> ValueError: can not serialize object larger than 2G
>>
>> Does anyone know how does this happen?
>>
>> Thanks!
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/ValueError-can-not-serialize-object-larger-than-2G-tp24984.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>

Re: ValueError: can not serialize object larger than 2G

Reply via email to