Pls checke the specific ERORR lines of the text file .
Chaces are are few Columns are not properly delimited in specific rows.
Regards
Prakash
On Fri, Sep 7, 2018, 3:41 AM dimitris plakas wrote:
> Hello everyone, I am new in Pyspark and i am facing an issue. Let me
> explain what exactly is the
It says serialization error - could there be a column value which is not
getting parsed as int in one of the rows 31-60? The relevant Python code in
serializers.py which is throwing the error is
def read_int(stream):
length = stream.read(4)
if not length:
raise EOFError
return
Can you isolate the row that is causing the problem? I mean start using
show(31) up to show(60).
Perhaps this will help you to understand the problem.
regards,
Apostolos
On 07/09/2018 01:11 πμ, dimitris plakas wrote:
Hello everyone, I am new in Pyspark and i am facing an issue. Let me
expl
Hello everyone, I am new in Pyspark and i am facing an issue. Let me
explain what exactly is the problem.
I have a dataframe and i apply on this a map() function
(dataframe2=datframe1.rdd.map(custom_function())
dataframe = sqlContext.createDataframe(dataframe2)
when i have
dataframe.show(30,True