It says serialization error - could there be a column value which is not getting parsed as int in one of the rows 31-60? The relevant Python code in serializers.py which is throwing the error is
def read_int(stream): length = stream.read(4) if not length: raise EOFError return struct.unpack("!i", length)[0] Thanks, Sonal Nube Technologies <http://www.nubetech.co> <http://in.linkedin.com/in/sonalgoyal> On Fri, Sep 7, 2018 at 12:22 PM, Apostolos N. Papadopoulos < papad...@csd.auth.gr> wrote: > Can you isolate the row that is causing the problem? I mean start using > show(31) up to show(60). > > Perhaps this will help you to understand the problem. > > regards, > > Apostolos > > > > On 07/09/2018 01:11 πμ, dimitris plakas wrote: > > Hello everyone, I am new in Pyspark and i am facing an issue. Let me > explain what exactly is the problem. > > I have a dataframe and i apply on this a map() function > (dataframe2=datframe1.rdd.map(custom_function()) > dataframe = sqlContext.createDataframe(dataframe2) > > when i have > > dataframe.show(30,True) it shows the result, > > when i am using dataframe.show(60, True) i get the error. The Error is in > the attachement Pyspark_Error.txt. > > Could you please explain me what is this error and how to overpass it? > > > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > > -- > Apostolos N. Papadopoulos, Associate Professor > Department of Informatics > Aristotle University of Thessaloniki > Thessaloniki, GREECE > tel: ++0030312310991918 > email: papad...@csd.auth.gr > twitter: @papadopoulos_ap > web: http://datalab.csd.auth.gr/~apostol > >