We probably should have the alias. Is this still a problem on master branch?
On Wed, Mar 16, 2016 at 9:40 AM, Ruslan Dautkhanov <dautkha...@gmail.com> wrote: > Running following: > > #fix schema for gaid which should not be Double >> from pyspark.sql.types import * >> customSchema = StructType() >> for (col,typ) in tsp_orig.dtypes: >> if col=='Agility_GAID': >> typ='string' >> customSchema.add(col,typ,True) > > > Getting > > ValueError: Could not parse datatype: bigint > > > Looks like pyspark.sql.types doesn't know anything about bigint.. > Should it be aliased to LongType in pyspark.sql.types? > > Thanks > > > On Wed, Mar 16, 2016 at 10:18 AM, Ruslan Dautkhanov <dautkha...@gmail.com> > wrote: > >> Hello, >> >> Looking at >> >> https://spark.apache.org/docs/1.5.1/api/python/_modules/pyspark/sql/types.html >> >> and can't wrap my head around how to convert string data types names to >> actual >> pyspark.sql.types data types? >> >> Does pyspark.sql.types has an interface to return StringType() for >> "string", >> IntegerType() for "integer" etc? If it doesn't exist it would be great to >> have such a >> mapping function. >> >> Thank you. >> >> >> ps. I have a data frame, and use its dtypes to loop through all columns >> to fix a few >> columns' data types as a workaround for SPARK-13866. >> >> >> -- >> Ruslan Dautkhanov >> > >