Hi Direceu Thanks your right! that did work
But now im facing an even bigger problem since i dont have access to change the underlying data, I just want to apply a schema over something that was written via the sparkContext.newAPIHadoopRDD Basically I am reading in a RDD[JsonObject] and would like to convert it into a dataframe which I pass the schema into Whats the best way to do this? I doubt removing all the quotes in the JSON is the best solution is it? Regards Sam On Sat, Feb 4, 2017 at 2:13 PM, Dirceu Semighini Filho < dirceu.semigh...@gmail.com> wrote: > Hi Sam > Remove the " from the number that it will work > > Em 4 de fev de 2017 11:46 AM, "Sam Elamin" <hussam.ela...@gmail.com> > escreveu: > >> Hi All >> >> I would like to specify a schema when reading from a json but when trying >> to map a number to a Double it fails, I tried FloatType and IntType with no >> joy! >> >> >> When inferring the schema customer id is set to String, and I would like >> to cast it as Double >> >> so df1 is corrupted while df2 shows >> >> >> Also FYI I need this to be generic as I would like to apply it to any >> json, I specified the below schema as an example of the issue I am facing >> >> import org.apache.spark.sql.types.{BinaryType, StringType, StructField, >> DoubleType,FloatType, StructType, LongType,DecimalType} >> val testSchema = StructType(Array(StructField("customerid",DoubleType))) >> val df1 = >> spark.read.schema(testSchema).json(sc.parallelize(Array("""{"customerid":"535137"}"""))) >> val df2 = >> spark.read.json(sc.parallelize(Array("""{"customerid":"535137"}"""))) >> df1.show(1) >> df2.show(1) >> >> >> Any help would be appreciated, I am sure I am missing something obvious >> but for the life of me I cant tell what it is! >> >> >> Kind Regards >> Sam >> >