Hi, I ran into a pretty weird issue with to_avro and from_avro where it was not able to parse the data in a struct correctly. Please see the simple and self contained example below. I am using Spark 2.4. I am not sure if I missed something.
This is how I start the spark-shell on my Mac: ./bin/spark-shell --packages org.apache.spark:spark-avro_2.11:2.4.0 import org.apache.spark.sql.types._ import org.apache.spark.sql.avro._ import org.apache.spark.sql.functions._ spark.version val df = Seq((1, "John Doe", 30), (2, "Mary Jane", 25)).toDF("id", "name", "age") val dfStruct = df.withColumn("value", struct("name","age")) dfStruct.show dfStruct.printSchema val dfKV = dfStruct.select(to_avro('id).as("key"), to_avro('value).as("value")) val expectedSchema = StructType(Seq(StructField("name", StringType, false),StructField("age", IntegerType, false))) val avroTypeStruct = SchemaConverters.toAvroType(expectedSchema).toString val avroTypeStr = s""" |{ | "type": "int", | "name": "key" |} """.stripMargin dfKV.select(from_avro('key, avroTypeStr)).show // output +-------------------+ |from_avro(key, int)| +-------------------+ | 1| | 2| +-------------------+ dfKV.select(from_avro('value, avroTypeStruct)).show // output +---------------------------------------------+ |from_avro(value, struct<name:string,age:int>)| +---------------------------------------------+ | [, 9]| | [, 9]| +---------------------------------------------+ Please help and thanks in advance. -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org