Hi folks, I have a need to "append" two dataframes -- I was hoping to use UnionAll but it seems that this operation treats the underlying dataframes as sequence of columns, rather than a map.
In particular, my problem is that the columns in the two DFs are not in the same order --notice that my customer_id somehow comes out a string: This is Spark 1.4.1 case class Test(epoch: Long,browser:String,customer_id:Int,uri:String) val test = Test(1234l,"firefox",999,"http://foobar") case class Test1( customer_id :Int, uri:String, browser:String, epoch :Long) val test1 = Test1(888,"http://foobar","ie",12343) val df=sc.parallelize(Seq(test)).toDF val df1=sc.parallelize(Seq(test1)).toDF df.unionAll(df1) //res2: org.apache.spark.sql.DataFrame = [epoch: bigint, browser: string, customer_id: string, uri: string] Is unionAll the wrong operation? Any special incantations? Or advice on how to otherwise get this to succeeed?