Maybe something like
var finalDF = spark.sqlContext.emptyDataFrame for (df <- dfs){ finalDF = finalDF.union(df) } Where dfs is a Seq of dataframes. From: Cesar <ces...@gmail.com> Date: Thursday, April 5, 2018 at 2:17 PM To: user <user@spark.apache.org> Subject: Union of multiple data frames The following code works for small n, but not for large n (>20): val dfUnion = Seq(df1,df2,df3,...dfn).reduce(_ union _) dfUnion.show() By not working, I mean that Spark takes a lot of time to create the execution plan. Is there a more optimal way to perform a union of multiple data frames? thanks -- Cesar Flores