Seems like it's worthy of filing a bug against withColumn On Wed, Nov 21, 2018, 6:25 PM Colin Williams < colin.williams.seat...@gmail.com wrote:
> Hello, > > I'm currently trying to update the schema for a dataframe with nested > columns. I would either like to update the schema itself or cast the > column without having to explicitly select all the columns just to > cast one. > > In regards to updating the schema it looks like I would probably need > to write a more complex map on the schema to find the StructFields I > want to update and update them. I haven't found any examples of this > but it seems like there should be a simpler way to do it. > > In regards to changing the column on the dataframe itself, using E.G. > > val newDF = > df.withColumn("existing.top.level.FIELD_NAME",df.col("existing.top.level.FIELD_NAME").cast(LongType)) > > I end up with a new column named "existing.top.level.FIELD_NAME" at > the root level vs updating the nested column to the new type. Then has > anybody worked out how to both update nested column datatype and also > how to update the column type from the nested schema StructType? Are > there any easy ways to do this or is there a reason it is not trivial? >