Hi,

I've been using the immutable Metadata within the StructType of a
DataFrame/Dataset to track application-level column lineage.

However, since it's immutable, the only way to modify it is to do a full
trip of

   1. Convert DataFrame/Dataset to Row RDD
   2. Create new, modified Metadata per column from the old
   3. Create a new StructType with the modified metadata
   4. Convert the Row RDD + StructType schema to a DataFrame/Dataset

It looks like conversion to/from an RDD might involve real work, even
though in this case the data itself isn't modified at all.

Is there a better way to do this?

Thanks!

Reply via email to