johanl-db commented on PR #54488:
URL: https://github.com/apache/spark/pull/54488#issuecomment-3989338708

   > d tests like:
   > 
   > * type evolution
   > * 2 level structs
   > * non-partitioned table
   > * constraints
   > 
   > Also do we run the same tests Dataframe API? (I think we only test with 
SQL?)
   
   I meant to reply earlier to this:
   
   - 2 level structs: I've added tests for struct evolution nested inside 
another struct, a map key/value and an array.
   - type evolution: I'm planning a follow up to properly support type 
evolution - it'll need to be more granular that today where we blindly attempt 
to apply any type change even if the table doesn't support it. I've added a 
basic test here, but proper coverage will be added in that follow up.
   - non partitioned tables: for REPLACE WHERE, the only reason I'm using 
partitioned tables is because the InMemoryCatalog doesn't support partial 
overwrite on non-partitioned tables. It didn't seem necessary to implement 
this, schema evolution doesn't apply to partition columns (you can't add a new 
partition columns, and partition columns can't be nested types so can't use 
struct evolution)
   - Constraints: not sure exactly what you had in mind there
   
   For the dataframe API: Spark doesn't actually provide a way to enable schema 
evolution via the dataframe API, so I've let that out for now. Adding it would 
require more discussions: Delta (and Iceberg) do it via an a writer option 
`mergeSchema`, but Spark doesn't really say anything about that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to