Hello Devs, I was measuring perf on structs between V1 and V2 datasources. Found that although Iceberg Reader supports `SupportsPushDownRequiredColumns` it doesn't seem to prune nested column projections. I want to be able to prune on nested fields. How does V2 datasource have provision to be able to let Iceberg decide this? The `SupportsPushDownRequiredColumns` mix-in gives the entire struct field even if a sub-field is requested.
*Here's an illustration .. * scala> spark.sql("select location.lat from iceberg_people_struct").show() +-------+ | lat| +-------+ | null| |101.123| |175.926| +-------+ The pruning gets the entire struct instead of just `location.lat` .. *public void pruneColumns(StructType newRequestedSchema) * 19/08/30 16:25:38 WARN Reader: => Prune columns : { "type" : "struct", "fields" : [ { "name" : "location", "type" : { "type" : "struct", "fields" : [ { "name" : "lat", "type" : "double", "nullable" : true, "metadata" : { } }, { "name" : "lon", "type" : "double", "nullable" : true, "metadata" : { } } ] }, "nullable" : true, "metadata" : { } } ] } Is there information I can use in the IcebergSource (or add some) that can be used to prune the exact sub-field here? What's a good way to approach this? For dense/wide struct fields this affects performance significantly. Sample gist: https://gist.github.com/prodeezy/001cf155ff0675be7d307e9f842e1dac thanks and regards, -Gautam.