PDGGK opened a new pull request, #37516:
URL: https://github.com/apache/beam/pull/37516

   ## Summary
   
   This PR enables support for nested column paths (dot notation) in the 
`keep`/`drop` configuration for Iceberg IO column pruning.
   
   **Problem:**
   Users cannot specify nested column paths like `keep: ["data.name", 
"data.value"]` - the validation fails with:
   ```
   Invalid source configuration: 'keep' specifies unknown field(s): [data.name]
   ```
   
   **Solution:**
   Replace the top-level-only column validation with `Schema.findField()` which 
natively resolves dot-notation paths for nested fields in Iceberg.
   
   **Changes:**
   - `IcebergScanConfig.java`: Updated `validate()` method to use `findField()` 
instead of iterating only top-level columns
   - `IcebergIOReadTest.java`: Added test case for nested column path validation
   
   Fixes #37486
   
   ## Test plan
   
   - [x] Added unit test `testNestedColumnPruningValidation` that verifies 
nested paths are accepted
   - [ ] Existing tests should continue to pass (top-level column validation 
still works)
   
   ---
   
   🤖 Generated with [Claude Code](https://claude.ai/code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to