Hi everyone,I've been working on #3225 to extend field-major processing to nested struct fields in Comet. My implementation focuses on separating validity extraction from child field processing to improve cache locality and reduce type-dispatch overhead.
As I finalize this PR, I have a question regarding the broader roadmap for complex types: Now that we have a recursive strategy for Struct fields, what are the community's thoughts on applying a similar vectorized approach to List and Map types? Specifically, what is the preferred pattern for handling variable-length offsets in these collection types while staying within the optimized field-major traversal path in the shuffle kernels? I am a student contributor preparing for GSoc 2026 and would love to align my current work with the long-term architectural goals for complex type optimization in DataFusion Comet. Best regards, Vignesh. GitHub: vigneshsiva11
