gauravkhatri05 commented on PR #3272: URL: https://github.com/apache/parquet-java/pull/3272#issuecomment-3252363120
> Thanks for raising it @gauravkhatri05. +1 on the dev list since we are discussing adding partial support for an unsupported feature. In the meanwhile let me check if there is anything else that can be done do to unblock this usecase. (formally supporting recursive schemas in parquet has likely already been discussed will check if anything can be done) Thanks @gszadovszky & @ArnavBalyan for the response. While thinking further, I realized a limitation in my current approach: * If the self-reference is immediate, then max-depth works as expected, allowing recursive traversal of the same schema until the depth is exhausted. * But if the self-reference occurs deeper in the path, then *every* child schema along that path also consumes part of the max-depth before we even reach the recursive schema. For example, with max-depth = 10: * Input Avro schema: `A -> B -> C -> D -> A` * Output schema today: `A -> B -> C -> D -> A -> B -> C -> D -> A -> B` * Expected behavior: `A -> B -> C -> D` repeated 10 times (instead of prematurely exhausting depth). To make this more optimal, I think we need to track **seen schemas within the same traversal path**. That way, only encountering the *same schema in the same path* decrement the depth count. This is essentially a tree traversal problem where we need to track nodes along the current branch. What do you think about this refinement? As you suggested, I’ll also bring this up on the dev list so we can get broader community feedback. Could you please share the correct DL address / process to raise this discussion? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
