gauravkhatri05 commented on PR #3272:
URL: https://github.com/apache/parquet-java/pull/3272#issuecomment-3246338395
Hello Arnav,
Apologies for the delayed response, and thanks for the update.
I see that the patch introduces an IdentityHashMap check in
convertField(...):
if (seenSchemas.containsKey(schema)) {
throw new UnsupportedOperationException(
"Recursive Avro schemas are not supported by parquet-avro: " +
schema.getFullName());
}
While this prevents infinite loops by failing early, it doesn’t actually
handle recursion gracefully — it simply rejects recursive schemas faster. So
with the current patch, recursive Avro schemas are still not supported. In a
cycle like A -> B -> C -> D -> A, the code would still fail once it loops back
to A, which feels almost the same as having no patch.
In my view, a better approach is to introduce a configurable max-depth limit
to control recursion rather than throwing an exception. We’ve applied this
technique in our AvroSchemaConverter patch, and it has been working reliably in
production. With a default depth of 10, we see stable behavior and good
performance (though reducing it below 7 causes issues).
If you’d like, I can also share our patched implementation for reference.
Thanks 😊
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]