pyckle commented on code in PR #2927: URL: https://github.com/apache/parquet-java/pull/2927#discussion_r2258316076
########## parquet-avro/src/test/resources/test-bad-compressed-size.parquet: ########## Review Comment: Hey, thanks for taking a look. I encountered this issue while working on an early version of a new parquet driver built on top of parquet-java (https://github.com/Earnix/parquetforge) I'm happy to put this generated corrupted file into the parquet-testing repo. But that leaves a few questions: * Should this go in the bad data folder? It is indeed invalid parquet, but parquet-java attempts to read it. * This flow is very difficult to unit test without the corrupted file. Can (or should) this be merged without an automated test? As such, I think the sensible options are either to merge the fix without a test, or remove this code entirely. Removing the code entirely seems to make more sense as it's only purpose is to allow reading these corrupted files and it's been broken for a while so nobody seems to depend upon it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
