pyckle commented on code in PR #2927:
URL: https://github.com/apache/parquet-java/pull/2927#discussion_r2258316076


##########
parquet-avro/src/test/resources/test-bad-compressed-size.parquet:
##########


Review Comment:
   Hey, thanks for taking a look. I encountered this issue while working on an 
early version of a new parquet driver built on top of parquet-java 
(https://github.com/Earnix/parquetforge)
   I'm happy to put this generated corrupted file into the parquet-testing 
repo. But that leaves a few questions:
   
   * Should this go in the bad data folder? It is indeed invalid parquet, but 
parquet-java attempts to read it.
   * This flow is very difficult to unit test without the corrupted file. Can 
(or should) this be merged without an automated test?
   
   As such, I think the sensible options are either to merge the fix without a 
test, or remove this code entirely. Removing the code entirely seems to make 
more sense as it's only purpose is to allow reading these corrupted files and 
it's been broken for a while so nobody seems to depend upon it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to