Hi Paul, >> Are there any plans to address this?
> Not until you mentioned it, but I just now installed a patch for this; > please see the end of this message. Awesome! Many thanks. > Can you please help out by supplying some test cases? I can certainly provide one, currently part of a not-quite-final patch at https://issues.apache.org/jira/browse/HADOOP-6835 http://issues.apache.org/jira/secure/attachment/12448469/HADOOP-6835.v5.trunk-hadoop-common.patch I've copied it here: http://gregroelofs.com/test/testCompressThenConcat.txt.gz This was hand-built, but I've verified that zlib > 1.2.1.2 reads it correctly--that is, using the regular zlib inflateInit2() API, not the gz* one, which ignores the CRC but otherwise also handles it OK. (Versions prior to 1.2.1.2 forgot to compute the CRC on the trailing NULLs in the filename and comment fields.) I don't recall if I've verified it yet with Sun's JDK--I've made myself a note to do so sometime this week. (They're not exactly swift on gzip-related fixes in any case. ;-) http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4691425) > Are there > examples on the net of gzip files that gzip 1.4 won't decompress, due > to this problem? Not to my knowledge; we've got a bit of chicken-and-egg problem there, insofar as most people avoid generating gzip'd files that can't be decoded with standard gzip. Neither the JDK nor zlib minigzip provides a mechanism to generate arbitrary header fields, AFAIK. Possibly something like 7-Zip does, but I suspect not. > If not, can you please generate some? As things > stand, I feel that I haven't tested it in any real-world way. Thanks. I'll try to do so, yes. We're putting together a more extensive test plan for the Hadoop patch, and the ideal suite would include all possible header combos (with/without extra field, filename, comment, CRC). I'm not sure I'll have time--we're approaching an internal code freeze shortly--but I'll do what I can. > Here's the patch. I'll add a NEWS entry shortly. Thank you! I'll also test this at work this week--it will make my own testing easier. Greg