On Mon, 16 Sep 2024 11:42:46 GMT, Lance Andersen <lan...@openjdk.org> wrote:
>> I'm curious why the combined header length validation is being placed so >> late. In general I would assume failing fast is better? >> >> Also, since the combined header length clause applies to "any directory >> record", it also applies to LOC? >> >> So why is this happening late in `writeCEN`, as opposed to early in >> `putNextEntry`? >> >> Edit: I'm aware that moving it to `putNextEntry` means you probably need to >> repeat it in writeCEN, just in case the `ZipEntry` was updated meanwhile. >> >> Some comments inline. > >> I'm curious why the combined header length validation is being placed so >> late. In general I would assume failing fast is better? >> >> Also, since the combined header length clause applies to "any directory >> record", it also applies to LOC? >> >> So why is this happening late in `writeCEN`, as opposed to early in >> `putNextEntry`? >> >> Edit: I'm aware that moving it to `putNextEntry` means you probably need to >> repeat it in writeCEN, just in case the `ZipEntry` was updated meanwhile. >> >> Some comments inline. > > As this is really a corner case at best, I decided to keep the changes to a > minimum and the validation in writeCEN given that is where the encoded > comment bytes are obtained and written out. > @LanceAndersen > > I had a look at your update of the `CenSizeTooLarge` test. > > This test uses artificially large extra fields to produce a CEN size > exceeding the implementation limit. Since this otherwise would create a lot > of IO and a huge file, the test writes a sparse file until the last CEN > record is written, after which real bytes are written out for the "ZIP64 End > of Central Directory" and "End of Central Directory" records are written. > > The current extra field size of 0xFFFF is too large to pass the combined > length validation introduced by this PR. Because of this, the field size is > reduced to account for the CENHDR length, the length of the entry name and > the length of the comment (which is added to the last entry only) > > Since the comment is only added to the last entry (as a hint to > SparseOutputStream), the CEN_HEADER_SIZE no longer reflects the actual size > of the CEN headers written. While the test still passes (by increasing > NUM_ENTRIES as needed), this makes the test somewhat confusing for future > maintainers. > > Could we perhaps skip the comment altogether, and instead use equal-sized CEN > records for all entries? Instead of using a comment as a signal to > `SparseOutputStream` we could use a separate extra byte array on the last > entry, as suggested in this diff on top of your PR: Thanks for the suggestions to the test. They look fine. I have done a run across our internal Mach5 systems and no issues(as expected). PR has been updated with these changes ------------- PR Comment: https://git.openjdk.org/jdk/pull/21003#issuecomment-2367883851