On Wed, 10 Jan 2024 17:02:05 GMT, Eirik Bjørsnøs <eir...@openjdk.org> wrote:
>> Sounds like a CSR is needed for the behavioral change proposed here. > >> Sounds like a CSR is needed for the behavioral change proposed here. > > Thanks for flagging this @jddarcy > > I'm personally not 100% convinced a CSR is warranted in this case, but I'm of > course happy to extend the following into a CSR if you or other reviewer > think that's the best way forward: > > Let's review the intended and unintended behavioral changes in this PR: > > The _intended_ behavioral change is to extend the set of ZIP entries parsable > by `ZipInputStream` to include "small Zip64 entries". Such entries meet the > following criteria: > > 1. They are clearly marked as using the Zip64 format, meaning that the LOC's > 'compressed size' and 'uncompressed size' fields are set to the "Zip64 magic > value" 0xFFFFFFFF and that the LOC's extra field includes a valid _Zip64 > Extended Information Field_. > 2. They use _streaming mode_, meaning that the 'general purpose bit flag' 3 > is set, that the Zip64 'Original Size' and 'Compressed Size' are both set to > zero and that file data is followed by a 'Data Descriptor' containing the > actual size values. > 3. The Data Descriptor represents the 'compressed size' and 'uncompressed' > size fields using 8 byte fields, instead of the regular 4 byte fields. > > These entires are currently not parsable by `ZipInputStream`. Trying to parse > them fails with an exception similar to the following: > > > java.util.zip.ZipException: invalid entry size (expected 0 but got 6 bytes) > at > java.base/java.util.zip.ZipInputStream.readEnd(ZipInputStream.java:616) > > > The _unintended_ behavioral change in this PR is that ZIP entries marked as > Zip64, but still having a Data Descriptor using 4 bytes to represent the size > fields will now become unparsable. Such entries meet critera 1 and 2 above, > but not 3. > > While I have not performed a corpus search, I believe such entries are rare, > if they exist at all: > > - They would be in clear violation of the `APPNOTE.txt` spec, which in > 4.3.9.2 says _ZIP64 format MAY be used regardless of the size of a file. > When extracting, if the zip64 extended information extra field is present for > the file the compressed and uncompressed sizes will be 8 byte values_ > - The `zipdetails` tool fails to parse such a file, instead reading the two > 4-byte numbers into one 8-byte 'compressed size' > - The file cannot be parsed by external APIs similar to `ZipInputStream`, > such as the Python `stream-unzip` module. > > To avoid unintended behavioral changes in the implementation, care has been > taken to make the Zip64 extra field validation code robust ... > @eirbjo, either of the intended or unintended effects of the PR on their own > would merit a CSR and this is a case where the inclusive-OR applies so a CSR > is definitely needed; thanks. Thanks Joe! A CSR has been proposed and is ready for the first round of feedback: https://bugs.openjdk.org/browse/JDK-8323583 ------------- PR Comment: https://git.openjdk.org/jdk/pull/12524#issuecomment-1887559379