On Thu, 12 Sep 2024 11:28:20 GMT, Eirik Bjørsnøs <eir...@openjdk.org> wrote:

>> Please review this PR which speeds up `ZipFile.getZipEntry` by removing 
>> slash-checking logic which is already taking place during lookup in 
>> `ZipFile.Source.getEntryPos`. 
>> 
>> `ZipFile.Source.getEntryPos` includes logic to match a lookup for "name" 
>> against a directory entry "name/" (with a trailing slash). However, only the 
>> CEN position is currently returned, so `ZipFile.getZipEntry` needs to 
>> re-read the name from the CEN and determine if a trailing slash needs to be 
>> appended to the name of the returned `ZipEntry`.
>> 
>> By letting `ZipFile.Source.getEntryPos` return the resolved name along with 
>> the CEN position (in a new record `EntryPos`), `ZipFile.getZipEntry` can now 
>> instead use the already resolved name. 
>> 
>> This results in a nice ~18% speedup in the `ZipFileGetEntry.getEntryHit` 
>> micro:
>> 
>> Baseline:
>> 
>> 
>> Benchmark                    (size)  Mode  Cnt   Score   Error  Units
>> ZipFileGetEntry.getEntryHit     512  avgt   15  63.713 ? 2.645  ns/op
>> ZipFileGetEntry.getEntryHit    1024  avgt   15  67.405 ? 1.474  ns/op
>> 
>> 
>> PR:
>> 
>> 
>> Benchmark                    (size)  Mode  Cnt   Score   Error  Units
>> ZipFileGetEntry.getEntryHit     512  avgt   15  52.027 ? 2.669  ns/op
>> ZipFileGetEntry.getEntryHit    1024  avgt   15  55.211 ? 1.169  ns/op
>> 
>> The changes in this PR makes `UTF8ZipCoder.compare` the only caller of 
>> `ZipCoder.hasTrailingSlash`, so this method is made private and the 
>> implementation in the base class retired.
>> 
>> This purely a cleanup and optimization PR, no functional tests are changed 
>> or added.
>
> Eirik Bjørsnøs has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Add comment where getEntry calls Source::getEntryPos

For the benefit of future maintainers, I added a code comment where 
`ZipFile::getEntry` calls `Source::getEntryPos`, pointing out that the resolved 
name from the CEN may differ with a trailing slash.

This reminds me that `getEntryPos` is maybe not a great name. Perhaps something 
like `lookupEntry` would be more honest about the non-trivial logic involved. 
But perhaps that is for another PR.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20939#issuecomment-2346038907

Reply via email to