On Tue, 10 Sep 2024 18:35:52 GMT, Eirik Bjørsnøs <eir...@openjdk.org> wrote:

> Please review this PR which speeds up `ZipFile.getZipEntry` by removing 
> slash-checking logic which is already taking place during lookup in 
> `ZipFile.Source.getEntryPos`. 
> 
> `ZipFile.Source.getEntryPos` includes logic to match a lookup for "name" 
> against a directory entry "name/" (with a trailing slash). However, only the 
> CEN position is currently returned, so `ZipFile.getZipEntry` needs to re-read 
> the name from the CEN and determine if a trailing slash needs to be appended 
> to the name of the returned `ZipEntry`.
> 
> By letting `ZipFile.Source.getEntryPos` return the resolved name along with 
> the CEN position (in a new record `EntryPos`), `ZipFile.getZipEntry` can now 
> instead use the already resolved name. 
> 
> Since `ZipFile.getZipEntry` now has the name, CEN header fields can now be 
> read in bulk, separate from the allocation of the `ZipEntry`. This reordering 
> is unlocked by the other changes in this PR and can alone explain a lot of 
> the performance gains, probably because of better cache use.
> 
> This results in a nice ~18% speedup in the `ZipFileGetEntry.getEntryHit` 
> micro:
> 
> Baseline:
> 
> 
> Benchmark                    (size)  Mode  Cnt   Score   Error  Units
> ZipFileGetEntry.getEntryHit     512  avgt   15  63.713 ? 2.645  ns/op
> ZipFileGetEntry.getEntryHit    1024  avgt   15  67.405 ? 1.474  ns/op
> 
> 
> PR:
> 
> 
> Benchmark                    (size)  Mode  Cnt   Score   Error  Units
> ZipFileGetEntry.getEntryHit     512  avgt   15  52.027 ? 2.669  ns/op
> ZipFileGetEntry.getEntryHit    1024  avgt   15  55.211 ? 1.169  ns/op
> 
> The changes in this PR makes `UTF8ZipCoder.compare` the only caller of 
> `ZipCoder.hasTrailingSlash`, so this method is made private and the 
> implementation in the base class retired.
> 
> This purely a cleanup and optimization PR, no functional tests are changed or 
> added.

This pull request has now been integrated.

Changeset: 7f1dae12
Author:    Eirik Bjørsnøs <eir...@openjdk.org>
URL:       
https://git.openjdk.org/jdk/commit/7f1dae12e5e24d204a70cf610a8c482996556931
Stats:     79 lines in 2 files changed: 16 ins; 45 del; 18 mod

8339874: Avoid duplicate checking of trailing slash in ZipFile.getZipEntry

Reviewed-by: lancea, redestad

-------------

PR: https://git.openjdk.org/jdk/pull/20939

Reply via email to