On Thu, 17 Oct 2024 22:32:12 GMT, Claes Redestad <redes...@openjdk.org> wrote:
>>> I guess performance leans on there being only one or a small number of >>> versions per entry. >> >> I checked Spring Petclinic: >> >> * 109 JAR dependencies >> * 15 are multi-release enabled. >> * The 15 MR-JARs have 9347 entries, of which 21 are versioned >> * 9 of 15 MR JARs use the feature exclusively to hide `module-info.class` >> from Java 8. >> * 3 of the 6 remaining JAR files version a single class file >> * The remaining JARs mostly version on the 9 release. >> * The maximum number of versioned entries in a file is 5. >> * Zero entries have more than a single version. >> >>> Which I think is a fair assumption. Still would be good to have a stress >>> test with many entries having many versions and make sure we don't regress >>> such cases too much, no? >> >> Such a test would be somewhat speculative. If I add 5 versions of each entry >> in `ZipFileOpen`, then yes, this PR sees a regression over mainline: >> >> Baseline: >> >> Benchmark (size) Mode Cnt Score Error >> Units >> ZipFileOpen.openCloseZipFile 1024 avgt 15 718234.380 ? 29454.645 >> ns/op >> >> >> PR: >> >> Benchmark (size) Mode Cnt Score Error >> Units >> ZipFileOpen.openCloseZipFile 1024 avgt 15 1284944.601 ? 74919.522 >> ns/op >> >> >> While `JarFileGetEntry` improves over baseline: >> >> Baseline: >> >> Benchmark (mr) (size) Mode Cnt Score >> Error Units >> JarFileGetEntry.getEntryHit false 1024 avgt 15 68.984 ? >> 2.791 ns/op >> JarFileGetEntry.getEntryHit true 1024 avgt 15 324.763 ? >> 9.690 ns/op >> JarFileGetEntry.getEntryHitUncached false 1024 avgt 15 95.884 ? >> 1.404 ns/op >> JarFileGetEntry.getEntryHitUncached true 1024 avgt 15 333.491 ? >> 3.576 ns/op >> JarFileGetEntry.getEntryMiss false 1024 avgt 15 37.358 ? >> 2.215 ns/op >> JarFileGetEntry.getEntryMiss true 1024 avgt 15 650.467 ? >> 5.116 ns/op >> JarFileGetEntry.getEntryMissUncached false 1024 avgt 15 61.059 ? >> 0.382 ns/op >> JarFileGetEntry.getEntryMissUncached true 1024 avgt 15 678.457 ? >> 4.390 ns/op >> >> >> PR: >> >> >> Benchmark (mr) (size) Mode Cnt Score >> Error Units >> JarFileGetEntry.getEntryHit false 1024 avgt 15 61.333 ? >> 3.665 ns/op >> JarFileGetEntry.getEntryHit true 1024 avgt 15 153.987 ? >> 8.889 ns/op >> JarFileGetEntry.getEntryHitUncached false 1024 avgt 15 90.008 ? ... > > I poked around with an alternative approach that uses `java.util.BitSet`. > This also allows us to use the map we build up as-is, avoids sorting et.c. > Performance on `JarFileGetEntry` (#21489 version) is the same in my testing, > but performance on `ZipFileOpen` with your 5 versions of each entry stress > test should be much improved (might beat the baseline). Can you share your > local changes to that micro in a gist or e-mail so I can verify? With this change to `ZipFileOpen`: diff --git a/test/micro/org/openjdk/bench/java/util/zip/ZipFileOpen.java b/test/micro/org/openjdk/bench/java/util/zip/ZipFileOpen.java index 4f6ae6373ec..0e22cf7f0ab 100644 --- a/test/micro/org/openjdk/bench/java/util/zip/ZipFileOpen.java +++ b/test/micro/org/openjdk/bench/java/util/zip/ZipFileOpen.java @@ -73,6 +73,11 @@ public void beforeRun() throws IOException { ename += entryPrefixes[i % entryPrefixes.length] + "-" + i; zos.putNextEntry(new ZipEntry(ename)); + zos.putNextEntry(new ZipEntry("META-INF/versions/9/" + ename)); + zos.putNextEntry(new ZipEntry("META-INF/versions/11/" + ename)); + zos.putNextEntry(new ZipEntry("META-INF/versions/17/" + ename)); + zos.putNextEntry(new ZipEntry("META-INF/versions/21/" + ename)); + zos.putNextEntry(new ZipEntry("META-INF/versions/24/" + ename)); } } zipFile = tempFile; On that modified `ZipFileOpen` micro, https://github.com/eirbjo/jdk/pull/1 improves over #21489 but still regress compared to the baseline: Benchmark (size) Mode Cnt Score Error Units ZipFileOpen.openCloseZipFile 1024 avgt 15 677386,387 ± 15761,989 ns/op # baseline ZipFileOpen.openCloseZipFile 1024 avgt 15 1121127,629 ± 16716,218 ns/op # pull/21489, 0.6x ZipFileOpen.openCloseZipFile 1024 avgt 15 968501,630 ± 28761,067 ns/op # eirbjo/jdk#1, 0.7x I suspect remaining regression is mostly from the added `String` creation/processing in `initCEN`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/21489#discussion_r1806179709