On Mon, 14 Oct 2024 19:50:45 GMT, Eirik Bjørsnøs <eir...@openjdk.org> wrote:
>> Please review this PR which speeds up `JarFile::getEntry` lookup >> significantly for multi-release JAR files. >> >> The changes in this PR are motivated by the following insights: >> >> * `META-INF/versions/` is sparsely populated. >> * Most entries are not versioned >> * The number of unique versions for each versioned entry is small >> * Many JAR files are 'accidentally' multi-release; they use the feature to >> hide `module-info.class` from Java 8. >> >> Instead of performing one lookup for every version identified in the JAR, >> this PR narrows the version search down to only the number of versions found >> for the entry being looked up, which will most often be zero. This speeds up >> lookup for non-versioned entries, and provides a more targeted search for >> versioned entries. >> >> An alternative approach could be to normalize the hash code to use the >> none-versioned name such that versioned and non-versioned names would be >> resolved in the same lookup. This was quickly abandoned since the code >> changes were intrusive and mixed too many JAR specific concerns into >> `ZipFile`. >> >> Testing: The existing `JarFileGetEntry` benchmark is updated to optionally >> test a multi-release JAR file with one versioned entry for >> `module-info.class` plus two other versioned class files for two distinct >> versions. Performance results in [first comment](#issuecomment-2410901754). >> >> Running `ZipFileOpen` on a multi-release JAR did not show a significat >> difference between this PR and mainline. >> >> The JAR and ZIP tests are run locally. GHA results green. The `noreg-perf` >> label is added in JBS. > > Eirik Bjørsnøs has updated the pull request incrementally with one additional > commit since the last revision: > > Use Arrays.sort instead of TreeSet Marking this PR as draft while investigating alternatives as proposed by @cl4es ------------- PR Comment: https://git.openjdk.org/jdk/pull/21489#issuecomment-2421937705