On Fri, 18 Oct 2024 16:02:26 GMT, Eirik Bjørsnøs <eir...@openjdk.org> wrote:
>> src/java.base/share/classes/java/util/zip/ZipFile.java line 1796: >> >>> 1794: if (metaVersions == null) >>> 1795: metaVersions = new HashMap<>(); >>> 1796: >>> metaVersions.computeIfAbsent(hashCode, _ -> new BitSet()).set(version); >> >> Does this hashCode ever collide? Tried reading ZipCoder and can't rule that >> out. If yes, we need a fallback mechanism if this `computeIfAbsent` >> encountered an existing entry. > > Yes it may collide. This is an overapproximation of the version set where a > hash collision just results in an extra lookup being performed for a version > entry which does not exist for that name. See the comment on the > `metaVersions` field declaration. > > This is a trade-off against the initial alternative proposed in this PR which > was to map using the entry name string. That would minimize lookups, but it > would also require allocating the String name and storing those strings in > the map. So we decided a few extra lookups on hash collisions was an okay > trade-off for faster parsing and less memory used. > > In fact, if we a use constant hash code like '1', such that _all_ hash codes > collide, we get the behavior of the current mainline, where a search for any > name will result in searches being performed for each version discovered in > the ZIP. In principle the `metaVersions` map acts as a filter to reduce entry lookups, so it's ok to give false positives but not false negatives. Other alternatives such as mapping from entry length to a `BitSet` would also be semantically fine, would carry less of a footprint and per-versioned-entry-`initCEN` overhead, but might degrade to more false positives as the number of versioned entries in a jar file grows. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/21489#discussion_r1812437330