On Fri, 18 Oct 2024 15:43:18 GMT, Chen Liang <li...@openjdk.org> wrote:

>> Eirik Bjørsnøs has updated the pull request incrementally with four 
>> additional commits since the last revision:
>> 
>>  - Map versions by entry name hashcode instead of by entry name. This avoids 
>> String allocation and storage
>>  - Merge pull request #1 from cl4es/bitset_versions
>>    
>>    Use BitSet to streamline construction
>>  - Fix traversal, traverse backwards to pick latest applicable version
>>  - Use BitSet to streamline construction
>
> src/java.base/share/classes/java/util/zip/ZipFile.java line 1796:
> 
>> 1794:                                 if (metaVersions == null)
>> 1795:                                     metaVersions = new HashMap<>();
>> 1796:                                 metaVersions.computeIfAbsent(hashCode, 
>> _ -> new BitSet()).set(version);
> 
> Does this hashCode ever collide? Tried reading ZipCoder and can't rule that 
> out. If yes, we need a fallback mechanism if this `computeIfAbsent` 
> encountered an existing entry.

Yes, this is an overapproximation, so a hash collision may result in an extra 
lookup being performed for a versioned entry which does not exist for that 
version. See the comment on the `metaVersions` field declaration.

This is a trade-off against the initial alternative proposed in this PR which 
was to map using the entry name string. That would minimize lookups, but it 
would also require allocating the String name and storing those strings in the 
map. So we decided a few extra lookups on hash collisions was an okay trade-off 
for faster parsing and less memory used.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/21489#discussion_r1806726866

Reply via email to