On Wed, 18 Jan 2023 12:43:31 GMT, fabioromano1 <d...@openjdk.org> wrote:

>> The enanchment is useful for applications that make heavy use of BitSet 
>> objects as sets of integers, and therefore they need to make a lot of calls 
>> to cardinality() method, which actually require linear time in the number of 
>> words in use by the bit set.
>> This optimization reduces the cost of calling cardinality() to constant 
>> time, as it simply returns the value of the field, and it also try to make 
>> as little effort as possible to update the field, when needed.
>> 
>> Moreover, it has been implemented a new method for testing wheter a bit set 
>> includes another bit set (i.e., the set of true bits of the parameter is a 
>> subset of the true bits of the instance).
>
> fabioromano1 has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Added author and reverse the cicle order in includes(BitSet)

As one dealing with bitsets very often: In my opinion, adding this to bitset 
adds too much overhead on the hot methods like set/clear methods. Especially 
loops that populate a BitSet with values, I am not sure if the whole loop can 
be vectorized by Hotspot after this patch.

>From my experience with Apache Lucene it has been shown, that you use bitsets 
>often to track existence/not existence in very hot loops. Calling 
>cardinality() is done only after you have build the bitset and before you use 
>it in a read-only way. 

I know there are other usages patterns, but also code using BitSet as a compact 
`Set<Integer>` does not need cardinality information.

I also have a comment about the code: There is an assert-only method called 
`checkInvariants()` that is executed after all bulk operations. This one should 
also check the cardinality fits the current words. This is especially important 
after those crazy methods that intersect bitsets and have special handling of 
first/last word.

-------------

PR: https://git.openjdk.org/jdk/pull/11837

Reply via email to