On Wed, 18 Jan 2023 12:43:31 GMT, fabioromano1 <d...@openjdk.org> wrote:
>> The enanchment is useful for applications that make heavy use of BitSet >> objects as sets of integers, and therefore they need to make a lot of calls >> to cardinality() method, which actually require linear time in the number of >> words in use by the bit set. >> This optimization reduces the cost of calling cardinality() to constant >> time, as it simply returns the value of the field, and it also try to make >> as little effort as possible to update the field, when needed. >> >> Moreover, it has been implemented a new method for testing wheter a bit set >> includes another bit set (i.e., the set of true bits of the parameter is a >> subset of the true bits of the instance). > > fabioromano1 has updated the pull request incrementally with one additional > commit since the last revision: > > Added author and reverse the cicle order in includes(BitSet) I'll pile on: This optimization doesn't buy much in today's world, where most machines execute `bitCount` in one cycle. It saves a trivial loop. Over very large bitsets that saves something, but most bitsets are likely to be medium to small (across all use cases). And for large bitsets, if there is a problem with cardinality testing, the user should provide for it separately. Perhaps as a class which contains a bitset, plus the extra cached information. You might also cache the index of the first and last words containing a set bit. There's no end to what one might cache: It's data, and you can derive information from it. That's a job for users, not for the library. Also, as folks have noted, adding a speedup in one place pushes slowdowns in other places. Worst of all, there are new race conditions in the "improved" data structure. I don't recommend this change. If the data structure were immutable, I might think otherwise (as String caches its hashcode), but the race conditions really spoil it for me. ------------- PR: https://git.openjdk.org/jdk/pull/11837