I was going to modify the BloomFilter interface as discussed on this thread.
However before that I have made some changes to the current code to tidy it up. PR is [1]. The changes target the following: 1. Test coverage. Coverage is now close to 100%. The AbstractBloomFilterTest was modified to test merge and cardinality with a filter that was not the filter under test. The HasherBloomFilter cannot achieve 100%. This is because I’ve added assert statements and JaCoCo doesn’t ignore them. The asserts are there because the code can assume that a non-empty StaticHasher contains at least 1 bit index. If we really can assume this then the asserts can be removed. 2. toString I’ve removed BitSetBloomFilter toString(). A very big filter could cause an overflow of a char[] used to represent a string. 3. Improve the Shape class This is a big change, mainly for all the extra javadoc. The input desired probability should be tested for NaN so I added that. All the exception messages are the same in all the constructors and they are documented. Made Shape final. There are computations in there that do not use the getter methods to get the Shape values. So the class hasn’t been designed for extension. If we make it final then the rest of the code can be happy that a Shape will function as expected, i.e. a user cannot provide their own shape. Q. Is this desirable or not? If not then I can change it and we properly update it to be designed for extension. 4. Fix Shape.hashCode Calling Objects.hash(HashFunctionIdentity, …) would not work as the HashFunctionIdentity is an interface with no hashCode implementation. So we have to generate the hash using the same fields that are used to test equality of HashFunctionIdentity objects. This is done in the HashFunctionValidator. Q. Why does the shape precompute the hashCode? A shape is unlikely to be used as a key in a Map and so caching the hash seems like extra work for something that won’t be used. I suggest we drop the cache and compute the hashCode dynamically. Alex [1] https://github.com/apache/commons-collections/pull/140 <https://github.com/apache/commons-collections/pull/140>