Have you considered using Stable Bloom Filters [1].   I think they do what you 
want without a lot of the overhead you propose for your solution.  In addition, 
you may want to look at Commons-Collections v4.5 [2] (currently snapshot) for 
efficient Bloom filter code.  I have a Stable Bloom filter implementation based 
on commons-collections somewhere.

[1] Deng, Fan; Rafiei, Davood (2006), "Approximately Detecting Duplicates for 
Streaming Data using Stable Bloom Filters", Proceedings of the ACM SIGMOD 
Conference (PDF), pp. 25–36

[2] 
https://github.com/apache/commons-collections/tree/master/src/main/java/org/apache/commons/collections4/bloomfilter

Reply via email to