Hello.

Le lun. 7 oct. 2019 à 19:42, Claude Warren <cla...@xenei.com> a écrit :
>
> As noted earlier I am preparing a contribution of Bloom Filter classes to
> the collections module.  As part of this submission there are several
> methods that operate on BitSets that are used as part  of Bloom Filter
> manipulation and analysis.  My question is, should these be contributed as
> Bloom Filter specific methods or would it be better to submit a BitSet
> function library.

What do you mean?
What would be the alternative?  How would usage change (from a
user perspective)?  Would it improve the design (e.g. be increasing
the "separation of concerns")?

Thanks,
Gilles

>
> The methods in question are:
> hammingDistance() = the cardinality (A xor B)
> jaccardDistance()  = the 1 - jaccardSimilarity()
> jaccardSimilarity() = cardinality(A xor B) / cardinality (A or B)
> cosineDistance() = 1 - cosineSimilarity()
> cosineSimilarity() = cardinality( A and B ) / (Sqrt( cardinality( A ) ) *
> Sqrt( cardinality( B )))
> estimatedLog = estimated log2 of the BitSet if considered a large unsigned
> int.
>
> Opinions requested.
>
> Claude
> --
> I like: Like Like - The likeliest place on the web
> <http://like-like.xenei.com>
> LinkedIn: http://www.linkedin.com/in/claudewarren

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to