2014-12-14 21:08 GMT+01:00 Benedikt Ritter <brit...@apache.org>:
>
> Hi,
>
> currently the wording in commons text is a bit confusing. We have the
> three terms:
>
> - distance
> - similarity
> - metric
>
> Distance and similarity seem to be just opposites of the same thing. A
> great distance indicates a small similarity between two character
> sequences. Metric feels like it's something more general, but I'm not sure.
>
> I think we should consider renaming everything to distance, since the
> implemented algorithms all end on *Distance. So we would change the package
> name from o.a.c.text.similarity to o.a.c.text.distance and the interface
> from StringMetric to StringDistance.
>

Looking at the code again, it seems like the algorithms all really return a
similarity score and not a distance. For exmaple FuzzyDistance JavaDoc
states: "A higher score indicates a higher similarity". If this is a case,
maybe it makes more sense to rename everything to Similarity?


>
> WDYT?
>
> Benedikt
>
> --
> http://people.apache.org/~britter/
> http://www.systemoutprintln.de/
> http://twitter.com/BenediktRitter
> http://github.com/britter
>


-- 
http://people.apache.org/~britter/
http://www.systemoutprintln.de/
http://twitter.com/BenediktRitter
http://github.com/britter

Reply via email to