2014-12-14 21:08 GMT+01:00 Benedikt Ritter <brit...@apache.org>: > > Hi, > > currently the wording in commons text is a bit confusing. We have the > three terms: > > - distance > - similarity > - metric > > Distance and similarity seem to be just opposites of the same thing. A > great distance indicates a small similarity between two character > sequences. Metric feels like it's something more general, but I'm not sure. > > I think we should consider renaming everything to distance, since the > implemented algorithms all end on *Distance. So we would change the package > name from o.a.c.text.similarity to o.a.c.text.distance and the interface > from StringMetric to StringDistance. >
Looking at the code again, it seems like the algorithms all really return a similarity score and not a distance. For exmaple FuzzyDistance JavaDoc states: "A higher score indicates a higher similarity". If this is a case, maybe it makes more sense to rename everything to Similarity? > > WDYT? > > Benedikt > > -- > http://people.apache.org/~britter/ > http://www.systemoutprintln.de/ > http://twitter.com/BenediktRitter > http://github.com/britter > -- http://people.apache.org/~britter/ http://www.systemoutprintln.de/ http://twitter.com/BenediktRitter http://github.com/britter