2014/1/17 Julien Aymé <julien.a...@gmail.com> > More on Benedikt's idea: > > <quote> > What I want to avoid is something like: > > LevenshteinDistance algo = new LevenshteinDistance() > double dist = algo.getDistance(str1, str2); > <quote> > > If the algorithm is stateless, we can provide a public static final > LevenshteinDistance INSTANCE. >
Yes, that's a good idea. > In that case, the code would become: > double dist = LevenshteinDistance.INSTANCE.getDistance(str1, str2); > > This would be OO, while allowing other algorithms to be added later > (these may or may not be stateless, and the code would go in their own > classes). > WDYT? > I'd like to keep the implementation classes hidden from user code. I'd probably do something like: public interface StringDistance { double calculate(String str1, String str2); } final class LevenshteinDisctance implements StringDistance { public double calculate(String str1, String str2) { } } // more implementations like JaroWinkler public final class StringDistances { private static final StringDistance LEVENSHTEIN = new LevenshteinDistance(); // more implementations like JaroWinkler } User code would look like: double distance = StringDistances.LEVENSHTEIN.calculate(str1, str2); Or am I making this to complicated? I'm not sure... > > Julien > > 2014/1/17 Benedikt Ritter <brit...@apache.org>: > > 2014/1/15 Oliver Heger <oliver.he...@oliver-heger.de> > > > >> > >> > >> Am 15.01.2014 15:05, schrieb Benedikt Ritter: > >> > 2014/1/15 Gary Gregory <garydgreg...@gmail.com> > >> > > >> >> On Wed, Jan 15, 2014 at 8:06 AM, Benedikt Ritter < > brit...@apache.org> > >> >> wrote: > >> >> > >> >>> Hi Gary, > >> >>> > >> >>> 2014/1/15 Gary Gregory <garydgreg...@gmail.com> > >> >>> > >> >>>> On Wed, Jan 15, 2014 at 7:00 AM, Benedikt Ritter < > brit...@apache.org> > >> >>>> wrote: > >> >>>> > >> >>>>> Hi all, > >> >>>>> > >> >>>>> we currently have StringUtils.getLevenshteinDistance. LANG-944 > [1] is > >> >>>> about > >> >>>>> introducing a new string algorithm called Jaro Winkler Distance > [2]. > >> >>>> Since > >> >>>>> StringUtils already does a lot of things, I'm wondering if it may > >> >> make > >> >>>>> sense to introduce a new class that serves as a host for more > string > >> >>>>> algorithms to come. It would look something like: > >> >>>>> > >> >>>>> StringAlgorithms.levenshteinDistance(str1, str2); > >> >>>>> StringAlgorithms.jaroWinklerDistance(str1, str2); > >> >>>>> > >> >>>>> We would deprecate StringUtils.getLevenshteinDistance and > delegate to > >> >>> the > >> >>>>> new class. It could be removed from StringUtils in the next major > >> >>>> release. > >> >>>>> > >> >>>> > >> >>>>> Thoughts? > >> >>>>> > >> >>>> > >> >>>> Yuck! > >> >>>> > >> >>>> I'd rather have once class per algo which reminds me that [codec] > >> might > >> >>> be > >> >>>> a better place for things like this that 'encode' strings into > >> >> something > >> >>>> else. > >> >>>> > >> >>> > >> >>> Both methods return a double value modeling some kind of score. > They do > >> >> not > >> >>> encode. Maybe StringAlgorithms is the wrong name? How About > StringScore > >> >> or > >> >>> something like that? > >> >>> > >> >> > >> >> Still wrong IMO and not OO. A single class will become another > >> >> dumping-ground/kitchen-sink like StringUtils. I would not want to see > >> one > >> >> algo be a one method one liner impl and another algo be a complex 20 > >> method > >> >> job. I guess we could organize algos using nested classes like > >> >> StringFoo.BarAlgo but that's not ideal. All algo classes in a new > pkg is > >> >> another way to go. > >> >> > >> > > >> > We already have o.a.c.lang3.text, maybe this would fit? > >> > > >> > What I want to avoid is something like: > >> > > >> > LevenshteinDistance algo = new LevenshteinDistance() > >> > double dist = algo.getDistance(str1, str2); > >> > > >> > If those algorithms don't have a state, it doesn't make sense to force > >> > creation of an object. I like to idea of internal classes. > >> > >> IIUC, both algorithms do the same thing - calculating the difference (or > >> similarity) of two strings - using different methods. > >> > >> So another option would be to extract a common interface > >> (StringDifferenceMetric?) and provide the algorithms as concrete > >> implementations. > >> > > > > This is a possible, but very specific (= tied to distance measuring) > > approach. I think it is a good idea to create very specific utilities > > instead of generic ones like StringUtils, that can do a variety of > things. > > > > > >> > >> A concrete use case could be a query engine which allows customizing its > >> string matching algorithm. > >> > > > > Is this really a use case? It sounds very constructed to me. Have you > ever > > thought "I'd like to query on google, but I'd like suggestions to be > > matched using Levenshtein Distance algorithm"? > > > > > >> > >> If you want to avoid instantiating algorithm classes with no state, we > >> could have an enum with constants representing the available algorithms. > >> > > > > I still favor specific methods over an additional parameter. > > > > > >> > >> Oliver > >> > >> > > >> > > >> >> > >> >> Gary > >> >> > >> >> > >> >>> > >> >>> > >> >>>> > >> >>>> Gary > >> >>>> > >> >>>> > >> >>>>> Benedikt > >> >>>>> > >> >>>>> [1] https://issues.apache.org/jira/i#browse/LANG-944 > >> >>>>> [2] http://en.wikipedia.org/wiki/Jaro%E2%80%93Winkler_distance > >> >>>>> > >> >>>>> -- > >> >>>>> http://people.apache.org/~britter/ > >> >>>>> http://www.systemoutprintln.de/ > >> >>>>> http://twitter.com/BenediktRitter > >> >>>>> http://github.com/britter > >> >>>>> > >> >>>> > >> >>>> > >> >>>> > >> >>>> -- > >> >>>> E-Mail: garydgreg...@gmail.com | ggreg...@apache.org > >> >>>> Java Persistence with Hibernate, Second Edition< > >> >>>> http://www.manning.com/bauer3/> > >> >>>> JUnit in Action, Second Edition <http://www.manning.com/tahchiev/> > >> >>>> Spring Batch in Action <http://www.manning.com/templier/> > >> >>>> Blog: http://garygregory.wordpress.com > >> >>>> Home: http://garygregory.com/ > >> >>>> Tweet! http://twitter.com/GaryGregory > >> >>>> > >> >>> > >> >>> > >> >>> > >> >>> -- > >> >>> http://people.apache.org/~britter/ > >> >>> http://www.systemoutprintln.de/ > >> >>> http://twitter.com/BenediktRitter > >> >>> http://github.com/britter > >> >>> > >> >> > >> >> > >> >> > >> >> -- > >> >> E-Mail: garydgreg...@gmail.com | ggreg...@apache.org > >> >> Java Persistence with Hibernate, Second Edition< > >> >> http://www.manning.com/bauer3/> > >> >> JUnit in Action, Second Edition <http://www.manning.com/tahchiev/> > >> >> Spring Batch in Action <http://www.manning.com/templier/> > >> >> Blog: http://garygregory.wordpress.com > >> >> Home: http://garygregory.com/ > >> >> Tweet! http://twitter.com/GaryGregory > >> >> > >> > > >> > > >> > > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > >> For additional commands, e-mail: dev-h...@commons.apache.org > >> > >> > > > > > > -- > > http://people.apache.org/~britter/ > > http://www.systemoutprintln.de/ > > http://twitter.com/BenediktRitter > > http://github.com/britter > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- http://people.apache.org/~britter/ http://www.systemoutprintln.de/ http://twitter.com/BenediktRitter http://github.com/britter