2014/1/17 Julien Aymé <julien.a...@gmail.com>

> More on Benedikt's idea:
>
> <quote>
> What I want to avoid is something like:
>
> LevenshteinDistance algo = new LevenshteinDistance()
> double dist = algo.getDistance(str1, str2);
> <quote>
>
> If the algorithm is stateless, we can provide a public static final
> LevenshteinDistance INSTANCE.
>

Yes, that's a good idea.


> In that case, the code would become:
> double dist = LevenshteinDistance.INSTANCE.getDistance(str1, str2);
>
> This would be OO, while allowing other algorithms to be added later
> (these may or may not be stateless, and the code would go in their own
> classes).
> WDYT?
>

I'd like to keep the implementation classes hidden from user code. I'd
probably do something like:

public interface StringDistance {

   double calculate(String str1, String str2);

}

final class LevenshteinDisctance implements StringDistance {

   public double calculate(String str1, String str2) {
   }

}

// more implementations like JaroWinkler

public final class StringDistances {

   private static final StringDistance LEVENSHTEIN = new
LevenshteinDistance();

   // more implementations like JaroWinkler

}

User code would look like:

double distance = StringDistances.LEVENSHTEIN.calculate(str1, str2);

Or am I making this to complicated? I'm not sure...


>
> Julien
>
> 2014/1/17 Benedikt Ritter <brit...@apache.org>:
> > 2014/1/15 Oliver Heger <oliver.he...@oliver-heger.de>
> >
> >>
> >>
> >> Am 15.01.2014 15:05, schrieb Benedikt Ritter:
> >> > 2014/1/15 Gary Gregory <garydgreg...@gmail.com>
> >> >
> >> >>  On Wed, Jan 15, 2014 at 8:06 AM, Benedikt Ritter <
> brit...@apache.org>
> >> >> wrote:
> >> >>
> >> >>> Hi Gary,
> >> >>>
> >> >>> 2014/1/15 Gary Gregory <garydgreg...@gmail.com>
> >> >>>
> >> >>>> On Wed, Jan 15, 2014 at 7:00 AM, Benedikt Ritter <
> brit...@apache.org>
> >> >>>> wrote:
> >> >>>>
> >> >>>>> Hi all,
> >> >>>>>
> >> >>>>> we currently have StringUtils.getLevenshteinDistance. LANG-944
> [1] is
> >> >>>> about
> >> >>>>> introducing a new string algorithm called Jaro Winkler Distance
> [2].
> >> >>>> Since
> >> >>>>> StringUtils already does a lot of things, I'm wondering if it may
> >> >> make
> >> >>>>> sense to introduce a new class that serves as a host for more
> string
> >> >>>>> algorithms to come. It would look something like:
> >> >>>>>
> >> >>>>> StringAlgorithms.levenshteinDistance(str1, str2);
> >> >>>>> StringAlgorithms.jaroWinklerDistance(str1, str2);
> >> >>>>>
> >> >>>>> We would deprecate StringUtils.getLevenshteinDistance and
> delegate to
> >> >>> the
> >> >>>>> new class. It could be removed from StringUtils in the next major
> >> >>>> release.
> >> >>>>>
> >> >>>>
> >> >>>>> Thoughts?
> >> >>>>>
> >> >>>>
> >> >>>> Yuck!
> >> >>>>
> >> >>>> I'd rather have once class per algo which reminds me that [codec]
> >> might
> >> >>> be
> >> >>>> a better place for things like this that 'encode' strings into
> >> >> something
> >> >>>> else.
> >> >>>>
> >> >>>
> >> >>> Both methods return a double value modeling some kind of score.
> They do
> >> >> not
> >> >>> encode. Maybe StringAlgorithms is the wrong name? How About
> StringScore
> >> >> or
> >> >>> something like that?
> >> >>>
> >> >>
> >> >> Still wrong IMO and not OO. A single class will become another
> >> >> dumping-ground/kitchen-sink like StringUtils. I would not want to see
> >> one
> >> >> algo be a one method one liner impl and another algo be a complex 20
> >> method
> >> >> job. I guess we could organize algos using nested classes like
> >> >> StringFoo.BarAlgo but that's not ideal. All algo classes in a new
> pkg is
> >> >> another way to go.
> >> >>
> >> >
> >> > We already have o.a.c.lang3.text, maybe this would fit?
> >> >
> >> > What I want to avoid is something like:
> >> >
> >> > LevenshteinDistance algo = new LevenshteinDistance()
> >> > double dist = algo.getDistance(str1, str2);
> >> >
> >> > If those algorithms don't have a state, it doesn't make sense to force
> >> > creation of an object. I like to idea of internal classes.
> >>
> >> IIUC, both algorithms do the same thing - calculating the difference (or
> >> similarity) of two strings - using different methods.
> >>
> >> So another option would be to extract a common interface
> >> (StringDifferenceMetric?) and provide the algorithms as concrete
> >> implementations.
> >>
> >
> > This is a possible, but very specific (= tied to distance measuring)
> > approach. I think it is a good idea to create very specific utilities
> > instead of generic ones like StringUtils, that can do a variety of
> things.
> >
> >
> >>
> >> A concrete use case could be a query engine which allows customizing its
> >> string matching algorithm.
> >>
> >
> > Is this really a use case? It sounds very constructed to me. Have you
> ever
> > thought "I'd like to query on google, but I'd like suggestions to be
> > matched using Levenshtein Distance algorithm"?
> >
> >
> >>
> >> If you want to avoid instantiating algorithm classes with no state, we
> >> could have an enum with constants representing the available algorithms.
> >>
> >
> > I still favor specific methods over an additional parameter.
> >
> >
> >>
> >> Oliver
> >>
> >> >
> >> >
> >> >>
> >> >> Gary
> >> >>
> >> >>
> >> >>>
> >> >>>
> >> >>>>
> >> >>>> Gary
> >> >>>>
> >> >>>>
> >> >>>>> Benedikt
> >> >>>>>
> >> >>>>> [1] https://issues.apache.org/jira/i#browse/LANG-944
> >> >>>>> [2] http://en.wikipedia.org/wiki/Jaro%E2%80%93Winkler_distance
> >> >>>>>
> >> >>>>> --
> >> >>>>> http://people.apache.org/~britter/
> >> >>>>> http://www.systemoutprintln.de/
> >> >>>>> http://twitter.com/BenediktRitter
> >> >>>>> http://github.com/britter
> >> >>>>>
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> --
> >> >>>> E-Mail: garydgreg...@gmail.com | ggreg...@apache.org
> >> >>>> Java Persistence with Hibernate, Second Edition<
> >> >>>> http://www.manning.com/bauer3/>
> >> >>>> JUnit in Action, Second Edition <http://www.manning.com/tahchiev/>
> >> >>>> Spring Batch in Action <http://www.manning.com/templier/>
> >> >>>> Blog: http://garygregory.wordpress.com
> >> >>>> Home: http://garygregory.com/
> >> >>>> Tweet! http://twitter.com/GaryGregory
> >> >>>>
> >> >>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> http://people.apache.org/~britter/
> >> >>> http://www.systemoutprintln.de/
> >> >>> http://twitter.com/BenediktRitter
> >> >>> http://github.com/britter
> >> >>>
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> E-Mail: garydgreg...@gmail.com | ggreg...@apache.org
> >> >> Java Persistence with Hibernate, Second Edition<
> >> >> http://www.manning.com/bauer3/>
> >> >> JUnit in Action, Second Edition <http://www.manning.com/tahchiev/>
> >> >> Spring Batch in Action <http://www.manning.com/templier/>
> >> >> Blog: http://garygregory.wordpress.com
> >> >> Home: http://garygregory.com/
> >> >> Tweet! http://twitter.com/GaryGregory
> >> >>
> >> >
> >> >
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> >> For additional commands, e-mail: dev-h...@commons.apache.org
> >>
> >>
> >
> >
> > --
> > http://people.apache.org/~britter/
> > http://www.systemoutprintln.de/
> > http://twitter.com/BenediktRitter
> > http://github.com/britter
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


-- 
http://people.apache.org/~britter/
http://www.systemoutprintln.de/
http://twitter.com/BenediktRitter
http://github.com/britter

Reply via email to