Googling for "java string similarity" throws up some stuff you might find useful.
-- Ian. On Wed, Sep 3, 2008 at 11:58 PM, Thiago Moreira <[EMAIL PROTECTED]> wrote: > > Well, the similar definition that I'm looking for is the number 2, maybe > the number 3, but to start the number 2 is enough. If you guys think that is > not a Lucene problem what else tool can I use to implement this > requirement?? > > Thanks > ________________________________ > Thiago Moreira > Software Engineer > [EMAIL PROTECTED] > Liferay, Inc. > Enterprise. Open Source. For Life. > > > N. Hira wrote: > > I don't know how much of this is a Lucene problem, but -- as I'm sure you > will inevitably hear from others on the list -- it depends on what your > definition of "similar" is. > > By similar, do you mean: > 1. Identical, except for variations in case (upper/lower) > 2. Allow 1., but also allow prefixes/suffixes (e.g., "FW: " or "... > (summary") > 3. Allow 1., 2. and permit some new terms ... how many? > 4. Allow all of the above and allow some changes to terms using stemming > (E.g., "Google releases Chrome" is similar to "Google announces the release > of its new Chrome web browser") > .... > > I'm sure you see where this is going. So ... how do you define similar? > > Good luck! > > -h > ---------------------------------------------------------------------- > Hira, N.R. > Cognocys, Inc. > > On 03-Sep-2008, at 2:52 PM, Thiago Moreira wrote: > > > Hey all, > > I want to know how much two Strings are similar! The thing is: I'm > processing an email box and I want to group all messages that have the > subject similar, makes sense?? I looked on the documentation but I didn't > find how to accomplish this. It's not necessary add the messages or the > subjects on some kind of index. I'm using 2.3.2 version of Lucene. > > Anyone has some idea? > > Thanks in advance. > -- > Thiago Moreira > Software Engineer > [EMAIL PROTECTED] > Liferay, Inc. > Enterprise. Open Source. For Life. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]