Re: Search query problem

2010-01-09 Thread Ahmet Arslan
> Is there another stemmer we can use that is perhaps not as > aggressive as the Porter Stemmer. "KStem is an alternative to Porter for developers looking for a less agressive stemmer. It was written by Bob Krovetz, ported to Lucene by Sergio Guzman-Lara (UMASS Amherst)." [1] [1]http://wiki.ap

Re: Search query problem

2010-01-09 Thread Shashi Kant
Couldn't you just mod the PorterStemmer class for your requirements? (we did and provided it a list of ignore words & phrases specific to our needs) On Sat, Jan 9, 2010 at 4:00 AM, Jamie wrote: > Hi All > > Is there another stemmer we can use that is perhaps not as aggressive as the > Porter Stem

Re: Search query problem

2010-01-09 Thread Jamie
Hi All Is there another stemmer we can use that is perhaps not as aggressive as the Porter Stemmer. i.e. the stemming could remove ing's, er's, but not something so significant as to convert ""Lowe's" to "Low" Thanks Jamie Will Murnane wrote: On Fri, Jan 8, 2010 at 16:27, Jamie wrote:

Re: Search query problem

2010-01-08 Thread Will Murnane
On Fri, Jan 8, 2010 at 16:27, Jamie wrote: > Hi Ian / Will > > Thanks. Surely, the Porter Stemmer should not stem proper noun's. i.e. it > could check the capitalization of the first letter of a word and whether or > not the word is the start of sentence. If so, it could choose not apply any > ste

Re: Search query problem

2010-01-08 Thread Jamie
Hi Ian / Will Thanks. Surely, the Porter Stemmer should not stem proper noun's. i.e. it could check the capitalization of the first letter of a word and whether or not the word is the start of sentence. If so, it could choose not apply any stemming. Or am I completely out of whack? Jamie I

Re: Search query problem

2010-01-08 Thread Ian Lea
Looks like PorterStemFilter converts "Lowe's" to low. Not very surprising. Options include . Drop the stemming . Index stemmed and non-stemmed variants and search both, maybe boosting the non-stemmed variant. If you really want exact matches only, you may also/instead want untokenized fields

Re: Search query problem

2010-01-08 Thread Will Murnane
On Fri, Jan 8, 2010 at 15:01, Jamie wrote: > Hi There > > We are trying to search for the exact word "Lowe's" across a large set of > indexed data. Our results include everything with "low" in it. Thus, we are > receiving a much larger data set that we expected. The data is indexing > using the an