Re: Plural Stemming

2005-04-02 Thread markharw00d
>>Stemming doesn't have to produce intelligible words True, yes this should be fine for general search requirements. However, the code presented does make some attempt to produce intelligible words eg parties=party unlike Porter stemmer's parties=parti Does this make it a "lemmatizer"? This is a f

Re: Plural Stemming

2005-04-02 Thread Andrzej Bialecki
mark harwood wrote: Just ran this method on 4500 words ending in "s" in my index and results looks good but I'm tempted to remove this line: !word.endsWith("ses") ) With it removed I saw 3 oddities moses=mose gases=gase viruses=viruse but I got 100+ extra stems that were OK: Stemming d

Re: Plural Stemming

2005-04-02 Thread mark harwood
Just ran this method on 4500 words ending in "s" in my index and results looks good but I'm tempted to remove this line: !word.endsWith("ses") ) With it removed I saw 3 oddities moses=mose gases=gase viruses=viruse but I got 100+ extra stems that were OK: accesses=access addresses=ad

Re: Plural Stemming

2005-04-01 Thread Erik Hatcher
On Apr 1, 2005, at 7:03 PM, Chris Hostetter wrote: : > > Are there any Lucene extensions that can do simple stemming, i.e. just : > > for plurals? Or is the only stemming package available Snowball? LIA has a case study of jGuru which uses a very specific, home grown utility method called "stripE

Re: Plural Stemming

2005-04-01 Thread Chris Hostetter
: > > Are there any Lucene extensions that can do simple stemming, i.e. just : > > for plurals? Or is the only stemming package available Snowball? LIA has a case study of jGuru which uses a very specific, home grown utility method called "stripEnglishPlural" ... since it's in the case study chap

Re: Plural Stemming

2005-04-01 Thread Miles Barr
On Fri, 2005-04-01 at 19:24 +0200, Andrzej Bialecki wrote: > Miles Barr wrote: > > Are there any Lucene extensions that can do simple stemming, i.e. just > > for plurals? Or is the only stemming package available Snowball? > > For which language? Stemming is always language-specific... > > If for

Re: Plural Stemming

2005-04-01 Thread Andrzej Bialecki
Miles Barr wrote: Are there any Lucene extensions that can do simple stemming, i.e. just for plurals? Or is the only stemming package available Snowball? For which language? Stemming is always language-specific... If for English, then there is also a built-in PorterStemmer. If you know what you do

Plural Stemming

2005-04-01 Thread Miles Barr
Are there any Lucene extensions that can do simple stemming, i.e. just for plurals? Or is the only stemming package available Snowball? Cheers -- Miles Barr <[EMAIL PROTECTED]> Runtime Collective Ltd. - To unsubscribe, e-mail