>>Stemming doesn't have to produce intelligible words
True, yes this should be fine for general search requirements.
However, the code presented does make some attempt to produce
intelligible words eg parties=party unlike Porter stemmer's parties=parti
Does this make it a "lemmatizer"?
This is a f
mark harwood wrote:
Just ran this method on 4500 words ending in "s" in my
index and results looks good but I'm tempted to remove
this line:
!word.endsWith("ses") )
With it removed I saw 3 oddities moses=mose gases=gase
viruses=viruse but I got 100+ extra stems that were
OK:
Stemming d
Just ran this method on 4500 words ending in "s" in my
index and results looks good but I'm tempted to remove
this line:
!word.endsWith("ses") )
With it removed I saw 3 oddities moses=mose gases=gase
viruses=viruse but I got 100+ extra stems that were
OK:
accesses=access
addresses=ad
On Apr 1, 2005, at 7:03 PM, Chris Hostetter wrote:
: > > Are there any Lucene extensions that can do simple stemming,
i.e. just
: > > for plurals? Or is the only stemming package available Snowball?
LIA has a case study of jGuru which uses a very specific, home grown
utility method called "stripE
: > > Are there any Lucene extensions that can do simple stemming, i.e. just
: > > for plurals? Or is the only stemming package available Snowball?
LIA has a case study of jGuru which uses a very specific, home grown
utility method called "stripEnglishPlural" ... since it's in the case
study chap
On Fri, 2005-04-01 at 19:24 +0200, Andrzej Bialecki wrote:
> Miles Barr wrote:
> > Are there any Lucene extensions that can do simple stemming, i.e. just
> > for plurals? Or is the only stemming package available Snowball?
>
> For which language? Stemming is always language-specific...
>
> If for
Miles Barr wrote:
Are there any Lucene extensions that can do simple stemming, i.e. just
for plurals? Or is the only stemming package available Snowball?
For which language? Stemming is always language-specific...
If for English, then there is also a built-in PorterStemmer. If you know
what you do
Are there any Lucene extensions that can do simple stemming, i.e. just
for plurals? Or is the only stemming package available Snowball?
Cheers
--
Miles Barr <[EMAIL PROTECTED]>
Runtime Collective Ltd.
-
To unsubscribe, e-mail