Porter is a little outdated I've found KStem much better
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters/Kstem

You'll still need a good protected word list, but KStem is just a little
nicer

On Fri, Jan 16, 2009 at 6:20 PM, David Woodward <d...@loc.gov> wrote:

> Hi.
>
> Any good protwords.txt out there?
>
> In a fairly standard solr analyzer chain, we use the English Porter
> analyzer like so:
>
> <filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
>
> For most purposes the porter does just fine, but occasionally words come
> along that really don't work out to well, e.g.,
>
> "maine" is stemmed to "main" - clearly goofing up precision about "Maine"
> without doing much good for variants of "main".
>
> So - I have an entry for my protwords.txt. What else should go in there?
>
> Thanks for your ideas,
>
> Dave Woodward
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

Reply via email to