Hi All

Is there another stemmer we can use that is perhaps not as aggressive as the Porter Stemmer. i.e. the stemming could remove ing's, er's, but not something so significant as to convert ""Lowe's" to "Low"

Thanks

Jamie

Will Murnane wrote:
On Fri, Jan 8, 2010 at 16:27, Jamie <ja...@stimulussoft.com> wrote:
Hi Ian / Will

Thanks. Surely, the Porter Stemmer should not stem proper noun's. i.e. it
could check the capitalization of the first letter of a word and whether or
not the word is the start of sentence. If so, it could choose not apply any
stemming. Or am I completely out of whack?
Look again: you're downcasing the terms before the Porter filter ever
sees them (which is, AIUI, necessary).  You might do well to combine
the tokenizing and downcasing step with some heuristic to find proper
nouns and not downcase or stem them.

Will

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



--
Stimulus Software - MailArchiva
Email Archiving And Compliance
USA Tel: +1-713-343-8824 ext 100
UK Tel: +44-20-80991035 ext 100
Email:  ja...@stimulussoft.com
Web: http://www.mailarchiva.com
To receive MailArchiva Enterprise Edition product announcements, send a message to: 
<mailarchiva-enterprise-edition-subscr...@stimulussoft.com>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to