Couldn't you just mod the PorterStemmer class for your requirements?
(we did and provided it a list of ignore words & phrases specific to
our needs)

On Sat, Jan 9, 2010 at 4:00 AM, Jamie <ja...@stimulussoft.com> wrote:
> Hi All
>
> Is there another stemmer we can use that is perhaps not as aggressive as the
> Porter Stemmer. i.e. the stemming could remove ing's, er's, but not
> something so significant as to convert ""Lowe's" to "Low"
>
> Thanks
>
> Jamie
>
> Will Murnane wrote:
>>
>> On Fri, Jan 8, 2010 at 16:27, Jamie <ja...@stimulussoft.com> wrote:
>>
>>>
>>> Hi Ian / Will
>>>
>>> Thanks. Surely, the Porter Stemmer should not stem proper noun's. i.e. it
>>> could check the capitalization of the first letter of a word and whether
>>> or
>>> not the word is the start of sentence. If so, it could choose not apply
>>> any
>>> stemming. Or am I completely out of whack?
>>>
>>
>> Look again: you're downcasing the terms before the Porter filter ever
>> sees them (which is, AIUI, necessary).  You might do well to combine
>> the tokenizing and downcasing step with some heuristic to find proper
>> nouns and not downcase or stem them.
>>
>> Will
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
>
>
> --
> Stimulus Software - MailArchiva
> Email Archiving And Compliance
> USA Tel: +1-713-343-8824 ext 100
> UK Tel: +44-20-80991035 ext 100
> Email:  ja...@stimulussoft.com
> Web: http://www.mailarchiva.com
> To receive MailArchiva Enterprise Edition product announcements, send a
> message to: <mailarchiva-enterprise-edition-subscr...@stimulussoft.com>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to