Re: How to get the un-stemed word

Marvin Humphrey Fri, 08 Jul 2005 14:07:17 -0700


On Jul 8, 2005, at 8:44 AM, mark harwood wrote:

You can get the unstemmed word by re-analysing the
(hopefully stored somewhere) text.
Look at the tokens emitted from the TokenStream and
when you get to the one that matches the stemmed form
you can use the token offset info to retrieve the
unstemmed form from the original text.

Wouldn't that fall down if you had two distinct terms which producethe same string when stemmed?


Best,

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: How to get the un-stemed word

Reply via email to