Hi,

I'm using StandardAnalyzer during indexing and I have noticed that it splits hyphenated words in two, ditching the hyphen. This is messing up some of my search results. I would like to keep using StandardAnalyzer because it's very good on the whole, however I would like to add an extra term in these cases. I am fine doing everything except figuring out when StandardTokenizer has split a hyphenated word. All I get is the individual tokens with a type ALPHANUM. Can anyone think of a way I can do this without having to dive into StandardTokenizer?

I have looked at the source for StandardTokenizer and I really really really don't want to have to go there :/

Cheers
Rob

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to