Hi,
I'm using StandardAnalyzer during indexing and I have noticed that it
splits hyphenated words in two, ditching the hyphen. This is messing up
some of my search results. I would like to keep using StandardAnalyzer
because it's very good on the whole, however I would like to add an
extra term in these cases. I am fine doing everything except figuring
out when StandardTokenizer has split a hyphenated word. All I get is the
individual tokens with a type ALPHANUM. Can anyone think of a way I can
do this without having to dive into StandardTokenizer?
I have looked at the source for StandardTokenizer and I really really
really don't want to have to go there :/
Cheers
Rob
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]