It is my understanding that the StandardAnalyzer will remove underscores
- so "some_word" be indexed as 'some' and 'word'.
I want to keep the underscores, so I was thinking of changing over to an
Analyzer that uses the WhiteSpaceTokenizer, LowerCaseFilter, and StopFilter.
What other tokenizing magic will I lose by changing away from the
StandardAnalyzer?
Thanks,
Dan
--
****************************
Daniel Armbrust
Biomedical Informatics
Mayo Clinic Rochester
daniel.armbrust(at)mayo.edu
http://informatics.mayo.edu/
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]