I'd think extending WhiteSpaceTokenizer would be a good place to start.
Then create a new Analyzer that exactly mirrors your current Analyzer,
with the exception that it uses your new tokenizer instead of
WhiteSpaceTokenizer (Well.. there is of course my assumption that you
are using an Analyzer that already uses WhiteSpaceTokenizer... but you
likely are)
OBender wrote:
Hi All,
I need to make ? and ! characters to be a separate token e.g. to split [how
are you?] in to 4 tokens [how], [are], [you] and [?] what would be the best
way to do this?
Thanks
--
Matthew Hall
Software Engineer
Mouse Genome Informatics
mh...@informatics.jax.org
(207) 288-6012
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org