Hmmm, would it work for your case to use Synonyms? If you set expand=false and in your synonyms file have: quick brown => quickbrown
it might do what you want.... Best Erick On Sun, Aug 21, 2011 at 3:53 PM, Xiyang Chen <settingh...@gmail.com> wrote: > Hi, > > I have a dictionary of multi-word phrases and I'd like to analyze documents > such that anything that appears in the dictionary will be treated as one > single token. > For example, if the dictionary contains "brown fox", then the sentence > The quick brown fox jumps over the lazy dog. > > Will be tokenized as (with stopwords stripped): > quick | brown fox | jumps | lazy | dog > > What is the best way to achieve this? > > Thanks, > XIyang > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org