Hello, I figured out how to solve this. I just added stopTypes.add("<EMAIL>");
On Wed, Jun 12, 2013 at 8:39 PM, Gucko Gucko <gucko.gu...@googlemail.com>wrote: > Hello all, > > is there a filter I can use to remove emails from a TokenStream? > > so far I'm using this to remove numbers, URls, and I would like to remove > emails too: > > Tokenizer tokenizer = new UAX29URLEmailTokenizer(Version.LUCENE_43, > > new StringReader(text)); > > Set<String> stopTypes = new HashSet<String>(); > > stopTypes.add("<URL>"); > > stopTypes.add("<NUM>"); > > TokenStream stream = new TypeTokenFilter(true, tokenizer, stopTypes); > > stream = new StandardFilter( Version.LUCENE_43, stream ); > > stream = new LowerCaseFilter(Version.LUCENE_43, stream); > > > Thanks a million! > > > Best >