Looking For Tokenizer With Custom Delimeter

2018-01-08 Thread Armins Stepanjans
Hi, I am looking for a tokenizer, where I could specify a delimiter by which the words are tokenized, for example if I choose the delimiters as ' ' and '_' the following string: "foo__bar doo" would be tokenized into: "foo", "", "bar", "doo" (The analyzer could further filter empty tokens, since h

Re: Looking For Tokenizer With Custom Delimeter

2018-01-08 Thread Armins Stepanjans
dler > Achterdiek 19, D-28357 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > -Original Message- > > From: Armins Stepanjans [mailto:armins.bagr...@gmail.com] > > Sent: Monday, January 8, 2018 2:09 PM > > To: java-user@lucene.apache.org

RE: Looking For Tokenizer With Custom Delimeter

2018-01-08 Thread Uwe Schindler
9, D-28357 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Armins Stepanjans [mailto:armins.bagr...@gmail.com] > Sent: Monday, January 8, 2018 2:09 PM > To: java-user@lucene.apache.org > Subject: Re: Looking For Tokenizer With Custom Deli

Re: Looking For Tokenizer With Custom Delimeter

2018-01-08 Thread Armins Stepanjans
Armins Stepanjans [mailto:armins.bagr...@gmail.com] > > Sent: Monday, January 8, 2018 11:30 AM > > To: java-user@lucene.apache.org > > Subject: Looking For Tokenizer With Custom Delimeter > > > > Hi, > > > > I am looking for a tokenizer, where I could specify a

RE: Looking For Tokenizer With Custom Delimeter

2018-01-08 Thread Uwe Schindler
i.de eMail: u...@thetaphi.de > -Original Message- > From: Armins Stepanjans [mailto:armins.bagr...@gmail.com] > Sent: Monday, January 8, 2018 11:30 AM > To: java-user@lucene.apache.org > Subject: Looking For Tokenizer With Custom Delimeter > > Hi, > > I am looking

Looking For Tokenizer With Custom Delimeter

2018-01-08 Thread Armins Stepanjans
Hi, I am looking for a tokenizer, where I could specify a delimiter by which the words are tokenized, for example if I choose the delimiters as ' ' and '_' the following string: "foo__bar doo" would be tokenized into: "foo", "", "bar", "doo" (The analyzer could further filter empty tokens, since h