Take a look at the other filters, there are a ton of them. PatternReplaceFilterFactory is a possibility.
Best, Erick On Fri, Aug 4, 2017 at 11:01 AM, [email protected] <[email protected]> wrote: > Hey, > > that is a good point. What is the best way for filtering? About the plus at > the request, we are doing on the whole request an URL encode.. > > > > Thanks > David > > > > >> Am 04.08.2017 um 17:34 schrieb Erick Erickson <[email protected]>: >> >> Glad to hear it. Two things: >> >> 1> you might have to do some additional filtering when using >> WhitespaceTokenizer. It, well, splits on whitespace so things like >> punctuation will come through as part of the token. So "My dog has >> fleas." (note the period after fleas) would have the period included >> in the token "fleas.". >> >> 2> getting the plus sign through URL encoding and the parser may be >> fun, you may have to escape it to keep it from being interpreted as an >> operator.... >> >> Best, >> Erick >> >> On Fri, Aug 4, 2017 at 5:55 AM, [email protected] >> <[email protected]> wrote: >>> Hey, thanks. >>> >>> Yeah i found a way.. >>> I sued for these files my on fieldtype. In these I'm using the >>> WhitespaceTokenizerFactory for query an index.. and now everything is like >>> it should be.. >>> >>> :-) >>> >>> Thanks >>> >>> David >>> >>> -----Ursprüngliche Nachricht----- >>> Von: Shawn Heisey [mailto:[email protected]] >>> Gesendet: Freitag, 4. August 2017 14:53 >>> An: [email protected] >>> Betreff: Re: AW: plus sign in request / looking for + in title >>> >>>> On 8/4/2017 2:15 AM, [email protected] wrote: >>>> So how can I prevent e.g. the ST (standartTokenizer) to remove the plus >>>> sign? An suggestions? >>> >>> You can't. The standard tokenizer really isn't configurable at all. >>> >>> You'd need to change your analysis chain (tokenizer and filters) to produce >>> the results you want. >>> >>> Thanks, >>> Shawn >>>
