Glad to hear it. Two things:

1> you might have to do some additional filtering when using
WhitespaceTokenizer. It, well, splits on whitespace so things like
punctuation will come through as part of the token. So "My dog has
fleas." (note the period after fleas) would have the period included
in the token "fleas.".

2> getting the plus sign through URL encoding and the parser may be
fun, you may have to escape it to keep it from being interpreted as an
operator....

Best,
Erick

On Fri, Aug 4, 2017 at 5:55 AM, [email protected]
<[email protected]> wrote:
> Hey, thanks.
>
> Yeah i found a  way..
> I sued for these files my on fieldtype. In these I'm using the 
> WhitespaceTokenizerFactory for query an index.. and now everything is like it 
> should be..
>
> :-)
>
> Thanks
>
> David
>
> -----Ursprüngliche Nachricht-----
> Von: Shawn Heisey [mailto:[email protected]]
> Gesendet: Freitag, 4. August 2017 14:53
> An: [email protected]
> Betreff: Re: AW: plus sign in request / looking for + in title
>
> On 8/4/2017 2:15 AM, [email protected] wrote:
>> So how can I prevent e.g. the ST (standartTokenizer) to remove the plus 
>> sign? An suggestions?
>
> You can't.  The standard tokenizer really isn't configurable at all.
>
> You'd need to change your analysis chain (tokenizer and filters) to produce 
> the results you want.
>
> Thanks,
> Shawn
>

Reply via email to