RE: Dubious stuff spotted in LowerCaseFilter

2015-10-22 Thread Uwe Schindler
> What is the meaning of "the Unicode Policeman" ? Robert Muir :-) Uwe > Thanks, > Ahmet > > On Thursday, October 22, 2015 2:59 PM, Uwe Schindler > wrote: > > > > Hi, > > > > >> Setting aside the fact that Character.toLowerCase is already > > >> dubious in some locales (e.g. Turkish), > >

Re: Dubious stuff spotted in LowerCaseFilter

2015-10-22 Thread Ahmet Arslan
Hi Uwe, What is the meaning of "the Unicode Policeman" ? Thanks, Ahmet On Thursday, October 22, 2015 2:59 PM, Uwe Schindler wrote: Hi, > >> Setting aside the fact that Character.toLowerCase is already dubious > >> in some locales (e.g. Turkish), > > > > This is not true. Character.toLower

RE: Dubious stuff spotted in LowerCaseFilter

2015-10-22 Thread Uwe Schindler
Hi, > >> Setting aside the fact that Character.toLowerCase is already dubious > >> in some locales (e.g. Turkish), > > > > This is not true. Character.toLowerCase() works locale-independent. > > It is only String.toLowerCase that works using default locale. So you mean the opposite. You wanted t

Re: Dubious stuff spotted in LowerCaseFilter

2015-10-22 Thread Dawid Weiss
> LowerCaseFilter will not handle that. So whereas it is "safe" for > English hard-coded strings, it isn't safe for all fields you might > index in general. This filter is a "safe" fallback that works identically regardless of the locale you have on your computer (or on the server). This, I believ

Re: Dubious stuff spotted in LowerCaseFilter

2015-10-22 Thread Trejkaz
On Thu, Oct 22, 2015 at 7:05 PM, Uwe Schindler wrote: > Hi, > >> Setting aside the fact that Character.toLowerCase is already dubious in some >> locales (e.g. Turkish), > > This is not true. Character.toLowerCase() works locale-independent. > It is only String.toLowerCase that works using default

Re: Dubious stuff spotted in LowerCaseFilter

2015-10-22 Thread Dawid Weiss
locale. >> >> Uwe >> >> - >> Uwe Schindler >> H.-H.-Meier-Allee 63, D-28213 Bremen >> http://www.thetaphi.de >> eMail: u...@thetaphi.de >> >> >> > -Original Message- >> > From: Trejkaz [mailto:tr

Re: Dubious stuff spotted in LowerCaseFilter

2015-10-22 Thread Dawid Weiss
er-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > > -Original Message- > > From: Trejkaz [mailto:trej...@trypticon.org] > > Sent: Thursday, October 22, 2015 7:15 AM > > To: Lucene Users Mailing List > >

RE: Dubious stuff spotted in LowerCaseFilter

2015-10-22 Thread Uwe Schindler
63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Trejkaz [mailto:trej...@trypticon.org] > Sent: Thursday, October 22, 2015 7:15 AM > To: Lucene Users Mailing List > Subject: Dubious stuff spotted in LowerCaseFilter > > Hi all

Dubious stuff spotted in LowerCaseFilter

2015-10-21 Thread Trejkaz
Hi all. LowerCaseFilter uses CharacterUtils.toLowerCase to perform its work. The latter method looks like this: public final void toLowerCase(final char[] buffer, final int offset, final int limit) { assert buffer.length >= limit; assert offset <=0 && offset <= buffer.length; for (int i = o