In Turkish alphabet lowercase of I is not i. It is LATIN SMALL LETTER DOTLESS 
I. LowerCaseFilter which uses Character.toLowerCase() makes mistake just for 
that character. 

http://java.sun.com/javase/6/docs/api/java/lang/String.html#toLowerCase()

I am not sure if it is worth to add a new TokenFilter for Turkish language. I 
see there exist GreekLowerCaseFilter and RussianLowerCaseFilter. It would be 
nice to see TurkishLowerCaseFilter in Lucene.

Wiki recommends to ask permission from lucene committers before opening an 
issue. I can provide a patch (although it is just a one line change in original 
LowercaseFilter) for that if you want. 

Thank you for your consideration.

Ahmet



      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to