On Mon, Nov 30, 2009 at 8:08 PM, Robert Muir <rcm...@gmail.com> wrote: >> I am not sure if it is worth to add a new TokenFilter for Turkish language. >> I see there exist GreekLowerCaseFilter and RussianLowerCaseFilter. It would >> be nice to see TurkishLowerCaseFilter in Lucene. >> >> >> > just to clarify, GreekLowerCaseFilter really shouldn't exist either. The > final sigma problem it has (where there are two lowercase forms depending > upon position in word), this is also solved with unicode case folding or > collation. This is a perfect example of how lowercase is the wrong operation > for search. > > and RussianLowerCaseFilter is deprecated now, it does the exact same thing > as LowerCaseFilter. btw. we should fix supplementary chars in there too even if it is deprecated.
> > -- > Robert Muir > rcm...@gmail.com > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org