Re: Turkic I and re

Oktay Safak Fri, 16 Sep 2011 06:21:19 -0700

Well, I'm a self taught Turkish python coder. I was bitten by this firstin Python 2.3 and asked the group about it then. You can find thecorrespondence by googling "unicode bug in turkish characters?<http://mail.python.org/pipermail/python-list/2003-April/809637.html>"There are a couple of posts with that topic line but they don't come outin one thread. So check them all.

Anyway, to summarize, Martin v.Löwis answered my question back then andsuggested some locale settings but the problem seems to be with Windowsactually. His method worked on Linux, but not on windows. He said thatPython delegates uppercasing of strings to the operating system onWindows, and Windows is broken there. What he said was:


"Python delegates the toupper call to the operating system. It does not,

in itself, include a database of lower/upper case conversions for alllocales.


So there is nothing we can do about that; ask Microsoft."

So I ended up not asking Microsoft of course, but using [iİ] in re and writing 
a couple of utility functions upper_tr(), lower_tr() and title_tr() etc to use 
in my own projects. But of course when foreign language names etc. are mixed in 
a text it blows but I can't see a way to avoid it when that's the case.

I hope this helps,


Oktay Safak



Thomas Rachel wrote:

Am 15.09.2011 15:16 schrieb Alan Plum:

The Turkish 'I' is a peculiarity that will probably haunt us programmers
until hell freezes over.

That's why it would have been nice if the Unicode guys had defined"both Turkish i-s" at separate codepoints.


Then one could have the three pairs
I, i ("normal")
I (other one), ı

and

İ, i (the other one).

But alas, they haven't.


Thomas

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Turkic I and re

Reply via email to