[issue4610] Unicode case mappings are incorrect

2013-06-24 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: On 24.06.2013 00:52, Alexander Belopolsky wrote: > > Alexander Belopolsky added the comment: > > There has been a relatively recent discussion of case mappings under #12753 > (msg144836). > > I personally agree with Martin: str.upper/lower should remain t

[issue4610] Unicode case mappings are incorrect

2013-06-23 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: It looks like at least the OP issue has been fixed in #12736: >>> 'ß'.upper() 'SS' -- resolution: -> out of date status: open -> closed superseder: -> Request for python casemapping functions to use full not simple casemaps per Unicode's recomm

[issue4610] Unicode case mappings are incorrect

2013-06-23 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: There has been a relatively recent discussion of case mappings under #12753 (msg144836). I personally agree with Martin: str.upper/lower should remain the way it is - a simplistic 1-to-1 mapping using UnicodeData.txt fields. More sophisticated case map

[issue4610] Unicode case mappings are incorrect

2010-12-06 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: >> .swapcase() is just ...err... dumb^h^h^h^h questionably useful. > I agree with the rest of you that Python would be better-off > without swapcase(). As long as str.upper/lower are based only on UnicodeData.txt 1-to-1 mappings, existence of str.swapc

[issue4610] Unicode case mappings are incorrect

2009-10-14 Thread Raymond Hettinger
Raymond Hettinger added the comment: > .swapcase() is just ...err... dumb^h^h^h^h questionably useful. FWIW, it appears that the original use case (as an Emacs macro) was to correct blocks of text where touch typists had accidentally left the CapsLocks key turned on: tHE qUICK bROWN fOX jUMPE

[issue4610] Unicode case mappings are incorrect

2009-10-14 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Jeff Senn wrote: > However .capitalize() is a bit weird; and I'm not sure it isn't > incorrectly implemented now: > > It UPPERCASES the first character, rather than TITLECASING, which is > probably wrong in the very few cases where it makes a difference:

[issue4610] Unicode case mappings are incorrect

2009-10-14 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: Jeff Senn wrote: > > Jeff Senn added the comment: > > Yikes! I just noticed that u''.title() is really broken! > > It doesn't really pay attention to word breaks -- > only characters that "have case". > Therefore when there are (caseless) > combining

[issue4610] Unicode case mappings are incorrect

2009-10-14 Thread Jeff Senn
Jeff Senn added the comment: Yikes! I just noticed that u''.title() is really broken! It doesn't really pay attention to word breaks -- only characters that "have case". Therefore when there are (caseless) combining characters in a word it's really broken e.g. >>> u'n\u0303on\u0303e'.title

[issue4610] Unicode case mappings are incorrect

2009-10-14 Thread Jeff Senn
Jeff Senn added the comment: > Feel free to upload it here. I'm fairly skeptical that it is > possible to implement casing "correctly" in a locale-independent > way. Ok. I will try to find time to complete it enough to be readable. Unicode (see sec 3.13) specifies the casing of unicode strings

[issue4610] Unicode case mappings are incorrect

2009-10-13 Thread Martin v . Löwis
Martin v. Löwis added the comment: > I have a half finished implementation of this, in case anyone > is interested. Feel free to upload it here. I'm fairly skeptical that it is possible to implement casing "correctly" in a locale-independent way. -- ___

[issue4610] Unicode case mappings are incorrect

2009-10-13 Thread Jeff Senn
Jeff Senn added the comment: Has there been any action on this? a PEP? I disagree that using ICU is good way to simply get proper unicode casing. (A heavy hammer for a small task...) I agree locales are a different issue (and would prefer optional arguments to the unicode object casing methods

[issue4610] Unicode case mappings are incorrect

2008-12-20 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: On 2008-12-20 17:19, Alex Stapleton wrote: > Alex Stapleton added the comment: > > I am trying to get a PEP together for this. Does anyone have any thoughts > on how to handle comparison between unicode strings in a locale aware > situation? Some though

[issue4610] Unicode case mappings are incorrect

2008-12-20 Thread Martin v. Löwis
Martin v. Löwis added the comment: > I am trying to get a PEP together for this. Does anyone have any thoughts > on how to handle comparison between unicode strings in a locale aware > situation? Implementation-wise, or specification-wise? Implementation-wise, you can either try to use the C

[issue4610] Unicode case mappings are incorrect

2008-12-20 Thread Ezio Melotti
Changes by Ezio Melotti : -- nosy: +ezio.melotti ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.pyt

[issue4610] Unicode case mappings are incorrect

2008-12-20 Thread Alex Stapleton
Alex Stapleton added the comment: I am trying to get a PEP together for this. Does anyone have any thoughts on how to handle comparison between unicode strings in a locale aware situation? Should __lt__ and __gt__ be specified as ignoring locale? In which case do we need to add a new method

[issue4610] Unicode case mappings are incorrect

2008-12-10 Thread Alex Stapleton
Alex Stapleton <[EMAIL PROTECTED]> added the comment: I agree with loewis that ICU is probably the best way to get this functionality into Python. lemburg, yes it seems like extending those methods would be required at the very least. We would probably also need to support ICUs collators as w

[issue4610] Unicode case mappings are incorrect

2008-12-10 Thread Marc-Andre Lemburg
Marc-Andre Lemburg <[EMAIL PROTECTED]> added the comment: Python uses the Unicode database for the mapping and this only contains 1-1 mappings. The special cases (mostly 1-2 mappings) are not included. It would be nice to have them available as well, but I guess we'd have to write them in code r

[issue4610] Unicode case mappings are incorrect

2008-12-09 Thread Martin v. Löwis
Martin v. Löwis <[EMAIL PROTECTED]> added the comment: I have known this problem for years, and decided not to act; I don't consider it an important problem. Implementing it properly is complicated by the fact that some of the case mappings are conditional on the locale. If you consider it impor

[issue4610] Unicode case mappings are incorrect

2008-12-09 Thread Alex Stapleton
New submission from Alex Stapleton <[EMAIL PROTECTED]>: Following a discussion on reddit it seems that the unicode case conversion algorithms are not being followed. $ python3.0 Python 3.0rc1 (r30rc1:66499, Oct 10 2008, 02:33:36) [GCC 4.0.1 (Apple Inc. build 5488)] on darwin Type "help", "copyr