On 08/29/2009 01:43 PM, Vlastimil Brom wrote: > > 2009/8/29<ru...@yahoo.com>: >> >> On 08/28/2009 02:12 AM, "Martin v. Löwis" wrote: >> >> >> >> So far, it seems not and that unichr/ord >> >> is a poster child for "purity beats practicality". >> >> -- >> >> http://mail.python.org/mailman/listinfo/python-list >> >> > > > > As Mark Tolonen pointed out earlier in this thread, in Python 3 the > > practicality apparently beat purity in this aspect: > > > > Python 3.1.1 (r311:74483, Aug 17 2009, 17:02:12) [MSC v.1500 32 bit > > (Intel)] on win32 > > Type "copyright", "credits" or "license()" for more information. > > >>>> >>>> goth_urus_1 = '\U0001033f' >>>> >>>> list(goth_urus_1) > > ['\ud800', '\udf3f'] >>>> >>>> len(goth_urus_1) > > 2 >>>> >>>> ord(goth_urus_1) > > 66367 >>>> >>>> goth_urus_2 = chr(66367) >>>> >>>> len(goth_urus_2) > > 2 >>>> >>>> import unicodedata >>>> >>>> unicodedata.name(goth_urus_1) > > 'GOTHIC LETTER URUS' >>>> >>>> goth_urus_3 = unicodedata.lookup("GOTHIC LETTER URUS") >>>> >>>> goth_urus_4 = "\N{GOTHIC LETTER URUS}" >>>> >>>> goth_urus_1 == goth_urus_2 == goth_urus_3 == goth_urus_4 > > True >>>> >>>>
Yes, that certainly seems like much more sensible behavior. > > As for the behaviour in python 2.x, it's probably good enough, that > > the surrogates aren't prohibited and the eventually needed behaviour > > can be easily added via custom functions. Yes, I agree that given the current behavior is well documented and further, is fixed in python 3, it can't be changed. I would a nit though with "can be easily added via custom functions." I don't think that is a good criterion for rejection of functionality from the library because it is not sufficient; their are many functions in the library that fail that test. I think the criterion should be more like a ratio: (how often needed) / (ease of writing). [where "ease" is not just the line count but also the obviousness to someone who is not a python expert yet.] And I would also dispute that the generalized unichr/ord functions are "easily" added. When I ran into the TypeError in ord(), I thought "surrogate pairs" were something used in sex therapy. :-) It took a lot of reading and research before I was able to write a generalized ord() function. -- http://mail.python.org/mailman/listinfo/python-list