Feature Requests item #1706460, was opened at 2007-04-24 12:47
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1706460&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Unicode
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: vbr (vlbrom)
Assigned to: Nobody/Anonymous (nobody)
Summary: access to unicodedata (via codepoints or 2-char surrogates)

Initial Comment:
Currently, most functions of the unicodedata module require the unichr - 
unicode string of length 1 - as a parameter; for most uses it's ok, but 
especially while working with characters outside the BMP - (the code point over 
FFFF) on a narrow python build it could be quite handy, to access the 
properties of these characters simply using the codepoint or ordinal (since the 
simple unichr(x) only works for x <= FFFF on a narrow build, hence the other 
unicode planes are unaccessible this way).

I belive, the unicode database could be allready indexed using some numerical 
values like codepoints, or isn't it true?

With this improvement, the whole database could be effectively accessible also 
on narrow python builds, where it isn't possible to pass one-character string 
for codepoints over FFFF (even if the explicit limitation of unichr is 
bypassed, eg. by creating an unicode literal u'\Uxxxxxxxx', the resulting 
string consist of a surrogate pair and has obviously the length 2)

Alternatively, it could be possible, that the respective functions would also 
accept a two-character string, provided, this sequence can be correcly 
interpretted as a surrogate-pair representation of some valid unicode 
codepoint. 

Currently such behaviour (e.g. codepoint access) can be emulated with custom 
datasets derived from the unicode database, but I belive, that it should be 
possible to access the allready present data somehow (also on narrow builds), 
rather than having to duplicate it.



----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1706460&group_id=5470
_______________________________________________
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to