Jon Ribbens <[EMAIL PROTECTED]> wrote: >> and will also break unit tests. > > Er, so change the unit tests at the same time?
It is generally a principle of Python that new releases maintain backward compatability. An incompatible change such proposed here would probably break many tests for a large number of people. If the change were seen as a good thing, then a backwards compatible change (e.g. introducing a function with a different name) might be considered, but if so it should address the whole issue: the current lack of support for encodings is IMHO a far bigger problem than whether or a quote mark is escaped. > Why does it need to? cgi.escape is (or should be) dealing with > character strings, not byte sequences. I must admit, > internationalisation is not my forte, so if there's something > I'm missing here I'd love to hear about it. If I have a unicode string such as: u'\u201d' (right double quote), then I want that encoded in my html as '”' (or ” but the numeric form is better). For many purposes I could just encode it in the encoding to be used for the page, typically latin1 or utf8, but sometimes that isn't possible e.g. if you don't know the encoding at the point when you produce the string, or if there is no translation for the character in the desired encoding. The character reference will work whatever encoding is used for the page. There should be a one-stop shop where I can take my unicode text and convert it into something I can safely insert into a generated html page; at present I need to call both cgi.escape and s.encode to get the desired effect. -- http://mail.python.org/mailman/listinfo/python-list