In article <[EMAIL PROTECTED]>, Duncan Booth wrote: > I guess you've never seen anyone write tests which retrieve some generated > html and compare it against the expected value. If the page contains any > unescaped quotes then this change would break it.
You're right - I've never seen anyone do such a thing. It sounds like a highly dubious and very fragile sort of test to me, of very limited use. > I'm talking about encoding certain characters as entity references. It > doesn't matter whether its the character ampersand or right double quote, > they both want to be converted to entities. Same operation. This is that muddled thinking I was talking about. They are *not* the same operation. You want to encode "<", for example, because it must always be encoded to prevent it being treated as an HTML control character. This has nothing to do with character encodings. You might sometimes want to escape "right double quote" because it may or may not be available in the character encoding you using to output to the browser. Yes, this might sometimes seem a bit similar to the "<" escaping described above, because one of the ways you could avoid the character encoding issue would be to use numeric entities, but it is actually a completely separate issue and is none of the business of cgi.escape. By your argument, cgi.escape should in fact escape *every single* character as a numeric entity, and even that wouldn't work properly since "&", "#", ";" and the digits might not be in their usual positions in the output encoding. > Right now the only way the Python library gives me to do the entity > escaping properly has a side effect of encoding the string. I should > be able to do the escaping without having to encode the string at > the same time. I'm getting lost here - the opposite of what you say above is true. cgi.escape does the escaping properly (modulo failing to escape quotes) without encoding. -- http://mail.python.org/mailman/listinfo/python-list