richard wrote: > Leon wrote: > > example: > > s = ' ' ---> > > That's technically not HTML encoding, that's replacing a perfectly valid > space character with a *non-breaking* space character.
How can you tell? s = 'Â' # non-breaking space s = ' ' # normal space s = 'á' # em-space But you might want to do something like: def escapechar(s): import htmlentitydefs n = ord(s) if n < 128: return s.encode('ascii') elif n in htmlentitydefs.codepoint2name: return '&%s;' % htmlentitydefs.codepoint2name[n] else: return '&#%d;' % ord(s) This requires unicode strings, because unicode encodings have multi-byte characters. Demonstration: >>> f(u'Ã') 'ò' >>> f(u'Å') 'ş' >>> f(u's') 's' yours, Gerrit Holl. -- Weather in Lulea / Kallax, Sweden 13/12 10:20: -15.0ÂC wind 0.9 m/s NNW (34 m above NAP) -- In the councils of government, we must guard against the acquisition of unwarranted influence, whether sought or unsought, by the military-industrial complex. The potential for the disastrous rise of misplaced power exists and will persist. -Dwight David Eisenhower, January 17, 1961 -- http://mail.python.org/mailman/listinfo/python-list