On 10/9/07, Just Another Victim of the Ambient Morality <[EMAIL PROTECTED]> wrote: > > "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote in message > news:[EMAIL PROTECTED] > > > > Without code, that's hard to determine. But you are aware of e.g. > > > > handle_entityref(name) > > handle_charref(ref) > > > > ? > > Actually, I am not aware of these methods but I will certainly look into > them! > I was hoping that the issue would be known or simple before I commited > to posting code, something that is, to my chagrin, not easily done with my > news client...
For example, here's something simple/simplistic you can do to handle character and entity references: from htmlentitydefs import name2codepoint ... def handle_charref(self, ref): try: if ref.startswith('x'): char = unichr(int(ref[1:], 16)) else: char = unichr(int(ref)) except (TypeError, ValueError): char = ' ' # Do something with char def handle_entityref(self, ref): try: char = unichr(name2codepoint[ref]) except (KeyError, ValueError): char = ' ' # Do something with char A. -- http://mail.python.org/mailman/listinfo/python-list