On 10/9/07, Just Another Victim of the Ambient Morality
<[EMAIL PROTECTED]> wrote:
>
> "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote in message
> news:[EMAIL PROTECTED]
> >
> > Without code, that's hard to determine. But you are aware of e.g.
> >
> > handle_entityref(name)
> > handle_charref(ref)
> >
> > ?
>
>     Actually, I am not aware of these methods but I will certainly look into
> them!
>     I was hoping that the issue would be known or simple before I commited
> to posting code, something that is, to my chagrin, not easily done with my
> news client...

For example, here's something simple/simplistic you can do to handle
character and entity references:

from htmlentitydefs import name2codepoint

...

    def handle_charref(self, ref):
        try:
            if ref.startswith('x'):
                char = unichr(int(ref[1:], 16))
            else:
                char = unichr(int(ref))
        except (TypeError, ValueError):
            char = ' '
        # Do something with char

    def handle_entityref(self, ref):
        try:
            char = unichr(name2codepoint[ref])
        except (KeyError, ValueError):
            char = ' '
        # Do something with char


A.
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to