[EMAIL PROTECTED] wrote:
> Hi, I've found lots of material on the net about unicode html
> conversions, but still i'm having many problems converting unicode
> characters to html entities. Is there any available function to solve
> this issue?
> As an example I would like to do this kind of conversion:
> \uc3B4 => ô
> for all available html entities.
I don't know how you generate your HTML, but ElementTree and lxml both have
good HTML parsers, so that you can let them write out the result with an
"US-ASCII" encoding and they will generate numeric entities for everything
that's not ASCII.
>>> from lxml import etree
>>> root = etree.HTML(my_html_data)
>>> html_7_bit = etree.tostring(root, "us-ascii")
Stefan
--
http://mail.python.org/mailman/listinfo/python-list