Hi all,

I need to parse feeds and post the data to SOLR.I want the special
characters(Unicode char) to be posted as numerical representation,

For eg,
*'* --> ’ (for which HTML equivalent is ’)
I used BeautifulSoup,which seems to be allowing conversion from "&#xxxx;"(
numeric values )to unicode characters as follow,

*hdes=str(BeautifulStoneSoup(strdesc,
convertEntities=BeautifulStoneSoup.HTML_ENTITIES))
xdesc=str(BeautifulStoneSoup(hdes,
convertEntities=BeautifulStoneSoup.XML_ENTITIES))*

But i want *numerical representation of unicode characters.*
I also want to convert html representation like ’ to its numeric
equivalent ’

Thanks in advance.

*Note:*
The reason for the above requirement is i need a standard way to post to
SOLR to avoid errors.
-- 
Yours,
S.Selvam
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to