On Jun 4, 6:31 am, "js " <[EMAIL PROTECTED]> wrote: > Hi list. > > If I'm not mistaken, in python, there's no standard library to convert > html entities, like & or > into their applicable characters. > > htmlentitydefs provides maps that helps this conversion, > but it's not a function so you have to write your own function > make use of htmlentitydefs, probably using regex or something. > > To me this seemed odd because python is known as > 'Batteries Included' language. > > So my questions are > 1. Why doesn't python have/need entity encoding/decoding? > 2. Is there any idiom to do entity encode/decode in python? > > Thank you in advance.
I think this is the standard idiom: >>> import xml.sax.saxutils as saxutils >>> saxutils.escape("&") '&' >>> saxutils.unescape(">") '>' >>> saxutils.unescape("A bunch of text with entities: & > <") 'A bunch of text with entities: & > <' Notice there is an optional parameter (a dict) that can be used to define additional entities as well. Matt -- http://mail.python.org/mailman/listinfo/python-list