Re: Convert from unicode chars to HTML entities

2007-02-08 Thread Roberto Bonvallet
Steven D'Aprano <[EMAIL PROTECTED]> wrote: > I have a string containing Latin-1 characters: > > s = u"© and many more..." > > I want to convert it to HTML entities: > > result => > "© and many more..." [...[ > Is there a "batteries included" solution that doesn't involve > reinventing the wheel?

Re: Convert from unicode chars to HTML entities

2007-01-29 Thread Martin v. Löwis
Steven D'Aprano schrieb: > A few issues: > > (1) It doesn't seem to be reversible: > '© and many more...'.decode('latin-1') > u'© and many more...' > > What should I do instead? For reverse processing, you need to parse it with an SGML/XML parser. > (2) Are XML entities guaranteed to be t

Re: Convert from unicode chars to HTML entities

2007-01-28 Thread Leif K-Brooks
Steven D'Aprano wrote: > A few issues: > > (1) It doesn't seem to be reversible: > '© and many more...'.decode('latin-1') > u'© and many more...' > > What should I do instead? Unfortunately, there's nothing in the standard library that can do that, as far as I know. You'll have to write y

Re: Convert from unicode chars to HTML entities

2007-01-28 Thread Steven D'Aprano
On Sun, 28 Jan 2007 23:41:19 -0500, Leif K-Brooks wrote: > >>> s = u"© and many more..." > >>> s.encode('ascii', 'xmlcharrefreplace') > '© and many more...' Wow. That's short and to the point. I like it. A few issues: (1) It doesn't seem to be reversible: >>> '© and many more...'.decode('lat

Re: Convert from unicode chars to HTML entities

2007-01-28 Thread Leif K-Brooks
Steven D'Aprano wrote: > I have a string containing Latin-1 characters: > > s = u"© and many more..." > > I want to convert it to HTML entities: > > result => > "© and many more..." > > Decimal/hex escapes would be acceptable: > "© and many more..." > "© and many more..." >>> s = u"© and many

Re: Convert from unicode chars to HTML entities

2007-01-28 Thread Gabriel Genellina
En Mon, 29 Jan 2007 00:05:24 -0300, Steven D'Aprano <[EMAIL PROTECTED]> escribió: > I have a string containing Latin-1 characters: > > s = u"© and many more..." > > I want to convert it to HTML entities: > > result => > "© and many more..." > Module htmlentitydefs contains the tables you're loo

Re: Convert from unicode chars to HTML entities

2007-01-28 Thread Adonis Vargas
Adonis Vargas wrote: [...] > > Its *very* ugly, but im pretty sure you can make it look prettier. > > import htmlentitydefs as entity > > s = u"© and many more..." > t = "" > for i in s: > if ord(i) in entity.codepoint2name: > name = entity.codepoint2name.get(ord(i)) > entity

Re: Convert from unicode chars to HTML entities

2007-01-28 Thread Adonis Vargas
Steven D'Aprano wrote: > I have a string containing Latin-1 characters: > > s = u"© and many more..." > > I want to convert it to HTML entities: > > result => > "© and many more..." > > Decimal/hex escapes would be acceptable: > "© and many more..." > "© and many more..." > > I can look up tab

Convert from unicode chars to HTML entities

2007-01-28 Thread Steven D'Aprano
I have a string containing Latin-1 characters: s = u"© and many more..." I want to convert it to HTML entities: result => "© and many more..." Decimal/hex escapes would be acceptable: "© and many more..." "© and many more..." I can look up tables of HTML entities on the web (they're a dime a d