I'm working on an app that's processing Usenet messages. I'm making a connection to my NNTP feed and grabbing the headers for the groups I'm interested in, saving the info to disk, and doing some post-processing. I'm finding a few bizarre characters and I'm not sure how to handle them pythonically.
One of the lines I'm finding this problem with contains: 137050 Cleo and I have an anouncement! "Mlle. =?iso-8859-1?Q?Ana=EFs?=" <[EMAIL PROTECTED]> Sun, 21 Nov 2004 16:21:50 -0500 <[EMAIL PROTECTED]> 4478 69 Xref: sn-us rec.pets.cats.community:137050 The interesting patch is the string that reads "=?iso-8859-1?Q?Ana=EFs?=". An HTML rendering of what this string should look would be "Anaïs". What I'm doing now is a brute-force substitution from the version in the file to the HTML version. That's ugly. What's a better way to translate that string? Or is my problem that I'm grabbing the headers from the NNTP server incorrectly? -- http://mail.python.org/mailman/listinfo/python-list