On Mon, 06 Dec 2010 18:12:33 +0100, Peter Otten wrote:
> By the way:
>
print quopri.decodestring("=E4=F6=FC").decode("iso-8859-1")
> äöü
print r"\xe4\xf6\xfc".decode("string-escape").decode("iso-8859-1")
> äöü
Ah - better than a regex. Thanks!
--
http://mail.python.org/mailman/listin
Dan M wrote:
> I'm getting bogged down with backslash escaping.
>
> I have some text files containing characters with the 8th bit set. These
> characters are encoded one of two ways: either "=hh" or "\xhh", where "h"
> represents a hex digit, and "\x" is a literal backslash followed by a
> lower-
On Mon, 06 Dec 2010 09:44:39 -0600, Dan M wrote:
> That's what I had initially assumed was the case, but looking at the
> data files with a hex editor showed me that I do indeed have
> four-character sequences. That's what makes this such as interesting
> task!
Sorry, I misunderstood the first ti
On Mon, 06 Dec 2010 16:34:56 +0100, Alain Ketterlin wrote:
> Dan M writes:
>
>> I took at look at http://docs.python.org/howto/regex.html, especially
>> the section titled "The Backslash Plague". I started out trying :
>
> import re
> r = re.compile('x([0-9a-fA-F]{2})') a = "This \x
On Mon, 06 Dec 2010 10:29:41 -0500, Mel wrote:
> What you're missing is that string `a` doesn't actually contain four-
> character sequences like '\', 'x', 'a', 'a' . It contains single
> characters that you encode in string literals as '\xaa' and so on. You
> might do better with
>
> p1 = r'([
Dan M writes:
> I took at look at http://docs.python.org/howto/regex.html, especially the
> section titled "The Backslash Plague". I started out trying :
import re
r = re.compile('x([0-9a-fA-F]{2})')
a = "This \xef file \xef has \x20 a bunch \xa0 of \xb0 crap \xc0
The backs
Dan M wrote:
> I'm getting bogged down with backslash escaping.
>
> I have some text files containing characters with the 8th bit set. These
> characters are encoded one of two ways: either "=hh" or "\xhh", where "h"
> represents a hex digit, and "\x" is a literal backslash followed by a
> lower-
I'm getting bogged down with backslash escaping.
I have some text files containing characters with the 8th bit set. These
characters are encoded one of two ways: either "=hh" or "\xhh", where "h"
represents a hex digit, and "\x" is a literal backslash followed by a
lower-case x.
Catching the f