Johannes Bauer schrieb:
Hello group,
with this following program:
#!/usr/bin/python3
import gzip
x = gzip.open("testdatei", "wb")
x.write("ä")
x.close()
I get a broken .gzip file when decompressing:
$ cat testdatei |gunzip
ä
gzip: stdin: invalid compressed data--length error
As it only happens with UTF-8 characters, I suppose the gzip module
UTF-8 is not unicode. Even if the source-encoding above is UTF-8, I'm
not sure what is used to encode the unicode-string when it's written.
writes a length of 1 in the gzip file header (one character "ä"), but
then actually writes 2 characters (0xc3 0xa4).
Is there a solution?
What about writinga bytestring by explicitly decoding the string to
utf-8 first?
x.write("ä".encode("utf-8"))
Diez
--
http://mail.python.org/mailman/listinfo/python-list