New submission from Tomas Tomecek: I have a tarball (generated by docker-1.10 via `docker export`) and am trying to extract it with python 2.7 tarfile:
``` with tarfile.open(name=tarball_path) as tar_fd: tar_fd.extractall(path=path) ``` Output from a pytest run: ``` /usr/lib64/python2.7/tarfile.py:2072: in extractall for tarinfo in members: /usr/lib64/python2.7/tarfile.py:2507: in next tarinfo = self.tarfile.next() /usr/lib64/python2.7/tarfile.py:2355: in next tarinfo = self.tarinfo.fromtarfile(self) /usr/lib64/python2.7/tarfile.py:1254: in fromtarfile return obj._proc_member(tarfile) /usr/lib64/python2.7/tarfile.py:1276: in _proc_member return self._proc_pax(tarfile) /usr/lib64/python2.7/tarfile.py:1406: in _proc_pax value = value.decode("utf8") _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ input = '\x01\x00\x00\x02\xc0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', errors = 'strict' def decode(input, errors='strict'): > return codecs.utf_8_decode(input, errors, True) E UnicodeDecodeError: 'utf8' codec can't decode byte 0xc0 in position 4: invalid start byte /usr/lib64/python2.7/encodings/utf_8.py:16: UnicodeDecodeError ``` Since I know nothing about tars, I have no idea if this is a bug or there is a proper solution/workaround. When using GNU tar, I'm able to to list and extract the tarball. ---------- components: Unicode messages: 263237 nosy: Tomas Tomecek, ezio.melotti, haypo priority: normal severity: normal status: open title: tarfile: accessing (listing and extracting) tarball fails with UnicodeDecodeError versions: Python 2.7 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue26740> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com