New submission from Sridhar Ratnakumar <sridh...@activestate.com>: tarfile.getmembers has become extremely slow on Windows. This was triggered in r85916 by Lars Gustaebel on Oct 29, 2010 to "add read support for all missing variants of the GNU sparse extensions".
To reproduce, use this "tgz" file: http://pypm-free.activestate.com/3.2/win32-x86/pool/a/as/as.mklruntime-1.2_win32-x86_3.2_1.pypm It contains another tgz file called "data.tar.gz". Run `.getmembers()` on data.tar.gz. ... This invokes tarfile._FileInFile.read(...) that seems to be cause of slowness (or rather a hang). I had to workaround this issue by monkey-patching the above `read` function to revert the change: +if sys.version_info[:2] >= (3,2): + import tarfile + class _FileInFileNoSparse(tarfile._FileInFile): + def read(self, size): + if size is None: + size = self.size - self.position + else: + size = min(size, self.size - self.position) + self.fileobj.seek(self.offset + self.position) + self.position += size + return self.fileobj.read(size) + tarfile._FileInFile = _FileInFileNoSparse + LOG.info('Monkey patching `tarfile.py` to disable part of r85916 (py3k)') We caught this bug as part of testing ActiveState PyPM on Python 3.2 http://bugs.activestate.com/show_bug.cgi?id=89376#c3 If you want the easiest way to reproduce this, I can send you (in private) an internal build of ActivePython-3.2 containing PyPM. Running "pypm install numpy" (with breakpoints in tarfile.py) is all that is required to reproduce. ---------- components: Library (Lib), Windows messages: 128685 nosy: lars.gustaebel, srid priority: normal severity: normal status: open title: 3.2: tarfile.getmembers causes 100% cpu usage on Windows type: resource usage versions: Python 3.2 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue11224> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com