On Jan 9, 2:16 am, Steven D'Aprano <st...@remove-this- cybersource.com.au> wrote: > On Thu, 08 Jan 2009 16:47:39 -0800, webcomm wrote: > > The error... > ... > > BadZipfile: File is not a zip file > > > When I look at data.zip in Windows, it appears to be a valid zip file. > > I am able to uncompress it in Windows XP, and can also uncompress it > > with 7-Zip. It looks like zipfile is not able to read a "table of > > contents" in the zip file. That's not a concept I'm familiar with. > > No, ZipFile can read table of contents: > > Help on method printdir in module zipfile: > > printdir(self) unbound zipfile.ZipFile method > Print a table of contents for the zip file. > > In my experience, zip files originating from Windows sometimes have > garbage at the end of the file. WinZip just ignores the garbage, but > other tools sometimes don't -- if I recall correctly, Linux unzip > successfully unzips the file but then complains that the file was > corrupt. It's possible that you're running into a similar problem.
The zipfile format is kind of brain dead, you can't tell where the end of the file is supposed to be by looking at the header. If the end of file hasn't yet been reached there could be more data. To make matters worse, somehow zip files came to have text comments simply appended to the end of them. (Probably this was for the benefit of people who would cat them to the terminal.) Anyway, if you see something that doesn't adhere to the zipfile format, you don't have any foolproof way to know if it's because the file is corrupted or if it's just an appended comment. Most zipfile readers use a heuristic to distinguish. Python's zipfile module just assumes it's corrupted. The following post from a while back gives a solution that tries to snip the comment off so that zipfile module can handle it. It might help you out. http://groups.google.com/group/comp.lang.python/msg/c2008e48368c6543 Carl Banks -- http://mail.python.org/mailman/listinfo/python-list