I'm preparing to distribute a Windows XP Python program and some ancillary files, and I wanted to put everything in a .ZIP archive. It proved to be inordinately difficult and I thought I would post my solution here. Is there a better one?
Suppose you have a set of files in a directory c:\a\b and some additional files in c:\a\b\subdir. Using a Python script, you would like to make a Windows-readable archive (.zip) that preserves this directory structure, and where the root directory of the archive is c:\a\b. In other words, all the files from c:\a\b appear in the archive without a path prefix, and all the files in c:\a\b\subdir have a path prefix of \subdir. This looks like it should be easy with module zipfile and the handy function os.walk. Create a zip file, call os.walk, and add the files to the archive like so: import os import zipfile z = zipfile.ZipFile(r"c:\a\b\myzip.zip",mode="w",compression=zipfile.ZIP_DEFLATED) for dirpath,dirs,files in os.walk(r"c:\a\b"): for a_file in files: a_path = os.path.join(dirpath,a_file) z.write(a_path) # Change, see below z.close() This creates an archive that can be read by WinZip or by another Python script that uses zipfile. But when you try to view it with the Windows compressed folder viewer it will appear empty. If you try to extract the files anyway (because you know they are really there), you get a Windows Security Warning and XP refuses to decompress the folder - XP is apparently afraid it might be bird flu or something. If you change the line marked #Change to "z.write(a_path,file)", explicitly naming each file, now the compressed folder viewer will show all the files in the archive. XP will not treat it like a virus and it will extract the files. However, the archive does not contain a subdirectory; all the files are in a single directory. Some experimentation suggests that Windows does not like any filename in the archive that begins with either a drive designator like c:, or has a path containing a leading slash like "\a\b\afile.txt". Relative paths like "subdir\afile.txt" are okay, and cause the desired behavior when the archive is extracted, e.g., a new directory subdir is created and afile.txt is placed in it. Since the method ZipFile.write needs a valid pathname for each file, the correct solution to the original problem entails messing around with the OS's current working directory. Position the CWD in the desired base directory of the archive, add the files to the archive using their relative pathnames, and put the CWD back where it was when you started: import os import zipfile z = zipfile.ZipFile(r"c:\a\b\myzip.zip",mode="w",compression=zipfile.ZIP_DEFLATED) cwd = os.getcwd() os.chdir(base_dir) try: for dirpath,dirs,files in os.walk(''): # This starts the walk at the CWD for a_file in files: a_path = os.path.join(dirpath,a_file) z.write(a_path,a_path) # Can the second argument be omitted? z.close() finally: os.chdir(cwd) This produces an archive that can be extracted by Windows XP using its built-in capability, by WinZip, or by another Python script. Now that I have the solution it seems to make sense, but it wasn't at all obvious when I started. Paul Cornelius -- http://mail.python.org/mailman/listinfo/python-list