Wolfgang wrote: > Hi Simon, > > I did not know that library! I'm still new to python and I still have > problems to find the right commands.
Welcome. : ) Python comes with "batteries included". I'm always finding cool new modules myself, and I've been using it for years. In fact, I didn't notice the bz2 module until about a week ago. Browse the standard library docs for fun: http://docs.python.org/lib/lib.html there's all kinds of cool stuff in there. Whenever you say to yourself, "Hmm, somebody must have had this problem before," reach for the standard library. The solution's likely already in there. > > But I suppose this library is mainly for partially > compressing/decompressing of files. How can I use that library to > compress/decompress full files without reading them into memory? And > what about performance? Read the docs. There seems to be api for (de)compressing both "streams" of data and whole files. I don't know about performance, as I've never tried to use the module before, but I would bet that it's good. It almost certainly uses the same bzip2 library as the bzip2 program itself and it avoids the overhead of creating a new process for each file. But if you're in doubt (and performance really matters for this application) test and measure it. I think your script could be rewritten as follows with good speed and memory performance, but I haven't tested it (and the output filepaths may not be what you want...): import os import bz2 dir_ = r"g:\messtech" for root, dirs, files in os.walk(dir_): for file_ in files: f = os.path.join(root, file_) bzf = os.path.join(f, '.bz2') F = open(f) BZF = BZ2File(bzf, 'w') try: for line in F: BZF.write(line) finally: F.close() BZF.close() Also, note that I changed 'dir' and 'file' to 'dir_' and 'file_'. Both dir and file are python built-ins, so you shouldn't reuse those names for your variables. Peace, ~Simon -- http://mail.python.org/mailman/listinfo/python-list