On May 30, 1:41 pm, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote: > [EMAIL PROTECTED] schrieb:
> > what i want to know is which one is faster (if there is any difference > > in speed) since i'm working with very large files. of course, if there > > is any other way to write data to a file, i'd love to hear about it > > You should look at the mmap-module. Yes, memory mappings can be more efficient than files accessed using file descriptors. But mmap does not take an offset parameter, and is therefore not suited for working with large files. For example you only have a virtual memory space of 4 GiB on a 32 bit system, so there is no way mmap can access the last 4 GiB of an 8 GiB file on a 32 bit system. If mmap took an offset parameter, this would not be a problem. However, numpy has a properly working memory mapped array class, numpy.memmap. It can be used for fast file access. Numpy also has a wide range of datatypes that are efficient for working with binary data (e.g. an uint8 type for bytes), and a record array for working with structured binary data. This makes numpy very attractive when working with binary data files. Get the latest numpy here: www.scipy.org. Let us say you want to memory map an 23 bit RGB image of 640 x 480 pixels, located at an offset of 4096 bytes into the file 'myfile.dat'. Here is how numpy could do it: import numpy byte = numpy.uint8 desc = numpy.dtype({'names':['r','g','b'],'formats':[byte,byte,byte]}) mm = numpy.memmap('myfile.dat', dtype=desc, offset=4096, shape=(480,640), order='C') red = mm['r'] green = mm['g'] blue = mm['b'] Now you can access the RGB values simply by slicing the arrays red, green, and blue. To set the R value of every other horizontal line to 0, you could simply write red[::2,:] = 0 As always when working with memory mapped files, the changes are not committed before the memory mapping is synchronized with the file system. Thus, call mm.sync() when you want the actual write process to start. The memory mapping will be closed when it is garbage collected (typically when the reference count falls to zero) or when you call mm.close(). -- http://mail.python.org/mailman/listinfo/python-list