On Feb 18, 12:35 pm, Carl Banks <pavlovevide...@gmail.com> wrote: > On Feb 18, 10:48 am, Lionel <lionel.ke...@gmail.com> wrote: > > > Thanks Carl, I like your solution. Am I correct in my understanding > > that memory is allocated at the slicing step in your example i.e. when > > "reshaped_data" is sliced using "interesting_data = reshaped_data[:, > > 50:100]"? In other words, given a huge (say 1Gb) file, a memmap object > > is constructed that memmaps the entire file. Some relatively small > > amount of memory is allocated for the memmap operation, but the bulk > > memory allocation occurs when I generate my final numpy sub-array by > > slicing, and this accounts for the memory efficiency of using memmap? > > No, what accounts for the memory efficienty is there is no bulk > allocation at all. The ndarray you have points to the memory that's > in the mmap. There is no copying data or separate array allocation. > > Also, it's not any more memory efficient to use the offset parameter > with numpy.memmap than it is to memmap the whole file and take a > slice. > > Carl Banks
Does this mean that everytime I iterate through an ndarray that is sourced from a memmap, the data is read from the disc? The sliced array is at no time wholly resident in memory? What are the performance implications of this? -- http://mail.python.org/mailman/listinfo/python-list