Hi, I'm using $subjects combination successfully in a project for creating/iterating over huge binary files (> 5GB) with impressive performance, while resource usage keeps pretty low, all with plain Python3 code. Nice!
Environment: (Python 3.4.5, Linux 4.8.14, openSUSE/x86_64, NFS4 and XFS filesystems) The idea is: map a ctypes structure onto the file at a certain offset, act on the structure, and release the mapping. The latter is necessary for keeping the mmap file properly resizable and closable (due to the nature of mmaps and Python's posix implementation thereof). Hence, a context manager serves us well (in theory). Here's some code excerpt: class cstructmap: def __init__(self, cstruct, mm, offset = 0): self._cstruct = cstruct self._mm = mm self._offset = offset self._csinst = None def __enter__(self): # resize the mmap (and backing file), if structure exceeds mmap size # mmap size must be aligned to mmap.PAGESIZE cssize = ctypes.sizeof(self._cstruct) if self._offset + cssize > self._mm.size(): newsize = align(self._offset + cssize, mmap.PAGESIZE) self._mm.resize(newsize) self._csinst = self._cstruct.from_buffer(self._mm, self._offset) return self._csinst def __exit__(self, exc_type, exc_value, exc_traceback): # free all references into mmap del self._csinst self._csinst = None def work(): with cstructmap(ItemHeader, self._mm, self._offset) as ih: ih.identifier = ItemHeader.Identifier ih.length = ItemHeaderSize + datasize blktype = ctypes.c_char * datasize with cstructmap(blktype, self._mm, self._offset) as blk: blk.raw = data In practice, this results in: Traceback (most recent call last): File "ctypes_mmap_ctx.py", line 146, in <module> mf.add_data(data) File "ctypes_mmap_ctx.py", line 113, in add_data with cstructmap(blktype, self._mm, self._offset) as blk: File "ctypes_mmap_ctx.py", line 42, in __enter__ self._mm.resize(newsize) BufferError: mmap can't resize with extant buffers exported. The issue: when creating a mapping via context manager, we assign a local variable (with ..), that keep existing in the local context, even when the manager context was left. This keeps a reference on the ctypes mapped area alive, even if we try everything to destroy it in __exit__. We have to del the with var manually. Now, I want to get rid of the ugly any error prone del statements. What is needed, is a ctypes operation, that removes the mapping actively, and that could be added to the __exit__ part of the context manager. Full working code example: https://gist.github.com/frispete/97c27e24a0aae1bcaf1375e2e463d239 The script creates a memory mapped file in the current directory named "mapfile". When started without arguments, it copies itself into this file, until 10 * mmap.PAGESIZE growth is reached (or it errored out before..). IF you change NOPROB to True, it will actively destruct the context manager vars, and should work as advertized. Any ideas are much appreciated. Thanks in advance, Pete -- https://mail.python.org/mailman/listinfo/python-list