On 2010-10-23 01:50:53 -0700, Peter Otten said:
TomF wrote:
I have a program that manipulates lots of very large indices, which I
implement as bit vectors (via the bitarray module). These are too
large to keep all of them in memory so I have to come up with a way to
cache and load them from disk as necessary. I've been reading about
weak references and it looks like they may be what I want.
My idea is to use a WeakValueDictionary to hold references to these
bitarrays, so Python can decide when to garbage collect them. I then
keep a key-value database of them (via bsddb) on disk and load them
when necessary. The basic idea for accessing one of these indexes is:
_idx_to_bitvector_dict = weakref.WeakValueDictionary()
In a well written script that cache will be almost empty. You should compare
the weakref approach against a least-recently-used caching strategy. In
newer Pythons you can use collections.OrderedDict to implement an LRU cache
or use the functools.lru_cache decorator.
I don't know what your first sentence means, but thanks for pointers to
the LRU stuff. Maintaining my own LRU cache might be a better way to
go. At least I'll have more control.
Thanks,
-Tom
--
http://mail.python.org/mailman/listinfo/python-list