On 2010-10-23 01:50:53 -0700, Peter Otten said:

TomF wrote:

I have a program that manipulates lots of very large indices, which I
implement as bit vectors (via the bitarray module).   These are too
large to keep all of them in memory so I have to come up with a way to
cache and load them from disk as necessary.  I've been reading about
weak references and it looks like they may be what I want.

My idea is to use a WeakValueDictionary to hold references to these
bitarrays, so Python can decide when to garbage collect them.  I then
keep a key-value database of them (via bsddb) on disk and load them
when necessary.  The basic idea for accessing one of these indexes is:

_idx_to_bitvector_dict = weakref.WeakValueDictionary()

In a well written script that cache will be almost empty. You should compare
the weakref approach against a least-recently-used caching strategy. In
newer Pythons you can use collections.OrderedDict to implement an LRU cache
or use the functools.lru_cache decorator.

I don't know what your first sentence means, but thanks for pointers to the LRU stuff. Maintaining my own LRU cache might be a better way to go. At least I'll have more control.

Thanks,
-Tom

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to