I coded a caching system using BerkeleyDB::Hash as the backend. It was
working fine until the database file became fairly large (850M).
At some point the performance degraded and the web server process accessing the database started hanging. Someone suggested locking issues being the cause for the hangups, but trying to access the db from a single script even when there were no other processes accessing it still hung.

I am sure someone has done a similar thing before and would be very interested to hear any success/failure stories. I starting to wonder whether I would be better off just using an RDBMS table (2 columns - key,value) as the cache backend to avoid these types of issues.

There's quite a few options for caching in mod_perl. I describe a couple at the start of my caching module...

---
http://search.cpan.org/~robm/Cache-FastMmap-1.09/FastMmap.pm

DESCRIPTION

In multi-process environments (eg mod_perl, forking daemons, etc), it's common to want to cache information, but have that cache shared between processes. Many solutions already exist, and may suit your situation better:

MLDBM::Sync - acts as a database, data is not automatically expired, slow
IPC::MM - hash implementation is broken, data is not automatically expired, slow
Cache::FileCache - lots of features, slow
Cache::SharedMemoryCache - lots of features, VERY slow. Uses IPC::ShareLite which freeze/thaws ALL data at each read/write DBI - use your favourite RDBMS. can perform well, need a DB server running. very global. socket connection latency Cache::Mmap - similar to this module, in pure perl. slows down with larger pages BerkeleyDB - very fast (data ends up mostly in shared memory cache) but acts as a database overall, so data is not automatically expired
---

The main things I'd say are:

1. What version of bdb are you using. I've found 4.0-4.2 to be fairly unstable. The late 3's (eg >=3.3) and more recent 4's (eg >=4.3) seem better 2. Try running db_verify on your database to see if it picks up any problems/corruption 3. Consider switching to something else, even if it only supports a smaller size. Do you really need 850M of cached data? 4. Also not mentioned above, but look at memcached. It seems to be well designed for LARGE, global caches

Rob

Reply via email to