At 6:09 PM -0400 6/24/05, Arshavir Grigorian wrote:
Hello list,

I coded a caching system using BerkeleyDB::Hash as the backend. It was working fine until the database file became fairly large (850M). At some point the performance degraded and the web server process accessing the database started hanging. Someone suggested locking issues being the cause for the hangups, but trying to access the db from a single script even when there were no other processes accessing it still hung.

Having used some pretty large (though not quite 850 Mb) BDB files, I can tell you my experiences. Unless you are using the fully transactional model, and have lots of disk space to throw at it, I'd now recommend against using BDB for anything that is updated from an httpd process.

The reason has to do with corruption. Even when using the Concurrent DB Store model, I found that I was spending a huge amount of time writing code to detect all the possible ways in which DB files can become corrupt when accessed directly by httpd.

After using BDB for several years, I recently re-coded everything to use replicated MySQL DBs. Not only is the Perl code much smaller now, it actually runs more quickly due to better indexing and the advantages that can be obtained by using SQL for calculations.

I have no doubt that BDB works very well for some things, but in my opinion an httpd process--with many concurrent threads that can be dropped unexpectedly at pretty much any time--is not one of them. (BDB is actually what the default MySQL DB format uses at its lowest level, but the corruption problem is avoided by having one, central DB daemon that does all of the reads/writes to the files.)

--
Dan Wilga                                         [EMAIL PROTECTED]
Web Administrator                             http://www.mtholyoke.edu
Mount Holyoke College                                Tel: 413-538-3027
South Hadley, MA  01075            "Who left the cake out in the rain?"

Reply via email to