Hi all,

I posted this to the Apache httpd users list, but no reply there, so I'm posting here in the hopes that someone else who uses mod_perl with mod_cache in a reverse proxy setup might have insight.

I am using Apache 2.2.9 (built from source) on Debian Lenny to run a fairly large community LAMP (Perl, MySQL) site. I use the proxy and cache of Apache to improve site performance - I have a front end proxy build and a back-end mod_perl build, both on the same server currently. I have been using this setup for years successfully, but most of that time was using Apache 1.3, with mod_access and mod_deflate from Igor Sysoev. Since moving to Apache 2.2, I am using the stock caching.

The cache and front-end proxy help to serve images without bogging down the heavy mod_perl processes, while also obviously caching the mod_perl content. The site gets around 100,000 page requests or more per day. The cache is set to 1000MB, with htcacheclean running in daemon mode, interval 60 minutes (but looking at the performance charts, it seems to be running constantly).

I am finding that the cache directories that mod_cache builds are very large, and take a long time to traverse under ext2. There is currently about 10 GB under the cache according to du, and it took 162 minutes just to tell me that. Basically, htcacheclean is not keeping up. I'm using three levels of directory. Htcacheclean also takes a long time to process this if I try running it from cron nightly, during which time I would see a huge spike in iowait on the server, and it would take upward of 3 hours to complete. If I run htcacheclean in daemon mode, using the -n (nice) option, then it doesn't seem to be able to keep up, the cache just creeps up in size. If I take off the nice option, then it takes up a lot more resources, to the point where I'm concerned it'll be impacting the server performance by monopolising the disks.

So what I'm observing is that at least part of the problem appears to be that the directory structure is just very, very big and wide and takes a long time to traverse, even for basic system functions like du.

This leads to my main question, which is this: Would a different filesystem, perhaps reiserfs, be better for this type of cache? I have never used reiser before, but from reputation it seems to be designed for handling many small files efficiently. I wonder if it would be any easier for my system to traverse the directory and maintain the cache if it was under reiser rather than ext.

If not that, then are there other filesystems which make it very efficient to traverse wide directory structures?

I have a quad core server (AMD Opteron 265), with four 10k SCSI drives set up in RAID0 (yeah I know it's risky, but everything is backed up immediately via mysql replication, and I need the space and performance).

Thanks!

Neil

Reply via email to