Hi all,
I posted this to the Apache httpd users list, but no reply there, so I'm
posting here in the hopes that someone else who uses mod_perl with
mod_cache in a reverse proxy setup might have insight.
I am using Apache 2.2.9 (built from source) on Debian Lenny to run a
fairly large community LAMP (Perl, MySQL) site. I use the proxy and
cache of Apache to improve site performance - I have a front end proxy
build and a back-end mod_perl build, both on the same server currently.
I have been using this setup for years successfully, but most of that
time was using Apache 1.3, with mod_access and mod_deflate from Igor
Sysoev. Since moving to Apache 2.2, I am using the stock caching.
The cache and front-end proxy help to serve images without bogging down
the heavy mod_perl processes, while also obviously caching the mod_perl
content. The site gets around 100,000 page requests or more per day. The
cache is set to 1000MB, with htcacheclean running in daemon mode,
interval 60 minutes (but looking at the performance charts, it seems to
be running constantly).
I am finding that the cache directories that mod_cache builds are very
large, and take a long time to traverse under ext2. There is currently
about 10 GB under the cache according to du, and it took 162 minutes
just to tell me that. Basically, htcacheclean is not keeping up. I'm
using three levels of directory. Htcacheclean also takes a long time to
process this if I try running it from cron nightly, during which time I
would see a huge spike in iowait on the server, and it would take upward
of 3 hours to complete. If I run htcacheclean in daemon mode, using the
-n (nice) option, then it doesn't seem to be able to keep up, the cache
just creeps up in size. If I take off the nice option, then it takes up
a lot more resources, to the point where I'm concerned it'll be
impacting the server performance by monopolising the disks.
So what I'm observing is that at least part of the problem appears to be
that the directory structure is just very, very big and wide and takes a
long time to traverse, even for basic system functions like du.
This leads to my main question, which is this: Would a different
filesystem, perhaps reiserfs, be better for this type of cache? I have
never used reiser before, but from reputation it seems to be designed
for handling many small files efficiently. I wonder if it would be any
easier for my system to traverse the directory and maintain the cache if
it was under reiser rather than ext.
If not that, then are there other filesystems which make it very
efficient to traverse wide directory structures?
I have a quad core server (AMD Opteron 265), with four 10k SCSI drives
set up in RAID0 (yeah I know it's risky, but everything is backed up
immediately via mysql replication, and I need the space and performance).
Thanks!
Neil
- Best filesystem type for mod_cache in reverse proxy? Neil Gunton
-