I've been experiencing some serious performance issues that appear to be I/O-related with a dual-Xeon 2.4.22 mail server.
The server has a SuperMicro Super P4DP6 motherboard with dual Xeon 2.4Ghz processors and 4 GB of RAM. One of the two onboard Adaptec 7899P SCSI chipsets is being used to control a disk that has the OS, and a QLogic QLA2200 fibre channel PCI card is being used to connect an external array. A two-gigabyte partition on the disk is devoted to swap. The server is running Debian 3.0 with a hand-compiled 2.4.22 kernel. Once all of the RAM is being used to cache stuff, sudden spikes in activity seem to cause the machine to grind to a halt for anywhere between thirty seconds and twenty minutes. I'm not sure how much of this is a result of a bunch of SMTP, POP3, and MySQL processes piling up and waiting for the disk, but I am able to trigger the problem at will by running the "find" command on the root filesystem or the external array. Swap usage never goes beyond a couple of megabytes. When "ps" finally finishes running while the machine is in this state, it just shows a lot of processes in the sleep state and a lot of zombies that are exiting. The problem goes away, at least for a while, if I kill all the MTA-related processes and then restart them. I don't see any relevant messages being produced by syslog. The server isn't getting hammered by spammers or anything like that when it becomes unresponsive. I've tried using both the kernel's qlogicfc driver and QLogic's qla2200 driver, but neither has any clear advantage over the other. The problem seems to be able to be caused by heavy usage of either the SCSI disk or the array, anyway, so my suspicion would be that the bottleneck is occurring at a higher level in the kernel. I have High Memory Support, HIGHMEM I/O Support, SMP, and ACPI enabled, and I experience the same behavior when HIGHMEM I/O and ACPI are turned off. I am using the aic7xxx driver. I can put the .config file online as well, if that would help. Anyone have any ideas? Thanks, Dan