System: 4-way Opteron, generic Debian Sarge AMD64 RAID controller: LSI Logic MegaRAID 320-1, 64MB cache RAID config: Three 146GB 15K SCSI/320 disks, RAID-5 Kernel: 2.6.14 SMP, includes megaraid driver
The above system is incredibly fast under almost all conditions, except when writing very large files (say, 100s of MB, or even GB). When writing such files, the system effectively locks-up for many seconds - typically, for as long as it takes to finish writing/flushing the file to disk. This lock-up affects all other processes: local text editor sessions, workstations with /home NFS-mounted, web server stops serving. (I guess all the affected processes are those which are contending for disk write access, actually). In particular, the workstations which have /home NFS-mounted experience a *workstation* hang (if trying to write) during the *server* disk flush, which is very frustrating. Given that a 'write' may simply involve updating a web browser history stored in /home, this is an extremely serious problem. Example while system is idle, out of work hours: while creating a 1GB file (copying an existing file, already cached in RAM), 'vmstat 3' shows the following: procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 4280268 55888 3366484 0 0 0 0 260 49 0 0 100 0 1 1 0 3777432 56364 3847652 0 0 0 6419 319 55 0 23 77 0 1 4 0 3258408 56852 4342476 0 0 0 10243 407 45 0 25 75 0 0 3 0 3070296 57028 4520868 0 0 0 8856 403 99 0 9 75 16 0 4 0 3068152 57044 4520852 0 0 5 9561 417 153 0 1 72 27 0 3 0 3069316 57044 4520852 0 0 0 10240 429 144 0 0 75 25 0 3 0 3069356 57044 4520852 0 0 0 10219 411 85 0 0 75 25 0 3 0 3069368 57044 4520852 0 0 0 8876 391 78 0 0 75 25 [...] 0 2 0 3077856 57044 4520852 0 0 0 9557 409 44 0 0 75 25 0 2 0 3077856 57044 4520852 0 0 0 8875 384 41 0 0 75 25 0 1 0 3097748 57044 4520852 0 0 0 7704 421 42 0 2 73 25 0 0 0 3100096 57048 4520848 0 0 0 56 259 20 0 0 99 1 0 0 0 3100112 57052 4520844 0 0 0 552 362 32 0 0 97 3 0 0 0 3100112 57052 4520844 0 0 0 0 270 63 0 0 100 0 0 0 0 3100384 57052 4520844 0 0 0 5 260 39 0 0 100 0 I see that the 'bo' column, "blocks written to block device" kicks in and it takes approximately two minutes to finish flushing this file to disk (which makes a disk write rate of less than 10MB/sec, which strikes me as very slow). I also see that the CPU IO-Wait column ('wa') shows 25% while this is happening: this corresponds to one of our four CPUs, meaning that CPU is waiting for the file to flush to disk, presumably. Once the flush finishes, the disk and CPU state returns to idle. I have already tried: - a couple of different kernels. The stock Sarge kernel 2.6.8-11-amd64-k8-smp, and a custom-compiled 2.6.14 kernel. I configured the custom kernel to use the pre-emptible features designed for desktop use, in the hope that the other interactive processes would benefit from this. The kernel doesn't seem to affect the behaviour I describe above. Should I expect this kind of performance when writing large files? If not, then what can be done to improve this kind of write performance? The RAID controller is currently set to "write-through". I understand that, in theory, better write performance may be obtained by using "write-back", although I don't see how that would help for files that are many times larger than the RAID controller cache (64MB vs. files of 100s of MB). I understand the potential data-loss implications of using write-back. Thoughts/comments on changing to "write-back" in these circumstances? Any other suggestions or reports of similar experiences? Cheers, Dave. -- Dave Ewart [EMAIL PROTECTED] Computing Manager, Cancer Epidemiology Unit Cancer Research UK / Oxford University PGP: CC70 1883 BD92 E665 B840 118B 6E94 2CFD 694D E370 Get key from http://www.ceu.ox.ac.uk/~davee/davee-ceu-ox-ac-uk.asc N 51.7518, W 1.2016
signature.asc
Description: Digital signature