The vi we were doing was a 2 line file. If you just vi a new file, add
one line and exit it would take 15 minutes in fdsynch. On
recommendation of a workaround we set set zfs:zil_disable=1 after the reboot the fdsynch is now < 0.1 seconds. Now I have no idea if it was this setting or the fact that we went through a reboot. Whatever the root cause we are now back to a well behaved file system. thanks sean Roch wrote: 15 minutes to do a fdsync is way outside the slowdown usually seen. The footprint for 6413510 is that when a huge amount of data is being written non synchronously and a fsync comes in for the same filesystem then all the non-synchronous data is also forced out synchronously. So is there a lot of data being written during the vi? vi will write the whole file (in 4K) chunks and fsync it. (based on a single experiment). So for a largefile vi , on quit, we have lots of data to sync in and of itself. But because 6413510 we potentially have to sync lots of other data written by other applications. Now take a Niagara with lots of available CPUs and lots of free memory (32GB maybe?) running some 'tar x' in parallel. A huge chunk of the 32GB can end up as dirty. I say too much so because of lack of throttling: http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6429205 6429205 each zpool needs to monitor it's throughput and throttle heavy writers Then vi :q; fsyncs; and all of the pending data must sync. So we have extra data to sync because of: http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6413510 zfs: writing to ZFS filesystem slows down fsync() on other files in the same FS Furthermore, we can be slowed by this: http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6440499 zil should avoid txg_wait_synced() and use dmu_sync() to issue parallel IOs... Note: 6440499 is now fixed in the gate. And finally all this data goes to a single disk. Worse a slice of a disk. Since it's just a slice ZFS can't enable the write cache. Then if there is no tag queue (is there ?) we will handle everything one I/O at a time. If it's a SATA drive we have other issues... I think we've hit is all here. So can this lead to 15 min fsync ? I can't swear, Actually I won't be convinced myself before I convince you, but we do have things to chew on already. Do I recall that this is about a 1GB file in vi ? :wq-uitting out of a 1 GB vi session on a 50MB/sec disk will take 20sec when everything hums and there are no other traffic involved. With no write cache / no tag queue , maybe 10X more. -r --
NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. |
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss