The vi we were doing was a 2 line file. If you just vi a new file, add one line and exit it would take 15 minutes in fdsynch. On recommendation of a workaround we set
set zfs:zil_disable=1

after the reboot the fdsynch is now < 0.1 seconds. Now I have no idea if it was this setting or the fact that we went through a reboot. Whatever the root cause we are now back to a well behaved file system.

thanks
sean


Roch wrote:
  15 minutes to do a fdsync is way outside the slowdown usually seen.
  The footprint for 6413510 is that when a huge amount of
  data is being written non synchronously and a fsync comes in for the
  same filesystem then all the non-synchronous data is also forced out
  synchronously. So is there a lot of data being written during the vi?

vi will write the whole file (in 4K) chunks and fsync it.
(based on a single experiment).

So  for a largefile vi ,  on quit, we  have lots  of data to
sync in and of  itself.  But because 6413510  we potentially
have to    sync lots  of    other  data  written  by   other
applications.

Now take a Niagara with lots of available CPUs and lots
of free memory (32GB maybe?) running some 'tar x' in
parallel. A huge chunk of the 32GB can end up as dirty.

I say too much so because of lack of throttling:

	http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6429205
	6429205 each zpool needs to monitor it's  throughput and throttle heavy writers

Then vi :q; fsyncs; and all of the pending data must
sync. So we have extra data to sync because of:

	http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6413510
	zfs: writing to ZFS filesystem slows down fsync() on other files in the same FS

Furthermore, we can be slowed by this:

	http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6440499
	zil should avoid txg_wait_synced() and use dmu_sync() to issue parallel IOs...

Note: 6440499 is now fixed in the gate.

And  finally  all this data goes  to  a single disk. Worse a
slice of a disk.  Since  it's just a  slice ZFS can't enable
the write cache. Then if there is no tag queue (is there ?) we
will handle everything one I/O at a time. If it's a SATA
drive we have other issues...

I think  we've hit is  all here. So can this  lead to 15 min
fsync ? I can't swear,  Actually I won't be convinced myself
before  I convince you,  but we  do have  things  to chew on
already.


Do  I recall that   this   is about a    1GB  file in  vi  ?
:wq-uitting out of a 1 GB vi session on a 50MB/sec disk will
take  20sec  when everything  hums   and there  are no other
traffic involved. With no write cache / no tag queue , maybe
10X more.

-r

  

--
Sean Meighan
Mgr ITSM Engineering

Sun Microsystems, Inc.
US
Phone x32329 / +1 408 850-9537
Mobile 303-520-2024
Fax 408 850-9537
Email [EMAIL PROTECTED]

NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to