Re: 9.1-current disk throughput stalls ?

Ross Alexander Mon, 03 Jun 2013 14:48:43 -0700

On Mon, 3 Jun 2013, Jeremy Chadwick wrote:

1. There is no such thing as 9.1-CURRENT.  Either you meant 9.1-STABLE
(what should be called stable/9) or -CURRENT (what should be called
head).

I wrote:

The oldest kernel I have that shows the syndrome is -

    FreeBSD aukward.bogons 9.1-STABLE FreeBSD 9.1-STABLE #59 r250498:
    Sat May 11 00:03:15 MDT 2013
    toor@aukward.bogons:/usr/obj/usr/src/sys/GENERIC  amd64


See above.  You're right, I shouldn't post after a 07:00 dentist's
appt while my spouse is worrying me about the ins adjustor's report
on the car damage :(.  Hey, I'm very fallible.  I'll try harder.

2. Is there some reason you excluded details of your ZFS setup?
"zpool status" would be a good start.


Thanks for the useful hint as to what info you need to diagnose.

One of the machines ran a 5 drive zraid-1 pool (Mnemosyne).

Another was a 2 drive gmirror, in the simplest possible gpart/gmirror setup.
(Mnemosyne-sub-1.)

The third is a 2 drive ZFS raid-1, again in the simplest possible
gpart/gmirror manner (Aukward).

The fourth is a conceptually identical 2 drive ZFS raid-1, swapping
to a zvol (Griffon.)

If you look on the FreeBSD wiki, the pages that say "bootable zfs
gptzfsboot" and "bootable mirror" -

          https://wiki.freebsd.org/RootOnZFS
          http://www.freebsdwiki.net/index.php/RAID1,_Software,_How_to_setup

Well, I just followed those in cookbook style (modulo device and pool
names).  Didn't see any reason to be creative; I build for
reliability, not performance.

Aukward is gpart/zfs raid-1 box #1:

    aukward:/u0/rwa > ls -l /dev/gpt
    total 0
    crw-r-----  1 root  operator  0x91 Jun  3 10:18 vol0
    crw-r-----  1 root  operator  0x8e Jun  3 10:18 vol1

    aukward:/u0/rwa > zpool list -v
    NAME           SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
    ult_root       111G   108G  2.53G    97%  1.00x  ONLINE  -
      mirror       111G   108G  2.53G         -
        gpt/vol0      -      -      -         -
        gpt/vol1      -      -      -         -

    aukward:/u0/rwa > zpool status
      pool: ult_root
     state: ONLINE
      scan: scrub repaired 0 in 1h13m with 0 errors on Sun May  5 04:29:30 2013
    config:

            NAME          STATE     READ WRITE CKSUM
            ult_root      ONLINE       0     0     0
              mirror-0    ONLINE       0     0     0
                gpt/vol0  ONLINE       0     0     0
                gpt/vol1  ONLINE       0     0     0

    errors: No known data errors

(Yes, that machine has no swap.  Has NEVER had swap, has 16 GB and
uses maybe 10% at max load.  Has been running 9.x since prerelease
days, FWTW.  The ARC is throttled to 2 GB; zfs-stats says I never get
near using even that.  It's just the box that drives the radios,
a ham radio hobby machine.)

Griffon is also gpart/zfs raid-1 -

    griffon:/u0/rwa > uname -a
        FreeBSD griffon.cs.athabascau.ca 9.1-STABLE FreeBSD 9.1-STABLE #25 
r251062M:
        Tue May 28 10:39:13 MDT 2013
        t...@griffon.cs.athabascau.ca:/usr/obj/usr/src/sys/GENERIC
        amd64

    griffon:/u0/rwa > ls -l /dev/gpt
    total 0
    crw-r-----  1 root  operator  0x7b Jun  3 08:38 disk0
    crw-r-----  1 root  operator  0x80 Jun  3 08:38 disk1
    crw-r-----  1 root  operator  0x79 Jun  3 08:38 swap0
    crw-r-----  1 root  operator  0x7e Jun  3 08:38 swap1

and the pool is fat and happy -

    griffon:/u0/rwa > zpool status -v
      pool: pool0
     state: ONLINE
      scan: none requested
    config:

            NAME           STATE     READ WRITE CKSUM
            pool0          ONLINE       0     0     0
              mirror-0     ONLINE       0     0     0
                gpt/disk0  ONLINE       0     0     0
                gpt/disk1  ONLINE       0     0     0

    errors: No known data errors

Note that swap is through ZFS zvol;

    griffon:/u0/rwa > cat /etc/fstab
    # Device        Mountpoint      FStype  Options         Dump    Pass#
    #
    #
    /dev/zvol/pool0/swap none       swap    sw              0       0

    pool0           /               zfs     rw              0       0
    pool0/tmp       /tmp            zfs     rw              0       0
    pool0/var       /var            zfs     rw              0       0
    pool0/usr       /usr            zfs     rw              0       0
    pool0/u0        /u0             zfs     rw              0       0

    /dev/cd0        /cdrom          cd9660  ro,noauto       0       0
    /dev/ada2s1d    /mnt0           ufs     rw,noauto       0       0
    /dev/da0s1      /u0/rwa/camera  msdosfs rw,noauto       0       0

The machine has 32 GB and never swaps.  It runs virtualbox loads, anything
from one to forty virtuals (little OpenBSD images.)  Load is always light.

As for the zraid-5 box (Mnemosyne), I first replaced the ZFS pool with
a simple gpart/gmirror.  The drives gmirrored are known to be good.  That
*also* ran like mud.  Then I downgraded to 8.4-STABLE, GENERIC kernel,
and it's just fine now thanks.

I have the five zraid-1 disks that were pulled sitting in a second 4
core server chassis, on my desk, and they fail in that machine in the
same way that the production box died.  I'm 150 km away and the power
went down over the weekend at the remote site so I'll have to wait
until tomorrow to send you those details.

For now, think cut-and-paste from freebsd wiki, nothing clever,
everything as simple as possible.  Film at 11.

3. Do any of your filesystems/pools have ZFS compression enabled, or
have in the past?


No; disk is too cheap to bother with that.

4. Do any of your filesystems/pools have ZFS dedup enabled, or have in
the past?


No; disk is too cheap to bother with that.

5. Does the problem go away after a reboot?


It goes away for a few minutes, and then comes back on little cat feet.
Gradual slowdown.

6. Can you provide smartctl -x output for both ada0 and ada1?  You will
need to install ports/sysutils/smartmontools for this.  The reason I'm
asking for this is there may be one of your disks which is causing I/O
transactions to stall for the entire pool (i.e. "single point of
annoyance").


Been down that path, good call, Mnemosyne (zraid-1) checked clean as a
whistle. (Later) Griffon checks out clean, too.  Both -x and -a.
Aukward might have an iffy device, I will sched some self tests and
post everything, all neatly tabulated.

I've already fought a bad disk, and also just-slighly-iffy cables,
in a ZFS context and that time was nothing like this one.

7. Can you remove ZFS from the picture entirely (use UFS only) and
re-test?  My guess is that this is ZFS behaviour, particularly the ARC
being flushed to disk, and your disks are old/slow.  (Meaning: you have
16GB RAM + 4 core CPU but with very old disks).


Already did that.  A gmirror 9.1 (Mnemosyne-sub-1) box slowly choked
and died just like the ZFS instance did.  An 8.4-STABLE back-rev
without hardware changes was the fix.

Also: I noticed that when I mounted the 9.1 zraid from an 8.4 flash
fixit disk, everything ran quickly and stably.  I did copies of about
635 GB worth of ~3 GB sized .pcap files out of the zraid onto a SCSI
UFS and the ZFS disks were all about 75 to 80% busy for the ~8000
seconds the copy was running.  No slowdowns, no stalls.

BTW, I'd like to thank you for your kind interest, and please forgive
my poor reporting skills - I'm at home, work is 150 k away, the phone
keeps ringing, there are a lot of boxes, I'm sleep deprived, whine &
snivel, grumble & moan ;)

regards,
Ross
--
Ross Alexander, (780) 675-6823 desk / (780) 689-0749 cell, r...@athabascau.ca

        "Always do right. This will gratify some people,
         and astound the rest."  -- Samuel Clemens

--
   This communication is intended for the use of the recipient to whom it
   is addressed, and may contain confidential, personal, and or privileged
   information. Please contact us immediately if you are not the intended
   recipient of this communication, and do not copy, distribute, or take
   action relying on it. Any communications received in error, or
   subsequent reply, should be deleted or destroyed.
---
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: 9.1-current disk throughput stalls ?

Reply via email to