Re: [OpenIndiana-discuss] Huge ZFS root pool slowdown - diagnose root cause?

Jim Klimov Mon, 17 Dec 2018 01:30:07 -0800

On December 13, 2018 9:14:46 PM UTC, Lou Picciano <loupicci...@comcast.net> 
wrote:
>The plot thickens, I’m afraid. Since last post, I’ve replaced the
>drive, and throughput remains molasses-in-January slow…
>period indicated below is more than 24 hours:
>
>  scan: resilver in progress since Wed Dec 12 15:13:11 2018
>26.9G scanned out of 1.36T at 345K/s, (scan is slow, no estimated time)
>    26.9G resilvered, 1.93% done
>config:
>
>        NAME                STATE     READ WRITE CKSUM
>        rpool               DEGRADED     0     0     0
>          mirror-0          DEGRADED     0     0     0
>            replacing-0     DEGRADED     0     0     0
>              c2t0d0s0/old  OFFLINE      0     0     0
>              c2t0d0s0      ONLINE       0     0     0
>            c2t1d0s0        ONLINE       0     0     0
>
>I have added boot blocks during silvering, but have used the bootadm
>install-bootloader approach. Reports it’s done this (not GRUB; aren’t
>we on the Boot Forth loader now?), 
>
>"/usr/sbin/installboot -F -m -f //boot/pmbr //boot/gptzfsboot
>/dev/rdsk/c2t0d0s0”
>"/usr/sbin/installboot -F -m -f //boot/pmbr //boot/gptzfsboot
>/dev/rdsk/c2t1d0s0"
>
>Will double-check this when silvering completes. Could be a long time…
>Machine has not been rebooted yet at all.
>
>Bob, yes: This is a 4K sector drive. Bad? What’s the impact?
>
>Slowness at boot: Yes, immediately. Well before scrub or any other
>process had had a chance to grab hold.
>
>What’s next? Could it be as simple as a cable? These cables haven’t
>been perturbed in… a long time.
>
>Can’t do anything safely now until this silvering is completed,
>correct?
>
>Wow. A mess.
>
>Lou Picciano
>
>> On Dec 11, 2018, at 7:56 PM, jason matthews <ja...@broken.net> wrote:
>> 
>> 
>> On 12/11/18 10:14 AM, John D Groenveld wrote:
>>> And when its replaced, I believe the OP will need to installboot(1M)
>>> the new drive.
>>> Correct me if I'm wrong, but Illumos ZFS doesn't magically put
>>> the boot code with zpool replace.
>> 
>> man installgrub
>> 
>> installgrub /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/<whatever>
>> 
>> 
>> should take like one second and can be done before, after, or during
>the resilver.
>_______________________________________________
>openindiana-discuss mailing list
>openindiana-discuss@openindiana.org
>https://openindiana.org/mailman/listinfo/openindiana-discuss


To me it also seems similar to a very fragmented i/o (e.g. pool was near full 
earlier, and wrote many scattered bits), so while scrub tries to read data from 
a TXG, and/or read a large file like miniroot etc., it has to do a lot of seeks 
for small reads (remember - about 200 op/s max for spinning rust), leading to 
high busy-ness and low bandwidth.

To an extent, slowness of drives (hw issues) can be evaluated by a large 
sequential dd to read a large stroke - that should be in tens of MB/s range 
(numbers depend on HW age and other factors), while highly random i/o's on my 
pools were often under 1-5MB/s when I was worried.

Also, look at amount of datasets, particularly auto-snapshots. If some tools 
read those in (e.g. zfs-auto-snapshots or znapzend service doing it in a loop 
to manage yhe snaps) it can also be a lot of i/o continuously. This can be a 
lot to read (especially if ARC or L2ARC does not fit this all) and parse; my 
systems were quite busy with roughly 100k snapshots (about 1000 datasets with 
100 snaps), taking a couple of minutes to read the list of snaps alone, and a 
comparable while to 'beadm list' before andyf's recent fixes.

The DTrace toolkit IIRC had some scripts to build histograms of i/o sizes so 
you could rule out or confirm lots of scattered small i/o's as the problem. If 
it is there, and you can point a finger at read-time fragmentation, you might 
have to migrate data away from this pool onto a new one, without parallelizing 
the copy routines, so on-disk bits there are laid out sequentially for same 
files and/or txg commits. This would also let migrate the pool to 4K disks if 
that is a problem today.

Also, terabytes range is a bit too much for a root pool. Given its criticality 
for the system and some limitations (device id change hell, caching, etc.) 
consider also moving data that is not rootfs away to another pool, even if in 
partitions on same disk or better on other disks. This would keep rpool's scope 
smaller and changes less frequent, making system better bootable to cope with 
issues on non-critical pools after the full OS is up. Maybe my write-ups on 
split-root setups and wrapping zpool imports as smf services can help here, or 
at least give some more ideas and details to look at.

Good luck,
Jim Klimov

--
Typos courtesy of K-9 Mail on my Android

_______________________________________________
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
https://openindiana.org/mailman/listinfo/openindiana-discuss

Re: [OpenIndiana-discuss] Huge ZFS root pool slowdown - diagnose root cause?

Reply via email to