This is your offending device:
$ pfexec smartctl -a -d sat,12 /dev/rdsk/c2t0d0s0 | grep Raw_Read
1 Raw_Read_Error_Rate 0x000b 094 094 016 Pre-fail Always
- 1376259
Try removing this disk.
The boot manager is in your bios. It currently points to one of your
rpool disks. Go into the boot manager and pick the other disk and see
how it boots then. You can either set this up as a one time boot or
change the setting so it is persistant.
Life should be better with the sick disk removed.
j.
On 12/11/18 8:16 AM, Lou Picciano wrote:
I have now, finally) managed to get perhaps the key bit of reporting from
smartctl - does this seem adequately diagnostic?:
(I am fully satisfied to replace the drive; I just want to be sure I’ve run to
ground any potential root causes.)
$ pfexec smartctl -a -d sat,12 /dev/rdsk/c2t0d0s0 | grep Raw_Read
1 Raw_Read_Error_Rate 0x000b 094 094 016 Pre-fail Always
- 1376259
$ pfexec smartctl -a -d sat,12 /dev/rdsk/c2t1d0s0 | grep Raw_Read
1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always
- 0
Above seems consistent with all the read errors I see at boot.
What happens if you go into the boot manager and manually select a boot disk?
If the problem is with a single drive, then the other drive should boot
normally right? Try booting from both drives select each one manually.
That’s also interesting. With the hundreds of read errors at boot up, the boot
manager is never even (visibly) presented. I guess I could try this again from
a boot from USB image...
you can speed up the scrub with:
echo zfs_scrub_delay/W0x0 |mdb -kw
echo zfs_scan_min_time_ms/W0x0
Good commands for reference. I was unaware of these! But, even with scrub
canceled for the moment, am still seeing virtually continuous drive controller
traffic.
You also wanted to see:
$ iostat -nMxC 5
extended device statistics
r/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device
0.0 962.3 0.0 11.3 15.7 0.2 16.3 0.2 5 23 c2
0.0 398.4 0.0 4.3 7.1 0.1 17.9 0.2 83 6 c2t0d0
0.0 415.2 0.0 4.2 8.6 0.1 20.6 0.2 87 9 c2t1d0
0.0 40.2 0.0 0.7 0.0 0.0 0.0 0.4 0 2 c2t2d0
0.0 40.4 0.0 0.7 0.0 0.0 0.0 1.1 0 4 c2t3d0
0.0 34.4 0.0 0.7 0.0 0.0 0.0 0.3 0 1 c2t4d0
0.0 33.6 0.0 0.7 0.0 0.0 0.0 0.3 0 1 c2t5d0
Again, I assume the symmetry in findings between t0 and t1 is due to their
mirrored status… But doesn’t seem to help in differentiating offending device.
(For comparison, t2-t5 are the data pool.) There is essential zero ‘user’
activity on either data or root pools...
_______________________________________________
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
https://openindiana.org/mailman/listinfo/openindiana-discuss
_______________________________________________
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
https://openindiana.org/mailman/listinfo/openindiana-discuss