[Bug 211013] Write error to UFS filesystem with softupdates panics machine

bugzilla-noreply Mon, 11 Jul 2016 10:57:05 -0700

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211013


            Bug ID: 211013
           Summary: Write error to UFS filesystem with softupdates panics
                    machine
           Product: Base System
           Version: 11.0-BETA1
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Many People
          Priority: ---
         Component: kern
          Assignee: freebsd-bugs@FreeBSD.org
          Reporter: k...@denninger.net

The machine in question had mounted a UFS filesystem mounted that had
softupdates enabled (on an SD card; I was updating a system that runs FreeBSD
on a Raspberry Pi2 by plugging the card into a different machine) and the I/O
card took an unrecoverable write error.

The result was a kernel panic; this is apparently considered expected behavior
at present if softupdates are turned on for the filesystem because it's
possible that the filesystem has now been corrupted and there is no way to be
sure with the machine running.  Thus the choice to panic() when this situation
occurs.

But it appears that the choice to panic() is too broad and unnecessary in that
in many cases a less-severe action is effective while not exposing the system
to the risk of unknown filesystem corruption.

Yes, if there are working-set pages on that volume and it is corrupt, the
system is no longer stable (this is especially true if the system is *running*
from that volume.)  It is also true that in the case of a solid-state device of
some kind the impact of a write error may cross a filesystem boundary, so it's
insufficient to invalidate the filesystem (on a SSD or similar device the
read/erase/write cycle for a data re-write may involve many megabytes of data,
and that can possibly not be entirely local to the filesystem mounted if there
is more than one on the physical volume.)

HOWEVER, forcibly-detaching the volume in question instead of calling panic()
*should* be effective in preventing the possibility of propagating a corrupted
filesystem.  While this will lead to a panic in the event that executing RSS
(or consumed page file space) is present on that volume, in the case where the
device holds only data the detach will *not* panic the machine.

This appears to be a situation where a less-severe "remedy" for a failed I/O is
certainly called for.

The following backtrace was captured from the panic itself:

root@Dbms2:/var/crash # kgdb /boot/kernel/kernel vmcore.0
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:
panic: initiate_write_inodeblock_ufs2: already started
cpuid = 14
KDB: stack backtrace:
#0 0xffffffff80b1f357 at kdb_backtrace+0x67
#1 0xffffffff80ad6ec2 at vpanic+0x182
#2 0xffffffff80ad6d33 at panic+0x43
#3 0xffffffff80dc16ad at softdep_disk_io_initiation+0x159d
#4 0xffffffff80de61eb at ffs_geom_strategy+0x13b
#5 0xffffffff80b872f7 at bufwrite+0x267
#6 0xffffffff80b8ac6a at vfs_bio_awrite+0x3ca
#7 0xffffffff80b96b77 at vop_stdfsync+0x277
#8 0xffffffff80983766 at devfs_fsync+0x26
#9 0xffffffff81101f7d at VOP_FSYNC_APV+0x8d
#10 0xffffffff80baf1ae at sched_sync+0x3be
#11 0xffffffff80a8dcb5 at fork_exit+0x85
#12 0xffffffff80f7f85e at fork_trampoline+0xe
Uptime: 27m9s


(kgdb) where
#0  doadump (textdump=<value optimized out>) at pcpu.h:221
#1  0xffffffff80ad6949 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:366
#2  0xffffffff80ad6efb in vpanic (fmt=<value optimized out>,
    ap=<value optimized out>) at /usr/src/sys/kern/kern_shutdown.c:759
#3  0xffffffff80ad6d33 in panic (fmt=0x0)
    at /usr/src/sys/kern/kern_shutdown.c:690
#4  0xffffffff80dc16ad in softdep_disk_io_initiation (bp=<value optimized out>)
    at /usr/src/sys/ufs/ffs/ffs_softdep.c:10301
#5  0xffffffff80de61eb in ffs_geom_strategy (bo=<value optimized out>,
    bp=<value optimized out>) at buf.h:412
#6  0xffffffff80b872f7 in bufwrite (bp=0xfffffe02e8629b30) at buf.h:405
#7  0xffffffff80b8ac6a in vfs_bio_awrite (bp=<value optimized out>)
    at buf.h:393
#8  0xffffffff80b96b77 in vop_stdfsync (ap=0xfffffe034f481b68)
    at /usr/src/sys/kern/vfs_default.c:692
#9  0xffffffff80983766 in devfs_fsync (ap=0xfffffe034f481b68)
    at /usr/src/sys/fs/devfs/devfs_vnops.c:702
#10 0xffffffff81101f7d in VOP_FSYNC_APV (vop=<value optimized out>,
    a=<value optimized out>) at vnode_if.c:1331
#11 0xffffffff80baf1ae in sched_sync () at vnode_if.h:549
#12 0xffffffff80a8dcb5 in fork_exit (callout=0xffffffff80baedf0 <sched_sync>,
    arg=0x0, frame=0xfffffe034f481c00) at /usr/src/sys/kern/kern_fork.c:1038
#13 0xffffffff80f7f85e in fork_trampoline ()
    at /usr/src/sys/amd64/amd64/exception.S:611
#14 0x0000000000000000 in ?? ()
(kgdb)

FreeBSD 11.0-BETA1 #0 r302439: Fri Jul  8 14:37:27 CDT 2016    
k...@dbms2.denninger.net:/usr/obj/usr/src/sys/GENERIC

The offending code line:

static void
initiate_write_inodeblock_ufs2(inodedep, bp)
        struct inodedep *inodedep;
        struct buf *bp;                 /* The inode block */
{
        struct allocdirect *adp, *lastadp;
        struct ufs2_dinode *dp;
        struct ufs2_dinode *sip;
        struct inoref *inoref;
        struct ufsmount *ump;
        struct fs *fs;
        ufs_lbn_t i;
#ifdef INVARIANTS
        ufs_lbn_t prevlbn = 0;
#endif
        int deplist;

        if (inodedep->id_state & IOSTARTED)
                panic("initiate_write_inodeblock_ufs2: already started");
        inodedep->id_state |= IOSTARTED;



-- End capture

-- 
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"

[Bug 211013] Write error to UFS filesystem with softupdates panics machine

Reply via email to