10.3 update 5 + Inetd problem
Hello everybody, we're having a curious issue with a machine we recently upgraded to 10.3-RELEASE-p5. The machine is a pretty "stupid" MX with just sendmail running and delivering locally mails to an LMTP instance (dbmail lmtp). For various reasons we don't have a "demonized" LMTPd but we use Inetd to spawn it on demand. After we upgraded the machine from 10.2 to 10.3 all of a sudden the lmtpd process started (randomly) getting (forever) stuck with 100% cpu usage in _"uwait"_ state. On a twin machine with the same setup that is not yet upgraded (10.1-RELEASE-p3), with the same "ports" versions and identical setup the problem never happens. Given that I'm not too much into GDB, Dtrace and all that stuff, what can I do to try to diagnose the problem better? Thank you very much. -- Andrea Brancatelli Schema31 S.p.a. Responsabile IT ROMA - BO - FI - PA ITALY Tel: +39.06.98.358.472 Cell: +39.331.2488468 Fax: +39.055.71.880.466 Società del Gruppo SC31 ITALIA ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Reproducable panic in FFS with softupdates and no journaling (10.3-RELEASE-pLATEST)
Ok.. to reply to my own message, I using ktr and debugging printfs I have found the culprit.. but I am still at a loss to 'why', or what the appropriate fix is. Lets go back to the panic (simplified) #0 0x8043f160 at kdb_backtrace+0x60 #1 0x80401454 at vpanic+0x124 #2 0x804014e3 at panic+0x43 #3 0x8060719a at softdep_deallocate_dependencies+0x6a #4 0x80499cc1 at brelse+0x151 #5 0x804979b1 at bufwrite+0x81 #6 0x80623c80 at ffs_write+0x4b0 #7 0x806ce9a4 at VOP_WRITE_APV+0x1c4 #8 0x806639e3 at vnode_pager_generic_putpages+0x293 #9 0x806d2102 at VOP_PUTPAGES_APV+0x142 #10 0x80661cc1 at vnode_pager_putpages+0x91 #11 0x806588e6 at vm_pageout_flush+0x116 #12 0x806517e2 at vm_object_page_collect_flush+0x1c2 #13 0x80651519 at vm_object_page_clean+0x179 #14 0x80651102 at vm_object_terminate+0xa2 #15 0x806621a5 at vnode_destroy_vobject+0x85 #16 0x8062a52f at ufs_reclaim+0x1f #17 0x806d0782 at VOP_RECLAIM_APV+0x142 Via KTR logging I determined that the dangling dependedency was on a freshly allocated buf, *after* vinvalbuf in the vgonel() (so in VOP_RECLAIM itself), called by the vnode lru cleanup process; I further noticed that it was in a newbuf that recycled a bp (unimportant, except it let me narrow down my logging to something managable), from there I get this stacktrace (simplified) #0 0x8043f160 at kdb_backtrace+0x60 #1 0x8049c98e at getnewbuf+0x4be #2 0x804996a0 at getblk+0x830 #3 0x805fb207 at ffs_balloc_ufs2+0x1327 #4 0x80623b0b at ffs_write+0x33b #5 0x806ce9a4 at VOP_WRITE_APV+0x1c4 #6 0x806639e3 at vnode_pager_generic_putpages+0x293 #7 0x806d2102 at VOP_PUTPAGES_APV+0x142 #8 0x80661cc1 at vnode_pager_putpages+0x91 #9 0x806588e6 at vm_pageout_flush+0x116 #10 0x806517e2 at vm_object_page_collect_flush+0x1c2 #11 0x80651519 at vm_object_page_clean+0x179 #12 0x80651102 at vm_object_terminate+0xa2 #13 0x806621a5 at vnode_destroy_vobject+0x85 #14 0x8062a52f at ufs_reclaim+0x1f #15 0x806d0782 at VOP_RECLAIM_APV+0x142 #16 0x804b6c6e at vgonel+0x2ee #17 0x804ba6f5 at vnlru_proc+0x4b5 addr2line on the ffs_balloc_ufs2 gives: /usr/src/sys/ufs/ffs/ffs_balloc.c:778: bp = getblk(vp, lbn, nsize, 0, 0, gbflags); bp->b_blkno = fsbtodb(fs, newb); if (flags & BA_CLRBUF) vfs_bio_clrbuf(bp); if (DOINGSOFTDEP(vp)) softdep_setup_allocdirect(ip, lbn, newb, 0, nsize, 0, bp); Boom, freshly allocated buffer with a dependecy; nothing in VOP_RECLAIM handles this, this is after vinvalbuf is called, it expects that everything is flushed to disk and its just about releasing structures (is my read of the code). Now, perhaps this is a good assumption? the question then is how is this buffer hanging out there surviving a a vinvalbuf. I will note that my test-case that finds this runs and terminates *minutes* before... its not just hanging out there in a race, its surviving background sync, fsync, etc... wtf? Also, I *can* unmount the FS without an error, so that codepath is either ignoring this buffer, or its forcing a sync in a way that doesn't panic? Anyone have next steps? I am making progress here, but its really slow going, this is probably the most complex portion of the kernel and some pointers would be helpful. On Sat, Jul 2, 2016 at 2:31 PM, David Cross wrote: > Ok, I have been trying to trace this down for awhile..I know quite a bit > about it.. but there's a lot I don't know, or I would have a patch. I have > been trying to solve this on my own, but bringing in some outside > assistance will let me move on with my life. > > First up: The stacktrace (from a debugging kernel, with coredump > > #0 doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:298 > #1 0x8071018a in kern_reboot (howto=260) > at /usr/src/sys/kern/kern_shutdown.c:486 > #2 0x80710afc in vpanic ( > fmt=0x80c7a325 "softdep_deallocate_dependencies: dangling deps > b_ioflags: %d, b_bufsize: %ld, b_flags: %d, bo_flag: %d", > ap=0xfe023ae5cf40) > at /usr/src/sys/kern/kern_shutdown.c:889 > #3 0x807108c0 in panic ( > fmt=0x80c7a325 "softdep_deallocate_dependencies: dangling deps > b_ioflags: %d, b_bufsize: %ld, b_flags: %d, bo_flag: %d") > at /usr/src/sys/kern/kern_shutdown.c:818 > #4 0x80a7c841 in softdep_deallocate_dependencies ( > bp=0xfe01f030e148) at /usr/src/sys/ufs/ffs/ffs_softdep.c:14099 > #5 0x807f793f in buf_deallocate (bp=0xfe01f030e148) at > buf.h:428 > #6 0x807f59c9 in brelse (bp=0xfe01f030e148) > at /usr/src/sys/kern/vfs_bio.c:1599 > #7 0x
Re: Reproducable panic in FFS with softupdates and no journaling (10.3-RELEASE-pLATEST)
On Wed, Jul 06, 2016 at 10:51:28AM -0400, David Cross wrote: > Ok.. to reply to my own message, I using ktr and debugging printfs I have > found the culprit.. but I am still at a loss to 'why', or what the > appropriate fix is. > > Lets go back to the panic (simplified) > > #0 0x8043f160 at kdb_backtrace+0x60 > #1 0x80401454 at vpanic+0x124 > #2 0x804014e3 at panic+0x43 > #3 0x8060719a at softdep_deallocate_dependencies+0x6a > #4 0x80499cc1 at brelse+0x151 > #5 0x804979b1 at bufwrite+0x81 > #6 0x80623c80 at ffs_write+0x4b0 > #7 0x806ce9a4 at VOP_WRITE_APV+0x1c4 > #8 0x806639e3 at vnode_pager_generic_putpages+0x293 > #9 0x806d2102 at VOP_PUTPAGES_APV+0x142 > #10 0x80661cc1 at vnode_pager_putpages+0x91 > #11 0x806588e6 at vm_pageout_flush+0x116 > #12 0x806517e2 at vm_object_page_collect_flush+0x1c2 > #13 0x80651519 at vm_object_page_clean+0x179 > #14 0x80651102 at vm_object_terminate+0xa2 > #15 0x806621a5 at vnode_destroy_vobject+0x85 > #16 0x8062a52f at ufs_reclaim+0x1f > #17 0x806d0782 at VOP_RECLAIM_APV+0x142 > > Via KTR logging I determined that the dangling dependedency was on a > freshly allocated buf, *after* vinvalbuf in the vgonel() (so in VOP_RECLAIM > itself), called by the vnode lru cleanup process; I further noticed that it > was in a newbuf that recycled a bp (unimportant, except it let me narrow > down my logging to something managable), from there I get this stacktrace > (simplified) > > #0 0x8043f160 at kdb_backtrace+0x60 > #1 0x8049c98e at getnewbuf+0x4be > #2 0x804996a0 at getblk+0x830 > #3 0x805fb207 at ffs_balloc_ufs2+0x1327 > #4 0x80623b0b at ffs_write+0x33b > #5 0x806ce9a4 at VOP_WRITE_APV+0x1c4 > #6 0x806639e3 at vnode_pager_generic_putpages+0x293 > #7 0x806d2102 at VOP_PUTPAGES_APV+0x142 > #8 0x80661cc1 at vnode_pager_putpages+0x91 > #9 0x806588e6 at vm_pageout_flush+0x116 > #10 0x806517e2 at vm_object_page_collect_flush+0x1c2 > #11 0x80651519 at vm_object_page_clean+0x179 > #12 0x80651102 at vm_object_terminate+0xa2 > #13 0x806621a5 at vnode_destroy_vobject+0x85 > #14 0x8062a52f at ufs_reclaim+0x1f > #15 0x806d0782 at VOP_RECLAIM_APV+0x142 > #16 0x804b6c6e at vgonel+0x2ee > #17 0x804ba6f5 at vnlru_proc+0x4b5 > > addr2line on the ffs_balloc_ufs2 gives: > /usr/src/sys/ufs/ffs/ffs_balloc.c:778: > > bp = getblk(vp, lbn, nsize, 0, 0, gbflags); > bp->b_blkno = fsbtodb(fs, newb); > if (flags & BA_CLRBUF) > vfs_bio_clrbuf(bp); > if (DOINGSOFTDEP(vp)) > softdep_setup_allocdirect(ip, lbn, newb, 0, > nsize, 0, bp); > > > Boom, freshly allocated buffer with a dependecy; nothing in VOP_RECLAIM > handles this, this is after vinvalbuf is called, it expects that everything > is flushed to disk and its just about releasing structures (is my read of > the code). > > Now, perhaps this is a good assumption? the question then is how is this > buffer hanging out there surviving a a vinvalbuf. I will note that my > test-case that finds this runs and terminates *minutes* before... its not > just hanging out there in a race, its surviving background sync, fsync, > etc... wtf? Also, I *can* unmount the FS without an error, so that > codepath is either ignoring this buffer, or its forcing a sync in a way > that doesn't panic? Most typical cause for the buffer dependencies not flushed is a buffer write error. At least you could provide the printout of the buffer to confirm or reject this assumption. Were there any kernel messages right before the panic ? Just in case, did you fsck the volume before using it, after the previous panic ? > > Anyone have next steps? I am making progress here, but its really slow > going, this is probably the most complex portion of the kernel and some > pointers would be helpful. > > On Sat, Jul 2, 2016 at 2:31 PM, David Cross wrote: > > > Ok, I have been trying to trace this down for awhile..I know quite a bit > > about it.. but there's a lot I don't know, or I would have a patch. I have > > been trying to solve this on my own, but bringing in some outside > > assistance will let me move on with my life. > > > > First up: The stacktrace (from a debugging kernel, with coredump > > > > #0 doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:298 > > #1 0x8071018a in kern_reboot (howto=260) > > at /usr/src/sys/kern/kern_shutdown.c:486 > > #2 0x80710afc in vpanic ( > > fmt=0x80c7a325 "softdep_deallocate_dependencies: dangling deps > > b_ioflags: %d, b_bufsize: %ld, b_flags: %d, bo_flag: %d", > > ap=0xfe023ae5cf40) > > at /usr/src/sys/kern/k
Re: Reproducable panic in FFS with softupdates and no journaling (10.3-RELEASE-pLATEST)
No kernel messages before (if there were I would have written this off a long time ago); And as of right now, this is probably the most fsck-ed filesystem on the planet!.. I have an 'image' that I am going on that is ggate mounted, so I can access it in a bhyve VM to ease debuging so I am not crashing my real machine (with the real filesystem) all the time. One of my initial guesses was that this was a CG allocation error, but a dumpfs seems to show plenty of blocks in the CG to meet this need. Quick note on the testcase, I haven't totally isolated it yet, but the minimal reproduction is a 'ctl_cyrusdb -r", which runs a bdb5 recover op, a ktrace on that shows that it unlinks 3 files, opens them, lseeks then, writes a block, and then mmaps them (but leaves them open). At process termination is munmaps, and then closes. I have tried to write a shorter reproduction that opens, seeks, mmaps (with the same flags), writes the mmaped memory, munmaps, closes and exits, but this has been insufficient to reproduce the issue; There is likely some specific pattern in the bdb5 code tickling this, and behind the mmap-ed interface it is all opaque, and the bdb5 code is pretty complex itself On Wed, Jul 6, 2016 at 11:18 AM, Konstantin Belousov wrote: > On Wed, Jul 06, 2016 at 10:51:28AM -0400, David Cross wrote: > > Ok.. to reply to my own message, I using ktr and debugging printfs I have > > found the culprit.. but I am still at a loss to 'why', or what the > > appropriate fix is. > > > > Lets go back to the panic (simplified) > > > > #0 0x8043f160 at kdb_backtrace+0x60 > > #1 0x80401454 at vpanic+0x124 > > #2 0x804014e3 at panic+0x43 > > #3 0x8060719a at softdep_deallocate_dependencies+0x6a > > #4 0x80499cc1 at brelse+0x151 > > #5 0x804979b1 at bufwrite+0x81 > > #6 0x80623c80 at ffs_write+0x4b0 > > #7 0x806ce9a4 at VOP_WRITE_APV+0x1c4 > > #8 0x806639e3 at vnode_pager_generic_putpages+0x293 > > #9 0x806d2102 at VOP_PUTPAGES_APV+0x142 > > #10 0x80661cc1 at vnode_pager_putpages+0x91 > > #11 0x806588e6 at vm_pageout_flush+0x116 > > #12 0x806517e2 at vm_object_page_collect_flush+0x1c2 > > #13 0x80651519 at vm_object_page_clean+0x179 > > #14 0x80651102 at vm_object_terminate+0xa2 > > #15 0x806621a5 at vnode_destroy_vobject+0x85 > > #16 0x8062a52f at ufs_reclaim+0x1f > > #17 0x806d0782 at VOP_RECLAIM_APV+0x142 > > > > Via KTR logging I determined that the dangling dependedency was on a > > freshly allocated buf, *after* vinvalbuf in the vgonel() (so in > VOP_RECLAIM > > itself), called by the vnode lru cleanup process; I further noticed that > it > > was in a newbuf that recycled a bp (unimportant, except it let me narrow > > down my logging to something managable), from there I get this stacktrace > > (simplified) > > > > #0 0x8043f160 at kdb_backtrace+0x60 > > #1 0x8049c98e at getnewbuf+0x4be > > #2 0x804996a0 at getblk+0x830 > > #3 0x805fb207 at ffs_balloc_ufs2+0x1327 > > #4 0x80623b0b at ffs_write+0x33b > > #5 0x806ce9a4 at VOP_WRITE_APV+0x1c4 > > #6 0x806639e3 at vnode_pager_generic_putpages+0x293 > > #7 0x806d2102 at VOP_PUTPAGES_APV+0x142 > > #8 0x80661cc1 at vnode_pager_putpages+0x91 > > #9 0x806588e6 at vm_pageout_flush+0x116 > > #10 0x806517e2 at vm_object_page_collect_flush+0x1c2 > > #11 0x80651519 at vm_object_page_clean+0x179 > > #12 0x80651102 at vm_object_terminate+0xa2 > > #13 0x806621a5 at vnode_destroy_vobject+0x85 > > #14 0x8062a52f at ufs_reclaim+0x1f > > #15 0x806d0782 at VOP_RECLAIM_APV+0x142 > > #16 0x804b6c6e at vgonel+0x2ee > > #17 0x804ba6f5 at vnlru_proc+0x4b5 > > > > addr2line on the ffs_balloc_ufs2 gives: > > /usr/src/sys/ufs/ffs/ffs_balloc.c:778: > > > > bp = getblk(vp, lbn, nsize, 0, 0, gbflags); > > bp->b_blkno = fsbtodb(fs, newb); > > if (flags & BA_CLRBUF) > > vfs_bio_clrbuf(bp); > > if (DOINGSOFTDEP(vp)) > > softdep_setup_allocdirect(ip, lbn, newb, > 0, > > nsize, 0, bp); > > > > > > Boom, freshly allocated buffer with a dependecy; nothing in VOP_RECLAIM > > handles this, this is after vinvalbuf is called, it expects that > everything > > is flushed to disk and its just about releasing structures (is my read of > > the code). > > > > Now, perhaps this is a good assumption? the question then is how is this > > buffer hanging out there surviving a a vinvalbuf. I will note that my > > test-case that finds this runs and terminates *minutes* before... its not > > just hanging out there in a race, its surviving background sync, fsync, > > etc... wtf? Also, I *can* unmount the FS without an error, so that > > codep
Re: Reproducable panic in FFS with softupdates and no journaling (10.3-RELEASE-pLATEST)
Oh, whoops; how do I printout the buffer? On Wed, Jul 6, 2016 at 11:30 AM, David Cross wrote: > No kernel messages before (if there were I would have written this off a > long time ago); > And as of right now, this is probably the most fsck-ed filesystem on the > planet!.. I have an 'image' that I am going on that is ggate mounted, so I > can access it in a bhyve VM to ease debuging so I am not crashing my real > machine (with the real filesystem) all the time. > > One of my initial guesses was that this was a CG allocation error, but a > dumpfs seems to show plenty of blocks in the CG to meet this need. > > Quick note on the testcase, I haven't totally isolated it yet, but the > minimal reproduction is a 'ctl_cyrusdb -r", which runs a bdb5 recover op, a > ktrace on that shows that it unlinks 3 files, opens them, lseeks then, > writes a block, and then mmaps them (but leaves them open). At process > termination is munmaps, and then closes. I have tried to write a shorter > reproduction that opens, seeks, mmaps (with the same flags), writes the > mmaped memory, munmaps, closes and exits, but this has been insufficient to > reproduce the issue; There is likely some specific pattern in the bdb5 code > tickling this, and behind the mmap-ed interface it is all opaque, and the > bdb5 code is pretty complex itself > > On Wed, Jul 6, 2016 at 11:18 AM, Konstantin Belousov > wrote: > >> On Wed, Jul 06, 2016 at 10:51:28AM -0400, David Cross wrote: >> > Ok.. to reply to my own message, I using ktr and debugging printfs I >> have >> > found the culprit.. but I am still at a loss to 'why', or what the >> > appropriate fix is. >> > >> > Lets go back to the panic (simplified) >> > >> > #0 0x8043f160 at kdb_backtrace+0x60 >> > #1 0x80401454 at vpanic+0x124 >> > #2 0x804014e3 at panic+0x43 >> > #3 0x8060719a at softdep_deallocate_dependencies+0x6a >> > #4 0x80499cc1 at brelse+0x151 >> > #5 0x804979b1 at bufwrite+0x81 >> > #6 0x80623c80 at ffs_write+0x4b0 >> > #7 0x806ce9a4 at VOP_WRITE_APV+0x1c4 >> > #8 0x806639e3 at vnode_pager_generic_putpages+0x293 >> > #9 0x806d2102 at VOP_PUTPAGES_APV+0x142 >> > #10 0x80661cc1 at vnode_pager_putpages+0x91 >> > #11 0x806588e6 at vm_pageout_flush+0x116 >> > #12 0x806517e2 at vm_object_page_collect_flush+0x1c2 >> > #13 0x80651519 at vm_object_page_clean+0x179 >> > #14 0x80651102 at vm_object_terminate+0xa2 >> > #15 0x806621a5 at vnode_destroy_vobject+0x85 >> > #16 0x8062a52f at ufs_reclaim+0x1f >> > #17 0x806d0782 at VOP_RECLAIM_APV+0x142 >> > >> > Via KTR logging I determined that the dangling dependedency was on a >> > freshly allocated buf, *after* vinvalbuf in the vgonel() (so in >> VOP_RECLAIM >> > itself), called by the vnode lru cleanup process; I further noticed >> that it >> > was in a newbuf that recycled a bp (unimportant, except it let me narrow >> > down my logging to something managable), from there I get this >> stacktrace >> > (simplified) >> > >> > #0 0x8043f160 at kdb_backtrace+0x60 >> > #1 0x8049c98e at getnewbuf+0x4be >> > #2 0x804996a0 at getblk+0x830 >> > #3 0x805fb207 at ffs_balloc_ufs2+0x1327 >> > #4 0x80623b0b at ffs_write+0x33b >> > #5 0x806ce9a4 at VOP_WRITE_APV+0x1c4 >> > #6 0x806639e3 at vnode_pager_generic_putpages+0x293 >> > #7 0x806d2102 at VOP_PUTPAGES_APV+0x142 >> > #8 0x80661cc1 at vnode_pager_putpages+0x91 >> > #9 0x806588e6 at vm_pageout_flush+0x116 >> > #10 0x806517e2 at vm_object_page_collect_flush+0x1c2 >> > #11 0x80651519 at vm_object_page_clean+0x179 >> > #12 0x80651102 at vm_object_terminate+0xa2 >> > #13 0x806621a5 at vnode_destroy_vobject+0x85 >> > #14 0x8062a52f at ufs_reclaim+0x1f >> > #15 0x806d0782 at VOP_RECLAIM_APV+0x142 >> > #16 0x804b6c6e at vgonel+0x2ee >> > #17 0x804ba6f5 at vnlru_proc+0x4b5 >> > >> > addr2line on the ffs_balloc_ufs2 gives: >> > /usr/src/sys/ufs/ffs/ffs_balloc.c:778: >> > >> > bp = getblk(vp, lbn, nsize, 0, 0, gbflags); >> > bp->b_blkno = fsbtodb(fs, newb); >> > if (flags & BA_CLRBUF) >> > vfs_bio_clrbuf(bp); >> > if (DOINGSOFTDEP(vp)) >> > softdep_setup_allocdirect(ip, lbn, >> newb, 0, >> > nsize, 0, bp); >> > >> > >> > Boom, freshly allocated buffer with a dependecy; nothing in VOP_RECLAIM >> > handles this, this is after vinvalbuf is called, it expects that >> everything >> > is flushed to disk and its just about releasing structures (is my read >> of >> > the code). >> > >> > Now, perhaps this is a good assumption? the question then is how is >> this >> > buffer hanging out there surviving a a vinvalbuf. I will note that my >> > te
Re: Reproducable panic in FFS with softupdates and no journaling (10.3-RELEASE-pLATEST)
On Wed, Jul 06, 2016 at 12:02:00PM -0400, David Cross wrote: > Oh, whoops; how do I printout the buffer? In kgdb, p/x *(struct buf *)address ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Reproducable panic in FFS with softupdates and no journaling (10.3-RELEASE-pLATEST)
(kgdb) up 5 #5 0x804aafa1 in brelse (bp=0xfe00f77457d0) at buf.h:428 428 (*bioops.io_deallocate)(bp); Current language: auto; currently minimal (kgdb) p/x *(struct buf *)0xfe00f77457d0 $1 = {b_bufobj = 0xf80002e88480, b_bcount = 0x4000, b_caller1 = 0x0, b_data = 0xfe00f857b000, b_error = 0x0, b_iocmd = 0x0, b_ioflags = 0x0, b_iooffset = 0x0, b_resid = 0x0, b_iodone = 0x0, b_blkno = 0x115d6400, b_offset = 0x0, b_bobufs = {tqe_next = 0x0, tqe_prev = 0xf80002e884d0}, b_vflags = 0x0, b_freelist = {tqe_next = 0xfe00f7745a28, tqe_prev = 0x80c2afc0}, b_qindex = 0x0, b_flags = 0x20402800, b_xflags = 0x2, b_lock = {lock_object = {lo_name = 0x8075030b, lo_flags = 0x673, lo_data = 0x0, lo_witness = 0xfe602f00}, lk_lock = 0xf800022e8000, lk_exslpfail = 0x0, lk_timo = 0x0, lk_pri = 0x60}, b_bufsize = 0x4000, b_runningbufspace = 0x0, b_kvabase = 0xfe00f857b000, b_kvaalloc = 0x0, b_kvasize = 0x4000, b_lblkno = 0x0, b_vp = 0xf80002e883b0, b_dirtyoff = 0x0, b_dirtyend = 0x0, b_rcred = 0x0, b_wcred = 0x0, b_saveaddr = 0x0, b_pager = { pg_reqpage = 0x0}, b_cluster = {cluster_head = {tqh_first = 0x0, tqh_last = 0x0}, cluster_entry = {tqe_next = 0x0, tqe_prev = 0x0}}, b_pages = {0xf800b99b30b0, 0xf800b99b3118, 0xf800b99b3180, 0xf800b99b31e8, 0x0 }, b_npages = 0x4, b_dep = { lh_first = 0xf800023d8c00}, b_fsprivate1 = 0x0, b_fsprivate2 = 0x0, b_fsprivate3 = 0x0, b_pin_count = 0x0} This is the freshly allocated buf that causes the panic; is this what is needed? I "know" which vnode will cause the panic on vnlru cleanup, but I don't know how to walk the memory list without a 'hook'.. as in, i can setup the kernel in a state that I know will panic when the vnode is cleaned up, I can force a panic 'early' (kill -9 1), and then I could get that vnode.. if I could get the vnode list to walk. On Wed, Jul 6, 2016 at 1:37 PM, Konstantin Belousov wrote: > On Wed, Jul 06, 2016 at 12:02:00PM -0400, David Cross wrote: > > Oh, whoops; how do I printout the buffer? > > In kgdb, p/x *(struct buf *)address > ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
O convite de Giorgio Lago está aguardando sua resposta
Giorgio Lago would like to connect on LinkedIn. How would you like to respond? Accept: https://www.linkedin.com/e/v2?e=-hwzlvm-iqb4171m-ge&t=ssuw&ek=second_guest_reminder_01&li=13&m=hero&ts=accept_text&sharedKey=dlXI--ld&invitationId=6151415342305464320 View Giorgio Lago's profile: https://www.linkedin.com/e/v2?e=-hwzlvm-iqb4171m-ge&t=ssuw&ek=second_guest_reminder_01&li=3&m=hero&ts=profile_text&sharedKey=dlXI--ld&invitationId=6151415342305464320 Eu gostaria de adicioná-lo à minha rede profissional no LinkedIn. -Giorgio Você recebeu um convite para se conectar. O LinkedIn utiliza seu endereço de e-mail para fazer sugestões a nossos usuários em recursos como Pessoas que talvez você conheça. Cancelar inscrição: https://www.linkedin.com/e/v2?e=-hwzlvm-iqb4171m-ge&t=lun&midToken=AQFmrFVhGEd36Q&ek=second_guest_reminder_01&li=15&m=unsub&ts=HTML&eid=-hwzlvm-iqb4171m-ge&loid=AQEBxxkC5Zj2AgAAAVXCCp_8QtS1iDRa0yUyQRztSvf98jhLoL-W3mo31rM3KJ1C7LIvMyA9BzfdJHx5upnkYC8A Este e-mail foi enviado para freebsd-stable@freebsd.org. If you need assistance or have questions, please contact LinkedIn Customer Service: https://www.linkedin.com/e/v2?e=-hwzlvm-iqb4171m-ge&a=customerServiceUrl&ek=second_guest_reminder_01 © 2016 LinkedIn Corporation, 2029 Stierlin Court, Mountain View, CA 94043. LinkedIn e a logomarca do LinkedIn são marcas registradas da LinkedIn. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Reproducable panic in FFS with softupdates and no journaling (10.3-RELEASE-pLATEST)
On Wed, Jul 06, 2016 at 02:21:20PM -0400, David Cross wrote: > (kgdb) up 5 > #5 0x804aafa1 in brelse (bp=0xfe00f77457d0) at buf.h:428 > 428 (*bioops.io_deallocate)(bp); > Current language: auto; currently minimal > (kgdb) p/x *(struct buf *)0xfe00f77457d0 > $1 = {b_bufobj = 0xf80002e88480, b_bcount = 0x4000, b_caller1 = 0x0, > b_data = 0xfe00f857b000, b_error = 0x0, b_iocmd = 0x0, b_ioflags = > 0x0, > b_iooffset = 0x0, b_resid = 0x0, b_iodone = 0x0, b_blkno = 0x115d6400, > b_offset = 0x0, b_bobufs = {tqe_next = 0x0, tqe_prev = > 0xf80002e884d0}, > b_vflags = 0x0, b_freelist = {tqe_next = 0xfe00f7745a28, > tqe_prev = 0x80c2afc0}, b_qindex = 0x0, b_flags = 0x20402800, > b_xflags = 0x2, b_lock = {lock_object = {lo_name = 0x8075030b, > lo_flags = 0x673, lo_data = 0x0, lo_witness = > 0xfe602f00}, > lk_lock = 0xf800022e8000, lk_exslpfail = 0x0, lk_timo = 0x0, > lk_pri = 0x60}, b_bufsize = 0x4000, b_runningbufspace = 0x0, > b_kvabase = 0xfe00f857b000, b_kvaalloc = 0x0, b_kvasize = 0x4000, > b_lblkno = 0x0, b_vp = 0xf80002e883b0, b_dirtyoff = 0x0, > b_dirtyend = 0x0, b_rcred = 0x0, b_wcred = 0x0, b_saveaddr = 0x0, b_pager > = { > pg_reqpage = 0x0}, b_cluster = {cluster_head = {tqh_first = 0x0, > tqh_last = 0x0}, cluster_entry = {tqe_next = 0x0, tqe_prev = 0x0}}, > b_pages = {0xf800b99b30b0, 0xf800b99b3118, 0xf800b99b3180, > 0xf800b99b31e8, 0x0 }, b_npages = 0x4, b_dep = { > lh_first = 0xf800023d8c00}, b_fsprivate1 = 0x0, b_fsprivate2 = 0x0, > b_fsprivate3 = 0x0, b_pin_count = 0x0} > > > This is the freshly allocated buf that causes the panic; is this what is > needed? I "know" which vnode will cause the panic on vnlru cleanup, but I > don't know how to walk the memory list without a 'hook'.. as in, i can > setup the kernel in a state that I know will panic when the vnode is > cleaned up, I can force a panic 'early' (kill -9 1), and then I could get > that vnode.. if I could get the vnode list to walk. Was the state printed after the panic occured ? What is strange is that buffer was not even tried for i/o, AFAIS. Apart from empty b_error/b_iocmd, the b_lblkno is zero, which means that the buffer was never allocated on the disk. The b_blkno looks strangely high. Can you print *(bp->b_vp) ? If it is UFS vnode, do p *(struct inode)(->v_data). I am esp. interested in the vnode size. Can you reproduce the problem on HEAD ? ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"