10.3 update 5 + Inetd problem

2016-07-06 Thread Andrea Brancatelli
Hello everybody, 

we're having a curious issue with a machine we recently upgraded to
10.3-RELEASE-p5. 

The machine is a pretty "stupid" MX with just sendmail running and
delivering locally mails to an LMTP instance (dbmail lmtp). For various
reasons we don't have a "demonized" LMTPd but we use Inetd to spawn it
on demand. 

After we upgraded the machine from 10.2 to 10.3 all of a sudden the
lmtpd process started (randomly) getting (forever) stuck with 100% cpu
usage in _"uwait"_ state. 

On a twin machine with the same setup that is not yet upgraded
(10.1-RELEASE-p3), with the same "ports" versions and identical setup
the problem never happens. 

Given that I'm not too much into GDB, Dtrace and all that stuff, what
can I do to try to diagnose the problem better? 

Thank you very much.

-- 

Andrea Brancatelli
Schema31 S.p.a.
Responsabile IT

ROMA - BO - FI - PA 
ITALY
Tel: +39.06.98.358.472
Cell: +39.331.2488468
Fax: +39.055.71.880.466
Società del Gruppo SC31 ITALIA
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Reproducable panic in FFS with softupdates and no journaling (10.3-RELEASE-pLATEST)

2016-07-06 Thread David Cross
Ok.. to reply to my own message, I using ktr and debugging printfs I have
found the culprit.. but I am still at a loss to 'why', or what the
appropriate fix is.

Lets go back to the panic (simplified)

#0 0x8043f160 at kdb_backtrace+0x60
#1 0x80401454 at vpanic+0x124
#2 0x804014e3 at panic+0x43
#3 0x8060719a at softdep_deallocate_dependencies+0x6a
#4 0x80499cc1 at brelse+0x151
#5 0x804979b1 at bufwrite+0x81
#6 0x80623c80 at ffs_write+0x4b0
#7 0x806ce9a4 at VOP_WRITE_APV+0x1c4
#8 0x806639e3 at vnode_pager_generic_putpages+0x293
#9 0x806d2102 at VOP_PUTPAGES_APV+0x142
#10 0x80661cc1 at vnode_pager_putpages+0x91
#11 0x806588e6 at vm_pageout_flush+0x116
#12 0x806517e2 at vm_object_page_collect_flush+0x1c2
#13 0x80651519 at vm_object_page_clean+0x179
#14 0x80651102 at vm_object_terminate+0xa2
#15 0x806621a5 at vnode_destroy_vobject+0x85
#16 0x8062a52f at ufs_reclaim+0x1f
#17 0x806d0782 at VOP_RECLAIM_APV+0x142

Via KTR logging I determined that the dangling dependedency was on a
freshly allocated buf, *after* vinvalbuf in the vgonel() (so in VOP_RECLAIM
itself), called by the vnode lru cleanup process; I further noticed that it
was in a newbuf that recycled a bp (unimportant, except it let me narrow
down my logging to something managable), from there I get this stacktrace
(simplified)

#0 0x8043f160 at kdb_backtrace+0x60
#1 0x8049c98e at getnewbuf+0x4be
#2 0x804996a0 at getblk+0x830
#3 0x805fb207 at ffs_balloc_ufs2+0x1327
#4 0x80623b0b at ffs_write+0x33b
#5 0x806ce9a4 at VOP_WRITE_APV+0x1c4
#6 0x806639e3 at vnode_pager_generic_putpages+0x293
#7 0x806d2102 at VOP_PUTPAGES_APV+0x142
#8 0x80661cc1 at vnode_pager_putpages+0x91
#9 0x806588e6 at vm_pageout_flush+0x116
#10 0x806517e2 at vm_object_page_collect_flush+0x1c2
#11 0x80651519 at vm_object_page_clean+0x179
#12 0x80651102 at vm_object_terminate+0xa2
#13 0x806621a5 at vnode_destroy_vobject+0x85
#14 0x8062a52f at ufs_reclaim+0x1f
#15 0x806d0782 at VOP_RECLAIM_APV+0x142
#16 0x804b6c6e at vgonel+0x2ee
#17 0x804ba6f5 at vnlru_proc+0x4b5

addr2line on the ffs_balloc_ufs2 gives:
/usr/src/sys/ufs/ffs/ffs_balloc.c:778:

bp = getblk(vp, lbn, nsize, 0, 0, gbflags);
bp->b_blkno = fsbtodb(fs, newb);
if (flags & BA_CLRBUF)
vfs_bio_clrbuf(bp);
if (DOINGSOFTDEP(vp))
softdep_setup_allocdirect(ip, lbn, newb, 0,
nsize, 0, bp);


Boom, freshly allocated buffer with a dependecy; nothing in VOP_RECLAIM
handles this, this is after vinvalbuf is called, it expects that everything
is flushed to disk and its just about releasing structures (is my read of
the code).

Now, perhaps this is a good assumption?  the question then is how is this
buffer hanging out there surviving a a vinvalbuf.  I will note that my
test-case that finds this runs and terminates *minutes* before... its not
just hanging out there in a race, its surviving background sync, fsync,
etc... wtf?  Also, I *can* unmount the FS without an error, so that
codepath is either ignoring this buffer, or its forcing a sync in a way
that doesn't panic?

Anyone have next steps?  I am making progress here, but its really slow
going, this is probably the most complex portion of the kernel and some
pointers would be helpful.

On Sat, Jul 2, 2016 at 2:31 PM, David Cross  wrote:

> Ok, I have been trying to trace this down for awhile..I know quite a bit
> about it.. but there's a lot I don't know, or I would have a patch.  I have
> been trying to solve this on my own, but bringing in some outside
> assistance will let me move on with my life.
>
> First up:  The stacktrace (from a debugging kernel, with coredump
>
> #0  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:298
> #1  0x8071018a in kern_reboot (howto=260)
> at /usr/src/sys/kern/kern_shutdown.c:486
> #2  0x80710afc in vpanic (
> fmt=0x80c7a325 "softdep_deallocate_dependencies: dangling deps
> b_ioflags: %d, b_bufsize: %ld, b_flags: %d, bo_flag: %d",
> ap=0xfe023ae5cf40)
> at /usr/src/sys/kern/kern_shutdown.c:889
> #3  0x807108c0 in panic (
> fmt=0x80c7a325 "softdep_deallocate_dependencies: dangling deps
> b_ioflags: %d, b_bufsize: %ld, b_flags: %d, bo_flag: %d")
> at /usr/src/sys/kern/kern_shutdown.c:818
> #4  0x80a7c841 in softdep_deallocate_dependencies (
> bp=0xfe01f030e148) at /usr/src/sys/ufs/ffs/ffs_softdep.c:14099
> #5  0x807f793f in buf_deallocate (bp=0xfe01f030e148) at
> buf.h:428
> #6  0x807f59c9 in brelse (bp=0xfe01f030e148)
> at /usr/src/sys/kern/vfs_bio.c:1599
> #7  0x

Re: Reproducable panic in FFS with softupdates and no journaling (10.3-RELEASE-pLATEST)

2016-07-06 Thread Konstantin Belousov
On Wed, Jul 06, 2016 at 10:51:28AM -0400, David Cross wrote:
> Ok.. to reply to my own message, I using ktr and debugging printfs I have
> found the culprit.. but I am still at a loss to 'why', or what the
> appropriate fix is.
> 
> Lets go back to the panic (simplified)
> 
> #0 0x8043f160 at kdb_backtrace+0x60
> #1 0x80401454 at vpanic+0x124
> #2 0x804014e3 at panic+0x43
> #3 0x8060719a at softdep_deallocate_dependencies+0x6a
> #4 0x80499cc1 at brelse+0x151
> #5 0x804979b1 at bufwrite+0x81
> #6 0x80623c80 at ffs_write+0x4b0
> #7 0x806ce9a4 at VOP_WRITE_APV+0x1c4
> #8 0x806639e3 at vnode_pager_generic_putpages+0x293
> #9 0x806d2102 at VOP_PUTPAGES_APV+0x142
> #10 0x80661cc1 at vnode_pager_putpages+0x91
> #11 0x806588e6 at vm_pageout_flush+0x116
> #12 0x806517e2 at vm_object_page_collect_flush+0x1c2
> #13 0x80651519 at vm_object_page_clean+0x179
> #14 0x80651102 at vm_object_terminate+0xa2
> #15 0x806621a5 at vnode_destroy_vobject+0x85
> #16 0x8062a52f at ufs_reclaim+0x1f
> #17 0x806d0782 at VOP_RECLAIM_APV+0x142
> 
> Via KTR logging I determined that the dangling dependedency was on a
> freshly allocated buf, *after* vinvalbuf in the vgonel() (so in VOP_RECLAIM
> itself), called by the vnode lru cleanup process; I further noticed that it
> was in a newbuf that recycled a bp (unimportant, except it let me narrow
> down my logging to something managable), from there I get this stacktrace
> (simplified)
> 
> #0 0x8043f160 at kdb_backtrace+0x60
> #1 0x8049c98e at getnewbuf+0x4be
> #2 0x804996a0 at getblk+0x830
> #3 0x805fb207 at ffs_balloc_ufs2+0x1327
> #4 0x80623b0b at ffs_write+0x33b
> #5 0x806ce9a4 at VOP_WRITE_APV+0x1c4
> #6 0x806639e3 at vnode_pager_generic_putpages+0x293
> #7 0x806d2102 at VOP_PUTPAGES_APV+0x142
> #8 0x80661cc1 at vnode_pager_putpages+0x91
> #9 0x806588e6 at vm_pageout_flush+0x116
> #10 0x806517e2 at vm_object_page_collect_flush+0x1c2
> #11 0x80651519 at vm_object_page_clean+0x179
> #12 0x80651102 at vm_object_terminate+0xa2
> #13 0x806621a5 at vnode_destroy_vobject+0x85
> #14 0x8062a52f at ufs_reclaim+0x1f
> #15 0x806d0782 at VOP_RECLAIM_APV+0x142
> #16 0x804b6c6e at vgonel+0x2ee
> #17 0x804ba6f5 at vnlru_proc+0x4b5
> 
> addr2line on the ffs_balloc_ufs2 gives:
> /usr/src/sys/ufs/ffs/ffs_balloc.c:778:
> 
> bp = getblk(vp, lbn, nsize, 0, 0, gbflags);
> bp->b_blkno = fsbtodb(fs, newb);
> if (flags & BA_CLRBUF)
> vfs_bio_clrbuf(bp);
> if (DOINGSOFTDEP(vp))
> softdep_setup_allocdirect(ip, lbn, newb, 0,
> nsize, 0, bp);
> 
> 
> Boom, freshly allocated buffer with a dependecy; nothing in VOP_RECLAIM
> handles this, this is after vinvalbuf is called, it expects that everything
> is flushed to disk and its just about releasing structures (is my read of
> the code).
> 
> Now, perhaps this is a good assumption?  the question then is how is this
> buffer hanging out there surviving a a vinvalbuf.  I will note that my
> test-case that finds this runs and terminates *minutes* before... its not
> just hanging out there in a race, its surviving background sync, fsync,
> etc... wtf?  Also, I *can* unmount the FS without an error, so that
> codepath is either ignoring this buffer, or its forcing a sync in a way
> that doesn't panic?
Most typical cause for the buffer dependencies not flushed is a buffer
write error.  At least you could provide the printout of the buffer to
confirm or reject this assumption.

Were there any kernel messages right before the panic ?  Just in case,
did you fsck the volume before using it, after the previous panic ?

> 
> Anyone have next steps?  I am making progress here, but its really slow
> going, this is probably the most complex portion of the kernel and some
> pointers would be helpful.
> 
> On Sat, Jul 2, 2016 at 2:31 PM, David Cross  wrote:
> 
> > Ok, I have been trying to trace this down for awhile..I know quite a bit
> > about it.. but there's a lot I don't know, or I would have a patch.  I have
> > been trying to solve this on my own, but bringing in some outside
> > assistance will let me move on with my life.
> >
> > First up:  The stacktrace (from a debugging kernel, with coredump
> >
> > #0  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:298
> > #1  0x8071018a in kern_reboot (howto=260)
> > at /usr/src/sys/kern/kern_shutdown.c:486
> > #2  0x80710afc in vpanic (
> > fmt=0x80c7a325 "softdep_deallocate_dependencies: dangling deps
> > b_ioflags: %d, b_bufsize: %ld, b_flags: %d, bo_flag: %d",
> > ap=0xfe023ae5cf40)
> > at /usr/src/sys/kern/k

Re: Reproducable panic in FFS with softupdates and no journaling (10.3-RELEASE-pLATEST)

2016-07-06 Thread David Cross
No kernel messages before (if there were I would have written this off a
long time ago);
And as of right now, this is probably the most fsck-ed filesystem on the
planet!.. I have an 'image' that I am going on that is ggate mounted, so I
can access it in a bhyve VM to ease debuging so I am not crashing my real
machine (with the real filesystem) all the time.

One of my initial guesses was that this was a CG allocation error, but a
dumpfs seems to show plenty of blocks in the CG to meet this need.

Quick note on the testcase, I haven't totally isolated it yet, but the
minimal reproduction is a 'ctl_cyrusdb -r", which runs a bdb5 recover op, a
ktrace on that shows that it unlinks 3 files, opens them, lseeks then,
writes a block, and then mmaps them (but leaves them open).  At process
termination is munmaps, and then closes.  I have tried to write a shorter
reproduction that opens, seeks, mmaps (with the same flags), writes the
mmaped memory, munmaps, closes and exits, but this has been insufficient to
reproduce the issue; There is likely some specific pattern in the bdb5 code
tickling this, and behind the mmap-ed interface it is all opaque, and the
bdb5 code is pretty complex itself

On Wed, Jul 6, 2016 at 11:18 AM, Konstantin Belousov 
wrote:

> On Wed, Jul 06, 2016 at 10:51:28AM -0400, David Cross wrote:
> > Ok.. to reply to my own message, I using ktr and debugging printfs I have
> > found the culprit.. but I am still at a loss to 'why', or what the
> > appropriate fix is.
> >
> > Lets go back to the panic (simplified)
> >
> > #0 0x8043f160 at kdb_backtrace+0x60
> > #1 0x80401454 at vpanic+0x124
> > #2 0x804014e3 at panic+0x43
> > #3 0x8060719a at softdep_deallocate_dependencies+0x6a
> > #4 0x80499cc1 at brelse+0x151
> > #5 0x804979b1 at bufwrite+0x81
> > #6 0x80623c80 at ffs_write+0x4b0
> > #7 0x806ce9a4 at VOP_WRITE_APV+0x1c4
> > #8 0x806639e3 at vnode_pager_generic_putpages+0x293
> > #9 0x806d2102 at VOP_PUTPAGES_APV+0x142
> > #10 0x80661cc1 at vnode_pager_putpages+0x91
> > #11 0x806588e6 at vm_pageout_flush+0x116
> > #12 0x806517e2 at vm_object_page_collect_flush+0x1c2
> > #13 0x80651519 at vm_object_page_clean+0x179
> > #14 0x80651102 at vm_object_terminate+0xa2
> > #15 0x806621a5 at vnode_destroy_vobject+0x85
> > #16 0x8062a52f at ufs_reclaim+0x1f
> > #17 0x806d0782 at VOP_RECLAIM_APV+0x142
> >
> > Via KTR logging I determined that the dangling dependedency was on a
> > freshly allocated buf, *after* vinvalbuf in the vgonel() (so in
> VOP_RECLAIM
> > itself), called by the vnode lru cleanup process; I further noticed that
> it
> > was in a newbuf that recycled a bp (unimportant, except it let me narrow
> > down my logging to something managable), from there I get this stacktrace
> > (simplified)
> >
> > #0 0x8043f160 at kdb_backtrace+0x60
> > #1 0x8049c98e at getnewbuf+0x4be
> > #2 0x804996a0 at getblk+0x830
> > #3 0x805fb207 at ffs_balloc_ufs2+0x1327
> > #4 0x80623b0b at ffs_write+0x33b
> > #5 0x806ce9a4 at VOP_WRITE_APV+0x1c4
> > #6 0x806639e3 at vnode_pager_generic_putpages+0x293
> > #7 0x806d2102 at VOP_PUTPAGES_APV+0x142
> > #8 0x80661cc1 at vnode_pager_putpages+0x91
> > #9 0x806588e6 at vm_pageout_flush+0x116
> > #10 0x806517e2 at vm_object_page_collect_flush+0x1c2
> > #11 0x80651519 at vm_object_page_clean+0x179
> > #12 0x80651102 at vm_object_terminate+0xa2
> > #13 0x806621a5 at vnode_destroy_vobject+0x85
> > #14 0x8062a52f at ufs_reclaim+0x1f
> > #15 0x806d0782 at VOP_RECLAIM_APV+0x142
> > #16 0x804b6c6e at vgonel+0x2ee
> > #17 0x804ba6f5 at vnlru_proc+0x4b5
> >
> > addr2line on the ffs_balloc_ufs2 gives:
> > /usr/src/sys/ufs/ffs/ffs_balloc.c:778:
> >
> > bp = getblk(vp, lbn, nsize, 0, 0, gbflags);
> > bp->b_blkno = fsbtodb(fs, newb);
> > if (flags & BA_CLRBUF)
> > vfs_bio_clrbuf(bp);
> > if (DOINGSOFTDEP(vp))
> > softdep_setup_allocdirect(ip, lbn, newb,
> 0,
> > nsize, 0, bp);
> >
> >
> > Boom, freshly allocated buffer with a dependecy; nothing in VOP_RECLAIM
> > handles this, this is after vinvalbuf is called, it expects that
> everything
> > is flushed to disk and its just about releasing structures (is my read of
> > the code).
> >
> > Now, perhaps this is a good assumption?  the question then is how is this
> > buffer hanging out there surviving a a vinvalbuf.  I will note that my
> > test-case that finds this runs and terminates *minutes* before... its not
> > just hanging out there in a race, its surviving background sync, fsync,
> > etc... wtf?  Also, I *can* unmount the FS without an error, so that
> > codep

Re: Reproducable panic in FFS with softupdates and no journaling (10.3-RELEASE-pLATEST)

2016-07-06 Thread David Cross
Oh, whoops; how do I printout the buffer?

On Wed, Jul 6, 2016 at 11:30 AM, David Cross  wrote:

> No kernel messages before (if there were I would have written this off a
> long time ago);
> And as of right now, this is probably the most fsck-ed filesystem on the
> planet!.. I have an 'image' that I am going on that is ggate mounted, so I
> can access it in a bhyve VM to ease debuging so I am not crashing my real
> machine (with the real filesystem) all the time.
>
> One of my initial guesses was that this was a CG allocation error, but a
> dumpfs seems to show plenty of blocks in the CG to meet this need.
>
> Quick note on the testcase, I haven't totally isolated it yet, but the
> minimal reproduction is a 'ctl_cyrusdb -r", which runs a bdb5 recover op, a
> ktrace on that shows that it unlinks 3 files, opens them, lseeks then,
> writes a block, and then mmaps them (but leaves them open).  At process
> termination is munmaps, and then closes.  I have tried to write a shorter
> reproduction that opens, seeks, mmaps (with the same flags), writes the
> mmaped memory, munmaps, closes and exits, but this has been insufficient to
> reproduce the issue; There is likely some specific pattern in the bdb5 code
> tickling this, and behind the mmap-ed interface it is all opaque, and the
> bdb5 code is pretty complex itself
>
> On Wed, Jul 6, 2016 at 11:18 AM, Konstantin Belousov 
> wrote:
>
>> On Wed, Jul 06, 2016 at 10:51:28AM -0400, David Cross wrote:
>> > Ok.. to reply to my own message, I using ktr and debugging printfs I
>> have
>> > found the culprit.. but I am still at a loss to 'why', or what the
>> > appropriate fix is.
>> >
>> > Lets go back to the panic (simplified)
>> >
>> > #0 0x8043f160 at kdb_backtrace+0x60
>> > #1 0x80401454 at vpanic+0x124
>> > #2 0x804014e3 at panic+0x43
>> > #3 0x8060719a at softdep_deallocate_dependencies+0x6a
>> > #4 0x80499cc1 at brelse+0x151
>> > #5 0x804979b1 at bufwrite+0x81
>> > #6 0x80623c80 at ffs_write+0x4b0
>> > #7 0x806ce9a4 at VOP_WRITE_APV+0x1c4
>> > #8 0x806639e3 at vnode_pager_generic_putpages+0x293
>> > #9 0x806d2102 at VOP_PUTPAGES_APV+0x142
>> > #10 0x80661cc1 at vnode_pager_putpages+0x91
>> > #11 0x806588e6 at vm_pageout_flush+0x116
>> > #12 0x806517e2 at vm_object_page_collect_flush+0x1c2
>> > #13 0x80651519 at vm_object_page_clean+0x179
>> > #14 0x80651102 at vm_object_terminate+0xa2
>> > #15 0x806621a5 at vnode_destroy_vobject+0x85
>> > #16 0x8062a52f at ufs_reclaim+0x1f
>> > #17 0x806d0782 at VOP_RECLAIM_APV+0x142
>> >
>> > Via KTR logging I determined that the dangling dependedency was on a
>> > freshly allocated buf, *after* vinvalbuf in the vgonel() (so in
>> VOP_RECLAIM
>> > itself), called by the vnode lru cleanup process; I further noticed
>> that it
>> > was in a newbuf that recycled a bp (unimportant, except it let me narrow
>> > down my logging to something managable), from there I get this
>> stacktrace
>> > (simplified)
>> >
>> > #0 0x8043f160 at kdb_backtrace+0x60
>> > #1 0x8049c98e at getnewbuf+0x4be
>> > #2 0x804996a0 at getblk+0x830
>> > #3 0x805fb207 at ffs_balloc_ufs2+0x1327
>> > #4 0x80623b0b at ffs_write+0x33b
>> > #5 0x806ce9a4 at VOP_WRITE_APV+0x1c4
>> > #6 0x806639e3 at vnode_pager_generic_putpages+0x293
>> > #7 0x806d2102 at VOP_PUTPAGES_APV+0x142
>> > #8 0x80661cc1 at vnode_pager_putpages+0x91
>> > #9 0x806588e6 at vm_pageout_flush+0x116
>> > #10 0x806517e2 at vm_object_page_collect_flush+0x1c2
>> > #11 0x80651519 at vm_object_page_clean+0x179
>> > #12 0x80651102 at vm_object_terminate+0xa2
>> > #13 0x806621a5 at vnode_destroy_vobject+0x85
>> > #14 0x8062a52f at ufs_reclaim+0x1f
>> > #15 0x806d0782 at VOP_RECLAIM_APV+0x142
>> > #16 0x804b6c6e at vgonel+0x2ee
>> > #17 0x804ba6f5 at vnlru_proc+0x4b5
>> >
>> > addr2line on the ffs_balloc_ufs2 gives:
>> > /usr/src/sys/ufs/ffs/ffs_balloc.c:778:
>> >
>> > bp = getblk(vp, lbn, nsize, 0, 0, gbflags);
>> > bp->b_blkno = fsbtodb(fs, newb);
>> > if (flags & BA_CLRBUF)
>> > vfs_bio_clrbuf(bp);
>> > if (DOINGSOFTDEP(vp))
>> > softdep_setup_allocdirect(ip, lbn,
>> newb, 0,
>> > nsize, 0, bp);
>> >
>> >
>> > Boom, freshly allocated buffer with a dependecy; nothing in VOP_RECLAIM
>> > handles this, this is after vinvalbuf is called, it expects that
>> everything
>> > is flushed to disk and its just about releasing structures (is my read
>> of
>> > the code).
>> >
>> > Now, perhaps this is a good assumption?  the question then is how is
>> this
>> > buffer hanging out there surviving a a vinvalbuf.  I will note that my
>> > te

Re: Reproducable panic in FFS with softupdates and no journaling (10.3-RELEASE-pLATEST)

2016-07-06 Thread Konstantin Belousov
On Wed, Jul 06, 2016 at 12:02:00PM -0400, David Cross wrote:
> Oh, whoops; how do I printout the buffer?

In kgdb, p/x *(struct buf *)address
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Reproducable panic in FFS with softupdates and no journaling (10.3-RELEASE-pLATEST)

2016-07-06 Thread David Cross
(kgdb) up 5
#5  0x804aafa1 in brelse (bp=0xfe00f77457d0) at buf.h:428
428 (*bioops.io_deallocate)(bp);
Current language:  auto; currently minimal
(kgdb) p/x *(struct buf *)0xfe00f77457d0
$1 = {b_bufobj = 0xf80002e88480, b_bcount = 0x4000, b_caller1 = 0x0,
  b_data = 0xfe00f857b000, b_error = 0x0, b_iocmd = 0x0, b_ioflags =
0x0,
  b_iooffset = 0x0, b_resid = 0x0, b_iodone = 0x0, b_blkno = 0x115d6400,
  b_offset = 0x0, b_bobufs = {tqe_next = 0x0, tqe_prev =
0xf80002e884d0},
  b_vflags = 0x0, b_freelist = {tqe_next = 0xfe00f7745a28,
tqe_prev = 0x80c2afc0}, b_qindex = 0x0, b_flags = 0x20402800,
  b_xflags = 0x2, b_lock = {lock_object = {lo_name = 0x8075030b,
  lo_flags = 0x673, lo_data = 0x0, lo_witness =
0xfe602f00},
lk_lock = 0xf800022e8000, lk_exslpfail = 0x0, lk_timo = 0x0,
lk_pri = 0x60}, b_bufsize = 0x4000, b_runningbufspace = 0x0,
  b_kvabase = 0xfe00f857b000, b_kvaalloc = 0x0, b_kvasize = 0x4000,
  b_lblkno = 0x0, b_vp = 0xf80002e883b0, b_dirtyoff = 0x0,
  b_dirtyend = 0x0, b_rcred = 0x0, b_wcred = 0x0, b_saveaddr = 0x0, b_pager
= {
pg_reqpage = 0x0}, b_cluster = {cluster_head = {tqh_first = 0x0,
  tqh_last = 0x0}, cluster_entry = {tqe_next = 0x0, tqe_prev = 0x0}},
  b_pages = {0xf800b99b30b0, 0xf800b99b3118, 0xf800b99b3180,
0xf800b99b31e8, 0x0 }, b_npages = 0x4, b_dep = {
lh_first = 0xf800023d8c00}, b_fsprivate1 = 0x0, b_fsprivate2 = 0x0,
  b_fsprivate3 = 0x0, b_pin_count = 0x0}


This is the freshly allocated buf that causes the panic; is this what is
needed?  I "know" which vnode will cause the panic on vnlru cleanup, but I
don't know how to walk the memory list without a 'hook'.. as in, i can
setup the kernel in a state that I know will panic when the vnode is
cleaned up, I can force a panic 'early' (kill -9 1), and then I could get
that vnode.. if I could get the vnode list to walk.

On Wed, Jul 6, 2016 at 1:37 PM, Konstantin Belousov 
wrote:

> On Wed, Jul 06, 2016 at 12:02:00PM -0400, David Cross wrote:
> > Oh, whoops; how do I printout the buffer?
>
> In kgdb, p/x *(struct buf *)address
>
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


O convite de Giorgio Lago está aguardando sua resposta

2016-07-06 Thread Giorgio Lago via LinkedIn via freebsd-stable
Giorgio Lago would like to connect on LinkedIn. How would you like to respond?

Accept: 
https://www.linkedin.com/e/v2?e=-hwzlvm-iqb4171m-ge&t=ssuw&ek=second_guest_reminder_01&li=13&m=hero&ts=accept_text&sharedKey=dlXI--ld&invitationId=6151415342305464320

View Giorgio Lago's profile: 
https://www.linkedin.com/e/v2?e=-hwzlvm-iqb4171m-ge&t=ssuw&ek=second_guest_reminder_01&li=3&m=hero&ts=profile_text&sharedKey=dlXI--ld&invitationId=6151415342305464320

Eu gostaria de adicioná-lo à minha rede profissional no LinkedIn.
-Giorgio





Você recebeu um convite para se conectar. O LinkedIn utiliza seu endereço de 
e-mail para fazer sugestões a nossos usuários em recursos como Pessoas que 
talvez você conheça. Cancelar inscrição: 
https://www.linkedin.com/e/v2?e=-hwzlvm-iqb4171m-ge&t=lun&midToken=AQFmrFVhGEd36Q&ek=second_guest_reminder_01&li=15&m=unsub&ts=HTML&eid=-hwzlvm-iqb4171m-ge&loid=AQEBxxkC5Zj2AgAAAVXCCp_8QtS1iDRa0yUyQRztSvf98jhLoL-W3mo31rM3KJ1C7LIvMyA9BzfdJHx5upnkYC8A

Este e-mail foi enviado para freebsd-stable@freebsd.org.

If you need assistance or have questions, please contact LinkedIn Customer 
Service: 
https://www.linkedin.com/e/v2?e=-hwzlvm-iqb4171m-ge&a=customerServiceUrl&ek=second_guest_reminder_01

© 2016 LinkedIn Corporation, 2029 Stierlin Court, Mountain View, CA 94043. 
LinkedIn e a logomarca do LinkedIn são marcas registradas da LinkedIn.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: Reproducable panic in FFS with softupdates and no journaling (10.3-RELEASE-pLATEST)

2016-07-06 Thread Konstantin Belousov
On Wed, Jul 06, 2016 at 02:21:20PM -0400, David Cross wrote:
> (kgdb) up 5
> #5  0x804aafa1 in brelse (bp=0xfe00f77457d0) at buf.h:428
> 428 (*bioops.io_deallocate)(bp);
> Current language:  auto; currently minimal
> (kgdb) p/x *(struct buf *)0xfe00f77457d0
> $1 = {b_bufobj = 0xf80002e88480, b_bcount = 0x4000, b_caller1 = 0x0,
>   b_data = 0xfe00f857b000, b_error = 0x0, b_iocmd = 0x0, b_ioflags =
> 0x0,
>   b_iooffset = 0x0, b_resid = 0x0, b_iodone = 0x0, b_blkno = 0x115d6400,
>   b_offset = 0x0, b_bobufs = {tqe_next = 0x0, tqe_prev =
> 0xf80002e884d0},
>   b_vflags = 0x0, b_freelist = {tqe_next = 0xfe00f7745a28,
> tqe_prev = 0x80c2afc0}, b_qindex = 0x0, b_flags = 0x20402800,
>   b_xflags = 0x2, b_lock = {lock_object = {lo_name = 0x8075030b,
>   lo_flags = 0x673, lo_data = 0x0, lo_witness =
> 0xfe602f00},
> lk_lock = 0xf800022e8000, lk_exslpfail = 0x0, lk_timo = 0x0,
> lk_pri = 0x60}, b_bufsize = 0x4000, b_runningbufspace = 0x0,
>   b_kvabase = 0xfe00f857b000, b_kvaalloc = 0x0, b_kvasize = 0x4000,
>   b_lblkno = 0x0, b_vp = 0xf80002e883b0, b_dirtyoff = 0x0,
>   b_dirtyend = 0x0, b_rcred = 0x0, b_wcred = 0x0, b_saveaddr = 0x0, b_pager
> = {
> pg_reqpage = 0x0}, b_cluster = {cluster_head = {tqh_first = 0x0,
>   tqh_last = 0x0}, cluster_entry = {tqe_next = 0x0, tqe_prev = 0x0}},
>   b_pages = {0xf800b99b30b0, 0xf800b99b3118, 0xf800b99b3180,
> 0xf800b99b31e8, 0x0 }, b_npages = 0x4, b_dep = {
> lh_first = 0xf800023d8c00}, b_fsprivate1 = 0x0, b_fsprivate2 = 0x0,
>   b_fsprivate3 = 0x0, b_pin_count = 0x0}
> 
> 
> This is the freshly allocated buf that causes the panic; is this what is
> needed?  I "know" which vnode will cause the panic on vnlru cleanup, but I
> don't know how to walk the memory list without a 'hook'.. as in, i can
> setup the kernel in a state that I know will panic when the vnode is
> cleaned up, I can force a panic 'early' (kill -9 1), and then I could get
> that vnode.. if I could get the vnode list to walk.

Was the state printed after the panic occured ?  What is strange is that
buffer was not even tried for i/o, AFAIS.  Apart from empty b_error/b_iocmd,
the b_lblkno is zero, which means that the buffer was never allocated on
the disk.

The b_blkno looks strangely high.  Can you print *(bp->b_vp) ?  If it is
UFS vnode, do p *(struct inode)(->v_data).  I am esp. interested
in the vnode size.

Can you reproduce the problem on HEAD ?
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"