Adrian Penisoara wrote:
> Hi,
>
> As we are facing a heavy fragments attack (40-60byte packets in a
> ~ 1000 pkts/sec flow) I see some sporadic panics. Kernel/world is
> 4.2-STABLE as of 18 Jan 2001 -- it's a production machine and I
hadn't yet
> the chance for another update; if it's been fixed in the mean time I
would
> be glad to hear it...
>
> I have attached a gdb trace and a snip of a tcpdump log. When I
rebuilt
> the kernel with debug options it seemed to crush less often. I
remember
> that at the time of this panic I had an ipfw rule to deny IP
fragments.
This is one of those "odd" faults I've seen in -STABLE sometimes.
Thanks to good debugging information you've provided, to be noted:
#16 0xc014de98 in m_copym (m=0xc07e7c00, off0=0, len=40, wait=1)
at ../../kern/uipc_mbuf.c:621
621 n->m_pkthdr.len -= off0;
(kgdb) list
616 if (n == 0)
617 goto nospace;
618 if (copyhdr) {
619 M_COPY_PKTHDR(n, m);
620 if (len == M_COPYALL)
621 n->m_pkthdr.len -= off0; <-- fault happens here (XXX)
622 else
623 n->m_pkthdr.len = len;
624 copyhdr = 0;
625 }
(kgdb) print n
$1 = (struct mbuf *) 0x661c20
(kgdb) print *n
cannot read proc at 0
(kgdb) print m
$2 = (struct mbuf *) 0xc07e7c00
Where the fault happens (XXX), the possible problem is that the mbuf
pointer n is bad, and as printed from the debugger, it does appear to
be bad. However, there are two things to note:
1. the fault virtual address displayed in the trap message:
Fatal trap 12: page fault while in kernel mode
fault virtual address = 0x89c0c800
[...]
is different from the one printed in your analysis (even though
0x89c0c800 seems bogus as well, although it is at a correct boundry).
2. Nothing bad happens in M_COPY_PKTHDR() which dereferences an
equivalent pointer.
Something seriously evil is happening here and, unfortunately, I have
no idea what.
Does this only happen on this one machine? Or is it reproducable on
several different machines? I used to stress test -STABLE for mbuf
starvation and never stumbled upon one of these `spontaneous pointer
deaths' myself. Although I have seen other weird problems reported by
other people, but only in RELENG_3.
If you cannot reproduce it on any other machines, I would start
looking at possibly bad hardware... unless someone else sees something
I'm not.
> If you need further data just ask, I'd be glad to help,
> Ady (@warpnet.ro)
Regards,
Bosko.
To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-net" in the body of the message