On Oct 17, 2012, at 8:58 AM, Guy Helmer <guy.hel...@gmail.com> wrote:

> On Oct 12, 2012, at 8:54 AM, Guy Helmer <guy.hel...@gmail.com> wrote:
> 
>> 
>> On Oct 10, 2012, at 1:37 PM, Alexander V. Chernikov <melif...@freebsd.org> 
>> wrote:
>> 
>>> On 10.10.2012 00:36, Guy Helmer wrote:
>>>> 
>>>> On Oct 8, 2012, at 8:09 AM, Guy Helmer <guy.hel...@gmail.com> wrote:
>>>> 
>>>>> I'm seeing a consistent new kernel panic in FreeBSD 8.3:
>>>>> I'm not seeing how bd_sbuf would be NULL here. Any ideas?
>>>> 
>>>> Since I've not had any replies, I hope nobody minds if I reply with more 
>>>> information.
>>>> 
>>>> This panic seems to be occasionally triggered now that my user land code 
>>>> is changing the packet filter a while after the bpd device has been opened 
>>>> and an initial packet filter was set (previously, my code did not change 
>>>> the filter after it was initially set).
>>>> 
>>>> I'm focusing on bpf_setf() since that seems to be the place that could be 
>>>> tickling a problem, and I see that bpf_setf() calls reset_d(d) to clear 
>>>> the hold buffer. I have manually verified that the BPFD lock is held 
>>>> during the call to reset_d(), and the lock is held every other place that 
>>>> the buffers are manipulated, so I haven't been able to find any place that 
>>>> seems vulnerable to losing one of the bpf buffers. Still searching, but 
>>>> any help would be appreciated.
>>> 
>>> Can you please check this code on -current?
>>> Locking has changed quite significantly some time ago, so there is good 
>>> chance that you can get rid of this panic (or discover different one which 
>>> is really "new") :).
>> 
>> I'm not ready to run this app on current, so I have merged revs 229898, 
>> 233937, 233938, 233946, 235744, 235745, 235746, 235747, 236231, 236251, 
>> 236261, 236262, 236559, and 236806 to my 8.3 checkout to get code that 
>> should be virtually identical to current without the timestamp changes.
>> 
>> Unfortunately, I have only been able to trigger the panic in my test lab 
>> once -- so I'm not sure whether a lack of problems with the updated code 
>> will be indicative of likely success in the field where this has been 
>> trigged regularly at some sites…
>> 
>> Thanks,
>> Guy
>> 
> 
> 
> FWIW, I was able to trigger the panic with the original 8.3 code again in my 
> test lab. With these changes resulting from merging the revs mentioned above, 
> I have not seen any panics in my test lab setup in two days of load testing, 
> and AFAIK, packet capturing seems to be working fine.

Of course, the test system panic'ed with the same problem in catchpacket() an 
hour after I wrote this.

(kgdb) where
#0  doadump () at pcpu.h:224
#1  0xffffffff804c8280 in boot (howto=260) at ../../../kern/kern_shutdown.c:441
#2  0xffffffff804c8703 in panic (fmt=0x0) at ../../../kern/kern_shutdown.c:614
#3  0xffffffff8069ffad in trap_fatal (frame=0xffffffff809edbc0, eva=Variable 
"eva" is not available.
)
    at ../../../amd64/amd64/trap.c:825
#4  0xffffffff806a02e1 in trap_pfault (frame=0xffffff800014a8a0, usermode=0)
    at ../../../amd64/amd64/trap.c:741
#5  0xffffffff806a06bf in trap (frame=0xffffff800014a8a0)
    at ../../../amd64/amd64/trap.c:478
#6  0xffffffff80687cd4 in calltrap () at ../../../amd64/amd64/exception.S:228
#7  0xffffffff8069dc06 in bcopy () at ../../../amd64/amd64/support.S:124
#8  0xffffffff8056f69e in catchpacket (d=0xffffff005aaaf000, 
    pkt=0xffffff0001f46200 "", pktlen=522, snaplen=Variable "snaplen" is not 
available.
) at ../../../net/bpf.c:2240
#9  0xffffffff8056fc66 in bpf_mtap (bp=0xffffff0001be8c80, 
    m=0xffffff0001f46200) at ../../../net/bpf.c:2064
#10 0xffffffff80579c15 in ether_input (ifp=0xffffff0001b73800, 
    m=0xffffff0001f46200) at ../../../net/if_ethersubr.c:635
#11 0xffffffff802b694a in em_rxeof (rxr=0xffffff0001bca200, count=99, done=0x0)
    at ../../../dev/e1000/if_em.c:4404
#12 0xffffffff802b6db8 in em_handle_que (context=Variable "context" is not 
available.
)
    at ../../../dev/e1000/if_em.c:1494
#13 0xffffffff80506d85 in taskqueue_run_locked (queue=0xffffff0001be1580)
    at ../../../kern/subr_taskqueue.c:250
---Type <return> to continue, or q <return> to quit---q 
Quit
(kgdb) frame 8
#8  0xffffffff8056f69e in catchpacket (d=0xffffff005aaaf000, 
    pkt=0xffffff0001f46200 "", pktlen=522, snaplen=Variable "snaplen" is not 
available.
) at ../../../net/bpf.c:2240
warning: Source file is more recent than executable.

2240            bpf_append_bytes(d, d->bd_sbuf, curlen, &hdr, sizeof(hdr));
(kgdb) print *d
$1 = {bd_next = {le_next = 0xffffff0023fff400, le_prev = 0xffffff0001be8c90}, 
  bd_sbuf = 0x0, bd_hbuf = 0xffffff8000ffa000 "??~P", bd_fbuf = 0x0, 
  bd_slen = 0, bd_hlen = 2068, bd_bufsize = 8388608, 
  bd_bif = 0xffffff0001be8c80, bd_rtout = 1, bd_rfilter = 0xffffff0001e6f580, 
  bd_wfilter = 0x0, bd_bfilter = 0x0, bd_rcount = 7, bd_dcount = 0, 
  bd_promisc = 1 '\001', bd_state = 0 '\0', bd_immediate = 1 '\001', 
  bd_writer = 0 '\0', bd_hdrcmplt = 1, bd_direction = 1, bd_feedback = 0, 
  bd_async = 0, bd_sig = 23, bd_sigio = 0x0, bd_sel = {si_tdlist = {
      tqh_first = 0x0, tqh_last = 0x0}, si_note = {kl_list = {
        slh_first = 0x0}, kl_lock = 0xffffffff80497920 <knlist_mtx_lock>, 
      kl_unlock = 0xffffffff804978f0 <knlist_mtx_unlock>, 
      kl_assert_locked = 0xffffffff804945d0 <knlist_mtx_assert_locked>, 
      kl_assert_unlocked = 0xffffffff804945e0 <knlist_mtx_assert_unlocked>, 
      kl_lockarg = 0xffffff005aaaf0d8}, si_mtx = 0x0}, bd_lock = {
    lock_object = {lo_name = 0xffffff0001a5fce0 "bpf", lo_flags = 16973824, 
      lo_data = 0, lo_witness = 0x0}, mtx_lock = 18446742974226712768}, 
  bd_callout = {c_links = {sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, 
        tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, 
    c_lock = 0xffffff005aaaf0d8, c_flags = 0, c_cpu = 0}, bd_label = 0x0, 
  bd_fcount = 7, bd_pid = 89517, bd_locked = 0, bd_bufmode = 1, bd_wcount = 0, 
  bd_wfcount = 0, bd_wdcount = 0, bd_zcopy = 0, bd_compat32 = 0 '\0'}

Now, I am thinking the malloc() of the sbuf is failing but not sure how/why -- 
I thought malloc(size, M_BPF, M_WAITOK) should not fail?

Guy
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Reply via email to