RE: Kernel Panic on 9.0 and 9.1 with carp on BCE network interface

2012-09-10 Thread Jean-Luc Dupont
> Date: Fri, 7 Sep 2012 17:14:41 +0400
> From: gleb...@freebsd.org
> To: jl.dup...@outlook.com
> CC: freebsd-stable@FreeBSD.org
> Subject: Re: Kernel Panic on 9.0 and 9.1 with carp on BCE network interface
> 
> On Thu, Aug 30, 2012 at 02:39:10PM +, Jean-Luc Dupont wrote:
> J> Sorry, it seems that I didn't put the right backtrace :
> J> 
> J> #0  doadump (textdump=Variable "textdump" is not available.
> J> ) at /usr/src/sys/kern/kern_shutdown.c:271
> J> 271 dumpsys(&dumper);
> J> (kgdb) #0  doadump (textdump=Variable "textdump" is not available.
> J> ) at /usr/src/sys/kern/kern_shutdown.c:271
> J> #1  0x807fdf02 in kern_reboot (howto=260)
> J> at /usr/src/sys/kern/kern_shutdown.c:448
> J> #2  0x807fe3e3 in panic (fmt=0x104 )
> J> at /usr/src/sys/kern/kern_shutdown.c:636
> J> #3  0x80ad2700 in trap_fatal (frame=0xc, eva=Variable "eva" is not 
> available.
> J> )
> J> at /usr/src/sys/amd64/amd64/trap.c:857
> J> #4  0x80ad2a3d in trap_pfault (frame=0xff82e97a3500, 
> usermode=0)
> J> at /usr/src/sys/amd64/amd64/trap.c:773
> J> #5  0x80ad305e in trap (frame=0xff82e97a3500)
> J> at /usr/src/sys/amd64/amd64/trap.c:456
> J> #6  0x80abd67f in calltrap ()
> J> at /usr/src/sys/amd64/amd64/exception.S:228
> J> #7  0x8085f597 in m_copym (m=0x0, off0=1500, len=1480, wait=1)
> J> at /usr/src/sys/kern/uipc_mbuf.c:542
> J> #8  0x8092f2c8 in ip_fragment (ip=0xfe00970e0580, 
> J> m_frag=0xff82e97a3728, mtu=Variable "mtu" is not available.
> J> ) at /usr/src/sys/netinet/ip_output.c:822
> J> #9  0x8092fc17 in ip_output (m=0xfe00970e0500, opt=Variable 
> "opt" is not available.
> J> )
> J> at /usr/src/sys/netinet/ip_output.c:653
> J> #10 0x80928713 in ip_forward (m=0xfe00970e0500, srcrt=Variable 
> "srcrt" is not available.
> J> )
> J> at /usr/src/sys/netinet/ip_input.c:1494
> J> #11 0x80929dc8 in ip_input (m=0xfe00970e0500)
> J> at /usr/src/sys/netinet/ip_input.c:702
> 
> I don't see that this is CARP related. Do you use any firewall: pf or ipfw?
> 
> Can you please show the below session in gdb with discussed core file:
> 
> gdb> fr 9
> gdb> p mtu
> gdb> fr 7
> gdb> p off
> gdb> fr 8
> gdb> p m0
> gdb> p *m0
> 
> -- 
> Totus tuus, Glebius.
> ___
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Hi,  

  Thank you very much for your reply, we are using IPFW with several VLAN and 
several CARP on intel igb and bce network cards on a dell poweredge servers.
When we stopped using the bce and using only the igb (with more vlans per 
interface) we don't have any more panics.

Here is the output of the debugger as asked :

(kgdb) fr 9
#9  0x8092fc17 in ip_output (m=0xfe00941c8300, opt=Variable "opt" 
is not available.
) at /usr/src/sys/netinet/ip_output.c:653
653 error = ip_fragment(ip, &m, mtu, ifp->if_hwassist, sw_csum);
(kgdb) p mtu
$1 = 1500
(kgdb) fr 7
#7  0x8085f597 in m_copym (m=0x0, off0=1500, len=1317, wait=1) at 
/usr/src/sys/kern/uipc_mbuf.c:542
542 if (off < m->m_len)
(kgdb) p off
$2 = 1233
(kgdb) fr 8
#8  0x8092f2c8 in ip_fragment (ip=0xfe00941c8380, 
m_frag=0xff834869e7f8, mtu=Variable "mtu" is not available.
)
at /usr/src/sys/netinet/ip_output.c:822
822 m->m_next = m_copym(m0, off, len, M_DONTWAIT);
(kgdb) p m0
$3 = (struct mbuf *) 0xfe00941c8300
(kgdb) p *m0
$4 = {m_hdr = {mh_next = 0xfe0081d51800, mh_nextpkt = 0x0, mh_data = 
0xfe00941c8380 "E", mh_len = 40, 
mh_flags = 2, mh_type = 1, pad = "\000\000\000\000\000"}, M_dat = {MH = 
{MH_pkthdr = {
rcvif = 0xfe0003b53800, header = 0x0, len = 267, flowid = 0, 
csum_flags = 0, csum_data = 65535, 
tso_segsz = 0, PH_vt = {vt_vtag = 0, vt_nrecs = 0}, tags = {slh_first = 
0x0}}, MH_dat = {MH_ext = {
  ext_buf = 0x400092ae00400045 , ext_free = 0x16207, 
  ext_arg1 = 0x42011d, ext_arg2 = 0x601005e, ext_size = 
2660147200, 
  ref_cnt = 0x40f7e20b010045, ext_type = -843971023}, 
MH_databuf = 
"E\000@\000�\222\000@\ab\001\000\000\000\000\000\000\000\035\001��B\000\000\000\000\000^\000\001\006\000�\216\236�\200\b\000E\000\001\v��@\0001\006��H\025T�\n\n\vK\000\025��^h���\223R>\200\030\000r\213�\000\000\001\001\b\n�$*\200:��\a",
 '\0' }}, 
M_databuf = 
"\0008�\003\000���\000\000\000\000\000\000\000\000\v\001\000\000\000\000\000\000\000\000\000\000��",
 '\0' , 
"E\000@\000�\222\000@\ab\001\000\000\000\000\000\000\000\035\001��B\000\000\000\000\000^\000\001\006\000�\216\236�\200\b\000E\000\001\v��@\0001\006��H\025T�\n\n\vK\000\025��^h���\223R>\200\030\000r\213�\000\000\001\001\b\n�$*\200:��\a",
 '\0' }}
(kgdb) 

  _

usb port issue in 9.1-Prerelease (Possibly Cam related)

2012-09-10 Thread Benjamin Close

Hi Folks,
I've facing an intermittent hang with a USB port which seems cam 
related:


Event's that happen are:

o USB modem (HUAWEI E220) plugged into PC

ugen3.2:  at usbus3
u3g0: <3G Modem> on usbus3
u3g0: Found 3 ports.
umass0:  on usbus3
umass0:  SCSI over Bulk-Only; quirks = 0x
umass0:6:0:-1: Attached to scbus6
umass1:  on usbus3
umass1:  SCSI over Bulk-Only; quirks = 0x
umass1:7:1:-1: Attached to scbus7
cd1 at umass-sim0 bus 0 scbus6 target 0 lun 0
cd1:  Removable CD-ROM SCSI-2 device
cd1: 1.000MB/s transfers
cd1: Attempt to query device size failed: NOT READY, Medium not present
da0 at umass-sim1 bus 1 scbus7 target 0 lun 0
da0:  Removable Direct Access SCSI-2 device
da0: 1.000MB/s transfers
da0: Attempt to query device size failed: NOT READY, Medium not present


o Time Elapsesmany packets passed, no da0 or cd1 used.


o USB Modem drops off the bus
   (It does this occasionally as it resets itself)

o Causes USB bus to lose devices

ugen3.2:  at usbus3 (disconnected)
u3g0: at uhub3, port 1, addr 2 (disconnected)
(cd1:umass-sim0:0:0:0): lost device, 1 refs
(cd1:umass-sim0:0:0:0): removing device entry
(pass4:umass-sim0:0:0:0): passdevgonecb: devfs entry is gone
(da0:umass-sim1:1:0:0): lost device - 0 outstanding, 1 refs
(da0:umass-sim1:1:0:0): removing device entry
(pass5:umass-sim1:1:0:0): passdevgonecb: devfs entry is gone
umass0: at uhub3, port 1, addr 2 (disconnected)


At this point that particular USB port is effectively useless. Plugging 
anything into the ports shows no device showing up.


Running usbconfig hangs with:

  PIDTID COMM TDNAME KSTACK
48562 101874 usbconfig-mi_switch+0x186 
sleepq_wait+0x42 _sx_xlock_hard+0x426 usbd_enum_lock+0xac 
usb_ref_device+0x21c usb_open+0xc7 devfs_open+0x197 vn_open_cred+0x2ff 
kern_openat+0x20a amd64_syscall+0x540 Xfast_syscall+0xf7


Controller is:

uhci0@pci0:0:26:0:  class=0x0c0300 card=0x02091028 chip=0x28348086 
rev=0x02 hdr=0x00

vendor = 'Intel Corporation'
device = '82801H (ICH8 Family) USB UHCI Controller'
class  = serial bus
subclass   = USB
uhci1@pci0:0:26:1:  class=0x0c0300 card=0x02091028 chip=0x28358086 
rev=0x02 hdr=0x00

vendor = 'Intel Corporation'
device = '82801H (ICH8 Family) USB UHCI Controller'
class  = serial bus
subclass   = USB
ehci0@pci0:0:26:7:  class=0x0c0320 card=0x02091028 chip=0x283a8086 
rev=0x02 hdr=0x00

vendor = 'Intel Corporation'
device = '82801H (ICH8 Family) USB2 EHCI Controller'
class  = serial bus
subclass   = USB

It does however seem related to cam as looking at the various threads 
for the usb hub I find:


(kgdb) bt
#0  sched_switch (td=0xfe000265, newtd=0xfe000227f000, 
flags=Variable "flags" is not available.

) at /usr/src/sys/kern/sched_ule.c:1927
#1  0x808f34c6 in mi_switch (flags=260, newtd=0x0) at 
/usr/src/sys/kern/kern_synch.c:485
#2  0x8092bfd2 in sleepq_wait (wchan=0xfe001ec2a900, pri=92) 
at /usr/src/sys/kern/subr_sleepqueue.c:623
#3  0x808f3c69 in _sleep (ident=0xfe001ec2a900, 
lock=0xfe00371e9210, priority=Variable "priority" is not available.

) at /usr/src/sys/kern/kern_synch.c:250
#4  0x802bea02 in cam_sim_free (sim=0xfe001ec2a900, 
free_devq=1) at /usr/src/sys/cam/cam_sim.c:112

#5  0x8074f8ba in umass_detach (dev=Variable "dev" is not available.
) at /usr/src/sys/dev/usb/storage/umass.c:2183
#6  0x8091a054 in device_detach (dev=0xfe001ec2e900) at 
device_if.h:214
#7  0x8075c458 in usb_detach_device (udev=0xfe0007ce8800, 
iface_index=32 ' ', flag=Variable "flag" is not available.

) at /usr/src/sys/dev/usb/usb_device.c:1065
#8  0x8075c5f4 in usb_unconfigure (udev=0xfe0007ce8800, 
flag=Variable "flag" is not available.

) at /usr/src/sys/dev/usb/usb_device.c:455
#9  0x8075c88e in usb_free_device (udev=0xfe0007ce8800, 
flag=Variable "flag" is not available.

) at /usr/src/sys/dev/usb/usb_device.c:2093
#10 0x80764e5e in uhub_explore (udev=0xfe0007353800) at 
/usr/src/sys/dev/usb/usb_hub.c:358
#11 0x8074f536 in usb_bus_explore (pm=Variable "pm" is not 
available.

) at /usr/src/sys/dev/usb/controller/usb_controller.c:359
#12 0x80769173 in usb_process (arg=Variable "arg" is not available.
) at /usr/src/sys/dev/usb/usb_process.c:170
#13 0x808bc2df in fork_exit (callout=0x807690a0 
, arg=0xff80007c0e88, frame=0xff804743cc40) at 
/usr/src/sys/kern/kern_fork.c:992
#14 0x80bc216e in fork_trampoline () at 
/usr/src/sys/amd64/amd64/exception.S:602



From:   cam_sim_free(struct cam_sim *sim, int free_devq)

(kgdb) l
107 {
108 int error;
109
110 sim->refcount--;
111 if (sim->refcount > 0) {

112 error = msleep(sim, sim->mtx, PRIBIO, "simfree", 0);

113 KASSERT(error == 0, ("invalid

Re: bsnmpd always died on HDD detach

2012-09-10 Thread Miroslav Lachman

Mikolaj Golub wrote:

On Sun, Sep 09, 2012 at 11:56:55PM +0200, Miroslav Lachman wrote:

I am running bsnmpd with basic snmpd.config (only community and location
changed).

When there is a problem with HDD and disk disapeared from ATA channel
(eg.: disc physically removed) the bsnmpd always dumps core:

kernel: pid 1188 (bsnmpd), uid 0: exited on signal 11 (core dumped)

I see this for a long rime on all releases of 7.x and 8.x branches (i386
and amd64). I did not tested 9.x.

Is it a known bug, or should I file PR?


Do you happen to run bsnmp-ucd too? If you do then what version is it?
In bsnmp-ucd-0.3.5 I introduced a bug that lead to bsnmpd crash on a
disk detach. It has been fixed (thanks to Brian Somers) in 0.3.6.


No, I never installed bsnmpd-ucd. We are using plain bsnmpd from base 
without any modules.

It is used by MRTG only for network traffic. Nothing else.

Miroslav Lachman
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: bsnmpd always died on HDD detach

2012-09-10 Thread Mikolaj Golub
On Mon, Sep 10, 2012 at 04:46:15PM +0200, Miroslav Lachman wrote:
> Mikolaj Golub wrote:
> > On Sun, Sep 09, 2012 at 11:56:55PM +0200, Miroslav Lachman wrote:
> >> I am running bsnmpd with basic snmpd.config (only community and location
> >> changed).
> >>
> >> When there is a problem with HDD and disk disapeared from ATA channel
> >> (eg.: disc physically removed) the bsnmpd always dumps core:
> >>
> >> kernel: pid 1188 (bsnmpd), uid 0: exited on signal 11 (core dumped)
> >>
> >> I see this for a long rime on all releases of 7.x and 8.x branches (i386
> >> and amd64). I did not tested 9.x.
> >>
> >> Is it a known bug, or should I file PR?
> >
> > Do you happen to run bsnmp-ucd too? If you do then what version is it?
> > In bsnmp-ucd-0.3.5 I introduced a bug that lead to bsnmpd crash on a
> > disk detach. It has been fixed (thanks to Brian Somers) in 0.3.6.
> 
> No, I never installed bsnmpd-ucd. We are using plain bsnmpd from base 
> without any modules.
> It is used by MRTG only for network traffic. Nothing else.

Then the backtrace might be useful.

gdb /usr/sbin/bsnmpd /path/to/bsnmpd.core
bt

-- 
Mikolaj Golub
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"