Any chance you could try under valgrind?  It's easy, and it often
pinpoints the problem immediately.

On Tue, Sep 10, 2013 at 11:49:03AM -0400, Jing Su wrote:
> I updated to openvswitch 1.9.3, and am still getting segfaults.  Seems like
> there is some bad pointer math or memory corruption.  Brief gdb info below.
>  I don't know anything about the internals so can't really help unwind the
> stack and figure out what's wrong.  If it helps, I can provide a core and
> binary (they're kind of big so not attaching it here until requested).
> 
> In the gdb session below, the netdev ptr that's passed in is bad, causing
> the segfault, but probably not the source of the problem.
> 
> Program received signal SIGSEGV, Segmentation fault.
> netdev_get_etheraddr (netdev=0x6464613131646164, mac=0x18d33e6 "") at
> lib/netdev.c:495
> 495         return
> netdev_get_dev(netdev)->netdev_class->get_etheraddr(netdev, mac);
> (gdb) bt
> #0  netdev_get_etheraddr (netdev=0x6464613131646164, mac=0x18d33e6 "") at
> lib/netdev.c:495
> #1  0x0000000000417cc3 in send_bpdu_cb (pkt=0x18cfe40, port_num=13,
> ofproto_=0x184e6d0) at ofproto/ofproto-dpif.c:1378
> #2  0x0000000000477e95 in stp_send_bpdu (p=0x189f788, bpdu=0x7fff29d523d0,
> bpdu_size=35) at lib/stp.c:1347
> #3  0x00000000004782e0 in stp_transmit_config (p=0x189f788) at lib/stp.c:855
> #4  0x00000000004783c0 in stp_config_bpdu_generation (stp=0x189f1c0) at
> lib/stp.c:912
> #5  0x0000000000478c54 in stp_hello_timer_expiry (stp=0x189f1c0) at
> lib/stp.c:1154
> #6  stp_tick (stp=0x189f1c0, ms=<optimized out>) at lib/stp.c:306
> #7  0x0000000000420f6e in stp_run (ofproto=0x184e6d0) at
> ofproto/ofproto-dpif.c:1558
> #8  run (ofproto_=<optimized out>) at ofproto/ofproto-dpif.c:1006
> #9  0x0000000000411bdb in ofproto_run (p=0x184e6e0) at
> ofproto/ofproto.c:1076
> #10 0x000000000040bc85 in bridge_run () at vswitchd/bridge.c:2076
> #11 0x0000000000404a5c in main (argc=<optimized out>, argv=<optimized out>)
> at vswitchd/ovs-vswitchd.c:126
> (gdb) frame 0
> #0  netdev_get_etheraddr (netdev=0x6464613131646164, mac=0x18d33e6 "") at
> lib/netdev.c:495
> 495         return
> netdev_get_dev(netdev)->netdev_class->get_etheraddr(netdev, mac);
> (gdb) print *netdev
> Cannot access memory at address 0x6464613131646164
> (gdb) frame 1
> #1  0x0000000000417cc3 in send_bpdu_cb (pkt=0x18cfe40, port_num=13,
> ofproto_=0x184e6d0) at ofproto/ofproto-dpif.c:1378
> 1378            netdev_get_etheraddr(ofport->up.netdev, eth->eth_src);
> (gdb) print ofport->up
> $1 = {hmap_node = {hash = 26737488, next = 0x188eac0}, ofproto =
> 0x2d313962382d6434, netdev = 0x6464613131646164, pp = {port_no = 25654,
> hw_addr = "36\000\000\000",
>     name = "!\000\000\000\000\000\000\000p\374\227\001\000\000\000", config
> = 25545152, state = OFPUTIL_PS_STP_LISTEN, curr = 80, advertised = 0,
> supported = 48, peer = 0, curr_speed = 26737488,
>     max_speed = 0}, ofp_port = 2, change_seq = 0, mtu = 2}
> (gdb) frame 3
> #3  0x00000000004782e0 in stp_transmit_config (p=0x189f788) at lib/stp.c:855
> 855                 stp_send_bpdu(p, &config, sizeof config);
> (gdb) print config
> $4 = {header = {protocol_id = 0, protocol_version = 0 '\000', bpdu_type = 0
> '\000'}, flags = 0 '\000', root_id = 4923969635662299264, root_path_cost =
> 0, bridge_id = 4923969635662299264,
>   port_id = 3712, message_age = 0, max_age = 20, hello_time = 2,
> forward_delay = 15}
> (gdb) print p
> $5 = (struct stp_port *) 0x189f788
> (gdb) print *p
> $6 = {stp = 0x189f1c0, aux = 0x185c8b0, port_id = 32782, state =
> STP_FORWARDING, path_cost = 100, designated_root = 9223502441917732164,
> designated_cost = 0, designated_bridge = 9223502441917732164,
>   designated_port = 32782, topology_change_ack = false, config_pending =
> false, change_detection_enabled = true, message_age_timer = {active =
> false, value = 0}, forward_delay_timer = {
>     active = false, value = 3870}, hold_timer = {active = false, value =
> 276}, tx_count = 1129, rx_count = 0, error_count = 0, state_changed = false}
> 
> 
> 
> 
> [image: Gridcentric Logo]
> *Scalable, Efficient, Instant-On Virtualization*
> ------------------------------
> 
> Jing Su
> 
> Phone: +1-888-365-GRID (x710)
> Email: jin...@gridcentric.com
> 
> 
> 
> On Sun, Sep 8, 2013 at 1:55 PM, Ben Pfaff <b...@nicira.com> wrote:
> 
> > On Sat, Sep 07, 2013 at 10:13:57PM -0400, Jing Su wrote:
> > > I'm getting a problem with ovs-vswitchd segfaulting when I delete virtual
> > > machines on KVM.
> > >
> > >
> > > Linux  3.2.0-29-generic #46-Ubuntu SMP
> > > openvswitch-1.9.0 - compiled from source
> >
> > We just released 1.9.3, can you try that?
> >

> _______________________________________________
> discuss mailing list
> discuss@openvswitch.org
> http://openvswitch.org/mailman/listinfo/discuss

_______________________________________________
discuss mailing list
discuss@openvswitch.org
http://openvswitch.org/mailman/listinfo/discuss

Reply via email to