Any chance you could try under valgrind? It's easy, and it often pinpoints the problem immediately.
On Tue, Sep 10, 2013 at 11:49:03AM -0400, Jing Su wrote: > I updated to openvswitch 1.9.3, and am still getting segfaults. Seems like > there is some bad pointer math or memory corruption. Brief gdb info below. > I don't know anything about the internals so can't really help unwind the > stack and figure out what's wrong. If it helps, I can provide a core and > binary (they're kind of big so not attaching it here until requested). > > In the gdb session below, the netdev ptr that's passed in is bad, causing > the segfault, but probably not the source of the problem. > > Program received signal SIGSEGV, Segmentation fault. > netdev_get_etheraddr (netdev=0x6464613131646164, mac=0x18d33e6 "") at > lib/netdev.c:495 > 495 return > netdev_get_dev(netdev)->netdev_class->get_etheraddr(netdev, mac); > (gdb) bt > #0 netdev_get_etheraddr (netdev=0x6464613131646164, mac=0x18d33e6 "") at > lib/netdev.c:495 > #1 0x0000000000417cc3 in send_bpdu_cb (pkt=0x18cfe40, port_num=13, > ofproto_=0x184e6d0) at ofproto/ofproto-dpif.c:1378 > #2 0x0000000000477e95 in stp_send_bpdu (p=0x189f788, bpdu=0x7fff29d523d0, > bpdu_size=35) at lib/stp.c:1347 > #3 0x00000000004782e0 in stp_transmit_config (p=0x189f788) at lib/stp.c:855 > #4 0x00000000004783c0 in stp_config_bpdu_generation (stp=0x189f1c0) at > lib/stp.c:912 > #5 0x0000000000478c54 in stp_hello_timer_expiry (stp=0x189f1c0) at > lib/stp.c:1154 > #6 stp_tick (stp=0x189f1c0, ms=<optimized out>) at lib/stp.c:306 > #7 0x0000000000420f6e in stp_run (ofproto=0x184e6d0) at > ofproto/ofproto-dpif.c:1558 > #8 run (ofproto_=<optimized out>) at ofproto/ofproto-dpif.c:1006 > #9 0x0000000000411bdb in ofproto_run (p=0x184e6e0) at > ofproto/ofproto.c:1076 > #10 0x000000000040bc85 in bridge_run () at vswitchd/bridge.c:2076 > #11 0x0000000000404a5c in main (argc=<optimized out>, argv=<optimized out>) > at vswitchd/ovs-vswitchd.c:126 > (gdb) frame 0 > #0 netdev_get_etheraddr (netdev=0x6464613131646164, mac=0x18d33e6 "") at > lib/netdev.c:495 > 495 return > netdev_get_dev(netdev)->netdev_class->get_etheraddr(netdev, mac); > (gdb) print *netdev > Cannot access memory at address 0x6464613131646164 > (gdb) frame 1 > #1 0x0000000000417cc3 in send_bpdu_cb (pkt=0x18cfe40, port_num=13, > ofproto_=0x184e6d0) at ofproto/ofproto-dpif.c:1378 > 1378 netdev_get_etheraddr(ofport->up.netdev, eth->eth_src); > (gdb) print ofport->up > $1 = {hmap_node = {hash = 26737488, next = 0x188eac0}, ofproto = > 0x2d313962382d6434, netdev = 0x6464613131646164, pp = {port_no = 25654, > hw_addr = "36\000\000\000", > name = "!\000\000\000\000\000\000\000p\374\227\001\000\000\000", config > = 25545152, state = OFPUTIL_PS_STP_LISTEN, curr = 80, advertised = 0, > supported = 48, peer = 0, curr_speed = 26737488, > max_speed = 0}, ofp_port = 2, change_seq = 0, mtu = 2} > (gdb) frame 3 > #3 0x00000000004782e0 in stp_transmit_config (p=0x189f788) at lib/stp.c:855 > 855 stp_send_bpdu(p, &config, sizeof config); > (gdb) print config > $4 = {header = {protocol_id = 0, protocol_version = 0 '\000', bpdu_type = 0 > '\000'}, flags = 0 '\000', root_id = 4923969635662299264, root_path_cost = > 0, bridge_id = 4923969635662299264, > port_id = 3712, message_age = 0, max_age = 20, hello_time = 2, > forward_delay = 15} > (gdb) print p > $5 = (struct stp_port *) 0x189f788 > (gdb) print *p > $6 = {stp = 0x189f1c0, aux = 0x185c8b0, port_id = 32782, state = > STP_FORWARDING, path_cost = 100, designated_root = 9223502441917732164, > designated_cost = 0, designated_bridge = 9223502441917732164, > designated_port = 32782, topology_change_ack = false, config_pending = > false, change_detection_enabled = true, message_age_timer = {active = > false, value = 0}, forward_delay_timer = { > active = false, value = 3870}, hold_timer = {active = false, value = > 276}, tx_count = 1129, rx_count = 0, error_count = 0, state_changed = false} > > > > > [image: Gridcentric Logo] > *Scalable, Efficient, Instant-On Virtualization* > ------------------------------ > > Jing Su > > Phone: +1-888-365-GRID (x710) > Email: jin...@gridcentric.com > > > > On Sun, Sep 8, 2013 at 1:55 PM, Ben Pfaff <b...@nicira.com> wrote: > > > On Sat, Sep 07, 2013 at 10:13:57PM -0400, Jing Su wrote: > > > I'm getting a problem with ovs-vswitchd segfaulting when I delete virtual > > > machines on KVM. > > > > > > > > > Linux 3.2.0-29-generic #46-Ubuntu SMP > > > openvswitch-1.9.0 - compiled from source > > > > We just released 1.9.3, can you try that? > > > _______________________________________________ > discuss mailing list > discuss@openvswitch.org > http://openvswitch.org/mailman/listinfo/discuss _______________________________________________ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss