I am able to recreate it. I suspect its some bug at the dpif_netlink layer.
I am trying to narrow it down further. Anything in particular I could look
for ?

Thanks.

(gdb) print *get
$3 = {
  key = 0x7fedec092190,
  key_len = 0x60,
  buffer = 0x7fee162a4b70,
  flow = 0x7fee162a4b20
}
(gdb) print *get->flow
$4 = {
  key = 0x7fedec092190,
  key_len = 0x60,
  mask = 0x7fedf4000ac0,
  mask_len = 0x458fc3,
  actions = 0x7fee162a5900,
  actions_len = 0x49e690,
  stats = {
    n_packets = 0x7fedf4000ac0,
    n_bytes = 0x6d506b38dcb0ac00,
    used = 0x7fedf4000ac0,
    tcp_flags = 0xe958
  }
}

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fee162a5700 (LWP 10592)]
format_odp_actions (ds=0x7fee162a2800, actions=0x7fee162a5900,
actions_len=4843152) at lib/odp-util.c:622
622                ds_put_format(ds, "%02x", ((const uint8_t *) a)[i]);
(gdb) bt
#0  format_odp_actions (ds=0x7fee162a2800, actions=0x7fee162a5900,
actions_len=4843152) at lib/odp-util.c:622
#1  0x0000000000456749 in log_flow_message (error=2, operation=0x506548
"flow_get", key=0x7fedec092190, key_len=96,
    mask=0x7fedf4000ac0, mask_len=4558787, stats=0x7fee162a4b50,
actions=0x7fee162a5900, actions_len=4843152, dpif=<optimized out>)
    at lib/dpif.c:1498
#2  0x0000000000456de8 in log_flow_get_message (error=2,
get=0x7fee162a29a8, dpif=0x24db8f0) at lib/dpif.c:1588
#3  dpif_operate (dpif=0x24db8f0, ops=0x7fee162a29f8, n_ops=1) at
lib/dpif.c:1158
#4  0x00000000004572ce in dpif_flow_get (dpif=<optimized out>,
key=<optimized out>, key_len=<optimized out>, buf=<optimized out>,
    flow=<optimized out>) at lib/dpif.c:852
#5  0x0000000000431514 in handle_missed_revalidation (ukey=0x7fedec092100,
revalidator=<optimized out>)
    at ofproto/ofproto-dpif-upcall.c:1609
#6  revalidator_sweep__ (revalidator=0x251abb0, purge=false) at
ofproto/ofproto-dpif-upcall.c:1636
#7  0x0000000000431da7 in revalidator_sweep (revalidator=0x251abb0) at
ofproto/ofproto-dpif-upcall.c:1657
#8  udpif_revalidator (arg=0x251abb0) at ofproto/ofproto-dpif-upcall.c:705
#9  0x000000000049e652 in ovsthread_wrapper (aux_=<optimized out>) at
lib/ovs-thread.c:338
#10 0x00007fee19791e9a in start_thread (arg=0x7fee162a5700) at
pthread_create.c:308
#11 0x00007fee18fba3fd in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#12 0x0000000000000000 in ?? ()

On Fri, Oct 17, 2014 at 4:17 PM, Ben Pfaff <b...@nicira.com> wrote:

> On Fri, Oct 17, 2014 at 04:07:25PM -0700, Madhu Challa wrote:
> > dpif_flow_get initializes the flow_get part of the union, down the stack
> > log_flow_message checks for actions || actions_len that could contain
> > garbage leading to the crash.
> >
> > saw the crash once when running stress tests. can be easily recreated
> > by running ovs-dpctl del-flows in a loop when traffic is going on
> >
> > Signed-off-by: Madhu Challa <cha...@noironetworks.com>
>
> The actions aren't in the dpif_op so I don't see how this would help.
> Can you explain?
>
> The actions are, instead, in the caller-provided dpif_flow.  I guess
> that the error is here in dpif_operate() where the code clears the
> flow only after trying to log uninitialized garbage from it:
>                     log_flow_get_message(dpif, get, error);
>
>                     if (error) {
>                         memset(get->flow, 0, sizeof *get->flow);
>                     }
>
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to