> From: "Lance Richardson" <lrich...@redhat.com> > To: dev@openvswitch.org > Sent: Wednesday, July 6, 2016 7:39:52 PM > Subject: [ovs-dev] [PATCH] netdev-dummy: fix crash with more than one passive > connection > > Investigation found that Some of the occasional failures in the > "ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS" test case are caused > by ovs-vswitchd crashing with SIGSEGV. It turns out that the > crash occurrs when the number of netdev-dummy passive connections > transitions from 1 to 2. When xrealloc() copies the array of > dummy_packet_stream structures from the original buffer to a > newly allocated one, the struct ovs_list txq member of the structure > becomes corrupt (e.g. if ovs_list_is_empty() would have returned > false before the copy, it will return true after the copy, which > will lead to a crash when the bogus packet buffer on the list is > dereferenced). > > Fix by taking a hint from David Wheeler and adding a level of > indirection. > > Signed-off-by: Lance Richardson <lrich...@redhat.com> > --- > lib/netdev-dummy.c | 17 ++++++++++------- > 1 file changed, 10 insertions(+), 7 deletions(-) > > diff --git a/lib/netdev-dummy.c b/lib/netdev-dummy.c > index 24c107e..c99db2b 100644 > --- a/lib/netdev-dummy.c > +++ b/lib/netdev-dummy.c > @@ -68,7 +68,7 @@ enum dummy_netdev_conn_state { > > struct dummy_packet_pconn { > struct pstream *pstream; > - struct dummy_packet_stream *streams; > + struct dummy_packet_stream **streams; > size_t n_streams; > }; > > @@ -328,7 +328,8 @@ dummy_packet_conn_close(struct dummy_packet_conn *conn) > case PASSIVE: > pstream_close(pconn->pstream); > for (i = 0; i < pconn->n_streams; i++) { > - dummy_packet_stream_close(&pconn->streams[i]); > + dummy_packet_stream_close(pconn->streams[i]); > + free(pconn->streams[i]); > } > free(pconn->streams); > pconn->pstream = NULL; > @@ -446,8 +447,9 @@ dummy_pconn_run(struct netdev_dummy *dev) > > pconn->streams = xrealloc(pconn->streams, > ((pconn->n_streams + 1) > - * sizeof *s)); > - s = &pconn->streams[pconn->n_streams++]; > + * sizeof s)); > + s = xmalloc(sizeof *s); > + pconn->streams[pconn->n_streams++] = s; > dummy_packet_stream_init(s, new_stream); > } else if (error != EAGAIN) { > VLOG_WARN("%s: accept failed (%s)", > @@ -458,7 +460,7 @@ dummy_pconn_run(struct netdev_dummy *dev) > } > > for (i = 0; i < pconn->n_streams; i++) { > - struct dummy_packet_stream *s = &pconn->streams[i]; > + struct dummy_packet_stream *s = pconn->streams[i]; > > error = dummy_packet_stream_run(dev, s); > if (error) { > @@ -466,6 +468,7 @@ dummy_pconn_run(struct netdev_dummy *dev) > stream_get_name(s->stream), > ovs_retval_to_string(error)); > dummy_packet_stream_close(s); > + free(s); > pconn->streams[i] = pconn->streams[--pconn->n_streams]; > } > } > @@ -553,7 +556,7 @@ dummy_packet_conn_wait(struct dummy_packet_conn *conn) > case PASSIVE: > pstream_wait(conn->u.pconn.pstream); > for (i = 0; i < conn->u.pconn.n_streams; i++) { > - struct dummy_packet_stream *s = &conn->u.pconn.streams[i]; > + struct dummy_packet_stream *s = conn->u.pconn.streams[i]; > dummy_packet_stream_wait(s); > } > break; > @@ -578,7 +581,7 @@ dummy_packet_conn_send(struct dummy_packet_conn *conn, > switch (conn->type) { > case PASSIVE: > for (i = 0; i < conn->u.pconn.n_streams; i++) { > - struct dummy_packet_stream *s = &conn->u.pconn.streams[i]; > + struct dummy_packet_stream *s = conn->u.pconn.streams[i]; > > dummy_packet_stream_send(s, buffer, size); > pstream_wait(conn->u.pconn.pstream); > -- > 2.5.5 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev >
Here is a small script that reliably reproduces the crash in ovs-vswitchd. I don't have an explanation for why we have two connections to the same port in the "ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS" test case (this happens infrequently), perhaps it's something like the active connect from a peer ovs-vswitchd being interrupted and re-tried. #!/bin/bash set -ex PWD=$(pwd) export PATH=${PWD}/vswitchd:${PWD}/utilities:${PWD}/ovsdb:$PATH ovs_setenv() { export OVS_RUNDIR=${PWD}/$1 export OVS_LOGDIR=${PWD}/$1 export OVS_DBDIR=${PWD}/$1 export OVS_SYSCONFDIR=${PWD}/$1 export OVS_PKGDATADIR=${PWD}/$1 } for x in repro_main repro_hv1 repro_hv2; do mkdir -p $x rm -f $x/* ovs_setenv $x : > $OVS_RUNDIR/.conf.db.~lock~ ovsdb-tool create $OVS_RUNDIR/conf.db vswitchd/vswitch.ovsschema ovsdb-server --remote=punix:$OVS_RUNDIR/db.sock -vconsole:off --detach --no-chdir --pidfile --log-file ovs-vsctl --no-wait -- init ovs-vswitchd --enable-dummy=system -vvconn -vofproto_dpif -vunixctl -vconsole:off --detach --no-chdir --pidfile --log-file done ovs_setenv repro_main ovs-vsctl add-br foo \ -- add-port foo p1 \ -- set Interface p1 options:pstream="punix:$PWD/repro_main/p1.sock" ovs_setenv repro_hv1 ovs-vsctl add-br br1 \ -- add-port br1 p1 \ -- set Interface p1 options:stream="unix:$PWD/repro_main/p1.sock" ovs_setenv repro_hv2 ovs-vsctl add-br br1 \ -- add-port br1 p1 \ -- set Interface p1 options:stream="unix:$PWD/repro_main/p1.sock" _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev