Hi Ciara,
thank you for your comments. Appreciated!
Added some minor notes.
Regards,
Patrik
On 03/03/2016 04:56 PM, Loftus, Ciara wrote:
Hello,
Hi Patrik,
The majority of your concerns seem to be to do with the DPDK code itself so I would
suggest posting to the d...@dpdk.org mailing list to raise your concerns,
especially questions 2 & 3. Either way some comments/questions below.
Thanks,
Ciara
while doing robustness tests on OVS with DPDK datapath, I'm experiencing
vswitchd terminating due to a buffer overrun in DPDK
fdset_event_dispatch().
The underlying root cause for this, I'm quite sure, lies in our modified
version
of OpenStack or tests cases where some vhostuser ports happen to be left
behind when VMs with more than 16 vNICs are deleted using "nova delete".
How does "nova delete" bring down VMs? Graceful shutdown or another way? Are
the ports explicitly deleted after the VM is deleted, with del-port?
Yes, I think it is a graceful shutdown done by the "nova delete". Ports
are deleted
explicitly but not necessarily after the VM shutdown -- which gives us
the problem
discussed here:
http://openvswitch.org/pipermail/discuss/2016-February/020271.html
Thus, currently, the test leads to an accumulation of vhostuser sockets and
the termination is triggered when around >500 sockets are open at the
same time (accumulated sockets + sockets for 4 X VMs X 24 vNICs/VM).
Neither OVS (2.4.0, with corrections from the 2.4 branch) or DPDK (2.1.0)
properly protects against a situation where file descriptor numbers exceeds
what can be held by the fd_set bit mask, as far as I can determine. The
question then is if this is a bug or if I'm missing something?
I'm well aware that this is an unlikely scenario but I do prefer graceful
handling if possible.
1) should there be a limit to the number of vhostuser ports that can be
created in OVS or should the limit be dictated by DPDK only?
I think we first need to figure out what that limit is/if there is one and then
if it can be avoided.
I came to the conclusion that it is hard to figure out in OVS what a
reasonable limit would
be. Instead DPDK should be the place.
2) in rte_vhost_driver_register() there seems to be an implicit
assumption
that the file descriptor number and the vserver_cnt are related,
which in
general, does not hold. Can this check be improved?
This would be a question for the DPDK Mailing List
Yes, will try to phrase a discussion topic there.
3) should fdset_event_dispatch() protect against events from file
descriptors
greater than or equal to FD_SETSIZE (=__FD_SETSIZE=1024)?
Likewise
After careful consideration, I think checking at the "source" is better,
corresponding roughly
to 2) above.
What is your opinion?
---
In short, what happens is roughly that a vhostuser port is registered in
(OVS, netdev-dpdk.c):
static int
netdev_dpdk_vhost_user_construct(struct netdev *netdev_)
{
struct netdev_dpdk *netdev = netdev_dpdk_cast(netdev_);
int err;
ovs_mutex_lock(&dpdk_mutex);
/* Take the name of the vhost-user port and append it to the
location where
* the socket is to be created, then register the socket.
*/
snprintf(netdev->vhost_id, sizeof(netdev->vhost_id), "%s/%s",
vhost_sock_dir, netdev_->name);
err = rte_vhost_driver_register(netdev->vhost_id);
....
}
The call to rte_vhost_driver_register() ends up in (DPDK,
vhost-net-user.c):
int
rte_vhost_driver_register(const char *path)
{
struct vhost_server *vserver;
pthread_mutex_lock(&g_vhost_server.server_mutex);
if (ops == NULL)
ops = get_virtio_net_callbacks();
if (g_vhost_server.vserver_cnt == MAX_VHOST_SERVER) {
RTE_LOG(ERR, VHOST_CONFIG,
"error: the number of servers reaches maximum\n");
pthread_mutex_unlock(&g_vhost_server.server_mutex);
return -1;
}
...
}
Here a check is done against MAX_VHOST_SERVER (=1024),
the number of vhostuser ports are not yet at maximum number.
But still the process can have more file descriptors open
than it has vhostuser sockets.
Then we come to the actual failure, where an event on a file
descriptor (fd=1024) leads to a memory access outside of the
fd_set bit mask, at (DPDK, fd_man.c, FD_ISSET line shown by "-->"):
void
fdset_event_dispatch(struct fdset *pfdset)
{
fd_set rfds, wfds;
int i, maxfds;
struct fdentry *pfdentry;
int num = MAX_FDS;
fd_cb rcb, wcb;
void *dat;
int fd;
int remove1, remove2;
int ret;
if (pfdset == NULL)
return;
while (1) {
struct timeval tv;
tv.tv_sec = 1;
tv.tv_usec = 0;
FD_ZERO(&rfds);
FD_ZERO(&wfds);
pthread_mutex_lock(&pfdset->fd_mutex);
maxfds = fdset_fill(&rfds, &wfds, pfdset);
pthread_mutex_unlock(&pfdset->fd_mutex);
/*
* When select is blocked, other threads might unregister
* listenfds from and register new listenfds into fdset.
* When select returns, the entries for listenfds in the fdset
* might have been updated. It is ok if there is unwanted call
* for new listenfds.
*/
ret = select(maxfds + 1, &rfds, &wfds, NULL, &tv);
if (ret <= 0)
continue;
for (i = 0; i < num; i++) {
remove1 = remove2 = 0;
pthread_mutex_lock(&pfdset->fd_mutex);
pfdentry = &pfdset->fd[i];
fd = pfdentry->fd;
rcb = pfdentry->rcb;
wcb = pfdentry->wcb;
dat = pfdentry->dat;
pfdentry->busy = 1;
pthread_mutex_unlock(&pfdset->fd_mutex);
----> if (fd >= 0 && FD_ISSET(fd, &rfds) && rcb)
rcb(fd, dat, &remove1);
if (fd >= 0 && FD_ISSET(fd, &wfds) && wcb)
wcb(fd, dat, &remove2);
...
}
}
}
Regards,
Patrik
_______________________________________________
discuss mailing list
discuss@openvswitch.org
http://openvswitch.org/mailman/listinfo/discuss
_______________________________________________
discuss mailing list
discuss@openvswitch.org
http://openvswitch.org/mailman/listinfo/discuss