Hi, during trouble shooting sessions (OVS 2.4.1, DPDK 2.2.0) it was noticed that some guests trigger the SET_VRING_CALL message rather frequently. This can be all from a few times per minute up to 10 times per second.
From DPDK log: ... 2016-08-01T19:58:39.829222+09:00 compute-0-6 ovs-vswitchd[140481]: VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL, 1 2016-08-01T19:58:39.829232+09:00 compute-0-6 ovs-vswitchd[140481]: VHOST_CONFIG: vring call idx:0 file:251 2016-08-01T19:58:39.829246+09:00 compute-0-6 ovs-vswitchd[140481]: VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL, 1 2016-08-01T19:58:39.829250+09:00 compute-0-6 ovs-vswitchd[140481]: VHOST_CONFIG: vring call idx:0 file:215 2016-08-01T19:58:40.778491+09:00 compute-0-6 ovs-vswitchd[140481]: VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL, 1 2016-08-01T19:58:40.778501+09:00 compute-0-6 ovs-vswitchd[140481]: VHOST_CONFIG: vring call idx:0 file:251 2016-08-01T19:58:40.778517+09:00 compute-0-6 ovs-vswitchd[140481]: VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL, 1 2016-08-01T19:58:40.778521+09:00 compute-0-6 ovs-vswitchd[140481]: VHOST_CONFIG: vring call idx:0 file:215 2016-08-01T19:58:41.813467+09:00 compute-0-6 ovs-vswitchd[140481]: VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL, 1 2016-08-01T19:58:41.813479+09:00 compute-0-6 ovs-vswitchd[140481]: VHOST_CONFIG: vring call idx:0 file:251 2016-08-01T19:58:41.813499+09:00 compute-0-6 ovs-vswitchd[140481]: VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL, 1 2016-08-01T19:58:41.813505+09:00 compute-0-6 ovs-vswitchd[140481]: VHOST_CONFIG: vring call idx:0 file:215 ... Note that the ", 1" at the end of the log entries is the file handle index added in a debug build of DPDK, not part of vanilla DPDK. At high packet rate this might induce the kicking of the guest to fail repeatedly while enqueueing packets, due to the vq->callfd not being valid during the time its being reconfigured. Sporadically this leads to the virtio ring becoming full. Once full the enqueue functionality in DPDK stops kicking the guest. As the guest is interrupt driven and has not received all kicks it will not empty the virtio ring. Possibly there is some flaw also in the guest virtio driver to make this happen. To "solve" this problem, the kick operation in virtio_dev_merge_rx() was excluded from the pkt_idx > 0 condition. A similar change was done in virtio_dev_rx(). Original vhost_rxtx.c, virtio_dev_merge_rx(): ... merge_rx_exit: if (likely(pkt_idx)) { /* flush used->idx update before we read avail->flags. */ rte_mb(); /* Kick the guest if necessary. */ if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) eventfd_write(vq->callfd, (eventfd_t)1); } return pkt_idx; } ... Questions - Is it a valid operation to change the call/kick file descriptors (frequently) during device operation? - For stability reasons it seems to me that performing a kick even when the virtio ring is full is prudent. Since the check for packets put on the ring is there at all in the code, could it be that there is a penalty of kicking at ring full? - Would there be other ways to protect against the call file descriptor changing frequently? Assuming that virtio device events in the guest will cause the occasional SET_VRING_CALL message as part of normal operation. Any discussion on this topic will be appreciated. Regards, Patrik