Hello everyone,

In my own testing, I recently stumbled across an issue where I could get qemu 
to exit when sending traffic to my application. To do this, I simply needed to 
do the following:

1) Start my virtio interfaces
2) Send some traffic into/out of the interfaces
3) Stop the interfaces
4) Start the interfaces
5) Send some more traffic

At this point, I would lose connectivity to my VM.  Further investigation 
revealed qemu exiting with the following log:

        2016-09-01T15:45:32.119059Z qemu-kvm: Guest moved used index from 5 to 1

I found the following bug report against qemu, reported by a user of DPDK: 
https://bugs.launchpad.net/qemu/+bug/1558175

That thread seems to have stalled out, so I think we probably should deal with 
the problem within DPDK itself. Either way, later in the bug report chain, we 
see a link to this patch to DPDK: 
http://dpdk.org/browse/dpdk/commit/?id=9a0615af774648. The submitter of the bug 
report claims that this patch fixes the problem. Perhaps it does. However, it 
introduces a new problem: If I remove the patch, I cannot reproduce the 
problem. So, that leads me to believe that it has caused a regression.

To summarize the patch?s changes, it basically changes the virtio_dev_stop 
function to flag the device as stopped, and stops the device when 
closing/uninitializing it. However, there is a seemingly unintended 
side-effect. 

In virtio_dev_start, we have the following block of code:

        /* On restart after stop do not touch queues */
        if (hw->started)
                return 0;

        /* Do final configuration before rx/tx engine starts */
        virtio_dev_rxtx_start(dev);

?.

Prior to the patch, if an interface were stopped then started, without 
restarting the application, the queues would be left as-is, because hw->started 
would be set to 1. Now, calling stop sets hw->started to 0, which means the 
next call to start will ?touch the queues?. This is the unintended side-effect 
that causes the problem.

I made a change locally to break the state of the device into two: started and 
opened. The devices starts out neither started nor opened. If the device is 
accepting packets, it is started. If the device has set up its queues, it is 
opened. Stopping the device does not close the device. This allows me to change 
the check above to:

        if (hw->opened) {
                hw->started=1
                return 0;
        }

Then, if I stop and start the device, it does not reinitialize the queues. I 
have no problem. I can restart ports as much as I want, and the system keeps 
running. Traffic flows when they?ve restarted as well, which is always a plus. ?

Some background:
- I tested against DPDK 16.04 and DPDK 16.07.
- I?m using virtio NICs:
- CPU: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
- Host OS: CentOS Linux release 7.1.1503 (Core)
- Guest OS: CentOS Linux release 7.2.1511 (Core)
- Qemu-kvm version: 1.5.3-86.el7_1.6

I plan on submitting a patch to fix this tomorrow. Let me know if anyone has 
any thoughts about this, or a better way to fix it.

Thanks,

Kyle

Reply via email to