On 9 June 2015 at 09:04, Linhaifeng <haifeng.lin at huawei.com> wrote:
> On 2015/4/24 15:27, Luke Gorrie wrote: > > You should be able to test it like this: > > > > 1. Boot two Linux kernel (e.g. 3.13) guests. > > 2. Connect them via vhost switch. > > 3. Run continuous traffic between them (e.g. iperf). > > > > I would expect that within a reasonable timeframe (< 1 hour) one of the > > guests' network interfaces will hang indefinitely due to a missed > interrupt. > > > > You won't be able to reproduce this using DPDK guests because they are > not > > using the same interrupt suppression method. > > I think this patch can't resole this problem. On the other hand we still > would miss interrupt. > For what it is worth, we were able to reproduce the problem as described above with older Snabb Switch releases and we were also able to verify that inserting a memory barrier fixes this problem. This is the relevant commit in the snabbswitch repo for reference: https://github.com/SnabbCo/snabbswitch/commit/c33cdd8704246887e11d7c353f773f7b488a47f2 In a nutshell, we added an MFENCE instruction after writing used->idx and before checking VRING_F_NO_INTERRUPT. I have not tested this case under DPDK myself and so I am not really certain which memory barrier operations are sufficient/insufficient in that context. I hope that our experience is relevant/helpful though and I am happy to explain more about that if I have missed any important details. Cheers, -Luke