On 24/06/2015 09:47, "Traynor, Kevin" <kevin.tray...@intel.com> wrote:
> > >> -----Original Message----- > >> From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Panu >>Matilainen > >> Sent: Wednesday, June 24, 2015 9:33 AM > >> To: Pravin Shelar; Jesse Gross > >> Cc: dev@openvswitch.org; Flavio Leitner > >> Subject: Re: [ovs-dev] [PATCH] dpif-netdev: Check for PKT_RX_RSS_HASH >>flag. > >> > >> On 06/24/2015 05:06 AM, Pravin Shelar wrote: > >> > On Tue, Jun 23, 2015 at 2:51 PM, Jesse Gross <je...@nicira.com> wrote: > >> >> On Mon, Jun 22, 2015 at 8:08 PM, Pravin Shelar <pshe...@nicira.com> >>wrote: > >> >>> On Fri, Jun 19, 2015 at 11:24 AM, Daniele Di Proietto > >> >>> <diproiet...@vmware.com> wrote: > >> >>>> > >> >>>> > >> >>>> On 18/06/2015 23:57, "Traynor, Kevin" <kevin.tray...@intel.com> >>wrote: > >> >>>> > >> >>>>> > >> >>>>> > >> >>>>>> -----Original Message----- > >> >>>>> > >> >>>>>> From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of >>Daniele Di > >> >>>>> > >> >>>>>> Proietto > >> >>>>> > >> >>>>>> Sent: Tuesday, June 16, 2015 7:39 PM > >> >>>>> > >> >>>>>> To: dev@openvswitch.org > >> >>>>> > >> >>>>>> Subject: [ovs-dev] [PATCH] dpif-netdev: Check for PKT_RX_RSS_HASH > >> flag. > >> >>>>> > >> >>>>>> > >> >>>>> > >> >>>>>> DPDK mbufs contain a valid RSS hash only if PKT_RX_RSS_HASH is > >> >>>>> > >> >>>>>> set in 'ol_flags'. Otherwise the hash is garbage and doesn't > >> >>>>> > >> >>>>>> relate to the packet. > >> >>>>> > >> >>>>>> > >> >>>>> > >> >>>>>> This fixes an issue with vhost, which, being a virtual NIC, >>doesn't > >> >>>>> > >> >>>>>> compute the hash. > >> >>>>> > >> >>>>>> > >> >>>>> > >> >>>>>> Unfortunately the ixgbe vPMD doesn't set the PKT_RX_RSS_HASH, >>forcing > >> >>>>> > >> >>>>>> OVS to compute an hash is software. This has a significant >>impact on > >> >>>>> > >> >>>>>> performance (-30% throughput in a single flow setup) which can be > >> >>>>> > >> >>>>>> mitigated in the CPU supports crc32c instructions. > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> As per the other thread on this I'm a bit concerned about the > >> performance > >> >>>>> > >> >>>>> drop from this patch, so I did some testing of this and >>alternative/ > >> >>>>> > >> >>>>> complimentary solutions. > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> Here's the options I looked at and some comments: > >> >>>>> > >> >>>>> 1. This patch in isolation: vhost drops about ~15% vhost-vhost and > >> >>>>> > >> >>>>> phy-vhost-phy (because of sw hash) but also there is drops of >>~25% for > >> >>>>> > >> >>>>> phy-phy and ~15% drop for phy-ivshmem-phy. > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> 2. Leave the code as is and let EMC misses happen for vhost rx >>pkts: > >> >>>>> > >> >>>>> I measure this at ~35% drop if missed *everytime* for >>vhost-vhost. We > >> >>>>> > >> >>>>> see in testing that it can also never happen, but this is not > >> realistic. > >> >>>>> > >> >>>>> There should be no impact to other DPDK interfaces. > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> 3. Add hash reset for packets from vhost: This is another way of > >> forcing > >> >>>>> > >> >>>>> the software hash for vhost rx and it is roughly equivalent in > >> performance > >> >>>>> > >> >>>>> to 1. for vhost-vhost (~15% drop). While there is a no >>significant drop > >> >>>>> > >> >>>>> for phy-vhost-phy. There should be no impact to other DPDK >>interfaces. > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> 4. Apply this patch and turn off Rx Vectorisation. vhost-vhost >>will > >> drop > >> >>>>> > >> >>>>> ~15% as per 1. and there should be nothing significant for >>phy-vhost- > >> phy. > >> >>>>> > >> >>>>> We would lose the 10% gain that rx vectorisation gave us for >>phy-phy. > >> >>>>> > >> >>>>> There should be no impact for dpdkr ports. > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> In terms of not knowing whether the hw hash is valid or not if >>the flag > >> is > >> >>>>> > >> >>>>> not checked, I would have expected the pmd to return an error on >>config > >> if > >> >>>>> > >> >>>>> the hash wasn't supported, but I'm not sure that it does. > >> >>>>> > >> >>>>> In the worst case where there was an incorrect hash, it would >>miss the > >> EMC > >> >>>>> > >> >>>>> which is about a 45% drop for phy-phy. I would think it's pretty >>safe > >> that > >> >>>>> > >> >>>>> if we configure it, the hash will be correct but I guess there is >>a > >> >>>>> > >> >>>>> possibility it wouldn't be. > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> Even if it is possible to get a smaller patch to fix the >>underlying > >> issue > >> >>>>> > >> >>>>> in DPDK, it would be in DPDK 2.1 at the earliest meaning the > >> performance > >> >>>>> > >> >>>>> would remain low until sometime in August. If it's DPDK 2.2, then >>it > >> would > >> >>>>> > >> >>>>> be sometime in December. This would mean any performance drops >>would be > >> >>>>> > >> >>>>> present in OVS 2.4 and possibly OVS 2.5. > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> Sorry :( but based on the performance drop with this patch in >>isolation > >> it > >> >>>>> > >> >>>>> would be a NAK from me. My preference would be 3 which gives best > >> >>>>> performance, > >> >>>>> > >> >>>>> or 4 which is a bit lower for phy-phy but safer. > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> Kevin. > >> >>>> > >> >>>> Thanks for all the testing. I guess it might make sense to >>stretch our > >> >>>> interpretation of the API in this case, because it wouldn't affect > >> >>>> correctness. > >> >>>> > >> >>>> Unless there any other objection I'm fine with the 3rd approach. > >> >>>> > >> >>> > >> >>> We can use 3rd approach to fix issue on branch 2.4. Then have patch >>to > >> >>> check the PKT_RX_RSS_HASH flag on master. By the time we release > >> >>> branch 2.5 we will have proper fix in DPDK and performance will >>bounce > >> >>> back. > >> >> > >> >> I think this is probably a reasonable compromise. I think it's better > >> >> to not keep a workaround in for an unbounded amount of time, >>otherwise > >> >> we'll forget about it and it will come back to bite us in the future. > >> > > >> > ok, Once the DPDK fix is backported to DPDK 2.0, we can remove the > >> workaround. > >> > >> That's assuming there will be a DPDK 2.0.1 release, but I have seen no > >> evidence of such plans in the DPDK camp. > > > >I don't expect there will be a DPDK 2.0.1 release either. I'm optimistic >we > >can get a standalone patch to fix the issue in DPDK 2.1 which we will have > >at the end of July. We could then roll DPDK 2.1 support into OVS master >(and > >presumably OVS 2.5). > > > >The issue is fixed as part of the unified packet api changes but that >won't > >be available (by default) until DPDK 2.2, so obviously we would prefer >not to > >have to wait until then. > I sent a patch to the list that implements the workaround. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev