Some more comments below, I also posted a revised patch http://openvswitch.org/pipermail/dev/2015-April/053789.html
> > Thanks for fixing this, Mark! > > > > I’ve did a quick phy2phy test and (as you state on the commit message) > > the throughput appears to drop (only by ~1%, we can happily live with > > that, if necessary). > > > > If the problem exists just with packets sent to ring devices, couldn’t > > we reset the RSS hash only if we’re transmitting to ring devices? > > Or maybe I didn’t fully understand the issue. > > I could do that but because the dpdk eth ports and ring ports share a lot of > the same code (they both call netdev_dpdk_send__()). It was overall less > computationally expensive to put it here as it doesn’t have to execute > another loop through all packets to zero the hash. > > Would it help if I benchmarked both approaches? I averaged a few test runs and all results are relative to perf on master: Zero hash in netdev_dpdk_ring_send(): phy-phy: -0% vm-loopback (ivshmem): -3.7% Zero hash in netdev_dpdk_send__(); phy-phy: -0.2% vm-loopback (ivshmem): -2.7% I think I will take your suggestion to move it to netdev_dpdk_ring_send(). It makes it easier to understand and it doesn't affect the phy perf at all (although the other approach is negligible enough to be in the noise). The vm-loopback performance drops in both cases. I also tried pipelining the loop with prefetches but I couldn’t improve the performance. I think the write to that cache-line is affecting the performance (particularly because it ping-pongs between cores). > > > > > Also couple of comments inline > > > > > > > > +/* When using 'dpdkr' and sending to a DPDK ring, we want to ensure > > > +that > > the > > > + * rss hash field is clear. This is because the same mbuf may be > > > + modified > > by > > > + * the consumer of the ring and return into the datapath without > > recalculating > > > + * the RSS hash. */ > > > +#define RESET_RSS_HASH(mbuf) ((mbuf)->hash.rss = 0) > > > + > > > > Definitely not a big deal, but CodingStyle.md suggests to use inline > > functions instead of macros where possible. Perhaps you can use > > dp_packet_set_dp_hash(pkt, 0) > Yes it would be better to use this. When I wrote this code originally, this > function did not exist. It would be better to call this inline function. > > > > > > > > if (OVS_UNLIKELY(size > dev->max_packet_len)) { > > > VLOG_WARN_RL(&rl, "Too big size %d max_packet_len %d", > > > (int)size , dev->max_packet_len); @@ -988,6 > > > +995,8 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, > > struct dp_packet **pkts, > > > continue; > > > } > > > > > > + RESET_RSS_HASH(&pkts[i]->mbuf); > > > + > > > > pkts[i]->mbuf will not be transmitted here, mbufs[newcnt] will. > > Is there any reason why we’re resetting the RSS hash on pkts[i]->mbuf? > Good catch Actually this is not needed at all! The mbuf is allocated so it will have a zero hash field :) > > > > > mbufs[newcnt] = rte_pktmbuf_alloc(dev->dpdk_mp->mp); > > > > > > if (!mbufs[newcnt]) { > > > @@ -1064,6 +1073,7 @@ netdev_dpdk_send__(struct netdev_dpdk > *dev, > > int qid, > > > > > > for (i = 0; i < cnt; i++) { > > > int size = dp_packet_size(pkts[i]); > > > + > > > if (OVS_UNLIKELY(size > dev->max_packet_len)) { > > > if (next_tx_idx != i) { > > > dpdk_queue_pkts(dev, qid, @@ -1078,6 +1088,9 @@ > > > netdev_dpdk_send__(struct netdev_dpdk *dev, > > int qid, > > > dropped++; > > > next_tx_idx = i + 1; > > > } > > > + > > > + RESET_RSS_HASH(&pkts[i]->mbuf); > > > + > > > } > > > if (next_tx_idx != cnt) { > > > dpdk_queue_pkts(dev, qid, > > > -- > > > 1.9.3 > > > > > > _______________________________________________ > > > dev mailing list > > > dev@openvswitch.org > > > http://openvswitch.org/mailman/listinfo/dev _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev