Hi Jesse, Thank you for looking into the patch Please find my comments below,
Regards _Sugesh > -----Original Message----- > From: Jesse Gross [mailto:je...@kernel.org] > Sent: Monday, August 15, 2016 6:44 PM > To: Chandran, Sugesh <sugesh.chand...@intel.com> > Cc: ovs dev <dev@openvswitch.org> > Subject: Re: [PATCH RFC] netdev-dpdk: Rx checksum offloading feature on > DPDK physical ports. > > On Mon, Aug 8, 2016 at 3:03 AM, Sugesh Chandran > <sugesh.chand...@intel.com> wrote: > > To enable checksum offloading at rx side while adding a port, add the > > 'rx-checksum-offload' option to the 'ovs-vsctl add-port' command-line > > as below, > > > > 'ovs-vsctl add-port br0 dpdk0 -- \ > > set Interface dpdk0 type=dpdk options:rx-checksum-offload=true' > > Is there a reason not to enable this by default? It seems like it would be > beneficial to more users that way. > > I think we should also add some documentation for this new option. > [Sugesh] : I will add the documentation in the v2 release of patch. My concern over make it enabled by default is, the possible performance impact of being Vectorization OFF which is completely depends on the traffic pattern and packet size. Though I haven’t find any thing as such in my testing. This will avoid any surprises for the customer. Any comments? > > 2) Normally, OVS generates checksum for tunnel packets. It is more > > efficient to calculate the checksum at 'tunnel_push' operation itself > > as all the relevant fields are already present in the cache and also > > no additional validation overhead is involved as in checksum offloading. > > I don't quite understand this comment. This is more efficient compared to > what? Presumably, at least for large packets, there would still be some > benefit if the issue with losing vectorization wasn't present. [Sugesh] Sorry for the confusion. This is about generating checksum for tunnel packets at encap side(i.e tx checksum offload). What I meant here is generating checksum for tunnel packets at the time of 'tunnel_push' is more efficient than offloading into the NIC. In detail, At the time of packet encapsulation in 'tunnel_push' , the egress port for the tunnel packet is unknown. The tunnel packets are recirculated in OVS to decide the action, which can be either send to a phy port with tx checksum offload or send to phy port without tx offload, or even send to any other virtual/kernel port. So to offload the Tx checksum in NIC, *) Mark the packet for checksum offload at the time of tunnel_push *) At 'netdev_send' validate every packet for tx_checksum offload flag, *) If tx checksum offload-flag is set, Calculate the checksum in software or hardware based on NIC offload configuration. While implementing, we found that the validating every packet at netdev_send + vectorization OFF make performance worse than calculating checksum at the time of 'tunnel_push' because the cache is warm. The worst case is sending packet to a port with no tx checksum offload. It makes additional overhead of 1) Marking every packet with offload flag at the time of tunnel_push 2) validate every packet for offload flag at netdev_send, 3) Since the port doesn’t support offload, load all the fields into cache and calculate the checksum in software before sending out. Please let me know if anything is wrong / not clear? > > > diff --git a/lib/dp-packet.h b/lib/dp-packet.h index 7c1e637..f2e9d4f > > 100644 > > --- a/lib/dp-packet.h > > +++ b/lib/dp-packet.h > > @@ -38,6 +38,14 @@ enum OVS_PACKED_ENUM dp_packet_source { > > > > #define DP_PACKET_CONTEXT_SIZE 64 > > > > +enum hw_pkt_ol_flags { > > + OL_RX_IP_CKSUM_GOOD = (1 << 0), > > + OL_RX_L4_CKSUM_GOOD = (1 << 1), > > + OL_RX_IP_CKSUM_BAD = (1 << 2), > > + OL_RX_L4_CKSUM_BAD = (1 << 3), > > Is it necessary to redefine this fields in OVS instead of just using them from > the DPDK mbuf directly? This means that we have to do mapping and other > checks at runtime, which if I remember correctly resulted in a performance > hit when you posted numbers before. [Sugesh] : Yes, the performance improvement will be slightly less due to these new flags. The idea here is to generalize the hardware offloading , so that it can be extended for other hardware offloading as well. Something like exposing the hardware offloading to virtual ports/tx offloading as well in the future? Comments? Just to confirm, if we are not using the new flags, still we need to verify the source of packet like below, if (packet->source == DPBUF_DPDK) /* Do the checksum validation */ The problem with this approach is, In future, how a packet from virtual port will use the hardware offload ? How to enable the any offload at the tx side? > > I think we could also simplify things by not checking > NETDEV_RX_CHECKSUM_OFFLOAD for each packet. Presumably if checksum > offloading isn't configured on the interface then it wouldn't set checksum > good on the packet. [Sugesh] : Yes, the packet must have CHECKUM_UNKNOWN flag is set on Interface without checksum offloading. Again still we need to check for the packet source , before checking these flags. I can change the logic to avoid using these flags if its OK to limit the scope only for rx checksum offloading for now. > > > diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c index > > ce2582f..bad2c55 100644 > > --- a/lib/netdev-native-tnl.c > > +++ b/lib/netdev-native-tnl.c > > @@ -178,7 +179,10 @@ udp_extract_tnl_md(struct dp_packet *packet, > struct flow_tnl *tnl, > > return NULL; > > } > > > > - if (udp->udp_csum) { > > + if (packet->md.ol_flags & OL_RX_L4_CKSUM_GOOD) { > > + tnl->flags |= FLOW_TNL_F_CSUM; > > + } > > + else if (udp->udp_csum) { > > uint32_t csum; > > if (netdev_tnl_is_header_ipv6(dp_packet_data(packet))) { > > csum = packet_csum_pseudoheader6(dp_packet_l3(packet)); > > I think I would put the check for OL_RX_L4_CKSUM_GOOD inside the if (udp- > >udp_csum). It's more logical that way and also avoids any possible problems > with drivers that might mistakenly report checksum of 0 as a validated, > correct checksum. [Sugesh] Sure, will make the change in the v2 version of patch. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev