Typo error in the heading. This is the RFC patch, and the heading has to be,
"[RFC PATCHv4] netdev-dpdk: Enable Rx checksum offloading feature on DPDK physical ports." Sorry for missing it out. Regards _Sugesh > -----Original Message----- > From: Chandran, Sugesh > Sent: Wednesday, August 24, 2016 3:54 PM > To: dev@openvswitch.org; je...@kernel.org > Cc: Chandran, Sugesh <sugesh.chand...@intel.com> > Subject: [PATCH] netdev-dpdk: Enable Rx checksum offloading feature on > DPDK physical ports. > > Add Rx checksum offloading feature support on DPDK physical ports. By > default, > the Rx checksum offloading is enabled if NIC supports. However, > the checksum offloading can be turned OFF either while adding a new DPDK > physical port to OVS or at runtime. > > The rx checksum offloading can be turned off by setting the parameter to > 'false'. For eg: To disable the rx checksum offloading when adding a port, > > 'ovs-vsctl add-port br0 dpdk0 -- \ > set Interface dpdk0 type=dpdk options:rx-checksum-offload=false' > > OR (to disable at run time after port is being added to OVS) > > 'ovs-vsctl set Interface dpdk0 options:rx-checksum-offload=false' > > Similarly to turn ON rx checksum offloading at run time, > > 'ovs-vsctl set Interface dpdk0 options:rx-checksum-offload=true' > > This is a RFC patch as the new checksum offload flags > 'PKT_RX_L4_CKSUM_GOOD' > and 'PKT_RX_IP_CKSUM_GOOD' will be available only in DPDK 16.11 release. > OVS > must compile with DPDK 16.11 release to use the checksum offloading > feature. > > The Tx checksum offloading support is not implemented due to the following > reasons. > > 1) Checksum offloading and vectorization are mutually exclusive in DPDK poll > mode driver. Vector packet processing is turned OFF when checksum > offloading > is enabled which causes significant performance drop at Tx side. > > 2) Normally, OVS generates checksum for tunnel packets in software at the > 'tunnel push' operation, where the tunnel headers are created. However > enabling Tx checksum offloading involves, > > *) Mark every packets for tx checksum offloading at 'tunnel_push' and > recirculate. > *) At the time of xmit, validate the same flag and instruct the NIC to do > the > checksum calculation. In case NIC doesnt support Tx checksum offloading, > the checksum calculation has to be done in software before sending out the > packets. > > No significant performance improvement noticed with Tx checksum > offloading > due to the e overhead of additional validations + non vector packet > processing. > In some test scenarios, it introduces performance drop too. > > Rx checksum offloading still offers 8-9% of improvement on VxLAN tunneling > decapsulation even though the SSE vector Rx function is disabled in DPDK poll > mode driver. > > Signed-off-by: Sugesh Chandran <sugesh.chand...@intel.com> > > --- > v4 > - Unconditonally clear off the checksum flag one time in pop operation than > doing > separately in IP and UDP layers. > > v3 > - Reset the checksum offload flags in tunnel pop operation after the > validation. > - Reconfigure the dpdk port with rx checksum offload only if new > configuration > is different than current one. > > v2 > - Set Rx checksum enabled by default. > - Modified commit message, explaining the tradeoff with tx checksum > offloading. > - Use dpdk mbuf checksum offload flags instead of defining new > metadata field in OVS dp_packet. > - validate udp checksum mbuf flag only if the checksum present in the > packet. > - Doc update with Rx checksum offloading feature. > --- > INSTALL.DPDK-ADVANCED.md | 18 ++++++++++++++++-- > lib/dp-packet.h | 29 +++++++++++++++++++++++++++++ > lib/netdev-dpdk.c | 46 > ++++++++++++++++++++++++++++++++++++++++++++++ > lib/netdev-native-tnl.c | 38 +++++++++++++++++++++++--------------- > vswitchd/vswitch.xml | 13 +++++++++++++ > 5 files changed, 127 insertions(+), 17 deletions(-) > > diff --git a/INSTALL.DPDK-ADVANCED.md b/INSTALL.DPDK-ADVANCED.md > index 857c805..6cc42d9 100755 > --- a/INSTALL.DPDK-ADVANCED.md > +++ b/INSTALL.DPDK-ADVANCED.md > @@ -14,7 +14,8 @@ OVS DPDK ADVANCED INSTALL GUIDE > 9. [Flow Control](#fc) > 10. [Pdump](#pdump) > 11. [Jumbo Frames](#jumbo) > -12. [Vsperf](#vsperf) > +12. [Rx Checksum Offload](#rx_csum) > +13. [Vsperf](#vsperf) > > ## <a name="overview"></a> 1. Overview > > @@ -834,7 +835,20 @@ vhost ports: > ifconfig eth1 mtu 9000 > ``` > > -## <a name="vsperf"></a> 12. Vsperf > +## <a name="rx_csum"></a> 12. Rx Checksum Offload > +By default, DPDK physical ports are enabled with Rx checksum offload. Rx > +checksum offload can be configured on a DPDK physical port either when > adding > +or at run time. > + > +e.g. To disable Rx checksum offload when adding a DPDK port dpdk0: > + > +`ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk options:rx- > checksum-offload=false` > + > +e.g. To disable the Rx checksum offloading on a existing DPDK port dpdk0: > + > +`ovs-vsctl set Interface dpdk0 type=dpdk options:rx-checksum- > offload=false` > + > +## <a name="vsperf"></a> 13. Vsperf > > Vsperf project goal is to develop vSwitch test framework that can be used to > validate the suitability of different vSwitch implementations in a Telco > deployment > diff --git a/lib/dp-packet.h b/lib/dp-packet.h > index 7c1e637..ee601d0 100644 > --- a/lib/dp-packet.h > +++ b/lib/dp-packet.h > @@ -592,6 +592,35 @@ dp_packet_rss_invalidate(struct dp_packet *p) > #endif > } > > +static inline bool > +dp_packet_ip_checksum_valid(struct dp_packet *p) > +{ > +#ifdef DPDK_NETDEV > + return p->mbuf.ol_flags & PKT_RX_IP_CKSUM_GOOD; > +#else > + return 0; > +#endif > +} > + > +static inline bool > +dp_packet_l4_checksum_valid(struct dp_packet *p) > +{ > +#ifdef DPDK_NETDEV > + return p->mbuf.ol_flags & PKT_RX_L4_CKSUM_GOOD; > +#else > + return 0; > +#endif > +} > + > +static inline void > +reset_dp_packet_checksum_ol_flags(struct dp_packet *p) > +{ > +#ifdef DPDK_NETDEV > + p->mbuf.ol_flags &= ~(PKT_RX_L4_CKSUM_GOOD | > PKT_RX_L4_CKSUM_BAD | > + PKT_RX_IP_CKSUM_GOOD | PKT_RX_IP_CKSUM_BAD); > +#endif > +} > + > enum { NETDEV_MAX_BURST = 32 }; /* Maximum number packets in a > batch. */ > > struct dp_packet_batch { > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > index 6d334db..46c4045 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -326,6 +326,10 @@ struct ingress_policer { > rte_spinlock_t policer_lock; > }; > > +enum dpdk_hw_ol_features { > + NETDEV_RX_CHECKSUM_OFFLOAD = 1 << 0, > +}; > + > struct netdev_dpdk { > struct netdev up; > int port_id; > @@ -387,6 +391,10 @@ struct netdev_dpdk { > > /* DPDK-ETH Flow control */ > struct rte_eth_fc_conf fc_conf; > + > + /* DPDK-ETH hardware offload features, > + * from the enum set 'dpdk_hw_ol_features' */ > + uint32_t hw_ol_features; > }; > > struct netdev_rxq_dpdk { > @@ -624,6 +632,8 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk > *dev, int n_rxq, int n_txq) > conf.rxmode.jumbo_frame = 0; > conf.rxmode.max_rx_pkt_len = 0; > } > + conf.rxmode.hw_ip_checksum = (dev->hw_ol_features & > + NETDEV_RX_CHECKSUM_OFFLOAD) != 0; > /* A device may report more queues than it makes available (this has > * been observed for Intel xl710, which reserves some of them for > * SRIOV): rte_eth_*_queue_setup will fail if a queue is not > @@ -684,6 +694,28 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk > *dev, int n_rxq, int n_txq) > } > > static void > +dpdk_eth_checksum_offload_configure(struct netdev_dpdk *dev) > + OVS_REQUIRES(dev->mutex) > +{ > + struct rte_eth_dev_info info; > + bool rx_csum_ol_flag = false; > + uint32_t rx_chksm_offload_capa = DEV_RX_OFFLOAD_UDP_CKSUM | > + DEV_RX_OFFLOAD_TCP_CKSUM | > + DEV_RX_OFFLOAD_IPV4_CKSUM; > + rte_eth_dev_info_get(dev->port_id, &info); > + rx_csum_ol_flag = (dev->hw_ol_features & > NETDEV_RX_CHECKSUM_OFFLOAD) != 0; > + > + if (rx_csum_ol_flag && > + (info.rx_offload_capa & rx_chksm_offload_capa) != > + rx_chksm_offload_capa) { > + VLOG_WARN("Failed to enable Rx checksum offload on device %d", > + dev->port_id); > + dev->hw_ol_features &= ~NETDEV_RX_CHECKSUM_OFFLOAD; > + } > + netdev_request_reconfigure(&dev->up); > +} > + > +static void > dpdk_eth_flow_ctrl_setup(struct netdev_dpdk *dev) OVS_REQUIRES(dev- > >mutex) > { > if (rte_eth_dev_flow_ctrl_set(dev->port_id, &dev->fc_conf)) { > @@ -838,6 +870,9 @@ netdev_dpdk_init(struct netdev *netdev, unsigned > int port_no, > > /* Initialize the flow control to NULL */ > memset(&dev->fc_conf, 0, sizeof dev->fc_conf); > + > + /* Initilize the hardware offload flags to 0 */ > + dev->hw_ol_features = 0; > if (type == DPDK_DEV_ETH) { > err = dpdk_eth_dev_init(dev); > if (err) { > @@ -1071,6 +1106,8 @@ static int > netdev_dpdk_set_config(struct netdev *netdev, const struct smap *args) > { > struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); > + bool rx_chksm_ofld; > + bool temp_flag; > > ovs_mutex_lock(&dev->mutex); > > @@ -1090,6 +1127,15 @@ netdev_dpdk_set_config(struct netdev *netdev, > const struct smap *args) > > dpdk_eth_flow_ctrl_setup(dev); > > + /* Rx checksum offload configuration */ > + /* By default the Rx checksum offload is ON */ > + rx_chksm_ofld = smap_get_bool(args, "rx-checksum-offload", true); > + temp_flag = (dev->hw_ol_features & > NETDEV_RX_CHECKSUM_OFFLOAD) > + != 0; > + if (temp_flag != rx_chksm_ofld) { > + dev->hw_ol_features ^= NETDEV_RX_CHECKSUM_OFFLOAD; > + dpdk_eth_checksum_offload_configure(dev); > + } > ovs_mutex_unlock(&dev->mutex); > > return 0; > diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c > index ce2582f..31a12d6 100644 > --- a/lib/netdev-native-tnl.c > +++ b/lib/netdev-native-tnl.c > @@ -85,9 +85,11 @@ netdev_tnl_ip_extract_tnl_md(struct dp_packet > *packet, struct flow_tnl *tnl, > > ovs_be32 ip_src, ip_dst; > > - if (csum(ip, IP_IHL(ip->ip_ihl_ver) * 4)) { > - VLOG_WARN_RL(&err_rl, "ip packet has invalid checksum"); > - return NULL; > + if(OVS_UNLIKELY(!dp_packet_ip_checksum_valid(packet))) { > + if (csum(ip, IP_IHL(ip->ip_ihl_ver) * 4)) { > + VLOG_WARN_RL(&err_rl, "ip packet has invalid checksum"); > + return NULL; > + } > } > > if (ntohs(ip->ip_tot_len) > l3_size) { > @@ -179,20 +181,26 @@ udp_extract_tnl_md(struct dp_packet *packet, > struct flow_tnl *tnl, > } > > if (udp->udp_csum) { > - uint32_t csum; > - if (netdev_tnl_is_header_ipv6(dp_packet_data(packet))) { > - csum = packet_csum_pseudoheader6(dp_packet_l3(packet)); > - } else { > - csum = packet_csum_pseudoheader(dp_packet_l3(packet)); > - } > - > - csum = csum_continue(csum, udp, dp_packet_size(packet) - > - ((const unsigned char *)udp - > - (const unsigned char *)dp_packet_l2(packet))); > - if (csum_finish(csum)) { > - return NULL; > + if(OVS_UNLIKELY(!dp_packet_l4_checksum_valid(packet))) { > + uint32_t csum; > + if (netdev_tnl_is_header_ipv6(dp_packet_data(packet))) { > + csum = packet_csum_pseudoheader6(dp_packet_l3(packet)); > + } else { > + csum = packet_csum_pseudoheader(dp_packet_l3(packet)); > + } > + > + csum = csum_continue(csum, udp, dp_packet_size(packet) - > + ((const unsigned char *)udp - > + (const unsigned char > *)dp_packet_l2(packet))); > + if (csum_finish(csum)) { > + return NULL; > + } > } > tnl->flags |= FLOW_TNL_F_CSUM; > + > + /* Reset the checksum offload flags if present, to avoid wrong > + * interpretation in the further packet processing when > recirculated.*/ > + reset_dp_packet_checksum_ol_flags(packet); > } > > tnl->tp_src = udp->udp_src; > diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml > index 69b5592..19d5a4b 100644 > --- a/vswitchd/vswitch.xml > +++ b/vswitchd/vswitch.xml > @@ -3193,6 +3193,19 @@ > </column> > </group> > > + <group title="Rx Checksum Offload Configuration"> > + <p> > + The checksum validation on the incoming packets are performed on NIC > + using Rx checksum offload feature. Implemented only for <code>dpdk > + </code>physical interfaces. > + </p> > + > + <column name="options" key="rx-checksum-offload" type='{"type": > "boolean"}'> > + Set to <code>false</code> to disble Rx checksum offloading on <code> > + dpdk</code>physical ports. By default, Rx checksum offload is > enabled. > + </column> > + </group> > + > <group title="Common Columns"> > The overall purpose of these columns is described under <code>Common > Columns</code> at the beginning of this document. > -- > 2.5.0 _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev