Hi,

> -----Original Message-----
> From: Anoob Joseph <ano...@marvell.com>
> Sent: Wednesday, September 08, 2021 18:30
> To: Yigit, Ferruh <ferruh.yi...@intel.com>; Xu, Rosen <rosen...@intel.com>;
> Andrew Rybchenko <arybche...@solarflare.com>
> Cc: Nicolau, Radu <radu.nico...@intel.com>; Doherty, Declan
> <declan.dohe...@intel.com>; hemant.agra...@nxp.com;
> ma...@nvidia.com; Ananyev, Konstantin <konstantin.anan...@intel.com>;
> tho...@monjalon.net; Ankur Dwivedi <adwiv...@marvell.com>;
> andrew.rybche...@oktetlabs.ru; Akhil Goyal <gak...@marvell.com>;
> dev@dpdk.org
> Subject: RE: [EXT] Re: [PATCH] RFC: ethdev: add reassembly offload
> 
> Hi Ferruh, Rosen, Andrew,
> 
> Please see inline.
> 
> Thanks,
> Anoob
> 
> > Subject: [EXT] Re: [PATCH] RFC: ethdev: add reassembly offload
> >
> > External Email
> >
> > ----------------------------------------------------------------------
> > On 8/23/2021 11:02 AM, Akhil Goyal wrote:
> > > Reassembly is a costly operation if it is done in software, however,
> > > if it is offloaded to HW, it can considerably save application cycles.
> > > The operation becomes even more costlier if IP fragmants are
> > > encrypted.
> > >
> > > To resolve above two issues, a new offload
> > DEV_RX_OFFLOAD_REASSEMBLY
> > > is introduced in ethdev for devices which can attempt reassembly of
> > > packets in hardware.
> > > rte_eth_dev_info is added with the reassembly capabilities which a
> > > device can support.
> > > Now, if IP fragments are encrypted, reassembly can also be attempted
> > > while doing inline IPsec processing.
> > > This is controlled by a flag in rte_security_ipsec_sa_options to
> > > enable reassembly of encrypted IP fragments in the inline path.
> > >
> > > The resulting reassembled packet would be a typical segmented mbuf
> > > in case of success.
> > >
> > > And if reassembly of fragments is failed or is incomplete (if
> > > fragments do not come before the reass_timeout), the mbuf is updated
> > > with an ol_flag PKT_RX_REASSEMBLY_INCOMPLETE and mbuf is returned
> > as
> > > is. Now application may decide the fate of the packet to wait more
> > > for fragments to come or drop.
> > >
> > > Signed-off-by: Akhil Goyal <gak...@marvell.com>
> > > ---
> > >  lib/ethdev/rte_ethdev.c     |  1 +
> > >  lib/ethdev/rte_ethdev.h     | 18 +++++++++++++++++-
> > >  lib/mbuf/rte_mbuf_core.h    |  3 ++-
> > >  lib/security/rte_security.h | 10 ++++++++++
> > >  4 files changed, 30 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c index
> > > 9d95cd11e1..1ab3a093cf 100644
> > > --- a/lib/ethdev/rte_ethdev.c
> > > +++ b/lib/ethdev/rte_ethdev.c
> > > @@ -119,6 +119,7 @@ static const struct {
> > >   RTE_RX_OFFLOAD_BIT2STR(VLAN_FILTER),
> > >   RTE_RX_OFFLOAD_BIT2STR(VLAN_EXTEND),
> > >   RTE_RX_OFFLOAD_BIT2STR(JUMBO_FRAME),
> > > + RTE_RX_OFFLOAD_BIT2STR(REASSEMBLY),
> > >   RTE_RX_OFFLOAD_BIT2STR(SCATTER),
> > >   RTE_RX_OFFLOAD_BIT2STR(TIMESTAMP),
> > >   RTE_RX_OFFLOAD_BIT2STR(SECURITY),
> > > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index
> > > d2b27c351f..e89a4dc1eb 100644
> > > --- a/lib/ethdev/rte_ethdev.h
> > > +++ b/lib/ethdev/rte_ethdev.h
> > > @@ -1360,6 +1360,7 @@ struct rte_eth_conf {
> > >  #define DEV_RX_OFFLOAD_VLAN_FILTER       0x00000200
> > >  #define DEV_RX_OFFLOAD_VLAN_EXTEND       0x00000400
> > >  #define DEV_RX_OFFLOAD_JUMBO_FRAME       0x00000800
> > > +#define DEV_RX_OFFLOAD_REASSEMBLY        0x00001000
> >
> > previous '0x00001000' was 'DEV_RX_OFFLOAD_CRC_STRIP', it has been
> long
> > that offload has been removed, but not sure if it cause any problem to
> > re- use it.
> >
> > >  #define DEV_RX_OFFLOAD_SCATTER           0x00002000
> > >  /**
> > >   * Timestamp is set by the driver in
> > RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
> > > @@ -1477,6 +1478,20 @@ struct rte_eth_dev_portconf {
> > >   */
> > >  #define RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID
> >     (UINT16_MAX)
> > >
> > > +/**
> > > + * Reassembly capabilities that a device can support.
> > > + * The device which can support reassembly offload should set
> > > + * DEV_RX_OFFLOAD_REASSEMBLY
> > > + */
> > > +struct rte_eth_reass_capa {
> > > + /** Maximum time in ns that a fragment can wait for further
> > fragments */
> > > + uint64_t reass_timeout;
> > > + /** Maximum number of fragments that device can reassemble */
> > > + uint16_t max_frags;
> > > + /** Reserved for future capabilities */
> > > + uint16_t reserved[3];
> > > +};
> > > +
> >
> > I wonder if there is any other hardware around supports reassembly
> > offload, it would be good to get more feedback on the capabilities list.
> >
> > >  /**
> > >   * Ethernet device associated switch information
> > >   */
> > > @@ -1582,8 +1597,9 @@ struct rte_eth_dev_info {
> > >    * embedded managed interconnect/switch.
> > >    */
> > >   struct rte_eth_switch_info switch_info;
> > > + /* Reassembly capabilities of a device for reassembly offload */
> > > + struct rte_eth_reass_capa reass_capa;
> > >
> > > - uint64_t reserved_64s[2]; /**< Reserved for future fields */
> >
> > Reserved fields were added to be able to update the struct without
> > breaking the ABI, so that a critical change doesn't have to wait until
> > next ABI break release.
> > Since this is ABI break release, we can keep the reserved field and
> > add the new struct. Or this can be an opportunity to get rid of the reserved
> field.
> >
> > Personally I have no objection to get rid of the reserved field, but
> > better to agree on this explicitly.
> >
> > >   void *reserved_ptrs[2];   /**< Reserved for future fields */
> > >  };
> > >
> > > diff --git a/lib/mbuf/rte_mbuf_core.h b/lib/mbuf/rte_mbuf_core.h
> > > index
> > > bb38d7f581..cea25c87f7 100644
> > > --- a/lib/mbuf/rte_mbuf_core.h
> > > +++ b/lib/mbuf/rte_mbuf_core.h
> > > @@ -200,10 +200,11 @@ extern "C" {
> > >  #define PKT_RX_OUTER_L4_CKSUM_BAD        (1ULL << 21)
> > >  #define PKT_RX_OUTER_L4_CKSUM_GOOD       (1ULL << 22)
> > >  #define PKT_RX_OUTER_L4_CKSUM_INVALID    ((1ULL << 21) | (1ULL
> > << 22))
> > > +#define PKT_RX_REASSEMBLY_INCOMPLETE     (1ULL << 23)
> > >
> >
> > Similar comment with Andrew's, what is the expectation from
> > application if this flag exists? Can we drop it to simplify the logic in the
> application?
> 
> [Anoob] There can be few cases where hardware/NIC attempts inline
> reassembly but it fails to complete it
> 
> 1. Number of fragments is larger than what is supported by the hardware 2.
> Hardware reassembly resources are exhausted (due to limited reassembly
> contexts etc) 3. Reassembly errors such as overlapping fragments 4. Wait
> time exhausted (or reassembly timeout)
> 
> In such cases, application would be required to retrieve the original
> fragments so that it can attempt reassembly in software. The incomplete flag
> is useful for 2 purposes basically, 1. Application would need to retrieve the
> time the fragment has already spend in hardware reassembly so that
> software reassembly attempt can compensate for it. Otherwise, reassembly
> timeout across hardware + software will not be accurate 2. Retrieve original
> fragments. With this proposal, an incomplete reassembly would result in a
> chained mbuf but the segments need not be consecutive. To explain bit more,
> 
> Suppose we have a packet that is fragmented into 3 fragments, and fragment
> 3 & fragment 1 arrives in that order. Fragment 2 didn't arrive and hardware
> ultimately pushes it. In that case, application would be receiving a
> chained/segmented mbuf with fragment 1 & fragment 3 chained.
> 
> Now, this chained mbuf can't be treated like a regular chained mbuf. Each
> fragment would have its IP hdr and there are fragments missing in between.
> The only thing application is expected to do is, retrieve fragments, push it 
> to
> s/w reassembly.

What you mentioned is error identification. But actually a negotiation about 
max frame size is needed before datagrams tx/rx.
> >
> > >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> > >
> > > -#define PKT_FIRST_FREE (1ULL << 23)
> > > +#define PKT_FIRST_FREE (1ULL << 24)
> > >  #define PKT_LAST_FREE (1ULL << 40)
> > >
> > >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> > > diff --git a/lib/security/rte_security.h
> > > b/lib/security/rte_security.h index 88d31de0a6..364eeb5cd4 100644
> > > --- a/lib/security/rte_security.h
> > > +++ b/lib/security/rte_security.h
> > > @@ -181,6 +181,16 @@ struct rte_security_ipsec_sa_options {
> > >    * * 0: Disable per session security statistics collection for this SA.
> > >    */
> > >   uint32_t stats : 1;
> > > +
> > > + /** Enable reassembly on incoming packets.
> > > +  *
> > > +  * * 1: Enable driver to try reassembly of encrypted IP packets for
> > > +  *      this SA, if supported by the driver. This feature will work
> > > +  *      only if rx_offload DEV_RX_OFFLOAD_REASSEMBLY is set in
> > > +  *      inline ethernet device.
> > > +  * * 0: Disable reassembly of packets (default).
> > > +  */
> > > + uint32_t reass_en : 1;
> > >  };
> > >
> > >  /** IPSec security association direction */
> > >

Reply via email to