On 03/08/2017 12:41 PM, Olivier Matz wrote:
Based on discussions done in [1] and in this thread, this patchset reorganizes
the mbuf.

The main changes are:
- reorder structure to increase vector performance on some non-ia
   platforms.
- add a 64bits timestamp field in the 1st cache line. This timestamp
   is not normalized, i.e. no unit or time reference is enforced. A
   library may be added to do this job in the future.
- m->next, m->nb_segs, and m->refcnt are always initialized for mbufs
   in the pool, avoiding the need of setting m->next (located in the
   2nd cache line) in the Rx path for mono-segment packets.
- change port and nb_segs to 16 bits
- move seqn in the 2nd cache line

Things discussed but not done in the patchset:
- move refcnt and nb_segs to the 2nd cache line: many drivers sets
   them in the Rx path, so it could introduce a performance regression, or
   it would require to change all the drivers, which is not an easy task.
- remove the m->port field: too much impact on many examples and libraries,
   and some people highlighted they are using it.
- moving m->next in the 1st cache line: there is not enough room, and having
   it set to NULL for unused mbuf should remove the need for it.
- merge seqn and timestamp together in a union: we could imagine use cases
   were both are activated. There is no flag indicating the presence of seqn,
   so it looks preferable to keep them separated for now.

I made some basic performance tests (ixgbe) and see no regression.
Other tests from NIC vendors are welcome.

Once this patchset is pushed, the Rx path of drivers could be optimized a bit,
by removing writes to m->next, m->nb_segs and m->refcnt. The patch 4/8 gives an
idea of what could be done.

[1] http://dpdk.org/ml/archives/dev/2016-October/049338.html

rfc->v1:
- fix reset of mbuf fields in case of indirect mbuf in rte_pktmbuf_prefree_seg()
- do not enforce a unit or time reference for m->timestamp
- reorganize fields to make vlan and outer vlan consecutive
- enhance documentation of m->refcnt and m->port to explain why they are 16bits

Jerin Jacob (1):
   mbuf: make rearm data address naturally aligned

Olivier Matz (8):
   mbuf: make segment prefree function public
   mbuf: make raw free function public
   mbuf: set mbuf fields while in pool
   drivers/net: don't touch mbuf next or nb segs on Rx
   mbuf: use 2 bytes for port and nb segments
   mbuf: move sequence number in second cache line
   mbuf: add a timestamp field
   mbuf: reorder VLAN tci and buffer len fields

  app/test-pmd/csumonly.c                            |   4 +-
  drivers/net/ena/ena_ethdev.c                       |   2 +-
  drivers/net/enic/enic_rxtx.c                       |   2 +-
  drivers/net/fm10k/fm10k_rxtx.c                     |   6 +-
  drivers/net/fm10k/fm10k_rxtx_vec.c                 |   9 +-
  drivers/net/i40e/i40e_rxtx_vec_common.h            |   6 +-
  drivers/net/i40e/i40e_rxtx_vec_sse.c               |  11 +-
  drivers/net/ixgbe/ixgbe_rxtx.c                     |  10 +-
  drivers/net/ixgbe/ixgbe_rxtx_vec_common.h          |   6 +-
  drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c            |   9 --
  drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c             |   9 --
  drivers/net/mlx5/mlx5_rxtx.c                       |  11 +-
  drivers/net/mpipe/mpipe_tilegx.c                   |   3 +-
  drivers/net/null/rte_eth_null.c                    |   2 -
  drivers/net/virtio/virtio_rxtx.c                   |   4 -
  drivers/net/virtio/virtio_rxtx_simple.h            |   6 +-
  .../linuxapp/eal/include/exec-env/rte_kni_common.h |   5 +-
  lib/librte_mbuf/rte_mbuf.c                         |   4 +
  lib/librte_mbuf/rte_mbuf.h                         | 123 ++++++++++++++++-----
  19 files changed, 130 insertions(+), 102 deletions(-)


I see better performance with the patch series applied and next=NULL
assignments removed from net/sfc (waiting for the series applied to submit
corresponding patches). So the series:

Acked-by: Andrew Rybchenko <arybche...@solarflare.com>

Reply via email to