Hi Ivan,

I hate to disagree with you :), but no, we do not break the scatter-gather 
feature. We actually implemented the Packet Framework IPv4 fragmentation and 
reassembly ports to validate exactly this.

Maybe the confusion comes from the RTE_MBUF_TO_BADDR macro. This macro only 
works (and it is only used) for direct mbufs, so probably the correct name for 
this macro should be RTE_MBUF_DIRECT_TO_BADDR, as it cannot work (and it is not 
used) for indirect mbufs.

I am describing the rationale behind meta-data design below, sorry for the long 
description that looks like a design document ...

Quick recap:
- mbuf->buf_addr points to the first address where a byte of data for the 
current segment _can_ be placed
- direct mbuf: the packet descriptor (mbuf) sits in the same mempool buffer 
with the packet itself, so mbuf->buf_addr = mbuf + 1;
- indirect mbuf: the packet descriptor is located in a different mempool buffer 
than the packet itself, so mbuf->buf_addr != mbuf + 1;
- Regardless of the mbuf type, it is not necessarily true that the first byte 
of data is located at mbuf->buf_addr, as we save a headroom (configurable, by 
default initially of CONFIG_RTE_PKTMBUF_HEADROOM = 128 bytes) at the start of 
the data buffer (mbuf->buf_addr) for prepending packet headers ( ... or, why 
not, meta-data!).  The location of the first real data byte is mbuf->pkt.data, 
which is variable, as opposed to mbuf->buf_addr, which does not change.

Packet meta-data rationale:
- I think we both agree that we need to be able to store packet meta-data in 
the packet buffer. The packet buffer regions where meta-data could be stored 
can only be the in the headroom or in the tailroom, which are both managed by 
the application and both go up and down during the packet lifetime.

- To me, packet-metadata is part of the packet descriptor: mbuf is the 
mandatory & standard part of the packet descriptor, while meta-data is the 
optional & non-standard (per application) part of the packet descriptor. 
Therefore, it makes sense to put the meta-data immediately after mbuf, but this 
is not mandatory.

- The zero-size field called meta-data is really just an offset: it points to 
the first buffer location where meta-data _can_ be placed. The reason for 
having this field is to provide a standard way to place meta-data in the packet 
buffer rather that hide it in the Packet Framework libraries and potentially 
conflict with other mbuf changes in the future. It flags that meta-data should 
be taken into consideration when any mbuf change is done in  the future.

- For direct mbufs, the same buffer space (the headroom) is shared between 
packet data (header prepending) and meta-data similar to how the stack and the 
heap manage the same memory. Of course, since it is managed by the application, 
it is the responsibility of the application to make sure the packet bytes and 
the meta-data do not step on one another, but this problem is not at all new, 
nor introduced by the meta-data field: even currently, the application has to 
make sure the headroom is dimensioned correctly, so that even in the worst case 
scenario (application-specific), the packet bytes do not run into the mbuf 
fields, right?

- For indirect mbufs, the packet meta-data is associated with the indirect mbuf 
rather than the direct mbuf (the one the indirect mbuf attaches to, as the 
direct mbuf contains a different packet that has different meta-data), so it 
makes sense that meta-data of the indirect mbuf is stored in the same buffer as 
the indirect mbuf (rather than the buffer of the attached direct mbuf). So this 
means that the buffer size used to store the indirect mbuf is sized 
appropriately (mbuf size, currently 64 bytes, plus the max size for additional 
meta-data). This is illustrated in the code of the IPv4 fragmentation port, 
where for every child packet (output fragment) we copy the parent meta-data in 
the child buffer (which stores an indirect buffer attached to the direct buffer 
of the input jumbo, plus now additional meta-data).

- We are also considering having a user_data field in the mbuf itself to point 
to meta-data in the same buffer or any other buffer. The Packet Framework 
functionally works with both approaches, but there is a performance problem 
associated with the mbuf->user_data approach that we are not addressing for 
this 1.7 release timeframe. The issue is the data dependency that is created, 
as in order to find out the location of the meta-data, we need to first read 
the mbuf  and then read meta-data from mbuf->user_data. The cost of the 
additional memory access is high, due to data dependency preventing efficient 
prefetch, therefore (at least for the time being) we need to use a compile-time 
known meta-data offset. The application can fill the meta-data on packet RX and 
then all additional CPU cores processing the packet can leave of the meta-data 
for a long time with no need to access the mbuf or the packet (efficiency). For 
the time being, if somebody needs more yet meta-data in yet another buffer, 
they can add (for their application) a user_data pointer as part of their 
application meta-data (instead of standard mbuf).

- As said, the mbuf->metadata points to the first location where meta-data 
_can_ be placed, if somebody needs a different offset, they can add it on top 
of the mbuf->metadata (e.g. by having a reserved field in their struct 
app_pkt_metadata). We have demonstrated the use of meta-data in the 
examples/ip_pipeline sample app (see struct app_pkt_metadata in "main.h").

Please let me know if this makes sense?

Regards,
Cristian

PS: This is where a white board would help a lot ...


-----Original Message-----
From: Ivan Boule [mailto:ivan.bo...@6wind.com] 
Sent: Wednesday, May 28, 2014 1:03 PM
To: Dumitrescu, Cristian; dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH 04/29] mbuf: added offset of packet meta-data in 
the packet buffer just after mbuf

Hi Cristian,

Currently, the DPDK framework does not make any assumption on the actual
layout of a mbuf.
More precisely, the DPDK does not impose any constraint on the actual
location of additional metadata, if any, or on the actual location and
size of its associated payload data buffer.
This is coherent with the fact that mbuf pools are not created by the
DPDK itself, but by network applications that are free to choose
whatever packet mbuf layout that fits their particular needs.

There is one exception to this basic DPDK rule: the mbuf cloning feature 
available through the RTE_MBUF_SCATTER_GATHER configuration option 
assumes that the payload data buffer of the mbuf immediately follows the 
rte_mbuf data structure (see the macros RTE_MBUF_FROM_BADDR, 
RTE_MBUF_TO_BADDR, RTE_MBUF_INDIRECT, and RTE_MBUF_DIRECT in the file 
lib/librte_mbuf/rte_mbuf.h).

The cloning feature prevents to build packet mbufs with their metadata 
located immediately after the rte_mbuf data structure, which is exactly 
what your patch introduces.

At least, a comment that clearly warns the user of this incompatibility
might be worth adding into both the code and your patch log.

Regards,
Ivan

On 05/27/2014 07:09 PM, Cristian Dumitrescu wrote:
> Added zero-size field (offset in data structure) to specify the beginning of 
> packet meta-data in the packet buffer just after the mbuf.
>
> The size of the packet meta-data is application specific and the packet 
> meta-data is managed by the application.
>
> The packet meta-data should always be accessed through the provided macros.
>
> This is used by the Packet Framework libraries (port, table, pipeline).
>
> There is absolutely no performance impact due to this mbuf field, as it does 
> not take any space in the mbuf structure (zero-size field).
>
> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu at intel.com>
> ---
>   lib/librte_mbuf/rte_mbuf.h |   17 +++++++++++++++++
>   1 files changed, 17 insertions(+), 0 deletions(-)
>
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 4a9ab41..bf09618 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -201,8 +201,25 @@ struct rte_mbuf {
>               struct rte_ctrlmbuf ctrl;
>               struct rte_pktmbuf pkt;
>       };
> +     
> +     union {
> +             uint8_t metadata[0];
> +             uint16_t metadata16[0];
> +             uint32_t metadata32[0];
> +             uint64_t metadata64[0];
> +     };
>   } __rte_cache_aligned;
>
> +#define RTE_MBUF_METADATA_UINT8(mbuf, offset)       (mbuf->metadata[offset])
> +#define RTE_MBUF_METADATA_UINT16(mbuf, offset)      
> (mbuf->metadata16[offset/sizeof(uint16_t)])
> +#define RTE_MBUF_METADATA_UINT32(mbuf, offset)      
> (mbuf->metadata32[offset/sizeof(uint32_t)])
> +#define RTE_MBUF_METADATA_UINT64(mbuf, offset)      
> (mbuf->metadata64[offset/sizeof(uint64_t)])
> +
> +#define RTE_MBUF_METADATA_UINT8_PTR(mbuf, offset)   (&mbuf->metadata[offset])
> +#define RTE_MBUF_METADATA_UINT16_PTR(mbuf, offset)  
> (&mbuf->metadata16[offset/sizeof(uint16_t)])
> +#define RTE_MBUF_METADATA_UINT32_PTR(mbuf, offset)  
> (&mbuf->metadata32[offset/sizeof(uint32_t)])
> +#define RTE_MBUF_METADATA_UINT64_PTR(mbuf, offset)  
> (&mbuf->metadata64[offset/sizeof(uint64_t)])
> +
>   /**
>    * Given the buf_addr returns the pointer to corresponding mbuf.
>    */
>


-- 
Ivan Boule
6WIND Development Engineer
--------------------------------------------------------------
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare

This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). Any review or distribution by others is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender and delete all copies.


Reply via email to