[dpdk-dev] [RFC PATCH 0/4] pktdev

Wiles, Keith Mon, 20 Apr 2015 13:19:23 +0000


From: Marc Sune <marc.sune at bisdn.de<mailto:marc.s...@bisdn.de>>
Date: Monday, April 20, 2015 at 1:51 AM
To: Keith Wiles <keith.wiles at intel.com<mailto:keith.wiles at intel.com>>, 
"dev at dpdk.org<mailto:dev at dpdk.org>" <dev at dpdk.org<mailto:dev at 
dpdk.org>>
Subject: Re: [dpdk-dev] [RFC PATCH 0/4] pktdev




On 17/04/15 21:50, Wiles, Keith wrote:

Hi Marc and Bruce,

Hi Keith, Bruce,


On 4/17/15, 1:49 PM, "Marc Sune" <marc.sune at bisdn.de><mailto:marc.sune at 
bisdn.de> wrote:



On 17/04/15 17:16, Bruce Richardson wrote:


Hi all,

to continue this discussion a bit more, here is my, slightly different,
slant
on what a pktdev abstraction may look like.

The primary objective I had in mind when drafting this is to provide the
minimal abstraction that can be *easily* used as a common device
abstraction for
existing (and future) device types to be passed to dataplane code. The
patchset
demonstrates this by defining a minimal interface for pktdev - since I
firmly
believe the interface should be as small as possible - and then showing
how that
common interface can be used to unify rings and ethdevs under a common
API for the
datapath. I believe any attempt to unify things much beyond this to the
control
plane or setup phase is not worth doing - at least not initially - as at
init time the code always needs to be aware of the underlying resource
type in
order to configure it properly for dataplane use.

The overall objective I look to achieve is illustrated by the final
patch in
the series, which is a sample app where the same code is used for all
cores,
irrespective of the underlying device type.

To get to that point, patch 1 defines the minimal API - just RX and TX.
The .c
file in the library is empty for simplicity, though I would see some
functionality moving there when/if it makes sense e.g. the callback
support
from ethdev, as is done in Keith's patchset.
Patch 2 then makes very minimal changes to ethdev to allow ethdevs to
be used
as pktdevs, and to make use of the pktdev functions when appropriate
Patch 3 was, for me, the key test for this implementation - how hard
was it to
make an rte_ring usable as a pktdev too. Two single-line functions for
RX/TX
and a separate "converter" function proved to be all that was necessary
here -
and I believe simpler solutions may be possible too, as the extra
structures
allocated on conversion could be merged into the rte_ring structure
itself and
initialized on ring creation if we prefer that option. It is
hoped/presumed that
wrapping other structures, such as KNI, may prove to be just as easily
done.
[Not attempted yet - left as an exercise for the reader :-)].

Now, in terms of pktdev vs ethdev, there is nothing in this proposal
that
cannot also be done using ethdev AFAIK. However, pktdev as outlined here
should make the process far easier than trying to create a full PMD for
something.
All NIC specific functions, including things like stop/start, are
stripped out,
as they don't make sense for an rte_ring or other software objects.
Also, the other thing this provides is that we can move away from just
using
port ids. Instead in the same way as we now reference
rings/mempools/KNIs etc
via pointer, we can do the same with ethernet ports as pktdevs on the
data path.
There was discussion previously on moving beyond 8-bit port ids. If we
look to
use ethdev as a common abstraction, I feel that change will soon have
to be made
causing a large amount of code churn.


Hi Richard,

First thank you both for taking the time to look at this. I did not not
reply to Keith because you Richard summarized most of my concerns.

I had a brief look to this second proposal. It is more aligned to what I
had in mind. But still I feel it is slightly too complicated. I don't
like much the necessary (in your approach) MACRO-like pkt_dev_data
struct. It is also slightly inconvenient that the user has to do:

+       struct rte_pkt_dev *in = rte_eth_get_dev(0);

+               struct rte_pkt_dev *out = rte_ring_get_dev(
+                               rte_ring_create(name, 4096, rte_socket_id(), 
0));



What about something like (~pseudo-code):

rte_pkt_dev_data.h:

   enum rte_pkt_dev_type{
        RTE_PKT_DEV_ETH,
        RTE_PKT_DEV_RING,
        RTE_PKT_DEV_KNI,
        //Keep adding as more PMDs are supported
   };


   //This struct may be redundant if there is nothing more
   struct rte_pkt_dev_data{
        enum rte_pkt_dev_type;
        //Placeholder, maybe we need more...
   };

   //Make RX/TX pktdev APIs more readable, but not really needed
   typedef void pkt_dev_t;

(In all PMDs and e.g. KNI and RINGs):

 struct rte_eth_dev {
    struct rte_pkt_dev_data pkt_dev;//
<++++++++++++++++++++++++++++++++++
    eth_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive function. */
    eth_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit function.
*/
    struct rte_eth_dev_data *data;  /**< Pointer to device data */
    /...

rte_pkt_dev.h:

      #include <rte_ethdev.h>
      //Include PMD (and non-PMD) TX/RX headers...

   static inline uint16_t
   rte_pkt_tx_burst(pkt_dev_t* dev, uint16_t queue_id,
             struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
   {
        switch (((struct rte_pkt_dev_data*)dev)->type){
            case  RTE_PKT_DEV_ETH:
                struct rte_eth_dev* eth_dev = (struct
rte_eth_dev*)pkt_dev;
                rte_pkt_tx_burst(eth_dev, queue_id, tx_pkts, nb_pkts);
                break;
            case RTE_PKT_DEV_RING:
                //...
        }
   }

   //...


I do not see the functional different from my proposal other then a bit
more search/replace, which does effect some of the PMD?s. The changes to
the PMD is just to use the macros as we now have a nested structure.

I think what I am proposing is different. See below.


The nesting and movement some structures and code to a common location is
to provide a better way to add other devices. Moving code into a common
set of structures and APIs should only improve the code as we get more
devices added to DPDK. The basic flow and design is the same.

In your proposal you are movig TX/RX queues as well as some shared stuff and, 
most importantly, you are using function pointers to execute the RX/TX routines.

What I was proposing is to try to add the minimum common shared state in order 
to properly demultiplex the RX/TX call and have a common set of abstract calls 
(the pkt_dev type). In a way, I was proposing to deliberately not have a shared 
struct rte_dev_data because I think the internals of the "pkt_dev" can be very 
different across devices (e.g. queues in kni vs eth port vs. crypto?). I treat 
the pkt_dev as a "black box" that conforms to TX/RX API, leaving the developer 
of that device to define its internal structures as it better suites the needs. 
I only use each of the specific device type TX/RX APIs (external to us, pkt_dev 
library) in rte_pkt_dev.h. This also simplifies the refactor required to 
eventually integrate the rte_pkt_dev library and builds it "on top" of the 
existing APIs.

The other important difference with both, Bruce and your approach, and mine is 
the use of function pointers for RX/TX. I don't use them, which makes the 
entire abstracted TX/RX (including the final RX/TX routines itself) functions 
be "inlinable".

Btw, I forgot to add something basic in the previous pseudo-code. The different 
types have to be conditionally compiled according to compiled-in DPDK libs:

rte_pkt_dev.h:

    #include <rte_config.h>

    //Eth devices
    #ifdef RTE_LIBRTE_ETHER
    #include <rte_ethdev.h>
    #endif

    //KNI
    #ifdef RTE_LIBRTE_KNI
    #include <rte_kni.h>
    #endif

    //...
    //Include PMD (and non-PMD) TX/RX headers...

   static inline uint16_t
   rte_pkt_tx_burst(pkt_dev_t* dev, uint16_t queue_id,
             struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
   {
        switch (((struct rte_pkt_dev_data*)dev)->type){
            #ifdef RTE_LIBRTE_ETHER
            case  RTE_PKT_DEV_ETH:
                struct rte_eth_dev* eth_dev = (struct rte_eth_dev*)pkt_dev;
                rte_pkt_tx_burst(eth_dev, queue_id, tx_pkts, nb_pkts);
                break;
            #endif

            #ifdef RTE_LIBRTE_KNI
            case RTE_PKT_DEV_KNI:
                //...
                break;
            #endif

            default:
                    //Corrupted type or unsupported (without compiled support)
                    //Ignore or fail(fatal error)?
                    break;
        }
   }

   //...

This adds a switch into the performance path, not a great solution IMO.

[dpdk-dev] [RFC PATCH 0/4] pktdev

Reply via email to