Re: [dpdk-dev] [dpdk-stable] [PATCH] mem: fix incorrect munmap in error unwind

2020-01-07 Thread David Marchand
On Mon, Jan 6, 2020 at 9:56 PM Stephen Hemminger
 wrote:
>
> The loop to unwind existing mmaps was only unmapping the
> first segment.
>
> Also, remove obvious redundant assignment.
>
> Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
> Cc: anatoly.bura...@intel.com
> Cc: sta...@dpdk.org
> Signed-off-by: Stephen Hemminger 
> ---
>  lib/librte_eal/linux/eal/eal_memory.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/lib/librte_eal/linux/eal/eal_memory.c 
> b/lib/librte_eal/linux/eal/eal_memory.c
> index 43e4ffc757bd..cf5b2433614b 100644
> --- a/lib/librte_eal/linux/eal/eal_memory.c
> +++ b/lib/librte_eal/linux/eal/eal_memory.c
> @@ -1967,9 +1967,8 @@ eal_legacy_hugepage_attach(void)
> close(fd);
>  error:
> /* map all segments into memory to make sure we get the addrs */

This looks like a copy of the same loop from a few lines before in the
same function.
The comment can be removed.


> -   cur_seg = 0;
> for (cur_seg = 0; cur_seg < i; cur_seg++) {
> -   struct hugepage_file *hf = &hp[i];
> +   struct hugepage_file *hf = &hp[cur_seg];
> size_t map_sz = hf->size;
> void *map_addr = hf->final_va;
>

map_addr and map_sz are unnecessary.

We should be safe with dereferencing hp if i != 0, but still how about:

error:
if (hp != NULL && hp != MAP_FAILED) {
unsigned int cur_seg;

for (cur_seg = 0; cur_seg < i; cur_seg++)
munmap(hp[cur_seg].final_va, hp[cur_seg].size);
munmap(hp, size);
}
if (fd_hugepage >= 0)
close(fd_hugepage);
return -1;


-- 
David Marchand



[dpdk-dev] [PATCH] net/mlx5: fix incorrect pointer operation

2020-01-07 Thread Suanming Mou
The meter suffix flow item pointer restore is not correct to minus a
fixed value. It should minus the real offset it increases.

Set the value to the real offset the pointer increases to fix the issue.

Fixes: 9ea9b049a960 ("net/mlx5: split meter flow")
Cc: sta...@dpdk.org

Signed-off-by: Suanming Mou 
Tested-by: Tonghao Zhang 
---
 drivers/net/mlx5/mlx5_flow.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index cb9d265..52ffcb2 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -4037,7 +4037,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev 
*dev, int32_t priority,
sfx_items++;
}
sfx_items->type = RTE_FLOW_ITEM_TYPE_END;
-   sfx_items -= METER_SUFFIX_ITEM;
+   sfx_items -= sfx_port_id_item ? 2 : 1;
/* Setting the sfx group atrr. */
sfx_attr.group = sfx_attr.transfer ?
(MLX5_FLOW_TABLE_LEVEL_SUFFIX - 1) :
-- 
1.8.3.1



Re: [dpdk-dev] [PATCH v3 7/7] net/bnxt: fix to not overwrite error message

2020-01-07 Thread Ferruh Yigit
On 1/7/2020 12:37 AM, Ajit Khaparde wrote:
> In some cases when flow creation fails, we  overwrite the specific
> error message with a generic error message. This patch fixes it.
> 
> Fixes: d24610f7bfda ("net/bnxt: allow flow creation when RSS is enabled")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Ajit Khaparde 
> Reviewed-by: Lance Richardson 
> ---
>  drivers/net/bnxt/bnxt_flow.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/bnxt/bnxt_flow.c b/drivers/net/bnxt/bnxt_flow.c
> index 707aedcec..cde1fa41c 100644
> --- a/drivers/net/bnxt/bnxt_flow.c
> +++ b/drivers/net/bnxt/bnxt_flow.c
> @@ -1485,7 +1485,7 @@ bnxt_validate_and_parse_flow(struct rte_eth_dev *dev,
>   if (rxq && !vnic->rx_queue_cnt)
>   rxq->vnic = &bp->vnic_info[0];
>   }
> - return rc;
> + return -rte_errno;

The result will be same as far as I can see, you may get rid of the "rc =
-rte_errno;" assignments before "goto ret;" with this change if you want.

Also commit log seems describing the below change not this one ...

>  }
>  
>  static
> @@ -1815,7 +1815,7 @@ bnxt_flow_create(struct rte_eth_dev *dev,
>   rte_flow_error_set(error, 0,
>  RTE_FLOW_ERROR_TYPE_NONE, NULL,
>  "Flow with pattern exists, updating 
> destination queue");
> - else
> + else if (!rte_errno)
>   rte_flow_error_set(error, -ret,
>  RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
>  "Failed to create flow.");
> 



Re: [dpdk-dev] C++ app does not execute DPDK constructors.

2020-01-07 Thread Bruce Richardson
On Mon, Jan 06, 2020 at 01:24:36PM -0800, Sachin Jain wrote:
> Hi
> 
> *What am I trying to do?*
> 
> I am trying to write Gtests for my application based on DPDK.
> 
> *Details about the app:*
> 
> The application compiles dpdk as static library. Its a pretty simple. For
> now it just runs a dummy Gtest and create a memory pool. The key is, its a
> C++ file. I see no issue if it were a C app.
> 
> *Issue I am facing:*
> 
> The call to rte_mempool_create fails() with "Err: No such file or
> directory".
> 
> *More debugging:*
> 
> The reason for the above failure is, the constructor for the ring driver is
> not getting called. Specifically below code in
> ~/driver/mempool/ring/rte_mempool_ring.c.
> 
> MEMPOOL_REGISTER_OPS(ops_mp_mc);
> MEMPOOL_REGISTER_OPS(ops_sp_sc);
> MEMPOOL_REGISTER_OPS(ops_mp_sc);
> MEMPOOL_REGISTER_OPS(ops_sp_mc);
> 
> So when I look at the C app which has these constructors called, I see the
> init_array size to be much larger than the the C++ app. Considering the two
> code are almost identical, I don't know what am I missing to have these
> constructors run. They are definitely getting compiled but are not called.
> 
> *What have I tried:*
> 
> 1. I tried the --whole-archive LD option but I dont think that should play
> a part here since the constructors does get called with C app.
> 
Generally to get expected behaviour of constructors with static libs, you
need to use both --whole-archive and --no-as-needed linker flags. Other
than that, I'm not sure what the issue might be. The same .a files are used
to link against the C and C++ objects?

/Bruce


Re: [dpdk-dev] [PATCH v7 02/17] lib/ring: apis to support configurable element size

2020-01-07 Thread Ananyev, Konstantin


> 
> > > > +
> > > > +static __rte_always_inline void
> > > > +enqueue_elems_128(struct rte_ring *r, uint32_t prod_head, const
> > > > +void *obj_table, uint32_t n) { unsigned int i; const uint32_t size
> > > > += r->size; uint32_t idx = prod_head & r->mask; __uint128_t *ring =
> > > > +(__uint128_t *)&r[1]; const __uint128_t *obj = (const __uint128_t
> > > > +*)obj_table; if (likely(idx + n < size)) { for (i = 0; i < (n &
> > > > +~0x1); i += 2, idx += 2) { ring[idx] = obj[i]; ring[idx + 1] =
> > > > +obj[i + 1];
> > >
> > >
> > > AFAIK, that implies 16B aligned obj_table...
> > > Would it always be the case?
> > I am not sure from the compiler perspective.
> > At least on Arm architecture, unaligned access (address that is accessed is 
> > not
> > aligned to the size of the data element being accessed) will result in 
> > faults or
> > require additional cycles. So, aligning on 16B should be fine.
> Further, I would be changing this to use 'rte_int128_t' as '__uint128_t' is 
> not defined on 32b systems.

What I am trying to say: with this code we imply new requirement for
elems in the ring: when sizeof(elem)==16 it's alignment also has to be at least 
16.
Which from my perspective is not ideal.
Note that for elem sizes > 16 (24, 32), there is no such constraint.

> 
> >
> > >
> > > > +}
> > > > +switch (n & 0x1) {
> > > > +case 1:
> > > > +ring[idx++] = obj[i++];
> > > > +}
> > > > +} else {
> > > > +for (i = 0; idx < size; i++, idx++) ring[idx] = obj[i];
> > > > +/* Start at the beginning */
> > > > +for (idx = 0; i < n; i++, idx++)
> > > > +ring[idx] = obj[i];
> > > > +}
> > > > +}
> > > > +
> > > > +/* the actual enqueue of elements on the ring.
> > > > + * Placed here since identical code needed in both
> > > > + * single and multi producer enqueue functions.
> > > > + */
> > > > +static __rte_always_inline void
> > > > +enqueue_elems(struct rte_ring *r, uint32_t prod_head, const void
> > > *obj_table,
> > > > +uint32_t esize, uint32_t num)
> > > > +{
> > > > +uint32_t idx, nr_idx, nr_num;
> > > > +
> > > > +/* 8B and 16B copies implemented individually to retain
> > > > + * the current performance.
> > > > + */
> > > > +if (esize == 8)
> > > > +enqueue_elems_64(r, prod_head, obj_table, num); else if (esize ==
> > > > +16) enqueue_elems_128(r, prod_head, obj_table, num); else {
> > > > +/* Normalize to uint32_t */
> > > > +uint32_t scale = esize / sizeof(uint32_t); nr_num = num * scale;
> > > > +idx = prod_head & r->mask; nr_idx = idx * scale;
> > > > +enqueue_elems_32(r, nr_idx, obj_table, nr_num); } }
> > > > +



Re: [dpdk-dev] [PATCH v6 3/4] net/ixgbe: cleanup Tx buffers

2020-01-07 Thread Di, ChenxuX
Hi

> -Original Message-
> From: Ananyev, Konstantin
> Sent: Monday, January 6, 2020 9:26 PM
> To: Di, ChenxuX ; dev@dpdk.org
> Cc: Yang, Qiming 
> Subject: RE: [dpdk-dev] [PATCH v6 3/4] net/ixgbe: cleanup Tx buffers
> 
> 
> 
> > > > > > + * tx_tail is the last sent packet on the sw_ring. Goto the
> > > > > > + end
> > > > > > + * of that packet (the last segment in the packet chain) and
> > > > > > + * then the next segment will be the start of the oldest
> > > > > > + segment
> > > > > > + * in the sw_ring.
> > > > >
> > > > > Not sure I understand the sentence above.
> > > > > tx_tail is the value of TDT HW register (most recently armed by SW 
> > > > > TD).
> > > > > last_id  is the index of last descriptor for multi-seg packet.
> > > > > next_id is just the index of next descriptor in HW TD ring.
> > > > > How do you conclude that it will be the ' oldest segment in the 
> > > > > sw_ring'?
> > > > >
> > > >
> > > > The tx_tail is the last sent packet on the sw_ring. While the
> > > > xmit_cleanup or Tx_free_bufs will be call when the nb_tx_free <
> > > tx_free_thresh .
> > > > So the sw_ring[tx_tail].next_id must be the begin of mbufs which
> > > > are not used or  Already freed . then begin the loop until the
> > > > mbuf is used and
> > > begin to free them.
> > > >
> > > >
> > > >
> > > > > Another question why do you need to write your own functions?
> > > > > Why can't you reuse existing ixgbe_xmit_cleanup() for
> > > > > full(offload) path and
> > > > > ixgbe_tx_free_bufs() for simple path?
> > > > > Yes,  ixgbe_xmit_cleanup() doesn't free mbufs, but at least it
> > > > > could be used to determine finished TX descriptors.
> > > > > Based on that you can you can free appropriate sw_ring[] entries.
> > > > >
> > > >
> > > > The reason why I don't reuse existing function is that they all
> > > > free several mbufs While the free_cnt of the API
> > > > rte_eth_tx_done_cleanup() is the
> > > number of packets.
> > > > It also need to be done that check which mbuffs are from the same 
> > > > packet.
> > >
> > > At first, I don't see anything bad if tx_done_cleanup() will free
> > > only some segments from the packet. As long as it is safe - there is no
> problem with that.
> > > I think rte_eth_tx_done_cleanup() operates on mbuf, not packet quantities.
> > > But in our case I think it doesn't matter, as ixgbe_xmit_cleanup()
> > > mark TXDs as free only when HW is done with all TXDs for that packet.
> > > As long as there is a way to reuse existing code and avoid
> > > duplication (without introducing any degradation) - we should use it.
> > > And I think there is a very good opportunity here to reuse existing
> > > ixgbe_xmit_cleanup() for tx_done_cleanup() implementation.
> > > Moreover because your code doesn’t follow
> > > ixgbe_xmit_pkts()/ixgbe_xmit_cleanup()
> > > logic and infrastructure, it introduces unnecessary scans over TXD
> > > ring, and in some cases doesn't work as expected:
> > >
> > > +while (1) {
> > > +tx_last = sw_ring[tx_id].last_id;
> > > +
> > > +if (sw_ring[tx_last].mbuf) {
> > > +if (txr[tx_last].wb.status &
> > > +IXGBE_TXD_STAT_DD) {
> > > ...
> > > +} else {
> > > +/*
> > > + * mbuf still in use, nothing left to
> > > + * free.
> > > + */
> > > +break;
> > >
> > > It is not correct to expect that IXGBE_TXD_STAT_DD will be set on
> > > last TXD for
> > > *every* packet.
> > > We set IXGBE_TXD_CMD_RS bit only on threshold packet last descriptor.
> > > Plus ixgbe_xmit_cleanup() can cleanup TXD wb.status.
> > >
> > > So I strongly recommend to reuse ixgbe_xmit_cleanup() here.
> > > It would be much less error prone and will help to avoid code duplication.
> > >
> > > Konstantin
> > >
> >
> > At first.
> > The function ixgbe_xmit_cleanup(struct ixgbe_tx_queue *txq) will cleanup
> TXD wb.status.
> > the number of status cleanuped is always txq->tx_rs_thresh.
> 
> Yes, and what's wrong with it?
> 
> >
> > The  API rte_eth_tx_done_cleanup() in rte_eth_dev.h show that @param
> > free_cnt
> >  *   Maximum number of packets to free. Use 0 to indicate all possible 
> > packets
> >  *   should be freed. Note that a packet may be using multiple mbufs.
> 
> I don't think it is a good approach, better would be to report number of freed
> mbufs, but ok, as it is a public API, we probably need to keep it as it is.
> 
> > a number must be set while ixgbe_xmit_cleanup and ixgbe_tx_free_bufs only
> have one parameter txq.
> 
> Yes, ixgbe_xmit_cleanup() cleans up at least txq->tx_rs_thresh TXDs.
> So if user requested more packets to be freed we can call ixgbe_xmit_cleanup()
> in a loop.
> 

That is a great idea and I discuss with my workmate about it today. there is 
also some
Question that we don’t confirm. 
Actually it can call ixgbe_xmit_cleanup() in a loop if user requested more 
packets,
How to deal with the MOD. For example:
The default tx_rs_thresh is 32.if the count of mbufs we need free is 50.
We can call ixgbe_xmit_cleanup() one time to free 32 mbufs.
  

Re: [dpdk-dev] [PATCH] net/mlx5: fix matcher metadata register c0 field setup

2020-01-07 Thread Raslan Darawsheh
Hi,

> -Original Message-
> From: Matan Azrad 
> Sent: Monday, January 6, 2020 4:50 PM
> To: Slava Ovsiienko ; dev@dpdk.org
> Cc: Raslan Darawsheh ; Ori Kam
> ; sta...@dpdk.org
> Subject: RE: [PATCH] net/mlx5: fix matcher metadata register c0 field setup
> 
> 
> 
> From: Viacheslav Ovsiienko
> > The metadata register c0 field in the matcher might be split into two
> > independent fields - the source vport index and META item value. These
> > fields have no permanent assigned bits, the configuration is queried
> > from the kernel drivers.
> >
> > MLX5_SET configures the specified 32-bit field as whole entity.
> > For metadata register c0 we should take into account the provided mask
> > in order to configure the specified subfield bits only.
> >
> > Fixes: acfcd5c52f94 ("net/mlx5: update meta register matcher set")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Viacheslav Ovsiienko 
> > ---
> >  drivers/net/mlx5/mlx5_flow_dv.c | 15 +--
> >  1 file changed, 13 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/mlx5/mlx5_flow_dv.c
> > b/drivers/net/mlx5/mlx5_flow_dv.c index 4c16281..893db3e 100644
> > --- a/drivers/net/mlx5/mlx5_flow_dv.c
> > +++ b/drivers/net/mlx5/mlx5_flow_dv.c
> > @@ -5742,6 +5742,7 @@ struct field_modify_info modify_tcp[] = {
> > MLX5_ADDR_OF(fte_match_param, matcher,
> misc_parameters_2);
> > void *misc2_v =
> > MLX5_ADDR_OF(fte_match_param, key,
> > misc_parameters_2);
> > +   uint32_t temp;
> >
> > data &= mask;
> > switch (reg_type) {
> > @@ -5754,8 +5755,18 @@ struct field_modify_info modify_tcp[] = {
> > MLX5_SET(fte_match_set_misc2, misc2_v, metadata_reg_b,
> data);
> > break;
> > case REG_C_0:
> > -   MLX5_SET(fte_match_set_misc2, misc2_m,
> > metadata_reg_c_0, mask);
> > -   MLX5_SET(fte_match_set_misc2, misc2_v,
> > metadata_reg_c_0, data);
> > +   /*
> > +* The metadata register C0 field might be divided into
> > +* source vport index and META item value, we should set
> > +* this field accorfing to specified mask, not as whole one.
> > +*/
> 
> Typo - accorfing => according.
> 
> > +   temp = MLX5_GET(fte_match_set_misc2, misc2_m,
> > metadata_reg_c_0);
> > +   temp |= mask;
> > +   MLX5_SET(fte_match_set_misc2, misc2_m,
> > metadata_reg_c_0, temp);
> > +   temp = MLX5_GET(fte_match_set_misc2, misc2_v,
> > metadata_reg_c_0);
> > +   temp &= ~mask;
> > +   temp |= data;
> > +   MLX5_SET(fte_match_set_misc2, misc2_v,
> > metadata_reg_c_0, temp);
> > break;
> > case REG_C_1:
> > MLX5_SET(fte_match_set_misc2, misc2_m,
> metadata_reg_c_1, mask);
> 
> Raslan, please fix the typo in integration.
> Acked-by: Matan Azrad 
> 

Fixed typo issue and applied patch to next-net-mlx.
 
Kindest regards,
Raslan Darawsheh


[dpdk-dev] [PATCH 2/2 v2] examples/l3fwd: support multiple eth Rx queues in event mode

2020-01-07 Thread Nipun Gupta
This patch adds support of multiple Ethernet RX queues
for the event dev mode.

Signed-off-by: Nipun Gupta 
---

Changes in v2:
 - Update command line argument to '--event-eth-rxqs' instead of
   '--event-eth-queues'
 - Correctly printed the number of RX queues being created

 examples/l3fwd/l3fwd_event.c | 59 
 examples/l3fwd/l3fwd_event.h |  4 +++
 examples/l3fwd/main.c|  8 +++--
 3 files changed, 56 insertions(+), 15 deletions(-)

diff --git a/examples/l3fwd/l3fwd_event.c b/examples/l3fwd/l3fwd_event.c
index 9fea11bd9..1289bc9ab 100644
--- a/examples/l3fwd/l3fwd_event.c
+++ b/examples/l3fwd/l3fwd_event.c
@@ -40,12 +40,32 @@ parse_eventq_sync(const char *optarg)
evt_rsrc->sched_type = RTE_SCHED_TYPE_PARALLEL;
 }
 
+static void
+parse_event_eth_rx_queues(const char *eth_rx_queues)
+{
+   struct l3fwd_event_resources *evt_rsrc = l3fwd_get_eventdev_rsrc();
+   char *end = NULL;
+   uint8_t num_eth_rx_queues;
+
+   /* parse decimal string */
+   num_eth_rx_queues = strtoul(eth_rx_queues, &end, 10);
+   if ((eth_rx_queues[0] == '\0') || (end == NULL) || (*end != '\0'))
+   return;
+
+   if (num_eth_rx_queues == 0)
+   return;
+
+   evt_rsrc->eth_rx_queues = num_eth_rx_queues;
+}
+
 static void
 l3fwd_parse_eventdev_args(char **argv, int argc)
 {
const struct option eventdev_lgopts[] = {
{CMD_LINE_OPT_MODE, 1, 0, CMD_LINE_OPT_MODE_NUM},
{CMD_LINE_OPT_EVENTQ_SYNC, 1, 0, CMD_LINE_OPT_EVENTQ_SYNC_NUM},
+   {CMD_LINE_OPT_EVENT_ETH_RX_QUEUES, 1, 0,
+   CMD_LINE_OPT_EVENT_ETH_RX_QUEUES_NUM},
{NULL, 0, 0, 0}
};
char *prgname = argv[0];
@@ -64,6 +84,10 @@ l3fwd_parse_eventdev_args(char **argv, int argc)
parse_eventq_sync(optarg);
break;
 
+   case CMD_LINE_OPT_EVENT_ETH_RX_QUEUES_NUM:
+   parse_event_eth_rx_queues(optarg);
+   break;
+
default:
print_usage(prgname);
exit(1);
@@ -85,6 +109,7 @@ l3fwd_eth_dev_port_setup(struct rte_eth_conf *port_conf)
struct rte_eth_rxconf rxconf;
unsigned int nb_mbuf;
uint16_t port_id;
+   uint8_t eth_qid;
int32_t ret;
 
/* initialize all ports */
@@ -99,7 +124,8 @@ l3fwd_eth_dev_port_setup(struct rte_eth_conf *port_conf)
/* init port */
printf("Initializing port %d ... ", port_id);
fflush(stdout);
-   printf("Creating queues: nb_rxq=1 nb_txq=1...\n");
+   printf("Creating queues: nb_rxq=%d nb_txq=1...\n",
+  evt_rsrc->eth_rx_queues);
 
rte_eth_dev_info_get(port_id, &dev_info);
if (dev_info.tx_offload_capa & DEV_TX_OFFLOAD_MBUF_FAST_FREE)
@@ -118,7 +144,8 @@ l3fwd_eth_dev_port_setup(struct rte_eth_conf *port_conf)
   local_port_conf.rx_adv_conf.rss_conf.rss_hf);
}
 
-   ret = rte_eth_dev_configure(port_id, 1, 1, &local_port_conf);
+   ret = rte_eth_dev_configure(port_id, evt_rsrc->eth_rx_queues,
+   1, &local_port_conf);
if (ret < 0)
rte_exit(EXIT_FAILURE,
 "Cannot configure device: err=%d, port=%d\n",
@@ -161,20 +188,26 @@ l3fwd_eth_dev_port_setup(struct rte_eth_conf *port_conf)
  8192u);
ret = init_mem(port_id, nb_mbuf);
}
-   /* init one Rx queue per port */
+   /* init Rx queues per port */
rxconf = dev_info.default_rxconf;
rxconf.offloads = local_port_conf.rxmode.offloads;
-   if (!evt_rsrc->per_port_pool)
-   ret = rte_eth_rx_queue_setup(port_id, 0, nb_rxd, 0,
-   &rxconf, evt_rsrc->pkt_pool[0][0]);
-   else
-   ret = rte_eth_rx_queue_setup(port_id, 0, nb_rxd, 0,
-   &rxconf,
+
+   for (eth_qid = 0; eth_qid < evt_rsrc->eth_rx_queues;
+eth_qid++) {
+   if (!evt_rsrc->per_port_pool)
+   ret = rte_eth_rx_queue_setup(port_id, eth_qid,
+   nb_rxd, 0, &rxconf,
+   evt_rsrc->pkt_pool[0][0]);
+   else
+   ret = rte_eth_rx_queue_setup(port_id, eth_qid,
+   nb_rxd, 0, &rxconf,
evt_rsrc->pkt_pool[port_id][0]);
-   if (ret < 0)
-   rte_exit(EXIT_FAILURE,
-"rte_eth_r

[dpdk-dev] [PATCH 1/2 v2] examples/l3fwd: set default schedule type as atomic

2020-01-07 Thread Nipun Gupta
Signed-off-by: Nipun Gupta 
---
 examples/l3fwd/l3fwd_event.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/examples/l3fwd/l3fwd_event.h b/examples/l3fwd/l3fwd_event.h
index fcc0ce51a..470aedc61 100644
--- a/examples/l3fwd/l3fwd_event.h
+++ b/examples/l3fwd/l3fwd_event.h
@@ -100,7 +100,11 @@ l3fwd_get_eventdev_rsrc(void)
mz = rte_memzone_reserve(name, sizeof(struct l3fwd_event_resources),
 0, 0);
if (mz != NULL) {
+   struct l3fwd_event_resources *rsrc = mz->addr;
+
memset(mz->addr, 0, sizeof(struct l3fwd_event_resources));
+   rsrc->sched_type = RTE_SCHED_TYPE_ATOMIC;
+
return mz->addr;
}
 
-- 
2.17.1



Re: [dpdk-dev] Inconsistent behavior of mempool with regards to hugepage allocation

2020-01-07 Thread Burakov, Anatoly

On 27-Dec-19 11:11 AM, Olivier Matz wrote:

Hi Bao-Long,

On Fri, Dec 27, 2019 at 06:05:57PM +0800, Bao-Long Tran wrote:

Hi Olivier,


On 27 Dec 2019, at 4:11 PM, Olivier Matz  wrote:

On Thu, Dec 26, 2019 at 04:45:24PM +0100, Olivier Matz wrote:

Hi Bao-Long,

On Mon, Dec 23, 2019 at 07:09:29PM +0800, Bao-Long Tran wrote:

Hi,

I'm not sure if this is a bug, but I've seen an inconsistency in the behavior
of DPDK with regards to hugepage allocation for rte_mempool. Basically, for the
same mempool size, the number of hugepages allocated changes from run to run.

Here's how I reproduce with DPDK 19.11. IOVA=pa (default)

1. Reserve 16x1G hugepages on socket 0
2. Replace examples/skeleton/basicfwd.c with the code below, build and run
make && ./build/basicfwd
3. At the same time, watch the number of hugepages allocated
"watch -n.1 ls /dev/hugepages"
4. Repeat step 2

If you can reproduce, you should see that for some runs, DPDK allocates 5
hugepages, other times it allocates 6. When it allocates 6, if you watch the
output from step 3., you will see that DPDK first  try to allocate 5 hugepages,
then unmap all 5, retry, and got 6.


I cannot reproduce in the same conditions than yours (with 16 hugepages
on socket 0), but I think I can see a similar issue:

If I reserve at least 6 hugepages, it seems reproducible (6 hugepages
are used). If I reserve 5 hugepages, it takes more time,
taking/releasing hugepages several times, and it finally succeeds with 5
hugepages.


My apology: I just checked again, I was using DPDK 19.05, not 19.11 or master.
Let me try to see if I can repro my issue with 19.11. Sorry for the confusion.

I also saw your patch to reduce wasted memory (eba11e). Seems like it resolves
the problem with the IOVA-contig constraint that I described in my first 
message.
I'll look into it to confirm.

If I cannot repro my issue (different number of hugepages) with 19.11, from our
side we can upgrade to 19.11 and that's all we need for now. But let me also try
to repro the issue you described (multiple attempts to allocate hugepages).


OK, thanks.

Anyway, I think there is an issue on 19.11. And it is is even worse with
2M hugepages. Let's say we reserve 500x 2M hugepages, and try to
allocate a mempool of 5G:

1/ mempool_populate tries to allocate in one virtually contiguous block,
which maps all 500 hugepages, then fail, unmapping them
2/ it tries to allocate the largest zone, which returns ~2MB.
3/ this zone is added to the mempool, and for that, it allocates a
mem_header struct, which triggers the mapping of a new page.
4/ Back to 1... until it fails after 3 mins

The memzone allocation of "largest available area" does not have the
same semantic depending on the memory model (pre-mapped hugepages or
not). When using dynamic hugepage mapping, it won't map any additional
hugepage.

To solve the issue, we could either change it to allocate all available
hugepages, or change mempool populate, by not using the "largest
available area" allocation, doing the search by ourself.


Yep, this is one of the things that is currently an unsolved problem in 
the allocator. I am not sure if any one behavior is "more correct" than 
the other, so i don't think allocating "all available" hugepages is more 
correct than not doing it.


Besides, there's no reliable way to get "biggest" chunk of memory, 
because while you might get *some* memory from 2M pages, there's no 
guarantee that the amount you may get from 1G pages isn't bigger. So, we 
either momentarily take over the entire users' memory and figure out 
what we need and what we don't, or we use the first available page size 
and hope that that's enough.


That said, there's an internal API to allocate "up to X" pages, so in 
principle, we could build this kind of infrastructure.










For our use case, it's important that DPDK allocate the same number of
hugepages on every run so we can get reproducable results.


One possibility is to use the --legacy-mem EAL option. It will try to
reserve all hugepages first.


Passing --socket-mem=5120,0 also does the job.




Studying the code, this seems to be the behavior of
rte_mempool_populate_default(). If I understand correctly, if the first try fail
to get 5 IOVA-contiguous pages, it retries, relaxing the IOVA-contiguous
condition, and eventually wound up with 6 hugepages.


No, I think you don't have the IOVA-contiguous constraint in your
case. This is what I see:

a- reserve 5 hugepages on socket 0, and start your patched basicfwd
b- it tries to allocate 2097151 objects of size 2304, pg_size = 1073741824
c- the total element size (with header) is 2304 + 64 = 2368
d- in rte_mempool_op_calc_mem_size_helper(), it calculates
   obj_per_page = 453438(453438 * 2368 = 1073741184)
   mem_size = 4966058495
e- it tries to allocate 4966058495 bytes, which is less than 5 x 1G, with:
   rte_memzone_reserve_aligned(name, size=4966058495, socket=0,
 mz_flags=RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY,
 al

[dpdk-dev] [PATCH] event/octeontx2: relax memory requirement for event timers

2020-01-07 Thread pbhagavatula
From: Pavan Nikhilesh 

Relax memory requirement for event timers when internal mempool used is
octeontx2 mempool.
Add debug log to print the memory used.

Signed-off-by: Pavan Nikhilesh 
---
 drivers/event/octeontx2/otx2_tim_evdev.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/event/octeontx2/otx2_tim_evdev.c 
b/drivers/event/octeontx2/otx2_tim_evdev.c
index b275c6922..cd0dcde24 100644
--- a/drivers/event/octeontx2/otx2_tim_evdev.c
+++ b/drivers/event/octeontx2/otx2_tim_evdev.c
@@ -327,7 +327,11 @@ otx2_tim_ring_create(struct rte_event_timer_adapter *adptr)
tim_optimze_bkt_param(tim_ring);
}
 
-   tim_ring->nb_chunks = tim_ring->nb_chunks * tim_ring->nb_bkts;
+   if (tim_ring->disable_npa)
+   tim_ring->nb_chunks = tim_ring->nb_chunks * tim_ring->nb_bkts;
+   else
+   tim_ring->nb_chunks = tim_ring->nb_chunks + tim_ring->nb_bkts;
+
/* Create buckets. */
tim_ring->bkt = rte_zmalloc("otx2_tim_bucket", (tim_ring->nb_bkts) *
sizeof(struct otx2_tim_bkt),
@@ -376,6 +380,11 @@ otx2_tim_ring_create(struct rte_event_timer_adapter *adptr)
 RTE_EVENT_TYPE_TIMER);
sso_xae_reconfigure(dev->event_dev);
 
+   otx2_tim_dbg("Total memory used %"PRIu64"MB\n",
+   (uint64_t)(((tim_ring->nb_chunks * tim_ring->chunk_sz)
+   + (tim_ring->nb_bkts * sizeof(struct otx2_tim_bkt))) /
+   BIT_ULL(20)));
+
return rc;
 
 chnk_mem_err:
-- 
2.17.1



Re: [dpdk-dev] [PATCH v1] net/axgbe: Add a HW quirk for register definitions

2020-01-07 Thread Ferruh Yigit
On 12/17/2019 6:44 AM, Sebastian, Selwin wrote:
> [AMD Official Use Only - Internal Distribution Only]
> 
> Hi Ferruh,
>   Current driver was developed for EPYC 3000 processors. New processors 
> V1000/R1000 is also using the same PCI id for axgbe but register definitions 
> for determining the window settings for indirect PCS access is changed. In 
> order to identify processor, we are adding a quirk.
> 15d0 is the pci id for V1000/R1000/Raven root complex( 
> https://pci-ids.ucw.cz/read/PC/1022 ). Hence read pci-id of root complex  to 
> determine which processor and set the registers accordingly. 
> 

Got it, it is better to add a define for 0x15d0 with an explanation, and for the
root complex device use a more descriptive variable name that 'pdev'.

But still it is not really good idea to access the pci device list, isn't there
any other way to differentiate the devices, sub-device id etc? And how does
linux driver manages this?

And is it guaranteed that root complex device always will be the first device?


> Thanks and Regards
> Selwin Sebastian
>  
> 
> -Original Message-
> From: Ferruh Yigit  
> Sent: Wednesday, December 11, 2019 5:12 PM
> To: Sebastian, Selwin ; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v1] net/axgbe: Add a HW quirk for register 
> definitions
> 
> [CAUTION: External Email]
> 
> On 12/10/2019 3:29 PM, Selwin Sebastian wrote:
>> V1000/R1000 processors are using the same PCI ids for the network 
>> device but has altered register definitions for determining the window 
>> settings for the indirect PCS access.Add support to check for this 
>> hardware and if found use the new register values
> 
> How they are differentiated, subdevice ids?
> If so should we add subdevice fields check into DPDK?
> 
>>
>> Signed-off-by: Selwin Sebastian 
>> ---
>>  drivers/net/axgbe/axgbe_common.h |  2 ++  
>> drivers/net/axgbe/axgbe_ethdev.c | 18 +++---
>>  2 files changed, 17 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/axgbe/axgbe_common.h 
>> b/drivers/net/axgbe/axgbe_common.h
>> index 34f60f156..4a3fbac16 100644
>> --- a/drivers/net/axgbe/axgbe_common.h
>> +++ b/drivers/net/axgbe/axgbe_common.h
>> @@ -841,6 +841,8 @@
>>  #define PCS_V1_WINDOW_SELECT 0x03fc
>>  #define PCS_V2_WINDOW_DEF0x9060
>>  #define PCS_V2_WINDOW_SELECT 0x9064
>> +#define PCS_V2_RV_WINDOW_DEF 0x1060
>> +#define PCS_V2_RV_WINDOW_SELECT  0x1064
>>
>>  /* PCS register entry bit positions and sizes */
>>  #define PCS_V2_WINDOW_DEF_OFFSET_INDEX   6
>> diff --git a/drivers/net/axgbe/axgbe_ethdev.c 
>> b/drivers/net/axgbe/axgbe_ethdev.c
>> index d1f160e79..25e182b8d 100644
>> --- a/drivers/net/axgbe/axgbe_ethdev.c
>> +++ b/drivers/net/axgbe/axgbe_ethdev.c
>> @@ -31,6 +31,7 @@ static int  axgbe_dev_info_get(struct rte_eth_dev *dev,
>>  #define AMD_PCI_VENDOR_ID   0x1022
>>  #define AMD_PCI_AXGBE_DEVICE_V2A 0x1458  #define 
>> AMD_PCI_AXGBE_DEVICE_V2B 0x1459
>> +extern struct rte_pci_bus rte_pci_bus;
> 
> Not sure about accessing the bus device list from a PMD...
> 
>>
>>  int axgbe_logtype_init;
>>  int axgbe_logtype_driver;
>> @@ -585,6 +586,7 @@ eth_axgbe_dev_init(struct rte_eth_dev *eth_dev)
>>   struct rte_pci_device *pci_dev;
>>   uint32_t reg, mac_lo, mac_hi;
>>   int ret;
>> + struct rte_pci_device *pdev;
>>
>>   eth_dev->dev_ops = &axgbe_eth_dev_ops;
>>   eth_dev->rx_pkt_burst = &axgbe_recv_pkts; @@ -605,6 +607,17 @@ 
>> eth_axgbe_dev_init(struct rte_eth_dev *eth_dev)
>>   pci_dev = RTE_DEV_TO_PCI(eth_dev->device);
>>   pdata->pci_dev = pci_dev;
>>
>> + pdev = TAILQ_FIRST(&rte_pci_bus.device_list);
> 
> Can you please describe what this does? You are reading first pci device and 
> do you assume it is an axgbe device? And do you also assume there is single 
> axgbe device?
> 
> Why you are not simply using 'pci_dev' above?
> 
>> +
>> + if (pdev->id.vendor_id == AMD_PCI_VENDOR_ID &&
>> + pdev->id.device_id == 0x15d0) {
> 
> As far as I can see, '0x15d0' is not in the supported pci_id list, so why you 
> are checking it here? That devices shouldn't be probed at all ...
> 
>> + pdata->xpcs_window_def_reg = PCS_V2_RV_WINDOW_DEF;
>> + pdata->xpcs_window_sel_reg = PCS_V2_RV_WINDOW_SELECT;
>> + } else {
>> + pdata->xpcs_window_def_reg = PCS_V2_WINDOW_DEF;
>> + pdata->xpcs_window_sel_reg = PCS_V2_WINDOW_SELECT;
>> + }
>> +
>>   pdata->xgmac_regs =
>>   (void *)pci_dev->mem_resource[AXGBE_AXGMAC_BAR].addr;
>>   pdata->xprop_regs = (void *)((uint8_t *)pdata->xgmac_regs @@ 
>> -620,14 +633,13 @@ eth_axgbe_dev_init(struct rte_eth_dev *eth_dev)
>>   pdata->vdata = &axgbe_v2b;
>>
>>   /* Configure the PCS indirect addressing support */
>> - reg = XPCS32_IOREAD(pdata, PCS_V2_WINDOW_DEF);
>> + reg = XPCS32_IOREAD(pdata, pdata->xpcs_window_def_reg);
>>   pdata->xpcs_window = 

Re: [dpdk-dev] [PATCH v6 3/4] net/ixgbe: cleanup Tx buffers

2020-01-07 Thread Ananyev, Konstantin

Hi Chenxu,

> > > > > > > + * tx_tail is the last sent packet on the sw_ring. Goto the
> > > > > > > + end
> > > > > > > + * of that packet (the last segment in the packet chain) and
> > > > > > > + * then the next segment will be the start of the oldest
> > > > > > > + segment
> > > > > > > + * in the sw_ring.
> > > > > >
> > > > > > Not sure I understand the sentence above.
> > > > > > tx_tail is the value of TDT HW register (most recently armed by SW 
> > > > > > TD).
> > > > > > last_id  is the index of last descriptor for multi-seg packet.
> > > > > > next_id is just the index of next descriptor in HW TD ring.
> > > > > > How do you conclude that it will be the ' oldest segment in the 
> > > > > > sw_ring'?
> > > > > >
> > > > >
> > > > > The tx_tail is the last sent packet on the sw_ring. While the
> > > > > xmit_cleanup or Tx_free_bufs will be call when the nb_tx_free <
> > > > tx_free_thresh .
> > > > > So the sw_ring[tx_tail].next_id must be the begin of mbufs which
> > > > > are not used or  Already freed . then begin the loop until the
> > > > > mbuf is used and
> > > > begin to free them.
> > > > >
> > > > >
> > > > >
> > > > > > Another question why do you need to write your own functions?
> > > > > > Why can't you reuse existing ixgbe_xmit_cleanup() for
> > > > > > full(offload) path and
> > > > > > ixgbe_tx_free_bufs() for simple path?
> > > > > > Yes,  ixgbe_xmit_cleanup() doesn't free mbufs, but at least it
> > > > > > could be used to determine finished TX descriptors.
> > > > > > Based on that you can you can free appropriate sw_ring[] entries.
> > > > > >
> > > > >
> > > > > The reason why I don't reuse existing function is that they all
> > > > > free several mbufs While the free_cnt of the API
> > > > > rte_eth_tx_done_cleanup() is the
> > > > number of packets.
> > > > > It also need to be done that check which mbuffs are from the same 
> > > > > packet.
> > > >
> > > > At first, I don't see anything bad if tx_done_cleanup() will free
> > > > only some segments from the packet. As long as it is safe - there is no
> > problem with that.
> > > > I think rte_eth_tx_done_cleanup() operates on mbuf, not packet 
> > > > quantities.
> > > > But in our case I think it doesn't matter, as ixgbe_xmit_cleanup()
> > > > mark TXDs as free only when HW is done with all TXDs for that packet.
> > > > As long as there is a way to reuse existing code and avoid
> > > > duplication (without introducing any degradation) - we should use it.
> > > > And I think there is a very good opportunity here to reuse existing
> > > > ixgbe_xmit_cleanup() for tx_done_cleanup() implementation.
> > > > Moreover because your code doesn’t follow
> > > > ixgbe_xmit_pkts()/ixgbe_xmit_cleanup()
> > > > logic and infrastructure, it introduces unnecessary scans over TXD
> > > > ring, and in some cases doesn't work as expected:
> > > >
> > > > +while (1) {
> > > > +tx_last = sw_ring[tx_id].last_id;
> > > > +
> > > > +if (sw_ring[tx_last].mbuf) {
> > > > +if (txr[tx_last].wb.status &
> > > > +IXGBE_TXD_STAT_DD) {
> > > > ...
> > > > +} else {
> > > > +/*
> > > > + * mbuf still in use, nothing left to
> > > > + * free.
> > > > + */
> > > > +break;
> > > >
> > > > It is not correct to expect that IXGBE_TXD_STAT_DD will be set on
> > > > last TXD for
> > > > *every* packet.
> > > > We set IXGBE_TXD_CMD_RS bit only on threshold packet last descriptor.
> > > > Plus ixgbe_xmit_cleanup() can cleanup TXD wb.status.
> > > >
> > > > So I strongly recommend to reuse ixgbe_xmit_cleanup() here.
> > > > It would be much less error prone and will help to avoid code 
> > > > duplication.
> > > >
> > > > Konstantin
> > > >
> > >
> > > At first.
> > > The function ixgbe_xmit_cleanup(struct ixgbe_tx_queue *txq) will cleanup
> > TXD wb.status.
> > > the number of status cleanuped is always txq->tx_rs_thresh.
> >
> > Yes, and what's wrong with it?
> >
> > >
> > > The  API rte_eth_tx_done_cleanup() in rte_eth_dev.h show that @param
> > > free_cnt
> > >  *   Maximum number of packets to free. Use 0 to indicate all possible 
> > > packets
> > >  *   should be freed. Note that a packet may be using multiple mbufs.
> >
> > I don't think it is a good approach, better would be to report number of 
> > freed
> > mbufs, but ok, as it is a public API, we probably need to keep it as it is.
> >
> > > a number must be set while ixgbe_xmit_cleanup and ixgbe_tx_free_bufs only
> > have one parameter txq.
> >
> > Yes, ixgbe_xmit_cleanup() cleans up at least txq->tx_rs_thresh TXDs.
> > So if user requested more packets to be freed we can call 
> > ixgbe_xmit_cleanup()
> > in a loop.
> >
> 
> That is a great idea and I discuss with my workmate about it today. there is 
> also some
> Question that we don’t confirm.
> Actually it can call ixgbe_xmit_cleanup() in a loop if user requested more 
> packets,
> How to deal with the MOD. For example:
>   The default tx_rs_thresh is 32.if the count of mbufs we need free is 50.
>   We can call ixgbe_xmit_cleanup() one t

[dpdk-dev] [PATCH v4 0/6] OCTEON TX2 End Point Driver

2020-01-07 Thread Mahipal Challa
This patchset adds support for OCTEON TX2 end point mode of operation.
The driver implementation uses DPDK rawdevice sub-system.

v2:
* Updated memory barrior API's as per Gavin Hu suggestion.

v3:
* Fixed memory leak possibility issues.

v4:
* Improved error handling in selftest API.

Mahipal Challa (6):
  raw/octeontx2_ep: add build infra and device probe
  raw/octeontx2_ep: add device configuration
  raw/octeontx2_ep: add device uninitialization
  raw/octeontx2_ep: add enqueue operation
  raw/octeontx2_ep: add dequeue operation
  raw/octeontx2_ep: add driver self test

 MAINTAINERS|   5 +
 config/common_base |   5 +
 doc/guides/rawdevs/index.rst   |   1 +
 doc/guides/rawdevs/octeontx2_ep.rst|  89 +++
 drivers/common/octeontx2/hw/otx2_sdp.h | 184 +
 drivers/common/octeontx2/otx2_common.c |   9 +
 drivers/common/octeontx2/otx2_common.h |   4 +
 .../octeontx2/rte_common_octeontx2_version.map |   6 +
 drivers/raw/Makefile   |   1 +
 drivers/raw/meson.build|   1 +
 drivers/raw/octeontx2_ep/Makefile  |  44 ++
 drivers/raw/octeontx2_ep/meson.build   |   9 +
 drivers/raw/octeontx2_ep/otx2_ep_enqdeq.c  | 844 +
 drivers/raw/octeontx2_ep/otx2_ep_enqdeq.h  |  52 ++
 drivers/raw/octeontx2_ep/otx2_ep_rawdev.c  | 361 +
 drivers/raw/octeontx2_ep/otx2_ep_rawdev.h  | 499 
 drivers/raw/octeontx2_ep/otx2_ep_test.c| 173 +
 drivers/raw/octeontx2_ep/otx2_ep_vf.c  | 476 
 drivers/raw/octeontx2_ep/otx2_ep_vf.h  |  10 +
 .../rte_rawdev_octeontx2_ep_version.map|   4 +
 mk/rte.app.mk  |   2 +
 21 files changed, 2779 insertions(+)
 create mode 100644 doc/guides/rawdevs/octeontx2_ep.rst
 create mode 100644 drivers/common/octeontx2/hw/otx2_sdp.h
 create mode 100644 drivers/raw/octeontx2_ep/Makefile
 create mode 100644 drivers/raw/octeontx2_ep/meson.build
 create mode 100644 drivers/raw/octeontx2_ep/otx2_ep_enqdeq.c
 create mode 100644 drivers/raw/octeontx2_ep/otx2_ep_enqdeq.h
 create mode 100644 drivers/raw/octeontx2_ep/otx2_ep_rawdev.c
 create mode 100644 drivers/raw/octeontx2_ep/otx2_ep_rawdev.h
 create mode 100644 drivers/raw/octeontx2_ep/otx2_ep_test.c
 create mode 100644 drivers/raw/octeontx2_ep/otx2_ep_vf.c
 create mode 100644 drivers/raw/octeontx2_ep/otx2_ep_vf.h
 create mode 100644 drivers/raw/octeontx2_ep/rte_rawdev_octeontx2_ep_version.map

-- 
1.8.3.1



[dpdk-dev] [PATCH v4 2/6] raw/octeontx2_ep: add device configuration

2020-01-07 Thread Mahipal Challa
Register "dev_configure" API to configure/initialize the SDP
VF PCIe devices.

Signed-off-by: Mahipal Challa 
---
 doc/guides/rawdevs/octeontx2_ep.rst|  29 ++
 drivers/common/octeontx2/hw/otx2_sdp.h | 184 +
 drivers/common/octeontx2/otx2_common.c |   9 +
 drivers/common/octeontx2/otx2_common.h |   4 +
 .../octeontx2/rte_common_octeontx2_version.map |   6 +
 drivers/raw/octeontx2_ep/Makefile  |   3 +
 drivers/raw/octeontx2_ep/meson.build   |   4 +-
 drivers/raw/octeontx2_ep/otx2_ep_enqdeq.c  | 294 ++
 drivers/raw/octeontx2_ep/otx2_ep_enqdeq.h  |  11 +
 drivers/raw/octeontx2_ep/otx2_ep_rawdev.c  | 148 +++
 drivers/raw/octeontx2_ep/otx2_ep_rawdev.h  | 434 -
 drivers/raw/octeontx2_ep/otx2_ep_vf.c  | 408 +++
 drivers/raw/octeontx2_ep/otx2_ep_vf.h  |  10 +
 13 files changed, 1542 insertions(+), 2 deletions(-)

diff --git a/doc/guides/rawdevs/octeontx2_ep.rst 
b/doc/guides/rawdevs/octeontx2_ep.rst
index 5f5ed01..2507fcf 100644
--- a/doc/guides/rawdevs/octeontx2_ep.rst
+++ b/doc/guides/rawdevs/octeontx2_ep.rst
@@ -39,3 +39,32 @@ entry `sriov_numvfs` for the corresponding PF driver.
 
 Once the required VFs are enabled, to be accessible from DPDK, VFs need to be
 bound to vfio-pci driver.
+
+Device Setup
+
+
+The OCTEON TX2 SDP End Point VF devices will need to be bound to a
+user-space IO driver for use. The script ``dpdk-devbind.py`` script
+included with DPDK can be used to view the state of the devices and to bind
+them to a suitable DPDK-supported kernel driver. When querying the status
+of the devices, they will appear under the category of "Misc (rawdev)
+devices", i.e. the command ``dpdk-devbind.py --status-dev misc`` can be
+used to see the state of those devices alone.
+
+Device Configuration
+
+
+Configuring SDP EP rawdev device is done using the ``rte_rawdev_configure()``
+API, which takes the mempool as parameter. PMD uses this pool to send/receive
+packets to/from the HW.
+
+The following code shows how the device is configured
+
+.. code-block:: c
+
+   struct sdp_rawdev_info config = {0};
+   struct rte_rawdev_info rdev_info = {.dev_private = &config};
+   config.enqdeq_mpool = (void *)rte_mempool_create(...);
+
+   rte_rawdev_configure(dev_id, (rte_rawdev_obj_t)&rdev_info);
+
diff --git a/drivers/common/octeontx2/hw/otx2_sdp.h 
b/drivers/common/octeontx2/hw/otx2_sdp.h
new file mode 100644
index 000..1e690f8
--- /dev/null
+++ b/drivers/common/octeontx2/hw/otx2_sdp.h
@@ -0,0 +1,184 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2019 Marvell International Ltd.
+ */
+
+#ifndef __OTX2_SDP_HW_H_
+#define __OTX2_SDP_HW_H_
+
+/* SDP VF IOQs */
+#define SDP_MIN_RINGS_PER_VF(1)
+#define SDP_MAX_RINGS_PER_VF(8)
+
+/* SDP VF IQ configuration */
+#define SDP_VF_MAX_IQ_DESCRIPTORS   (512)
+#define SDP_VF_MIN_IQ_DESCRIPTORS   (128)
+
+#define SDP_VF_DB_MIN   (1)
+#define SDP_VF_DB_TIMEOUT   (1)
+#define SDP_VF_INTR_THRESHOLD   (0x)
+
+#define SDP_VF_64BYTE_INSTR (64)
+#define SDP_VF_32BYTE_INSTR (32)
+
+/* SDP VF OQ configuration */
+#define SDP_VF_MAX_OQ_DESCRIPTORS   (512)
+#define SDP_VF_MIN_OQ_DESCRIPTORS   (128)
+#define SDP_VF_OQ_BUF_SIZE  (2048)
+#define SDP_VF_OQ_REFIL_THRESHOLD   (16)
+
+#define SDP_VF_OQ_INFOPTR_MODE  (1)
+#define SDP_VF_OQ_BUFPTR_MODE   (0)
+
+#define SDP_VF_OQ_INTR_PKT  (1)
+#define SDP_VF_OQ_INTR_TIME (10)
+#define SDP_VF_CFG_IO_QUEUESSDP_MAX_RINGS_PER_VF
+
+/* Wait time in milliseconds for FLR */
+#define SDP_VF_PCI_FLR_WAIT (100)
+#define SDP_VF_BUSY_LOOP_COUNT  (1)
+
+#define SDP_VF_MAX_IO_QUEUESSDP_MAX_RINGS_PER_VF
+#define SDP_VF_MIN_IO_QUEUESSDP_MIN_RINGS_PER_VF
+
+/* SDP VF IOQs per rawdev */
+#define SDP_VF_MAX_IOQS_PER_RAWDEV  SDP_VF_MAX_IO_QUEUES
+#define SDP_VF_DEFAULT_IOQS_PER_RAWDEV  SDP_VF_MIN_IO_QUEUES
+
+/* SDP VF Register definitions */
+#define SDP_VF_RING_OFFSET(0x1ull << 17)
+
+/* SDP VF IQ Registers */
+#define SDP_VF_R_IN_CONTROL_START (0x1)
+#define SDP_VF_R_IN_ENABLE_START  (0x10010)
+#define SDP_VF_R_IN_INSTR_BADDR_START (0x10020)
+#define SDP_VF_R_IN_INSTR_RSIZE_START (0x10030)
+#define SDP_VF_R_IN_INSTR_DBELL_START (0x10040)
+#define SDP_VF_R_IN_CNTS_START(0x10050)
+#define SDP_VF_R_IN_INT_LEVELS_START  (0x10060)
+#define SDP_VF_R_IN_PKT_CNT_START (0x10080)
+#define SDP_VF_R_IN_BYTE_CNT_START(0x10090)
+
+#define SDP_VF_R_IN_CONTROL(ring)  \
+   (SDP_VF_R_IN_CONTROL_START + ((ring) * SDP_VF_RING_OFFSET))
+
+#define SDP_VF_R_IN_ENABLE(ring)   \
+   (SDP_VF_R_IN_ENABLE_START + ((ring) * SDP_VF_RING_OFFSET))
+
+#define SDP_VF_R_IN_INSTR_BADDR(ring)   \
+   (SDP_VF_R_IN_INSTR_BADDR_START 

[dpdk-dev] [PATCH v4 1/6] raw/octeontx2_ep: add build infra and device probe

2020-01-07 Thread Mahipal Challa
Add the OCTEON TX2 SDP EP device probe along with the
build infrastructure for Make and meson builds.

Signed-off-by: Mahipal Challa 
---
 MAINTAINERS|   5 +
 config/common_base |   5 +
 doc/guides/rawdevs/index.rst   |   1 +
 doc/guides/rawdevs/octeontx2_ep.rst|  41 +++
 drivers/raw/Makefile   |   1 +
 drivers/raw/meson.build|   1 +
 drivers/raw/octeontx2_ep/Makefile  |  40 +++
 drivers/raw/octeontx2_ep/meson.build   |   6 +
 drivers/raw/octeontx2_ep/otx2_ep_rawdev.c  | 132 +
 drivers/raw/octeontx2_ep/otx2_ep_rawdev.h  |  21 
 .../rte_rawdev_octeontx2_ep_version.map|   4 +
 mk/rte.app.mk  |   2 +
 12 files changed, 259 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 4395d8d..24f1240 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1173,6 +1173,11 @@ M: Vamsi Attunuru 
 F: drivers/raw/octeontx2_dma/
 F: doc/guides/rawdevs/octeontx2_dma.rst
 
+Marvell OCTEON TX2 EP
+M: Mahipal Challa 
+F: drivers/raw/octeontx2_ep/
+F: doc/guides/rawdevs/octeontx2_ep.rst
+
 NTB
 M: Xiaoyun Li 
 M: Jingjing Wu 
diff --git a/config/common_base b/config/common_base
index 7dec7ed..8e7dad2 100644
--- a/config/common_base
+++ b/config/common_base
@@ -796,6 +796,11 @@ CONFIG_RTE_LIBRTE_PMD_IOAT_RAWDEV=y
 CONFIG_RTE_LIBRTE_PMD_OCTEONTX2_DMA_RAWDEV=y
 
 #
+# Compile PMD for octeontx2 EP raw device
+#
+CONFIG_RTE_LIBRTE_PMD_OCTEONTX2_EP_RAWDEV=y
+
+#
 # Compile PMD for NTB raw device
 #
 CONFIG_RTE_LIBRTE_PMD_NTB_RAWDEV=y
diff --git a/doc/guides/rawdevs/index.rst b/doc/guides/rawdevs/index.rst
index 22bc013..f64ec44 100644
--- a/doc/guides/rawdevs/index.rst
+++ b/doc/guides/rawdevs/index.rst
@@ -17,3 +17,4 @@ application through rawdev API.
 ioat
 ntb
 octeontx2_dma
+octeontx2_ep
diff --git a/doc/guides/rawdevs/octeontx2_ep.rst 
b/doc/guides/rawdevs/octeontx2_ep.rst
new file mode 100644
index 000..5f5ed01
--- /dev/null
+++ b/doc/guides/rawdevs/octeontx2_ep.rst
@@ -0,0 +1,41 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+Copyright(c) 2019 Marvell International Ltd.
+
+Marvell OCTEON TX2 End Point Rawdev Driver
+==
+
+OCTEON TX2 has an internal SDP unit which provides End Point mode of operation
+by exposing its IOQs to Host, IOQs are used for packet I/O between Host and
+OCTEON TX2. Each OCTEON TX2 SDP PF supports a max of 128 VFs and Each VF is
+associated with a set of IOQ pairs.
+
+Features
+
+
+This OCTEON TX2 End Point mode PMD supports
+
+#. Packet Input - Host to OCTEON TX2 with direct data instruction mode.
+
+#. Packet Output - OCTEON TX2 to Host with info pointer mode.
+
+Config File Options
+~~~
+
+The following options can be modified in the ``config`` file.
+
+- ``CONFIG_RTE_LIBRTE_PMD_OCTEONTX2_EP_RAWDEV`` (default ``y``)
+
+  Toggle compilation of the ``lrte_pmd_octeontx2_ep`` driver.
+
+Initialization
+--
+
+The number of SDP VFs enabled, can be controlled by setting sysfs
+entry `sriov_numvfs` for the corresponding PF driver.
+
+.. code-block:: console
+
+ echo  > /sys/bus/pci/drivers/octeontx2-ep/\:04\:00.0/sriov_numvfs
+
+Once the required VFs are enabled, to be accessible from DPDK, VFs need to be
+bound to vfio-pci driver.
diff --git a/drivers/raw/Makefile b/drivers/raw/Makefile
index 0b6d13d..80b043e 100644
--- a/drivers/raw/Makefile
+++ b/drivers/raw/Makefile
@@ -13,5 +13,6 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_IFPGA_RAWDEV) += ifpga
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_IOAT_RAWDEV) += ioat
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_NTB_RAWDEV) += ntb
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_OCTEONTX2_DMA_RAWDEV) += octeontx2_dma
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_OCTEONTX2_EP_RAWDEV) += octeontx2_ep
 
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/drivers/raw/meson.build b/drivers/raw/meson.build
index d7037cd..bb57977 100644
--- a/drivers/raw/meson.build
+++ b/drivers/raw/meson.build
@@ -4,6 +4,7 @@
 drivers = ['dpaa2_cmdif', 'dpaa2_qdma',
'ifpga', 'ioat', 'ntb',
'octeontx2_dma',
+   'octeontx2_ep',
'skeleton']
 std_deps = ['rawdev']
 config_flag_fmt = 'RTE_LIBRTE_PMD_@0@_RAWDEV'
diff --git a/drivers/raw/octeontx2_ep/Makefile 
b/drivers/raw/octeontx2_ep/Makefile
new file mode 100644
index 000..8cec6bd
--- /dev/null
+++ b/drivers/raw/octeontx2_ep/Makefile
@@ -0,0 +1,40 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(C) 2019 Marvell International Ltd.
+#
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# Library name
+LIB = librte_rawdev_octeontx2_ep.a
+
+# Build flags
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+CFLAGS += -I$(RTE_SDK)/drivers/common/octeontx2/
+CFLAGS += -I$(RTE_SDK)/drivers/raw/octeontx2_ep/
+
+LDLIBS += -lrte_eal
+LDLIBS += -lrte_rawdev
+LDLIBS += -lrte_bus_pci
+LDLIBS += -lrte_mempool
+LDLIBS += -lrte_com

[dpdk-dev] [PATCH v4 4/6] raw/octeontx2_ep: add enqueue operation

2020-01-07 Thread Mahipal Challa
Add rawdev enqueue operation for SDP VF devices.

Signed-off-by: Mahipal Challa 
---
 doc/guides/rawdevs/octeontx2_ep.rst   |   6 +
 drivers/raw/octeontx2_ep/otx2_ep_enqdeq.c | 242 ++
 drivers/raw/octeontx2_ep/otx2_ep_enqdeq.h |  39 +
 drivers/raw/octeontx2_ep/otx2_ep_rawdev.c |   1 +
 drivers/raw/octeontx2_ep/otx2_ep_rawdev.h |  20 +++
 drivers/raw/octeontx2_ep/otx2_ep_vf.c |  24 +++
 6 files changed, 332 insertions(+)

diff --git a/doc/guides/rawdevs/octeontx2_ep.rst 
b/doc/guides/rawdevs/octeontx2_ep.rst
index 2507fcf..39a7c29 100644
--- a/doc/guides/rawdevs/octeontx2_ep.rst
+++ b/doc/guides/rawdevs/octeontx2_ep.rst
@@ -68,3 +68,9 @@ The following code shows how the device is configured
 
rte_rawdev_configure(dev_id, (rte_rawdev_obj_t)&rdev_info);
 
+Performing Data Transfer
+
+
+To perform data transfer using SDP VF EP rawdev devices use standard
+``rte_rawdev_enqueue_buffers()`` and ``rte_rawdev_dequeue_buffers()`` APIs.
+
diff --git a/drivers/raw/octeontx2_ep/otx2_ep_enqdeq.c 
b/drivers/raw/octeontx2_ep/otx2_ep_enqdeq.c
index 584b818..87ca3cd 100644
--- a/drivers/raw/octeontx2_ep/otx2_ep_enqdeq.c
+++ b/drivers/raw/octeontx2_ep/otx2_ep_enqdeq.c
@@ -403,3 +403,245 @@
return -ENOMEM;
 }
 
+static inline void
+sdp_iqreq_delete(struct sdp_device *sdpvf,
+   struct sdp_instr_queue *iq, uint32_t idx)
+{
+   uint32_t reqtype;
+   void *buf;
+
+   buf = iq->req_list[idx].buf;
+   reqtype = iq->req_list[idx].reqtype;
+
+   switch (reqtype) {
+   case SDP_REQTYPE_NORESP:
+   rte_mempool_put(sdpvf->enqdeq_mpool, buf);
+   otx2_sdp_dbg("IQ buffer freed at idx[%d]", idx);
+   break;
+
+   case SDP_REQTYPE_NORESP_GATHER:
+   case SDP_REQTYPE_NONE:
+   default:
+   otx2_info("This iqreq mode is not supported:%d", reqtype);
+
+   }
+
+   /* Reset the request list at this index */
+   iq->req_list[idx].buf = NULL;
+   iq->req_list[idx].reqtype = 0;
+}
+
+static inline void
+sdp_iqreq_add(struct sdp_instr_queue *iq, void *buf,
+   uint32_t reqtype)
+{
+   iq->req_list[iq->host_write_index].buf = buf;
+   iq->req_list[iq->host_write_index].reqtype = reqtype;
+
+   otx2_sdp_dbg("IQ buffer added at idx[%d]", iq->host_write_index);
+
+}
+
+static void
+sdp_flush_iq(struct sdp_device *sdpvf,
+   struct sdp_instr_queue *iq,
+   uint32_t pending_thresh __rte_unused)
+{
+   uint32_t instr_processed = 0;
+
+   rte_spinlock_lock(&iq->lock);
+
+   iq->otx_read_index = sdpvf->fn_list.update_iq_read_idx(iq);
+   while (iq->flush_index != iq->otx_read_index) {
+   /* Free the IQ data buffer to the pool */
+   sdp_iqreq_delete(sdpvf, iq, iq->flush_index);
+   iq->flush_index =
+   sdp_incr_index(iq->flush_index, 1, iq->nb_desc);
+
+   instr_processed++;
+   }
+
+   iq->stats.instr_processed = instr_processed;
+   rte_atomic64_sub(&iq->instr_pending, instr_processed);
+
+   rte_spinlock_unlock(&iq->lock);
+}
+
+static inline void
+sdp_ring_doorbell(struct sdp_device *sdpvf __rte_unused,
+   struct sdp_instr_queue *iq)
+{
+   otx2_write64(iq->fill_cnt, iq->doorbell_reg);
+
+   /* Make sure doorbell writes observed by HW */
+   rte_cio_wmb();
+   iq->fill_cnt = 0;
+
+}
+
+static inline int
+post_iqcmd(struct sdp_instr_queue *iq, uint8_t *iqcmd)
+{
+   uint8_t *iqptr, cmdsize;
+
+   /* This ensures that the read index does not wrap around to
+* the same position if queue gets full before OCTEON TX2 could
+* fetch any instr.
+*/
+   if (rte_atomic64_read(&iq->instr_pending) >=
+ (int32_t)(iq->nb_desc - 1)) {
+   otx2_err("IQ is full, pending:%ld",
+(long)rte_atomic64_read(&iq->instr_pending));
+
+   return SDP_IQ_SEND_FAILED;
+   }
+
+   /* Copy cmd into iq */
+   cmdsize = ((iq->iqcmd_64B) ? 64 : 32);
+   iqptr   = iq->base_addr + (cmdsize * iq->host_write_index);
+
+   rte_memcpy(iqptr, iqcmd, cmdsize);
+
+   otx2_sdp_dbg("IQ cmd posted @ index:%d", iq->host_write_index);
+
+   /* Increment the host write index */
+   iq->host_write_index =
+   sdp_incr_index(iq->host_write_index, 1, iq->nb_desc);
+
+   iq->fill_cnt++;
+
+   /* Flush the command into memory. We need to be sure the data
+* is in memory before indicating that the instruction is
+* pending.
+*/
+   rte_smp_wmb();
+   rte_atomic64_inc(&iq->instr_pending);
+
+   /* SDP_IQ_SEND_SUCCESS */
+   return 0;
+}
+
+
+static int
+sdp_send_data(struct sdp_device *sdpvf,
+ struct sdp_instr_queue *iq, void *cmd)
+{
+   uint32_t ret;
+
+   /* Lock this IQ command queue before posting instruction */
+   

[dpdk-dev] [PATCH v4 5/6] raw/octeontx2_ep: add dequeue operation

2020-01-07 Thread Mahipal Challa
Add rawdev dequeue operation for SDP VF devices.

Signed-off-by: Mahipal Challa 
---
 drivers/raw/octeontx2_ep/otx2_ep_enqdeq.c | 197 ++
 drivers/raw/octeontx2_ep/otx2_ep_enqdeq.h |   2 +
 drivers/raw/octeontx2_ep/otx2_ep_rawdev.c |   1 +
 drivers/raw/octeontx2_ep/otx2_ep_rawdev.h |  18 ++-
 4 files changed, 217 insertions(+), 1 deletion(-)

diff --git a/drivers/raw/octeontx2_ep/otx2_ep_enqdeq.c 
b/drivers/raw/octeontx2_ep/otx2_ep_enqdeq.c
index 87ca3cd..b9287a5 100644
--- a/drivers/raw/octeontx2_ep/otx2_ep_enqdeq.c
+++ b/drivers/raw/octeontx2_ep/otx2_ep_enqdeq.c
@@ -260,6 +260,7 @@
rte_mempool_get(sdpvf->enqdeq_mpool, &buf);
if (buf == NULL) {
otx2_err("OQ buffer alloc failed");
+   droq->stats.rx_alloc_failure++;
/* sdp_droq_destroy_ring_buffers(droq);*/
return -ENOMEM;
}
@@ -645,3 +646,199 @@
return SDP_IQ_SEND_FAILED;
 }
 
+static uint32_t
+sdp_droq_refill(struct sdp_device *sdpvf, struct sdp_droq *droq)
+{
+   struct sdp_droq_desc *desc_ring;
+   uint32_t desc_refilled = 0;
+   void *buf = NULL;
+
+   desc_ring = droq->desc_ring;
+
+   while (droq->refill_count && (desc_refilled < droq->nb_desc)) {
+   /* If a valid buffer exists (happens if there is no dispatch),
+* reuse the buffer, else allocate.
+*/
+   if (droq->recv_buf_list[droq->refill_idx].buffer != NULL)
+   break;
+
+   rte_mempool_get(sdpvf->enqdeq_mpool, &buf);
+   /* If a buffer could not be allocated, no point in
+* continuing
+*/
+   if (buf == NULL) {
+   droq->stats.rx_alloc_failure++;
+   break;
+   }
+
+   droq->recv_buf_list[droq->refill_idx].buffer = buf;
+   desc_ring[droq->refill_idx].buffer_ptr = rte_mem_virt2iova(buf);
+
+   /* Reset any previous values in the length field. */
+   droq->info_list[droq->refill_idx].length = 0;
+
+   droq->refill_idx = sdp_incr_index(droq->refill_idx, 1,
+   droq->nb_desc);
+
+   desc_refilled++;
+   droq->refill_count--;
+
+   }
+
+   return desc_refilled;
+}
+
+static int
+sdp_droq_read_packet(struct sdp_device *sdpvf __rte_unused,
+struct sdp_droq *droq,
+struct sdp_droq_pkt *droq_pkt)
+{
+   struct sdp_droq_info *info;
+   uint32_t total_len = 0;
+   uint32_t pkt_len = 0;
+
+   info = &droq->info_list[droq->read_idx];
+   sdp_swap_8B_data((uint64_t *)&info->length, 1);
+   if (!info->length) {
+   otx2_err("OQ info_list->length[%ld]", (long)info->length);
+   goto oq_read_fail;
+   }
+
+   /* Deduce the actual data size */
+   info->length -= SDP_RH_SIZE;
+   total_len += (uint32_t)info->length;
+
+   otx2_sdp_dbg("OQ: pkt_len[%ld], buffer_size %d",
+   (long)info->length, droq->buffer_size);
+   if (info->length > droq->buffer_size) {
+   otx2_err("This mode is not supported: pkt_len > buffer_size");
+   goto oq_read_fail;
+   }
+
+   if (info->length <= droq->buffer_size) {
+   pkt_len = (uint32_t)info->length;
+   droq_pkt->data = droq->recv_buf_list[droq->read_idx].buffer;
+   droq_pkt->len  = pkt_len;
+
+   droq->recv_buf_list[droq->read_idx].buffer = NULL;
+   droq->read_idx = sdp_incr_index(droq->read_idx, 1,/* count */
+   droq->nb_desc /* max rd idx */);
+   droq->refill_count++;
+
+   }
+
+   info->length = 0;
+
+   return SDP_OQ_RECV_SUCCESS;
+
+oq_read_fail:
+   return SDP_OQ_RECV_FAILED;
+}
+
+static inline uint32_t
+sdp_check_droq_pkts(struct sdp_droq *droq, uint32_t burst_size)
+{
+   uint32_t min_pkts = 0;
+   uint32_t new_pkts;
+   uint32_t pkt_count;
+
+   /* Latest available OQ packets */
+   pkt_count = rte_read32(droq->pkts_sent_reg);
+
+   /* Newly arrived packets */
+   new_pkts = pkt_count - droq->last_pkt_count;
+   otx2_sdp_dbg("Recvd [%d] new OQ pkts", new_pkts);
+
+   min_pkts = (new_pkts > burst_size) ? burst_size : new_pkts;
+   if (min_pkts) {
+   rte_atomic64_add(&droq->pkts_pending, min_pkts);
+   /* Back up the aggregated packet count so far */
+   droq->last_pkt_count += min_pkts;
+   }
+
+   return min_pkts;
+}
+
+/* Check for response arrival from OCTEON TX2
+ * returns number of requests completed
+ */
+int
+sdp_rawdev_dequeue(struct rte_rawdev *rawdev,
+  struct rte_rawdev_buf **buffers, unsigned int count,
+  rte_rawdev_obj_t context __rte_unuse

[dpdk-dev] [PATCH v4 3/6] raw/octeontx2_ep: add device uninitialization

2020-01-07 Thread Mahipal Challa
Add rawdev close/uninitialize operation for SDP
VF devices.

Signed-off-by: Mahipal Challa 
---
 drivers/raw/octeontx2_ep/otx2_ep_enqdeq.c | 111 ++
 drivers/raw/octeontx2_ep/otx2_ep_rawdev.c |  78 +
 drivers/raw/octeontx2_ep/otx2_ep_rawdev.h |   8 +++
 drivers/raw/octeontx2_ep/otx2_ep_vf.c |  44 
 4 files changed, 241 insertions(+)

diff --git a/drivers/raw/octeontx2_ep/otx2_ep_enqdeq.c 
b/drivers/raw/octeontx2_ep/otx2_ep_enqdeq.c
index 8857004..584b818 100644
--- a/drivers/raw/octeontx2_ep/otx2_ep_enqdeq.c
+++ b/drivers/raw/octeontx2_ep/otx2_ep_enqdeq.c
@@ -21,6 +21,59 @@
 #include "otx2_common.h"
 #include "otx2_ep_enqdeq.h"
 
+static void
+sdp_dmazone_free(const struct rte_memzone *mz)
+{
+   const struct rte_memzone *mz_tmp;
+   int ret = 0;
+
+   if (mz == NULL) {
+   otx2_err("Memzone %s : NULL", mz->name);
+   return;
+   }
+
+   mz_tmp = rte_memzone_lookup(mz->name);
+   if (mz_tmp == NULL) {
+   otx2_err("Memzone %s Not Found", mz->name);
+   return;
+   }
+
+   ret = rte_memzone_free(mz);
+   if (ret)
+   otx2_err("Memzone free failed : ret = %d", ret);
+
+}
+
+/* Free IQ resources */
+int
+sdp_delete_iqs(struct sdp_device *sdpvf, uint32_t iq_no)
+{
+   struct sdp_instr_queue *iq;
+
+   iq = sdpvf->instr_queue[iq_no];
+   if (iq == NULL) {
+   otx2_err("Invalid IQ[%d]\n", iq_no);
+   return -ENOMEM;
+   }
+
+   rte_free(iq->req_list);
+   iq->req_list = NULL;
+
+   if (iq->iq_mz) {
+   sdp_dmazone_free(iq->iq_mz);
+   iq->iq_mz = NULL;
+   }
+
+   rte_free(sdpvf->instr_queue[iq_no]);
+   sdpvf->instr_queue[iq_no] = NULL;
+
+   sdpvf->num_iqs--;
+
+   otx2_info("IQ[%d] is deleted", iq_no);
+
+   return 0;
+}
+
 /* IQ initialization */
 static int
 sdp_init_instr_queue(struct sdp_device *sdpvf, int iq_no)
@@ -126,6 +179,7 @@
return 0;
 
 delete_IQ:
+   sdp_delete_iqs(sdpvf, iq_no);
return -ENOMEM;
 }
 
@@ -139,6 +193,61 @@
rte_atomic64_set(&droq->pkts_pending, 0);
 }
 
+static void
+sdp_droq_destroy_ring_buffers(struct sdp_device *sdpvf,
+   struct sdp_droq *droq)
+{
+   uint32_t idx;
+
+   for (idx = 0; idx < droq->nb_desc; idx++) {
+   if (droq->recv_buf_list[idx].buffer) {
+   rte_mempool_put(sdpvf->enqdeq_mpool,
+   droq->recv_buf_list[idx].buffer);
+
+   droq->recv_buf_list[idx].buffer = NULL;
+   }
+   }
+
+   sdp_droq_reset_indices(droq);
+}
+
+/* Free OQs resources */
+int
+sdp_delete_oqs(struct sdp_device *sdpvf, uint32_t oq_no)
+{
+   struct sdp_droq *droq;
+
+   droq = sdpvf->droq[oq_no];
+   if (droq == NULL) {
+   otx2_err("Invalid droq[%d]", oq_no);
+   return -ENOMEM;
+   }
+
+   sdp_droq_destroy_ring_buffers(sdpvf, droq);
+   rte_free(droq->recv_buf_list);
+   droq->recv_buf_list = NULL;
+
+   if (droq->info_mz) {
+   sdp_dmazone_free(droq->info_mz);
+   droq->info_mz = NULL;
+   }
+
+   if (droq->desc_ring_mz) {
+   sdp_dmazone_free(droq->desc_ring_mz);
+   droq->desc_ring_mz = NULL;
+   }
+
+   memset(droq, 0, SDP_DROQ_SIZE);
+
+   rte_free(sdpvf->droq[oq_no]);
+   sdpvf->droq[oq_no] = NULL;
+
+   sdpvf->num_oqs--;
+
+   otx2_info("OQ[%d] is deleted", oq_no);
+   return 0;
+}
+
 static int
 sdp_droq_setup_ring_buffers(struct sdp_device *sdpvf,
struct sdp_droq *droq)
@@ -290,5 +399,7 @@
return 0;
 
 delete_OQ:
+   sdp_delete_oqs(sdpvf, oq_no);
return -ENOMEM;
 }
+
diff --git a/drivers/raw/octeontx2_ep/otx2_ep_rawdev.c 
b/drivers/raw/octeontx2_ep/otx2_ep_rawdev.c
index 0c56609..3db5a74 100644
--- a/drivers/raw/octeontx2_ep/otx2_ep_rawdev.c
+++ b/drivers/raw/octeontx2_ep/otx2_ep_rawdev.c
@@ -63,6 +63,45 @@
 }
 
 static int
+sdp_vfdev_exit(struct rte_rawdev *rawdev)
+{
+   struct sdp_device *sdpvf;
+   uint32_t rawdev_queues, q;
+
+   otx2_info("%s:", __func__);
+
+   sdpvf = (struct sdp_device *)rawdev->dev_private;
+
+   sdpvf->fn_list.disable_io_queues(sdpvf);
+
+   rawdev_queues = sdpvf->num_oqs;
+   for (q = 0; q < rawdev_queues; q++) {
+   if (sdp_delete_oqs(sdpvf, q)) {
+   otx2_err("Failed to delete OQ:%d", q);
+   return -ENOMEM;
+   }
+   }
+   otx2_info("Num OQs:%d freed", sdpvf->num_oqs);
+
+   /* Free the oqbuf_pool */
+   rte_mempool_free(sdpvf->enqdeq_mpool);
+   sdpvf->enqdeq_mpool = NULL;
+
+   otx2_info("Enqdeq_mpool free done");
+
+   rawdev_queues = sdpvf->num_iqs;
+   for (q = 0; q < rawdev_queues; q++) {
+   if (sdp_delete_iqs(sdpvf, q)) {
+

[dpdk-dev] [PATCH v4 6/6] raw/octeontx2_ep: add driver self test

2020-01-07 Thread Mahipal Challa
Add rawdev's selftest feature in SDP VF driver, which
verifies the EP mode functionality test.

Signed-off-by: Mahipal Challa 
---
 doc/guides/rawdevs/octeontx2_ep.rst   |  13 +++
 drivers/raw/octeontx2_ep/Makefile |   1 +
 drivers/raw/octeontx2_ep/meson.build  |   1 +
 drivers/raw/octeontx2_ep/otx2_ep_rawdev.c |   1 +
 drivers/raw/octeontx2_ep/otx2_ep_rawdev.h |   2 +
 drivers/raw/octeontx2_ep/otx2_ep_test.c   | 173 ++
 6 files changed, 191 insertions(+)

diff --git a/doc/guides/rawdevs/octeontx2_ep.rst 
b/doc/guides/rawdevs/octeontx2_ep.rst
index 39a7c29..bbcf530 100644
--- a/doc/guides/rawdevs/octeontx2_ep.rst
+++ b/doc/guides/rawdevs/octeontx2_ep.rst
@@ -74,3 +74,16 @@ Performing Data Transfer
 To perform data transfer using SDP VF EP rawdev devices use standard
 ``rte_rawdev_enqueue_buffers()`` and ``rte_rawdev_dequeue_buffers()`` APIs.
 
+Self test
+-
+
+On EAL initialization, SDP VF devices will be probed and populated into the
+raw devices. The rawdev ID of the device can be obtained using
+
+* Invoke ``rte_rawdev_get_dev_id("SDPEP:x")`` from the test application
+  where x is the VF device's bus id specified in "bus:device.func"(BDF)
+  format. Use this index for further rawdev function calls.
+
+* The driver's selftest rawdev API can be used to verify the SDP EP mode
+  functional tests which can send/receive the raw data packets to/from the
+  EP device.
diff --git a/drivers/raw/octeontx2_ep/Makefile 
b/drivers/raw/octeontx2_ep/Makefile
index 02853fb..44fdf89 100644
--- a/drivers/raw/octeontx2_ep/Makefile
+++ b/drivers/raw/octeontx2_ep/Makefile
@@ -37,6 +37,7 @@ LIBABIVER := 1
 #
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_OCTEONTX2_EP_RAWDEV) += otx2_ep_rawdev.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_OCTEONTX2_EP_RAWDEV) += otx2_ep_enqdeq.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_OCTEONTX2_EP_RAWDEV) += otx2_ep_test.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_OCTEONTX2_EP_RAWDEV) += otx2_ep_vf.c
 
 
diff --git a/drivers/raw/octeontx2_ep/meson.build 
b/drivers/raw/octeontx2_ep/meson.build
index 99e6c6d..0e6338f 100644
--- a/drivers/raw/octeontx2_ep/meson.build
+++ b/drivers/raw/octeontx2_ep/meson.build
@@ -5,4 +5,5 @@
 deps += ['bus_pci', 'common_octeontx2', 'rawdev']
 sources = files('otx2_ep_rawdev.c',
'otx2_ep_enqdeq.c',
+   'otx2_ep_test.c',
'otx2_ep_vf.c')
diff --git a/drivers/raw/octeontx2_ep/otx2_ep_rawdev.c 
b/drivers/raw/octeontx2_ep/otx2_ep_rawdev.c
index 7158b97..0778603 100644
--- a/drivers/raw/octeontx2_ep/otx2_ep_rawdev.c
+++ b/drivers/raw/octeontx2_ep/otx2_ep_rawdev.c
@@ -253,6 +253,7 @@
.dev_close  = sdp_rawdev_close,
.enqueue_bufs   = sdp_rawdev_enqueue,
.dequeue_bufs   = sdp_rawdev_dequeue,
+   .dev_selftest   = sdp_rawdev_selftest,
 };
 
 static int
diff --git a/drivers/raw/octeontx2_ep/otx2_ep_rawdev.h 
b/drivers/raw/octeontx2_ep/otx2_ep_rawdev.h
index a77cbab..dab2fb7 100644
--- a/drivers/raw/octeontx2_ep/otx2_ep_rawdev.h
+++ b/drivers/raw/octeontx2_ep/otx2_ep_rawdev.h
@@ -494,4 +494,6 @@ int sdp_rawdev_enqueue(struct rte_rawdev *dev, struct 
rte_rawdev_buf **buffers,
 int sdp_rawdev_dequeue(struct rte_rawdev *dev, struct rte_rawdev_buf **buffers,
   unsigned int count, rte_rawdev_obj_t context);
 
+int sdp_rawdev_selftest(uint16_t dev_id);
+
 #endif /* _OTX2_EP_RAWDEV_H_ */
diff --git a/drivers/raw/octeontx2_ep/otx2_ep_test.c 
b/drivers/raw/octeontx2_ep/otx2_ep_test.c
new file mode 100644
index 000..9746840
--- /dev/null
+++ b/drivers/raw/octeontx2_ep/otx2_ep_test.c
@@ -0,0 +1,173 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2019 Marvell International Ltd.
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#include "otx2_common.h"
+#include "otx2_ep_rawdev.h"
+
+#define SDP_IOQ_NUM_BUFS   (4 * 1024)
+#define SDP_IOQ_BUF_SIZE   (2 * 1024)
+
+#define SDP_TEST_PKT_FSZ   (0)
+#define SDP_TEST_PKT_SIZE  (1024)
+
+static int
+sdp_validate_data(struct sdp_droq_pkt *oq_pkt, uint8_t *iq_pkt,
+ uint32_t pkt_len)
+{
+   if (!oq_pkt)
+   return -EINVAL;
+
+   if (pkt_len != oq_pkt->len) {
+   otx2_err("Invalid packet length");
+   return -EINVAL;
+   }
+
+   if (memcmp(oq_pkt->data, iq_pkt, pkt_len) != 0) {
+   otx2_err("Data validation failed");
+   return -EINVAL;
+   }
+   otx2_sdp_dbg("Data validation successful");
+
+   return 0;
+}
+
+static void
+sdp_ioq_buffer_fill(uint8_t *addr, uint32_t len)
+{
+   uint32_t idx;
+
+   memset(addr, 0, len);
+
+   for (idx = 0; idx < len; idx++)
+   addr[idx] = idx;
+}
+
+static struct rte_mempool*
+sdp_ioq_mempool_create(void)
+{
+   struct rte_mempool *mpool;
+
+   mpool = rte_mempool_create("ioqbuf_pool",
+  SDP_IOQ_NUM_BUFS /*num elt*/,
+  SDP_IOQ_BUF

Re: [dpdk-dev] [PATCH] net/mlx5: fix incorrect pointer operation

2020-01-07 Thread Thomas Monjalon
Hi,

How this issue was seen? Is it related to the bug reported by Tonghao Zhang?
If yes, you may use "Reported-by:".

One comment about the title, please describe which area is fixed,
instead of "incorrect pointer operation" which is very generic.
The title should probably include the word "meter".
Then in the explanations below, please explain what is the impact of the bug.

Thanks

07/01/2020 09:56, Suanming Mou:
> The meter suffix flow item pointer restore is not correct to minus a
> fixed value. It should minus the real offset it increases.
> 
> Set the value to the real offset the pointer increases to fix the issue.
> 
> Fixes: 9ea9b049a960 ("net/mlx5: split meter flow")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Suanming Mou 
> Tested-by: Tonghao Zhang 





Re: [dpdk-dev] [PATCH 12/14] examples/ipsec-secgw: add driver outbound worker

2020-01-07 Thread Ananyev, Konstantin
> > > > > This patch adds the driver outbound worker thread for ipsec-secgw.
> > > > > In this mode the security session is a fixed one and sa update is
> > > > > not done.
> > > > >
> > > > > Signed-off-by: Ankur Dwivedi 
> > > > > Signed-off-by: Anoob Joseph 
> > > > > Signed-off-by: Lukasz Bartosik 
> > > > > ---
> > > > >  examples/ipsec-secgw/ipsec-secgw.c  | 12 +
> > > > >  examples/ipsec-secgw/ipsec.c|  9 
> > > > >  examples/ipsec-secgw/ipsec_worker.c | 90
> > > > > -
> > > > >  3 files changed, 110 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/examples/ipsec-secgw/ipsec-secgw.c
> > > > > b/examples/ipsec-secgw/ipsec-secgw.c
> > > > > index 2e7d4d8..76719f2 100644
> > > > > --- a/examples/ipsec-secgw/ipsec-secgw.c
> > > > > +++ b/examples/ipsec-secgw/ipsec-secgw.c
> > > > > @@ -2011,6 +2011,18 @@ cryptodevs_init(void)
> > > > >   i++;
> > > > >   }
> > > > >
> > > > > + /*
> > > > > +  * Set the queue pair to at least the number of ethernet
> > > > > +  * devices for inline outbound.
> > > > > +  */
> > > > > + qp = RTE_MAX(rte_eth_dev_count_avail(), qp);
> > > >
> > > >
> > > > Not sure, what for?
> > > > Why we can't process packets from several eth devs on the same
> > > > crypto-dev queue?
> > >
> > > [Anoob] This is because of a limitation in our hardware. In our
> > > hardware, it's the crypto queue pair which would be submitting to the
> > > ethernet queue for Tx. But in DPDK spec, the security processing is
> > > done by the ethernet PMD Tx routine alone. We manage to do this by sharing
> > the crypto queue internally. The crypto queues initialized during
> > crypto_configure() gets mapped to various ethernet ports. Because of this, 
> > we
> > need to have atleast as many crypto queues as the number of eth ports.
> >
> > Ok, but that breaks current behavior.
> > Right now in poll-mode it is possible to map traffic from N eth-devs to M 
> > crypto-
> > devs (N>= M, by using M lcores).
> > Would prefer to keep this functionality in place.
> 
> [Anoob] Understood. I don't think that functionality is broken. If the number 
> of qps available is lower than the number of eth devs,
> then only the ones available would be enabled. Inline protocol session for 
> the other eth devs would fail for us.
> 
> Currently, the app assumes that for one core, it needs only one qp (and for M 
> core, M qp). Is there any harm in enabling all qps
> available? If such a change can be done, that would also work for us.

Hmm, I suppose it could cause some problems with some corner-cases:
if we'll have crypto-dev with really big number of max_queues.
In that case it might require a lot of extra memory for 
cryptodev_configure/queue_pair_setup.
Probably the easiest way to deal with it:
- add req_queue_num parameter for cryptodevs_init()
   And then do: qp =RTE_MIN(max_nb_qps, RTE_MAX(req_queue_num, qp));
 - for poll mode we'll call cryptodevs_init(0), for your case it could be
   cryptodevs_init(rte_eth_dev_count_avail()).

Would it work for your case?

> >
> > >
> > > The above change is required because here we limit the number of
> > > crypto qps based on the number of cores etc. So when tried on single 
> > > core, the
> > qps get limited to 1, which causes session_create() to fail for all ports 
> > other than
> > the first one.
> > >
> > > >
> > > > > +
> > > > > + /*
> > > > > +  * The requested number of queues should never exceed
> > > > > +  * the max available
> > > > > +  */
> > > > > + qp = RTE_MIN(qp, max_nb_qps);
> > > > > +
> > > > >   if (qp == 0)
> > > > >   continue;
> > > > >
> > > > > diff --git a/examples/ipsec-secgw/ipsec.c
> > > > > b/examples/ipsec-secgw/ipsec.c index e529f68..9ff8a63 100644
> > > > > --- a/examples/ipsec-secgw/ipsec.c
> > > > > +++ b/examples/ipsec-secgw/ipsec.c
> > > > > @@ -141,6 +141,10 @@ create_lookaside_session(struct ipsec_ctx
> > > > *ipsec_ctx, struct ipsec_sa *sa,
> > > > >   return 0;
> > > > >  }
> > > > >
> > > > > +uint16_t sa_no;
> > > > > +#define MAX_FIXED_SESSIONS   10
> > > > > +struct rte_security_session
> > > > > +*sec_session_fixed[MAX_FIXED_SESSIONS];
> > > > > +
> > > > >  int
> > > > >  create_inline_session(struct socket_ctx *skt_ctx, struct ipsec_sa 
> > > > > *sa,
> > > > >   struct rte_ipsec_session *ips)
> > > > > @@ -401,6 +405,11 @@ create_inline_session(struct socket_ctx
> > > > > *skt_ctx, struct ipsec_sa *sa,
> > > > >
> > > > >   ips->security.ol_flags = sec_cap->ol_flags;
> > > > >   ips->security.ctx = sec_ctx;
> > > > > + if (sa_no < MAX_FIXED_SESSIONS) {
> > > > > + sec_session_fixed[sa_no] =
> > > > > + ipsec_get_primary_session(sa)-
> > > > >security.ses;
> > > > > +

Re: [dpdk-dev] [PATCH 04/14] examples/ipsec-secgw: add Rx adapter support

2020-01-07 Thread Ananyev, Konstantin



> > > Add Rx adapter support. The event helper init routine will initialize
> > > the Rx adapter according to the configuration. If Rx adapter config is
> > > not present it will generate a default config. It will check the
> > > available eth ports and event queues and map them 1:1. So one eth port
> > > will be connected to one event queue. This way event queue ID could be
> > > used to figure out the port on which a packet came in.
> > >
> > > Signed-off-by: Anoob Joseph 
> > > Signed-off-by: Lukasz Bartosik 
> > > ---
> > >  examples/ipsec-secgw/event_helper.c | 289
> > > +++-
> > >  examples/ipsec-secgw/event_helper.h |  29 
> > >  2 files changed, 317 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/examples/ipsec-secgw/event_helper.c
> > > b/examples/ipsec-secgw/event_helper.c
> > > index d0157f4..f0eca01 100644
> > > --- a/examples/ipsec-secgw/event_helper.c
> > > +++ b/examples/ipsec-secgw/event_helper.c
> > > @@ -4,10 +4,60 @@
> > >  #include 
> > >  #include 
> > >  #include 
> > > +#include 
> > >  #include 
> > >
> > >  #include "event_helper.h"
> > >
> > > +static int
> > > +eh_get_enabled_cores(struct rte_bitmap *eth_core_mask) {
> > > + int i;
> > > + int count = 0;
> > > +
> > > + RTE_LCORE_FOREACH(i) {
> > > + /* Check if this core is enabled in core mask*/
> > > + if (rte_bitmap_get(eth_core_mask, i)) {
> > > + /* We have found enabled core */
> > > + count++;
> > > + }
> > > + }
> > > + return count;
> > > +}
> > > +
> > > +static inline unsigned int
> > > +eh_get_next_eth_core(struct eventmode_conf *em_conf) {
> > > + static unsigned int prev_core = -1;
> > > + unsigned int next_core;
> > > +
> > > + /*
> > > +  * Make sure we have at least one eth core running, else the following
> > > +  * logic would lead to an infinite loop.
> > > +  */
> > > + if (eh_get_enabled_cores(em_conf->eth_core_mask) == 0) {
> > > + EH_LOG_ERR("No enabled eth core found");
> > > + return RTE_MAX_LCORE;
> > > + }
> > > +
> > > +get_next_core:
> > > + /* Get the next core */
> > > + next_core = rte_get_next_lcore(prev_core, 0, 1);
> > > +
> > > + /* Check if we have reached max lcores */
> > > + if (next_core == RTE_MAX_LCORE)
> > > + return next_core;
> > > +
> > > + /* Update prev_core */
> > > + prev_core = next_core;
> > > +
> > > + /* Only some cores are marked as eth cores. Skip others */
> > > + if (!(rte_bitmap_get(em_conf->eth_core_mask, next_core)))
> > > + goto get_next_core;
> >
> > Are loops statements forbidden in C now? 😉
> > As a generic comment - too many (unnecessary) gotos in this patch series.
> > It is not uncommon to see 2-3 labels inside the function and bunch gotos to
> > them.
> > Would be good to rework the code a bit to get rid of them.
> 
> [Anoob] Sure. Will rework the code and see if the gotos can be minimized. In 
> this case, it seemed more straightforward to have goto
> instead of the loop. Will recheck anyway.

The code above looks like a classical do {..} while (...); example, no?

> 
> >
> > > +
> > > + return next_core;
> > > +}
> > > +
> > >  static inline unsigned int
> > >  eh_get_next_active_core(struct eventmode_conf *em_conf, unsigned int
> > > prev_core)  { @@ -154,6 +204,87 @@ eh_set_default_conf_link(struct
> > > eventmode_conf *em_conf)  }
> > >


Re: [dpdk-dev] [PATCH 09/14] examples/ipsec-secgw: add eventmode to ipsec-secgw

2020-01-07 Thread Ananyev, Konstantin



> > > > > Add eventmode support to ipsec-secgw. This uses event helper to
> > > > > setup and use the eventmode capabilities. Add driver inbound worker.
> > > > >
> > > > > Example command:
> > > > > ./ipsec-secgw -c 0x1 -w 0002:02:00.0,ipsec_in_max_spi=100 -w
> > > > > 0002:07:00.0  -w 0002:0e:00.0 -w 0002:10:00.1 -- -P -p 0x3 -u 0x1
> > > > > --config "(0,0,0),(1,0,0)" -f a-aes-gcm-msa.cfg --transfer-mode 1
> > > > > --schedule-type 2 --process-mode drv --process-dir in
> > > >
> > > > As  I can see new event mode is totally orthogonal to the existing poll 
> > > > mode.
> > > > Event mode has it is own data-path, and it doesn't reuse any part of
> > > > poll- mode data-path code.
> > > > Plus in event mode many poll-mode options:
> > > > libirary/legacy mode, fragment/reassemble, replay-window, ESN,
> > > > fall-back session, etc.
> > > > are simply ignored.
> > >
> > > [Anoob] The features are not supported with the initial version. But
> > > the features are equally applicable to eventmode and is planned for the 
> > > future.
> > Also, fragment/reassemble, replay-window, ESN, fall-back session etc are not
> > applicable for non-library mode.
> >
> > True, but in poll-mode library-mode support all functionality that 
> > legacy-mode
> > does, plus some extra.
> > Also I still hope that after perf-problems evaluation with NXP we will be 
> > able to
> > safely remove legacy poll-mode.
> >
> > >We can follow the
> > > same logic and allow for an extra arg (which is --transfer-mode).
> > >
> > > > Also as I can read the current code - right now these modes can't be
> > > > mixed and used together.
> > > > User has to use either only event based or poll mode API/devices.
> > >
> > > [Anoob] Same like how we cannot mix library and non-library modes.
> > >
> > > >
> > > > If so, then at least we need a check (and report with error exit)
> > > > for these mutually exclusive option variants.
> > >
> > > [Anoob] Will do that.
> >
> > Ok.
> >
> > > > Probably even better would be to generate two separate binaries Let say:
> > > > ipsec-secgw-event and ipsec-secgw-poll.
> > > > We can still keep the same parent directory, makefile, common src files 
> > > > etc.
> > > > for both.
> > >
> > > [Anoob] I would be inclined to not fork the current application. Do
> > > you see any issues if the same binary could run in both modes. The default
> > behavior would be poll mode (with existing behavior).
> >
> > My main concern here that there will be over-helming number of options (some
> > of which are mutually exclusive) in the same app.
> > So it will be really hard to maintain and use such app.
> > My thought was that it might be cleaner to have two different apps each 
> > with its
> > own set of options.
> >
> 
> [Anoob] Technically event mode would need only one extra argument. The one to 
> specify "scheduling type". The direction can be
> removed (discussed in another thread) and app-mode can be merged with 
> existing single_sa mode.
> 
> And we do want the event-mode to be supporting all features supported by poll 
> mode. Just that we will have to take it up gradually
> (because of the volume of code change).
> 
> Thomas had opposed the idea of forking example applications for event mode. I 
> also agree with him there. Event-mode just
> establishes an alternate way to receive and send packets. Entire IPsec 
> processing can be maintained common.

I didn't talk about forking.
I talked about something like that - keep all code in examples/ipsec-secgw
Probably move event/poll specific code into
examples/ipsec-secgw/poll, examples/ipsec-secgw/event.
Make changes in Makefile, meson.build to generate 2 binaries.
But ok, one extra event-mode specific option doesn't seem that much.
Let's try to keep unified binary and see how it goes.
Konstantin 


Re: [dpdk-dev] [PATCH v3 2/2] ci: add travis ci support for aarch64

2020-01-07 Thread Honnappa Nagarahalli


> >
> > > Add Travis compilation jobs for aarch64. gcc/clang compilations for
> > > static/shared libraries are added.
> > >
> > > Some limitations for current aarch64 Travis support:
> > > 1. Container is used. Huge page is not available due to security reason.
> > > 2. Missing kernel header package in Xenial distribution.
> > >
> > > Solutions to address the limitations:
> > > 1. Not to add unit test for now. And run tests with no-huge in future.
> > > 2. Use Bionic distribution for all aarch64 jobs.
> > >
> > > Signed-off-by: Ruifeng Wang 
> > > Reviewed-by: Gavin Hu 
> > > ---
> >
> > Can't we achieve the same thing by setting
> >
> > arch:
> >   - amd64
> >   - arm64
> >
> > in the build matrix?  Or will that also force the intel builds to use
> > the container infrastructure (in which case the no-huge support needs to be
> fixed)?
> 
> No, container infrastructure will not be imposed to intel builds.
> AFAIN, Travis infrastructure for a specific CPU arch is provided as is, and
> there is no config option to control.
> The problem with just adding 'arch' in build matrix is that RUN_TESTS on
> arm64 is not supported by now (Travis limitation). 'env' with RUN_TESTS will
> fail.
> >
> > One thing I wonder, isn't is possible to use qemu-user to do the amd64
> > unit tests?  Then do we really need some changes to do the native build?
> 
> Do you mean to use qemu-user to do unit tests for non-x86 arch?
> Changes will be needed as well to enable qemu-user to do unit test.
> Since Travis support multi CPU arch, I think native build and test is simpler
> and more natural.
Yes, prefer to run the tests natively as the infrastructure is available and 
will improve further from here.

> 
> > Does it buy us anything *today* given the cost of the hugepage restriction?
> > Will that ever be resolved (I didn't see so from the docs on travis)?
> 
> The hugepage issue has been reported to Travis. I think it will be resolved.
> But no set dates yet.
> >
> > >  .ci/linux-setup.sh | 11 +++
> > >  .travis.yml| 42 +-
> > >  2 files changed, 48 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/.ci/linux-setup.sh b/.ci/linux-setup.sh index
> > > dfb9d4a20..a92978037 100755
> > > --- a/.ci/linux-setup.sh
> > > +++ b/.ci/linux-setup.sh
> > > @@ -3,7 +3,10 @@
> > >  # need to install as 'root' since some of the unit tests won't run
> > > without it  sudo python3 -m pip install --upgrade meson
> > >
> > > -# setup hugepages
> > > -cat /proc/meminfo
> > > -sudo sh -c 'echo 1024 > /proc/sys/vm/nr_hugepages'
> > > -cat /proc/meminfo
> > > +# hugepage settings are skipped on aarch64 due to environment
> > > +limitation if [ "$TRAVIS_ARCH" != "aarch64" ]; then
> > > +# setup hugepages
> > > +cat /proc/meminfo
> > > +sudo sh -c 'echo 1024 > /proc/sys/vm/nr_hugepages'
> > > +cat /proc/meminfo
> > > +fi
> > > diff --git a/.travis.yml b/.travis.yml index 8f90d06f2..980c7605d
> > > 100644
> > > --- a/.travis.yml
> > > +++ b/.travis.yml
> > > @@ -115,6 +115,46 @@ matrix:
> > >apt:
> > >  packages:
> > >- *extra_packages
> > > -
> > > +  - env: DEF_LIB="static"
> > > +arch: arm64
> > > +compiler: gcc
> > > +dist: bionic
> > > +addons:
> > > +  apt:
> > > +packages:
> > > +  - *required_packages
> > > +  - env: DEF_LIB="shared"
> > > +arch: arm64
> > > +compiler: gcc
> > > +dist: bionic
> > > +addons:
> > > +  apt:
> > > +packages:
> > > +  - *required_packages
> > > +  - env: DEF_LIB="static"
> > > +arch: arm64
> > > +dist: bionic
> > > +compiler: clang
> > > +addons:
> > > +  apt:
> > > +packages:
> > > +  - *required_packages
> > > +  - env: DEF_LIB="shared"
> > > +arch: arm64
> > > +dist: bionic
> > > +compiler: clang
> > > +addons:
> > > +  apt:
> > > +packages:
> > > +  - *required_packages
> > > +  - env: DEF_LIB="shared" OPTS="-Denable_kmods=false" BUILD_DOCS=1
> > > +arch: arm64
> > > +compiler: gcc
> > > +dist: bionic
> > > +addons:
> > > +  apt:
> > > +packages:
> > > +  - *required_packages
> > > +  - *doc_packages
> > >
> > >  script: ./.ci/${TRAVIS_OS_NAME}-build.sh
> 



[dpdk-dev] [PATCH] examples/fips_validation: add AES XTS support

2020-01-07 Thread Archana Muniganti
From: Sucharitha Sarananaga 

AES XTS support is added to fips application. Parse test-vectors
from input files, populate AES XTS tests and prepare AES XTS
operations for fips validation.

Signed-off-by: Abed Kamaluddin 
Signed-off-by: Archana Muniganti 
Signed-off-by: Sucharitha Sarananaga 
---
 examples/fips_validation/Makefile  |   1 +
 examples/fips_validation/fips_validation.c |   6 ++
 examples/fips_validation/fips_validation.h |   4 +
 examples/fips_validation/fips_validation_xts.c | 119 +
 examples/fips_validation/main.c|  45 ++
 examples/fips_validation/meson.build   |   1 +
 6 files changed, 176 insertions(+)
 create mode 100644 examples/fips_validation/fips_validation_xts.c

diff --git a/examples/fips_validation/Makefile 
b/examples/fips_validation/Makefile
index 1385e8c..c207d11 100644
--- a/examples/fips_validation/Makefile
+++ b/examples/fips_validation/Makefile
@@ -14,6 +14,7 @@ SRCS-y += fips_validation_cmac.c
 SRCS-y += fips_validation_ccm.c
 SRCS-y += fips_validation_sha.c
 SRCS-y += fips_dev_self_test.c
+SRCS-y += fips_validation_xts.c
 SRCS-y += main.c
 
 # Build using pkg-config variables if possible
diff --git a/examples/fips_validation/fips_validation.c 
b/examples/fips_validation/fips_validation.c
index 07ffa62..ef24b72 100644
--- a/examples/fips_validation/fips_validation.c
+++ b/examples/fips_validation/fips_validation.c
@@ -150,6 +150,12 @@
ret = parse_test_sha_init();
if (ret < 0)
return ret;
+   } else if (strstr(info.vec[i], "XTS")) {
+   algo_parsed = 1;
+   info.algo = FIPS_TEST_ALGO_AES_XTS;
+   ret = parse_test_xts_init();
+   if (ret < 0)
+   return ret;
}
}
 
diff --git a/examples/fips_validation/fips_validation.h 
b/examples/fips_validation/fips_validation.h
index d487fb0..5aee955 100644
--- a/examples/fips_validation/fips_validation.h
+++ b/examples/fips_validation/fips_validation.h
@@ -31,6 +31,7 @@ enum fips_test_algorithms {
FIPS_TEST_ALGO_HMAC,
FIPS_TEST_ALGO_TDES,
FIPS_TEST_ALGO_SHA,
+   FIPS_TEST_ALGO_AES_XTS,
FIPS_TEST_ALGO_MAX
 };
 
@@ -223,6 +224,9 @@ struct fips_test_interim_info {
 parse_test_sha_init(void);
 
 int
+parse_test_xts_init(void);
+
+int
 parser_read_uint8_hex(uint8_t *value, const char *p);
 
 int
diff --git a/examples/fips_validation/fips_validation_xts.c 
b/examples/fips_validation/fips_validation_xts.c
new file mode 100644
index 000..5bb1966
--- /dev/null
+++ b/examples/fips_validation/fips_validation_xts.c
@@ -0,0 +1,119 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (C) 2020 Marvell International Ltd.
+ */
+
+#include 
+#include 
+#include 
+
+#include 
+
+#include "fips_validation.h"
+
+#define MODE_STR   "XTS"
+#define ALGO_STR   "test data for "
+#define OP_STR "State"
+#define KEY_SIZE_STR   "Key Length : "
+
+#define COUNT_STR  "COUNT = "
+#define KEY_STR"Key = "
+#define IV_STR "i = "
+#define PT_STR "PT = "
+#define CT_STR "CT = "
+
+#define OP_ENC_STR "ENCRYPT"
+#define OP_DEC_STR "DECRYPT"
+
+static int
+parse_interim_xts_enc_dec(const char *key,
+   __rte_unused char *text,
+   __rte_unused struct fips_val *val)
+{
+   if (strcmp(key, OP_ENC_STR) == 0)
+   info.op = FIPS_TEST_ENC_AUTH_GEN;
+   else if (strcmp(key, OP_DEC_STR) == 0)
+   info.op = FIPS_TEST_DEC_AUTH_VERIF;
+   else
+   return -1;
+   return 0;
+}
+
+struct fips_test_callback xts_tests_vectors[] = {
+   {KEY_STR, parse_uint8_hex_str, &vec.cipher_auth.key},
+   {IV_STR, parse_uint8_hex_str, &vec.iv},
+   {PT_STR, parse_uint8_hex_str, &vec.pt},
+   {CT_STR, parse_uint8_hex_str, &vec.ct},
+   {NULL, NULL, NULL} /**< end pointer */
+};
+
+struct fips_test_callback xts_tests_interim_vectors[] = {
+   {OP_ENC_STR, parse_interim_xts_enc_dec, NULL},
+   {OP_DEC_STR, parse_interim_xts_enc_dec, NULL},
+   {NULL, NULL, NULL} /**< end pointer */
+};
+
+struct fips_test_callback xts_writeback_callbacks[] = {
+   /** First element is used to pass COUNT string */
+   {COUNT_STR, NULL, NULL},
+   {IV_STR, writeback_hex_str, &vec.iv},
+   {KEY_STR, writeback_hex_str, &vec.cipher_auth.key},
+   {PT_STR, writeback_hex_str, &vec.pt},
+   {CT_STR, writeback_hex_str, &vec.ct},
+   {NULL, NULL, NULL} /**< end pointer */
+};
+
+static int
+parse_test_xts_writeback(struct fips_val *val)
+{
+   if (i

[dpdk-dev] [PATCH] librte_ethdev: extend dpdk api led control to query capability

2020-01-07 Thread Laurent Hardy
In current led control API we have no way to know if a device is able
to handle on/off requests coming from the application.
Knowing if the device is led control capable could be useful to avoid
exchanges between application and kernel.
Using the on/off requests to flag if the device is led control capable
(based on the ENOSUP returned error) is not convenient as such request
can change the led state on device.

This patch adds a new function rte_eth_led_ctrl_capable() that will look
for led_off/on dev ops availability on the related pmd, to know if the
device is able to handle such led control requests (on/off).

Signed-off-by: Laurent Hardy 
---
 doc/guides/nics/features.rst |  5 +
 lib/librte_ethdev/rte_ethdev.c   | 12 
 lib/librte_ethdev/rte_ethdev.h   | 15 +++
 lib/librte_ethdev/rte_ethdev_core.h  |  4 
 lib/librte_ethdev/rte_ethdev_version.map |  1 +
 5 files changed, 37 insertions(+)

diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
index 8394a6595..012645dc5 100644
--- a/doc/guides/nics/features.rst
+++ b/doc/guides/nics/features.rst
@@ -732,6 +732,11 @@ registers and register size).
 LED
 ---
 
+Interrogates device to know if it is led control capable.
+
+* **[implements] eth_dev_ops**: ``dev_led_ctrl_capable``.
+* **[related]API**: ``rte_eth_led_ctrl_capable()``.
+
 Supports turning on/off a software controllable LED on a device.
 
 * **[implements] eth_dev_ops**: ``dev_led_on``, ``dev_led_off``.
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 6e9cb243e..b259b6b19 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -3612,6 +3612,18 @@ rte_eth_led_off(uint16_t port_id)
return eth_err(port_id, (*dev->dev_ops->dev_led_off)(dev));
 }
 
+int
+rte_eth_led_ctrl_capable(uint16_t port_id)
+{
+   struct rte_eth_dev *dev;
+
+   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+   dev = &rte_eth_devices[port_id];
+   RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_led_off, -ENOTSUP);
+   RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_led_on, -ENOTSUP);
+   return 0;
+}
+
 /*
  * Returns index into MAC address array of addr. Use 00:00:00:00:00:00 to find
  * an empty spot.
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index 18a9defc2..a5bacd643 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -3204,6 +3204,21 @@ int  rte_eth_led_on(uint16_t port_id);
  */
 int  rte_eth_led_off(uint16_t port_id);
 
+/**
+ * Interrogate the Ethernet device to know if it is led control capable.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @return
+ *   - (0) if successful.
+ *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
+ * that operation.
+ *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
+ */
+int  __rte_experimental
+rte_eth_led_ctrl_capable(uint16_t port_id);
+
 /**
  * Get current status of the Ethernet link flow control for Ethernet device
  *
diff --git a/lib/librte_ethdev/rte_ethdev_core.h 
b/lib/librte_ethdev/rte_ethdev_core.h
index 7bf97e24e..6cf2a5242 100644
--- a/lib/librte_ethdev/rte_ethdev_core.h
+++ b/lib/librte_ethdev/rte_ethdev_core.h
@@ -388,6 +388,9 @@ typedef int (*eth_dev_led_on_t)(struct rte_eth_dev *dev);
 typedef int (*eth_dev_led_off_t)(struct rte_eth_dev *dev);
 /**< @internal Turn off SW controllable LED on an Ethernet device */
 
+typedef int (*eth_dev_led_ctrl_capable_t)(struct rte_eth_dev *dev);
+/**< @internal Get led control capability on an Ethernet device */
+
 typedef void (*eth_mac_addr_remove_t)(struct rte_eth_dev *dev, uint32_t index);
 /**< @internal Remove MAC address from receive address register */
 
@@ -675,6 +678,7 @@ struct eth_dev_ops {
 
eth_dev_led_on_t   dev_led_on;/**< Turn on LED. */
eth_dev_led_off_t  dev_led_off;   /**< Turn off LED. */
+   eth_dev_led_ctrl_capable_t dev_led_ctrl_capable;  /**< Get led control 
capability. */
 
flow_ctrl_get_tflow_ctrl_get; /**< Get flow control. */
flow_ctrl_set_tflow_ctrl_set; /**< Setup flow control. */
diff --git a/lib/librte_ethdev/rte_ethdev_version.map 
b/lib/librte_ethdev/rte_ethdev_version.map
index a7dacf2cf..776f3b5d6 100644
--- a/lib/librte_ethdev/rte_ethdev_version.map
+++ b/lib/librte_ethdev/rte_ethdev_version.map
@@ -227,4 +227,5 @@ EXPERIMENTAL {
rte_flow_dynf_metadata_mask;
rte_flow_dynf_metadata_register;
rte_eth_dev_set_ptypes;
+   rte_eth_led_ctrl_capable;
 };
-- 
2.20.1



Re: [dpdk-dev] [PATCH v7 02/17] lib/ring: apis to support configurable element size

2020-01-07 Thread Honnappa Nagarahalli

> > > > > +
> > > > > +static __rte_always_inline void enqueue_elems_128(struct
> > > > > +rte_ring *r, uint32_t prod_head, const void *obj_table,
> > > > > +uint32_t n) { unsigned int i; const uint32_t size = r->size;
> > > > > +uint32_t idx = prod_head & r->mask; __uint128_t *ring =
> > > > > +(__uint128_t *)&r[1]; const __uint128_t *obj = (const
> > > > > +__uint128_t *)obj_table; if (likely(idx + n < size)) { for (i =
> > > > > +0; i < (n & ~0x1); i += 2, idx += 2) { ring[idx] = obj[i];
> > > > > +ring[idx + 1] = obj[i + 1];
> > > >
> > > >
> > > > AFAIK, that implies 16B aligned obj_table...
> > > > Would it always be the case?
> > > I am not sure from the compiler perspective.
> > > At least on Arm architecture, unaligned access (address that is
> > > accessed is not aligned to the size of the data element being
> > > accessed) will result in faults or require additional cycles. So, 
> > > aligning on
> 16B should be fine.
> > Further, I would be changing this to use 'rte_int128_t' as '__uint128_t' is
> not defined on 32b systems.
> 
> What I am trying to say: with this code we imply new requirement for elems
The only existing use case in DPDK for 16B is the event ring. The event ring 
already does similar kind of copy (using 'struct rte_event'). So, there is no 
change in expectations for event ring.
For future code, I think this expectation should be fine since it allows for 
optimal code.

> in the ring: when sizeof(elem)==16 it's alignment also has to be at least 16.
> Which from my perspective is not ideal.
Any reasoning?

> Note that for elem sizes > 16 (24, 32), there is no such constraint.
The rest of them need to be aligned on 4B boundary. However, this should not 
affect the existing code.
The code for 8B and 16B is kept as is to ensure the performance is not affected 
for the existing code.

> 
> >
> > >
> > > >
> > > > > +}
> > > > > +switch (n & 0x1) {
> > > > > +case 1:
> > > > > +ring[idx++] = obj[i++];
> > > > > +}
> > > > > +} else {
> > > > > +for (i = 0; idx < size; i++, idx++) ring[idx] = obj[i];
> > > > > +/* Start at the beginning */
> > > > > +for (idx = 0; i < n; i++, idx++) ring[idx] = obj[i]; } }
> > > > > +
> > > > > +/* the actual enqueue of elements on the ring.
> > > > > + * Placed here since identical code needed in both
> > > > > + * single and multi producer enqueue functions.
> > > > > + */
> > > > > +static __rte_always_inline void enqueue_elems(struct rte_ring
> > > > > +*r, uint32_t prod_head, const void
> > > > *obj_table,
> > > > > +uint32_t esize, uint32_t num)
> > > > > +{
> > > > > +uint32_t idx, nr_idx, nr_num;
> > > > > +
> > > > > +/* 8B and 16B copies implemented individually to retain
> > > > > + * the current performance.
> > > > > + */
> > > > > +if (esize == 8)
> > > > > +enqueue_elems_64(r, prod_head, obj_table, num); else if (esize
> > > > > +==
> > > > > +16) enqueue_elems_128(r, prod_head, obj_table, num); else {
> > > > > +/* Normalize to uint32_t */
> > > > > +uint32_t scale = esize / sizeof(uint32_t); nr_num = num *
> > > > > +scale; idx = prod_head & r->mask; nr_idx = idx * scale;
> > > > > +enqueue_elems_32(r, nr_idx, obj_table, nr_num); } }
> > > > > +



Re: [dpdk-dev] [PATCH v2 2/2] net/i40e: support FDIR for L2TPv3 over IP

2020-01-07 Thread Sexton, Rory
Hi Beilei,

I have resolved all comments below in v3 of patch.

Rory
 
> > Adding FDIR support for L2TPv3 over IP header matching and adding a 
> > new customized pctype for l2tpv3 over IP.
> > 
> > Signed-off-by: Rory Sexton 
> > Signed-off-by: Dariusz Jagus 
> > ---
>
> Version change is needed here.
>

Thanks for noting, will fix this. Just getting familiar with the mailing list 
process.

> > diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> > b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> > index 73ef0b41d..5e6935829 100644
> > --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> > +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> > @@ -3954,6 +3954,10 @@ This section lists supported pattern items and 
> > their attributes, if any.
> > 
> >- ``proto_id {unsigned}``: PPP protocol identifier.
> > 
> > +- ``l2tpv3oip``: match L2TPv3 over IP header.
> > +
> > +  - ``session_id {unsigned}``: L2TPv3 over IP session identifier.
> > +
>
> It should not be a part of this patch, which is focus on i40e PMD.
>

Good point.
Moving this change to separate patch where testpmd changes are applied.

> > diff --git a/drivers/net/i40e/i40e_ethdev.h 
> > b/drivers/net/i40e/i40e_ethdev.h index 295ad593b..5b6d43556 100644
> > --- a/drivers/net/i40e/i40e_ethdev.h
> > +++ b/drivers/net/i40e/i40e_ethdev.h
> > @@ -508,24 +508,43 @@ struct i40e_raw_flow {
> > uint32_t length;
> >  };
> > 
> > +/* A structure used to define the input for L2TPv3 over IP flow */ 
> > +struct i40e_l2tpv3oip_flow {
> > +   uint32_t session_id; /* Session ID in big endian. */ };
>
> Is this structure needed?
>

This is unused so will remove it.


Re: [dpdk-dev] [PATCH v7 02/17] lib/ring: apis to support configurable element size

2020-01-07 Thread Ananyev, Konstantin


> 
> > > > > > +
> > > > > > +static __rte_always_inline void enqueue_elems_128(struct
> > > > > > +rte_ring *r, uint32_t prod_head, const void *obj_table,
> > > > > > +uint32_t n) { unsigned int i; const uint32_t size = r->size;
> > > > > > +uint32_t idx = prod_head & r->mask; __uint128_t *ring =
> > > > > > +(__uint128_t *)&r[1]; const __uint128_t *obj = (const
> > > > > > +__uint128_t *)obj_table; if (likely(idx + n < size)) { for (i =
> > > > > > +0; i < (n & ~0x1); i += 2, idx += 2) { ring[idx] = obj[i];
> > > > > > +ring[idx + 1] = obj[i + 1];
> > > > >
> > > > >
> > > > > AFAIK, that implies 16B aligned obj_table...
> > > > > Would it always be the case?
> > > > I am not sure from the compiler perspective.
> > > > At least on Arm architecture, unaligned access (address that is
> > > > accessed is not aligned to the size of the data element being
> > > > accessed) will result in faults or require additional cycles. So, 
> > > > aligning on
> > 16B should be fine.
> > > Further, I would be changing this to use 'rte_int128_t' as '__uint128_t' 
> > > is
> > not defined on 32b systems.
> >
> > What I am trying to say: with this code we imply new requirement for elems
> The only existing use case in DPDK for 16B is the event ring. The event ring 
> already does similar kind of copy (using 'struct rte_event').
> So, there is no change in expectations for event ring.
> For future code, I think this expectation should be fine since it allows for 
> optimal code.
> 
> > in the ring: when sizeof(elem)==16 it's alignment also has to be at least 
> > 16.
> > Which from my perspective is not ideal.
> Any reasoning?

New implicit requirement and inconsistency.
Code like that:

struct ring_elem {uint64_t a, b;};

struct ring_elem elem; 
rte_ring_dequeue_elem(ring, &elem, sizeof(elem));
 
might cause a crash.
While exactly the same code with:

struct ring_elem {uint64_t a, b, c;}; OR struct ring_elem {uint64_t a, b, c, 
d;};

will work ok.

> 
> > Note that for elem sizes > 16 (24, 32), there is no such constraint.
> The rest of them need to be aligned on 4B boundary. However, this should not 
> affect the existing code.
> The code for 8B and 16B is kept as is to ensure the performance is not 
> affected for the existing code.
> 
> >
> > >
> > > >
> > > > >
> > > > > > +}
> > > > > > +switch (n & 0x1) {
> > > > > > +case 1:
> > > > > > +ring[idx++] = obj[i++];
> > > > > > +}
> > > > > > +} else {
> > > > > > +for (i = 0; idx < size; i++, idx++) ring[idx] = obj[i];
> > > > > > +/* Start at the beginning */
> > > > > > +for (idx = 0; i < n; i++, idx++) ring[idx] = obj[i]; } }
> > > > > > +
> > > > > > +/* the actual enqueue of elements on the ring.
> > > > > > + * Placed here since identical code needed in both
> > > > > > + * single and multi producer enqueue functions.
> > > > > > + */
> > > > > > +static __rte_always_inline void enqueue_elems(struct rte_ring
> > > > > > +*r, uint32_t prod_head, const void
> > > > > *obj_table,
> > > > > > +uint32_t esize, uint32_t num)
> > > > > > +{
> > > > > > +uint32_t idx, nr_idx, nr_num;
> > > > > > +
> > > > > > +/* 8B and 16B copies implemented individually to retain
> > > > > > + * the current performance.
> > > > > > + */
> > > > > > +if (esize == 8)
> > > > > > +enqueue_elems_64(r, prod_head, obj_table, num); else if (esize
> > > > > > +==
> > > > > > +16) enqueue_elems_128(r, prod_head, obj_table, num); else {
> > > > > > +/* Normalize to uint32_t */
> > > > > > +uint32_t scale = esize / sizeof(uint32_t); nr_num = num *
> > > > > > +scale; idx = prod_head & r->mask; nr_idx = idx * scale;
> > > > > > +enqueue_elems_32(r, nr_idx, obj_table, nr_num); } }
> > > > > > +



[dpdk-dev] [PATCH v3 1/2] ethdev: add L2TPv3 over IP header to flow API

2020-01-07 Thread Rory Sexton
This patch adds the new flow item RTE_FLOW_ITEM_TYPE_L2TPV3OIP to
flow API to match a L2TPv3 over IP header. This patch supports only
L2TPv3 over IP header format which is different to L2TPv2/L2TPv3
over UDP. The difference in header formats between L2TPv3 over IP
and L2TP over UDP require a separate implementation for each.

Signed-off-by: Rory Sexton 
Signed-off-by: Dariusz Jagus 
---
v3 changes: adding testpmd docs updates
---
 app/test-pmd/cmdline_flow.c | 33 +
 doc/guides/prog_guide/rte_flow.rst  |  8 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  4 +++
 lib/librte_ethdev/rte_flow.c|  1 +
 lib/librte_ethdev/rte_flow.h| 29 +-
 5 files changed, 74 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 99dade7d8..72a792d93 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -213,6 +213,8 @@ enum index {
ITEM_TAG,
ITEM_TAG_DATA,
ITEM_TAG_INDEX,
+   ITEM_L2TPV3OIP,
+   ITEM_L2TPV3OIP_SESSION_ID,
 
/* Validate/create actions. */
ACTIONS,
@@ -746,6 +748,7 @@ static const enum index next_item[] = {
ITEM_PPPOE_PROTO_ID,
ITEM_HIGIG2,
ITEM_TAG,
+   ITEM_L2TPV3OIP,
END_SET,
ZERO,
 };
@@ -1030,6 +1033,12 @@ static const enum index item_tag[] = {
ZERO,
 };
 
+static const enum index item_l2tpv3oip[] = {
+   ITEM_L2TPV3OIP_SESSION_ID,
+   ITEM_NEXT,
+   ZERO,
+};
+
 static const enum index next_action[] = {
ACTION_END,
ACTION_VOID,
@@ -2593,6 +2602,22 @@ static const struct token token_list[] = {
 NEXT_ENTRY(ITEM_PARAM_IS)),
.args = ARGS(ARGS_ENTRY(struct rte_flow_item_tag, index)),
},
+   [ITEM_L2TPV3OIP] = {
+   .name = "l2tpv3oip",
+   .help = "match L2TPv3 over IP header",
+   .priv = PRIV_ITEM(L2TPV3OIP,
+ sizeof(struct rte_flow_item_l2tpv3oip)),
+   .next = NEXT(item_l2tpv3oip),
+   .call = parse_vc,
+   },
+   [ITEM_L2TPV3OIP_SESSION_ID] = {
+   .name = "session_id",
+   .help = "session identifier",
+   .next = NEXT(item_l2tpv3oip, NEXT_ENTRY(UNSIGNED), item_param),
+   .args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_l2tpv3oip,
+session_id)),
+   },
+
/* Validate/create actions. */
[ACTIONS] = {
.name = "actions",
@@ -6238,6 +6263,10 @@ flow_item_default_mask(const struct rte_flow_item *item)
break;
case RTE_FLOW_ITEM_TYPE_PPPOE_PROTO_ID:
mask = &rte_flow_item_pppoe_proto_id_mask;
+   break;
+   case RTE_FLOW_ITEM_TYPE_L2TPV3OIP:
+   mask = &rte_flow_item_l2tpv3oip_mask;
+   break;
default:
break;
}
@@ -6327,6 +6356,10 @@ cmd_set_raw_parsed(const struct buffer *in)
case RTE_FLOW_ITEM_TYPE_GENEVE:
size = sizeof(struct rte_flow_item_geneve);
break;
+   case RTE_FLOW_ITEM_TYPE_L2TPV3OIP:
+   size = sizeof(struct rte_flow_item_l2tpv3oip);
+   proto = 0x73;
+   break;
default:
printf("Error - Not supported item\n");
*total_size = 0;
diff --git a/doc/guides/prog_guide/rte_flow.rst 
b/doc/guides/prog_guide/rte_flow.rst
index a254c81ef..d4cef4621 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -1336,6 +1336,14 @@ Broadcom switches.
 
 - Default ``mask`` matches classification and vlan.
 
+Item: ``L2TPV3OIP``
+^
+
+Matches a L2TPv3 over IP header.
+
+- ``session_id``: L2TPv3 over IP session identifier.
+- Default ``mask`` matches session_id only.
+
 
 Actions
 ~~~
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst 
b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 73ef0b41d..5e6935829 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -3954,6 +3954,10 @@ This section lists supported pattern items and their 
attributes, if any.
 
   - ``proto_id {unsigned}``: PPP protocol identifier.
 
+- ``l2tpv3oip``: match L2TPv3 over IP header.
+
+  - ``session_id {unsigned}``: L2TPv3 over IP session identifier.
+
 Actions list
 
 
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index 87a3e8c4c..4d130be77 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -93,6 +93,7 @@ static const struct rte_flow_desc_data rte_flow_desc_item[] = 
{
MK_FLOW_ITEM(IGMP, sizeof(struct rte_flow_item_igmp)),
MK_FLOW_ITEM(AH, sizeof(struct rt

[dpdk-dev] [PATCH v3 2/2] net/i40e: support FDIR for L2TPv3 over IP

2020-01-07 Thread Rory Sexton
Adding FDIR support for L2TPv3 over IP header matching and adding
a new customized pctype for l2tpv3 over IP.

Signed-off-by: Rory Sexton 
Signed-off-by: Dariusz Jagus 
---
v3 changes:
* removing testpmd doc update as this patch focus only on i40e PMD
* remove unused structure from i40e_ethdev.h
---
 drivers/net/i40e/i40e_ethdev.c | 11 ++-
 drivers/net/i40e/i40e_ethdev.h | 43 +
 drivers/net/i40e/i40e_fdir.c   | 40 +---
 drivers/net/i40e/i40e_flow.c   | 57 ++
 4 files changed, 132 insertions(+), 19 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 5999c964b..80a46916c 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -12351,6 +12351,14 @@ i40e_update_customized_pctype(struct rte_eth_dev *dev, 
uint8_t *pkg,
new_pctype =
i40e_find_customized_pctype(pf,
  I40E_CUSTOMIZED_GTPU);
+   else if (!strcmp(name, "IPV4_L2TPV3"))
+   new_pctype =
+   i40e_find_customized_pctype(pf,
+   I40E_CUSTOMIZED_IPV4_L2TPV3);
+   else if (!strcmp(name, "IPV6_L2TPV3"))
+   new_pctype =
+   i40e_find_customized_pctype(pf,
+   I40E_CUSTOMIZED_IPV6_L2TPV3);
if (new_pctype) {
if (op == RTE_PMD_I40E_PKG_OP_WR_ADD) {
new_pctype->pctype = pctype_value;
@@ -12544,7 +12552,8 @@ i40e_update_customized_ptype(struct rte_eth_dev *dev, 
uint8_t *pkg,
RTE_PTYPE_TUNNEL_GRENAT;
in_tunnel = true;
} else if (!strncasecmp(name, "L2TPV2CTL", 9) ||
-  !strncasecmp(name, "L2TPV2", 6)) {
+  !strncasecmp(name, "L2TPV2", 6) ||
+  !strncasecmp(name, "L2TPV3", 6)) {
ptype_mapping[i].sw_ptype |=
RTE_PTYPE_TUNNEL_L2TP;
in_tunnel = true;
diff --git a/drivers/net/i40e/i40e_ethdev.h b/drivers/net/i40e/i40e_ethdev.h
index 295ad593b..bba2b83b4 100644
--- a/drivers/net/i40e/i40e_ethdev.h
+++ b/drivers/net/i40e/i40e_ethdev.h
@@ -508,24 +508,38 @@ struct i40e_raw_flow {
uint32_t length;
 };
 
+/* A structure used to define the input for L2TPv3 over IPv4 flow */
+struct i40e_ipv4_l2tpv3oip_flow {
+   struct rte_eth_ipv4_flow ip4;
+   uint32_t session_id; /* Session ID in big endian. */
+};
+
+/* A structure used to define the input for L2TPv3 over IPv6 flow */
+struct i40e_ipv6_l2tpv3oip_flow {
+   struct rte_eth_ipv6_flow ip6;
+   uint32_t session_id; /* Session ID in big endian. */
+};
+
 /*
  * A union contains the inputs for all types of flow
  * items in flows need to be in big endian
  */
 union i40e_fdir_flow {
-   struct rte_eth_l2_flow l2_flow;
-   struct rte_eth_udpv4_flow  udp4_flow;
-   struct rte_eth_tcpv4_flow  tcp4_flow;
-   struct rte_eth_sctpv4_flow sctp4_flow;
-   struct rte_eth_ipv4_flow   ip4_flow;
-   struct rte_eth_udpv6_flow  udp6_flow;
-   struct rte_eth_tcpv6_flow  tcp6_flow;
-   struct rte_eth_sctpv6_flow sctp6_flow;
-   struct rte_eth_ipv6_flow   ipv6_flow;
-   struct i40e_gtp_flow   gtp_flow;
-   struct i40e_gtp_ipv4_flow  gtp_ipv4_flow;
-   struct i40e_gtp_ipv6_flow  gtp_ipv6_flow;
-   struct i40e_raw_flow   raw_flow;
+   struct rte_eth_l2_flow  l2_flow;
+   struct rte_eth_udpv4_flow   udp4_flow;
+   struct rte_eth_tcpv4_flow   tcp4_flow;
+   struct rte_eth_sctpv4_flow  sctp4_flow;
+   struct rte_eth_ipv4_flowip4_flow;
+   struct rte_eth_udpv6_flow   udp6_flow;
+   struct rte_eth_tcpv6_flow   tcp6_flow;
+   struct rte_eth_sctpv6_flow  sctp6_flow;
+   struct rte_eth_ipv6_flowipv6_flow;
+   struct i40e_gtp_flowgtp_flow;
+   struct i40e_gtp_ipv4_flow   gtp_ipv4_flow;
+   struct i40e_gtp_ipv6_flow   gtp_ipv6_flow;
+   struct i40e_raw_flowraw_flow;
+   struct i40e_ipv4_l2tpv3oip_flow ip4_l2tpv3oip_flow;
+   struct i40e_ipv6_l2tpv3oip_flow ip6_l2tpv3oip_flow;
 };
 
 enum i40e_fdir_ip_type {
@@ -542,6 +556,7 @@ struct i40e_fdir_flow_ext {
uint16_t dst_id; /* VF ID, available when is_vf is 1*/
bool inner_ip;   /* If there is inner ip */
enum i40e_fdir_ip_type iip_type; /* ip type for inner ip */
+   enum i40e_fdir_ip_type oip_type; /* ip type for outer ip */
bool customized_pctype; /* If cu

[dpdk-dev] [PATCH] ethdev: rte_eth_dev_callback_unregister() fails with cb_arg == -1

2020-01-07 Thread Ricardo Roldan
The function was checking -1 against the callback data instead of
the given cb_arg parameter.

Signed-off-by: Ricardo Roldan 
---
 lib/librte_ethdev/rte_ethdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 6e9cb243e..aec2d0f70 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -4039,7 +4039,7 @@ rte_eth_dev_callback_unregister(uint16_t port_id,
next = TAILQ_NEXT(cb, next);
 
if (cb->cb_fn != cb_fn || cb->event != event ||
-   (cb->cb_arg != (void *)-1 && cb->cb_arg != cb_arg))
+   (cb_arg != (void *)-1 && cb->cb_arg != cb_arg))
continue;
 
/*
-- 
2.23.0



Re: [dpdk-dev] [PATCH v7 03/17] test/ring: add functional tests for rte_ring_xxx_elem APIs

2020-01-07 Thread Ananyev, Konstantin
> > > Add basic infrastructure to test rte_ring_xxx_elem APIs. Add test
> > > cases for testing burst and bulk tests.
> > >
> > > Signed-off-by: Honnappa Nagarahalli 
> > > Reviewed-by: Gavin Hu 
> > > ---
> > >  app/test/test_ring.c | 466
> > > ---
> > >  app/test/test_ring.h | 203 +++
> > >  2 files changed, 419 insertions(+), 250 deletions(-)  create mode
> > > 100644 app/test/test_ring.h
> > >
> > > diff --git a/app/test/test_ring.c b/app/test/test_ring.c index
> > > aaf1e70ad..e7a8b468b 100644
> > > --- a/app/test/test_ring.c
> > > +++ b/app/test/test_ring.c
> > > @@ -23,11 +23,13 @@
> > >  #include 
> > >  #include 
> > >  #include 
> > > +#include 
> > >  #include 
> > >  #include 
> > >  #include 
> > >
> > >  #include "test.h"
> > > +#include "test_ring.h"
> > >
> > >  /*
> > >   * Ring
> > > @@ -67,6 +69,50 @@ static rte_atomic32_t synchro;
> > >
> > >  #define  TEST_RING_FULL_EMTPY_ITER   8
> > >
> > > +static int esize[] = {-1, 4, 8, 16};
> > > +
> > > +static void
> > > +test_ring_mem_init(void *obj, unsigned int count, int esize) {
> > > + unsigned int i;
> > > +
> > > + /* Legacy queue APIs? */
> > > + if (esize == -1)
> > > + for (i = 0; i < count; i++)
> > > + ((void **)obj)[i] = (void *)(unsigned long)i;
> > > + else
> > > + for (i = 0; i < (count * esize / sizeof(uint32_t)); i++)
> > > + ((uint32_t *)obj)[i] = i;
> > > +}
> > > +
> > > +static void
> > > +test_ring_print_test_string(const char *istr, unsigned int api_type,
> > > +int esize) {
> > > + printf("\n%s: ", istr);
> > > +
> > > + if (esize == -1)
> > > + printf("legacy APIs: ");
> > > + else
> > > + printf("elem APIs: element size %dB ", esize);
> > > +
> > > + if (api_type == TEST_RING_IGNORE_API_TYPE)
> > > + return;
> > > +
> > > + if ((api_type & TEST_RING_N) == TEST_RING_N)
> > > + printf(": default enqueue/dequeue: ");
> > > + else if ((api_type & TEST_RING_S) == TEST_RING_S)
> > > + printf(": SP/SC: ");
> > > + else if ((api_type & TEST_RING_M) == TEST_RING_M)
> > > + printf(": MP/MC: ");
> > > +
> > > + if ((api_type & TEST_RING_SL) == TEST_RING_SL)
> > > + printf("single\n");
> > > + else if ((api_type & TEST_RING_BL) == TEST_RING_BL)
> > > + printf("bulk\n");
> > > + else if ((api_type & TEST_RING_BR) == TEST_RING_BR)
> > > + printf("burst\n");
> > > +}
> > > +
> > >  /*
> > >   * helper routine for test_ring_basic
> > >   */
> > > @@ -314,286 +360,203 @@ test_ring_basic(struct rte_ring *r)
> > >   return -1;
> > >  }
> > >
> > > +/*
> > > + * Burst and bulk operations with sp/sc, mp/mc and default (during
> > > +creation)  */
> > >  static int
> > > -test_ring_burst_basic(struct rte_ring *r)
> > > +test_ring_burst_bulk_tests(unsigned int api_type)
> > >  {
> > > + struct rte_ring *r;
> > >   void **src = NULL, **cur_src = NULL, **dst = NULL, **cur_dst = NULL;
> > >   int ret;
> > > - unsigned i;
> > > + unsigned int i, j;
> > > + unsigned int num_elems;
> > >
> > > - /* alloc dummy object pointers */
> > > - src = malloc(RING_SIZE*2*sizeof(void *));
> > > - if (src == NULL)
> > > - goto fail;
> > > -
> > > - for (i = 0; i < RING_SIZE*2 ; i++) {
> > > - src[i] = (void *)(unsigned long)i;
> > > - }
> > > - cur_src = src;
> > > + for (i = 0; i < RTE_DIM(esize); i++) {
> > > + test_ring_print_test_string("Test standard ring", api_type,
> > > + esize[i]);
> > >
> > > - /* alloc some room for copied objects */
> > > - dst = malloc(RING_SIZE*2*sizeof(void *));
> > > - if (dst == NULL)
> > > - goto fail;
> > > + /* Create the ring */
> > > + TEST_RING_CREATE("test_ring_burst_bulk_tests", esize[i],
> > > + RING_SIZE, SOCKET_ID_ANY, 0, r);
> > >
> > > - memset(dst, 0, RING_SIZE*2*sizeof(void *));
> > > - cur_dst = dst;
> > > -
> > > - printf("Test SP & SC basic functions \n");
> > > - printf("enqueue 1 obj\n");
> > > - ret = rte_ring_sp_enqueue_burst(r, cur_src, 1, NULL);
> > > - cur_src += 1;
> > > - if (ret != 1)
> > > - goto fail;
> > > -
> > > - printf("enqueue 2 objs\n");
> > > - ret = rte_ring_sp_enqueue_burst(r, cur_src, 2, NULL);
> > > - cur_src += 2;
> > > - if (ret != 2)
> > > - goto fail;
> > > -
> > > - printf("enqueue MAX_BULK objs\n");
> > > - ret = rte_ring_sp_enqueue_burst(r, cur_src, MAX_BULK, NULL);
> > > - cur_src += MAX_BULK;
> > > - if (ret != MAX_BULK)
> > > - goto fail;
> > > -
> > > - printf("dequeue 1 obj\n");
> > > - ret = rte_ring_sc_dequeue_burst(r, cur_dst, 1, NULL);
> > > - cur_dst += 1;
> > > - if (ret != 1)
> > > - goto fail;
> > > -
> > > - printf("dequeue 2 objs\n");
> > > - ret = rte_ring_sc_dequeue_burst(r, cur_dst, 2, NULL);
> > > - cur_dst += 2;
> > > - if (ret != 2)
> > > - goto fail;
> > > + /* alloc dummy object pointers */
> > > + src =

Re: [dpdk-dev] [PATCH v7 10/17] test/ring: modify single element enq/deq perf test cases

2020-01-07 Thread Ananyev, Konstantin


> > > Add test cases to test rte_ring_xxx_elem APIs for single element
> > > enqueue/dequeue test cases.
> > >
> > > Signed-off-by: Honnappa Nagarahalli 
> > > Reviewed-by: Gavin Hu 
> > > ---
> > >  app/test/test_ring_perf.c | 100
> > > ++
> > >  1 file changed, 80 insertions(+), 20 deletions(-)
> > >
> > > diff --git a/app/test/test_ring_perf.c b/app/test/test_ring_perf.c
> > > index 6c2aca483..5829718c1 100644
> > > --- a/app/test/test_ring_perf.c
> > > +++ b/app/test/test_ring_perf.c
> > > @@ -13,6 +13,7 @@
> > >  #include 
> > >
> > >  #include "test.h"
> > > +#include "test_ring.h"
> > >
> > >  /*
> > >   * Ring
> > > @@ -41,6 +42,35 @@ struct lcore_pair {
> > >
> > >  static volatile unsigned lcore_count = 0;
> > >
> > > +static void
> > > +test_ring_print_test_string(unsigned int api_type, int esize,
> > > + unsigned int bsz, double value)
> > > +{
> > > + if (esize == -1)
> > > + printf("legacy APIs");
> > > + else
> > > + printf("elem APIs: element size %dB", esize);
> > > +
> > > + if (api_type == TEST_RING_IGNORE_API_TYPE)
> > > + return;
> > > +
> > > + if ((api_type & TEST_RING_N) == TEST_RING_N)
> > > + printf(": default enqueue/dequeue: ");
> > > + else if ((api_type & TEST_RING_S) == TEST_RING_S)
> > > + printf(": SP/SC: ");
> > > + else if ((api_type & TEST_RING_M) == TEST_RING_M)
> > > + printf(": MP/MC: ");
> > > +
> > > + if ((api_type & TEST_RING_SL) == TEST_RING_SL)
> > > + printf("single: ");
> > > + else if ((api_type & TEST_RING_BL) == TEST_RING_BL)
> > > + printf("bulk (size: %u): ", bsz);
> > > + else if ((api_type & TEST_RING_BR) == TEST_RING_BR)
> > > + printf("burst (size: %u): ", bsz);
> > > +
> > > + printf("%.2F\n", value);
> > > +}
> > > +
> > >  / Functions to analyse our core mask to get cores for different
> > > tests ***/
> > >
> > >  static int
> > > @@ -335,32 +365,35 @@ run_on_all_cores(struct rte_ring *r)
> > >   * Test function that determines how long an enqueue + dequeue of a
> > single item
> > >   * takes on a single lcore. Result is for comparison with the bulk 
> > > enq+deq.
> > >   */
> > > -static void
> > > -test_single_enqueue_dequeue(struct rte_ring *r)
> > > +static int
> > > +test_single_enqueue_dequeue(struct rte_ring *r, const int esize,
> > > + const unsigned int api_type)
> > >  {
> > > - const unsigned iter_shift = 24;
> > > - const unsigned iterations = 1< > > - unsigned i = 0;
> > > + int ret;
> > > + const unsigned int iter_shift = 24;
> > > + const unsigned int iterations = 1 << iter_shift;
> > > + unsigned int i = 0;
> > >   void *burst = NULL;
> > >
> > > - const uint64_t sc_start = rte_rdtsc();
> > > - for (i = 0; i < iterations; i++) {
> > > - rte_ring_sp_enqueue(r, burst);
> > > - rte_ring_sc_dequeue(r, &burst);
> > > - }
> > > - const uint64_t sc_end = rte_rdtsc();
> > > + (void)ret;
> >
> > Here, and in few other places, looks redundant.
> The compiler throws an error since 'ret' is assigned a value, but it is not 
> used.

Probably one way to change  TEST_RING_ENQUEUE() from macro
to inline-function returning ret.  

> 
> >
> > > + /* alloc dummy object pointers */
> > > + burst = test_ring_calloc(1, esize);
> > > + if (burst == NULL)
> > > + return -1;
> > >
> > > - const uint64_t mc_start = rte_rdtsc();
> > > + const uint64_t start = rte_rdtsc();
> > >   for (i = 0; i < iterations; i++) {
> > > - rte_ring_mp_enqueue(r, burst);
> > > - rte_ring_mc_dequeue(r, &burst);
> > > + TEST_RING_ENQUEUE(r, burst, esize, 1, ret, api_type);
> > > + TEST_RING_DEQUEUE(r, burst, esize, 1, ret, api_type);
> > >   }
> > > - const uint64_t mc_end = rte_rdtsc();
> > > + const uint64_t end = rte_rdtsc();
> > > +
> > > + test_ring_print_test_string(api_type, esize, 1,
> > > + ((double)(end - start)) / iterations);
> > > +
> > > + rte_free(burst);
> > >
> > > - printf("SP/SC single enq/dequeue: %.2F\n",
> > > - ((double)(sc_end-sc_start)) / iterations);
> > > - printf("MP/MC single enq/dequeue: %.2F\n",
> > > - ((double)(mc_end-mc_start)) / iterations);
> > > + return 0;
> > >  }
> > >
> > >  /*


Re: [dpdk-dev] [PATCH] mem: fix incorrect munmap in error unwind

2020-01-07 Thread Burakov, Anatoly

On 06-Jan-20 8:55 PM, Stephen Hemminger wrote:

The loop to unwind existing mmaps was only unmapping the
first segment.

Also, remove obvious redundant assignment.

Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
Cc: anatoly.bura...@intel.com
Cc: sta...@dpdk.org
Signed-off-by: Stephen Hemminger 
---


Acked-by: Anatoly Burakov 

--
Thanks,
Anatoly


Re: [dpdk-dev] [PATCH v2 10/11] examples/l3fwd: add graceful teardown for eventdevice

2020-01-07 Thread Pavan Nikhilesh Bhagavatula


>-Original Message-
>From: Ananyev, Konstantin 
>Sent: Monday, January 6, 2020 5:43 PM
>To: Pavan Nikhilesh Bhagavatula ; Jerin
>Jacob Kollanukkaran ; Kovacevic, Marko
>; Ori Kam ;
>Richardson, Bruce ; Nicolau, Radu
>; Akhil Goyal ;
>Kantecki, Tomasz ; Sunil Kumar Kori
>
>Cc: dev@dpdk.org
>Subject: [EXT] RE: [dpdk-dev] [PATCH v2 10/11] examples/l3fwd: add
>graceful teardown for eventdevice
>
>External Email
>
>--
>
>> >> Add graceful teardown that addresses both event mode and poll
>> >mode.
>> >>
>> >> Signed-off-by: Pavan Nikhilesh 
>> >> ---
>> >>  examples/l3fwd/main.c | 49
>++-
>> >
>> >>  1 file changed, 34 insertions(+), 15 deletions(-)
>> >>
>> >> diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
>> >> index 0ae64dd41..68998f42c 100644
>> >> --- a/examples/l3fwd/main.c
>> >> +++ b/examples/l3fwd/main.c
>> >> @@ -920,7 +920,7 @@ main(int argc, char **argv)
>> >>   struct lcore_conf *qconf;
>> >>   struct rte_eth_dev_info dev_info;
>> >>   struct rte_eth_txconf *txconf;
>> >> - int ret;
>> >> + int i, ret;
>> >>   unsigned nb_ports;
>> >>   uint16_t queueid, portid;
>> >>   unsigned lcore_id;
>> >> @@ -1195,27 +1195,46 @@ main(int argc, char **argv)
>> >>   }
>> >>   }
>> >>
>> >> -
>> >>   check_all_ports_link_status(enabled_port_mask);
>> >>
>> >>   ret = 0;
>> >>   /* launch per-lcore init on every lcore */
>> >>   rte_eal_mp_remote_launch(l3fwd_lkp.main_loop, NULL,
>> >CALL_MASTER);
>> >> - RTE_LCORE_FOREACH_SLAVE(lcore_id) {
>> >> - if (rte_eal_wait_lcore(lcore_id) < 0) {
>> >> - ret = -1;
>> >> - break;
>> >> + if (evt_rsrc->enabled) {
>> >> + for (i = 0; i < evt_rsrc->rx_adptr.nb_rx_adptr; i++)
>> >> + rte_event_eth_rx_adapter_stop(
>> >> + evt_rsrc->rx_adptr.rx_adptr[i]);
>> >> + for (i = 0; i < evt_rsrc->tx_adptr.nb_tx_adptr; i++)
>> >> + rte_event_eth_tx_adapter_stop(
>> >> + evt_rsrc->tx_adptr.tx_adptr[i]);
>> >> +
>> >> + RTE_ETH_FOREACH_DEV(portid) {
>> >> + if ((enabled_port_mask & (1 << portid)) == 0)
>> >> + continue;
>> >> + rte_eth_dev_stop(portid);
>> >>   }
>> >> - }
>> >>
>> >> - /* stop ports */
>> >> - RTE_ETH_FOREACH_DEV(portid) {
>> >> - if ((enabled_port_mask & (1 << portid)) == 0)
>> >> - continue;
>> >> - printf("Closing port %d...", portid);
>> >> - rte_eth_dev_stop(portid);
>> >> - rte_eth_dev_close(portid);
>> >> - printf(" Done\n");
>> >
>> >Why to stop ports *before* making sure all lcores are stopped?
>> >Shouldn't that peace of code be identical for both poll and event
>mode?
>> >Something like:
>> >rte_eal_mp_wait_lcore();
>> >
>> >RTE_ETH_FOREACH_DEV(portid) {
>> >if ((enabled_port_mask & (1 << portid)) == 0)
>> >continue;
>> >rte_eth_dev_stop(portid);
>> >rte_eth_dev_close(portid);
>> >}
>> >?
>> >
>>
>> Event dev spec requires stopping producers before consumers else
>we might run into
>> deadlock in some cases.
>
>Ok... but for TX path wouldn't core be a producer?

Not in all cases, true in case of SW event device and implementation defined in 
HW event devices.

Also both the cases of Tx poll on force_quit in case Tx path fails so that 
cores can exit.

if (flags & L3FWD_EVENT_TX_ENQ) {
ev.queue_id = tx_q_id;
ev.op = RTE_EVENT_OP_FORWARD;
while (rte_event_enqueue_burst(event_d_id, event_p_id,
&ev, 1) && !force_quit)
;
}

if (flags & L3FWD_EVENT_TX_DIRECT) {
rte_event_eth_tx_adapter_txq_set(mbuf, 0);
while (!rte_event_eth_tx_adapter_enqueue(event_d_id,
event_p_id, &ev, 1, 0) &&
!force_quit)
;
}

>Also for that wouldn't rte_event_eth_(rx|tx)_adapter_stop(0 be
>enough?

In case of HW event devices the above might be nop as control lies with 
driver/net.
And since anyway we need to stop ethernet device we might as well do it here.

>I am not familiar with event-dev spec at all, so forgive for possibly dumb
>questions 😉

😊

>
>>
>> >> + rte_eal_mp_wait_lcore();
>> >> + RTE_ETH_FOREACH_DEV(portid) {
>> >> + if ((enabled_port_mask & (1 << portid)) == 0)
>> >> + continue;
>> >> + rte_eth_dev_close(portid);
>> >> + }
>> >> +
>> >> + rte_event_dev_stop(evt_rsrc->even

Re: [dpdk-dev] [PATCH v4 00/17] Introduces net/ionic PMD

2020-01-07 Thread Ferruh Yigit
On 12/19/2019 10:18 PM, Alfredo Cardigliano wrote:
> The patch series provides an initial version of a
> poll mode driver for Pensando network adapters.
> The driver name is ionic.
> 
> v4 Changes:
> --
> - Remove the static list of adapters
> - Disable compilation on unsupported platforms
> - Add BSD-3-Clause to ionic_if.h
> - Add a link to the supported adapters description pages
> - Fix compilation warnings
> - Other minor fixes
> 
> Signed-off-by: Alfredo Cardigliano 
> Reviewed-by: Shannon Nelson 
> 
> Alfredo Cardigliano (17):
>   net/ionic: add skeleton
>   net/ionic: add hardware structures definitions
>   net/ionic: add log
>   net/ionic: register and initialize the adapter
>   net/ionic: add port management commands
>   net/ionic: add basic lif support
>   net/ionic: add doorbells
>   net/ionic: add adminq support
>   net/ionic: add notifyq support
>   net/ionic: add basic port operations
>   net/ionic: add RX filters support
>   net/ionic: add Flow Control support
>   net/ionic: add RX and TX handling
>   net/ionic: add RSS support
>   net/ionic: add stats
>   net/ionic: add TX checksum support
>   net/ionic: read fw version

Hi Alfredo,

There are some build error on patch by patch build, I put comments in individual
patches, the expectation is after each patch the code still should be build
successfully, this is at least needed for git bisect support.

also "./devtools/check-git-log.sh" still complaining on a few commits, can you
please check them?

Thanks,
ferruh



Re: [dpdk-dev] [PATCH v2 01/11] examples/l3fwd: add framework for event device

2020-01-07 Thread Pavan Nikhilesh Bhagavatula
>> >> +struct l3fwd_event_resources {
>> >> + uint8_t sched_type;
>> >> + uint8_t enabled;
>> >> + uint8_t nb_args;
>> >> + char **args;
>> >> +};
>> >> +
>> >> +static inline struct l3fwd_event_resources *
>> >> +l3fwd_get_eventdev_rsrc(void)
>> >> +{
>> >> + static const char name[RTE_MEMZONE_NAMESIZE] =
>> >"l3fwd_event_rsrc";
>> >> + const struct rte_memzone *mz;
>> >> +
>> >> + mz = rte_memzone_lookup(name);
>> >> +
>> >> + if (mz != NULL)
>> >> + return mz->addr;
>> >> +
>> >> + mz = rte_memzone_reserve(name, sizeof(struct
>> >l3fwd_event_resources),
>> >> +  0, 0);
>> >> + if (mz != NULL) {
>> >> + memset(mz->addr, 0, sizeof(struct
>> >l3fwd_event_resources));
>> >> + return mz->addr;
>> >> + }
>> >> +
>> >> + rte_exit(EXIT_FAILURE, "Unable to allocate memory for
>> >eventdev cfg\n");
>> >> +
>> >> + return NULL;
>> >> +}
>> >
>> >Does this function really need to be inline?
>> >It wouldn't be fast anyway.
>> >Another question - do you really need memzone here?
>> >Wouldn't just rte_malloc() be enough?
>>
>> Will remove inline in next version.
>> rte_malloc would call for a global variable which I'm
>> trying to avoid.
>
>If you plan to move that function into .c file,
>you don't need a global var.  it could be static local one.
>
>> I don't think there is any harm in using
>> named memzone.
>
>I don't see any harm, though malloc+var will be faster I think.
>Though up to you - no strong opinion here.

Maybe we can cut down some init time. I will move it to .c next version. 
Thanks,
Pavan


Re: [dpdk-dev] [PATCH v4 01/17] net/ionic: add skeleton

2020-01-07 Thread Ferruh Yigit
On 12/19/2019 10:18 PM, Alfredo Cardigliano wrote:
> Add makefile and config file options to compile the Pensando ionic PMD.
> Add feature and version map file.
> Update maintainers file.
> 
> Signed-off-by: Alfredo Cardigliano 
> Reviewed-by: Shannon Nelson 

<...>

> +++ b/doc/guides/nics/ionic.rst
> @@ -0,0 +1,43 @@
> +..  SPDX-License-Identifier: (BSD-3-Clause OR GPL-2.0)
> +Copyright(c) 2018-2019 Pensando Systems, Inc. All rights reserved.
> +
> +IONIC Driver
> +
> +
> +The ionic driver provides support for Pensando server adapters. 
> +It currently supports the below models:
> +
> +- `Naples DSC-25
> +`_
> +- `Naples DSC-100
> +`_
> +
> +Please visit https://pensando.io for more information.
> +
> +Identifying the Adapter
> +---
> +
> +To find if one or more Pensando PCI Ethernet devices are installed
> +on the host, check for the PCI devices:
> +
> +   .. code-block:: console
> +
> +  lspci -d 1dd8:
> +  b5:00.0 Ethernet controller: Device 1dd8:1002
> +  b6:00.0 Ethernet controller: Device 1dd8:1002
> +
> +Pre-Installation Configuration
> +--
> +
> +The following options can be modified in the ``config`` file.
> +
> +- ``CONFIG_RTE_LIBRTE_IONIC_PMD`` (default ``y``)
> +
> +  Toggle compilation of ionic PMD.
> +
> +Building DPDK
> +-
> +
> +The ionic PMD driver supports UIO and VFIO, please refer to the
> +:ref:`DPDK documentation that comes with the DPDK suite `
> +for instructions on how to build DPDK.

Doc is causing following warning, can you please check it:
.../doc/guides/nics/ionic.rst:10: WARNING: Inline interpreted text or phrase
reference start-string without end-string.
.../doc/guides/nics/ionic.rst:11: WARNING: Bullet list ends without a blank
line; unexpected unindent.


Re: [dpdk-dev] [PATCH v4 04/17] net/ionic: register and initialize the adapter

2020-01-07 Thread Ferruh Yigit
On 12/19/2019 10:18 PM, Alfredo Cardigliano wrote:
> Register the Pensando ionic PMD (net_ionic) and define initial probe
> and remove callbacks with adapter initialization.
> 
> Signed-off-by: Alfredo Cardigliano 
> Reviewed-by: Shannon Nelson 

<...>

> +static int
> +eth_ionic_pci_remove(struct rte_pci_device *pci_dev)
> +{
> + return 0;
> +}

This function causing build error on patch by patch build, because of unused
'pci_dev' variable, it needs to be used as unused:

error: unused parameter 'pci_dev' [-Werror,-Wunused-parameter]

<...>

> +int32_t
> +ionic_set_mac_type(struct ionic_hw *hw)
> +{
> + int err = 0;
> +
> + IONIC_PRINT_CALL();
> +
> + if (hw->vendor_id != IONIC_PENSANDO_VENDOR_ID) {
> + IONIC_PRINT(ERR, "Unsupported vendor id: %" PRIx32 "",
> + hw->vendor_id);

Causing following build error on patch by patch build, need to add "#include
", since ionic_dev.c also fails in this patch, adding include to
ionic.h can fix both.

.../drivers/net/ionic/ionic_mac_api.c:38:47: error: expected ')'
IONIC_PRINT(ERR, "Unsupported vendor id: %" PRIx32 "",
^
.../drivers/net/ionic/ionic_mac_api.c:38:3: note: to match this '('
IONIC_PRINT(ERR, "Unsupported vendor id: %" PRIx32 "",
^
.../drivers/net/ionic/ionic_logs.h:12:49: note: expanded from macro 
'IONIC_PRINT'
#define IONIC_PRINT(level, fmt, args...) rte_log(RTE_LOG_ ## level, \
^



Re: [dpdk-dev] [PATCH v1 1/3] drivers: introduce vDPA class

2020-01-07 Thread Maxime Coquelin
Hi Matan,

On 12/25/19 4:19 PM, Matan Azrad wrote:
> The vDPA (vhost data path acceleration) drivers provide support for
> the vDPA operations introduced by the rte_vhost library.
> 
> Any driver which provides the vDPA operations should be moved\added to
> the vdpa class under drivers/vdpa/.
> 
> Create the general files for vDPA class in drivers and in documentation.
> 
> Signed-off-by: Matan Azrad 
> ---
>  doc/guides/index.rst  |  1 +
>  doc/guides/vdpadevs/index.rst | 13 +
>  drivers/Makefile  |  2 ++
>  drivers/meson.build   |  1 +
>  drivers/vdpa/Makefile |  8 
>  drivers/vdpa/meson.build  |  8 
>  6 files changed, 33 insertions(+)
>  create mode 100644 doc/guides/vdpadevs/index.rst
>  create mode 100644 drivers/vdpa/Makefile
>  create mode 100644 drivers/vdpa/meson.build
> 

Looks good to me. Just wondering if we need a dedicated maintainer for
this new class of devices?

Other than that:
Reviewed-by: Maxime Coquelin 

Thanks,
Maxime



Re: [dpdk-dev] [PATCH] net/ixgbe: fix blocking system events

2020-01-07 Thread Ananyev, Konstantin


Hi Tao,

> Thank you for your advice. Your advice is excellent. I have the same idea 
> with you to solve this problem.
> First.  Consider whether the fallback function of alamer can shorten the time.
> But it seems difficult for the following reasons(If you have a good idea, 
> please let me know):
>   a. The entry of the fallback function is 
> 'ixgbe_setup_mac_link_multispeed_fiber', It's a little more complicated
>   b. This callback function not only has multiple sleeps, but also may 
> have long sleep in other functions it calls
>   c. The function called by the callback function has different 
> implementations for different NIC
>   d. This callback function and other functions it calls are all from 
> ND's shared code. After modification, upgrading will cause
> trouble

Agree, I also don't see a way to overcome that problem
without reworking ixgbe/base code quite significantly.
Which obviously is not an option. 

> 
> So I use a separate thread to replace alarm, This thread is created using 
> wrapper function: rte_ctrl_thread_create().
> A mechanism is provided to create a thread when PF or VF is initialized for 
> the first time. This thread will select tasks from the task list
> in order to execute. If the task list is empty, the thread sleeps.
> Provides interface for adding and deleting tasks.
> The specific implementation is as follows(See patch for detailed 
> implementation):
>   a. Thread create function: ixgbe_task_thread_init, it will call 
> rte_ctrl_thread_create to create thread
>   b. Thread destroy function: ixgbe_task_thread_uninit
>   c. Add task function: ixgbe_add_task, it add a task and wake up the 
> sleeping thread
>   d. Cancel task function: ixgbe_cancel_task, it cancel a task 
> synchronously
> 
> The reason why a task queue is used to manage tasks is that there may be 
> concurrent conflicts, because alarm has no concurrent
> conflicts.

Sorry, I probably wasn't clear in my previous mail.
I realized what you did, but I think it is not a good practice
to add task management thread/code into specific PMD.
If we do need such code, then it probably has to be generic enough and  be in 
EAL.
Though I wonder do we really need it for that case?
Can't we just spawn a thread to execute ixgbe_dev_setup_link_alarm_handler?
I.E just :
rte_ctrl_thread_create(&hw->link_thread, , 
ixgbe_dev_setup_link_alarm_handler, dev);
instead of rte_eal_alarm_set()
And pthread_cancel(&hw->link_thread); pthread_join(hw->link_thread);
Instead of rte_alarm_cancel().

> 
> BR,
> Zhu, Tao
> 
> > -Original Message-
> > From: Ananyev, Konstantin
> > Sent: Monday, December 30, 2019 9:58 PM
> > To: Zhu, TaoX ; konstantin.anan...@intel.com
> > wenzhuo...@intel.com
> > Cc: dev@dpdk.org; sta...@dpdk.org
> > Subject: RE: [PATCH] net/ixgbe: fix blocking system events
> >
> > Hi,
> >
> > > IXGBE link status task use rte alarm thread in old implementation.
> > > Sometime ixgbe link status task takes up to 9 seconds. This will
> > > severely affect the rte-alarm-thread-dependent a task in the system,
> > > like  interrupt or hotplug event. So replace with a independent thread
> > > which has the same thread affinity settings as rte interrupt.
> >
> >
> > I don't think it is a good idea to add into PMD code that creates/manages
> > new threads.
> > I think ideal would be to rework link setup code, so it can take less time
> > (make it interruptable/repeatable).
> > If that is not possible, and control function is the only way, there is a 
> > wrapper
> > function: rte_ctrl_thread_create() that can be used here instead of creating
> > your own framework.
> >
> >
> > >
> > > Fixes: 0408f47b ("net/ixgbe: fix busy polling while fiber link
> > > update")
> > > Cc: sta...@dpdk.org
> > >
> > > Signed-off-by: Zhu Tao 
> > > ---
> > >  drivers/net/ixgbe/ixgbe_ethdev.c | 184
> > > +--
> > >  drivers/net/ixgbe/ixgbe_ethdev.h |  32 +++
> > >  2 files changed, 210 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
> > > b/drivers/net/ixgbe/ixgbe_ethdev.c
> > > index 2c6fd0f..f0b387d 100644
> > > --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> > > +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> > > @@ -15,6 +15,7 @@
> > >  #include 
> > >  #include 
> > >  #include 
> > > +#include 
> > >
> > >  #include 
> > >  #include 
> > > @@ -378,6 +379,9 @@ static int ixgbe_dev_udp_tunnel_port_del(struct
> > rte_eth_dev *dev,
> > >   struct rte_eth_udp_tunnel *udp_tunnel);  static int
> > > ixgbe_filter_restore(struct rte_eth_dev *dev);  static void
> > > ixgbe_l2_tunnel_conf(struct rte_eth_dev *dev);
> > > +static int ixgbe_task_thread_init(struct rte_eth_dev *dev); static
> > > +void ixgbe_task_thread_uninit(struct rte_eth_dev *dev);
> > > +
> > >
> > >  /*
> > >   * Define VF Stats MACRO for Non "cleared on read" register @@
> > > -1069,6 +1073,171 @@ struct rte_ixgbe_xstats_name_off {  }
> > >
> > >  /*
> > > + * Add a task to tas

Re: [dpdk-dev] [PATCH v1 2/3] doc: add vDPA feature table

2020-01-07 Thread Maxime Coquelin



On 12/25/19 4:19 PM, Matan Azrad wrote:
> Add vDPA devices features table and explanation.
> 
> Any vDPA driver can add its own supported features by ading a new ini
> file to the features directory in doc/guides/vdpadevs/features.
> 
> Signed-off-by: Matan Azrad 
> ---
>  doc/guides/conf.py|  5 +++
>  doc/guides/vdpadevs/features/default.ini  | 55 ++
>  doc/guides/vdpadevs/features_overview.rst | 65 
> +++
>  doc/guides/vdpadevs/index.rst |  1 +
>  4 files changed, 126 insertions(+)
>  create mode 100644 doc/guides/vdpadevs/features/default.ini
>  create mode 100644 doc/guides/vdpadevs/features_overview.rst
> 
> diff --git a/doc/guides/conf.py b/doc/guides/conf.py
> index 0892c06..c368fa5 100644
> --- a/doc/guides/conf.py
> +++ b/doc/guides/conf.py
> @@ -401,6 +401,11 @@ def setup(app):
>  'Features',
>  'Features availability in compression drivers',
>  'Feature')
> +table_file = dirname(__file__) + '/vdpadevs/overview_feature_table.txt'
> +generate_overview_table(table_file, 1,
> +'Features',
> +'Features availability in vDPA drivers',
> +'Feature')
>  
>  if LooseVersion(sphinx_version) < LooseVersion('1.3.1'):
>  print('Upgrade sphinx to version >= 1.3.1 for '
> diff --git a/doc/guides/vdpadevs/features/default.ini 
> b/doc/guides/vdpadevs/features/default.ini
> new file mode 100644
> index 000..a3e0bc7
> --- /dev/null
> +++ b/doc/guides/vdpadevs/features/default.ini
> @@ -0,0 +1,55 @@
> +;
> +; Features of a default vDPA driver.
> +;
> +; This file defines the features that are valid for inclusion in
> +; the other driver files and also the order that they appear in
> +; the features table in the documentation. The feature description
> +; string should not exceed feature_str_len defined in conf.py.
> +;

I think some entries below could be removed for vDPA.

> +[Features]
> +csum =
> +guest csum   =
> +mac  =
> +gso  =
> +guest tso4   =
> +guest tso6   =
> +ecn  =
> +ufo  =
> +host tso4=
> +host tso6=
> +mrg rxbuf=
> +ctrl vq  =
> +ctrl rx  =
> +any layout   =
> +guest announce   =
> +mq   =
> +version 1=
> +log all  =
> +protocol features=
> +indirect desc=
> +event idx=
> +mtu  =
> +in_order =
> +IOMMU platform   =
> +packed   =
> +proto mq =
> +proto log shmfd  =
> +proto rarp   =
> +proto reply ack  =
> +proto slave req  =
> +proto crypto session =
> +proto host notifier  =
> +proto pagefault  =
> +Multiprocess aware   =
> +BSD nic_uio  =
> +Linux UIO=

E.g. UIO, which cannot be used since vDPA requires an IOMMU.

> +Linux VFIO   =
> +Other kdrv   =
> +ARMv7=
> +ARMv8=
> +Power8   =
> +x86-32   =
> +x86-64   =
> +Usage doc=
> +Design doc   =
> +Perf doc =
> \ No newline at end of file
> diff --git a/doc/guides/vdpadevs/features_overview.rst 
> b/doc/guides/vdpadevs/features_overview.rst
> new file mode 100644
> index 000..c7745b7
> --- /dev/null
> +++ b/doc/guides/vdpadevs/features_overview.rst
> @@ -0,0 +1,65 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +Copyright 2019 Mellanox Technologies, Ltd
> +
> +Overview of vDPA drivers features
> +=
> +
> +This section explains the supported features that are listed in the table 
> below.
> +
> +  * csum - Device can handle packets with partial checksum.
> +  * guest csum - Guest can handle packets with partial checksum.
> +  * mac - Device has given MAC address.
> +  * gso - Device can handle packets with any GSO type.
> +  * guest tso4 - Guest can receive TSOv4.
> +  * guest tso6 - Guest can receive TSOv6.
> +  * ecn - Device can receive TSO with ECN.
> +  * ufo - Device can receive UFO.
> +  * host tso4 - Device can receive TSOv4.
> +  * host tso6 - Device can receive TSOv6.
> +  * mrg rxbuf - Guest can merge receive buffers.
> +  * ctrl vq - Control channel is available.
> +  * ctrl rx - Control channel RX mode support.
> +  * any layout - Device can handle any descriptor layout.
> +  * guest announce - Guest can send gratuitous packets.
> +  * mq - Device supports Receive Flow Steering.
> +  * version 1 - v1.0 compliant.
> +  * log all - Device can log all write descriptors (live migration).
> +  * protocol features - Protocol features negotiation support.
> +  * indirect desc - Indirect buffer descriptors support.
> +  * event idx - Support for avail_idx and used_idx fields.
> +

[dpdk-dev] [PATCH] testpmd: call cleanup on exit

2020-01-07 Thread Stephen Hemminger
The rte_eal_cleanup code is not exercised by testpmd which
is the most used DPDK test tool. Add a call at end of program.

This helps exercise free and close paths which can
be checked with tools like valgrind.

Signed-off-by: Stephen Hemminger 
---
 app/test-pmd/testpmd.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index b3746822366f..a94ad19c8f5a 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -3570,5 +3570,6 @@ main(int argc, char** argv)
return 1;
}
 
+   rte_eal_cleanup();
return 0;
 }
-- 
2.20.1



[dpdk-dev] Issues reported by valgrind

2020-01-07 Thread Stephen Hemminger
Could who ever maintains the CI system get valgrind integrated.
DPDK leaks a lot of memory and file descriptors but since tools
are not regularly run to test, no one notices.

Valgrind on x86 has issues in that by default DPDK builds with
CPU flags it doesn't support. Running it on QEMU/Arm works.

I ran testpmd and saw a lot of issues to be fixed.
Patches are outstanding for some of these already.

# valgrind --keep-debuginfo=yes --leak-check=yes --show-reachable=yes 
--num-callers=10 --track-fds=yes -- ./build/app/testpmd -n 4 -l 1-2 
--vdev=net_tap0 -- -i
==813== Memcheck, a memory error detector
==813== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==813== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==813== Command: ./build/app/testpmd -n 4 -l 1-2 --vdev=net_tap0 -- -i
==813== 
EAL: Detected 8 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: Probing VFIO support...
==813== Warning: noted but unhandled ioctl 0x3b64 with no size/direction hints.
==813==This could cause spurious value errors to appear.
==813==See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper 
wrapper.
==813== Warning: noted but unhandled ioctl 0x3b65 with no size/direction hints.
==813==This could cause spurious value errors to appear.
==813==See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper 
wrapper.
EAL: VFIO support initialized
==813== Warning: set address range perms: large range [0x10020, 
0x50040) (defined)
==813== Warning: set address range perms: large range [0x50040, 
0x90060) (defined)
==813== Warning: set address range perms: large range [0x90060, 
0xd0080) (defined)
==813== Warning: set address range perms: large range [0x100626d000, 
0x140646d000) (defined)
==813== Warning: set address range perms: large range [0x100626d000, 
0x140646d000) (noaccess)
==813== Warning: set address range perms: large range [0x100626d000, 
0x140646d000) (defined)
==813== Warning: set address range perms: large range [0x100626d000, 
0x140646d000) (noaccess)
==813== Warning: set address range perms: large range [0x100626d000, 
0x140646d000) (defined)
==813== Warning: set address range perms: large range [0x100626d000, 
0x140646d000) (noaccess)
==813== Warning: set address range perms: large range [0x100626d000, 
0x140646d000) (defined)
==813== Warning: set address range perms: large range [0x100626d000, 
0x140646d000) (noaccess)
==813== Warning: set address range perms: large range [0x100626d000, 
0x140646d000) (defined)
==813== Warning: set address range perms: large range [0x100626d000, 
0x140646d000) (noaccess)
==813== Warning: set address range perms: large range [0x100626d000, 
0x140646d000) (defined)
EAL: WARNING! Base virtual address hint (0xd01061000 != 0x100640) not 
respected!
EAL:This may cause issues with mapping memory into secondary processes
==813== Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
==813==at 0x495EC4C: sendmsg (sendmsg.c:28)
==813==by 0x766E6F: tap_nl_send (in /home/root/dpdk/build/app/testpmd)
==813==by 0x76748B: qdisc_add_multiq (in /home/root/dpdk/build/app/testpmd)
==813==by 0x76755B: qdisc_create_multiq (in 
/home/root/dpdk/build/app/testpmd)
==813==by 0x27F86B: eth_dev_tap_create (in 
/home/root/dpdk/build/app/testpmd)
==813==by 0x75FA6F: rte_pmd_tap_probe (in /home/root/dpdk/build/app/testpmd)
==813==by 0x20BE9F: vdev_probe_all_drivers.part.1 (in 
/home/root/dpdk/build/app/testpmd)
==813==by 0x4DEE7F: vdev_probe (in /home/root/dpdk/build/app/testpmd)
==813==by 0x4ADFE7: rte_bus_probe (in /home/root/dpdk/build/app/testpmd)
==813==by 0x499D1B: rte_eal_init (in /home/root/dpdk/build/app/testpmd)
==813==  Address 0x1fff000251 is on thread 1's stack
==813==  in frame #2, created by qdisc_add_multiq (???:)
==813== 
==813== Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
==813==at 0x495EC4C: sendmsg (sendmsg.c:28)
==813==by 0x766E6F: tap_nl_send (in /home/root/dpdk/build/app/testpmd)
==813==by 0x76751B: qdisc_add_ingress (in /home/root/dpdk/build/app/testpmd)
==813==by 0x7675EB: qdisc_create_ingress (in 
/home/root/dpdk/build/app/testpmd)
==813==by 0x27F8A3: eth_dev_tap_create (in 
/home/root/dpdk/build/app/testpmd)
==813==by 0x75FA6F: rte_pmd_tap_probe (in /home/root/dpdk/build/app/testpmd)
==813==by 0x20BE9F: vdev_probe_all_drivers.part.1 (in 
/home/root/dpdk/build/app/testpmd)
==813==by 0x4DEE7F: vdev_probe (in /home/root/dpdk/build/app/testpmd)
==813==by 0x4ADFE7: rte_bus_probe (in /home/root/dpdk/build/app/testpmd)
==813==by 0x499D1B: rte_eal_init (in /home/root/dpdk/build/app/testpmd)
==813==  Address 0x1fff000251 is on thread 1's stack
==813==  in frame #2, created by qdisc_add_ingress (???:)
==813== 
Interactive-mode selected
testpmd: create a new mbuf pool : n=155456, size=2176, 
socket=0
t

Re: [dpdk-dev] [PATCH] testpmd: call cleanup on exit

2020-01-07 Thread Jerin Jacob
On Tue, Jan 7, 2020 at 11:11 PM Stephen Hemminger
 wrote:
>
> The rte_eal_cleanup code is not exercised by testpmd which
> is the most used DPDK test tool. Add a call at end of program.
>
> This helps exercise free and close paths which can
> be checked with tools like valgrind.
>
> Signed-off-by: Stephen Hemminger 
> ---
>  app/test-pmd/testpmd.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
> index b3746822366f..a94ad19c8f5a 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -3570,5 +3570,6 @@ main(int argc, char** argv)
> return 1;
> }
>
> +   rte_eal_cleanup();
> return 0;

# How about changing to "return rte_eal_cleanup()" to provide OS the
correct stats?
# IMO, we need the same in other applications main() and signal handler too.




>  }
> --
> 2.20.1
>


Re: [dpdk-dev] [PATCH v1 3/3] drivers: move ifc driver to the vDPA class

2020-01-07 Thread Maxime Coquelin



On 12/25/19 4:19 PM, Matan Azrad wrote:
> A new vDPA class was recently introduced.
> 
> IFC driver implements the vDPA operations, hence it should be moved to
> the vDPA class.
> 
> Move it.
> 
> Signed-off-by: Matan Azrad 
> ---
>  MAINTAINERS  |6 +-
>  doc/guides/nics/features/ifcvf.ini   |8 -
>  doc/guides/nics/ifc.rst  |  106 ---
>  doc/guides/nics/index.rst|1 -
>  doc/guides/vdpadevs/features/ifcvf.ini   |8 +
>  doc/guides/vdpadevs/ifc.rst  |  106 +++
>  doc/guides/vdpadevs/index.rst|1 +
>  drivers/net/Makefile |3 -
>  drivers/net/ifc/Makefile |   34 -
>  drivers/net/ifc/base/ifcvf.c |  329 
>  drivers/net/ifc/base/ifcvf.h |  162 
>  drivers/net/ifc/base/ifcvf_osdep.h   |   52 --
>  drivers/net/ifc/ifcvf_vdpa.c | 1280 
> --
>  drivers/net/ifc/meson.build  |9 -
>  drivers/net/ifc/rte_pmd_ifc_version.map  |3 -
>  drivers/net/meson.build  |1 -
>  drivers/vdpa/Makefile|6 +
>  drivers/vdpa/ifc/Makefile|   34 +
>  drivers/vdpa/ifc/base/ifcvf.c|  329 
>  drivers/vdpa/ifc/base/ifcvf.h|  162 
>  drivers/vdpa/ifc/base/ifcvf_osdep.h  |   52 ++
>  drivers/vdpa/ifc/ifcvf_vdpa.c| 1280 
> ++
>  drivers/vdpa/ifc/meson.build |9 +
>  drivers/vdpa/ifc/rte_pmd_ifc_version.map |3 +
>  drivers/vdpa/meson.build |2 +-
>  25 files changed, 1994 insertions(+), 1992 deletions(-)
>  delete mode 100644 doc/guides/nics/features/ifcvf.ini
>  delete mode 100644 doc/guides/nics/ifc.rst
>  create mode 100644 doc/guides/vdpadevs/features/ifcvf.ini
>  create mode 100644 doc/guides/vdpadevs/ifc.rst
>  delete mode 100644 drivers/net/ifc/Makefile
>  delete mode 100644 drivers/net/ifc/base/ifcvf.c
>  delete mode 100644 drivers/net/ifc/base/ifcvf.h
>  delete mode 100644 drivers/net/ifc/base/ifcvf_osdep.h
>  delete mode 100644 drivers/net/ifc/ifcvf_vdpa.c
>  delete mode 100644 drivers/net/ifc/meson.build
>  delete mode 100644 drivers/net/ifc/rte_pmd_ifc_version.map
>  create mode 100644 drivers/vdpa/ifc/Makefile
>  create mode 100644 drivers/vdpa/ifc/base/ifcvf.c
>  create mode 100644 drivers/vdpa/ifc/base/ifcvf.h
>  create mode 100644 drivers/vdpa/ifc/base/ifcvf_osdep.h
>  create mode 100644 drivers/vdpa/ifc/ifcvf_vdpa.c
>  create mode 100644 drivers/vdpa/ifc/meson.build
>  create mode 100644 drivers/vdpa/ifc/rte_pmd_ifc_version.map
> 

...

> diff --git a/doc/guides/vdpadevs/features/ifcvf.ini 
> b/doc/guides/vdpadevs/features/ifcvf.ini
> new file mode 100644
> index 000..ef1fc47
> --- /dev/null
> +++ b/doc/guides/vdpadevs/features/ifcvf.ini
> @@ -0,0 +1,8 @@
> +;
> +; Supported features of the 'ifcvf' vDPA driver.
> +;
> +; Refer to default.ini for the full list of available PMD features.
> +;
> +[Features]
> +x86-32   = Y
> +x86-64   = Y

Xiao or someone knowing the IFC enough would need to file the feature
list in a separate patch.

Other than that:
Reviewed-by: Maxime Coquelin 

Thanks,
Maxime



[dpdk-dev] [PATCH v2] testpmd: call cleanup on exit

2020-01-07 Thread Stephen Hemminger
The rte_eal_cleanup code is not exercised by testpmd which
is the most used DPDK test tool. Add a call at end of program.

This helps exercise free and close paths which can
be checked with tools like valgrind.

Signed-off-by: Stephen Hemminger 
---
v2 - report errors

 app/test-pmd/testpmd.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index b3746822366f..2eec8afda1ec 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -3570,5 +3570,10 @@ main(int argc, char** argv)
return 1;
}
 
-   return 0;
+   ret = rte_eal_cleanup();
+   if (ret != 0)
+   rte_exit(EXIT_FAILURE,
+"EAL cleanup failed: %s\n", strerror(-ret));
+
+   return EXIT_SUCCESS;
 }
-- 
2.20.1



Re: [dpdk-dev] [PATCH v3 7/7] net/bnxt: fix to not overwrite error message

2020-01-07 Thread Ajit Khaparde
On Tue, Jan 7, 2020 at 1:18 AM Ferruh Yigit  wrote:

> On 1/7/2020 12:37 AM, Ajit Khaparde wrote:
> > In some cases when flow creation fails, we  overwrite the specific
> > error message with a generic error message. This patch fixes it.
> >
> > Fixes: d24610f7bfda ("net/bnxt: allow flow creation when RSS is enabled")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Ajit Khaparde 
> > Reviewed-by: Lance Richardson 
> > ---
> >  drivers/net/bnxt/bnxt_flow.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/bnxt/bnxt_flow.c b/drivers/net/bnxt/bnxt_flow.c
> > index 707aedcec..cde1fa41c 100644
> > --- a/drivers/net/bnxt/bnxt_flow.c
> > +++ b/drivers/net/bnxt/bnxt_flow.c
> > @@ -1485,7 +1485,7 @@ bnxt_validate_and_parse_flow(struct rte_eth_dev
> *dev,
> >   if (rxq && !vnic->rx_queue_cnt)
> >   rxq->vnic = &bp->vnic_info[0];
> >   }
> > - return rc;
> > + return -rte_errno;
>
> The result will be same as far as I can see, you may get rid of the "rc =
> -rte_errno;" assignments before "goto ret;" with this change if you want.
>
I had that change lined up for my next set. But I can send it early.


>
> Also commit log seems describing the below change not this one ...
>
> >  }
> >
> >  static
> > @@ -1815,7 +1815,7 @@ bnxt_flow_create(struct rte_eth_dev *dev,
> >   rte_flow_error_set(error, 0,
> >  RTE_FLOW_ERROR_TYPE_NONE, NULL,
> >  "Flow with pattern exists, updating
> destination queue");
> > - else
> > + else if (!rte_errno)
> >   rte_flow_error_set(error, -ret,
> >  RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
> >  "Failed to create flow.");
> >
>
>


Re: [dpdk-dev] [PATCH v7 10/17] test/ring: modify single element enq/deq perf test cases

2020-01-07 Thread Honnappa Nagarahalli


> 
> > > > Add test cases to test rte_ring_xxx_elem APIs for single element
> > > > enqueue/dequeue test cases.
> > > >
> > > > Signed-off-by: Honnappa Nagarahalli 
> > > > Reviewed-by: Gavin Hu 
> > > > ---
> > > >  app/test/test_ring_perf.c | 100
> > > > ++
> > > >  1 file changed, 80 insertions(+), 20 deletions(-)
> > > >
> > > > diff --git a/app/test/test_ring_perf.c b/app/test/test_ring_perf.c
> > > > index 6c2aca483..5829718c1 100644
> > > > --- a/app/test/test_ring_perf.c
> > > > +++ b/app/test/test_ring_perf.c
> > > > @@ -13,6 +13,7 @@
> > > >  #include 
> > > >
> > > >  #include "test.h"
> > > > +#include "test_ring.h"
> > > >
> > > >  /*
> > > >   * Ring
> > > > @@ -41,6 +42,35 @@ struct lcore_pair {
> > > >
> > > >  static volatile unsigned lcore_count = 0;
> > > >
> > > > +static void
> > > > +test_ring_print_test_string(unsigned int api_type, int esize,
> > > > +   unsigned int bsz, double value)
> > > > +{
> > > > +   if (esize == -1)
> > > > +   printf("legacy APIs");
> > > > +   else
> > > > +   printf("elem APIs: element size %dB", esize);
> > > > +
> > > > +   if (api_type == TEST_RING_IGNORE_API_TYPE)
> > > > +   return;
> > > > +
> > > > +   if ((api_type & TEST_RING_N) == TEST_RING_N)
> > > > +   printf(": default enqueue/dequeue: ");
> > > > +   else if ((api_type & TEST_RING_S) == TEST_RING_S)
> > > > +   printf(": SP/SC: ");
> > > > +   else if ((api_type & TEST_RING_M) == TEST_RING_M)
> > > > +   printf(": MP/MC: ");
> > > > +
> > > > +   if ((api_type & TEST_RING_SL) == TEST_RING_SL)
> > > > +   printf("single: ");
> > > > +   else if ((api_type & TEST_RING_BL) == TEST_RING_BL)
> > > > +   printf("bulk (size: %u): ", bsz);
> > > > +   else if ((api_type & TEST_RING_BR) == TEST_RING_BR)
> > > > +   printf("burst (size: %u): ", bsz);
> > > > +
> > > > +   printf("%.2F\n", value);
> > > > +}
> > > > +
> > > >  / Functions to analyse our core mask to get cores for
> > > > different tests ***/
> > > >
> > > >  static int
> > > > @@ -335,32 +365,35 @@ run_on_all_cores(struct rte_ring *r)
> > > >   * Test function that determines how long an enqueue + dequeue of
> > > > a
> > > single item
> > > >   * takes on a single lcore. Result is for comparison with the bulk
> enq+deq.
> > > >   */
> > > > -static void
> > > > -test_single_enqueue_dequeue(struct rte_ring *r)
> > > > +static int
> > > > +test_single_enqueue_dequeue(struct rte_ring *r, const int esize,
> > > > +   const unsigned int api_type)
> > > >  {
> > > > -   const unsigned iter_shift = 24;
> > > > -   const unsigned iterations = 1< > > > -   unsigned i = 0;
> > > > +   int ret;
> > > > +   const unsigned int iter_shift = 24;
> > > > +   const unsigned int iterations = 1 << iter_shift;
> > > > +   unsigned int i = 0;
> > > > void *burst = NULL;
> > > >
> > > > -   const uint64_t sc_start = rte_rdtsc();
> > > > -   for (i = 0; i < iterations; i++) {
> > > > -   rte_ring_sp_enqueue(r, burst);
> > > > -   rte_ring_sc_dequeue(r, &burst);
> > > > -   }
> > > > -   const uint64_t sc_end = rte_rdtsc();
> > > > +   (void)ret;
> > >
> > > Here, and in few other places, looks redundant.
> > The compiler throws an error since 'ret' is assigned a value, but it is not
> used.
> 
> Probably one way to change  TEST_RING_ENQUEUE() from macro to inline-
> function returning ret.
> 
Yes, that is possible, will do.

> >
> > >



Re: [dpdk-dev] [PATCH v3 0/7] bnxt patchset

2020-01-07 Thread Ajit Khaparde
On Mon, Jan 6, 2020 at 4:37 PM Ajit Khaparde 
wrote:

> v2->v3:
>  - Rebased patches against latest dpdk-next-net
>  - sync commit logs
>

Patchset applied to dpdk-next-net-brcm. Thanks

>
> Ajit Khaparde (2):
>   net/bnxt: add support for flow mark action
>   net/bnxt: fix to not overwrite error message
>
> Santoshkumar Karanappa Rastapur (2):
>   net/bnxt: fix link failure during port toggle
>   net/bnxt: fix non matching flow hitting filter rule
>
> Somnath Kotur (3):
>   net/bnxt: fix to use first valid profile
>   net/bnxt: fix flow flush to sync with flow destroy
>   net/bnxt: fix to reuse an L2 filter
>
>  drivers/net/bnxt/bnxt.h|  15 ++-
>  drivers/net/bnxt/bnxt_cpr.c|   2 +-
>  drivers/net/bnxt/bnxt_ethdev.c |  34 -
>  drivers/net/bnxt/bnxt_filter.h |   7 +
>  drivers/net/bnxt/bnxt_flow.c   | 233 +
>  drivers/net/bnxt/bnxt_hwrm.c   |  66 --
>  drivers/net/bnxt/bnxt_hwrm.h   |   3 +
>  drivers/net/bnxt/bnxt_rxr.c|  41 +-
>  drivers/net/bnxt/bnxt_rxr.h|  11 ++
>  9 files changed, 277 insertions(+), 135 deletions(-)
>
> --
> 2.21.0 (Apple Git-122.2)
>
>


[dpdk-dev] [PATCH] net/af_xdp: use single-prod-and-cons ring

2020-01-07 Thread Xiao Wang
The ring is used only by af_xdp pmd itself, so no need to support
multi-producer and multi-consumer mode. This patch changes the ring
to single-producer and single-consumer mode, which could yield better
performance for addr enqueue and dequeue.

Signed-off-by: Xiao Wang 
---
 drivers/net/af_xdp/rte_eth_af_xdp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c 
b/drivers/net/af_xdp/rte_eth_af_xdp.c
index d903e6c28..683e2a559 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -809,7 +809,7 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals 
*internals,
umem->buf_ring = rte_ring_create(ring_name,
 ETH_AF_XDP_NUM_BUFFERS,
 rte_socket_id(),
-0x0);
+RING_F_SP_ENQ | RING_F_SC_DEQ);
if (umem->buf_ring == NULL) {
AF_XDP_LOG(ERR, "Failed to create rte_ring\n");
goto err;
-- 
2.15.1



[dpdk-dev] [PATCH v4] net/iavf/base: change the base as driver common

2020-01-07 Thread Haiyue Wang
Change the iavf base code as driver common library, it is used by iavf
PMD now, and it can be used by other Intel SR-IOV PMDs in the future.

Signed-off-by: Haiyue Wang 
---
v4: keep the iavf spinlock & memmory function prototype in osdep.h

v3: update the commit message.

v2: update the commit message, and rename the iavf_main.c to iavf_impl.c

 drivers/common/Makefile   |  5 +
 drivers/common/iavf/Makefile  | 28 ++
 drivers/{net/iavf/base => common/iavf}/README |  1 +
 .../iavf/base => common/iavf}/iavf_adminq.c   |  0
 .../iavf/base => common/iavf}/iavf_adminq.h   |  0
 .../base => common/iavf}/iavf_adminq_cmd.h|  0
 .../iavf/base => common/iavf}/iavf_alloc.h|  0
 .../iavf/base => common/iavf}/iavf_common.c   |  0
 .../iavf/base => common/iavf}/iavf_devids.h   |  0
 drivers/common/iavf/iavf_impl.c   | 95 +++
 .../iavf/base => common/iavf}/iavf_osdep.h| 15 +--
 .../base => common/iavf}/iavf_prototype.h |  0
 .../iavf/base => common/iavf}/iavf_register.h |  0
 .../iavf/base => common/iavf}/iavf_status.h   |  0
 .../iavf/base => common/iavf}/iavf_type.h |  0
 drivers/common/iavf/meson.build   | 10 ++
 .../common/iavf/rte_common_iavf_version.map   | 12 +++
 .../{net/iavf/base => common/iavf}/virtchnl.h |  0
 drivers/common/meson.build|  2 +-
 drivers/net/iavf/Makefile | 23 +
 drivers/net/iavf/base/meson.build | 23 -
 drivers/net/iavf/iavf.h   |  6 +-
 drivers/net/iavf/iavf_ethdev.c| 83 
 drivers/net/iavf/iavf_rxtx.c  |  3 -
 drivers/net/iavf/iavf_rxtx_vec_avx2.c |  1 -
 drivers/net/iavf/iavf_rxtx_vec_sse.c  |  2 -
 drivers/net/iavf/iavf_vchnl.c |  5 -
 drivers/net/iavf/meson.build  |  4 +-
 mk/rte.app.mk |  4 +
 29 files changed, 173 insertions(+), 149 deletions(-)
 create mode 100644 drivers/common/iavf/Makefile
 rename drivers/{net/iavf/base => common/iavf}/README (96%)
 rename drivers/{net/iavf/base => common/iavf}/iavf_adminq.c (100%)
 rename drivers/{net/iavf/base => common/iavf}/iavf_adminq.h (100%)
 rename drivers/{net/iavf/base => common/iavf}/iavf_adminq_cmd.h (100%)
 rename drivers/{net/iavf/base => common/iavf}/iavf_alloc.h (100%)
 rename drivers/{net/iavf/base => common/iavf}/iavf_common.c (100%)
 rename drivers/{net/iavf/base => common/iavf}/iavf_devids.h (100%)
 create mode 100644 drivers/common/iavf/iavf_impl.c
 rename drivers/{net/iavf/base => common/iavf}/iavf_osdep.h (93%)
 rename drivers/{net/iavf/base => common/iavf}/iavf_prototype.h (100%)
 rename drivers/{net/iavf/base => common/iavf}/iavf_register.h (100%)
 rename drivers/{net/iavf/base => common/iavf}/iavf_status.h (100%)
 rename drivers/{net/iavf/base => common/iavf}/iavf_type.h (100%)
 create mode 100644 drivers/common/iavf/meson.build
 create mode 100644 drivers/common/iavf/rte_common_iavf_version.map
 rename drivers/{net/iavf/base => common/iavf}/virtchnl.h (100%)
 delete mode 100644 drivers/net/iavf/base/meson.build

diff --git a/drivers/common/Makefile b/drivers/common/Makefile
index 1ff033bba..3254c5274 100644
--- a/drivers/common/Makefile
+++ b/drivers/common/Makefile
@@ -30,4 +30,9 @@ ifeq ($(CONFIG_RTE_LIBRTE_COMMON_DPAAX),y)
 DIRS-y += dpaax
 endif
 
+IAVF-y := $(CONFIG_RTE_LIBRTE_IAVF_PMD)
+ifneq (,$(findstring y,$(IAVF-y)))
+DIRS-y += iavf
+endif
+
 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/drivers/common/iavf/Makefile b/drivers/common/iavf/Makefile
new file mode 100644
index 0..43383e376
--- /dev/null
+++ b/drivers/common/iavf/Makefile
@@ -0,0 +1,28 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2019 Intel Corporation
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+#
+# library name
+#
+LIB = librte_common_iavf.a
+
+CFLAGS += -DALLOW_EXPERIMENTAL_API
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+CFLAGS += -Wno-pointer-arith
+CFLAGS += -Wno-cast-qual
+
+EXPORT_MAP := rte_common_iavf_version.map
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-y += iavf_adminq.c
+SRCS-y += iavf_common.c
+SRCS-y += iavf_impl.c
+
+LDLIBS += -lrte_eal
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/iavf/base/README b/drivers/common/iavf/README
similarity index 96%
rename from drivers/net/iavf/base/README
rename to drivers/common/iavf/README
index e8c49c36f..b78e89bee 100644
--- a/drivers/net/iavf/base/README
+++ b/drivers/common/iavf/README
@@ -17,3 +17,4 @@ NOTE: The source code in this directory should not be 
modified apart from
 the following file(s):
 
 iavf_osdep.h
+iavf_impl.c
diff --git a/drivers/net/iavf/base/iavf_adminq.c 
b/drivers/common/iavf/iavf_adminq.c
similarity index 100%
rename from drivers/net/iavf/base/iavf_adminq.c
rename to drivers/common/iavf/iavf_adminq.c
diff --git a/drivers/net/iavf/base/iavf_adminq.h 
b/drivers/common/iavf/iavf_adminq.h
similarity index 100%
rename from driv

Re: [dpdk-dev] [PATCH v4] net/iavf/base: change the base as driver common

2020-01-07 Thread Zhang, Qi Z



> -Original Message-
> From: Wang, Haiyue 
> Sent: Wednesday, January 8, 2020 11:12 AM
> To: dev@dpdk.org; Ye, Xiaolong ; Zhang, Qi Z
> ; Yang, Qiming 
> Cc: Wang, Haiyue 
> Subject: [PATCH v4] net/iavf/base: change the base as driver common
> 
> Change the iavf base code as driver common library, it is used by iavf PMD 
> now,
> and it can be used by other Intel SR-IOV PMDs in the future.
> 
> Signed-off-by: Haiyue Wang 

Acked-by: Qi Zhang 



Re: [dpdk-dev] [PATCH v1 2/3] doc: add vDPA feature table

2020-01-07 Thread Tiwei Bie
On Tue, Jan 07, 2020 at 06:39:36PM +0100, Maxime Coquelin wrote:
> On 12/25/19 4:19 PM, Matan Azrad wrote:
> > Add vDPA devices features table and explanation.
> > 
> > Any vDPA driver can add its own supported features by ading a new ini
> > file to the features directory in doc/guides/vdpadevs/features.
> > 
> > Signed-off-by: Matan Azrad 
> > ---
> >  doc/guides/conf.py|  5 +++
> >  doc/guides/vdpadevs/features/default.ini  | 55 ++
> >  doc/guides/vdpadevs/features_overview.rst | 65 
> > +++
> >  doc/guides/vdpadevs/index.rst |  1 +
> >  4 files changed, 126 insertions(+)
> >  create mode 100644 doc/guides/vdpadevs/features/default.ini
> >  create mode 100644 doc/guides/vdpadevs/features_overview.rst
> > 
> > diff --git a/doc/guides/conf.py b/doc/guides/conf.py
> > index 0892c06..c368fa5 100644
> > --- a/doc/guides/conf.py
> > +++ b/doc/guides/conf.py
> > @@ -401,6 +401,11 @@ def setup(app):
> >  'Features',
> >  'Features availability in compression drivers',
> >  'Feature')
> > +table_file = dirname(__file__) + '/vdpadevs/overview_feature_table.txt'
> > +generate_overview_table(table_file, 1,
> > +'Features',
> > +'Features availability in vDPA drivers',
> > +'Feature')
> >  
> >  if LooseVersion(sphinx_version) < LooseVersion('1.3.1'):
> >  print('Upgrade sphinx to version >= 1.3.1 for '
> > diff --git a/doc/guides/vdpadevs/features/default.ini 
> > b/doc/guides/vdpadevs/features/default.ini
> > new file mode 100644
> > index 000..a3e0bc7
> > --- /dev/null
> > +++ b/doc/guides/vdpadevs/features/default.ini
> > @@ -0,0 +1,55 @@
> > +;
> > +; Features of a default vDPA driver.
> > +;
> > +; This file defines the features that are valid for inclusion in
> > +; the other driver files and also the order that they appear in
> > +; the features table in the documentation. The feature description
> > +; string should not exceed feature_str_len defined in conf.py.
> > +;
> 
> I think some entries below could be removed for vDPA.

+1

> 
> > +[Features]
> > +csum =
> > +guest csum   =
> > +mac  =
> > +gso  =
> > +guest tso4   =
> > +guest tso6   =
> > +ecn  =
> > +ufo  =
> > +host tso4=
> > +host tso6=
> > +mrg rxbuf=
> > +ctrl vq  =
> > +ctrl rx  =
> > +any layout   =
> > +guest announce   =
> > +mq   =
> > +version 1=
> > +log all  =
> > +protocol features=

We may not need to list this. The proto * would imply it.

> > +indirect desc=
> > +event idx=
> > +mtu  =
> > +in_order =
> > +IOMMU platform   =
> > +packed   =
> > +proto mq =
> > +proto log shmfd  =
> > +proto rarp   =
> > +proto reply ack  =
> > +proto slave req  =

Ditto. This feature is to be used by other features.
Features like host notifier would imply it.

> > +proto crypto session =

We don't need to list this before we officially support
the crypto vDPA device.

> > +proto host notifier  =
> > +proto pagefault  =
> > +Multiprocess aware   =

There is no support for this in library currently.
To support it, we need to sync vhost fds and messages
among processes.

> > +BSD nic_uio  =
> > +Linux UIO=
> 
> E.g. UIO, which cannot be used since vDPA requires an IOMMU.
> 
> > +Linux VFIO   =
> > +Other kdrv   =
> > +ARMv7=
> > +ARMv8=
> > +Power8   =
> > +x86-32   =
> > +x86-64   =
> > +Usage doc=
> > +Design doc   =
> > +Perf doc =
> > \ No newline at end of file
> > diff --git a/doc/guides/vdpadevs/features_overview.rst 
> > b/doc/guides/vdpadevs/features_overview.rst
> > new file mode 100644
> > index 000..c7745b7
> > --- /dev/null
> > +++ b/doc/guides/vdpadevs/features_overview.rst
> > @@ -0,0 +1,65 @@
> > +..  SPDX-License-Identifier: BSD-3-Clause
> > +Copyright 2019 Mellanox Technologies, Ltd
> > +
> > +Overview of vDPA drivers features
> > +=
> > +
> > +This section explains the supported features that are listed in the table 
> > below.
> > +
> > +  * csum - Device can handle packets with partial checksum.
> > +  * guest csum - Guest can handle packets with partial checksum.
> > +  * mac - Device has given MAC address.
> > +  * gso - Device can handle packets with any GSO type.
> > +  * guest tso4 - Guest can receive TSOv4.
> > +  * guest tso6 - Guest can receive TSOv6.
> > +  * ecn - Device can receive TSO with ECN.
> > +  * ufo - Device can receive UFO.
> > +  * host tso4 - Dev

Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device drivers

2020-01-07 Thread Xu, Rosen
Hi Matan,

Did you think about OVS DPDK?
vDPA is a basic module for OVS, currently it will take some exception path 
packet processing
for OVS, so it still needs to integrate eth_dev.

Thanks,
Rosen

> -Original Message-
> From: dev  On Behalf Of Matan Azrad
> Sent: Tuesday, January 07, 2020 15:57
> To: Matan Azrad ; Maxime Coquelin
> ; Bie, Tiwei ; Wang,
> Zhihong ; Wang, Xiao W
> 
> Cc: Yigit, Ferruh ; dev@dpdk.org; Thomas Monjalon
> 
> Subject: Re: [dpdk-dev] [PATCH v1 0/3] Introduce new class for vDPA device
> drivers
> 
> Hi all
> 
> Any comments?
> 
> From: Matan Azrad
> > As discussed and as described in RFC "[RFC] net: new vdpa PMD for
> > Mellanox devices", new vDPA driver is going to be added for Mellanox
> > devices - vDPA
> > mlx5 and more.
> >
> > The only vDPA driver now is the IFC driver that is located in net directory.
> >
> > The IFC driver and the new vDPA mlx5 driver provide the vDPA ops
> > introduced in librte_vhost and not the eth-dev ops.
> > All the others drivers in net class provide the eth-dev ops.
> > The set of features is also different.
> >
> > Create a new class for vDPA drivers and move IFC to this class.
> > Later, all the new drivers that implement the vDPA ops will be added
> > to the vDPA class.
> >
> > Also, a vDPA device driver features list was added to vDPA documentation.
> >
> > Please review the features list and the series.
> >
> > Later on, I'm going to send the vDPA mlx5 driver.
> >
> > Thanks.
> >
> >
> > Matan Azrad (3):
> >   drivers: introduce vDPA class
> >   doc: add vDPA feature table
> >   drivers: move ifc driver to the vDPA class
> >
> >  MAINTAINERS   |6 +-
> >  doc/guides/conf.py|5 +
> >  doc/guides/index.rst  |1 +
> >  doc/guides/nics/features/ifcvf.ini|8 -
> >  doc/guides/nics/ifc.rst   |  106 ---
> >  doc/guides/nics/index.rst |1 -
> >  doc/guides/vdpadevs/features/default.ini  |   55 ++
> >  doc/guides/vdpadevs/features/ifcvf.ini|8 +
> >  doc/guides/vdpadevs/features_overview.rst |   65 ++
> >  doc/guides/vdpadevs/ifc.rst   |  106 +++
> >  doc/guides/vdpadevs/index.rst |   15 +
> >  drivers/Makefile  |2 +
> >  drivers/meson.build   |1 +
> >  drivers/net/Makefile  |3 -
> >  drivers/net/ifc/Makefile  |   34 -
> >  drivers/net/ifc/base/ifcvf.c  |  329 
> >  drivers/net/ifc/base/ifcvf.h  |  162 
> >  drivers/net/ifc/base/ifcvf_osdep.h|   52 --
> >  drivers/net/ifc/ifcvf_vdpa.c  | 1280 
> > -
> >  drivers/net/ifc/meson.build   |9 -
> >  drivers/net/ifc/rte_pmd_ifc_version.map   |3 -
> >  drivers/net/meson.build   |1 -
> >  drivers/vdpa/Makefile |   14 +
> >  drivers/vdpa/ifc/Makefile |   34 +
> >  drivers/vdpa/ifc/base/ifcvf.c |  329 
> >  drivers/vdpa/ifc/base/ifcvf.h |  162 
> >  drivers/vdpa/ifc/base/ifcvf_osdep.h   |   52 ++
> >  drivers/vdpa/ifc/ifcvf_vdpa.c | 1280
> > +
> >  drivers/vdpa/ifc/meson.build  |9 +
> >  drivers/vdpa/ifc/rte_pmd_ifc_version.map  |3 +
> >  drivers/vdpa/meson.build  |8 +
> >  31 files changed, 2152 insertions(+), 1991 deletions(-)  delete mode
> > 100644 doc/guides/nics/features/ifcvf.ini
> >  delete mode 100644 doc/guides/nics/ifc.rst  create mode 100644
> > doc/guides/vdpadevs/features/default.ini
> >  create mode 100644 doc/guides/vdpadevs/features/ifcvf.ini
> >  create mode 100644 doc/guides/vdpadevs/features_overview.rst
> >  create mode 100644 doc/guides/vdpadevs/ifc.rst  create mode 100644
> > doc/guides/vdpadevs/index.rst  delete mode 100644
> > drivers/net/ifc/Makefile  delete mode 100644
> > drivers/net/ifc/base/ifcvf.c delete mode 100644
> > drivers/net/ifc/base/ifcvf.h  delete mode 100644
> > drivers/net/ifc/base/ifcvf_osdep.h
> >  delete mode 100644 drivers/net/ifc/ifcvf_vdpa.c  delete mode 100644
> > drivers/net/ifc/meson.build  delete mode 100644
> > drivers/net/ifc/rte_pmd_ifc_version.map
> >  create mode 100644 drivers/vdpa/Makefile  create mode 100644
> > drivers/vdpa/ifc/Makefile  create mode 100644
> > drivers/vdpa/ifc/base/ifcvf.c create mode 100644
> > drivers/vdpa/ifc/base/ifcvf.h  create mode 100644
> > drivers/vdpa/ifc/base/ifcvf_osdep.h
> >  create mode 100644 drivers/vdpa/ifc/ifcvf_vdpa.c  create mode 100644
> > drivers/vdpa/ifc/meson.build  create mode 100644
> > drivers/vdpa/ifc/rte_pmd_ifc_version.map
> >  create mode 100644 drivers/vdpa/meson.build
> >
> > --
> > 1.8.3.1



Re: [dpdk-dev] [PATCH v7 02/17] lib/ring: apis to support configurable element size

2020-01-07 Thread Honnappa Nagarahalli

> > > > > > > +
> > > > > > > +static __rte_always_inline void enqueue_elems_128(struct
> > > > > > > +rte_ring *r, uint32_t prod_head, const void *obj_table,
> > > > > > > +uint32_t n) { unsigned int i; const uint32_t size =
> > > > > > > +r->size; uint32_t idx = prod_head & r->mask; __uint128_t
> > > > > > > +*ring = (__uint128_t *)&r[1]; const __uint128_t *obj =
> > > > > > > +(const __uint128_t *)obj_table; if (likely(idx + n < size))
> > > > > > > +{ for (i = 0; i < (n & ~0x1); i += 2, idx += 2) { ring[idx]
> > > > > > > += obj[i]; ring[idx + 1] = obj[i + 1];
> > > > > >
> > > > > >
> > > > > > AFAIK, that implies 16B aligned obj_table...
> > > > > > Would it always be the case?
> > > > > I am not sure from the compiler perspective.
> > > > > At least on Arm architecture, unaligned access (address that is
> > > > > accessed is not aligned to the size of the data element being
> > > > > accessed) will result in faults or require additional cycles.
> > > > > So, aligning on
> > > 16B should be fine.
> > > > Further, I would be changing this to use 'rte_int128_t' as
> > > > '__uint128_t' is
> > > not defined on 32b systems.
> > >
> > > What I am trying to say: with this code we imply new requirement for
> > > elems
> > The only existing use case in DPDK for 16B is the event ring. The event ring
> already does similar kind of copy (using 'struct rte_event').
> > So, there is no change in expectations for event ring.
> > For future code, I think this expectation should be fine since it allows for
> optimal code.
> >
> > > in the ring: when sizeof(elem)==16 it's alignment also has to be at least
> 16.
> > > Which from my perspective is not ideal.
> > Any reasoning?
> 
> New implicit requirement and inconsistency.
> Code like that:
> 
> struct ring_elem {uint64_t a, b;};
> 
> struct ring_elem elem;
> rte_ring_dequeue_elem(ring, &elem, sizeof(elem));
> 
> might cause a crash.
The alignment here is 8B. Assuming that instructions generated will require 16B 
alignment, it will result in a crash, if configured to generate exception.
But, these instructions are not atomic instructions. At least on aarch64, 
unaligned access will not result in an exception for non-atomic loads/stores. I 
believe it is the same behavior for x86 as well.

> While exactly the same code with:
> 
> struct ring_elem {uint64_t a, b, c;}; OR struct ring_elem {uint64_t a, b, c, 
> d;};
> 
> will work ok.
The alignment for these structures is still 8B. Are you saying this will work 
because these will be copied using pointer to uint32_t (whose alignment is 4B)?

> 
> >
> > > Note that for elem sizes > 16 (24, 32), there is no such constraint.
> > The rest of them need to be aligned on 4B boundary. However, this should
> not affect the existing code.
> > The code for 8B and 16B is kept as is to ensure the performance is not
> affected for the existing code.




[dpdk-dev] [PATCH 1/4] net/vhost: remove an unused member

2020-01-07 Thread oda
From: Itsuro Oda 

remove an unused member from pmd_internal.
---
 drivers/net/vhost/rte_eth_vhost.c | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/drivers/net/vhost/rte_eth_vhost.c 
b/drivers/net/vhost/rte_eth_vhost.c
index 46f01a7f4..d4e3485ce 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -95,7 +95,6 @@ struct vhost_queue {
 
 struct pmd_internal {
rte_atomic32_t dev_attached;
-   char *dev_name;
char *iface_name;
uint16_t max_queues;
int vid;
@@ -1008,7 +1007,6 @@ eth_dev_close(struct rte_eth_dev *dev)
for (i = 0; i < dev->data->nb_tx_queues; i++)
rte_free(dev->data->tx_queues[i]);
 
-   free(internal->dev_name);
free(internal->iface_name);
rte_free(internal);
 
@@ -1253,9 +1251,6 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char 
*iface_name,
 * - and point eth_dev structure to new eth_dev_data structure
 */
internal = eth_dev->data->dev_private;
-   internal->dev_name = strdup(name);
-   if (internal->dev_name == NULL)
-   goto error;
internal->iface_name = strdup(iface_name);
if (internal->iface_name == NULL)
goto error;
@@ -1305,10 +1300,8 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char 
*iface_name,
return data->port_id;
 
 error:
-   if (internal) {
+   if (internal)
free(internal->iface_name);
-   free(internal->dev_name);
-   }
rte_free(vring_state);
rte_eth_dev_release_port(eth_dev);
rte_free(list);
-- 
2.17.0



[dpdk-dev] [PATCH 2/4] net/vhost: allocate iface_name from heap

2020-01-07 Thread oda
From: Itsuro Oda 

allocate iface_name of pmd_internal from heap in order to
be able to refer from secondary processes.
---
 drivers/net/vhost/rte_eth_vhost.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/vhost/rte_eth_vhost.c 
b/drivers/net/vhost/rte_eth_vhost.c
index d4e3485ce..1b07aad24 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -1007,7 +1007,7 @@ eth_dev_close(struct rte_eth_dev *dev)
for (i = 0; i < dev->data->nb_tx_queues; i++)
rte_free(dev->data->tx_queues[i]);
 
-   free(internal->iface_name);
+   rte_free(internal->iface_name);
rte_free(internal);
 
dev->data->dev_private = NULL;
@@ -1251,9 +1251,11 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char 
*iface_name,
 * - and point eth_dev structure to new eth_dev_data structure
 */
internal = eth_dev->data->dev_private;
-   internal->iface_name = strdup(iface_name);
+   internal->iface_name = rte_malloc_socket(name, strlen(iface_name) + 1,
+0, numa_node);
if (internal->iface_name == NULL)
goto error;
+   strcpy(internal->iface_name, iface_name);
 
list->eth_dev = eth_dev;
pthread_mutex_lock(&internal_list_lock);
@@ -1301,7 +1303,7 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char 
*iface_name,
 
 error:
if (internal)
-   free(internal->iface_name);
+   rte_free(internal->iface_name);
rte_free(vring_state);
rte_eth_dev_release_port(eth_dev);
rte_free(list);
-- 
2.17.0



[dpdk-dev] [PATCH 0/4] make vhost PMD available for secondary processes

2020-01-07 Thread oda
From: Itsuro Oda 

vhost PMD has not been available for secondary processes since
DPDK v18.11.  (https://bugs.dpdk.org/show_bug.cgi?id=194)
(for a long term !)
This series of patches intend to make vhost PMD available for
secondary processes.
Because now setting vhost driver to communicate with a vhost-user
master (ex. Qemu) is accomplished by the probe function of the
primary process, only the primary process can be a vhost-user
slave.
With this patch, setting vhost driver is delayed at eth_dev
configuration in order to be able to set it from a secondary
process. Because (in the first place,) setting vhost driver is not
necessary to be done at probe (it is enough to be done up to eth_dev
start), this fix is no problem for the primary process.
There is a precondition that the same process has to operate
a vhost interface from begining to end (eth_dev configuration to
eth_dev close). (This patch leaves it to user's responsibility.)
This precondition will not be a problem in most use cases
(including SPP).

Itsuro Oda (4):
  net/vhost: remove an unused member
  net/vhost: allocate iface_name from heap
  net/vhost: delay vhost driver setup
  net/vhost: make secondary probe complete

 drivers/net/vhost/rte_eth_vhost.c | 152 +-
 1 file changed, 88 insertions(+), 64 deletions(-)

-- 
2.17.0



[dpdk-dev] [PATCH 3/4] net/vhost: delay vhost driver setup

2020-01-07 Thread oda
From: Itsuro Oda 

setting vhost driver is delayed at eth_dev configuration
in order to be able to set it from a secondary process.
---
 drivers/net/vhost/rte_eth_vhost.c | 130 ++
 1 file changed, 78 insertions(+), 52 deletions(-)

diff --git a/drivers/net/vhost/rte_eth_vhost.c 
b/drivers/net/vhost/rte_eth_vhost.c
index 1b07aad24..0b8b5a4ca 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -96,6 +96,8 @@ struct vhost_queue {
 struct pmd_internal {
rte_atomic32_t dev_attached;
char *iface_name;
+   uint64_t flags;
+   uint64_t disable_flags;
uint16_t max_queues;
int vid;
rte_atomic32_t started;
@@ -490,17 +492,6 @@ eth_vhost_tx(void *q, struct rte_mbuf **bufs, uint16_t 
nb_bufs)
return nb_tx;
 }
 
-static int
-eth_dev_configure(struct rte_eth_dev *dev __rte_unused)
-{
-   struct pmd_internal *internal = dev->data->dev_private;
-   const struct rte_eth_rxmode *rxmode = &dev->data->dev_conf.rxmode;
-
-   internal->vlan_strip = !!(rxmode->offloads & DEV_RX_OFFLOAD_VLAN_STRIP);
-
-   return 0;
-}
-
 static inline struct internal_list *
 find_internal_resource(char *ifname)
 {
@@ -876,6 +867,62 @@ static struct vhost_device_ops vhost_ops = {
.vring_state_changed = vring_state_changed,
 };
 
+static int
+vhost_driver_setup(struct rte_eth_dev *eth_dev)
+{
+   struct pmd_internal *internal = eth_dev->data->dev_private;
+   struct internal_list *list = NULL;
+   struct rte_vhost_vring_state *vring_state = NULL;
+   unsigned int numa_node = eth_dev->device->numa_node;
+   const char *name = eth_dev->device->name;
+
+   list = rte_zmalloc_socket(name, sizeof(*list), 0, numa_node);
+   if (list == NULL)
+   goto error;
+
+   vring_state = rte_zmalloc_socket(name, sizeof(*vring_state),
+0, numa_node);
+   if (vring_state == NULL)
+   goto error;
+
+   list->eth_dev = eth_dev;
+   pthread_mutex_lock(&internal_list_lock);
+   TAILQ_INSERT_TAIL(&internal_list, list, next);
+   pthread_mutex_unlock(&internal_list_lock);
+
+   rte_spinlock_init(&vring_state->lock);
+   vring_states[eth_dev->data->port_id] = vring_state;
+
+   if (rte_vhost_driver_register(internal->iface_name, internal->flags))
+   goto error;
+
+   if (internal->disable_flags) {
+   if (rte_vhost_driver_disable_features(internal->iface_name,
+ internal->disable_flags))
+   goto error;
+   }
+
+   if (rte_vhost_driver_callback_register(internal->iface_name,
+  &vhost_ops) < 0) {
+   VHOST_LOG(ERR, "Can't register callbacks\n");
+   goto error;
+   }
+
+   if (rte_vhost_driver_start(internal->iface_name) < 0) {
+   VHOST_LOG(ERR, "Failed to start driver for %s\n",
+ internal->iface_name);
+   goto error;
+   }
+
+   return 0;
+
+error:
+   rte_free(vring_state);
+   rte_free(list);
+
+   return -1;
+}
+
 int
 rte_eth_vhost_get_queue_event(uint16_t port_id,
struct rte_eth_vhost_queue_event *event)
@@ -942,6 +989,24 @@ rte_eth_vhost_get_vid_from_port_id(uint16_t port_id)
return vid;
 }
 
+static int
+eth_dev_configure(struct rte_eth_dev *dev)
+{
+   struct pmd_internal *internal = dev->data->dev_private;
+   const struct rte_eth_rxmode *rxmode = &dev->data->dev_conf.rxmode;
+
+   /* NOTE: the same process has to operate a vhost interface
+* from begining to end (eth_dev configure to eth_dev close).
+* It is user's responsibility at the moment.
+*/
+   if (vhost_driver_setup(dev) < 0)
+   return -1;
+
+   internal->vlan_strip = !!(rxmode->offloads & DEV_RX_OFFLOAD_VLAN_STRIP);
+
+   return 0;
+}
+
 static int
 eth_dev_start(struct rte_eth_dev *eth_dev)
 {
@@ -1217,16 +1282,10 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char 
*iface_name,
struct pmd_internal *internal = NULL;
struct rte_eth_dev *eth_dev = NULL;
struct rte_ether_addr *eth_addr = NULL;
-   struct rte_vhost_vring_state *vring_state = NULL;
-   struct internal_list *list = NULL;
 
VHOST_LOG(INFO, "Creating VHOST-USER backend on numa socket %u\n",
numa_node);
 
-   list = rte_zmalloc_socket(name, sizeof(*list), 0, numa_node);
-   if (list == NULL)
-   goto error;
-
/* reserve an ethdev entry */
eth_dev = rte_eth_vdev_allocate(dev, sizeof(*internal));
if (eth_dev == NULL)
@@ -1240,11 +1299,6 @@ eth_dev_vhost_create(struct rte_vdev_device *dev, char 
*iface_name,
*eth_addr = base_eth_addr;
eth_addr->addr_bytes[5] = eth_dev->data->port_id;
 
-   vring_state = rte_zmalloc_socket

[dpdk-dev] [PATCH 4/4] net/vhost: make secondary probe complete

2020-01-07 Thread oda
From: Itsuro Oda 

add lacking member setting and make secondary probe complete.
---
 drivers/net/vhost/rte_eth_vhost.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/vhost/rte_eth_vhost.c 
b/drivers/net/vhost/rte_eth_vhost.c
index 0b8b5a4ca..7a501cf91 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -1390,8 +1390,11 @@ rte_pmd_vhost_probe(struct rte_vdev_device *dev)
VHOST_LOG(ERR, "Failed to probe %s\n", name);
return -1;
}
-   /* TODO: request info from primary to set up Rx and Tx */
+   eth_dev->rx_pkt_burst = eth_vhost_rx;
+   eth_dev->tx_pkt_burst = eth_vhost_tx;
eth_dev->dev_ops = &ops;
+   if (dev->device.numa_node == SOCKET_ID_ANY)
+   dev->device.numa_node = rte_socket_id();
eth_dev->device = &dev->device;
rte_eth_dev_probing_finish(eth_dev);
return 0;
-- 
2.17.0



Re: [dpdk-dev] [PATCH 0/4] make vhost PMD available for secondary processes

2020-01-07 Thread Itsuro ODA
Hi,

I will fix to add signed-off-by and correct spelling error
according to test-report@dpdk.
Please check the content until then.

Thanks.

On Wed,  8 Jan 2020 15:25:06 +0900
o...@valinux.co.jp wrote:

> From: Itsuro Oda 
> 
> vhost PMD has not been available for secondary processes since
> DPDK v18.11.  (https://bugs.dpdk.org/show_bug.cgi?id=194)
> (for a long term !)
> This series of patches intend to make vhost PMD available for
> secondary processes.
> Because now setting vhost driver to communicate with a vhost-user
> master (ex. Qemu) is accomplished by the probe function of the
> primary process, only the primary process can be a vhost-user
> slave.
> With this patch, setting vhost driver is delayed at eth_dev
> configuration in order to be able to set it from a secondary
> process. Because (in the first place,) setting vhost driver is not
> necessary to be done at probe (it is enough to be done up to eth_dev
> start), this fix is no problem for the primary process.
> There is a precondition that the same process has to operate
> a vhost interface from begining to end (eth_dev configuration to
> eth_dev close). (This patch leaves it to user's responsibility.)
> This precondition will not be a problem in most use cases
> (including SPP).
> 
> Itsuro Oda (4):
>   net/vhost: remove an unused member
>   net/vhost: allocate iface_name from heap
>   net/vhost: delay vhost driver setup
>   net/vhost: make secondary probe complete
> 
>  drivers/net/vhost/rte_eth_vhost.c | 152 +-
>  1 file changed, 88 insertions(+), 64 deletions(-)
> 
> -- 
> 2.17.0

-- 
Itsuro ODA 



Re: [dpdk-dev] [PATCH v1 2/3] doc: add vDPA feature table

2020-01-07 Thread Andrew Rybchenko
On 1/8/20 8:28 AM, Tiwei Bie wrote:
> On Tue, Jan 07, 2020 at 06:39:36PM +0100, Maxime Coquelin wrote:
>> On 12/25/19 4:19 PM, Matan Azrad wrote:
>>> Add vDPA devices features table and explanation.
>>>
>>> Any vDPA driver can add its own supported features by ading a new ini
>>> file to the features directory in doc/guides/vdpadevs/features.
>>>
>>> Signed-off-by: Matan Azrad 
>>> ---
>>>  doc/guides/conf.py|  5 +++
>>>  doc/guides/vdpadevs/features/default.ini  | 55 ++
>>>  doc/guides/vdpadevs/features_overview.rst | 65 
>>> +++
>>>  doc/guides/vdpadevs/index.rst |  1 +
>>>  4 files changed, 126 insertions(+)
>>>  create mode 100644 doc/guides/vdpadevs/features/default.ini
>>>  create mode 100644 doc/guides/vdpadevs/features_overview.rst
>>>
>>> diff --git a/doc/guides/conf.py b/doc/guides/conf.py
>>> index 0892c06..c368fa5 100644
>>> --- a/doc/guides/conf.py
>>> +++ b/doc/guides/conf.py
>>> @@ -401,6 +401,11 @@ def setup(app):
>>>  'Features',
>>>  'Features availability in compression drivers',
>>>  'Feature')
>>> +table_file = dirname(__file__) + '/vdpadevs/overview_feature_table.txt'
>>> +generate_overview_table(table_file, 1,
>>> +'Features',
>>> +'Features availability in vDPA drivers',
>>> +'Feature')
>>>  
>>>  if LooseVersion(sphinx_version) < LooseVersion('1.3.1'):
>>>  print('Upgrade sphinx to version >= 1.3.1 for '
>>> diff --git a/doc/guides/vdpadevs/features/default.ini 
>>> b/doc/guides/vdpadevs/features/default.ini
>>> new file mode 100644
>>> index 000..a3e0bc7
>>> --- /dev/null
>>> +++ b/doc/guides/vdpadevs/features/default.ini
>>> @@ -0,0 +1,55 @@
>>> +;
>>> +; Features of a default vDPA driver.
>>> +;
>>> +; This file defines the features that are valid for inclusion in
>>> +; the other driver files and also the order that they appear in
>>> +; the features table in the documentation. The feature description
>>> +; string should not exceed feature_str_len defined in conf.py.
>>> +;
>> I think some entries below could be removed for vDPA.
> +1
>
>>> +[Features]
>>> +csum =
>>> +guest csum   =
>>> +mac  =
>>> +gso  =
>>> +guest tso4   =
>>> +guest tso6   =
>>> +ecn  =
>>> +ufo  =
>>> +host tso4=
>>> +host tso6=
>>> +mrg rxbuf=
>>> +ctrl vq  =
>>> +ctrl rx  =
>>> +any layout   =
>>> +guest announce   =
>>> +mq   =
>>> +version 1=
>>> +log all  =
>>> +protocol features=
> We may not need to list this. The proto * would imply it.
>
>>> +indirect desc=
>>> +event idx=
>>> +mtu  =
>>> +in_order =
>>> +IOMMU platform   =
>>> +packed   =
>>> +proto mq =
>>> +proto log shmfd  =
>>> +proto rarp   =
>>> +proto reply ack  =
>>> +proto slave req  =
> Ditto. This feature is to be used by other features.
> Features like host notifier would imply it.
>
>>> +proto crypto session =
> We don't need to list this before we officially support
> the crypto vDPA device.
>
>>> +proto host notifier  =
>>> +proto pagefault  =
>>> +Multiprocess aware   =
> There is no support for this in library currently.
> To support it, we need to sync vhost fds and messages
> among processes.
>
>>> +BSD nic_uio  =
>>> +Linux UIO=
>> E.g. UIO, which cannot be used since vDPA requires an IOMMU.
>>
>>> +Linux VFIO   =
>>> +Other kdrv   =
>>> +ARMv7=
>>> +ARMv8=
>>> +Power8   =
>>> +x86-32   =
>>> +x86-64   =
>>> +Usage doc=
>>> +Design doc   =
>>> +Perf doc =
>>> \ No newline at end of file
>>> diff --git a/doc/guides/vdpadevs/features_overview.rst 
>>> b/doc/guides/vdpadevs/features_overview.rst
>>> new file mode 100644
>>> index 000..c7745b7
>>> --- /dev/null
>>> +++ b/doc/guides/vdpadevs/features_overview.rst
>>> @@ -0,0 +1,65 @@
>>> +..  SPDX-License-Identifier: BSD-3-Clause
>>> +Copyright 2019 Mellanox Technologies, Ltd
>>> +
>>> +Overview of vDPA drivers features
>>> +=
>>> +
>>> +This section explains the supported features that are listed in the table 
>>> below.
>>> +
>>> +  * csum - Device can handle packets with partial checksum.
>>> +  * guest csum - Guest can handle packets with partial checksum.
>>> +  * mac - Device has given MAC address.
>>> +  * gso - Device can handle packets with any GSO type.
>>> +  * guest tso4 - Guest can receive TSOv4.
>>> +  * guest tso6 - Guest can receive TSOv6.
>>> +  * ecn - Device can receive TSO with ECN.
>>> +  * ufo - D

Re: [dpdk-dev] [PATCH v4 0/6] OCTEON TX2 End Point Driver

2020-01-07 Thread Mahipal Challa
Hi Gavin,
We have incorporated the changes you suggested in v3, please ack.
We like to take up performance improvement optimizations later( that you 
suggested in v3) and upstream in the future, so for this release 20.02 we like 
to go with the existing patch set sources(v4), please ack.

Thanks,
Mahipal

> -Original Message-
> From: dev  On Behalf Of Mahipal Challa
> Sent: Tuesday, January 7, 2020 7:53 PM
> To: dev@dpdk.org
> Cc: Jerin Jacob Kollanukkaran ; Narayana Prasad Raju
> Athreya ; Subrahmanyam Nilla
> ; Venkateshwarlu Nalla ;
> gavin...@arm.com
> Subject: [dpdk-dev] [PATCH v4 0/6] OCTEON TX2 End Point Driver
> 
> This patchset adds support for OCTEON TX2 end point mode of operation.
> The driver implementation uses DPDK rawdevice sub-system.
> 
> v2:
> * Updated memory barrior API's as per Gavin Hu suggestion.
> 
> v3:
> * Fixed memory leak possibility issues.
> 
> v4:
> * Improved error handling in selftest API.
> 
> Mahipal Challa (6):
>   raw/octeontx2_ep: add build infra and device probe
>   raw/octeontx2_ep: add device configuration
>   raw/octeontx2_ep: add device uninitialization
>   raw/octeontx2_ep: add enqueue operation
>   raw/octeontx2_ep: add dequeue operation
>   raw/octeontx2_ep: add driver self test
> 
>  MAINTAINERS|   5 +
>  config/common_base |   5 +
>  doc/guides/rawdevs/index.rst   |   1 +
>  doc/guides/rawdevs/octeontx2_ep.rst|  89 +++
>  drivers/common/octeontx2/hw/otx2_sdp.h | 184 +
>  drivers/common/octeontx2/otx2_common.c |   9 +
>  drivers/common/octeontx2/otx2_common.h |   4 +
>  .../octeontx2/rte_common_octeontx2_version.map |   6 +
>  drivers/raw/Makefile   |   1 +
>  drivers/raw/meson.build|   1 +
>  drivers/raw/octeontx2_ep/Makefile  |  44 ++
>  drivers/raw/octeontx2_ep/meson.build   |   9 +
>  drivers/raw/octeontx2_ep/otx2_ep_enqdeq.c  | 844
> +
>  drivers/raw/octeontx2_ep/otx2_ep_enqdeq.h  |  52 ++
>  drivers/raw/octeontx2_ep/otx2_ep_rawdev.c  | 361 +
>  drivers/raw/octeontx2_ep/otx2_ep_rawdev.h  | 499 
>  drivers/raw/octeontx2_ep/otx2_ep_test.c| 173 +
>  drivers/raw/octeontx2_ep/otx2_ep_vf.c  | 476 
>  drivers/raw/octeontx2_ep/otx2_ep_vf.h  |  10 +
>  .../rte_rawdev_octeontx2_ep_version.map|   4 +
>  mk/rte.app.mk  |   2 +
>  21 files changed, 2779 insertions(+)
>  create mode 100644 doc/guides/rawdevs/octeontx2_ep.rst
>  create mode 100644 drivers/common/octeontx2/hw/otx2_sdp.h
>  create mode 100644 drivers/raw/octeontx2_ep/Makefile  create mode
> 100644 drivers/raw/octeontx2_ep/meson.build
>  create mode 100644 drivers/raw/octeontx2_ep/otx2_ep_enqdeq.c
>  create mode 100644 drivers/raw/octeontx2_ep/otx2_ep_enqdeq.h
>  create mode 100644 drivers/raw/octeontx2_ep/otx2_ep_rawdev.c
>  create mode 100644 drivers/raw/octeontx2_ep/otx2_ep_rawdev.h
>  create mode 100644 drivers/raw/octeontx2_ep/otx2_ep_test.c
>  create mode 100644 drivers/raw/octeontx2_ep/otx2_ep_vf.c
>  create mode 100644 drivers/raw/octeontx2_ep/otx2_ep_vf.h
>  create mode 100644
> drivers/raw/octeontx2_ep/rte_rawdev_octeontx2_ep_version.map
> 
> --
> 1.8.3.1



Re: [dpdk-dev] [PATCH] net/mlx5: fix register c0 usage for metadata entities

2020-01-07 Thread Raslan Darawsheh
Hi,

> -Original Message-
> From: Viacheslav Ovsiienko 
> Sent: Friday, December 20, 2019 9:53 AM
> To: dev@dpdk.org
> Cc: Matan Azrad ; Raslan Darawsheh
> ; Ori Kam ;
> sta...@dpdk.org
> Subject: [PATCH] net/mlx5: fix register c0 usage for metadata entities
> 
> The register c0 might be engaged to support META and MARK related items
> and actions. Also, this register might be used by kernel to specify the source
> vport index. The register c0 is split into two 16-bit fields. Depending on the
> mask returned by kernel the PMD can use upper or lower half of register c0.
> This patch adds the missing support for upper half.
> 
> Fixes: e554b672aa05 ("net/mlx5: support flow tag")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Viacheslav Ovsiienko 
> ---
>  drivers/net/mlx5/mlx5_flow_dv.c | 39
> +++
>  1 file changed, 35 insertions(+), 4 deletions(-)

Patch applied to next-net-mlx,

Kindest regards,
Raslan Darawsheh