Re: [dpdk-dev] [PATCH 2/4] vdpa/mlx5: support direct HW notifications

2020-04-26 Thread Matan Azrad


From: Maxime Coquelin 
> On 4/15/20 11:47 AM, Maxime Coquelin wrote:
> >
> >
> > On 3/24/20 3:24 PM, Asaf Penso wrote:
> >> From: Matan Azrad 
> >>
> >> Add support for the next 2 callbacks:
> >> get_vfio_device_fd and get_notify_area.
> >>
> >> This will allow direct HW doorbell ringing from guest and will save
> >> CPU usage in host.
> >>
> >> By this patch, the QEMU will map the physical address of the virtio
> >> device in guest directly to the physical address of the HW device
> >> doorbell.
> >>
> >> The guest doorbell write is 2 bytes transaction while some Mellanox
> >> nics support only 4 bytes transactions.
> >>
> >> Remove ConnectX-5 and BF1 devices support which don't support 2B
> >> doorbell writes for HW triggering.
> >
> > Couldn't we have different rte_vdpa_dev_ops depending on whether
> > doorbell write support?
> 
> I'll take the patch so that it lands into -rc1.
> But I would like to discuss the opportunity to have different dev_ops based
> on the device IDs, so that BF can still be supported. Maybe that could be
> done in -rc2?

Mellanox decided to remove BF1 from GA and from upstream support.
GA devices are CX6dx + BF2.

> 
> Reviewed-by: Maxime Coquelin 
> 
> Thanks,
> Maxime
> 



Re: [dpdk-dev] [PATCH v2 2/2] net/mlx5: support flow aging

2020-04-26 Thread Suanming Mou

On 4/24/2020 6:45 PM, Bill Zhou wrote:

Currently, there is no flow aging check and age-out event callback
mechanism for mlx5 driver, this patch implements it. It's included:
- Splitting the current counter container to aged or no-aged container
   since reducing memory consumption. Aged container will allocate extra
   memory to save the aging parameter from user configuration.
- Aging check and age-out event callback mechanism based on current
   counter. When a flow be checked aged-out, RTE_ETH_EVENT_FLOW_AGED
   event will be triggered to applications.
- Implement the new API: rte_flow_get_aged_flows, applications can use
   this API to get aged flows.

Signed-off-by: Bill Zhou 

Reviewed-by: Suanming Mou 

---
v2: Moving aging list from struct mlx5_ibv_shared to struct mlx5_priv,
one port has one aging list. Update event be triggered once after last
call of rte_flow_get_aged_flows.
---
  doc/guides/rel_notes/release_20_05.rst |   1 +
  drivers/net/mlx5/mlx5.c|  86 +++---
  drivers/net/mlx5/mlx5.h|  49 +++-
  drivers/net/mlx5/mlx5_flow.c   | 201 --
  drivers/net/mlx5/mlx5_flow.h   |  16 +-
  drivers/net/mlx5/mlx5_flow_dv.c| 361 +
  drivers/net/mlx5/mlx5_flow_verbs.c |  14 +-
  7 files changed, 607 insertions(+), 121 deletions(-)

diff --git a/doc/guides/rel_notes/release_20_05.rst 
b/doc/guides/rel_notes/release_20_05.rst
index b124c3f287..a5ba8a4792 100644
--- a/doc/guides/rel_notes/release_20_05.rst
+++ b/doc/guides/rel_notes/release_20_05.rst
@@ -141,6 +141,7 @@ New Features
* Added support for creating Relaxed Ordering Memory Regions.
* Added support for jumbo frame size (9K MTU) in Multi-Packet RQ mode.
* Optimized the memory consumption of flow.
+  * Added support for flow aging based on hardware counter.
  
  * **Updated the AESNI MB crypto PMD.**
  
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c

index 57d76cb741..674d0ea9d3 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -437,6 +437,20 @@ mlx5_flow_id_release(struct mlx5_flow_id_pool *pool, 
uint32_t id)
return 0;
  }
  
+/**

+ * Initialize the private aging list information.
+ *
+ * @param[in] priv
+ *   Pointer to the private device data structure.
+ */
+static void
+mlx5_flow_aging_list_init(struct mlx5_priv *priv)
+{
+   TAILQ_INIT(&priv->aged_counters);
+   rte_spinlock_init(&priv->aged_sl);
+   rte_atomic16_set(&priv->trigger_event, 1);
+}
+
  /**
   * Initialize the counters management structure.
   *
@@ -446,11 +460,14 @@ mlx5_flow_id_release(struct mlx5_flow_id_pool *pool, 
uint32_t id)
  static void
  mlx5_flow_counters_mng_init(struct mlx5_ibv_shared *sh)
  {
-   uint8_t i;
+   uint8_t i, age;
  
+	sh->cmng.age = 0;

TAILQ_INIT(&sh->cmng.flow_counters);
-   for (i = 0; i < RTE_DIM(sh->cmng.ccont); ++i)
-   TAILQ_INIT(&sh->cmng.ccont[i].pool_list);
+   for (age = 0; age < RTE_DIM(sh->cmng.ccont[0]); ++age) {
+   for (i = 0; i < RTE_DIM(sh->cmng.ccont); ++i)
+   TAILQ_INIT(&sh->cmng.ccont[i][age].pool_list);
+   }
  }
  
  /**

@@ -480,7 +497,7 @@ static void
  mlx5_flow_counters_mng_close(struct mlx5_ibv_shared *sh)
  {
struct mlx5_counter_stats_mem_mng *mng;
-   uint8_t i;
+   uint8_t i, age = 0;
int j;
int retries = 1024;
  
@@ -491,36 +508,42 @@ mlx5_flow_counters_mng_close(struct mlx5_ibv_shared *sh)

break;
rte_pause();
}
-   for (i = 0; i < RTE_DIM(sh->cmng.ccont); ++i) {
-   struct mlx5_flow_counter_pool *pool;
-   uint32_t batch = !!(i % 2);
  
-		if (!sh->cmng.ccont[i].pools)

-   continue;
-   pool = TAILQ_FIRST(&sh->cmng.ccont[i].pool_list);
-   while (pool) {
-   if (batch) {
-   if (pool->min_dcs)
-   claim_zero
-   (mlx5_devx_cmd_destroy(pool->min_dcs));
-   }
-   for (j = 0; j < MLX5_COUNTERS_PER_POOL; ++j) {
-   if (MLX5_POOL_GET_CNT(pool, j)->action)
-   claim_zero
-   (mlx5_glue->destroy_flow_action
-(MLX5_POOL_GET_CNT(pool, j)->action));
-   if (!batch && MLX5_GET_POOL_CNT_EXT
-   (pool, j)->dcs)
-   claim_zero(mlx5_devx_cmd_destroy
- (MLX5_GET_POOL_CNT_EXT
- (pool, j)->dcs));
+   for (age = 0; age < RTE_DIM(sh->cmng.ccont[0]); ++age) {
+   for (i = 0; i < RTE_DIM(sh->cmng.ccont); ++i) {
+   struct mlx5_flow_coun

Re: [dpdk-dev] [PATCH v9 1/6] lib/eal: implement the family of common bit operation APIs

2020-04-26 Thread Joyce Kong
> -Original Message-
> From: Thomas Monjalon 
> Sent: Sunday, April 26, 2020 3:59 AM
> To: Joyce Kong 
> Cc: step...@networkplumber.org; david.march...@redhat.com;
> m...@smartsharesystems.com; jer...@marvell.com;
> bruce.richard...@intel.com; ravi1.ku...@amd.com; rm...@marvell.com;
> shsha...@marvell.com; xuanziya...@huawei.com;
> cloud.wangxiao...@huawei.com; zhouguoy...@huawei.com; Honnappa
> Nagarahalli ; Gavin Hu
> ; Phil Yang ; dev@dpdk.org; nd
> 
> Subject: Re: [dpdk-dev] [PATCH v9 1/6] lib/eal: implement the family of
> common bit operation APIs
> 
> 24/04/2020 05:21, Joyce Kong:
> > Bitwise operation APIs are defined and used in a lot of PMDs, which
> > caused a huge code duplication. To reduce duplication, this patch
> > consolidates them into a common API family.
> [...]
> > +rte_get_bit32_relaxed(unsigned int nr, volatile uint32_t *addr)
> > +rte_set_bit32_relaxed(unsigned int nr, volatile uint32_t *addr)
> > +rte_clear_bit32_relaxed(unsigned int nr, volatile uint32_t *addr)
> > +rte_test_and_set_bit32_relaxed(unsigned int nr, volatile uint32_t
> > +*addr) rte_test_and_clear_bit32_relaxed(unsigned int nr, volatile
> > +uint32_t *addr) rte_get_bit64_relaxed(unsigned int nr, volatile
> > +uint64_t *addr) rte_set_bit64_relaxed(unsigned int nr, volatile
> > +uint64_t *addr) rte_clear_bit64_relaxed(unsigned int nr, volatile
> > +uint64_t *addr) rte_test_and_set_bit64_relaxed(unsigned int nr,
> > +volatile uint64_t *addr) rte_test_and_clear_bit64_relaxed(unsigned
> > +int nr, volatile uint64_t *addr)
> 
> Sorry, I have one more naming concern with this series.
> I prefer a common namespace for bit operations.
> Would you be OK to prefix all function names with rte_bit_relaxed_?
> 
Hi Thomas,
Do you mean to rename the functions as 'rte_bit_relaxed_get_bit32'?
If the example is ok, I will modify as this in v10.
Thanks, Joyce


Re: [dpdk-dev] [PATCH] app/testpmd: support flow aging

2020-04-26 Thread Bill Zhou


> -Original Message-
> From: Ferruh Yigit 
> Sent: Saturday, April 25, 2020 12:25 AM
> To: Bill Zhou ; wenzhuo...@intel.com;
> jingjing...@intel.com; bernard.iremon...@intel.com; Ori Kam
> 
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] app/testpmd: support flow aging
> 
> On 4/24/2020 11:55 AM, Bill Zhou wrote:
> > Currently, there is no way to check the aging event or to get the
> > current aged flows in testpmd, this patch include those implements, it's
> included:
> > - Registering aging event based on verbose level, when set verbose > 0,
> >   will register this event, otherwise, remove this event. In this event
> >   only dump one line of log to user there is one aging event coming.
> > - Add new command to list all aged flows, meanwhile, we can set
> parameter
> >   to destroy it.
> 
> Can you please document new feature and command?
> 
> Instead of overloading the 'verbose', what do you think having explicit
> command to register aging events? ("flow aged register "?) I think
> many of the verbose usage won't really interest in the flow aging.
> 

Yes, some of the verbose usage indeed not interest in the flow aging. But If we 
use
register or unregister event to one port, sometime, it will repeat many times 
for
every ports.
What do you think if we register this event all the time, and introduce new 
global
var from command to control the event export to application ?
   for example: set aged_flow_event_print_en [0 | 1]

> >
> > Signed-off-by: Bill Zhou 
> 
> <...>


Re: [dpdk-dev] [PATCH v3] lib/timer: relax barrier for status update

2020-04-26 Thread Phil Yang
> -Original Message-
> From: Thomas Monjalon 
> Sent: Sunday, April 26, 2020 1:18 AM
> To: Phil Yang 
> Cc: erik.g.carri...@intel.com; rsanf...@akamai.com; dev@dpdk.org;
> david.march...@redhat.com; Honnappa Nagarahalli
> ; Gavin Hu ; nd
> 
> Subject: Re: [dpdk-dev] [PATCH v3] lib/timer: relax barrier for status update
> 
> 24/04/2020 09:24, Phil Yang:
> > Volatile has no ordering semantics. The rte_timer structure defines
> > timer status as a volatile variable and uses the rte_r/wmb barrier
> > to guarantee inter-thread visibility.
> >
> > This patch optimized the volatile operation with c11 atomic operations
> > and one-way barrier to save the performance penalty. According to the
> > timer_perf_autotest benchmarking results, this patch can uplift 10%~16%
> > timer appending performance, 3%~20% timer resetting performance and
> 45%
> > timer callbacks scheduling performance on aarch64 and no loss in
> > performance for x86.
> >
> > Suggested-by: Honnappa Nagarahalli 
> > Signed-off-by: Phil Yang 
> > Reviewed-by: Gavin Hu 
> > Acked-by: Erik Gabriel Carrillo 
> [...]
> > --- a/lib/librte_timer/rte_timer.h
> > +++ b/lib/librte_timer/rte_timer.h
> > @@ -101,7 +101,7 @@ struct rte_timer
> > -   volatile union rte_timer_status status; /**< Status of timer. */
> > +   union rte_timer_status status; /**< Status of timer. */
> 
> Unfortunately, I cannot merge this patch because it breaks the ABI:
> 
>   [C]'function void rte_timer_init(rte_timer*)' at rte_timer.c:214:1 has some
> indirect sub-type changes:
> parameter 1 of type 'rte_timer*' has sub-type changes:
>   in pointed to type 'struct rte_timer' at rte_timer.h:100:1:
> type size hasn't changed
> 1 data member changes (2 filtered):
>  type of 'volatile rte_timer_status rte_timer::status' changed:
>entity changed from 'volatile rte_timer_status' to 'union
> rte_timer_status' at rte_timer.h:67:1
>type size hasn't changed
> 

I think we can revert it to the original definition of rte_timer and keep the 
union rte_timer_status volatile-qualified. 
Because with or without this 'volatile' qualify, it generates the same code on 
aarch64 and x86.
So it seems acceptable to ignore it to make the ABI compatible?

Thank,
Phil


[dpdk-dev] [PATCH 2/3] maintainers: update for Intel iavf

2020-04-26 Thread Beilei Xing
Replace Wenzhuo Lu with Beilei Xing.

Signed-off-by: Beilei Xing 
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index cc2664a..369ab26 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -675,7 +675,7 @@ F: doc/guides/nics/features/fm10k*.ini
 
 Intel iavf
 M: Jingjing Wu 
-M: Wenzhuo Lu 
+M: Beilei Xing 
 T: git://dpdk.org/next/dpdk-next-net-intel
 F: drivers/net/iavf/
 F: drivers/common/iavf/
-- 
2.7.4



[dpdk-dev] [PATCH 0/3] update MAINTAINER

2020-04-26 Thread Beilei Xing
Update maintainer for Intel i40e, Intel iavf and driver testind tool.

Beilei Xing (3):
  maintainers: update for Intel i40e
  maintainers: update for Intel iavf
  maintainers: update for driver testing tool

 MAINTAINERS | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

-- 
2.7.4



[dpdk-dev] [PATCH 3/3] maintainers: update for driver testing tool

2020-04-26 Thread Beilei Xing
Replace Jingjing Wu with Beilei Xing.

Signed-off-by: Beilei Xing 
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 369ab26..a2d7b76 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1487,7 +1487,7 @@ F: app/test/sample_packet_forward.h
 
 Driver testing tool
 M: Wenzhuo Lu 
-M: Jingjing Wu 
+M: Beilei Xing 
 M: Bernard Iremonger 
 T: git://dpdk.org/next/dpdk-next-net
 F: app/test-pmd/
-- 
2.7.4



[dpdk-dev] [PATCH 1/3] maintainers: update for Intel i40e

2020-04-26 Thread Beilei Xing
Replace Qi Zhang with Jeff Guo.

Signed-off-by: Beilei Xing 
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index cd50160..cc2664a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -658,7 +658,7 @@ F: doc/guides/nics/features/ixgbe*.ini
 
 Intel i40e
 M: Beilei Xing 
-M: Qi Zhang 
+M: Jeff Guo 
 T: git://dpdk.org/next/dpdk-next-net-intel
 F: drivers/net/i40e/
 F: doc/guides/nics/i40e.rst
-- 
2.7.4



Re: [dpdk-dev] [PATCH v1 2/2] ring: use wfe to wait for ring tail update on aarch64

2020-04-26 Thread Gavin Hu
Hi Honnappa,

> -Original Message-
> From: Honnappa Nagarahalli 
> Sent: Friday, April 24, 2020 10:46 PM
> To: Gavin Hu ; dev@dpdk.org
> Cc: nd ; david.march...@redhat.com;
> tho...@monjalon.net; jer...@marvell.com; Ruifeng Wang
> ; Phil Yang ; Joyce Kong
> ; Honnappa Nagarahalli
> ; nd 
> Subject: RE: [PATCH v1 2/2] ring: use wfe to wait for ring tail update on
> aarch64
> 
> 
> 
> Hi Gavin,
>   There are other sync modes added to rte_ring library. Can you
> please extend this to other rte_pause() calls in the library?
I looked into the other two calls to rte_cause, they can't use the 
rte_wait_until_equal_XX APIs simply.
Maybe new APIs are required? 
Here is the 32-bit API for your reference:
rte_wait_until_equal_32(volatile uint32_t *addr, uint32_t expected, int 
memorder)
> 
> Thanks,
> Honnappa
> 
> > Subject: [PATCH v1 2/2] ring: use wfe to wait for ring tail update on
> aarch64
> >
> > Instead of polling for tail to be updated, use wfe instruction.
> >
> > Signed-off-by: Gavin Hu 
> > Reviewed-by: Ruifeng Wang 
> > Reviewed-by: Steve Capper 
> > Reviewed-by: Ola Liljedahl 
> > Reviewed-by: Honnappa Nagarahalli 
> > ---
> >  lib/librte_ring/rte_ring_c11_mem.h | 4 ++--
> > lib/librte_ring/rte_ring_generic.h | 3 +--
> >  2 files changed, 3 insertions(+), 4 deletions(-)
> >
> > diff --git a/lib/librte_ring/rte_ring_c11_mem.h
> > b/lib/librte_ring/rte_ring_c11_mem.h
> > index 0fb73a337..764d8f186 100644
> > --- a/lib/librte_ring/rte_ring_c11_mem.h
> > +++ b/lib/librte_ring/rte_ring_c11_mem.h
> > @@ -2,6 +2,7 @@
> >   *
> >   * Copyright (c) 2017,2018 HXT-semitech Corporation.
> >   * Copyright (c) 2007-2009 Kip Macy km...@freebsd.org
> > + * Copyright (c) 2019 Arm Limited
> >   * All rights reserved.
> >   * Derived from FreeBSD's bufring.h
> >   * Used as BSD-3 Licensed with permission from Kip Macy.
> > @@ -21,8 +22,7 @@ update_tail(struct rte_ring_headtail *ht, uint32_t
> > old_val, uint32_t new_val,
> >  * we need to wait for them to complete
> >  */
> > if (!single)
> > -   while (unlikely(ht->tail != old_val))
> > -   rte_pause();
> > +   rte_wait_until_equal_relaxed_32(&ht->tail, old_val);
> >
> > __atomic_store_n(&ht->tail, new_val, __ATOMIC_RELEASE);  } diff --
> > git a/lib/librte_ring/rte_ring_generic.h 
> > b/lib/librte_ring/rte_ring_generic.h
> > index 953cdbbd5..682852783 100644
> > --- a/lib/librte_ring/rte_ring_generic.h
> > +++ b/lib/librte_ring/rte_ring_generic.h
> > @@ -23,8 +23,7 @@ update_tail(struct rte_ring_headtail *ht, uint32_t
> > old_val, uint32_t new_val,
> >  * we need to wait for them to complete
> >  */
> > if (!single)
> > -   while (unlikely(ht->tail != old_val))
> > -   rte_pause();
> > +   rte_wait_until_equal_relaxed_32(&ht->tail, old_val);
> >
> > ht->tail = new_val;
> >  }
> > --
> > 2.17.1



[dpdk-dev] [PATCH v2 1/2] spinlock: use wfe to reduce contention on aarch64

2020-04-26 Thread Gavin Hu
In acquiring a spinlock, cores repeatedly poll the lock variable.
This is replaced by rte_wait_until_equal API.

Running the micro benchmarking and the testpmd and l3fwd traffic tests
on ThunderX2, Ampere eMAG80 and Arm N1SDP, everything went well and no
notable performance gain nor degradation was measured.

Signed-off-by: Gavin Hu 
Reviewed-by: Ruifeng Wang 
Reviewed-by: Phil Yang 
Reviewed-by: Steve Capper 
Reviewed-by: Ola Liljedahl 
Reviewed-by: Honnappa Nagarahalli 
Tested-by: Pavan Nikhilesh 
---
 lib/librte_eal/include/generic/rte_spinlock.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/include/generic/rte_spinlock.h 
b/lib/librte_eal/include/generic/rte_spinlock.h
index 87ae7a4f1..40fe49d5a 100644
--- a/lib/librte_eal/include/generic/rte_spinlock.h
+++ b/lib/librte_eal/include/generic/rte_spinlock.h
@@ -65,8 +65,8 @@ rte_spinlock_lock(rte_spinlock_t *sl)
 
while (!__atomic_compare_exchange_n(&sl->locked, &exp, 1, 0,
__ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) {
-   while (__atomic_load_n(&sl->locked, __ATOMIC_RELAXED))
-   rte_pause();
+   rte_wait_until_equal_32((volatile uint32_t *)&sl->locked,
+  0, __ATOMIC_RELAXED);
exp = 0;
}
 }
-- 
2.17.1



[dpdk-dev] [PATCH v2 0/2] Use WFE for spinlock and ring

2020-04-26 Thread Gavin Hu
The rte_wait_until_equal_xxx APIs abstract the functionality of 'polling
for a memory location to become equal to a given value'[1].

Use the API for the rte spinlock and ring implementations.

[1] http://patches.dpdk.org/cover/62703/

Gavin Hu (2):
  spinlock: use wfe to reduce contention on aarch64
  ring: use wfe to wait for ring tail update on aarch64

 lib/librte_eal/include/generic/rte_spinlock.h | 4 ++--
 lib/librte_ring/rte_ring_c11_mem.h| 4 ++--
 lib/librte_ring/rte_ring_generic.h| 3 +--
 3 files changed, 5 insertions(+), 6 deletions(-)

-- 
2.17.1



[dpdk-dev] [PATCH v2 2/2] ring: use wfe to wait for ring tail update on aarch64

2020-04-26 Thread Gavin Hu
Instead of polling for tail to be updated, use wfe instruction.

Signed-off-by: Gavin Hu 
Reviewed-by: Ruifeng Wang 
Reviewed-by: Steve Capper 
Reviewed-by: Ola Liljedahl 
Reviewed-by: Honnappa Nagarahalli 
---
 lib/librte_ring/rte_ring_c11_mem.h | 4 ++--
 lib/librte_ring/rte_ring_generic.h | 3 +--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/lib/librte_ring/rte_ring_c11_mem.h 
b/lib/librte_ring/rte_ring_c11_mem.h
index 0fb73a337..b37ef5995 100644
--- a/lib/librte_ring/rte_ring_c11_mem.h
+++ b/lib/librte_ring/rte_ring_c11_mem.h
@@ -2,6 +2,7 @@
  *
  * Copyright (c) 2017,2018 HXT-semitech Corporation.
  * Copyright (c) 2007-2009 Kip Macy km...@freebsd.org
+ * Copyright (c) 2019 Arm Limited
  * All rights reserved.
  * Derived from FreeBSD's bufring.h
  * Used as BSD-3 Licensed with permission from Kip Macy.
@@ -21,8 +22,7 @@ update_tail(struct rte_ring_headtail *ht, uint32_t old_val, 
uint32_t new_val,
 * we need to wait for them to complete
 */
if (!single)
-   while (unlikely(ht->tail != old_val))
-   rte_pause();
+   rte_wait_until_equal_32(&ht->tail, old_val, __ATOMIC_RELAXED);
 
__atomic_store_n(&ht->tail, new_val, __ATOMIC_RELEASE);
 }
diff --git a/lib/librte_ring/rte_ring_generic.h 
b/lib/librte_ring/rte_ring_generic.h
index 953cdbbd5..afce89aec 100644
--- a/lib/librte_ring/rte_ring_generic.h
+++ b/lib/librte_ring/rte_ring_generic.h
@@ -23,8 +23,7 @@ update_tail(struct rte_ring_headtail *ht, uint32_t old_val, 
uint32_t new_val,
 * we need to wait for them to complete
 */
if (!single)
-   while (unlikely(ht->tail != old_val))
-   rte_pause();
+   rte_wait_until_equal_32(&ht->tail, old_val, __ATOMIC_RELAXED);
 
ht->tail = new_val;
 }
-- 
2.17.1



Re: [dpdk-dev] [PATCH v1 2/2] ring: use wfe to wait for ring tail update on aarch64

2020-04-26 Thread Gavin Hu
Hi Konstantin,

> -Original Message-
> From: Ananyev, Konstantin 
> Sent: Friday, April 24, 2020 9:24 PM
> To: Gavin Hu ; dev@dpdk.org
> Cc: nd ; david.march...@redhat.com;
> tho...@monjalon.net; jer...@marvell.com; Honnappa Nagarahalli
> ; Ruifeng Wang
> ; Phil Yang ; Joyce Kong
> 
> Subject: RE: [dpdk-dev] [PATCH v1 2/2] ring: use wfe to wait for ring tail
> update on aarch64
> 
> 
> >
> > Instead of polling for tail to be updated, use wfe instruction.
> >
> > Signed-off-by: Gavin Hu 
> > Reviewed-by: Ruifeng Wang 
> > Reviewed-by: Steve Capper 
> > Reviewed-by: Ola Liljedahl 
> > Reviewed-by: Honnappa Nagarahalli 
> > ---
> >  lib/librte_ring/rte_ring_c11_mem.h | 4 ++--
> >  lib/librte_ring/rte_ring_generic.h | 3 +--
> >  2 files changed, 3 insertions(+), 4 deletions(-)
> >
> > diff --git a/lib/librte_ring/rte_ring_c11_mem.h
> b/lib/librte_ring/rte_ring_c11_mem.h
> > index 0fb73a337..764d8f186 100644
> > --- a/lib/librte_ring/rte_ring_c11_mem.h
> > +++ b/lib/librte_ring/rte_ring_c11_mem.h
> > @@ -2,6 +2,7 @@
> >   *
> >   * Copyright (c) 2017,2018 HXT-semitech Corporation.
> >   * Copyright (c) 2007-2009 Kip Macy km...@freebsd.org
> > + * Copyright (c) 2019 Arm Limited
> >   * All rights reserved.
> >   * Derived from FreeBSD's bufring.h
> >   * Used as BSD-3 Licensed with permission from Kip Macy.
> > @@ -21,8 +22,7 @@ update_tail(struct rte_ring_headtail *ht, uint32_t
> old_val, uint32_t new_val,
> >  * we need to wait for them to complete
> >  */
> > if (!single)
> > -   while (unlikely(ht->tail != old_val))
> > -   rte_pause();
> > +   rte_wait_until_equal_relaxed_32(&ht->tail, old_val);
> >
> > __atomic_store_n(&ht->tail, new_val, __ATOMIC_RELEASE);
> >  }
> > diff --git a/lib/librte_ring/rte_ring_generic.h
> b/lib/librte_ring/rte_ring_generic.h
> > index 953cdbbd5..682852783 100644
> > --- a/lib/librte_ring/rte_ring_generic.h
> > +++ b/lib/librte_ring/rte_ring_generic.h
> > @@ -23,8 +23,7 @@ update_tail(struct rte_ring_headtail *ht, uint32_t
> old_val, uint32_t new_val,
> >  * we need to wait for them to complete
> >  */
> > if (!single)
> > -   while (unlikely(ht->tail != old_val))
> > -   rte_pause();
> > +   rte_wait_until_equal_relaxed_32(&ht->tail, old_val);
> 
> ./lib/librte_ring/rte_ring_generic.h:26:3: error: implicit declaration of
> function 'rte_wait_until_equal_relaxed_32'; did you mean
> 'rte_wait_until_equal_32'? [-Werror=implicit-function-declaration]
>rte_wait_until_equal_relaxed_32(&ht->tail, old_val);
>^~~
Sorry for the leakage, fixed in v2.
> 
> BTW, after patch #1 compilation fails also:
> ../lib/librte_eal/x86/include/rte_spinlock.h:125:24: error: pointer targets in
> passing argument 1 of 'rte_try_tm' differ in signedness [-Werror=pointer-
> sign]
>   if (likely(rte_try_tm(&sl->locked)))
Fixed in v2, thanks!
> ^
> 
> 
> 
> >
> > ht->tail = new_val;
> >  }
> > --
> > 2.17.1



Re: [dpdk-dev] [PATCH] doc: refine ethernet and VLAN flow rule items

2020-04-26 Thread Dekel Peled
Thanks, PSB.

> -Original Message-
> From: Andrew Rybchenko 
> Sent: Saturday, April 25, 2020 5:00 PM
> To: Dekel Peled ; Ori Kam ;
> john.mcnam...@intel.com; marko.kovace...@intel.com; Thomas Monjalon
> ; ferruh.yi...@intel.com
> Cc: dev@dpdk.org; Asaf Penso 
> Subject: Re: [PATCH] doc: refine ethernet and VLAN flow rule items
> 
> On 4/23/20 9:30 PM, Dekel Peled wrote:
> > Specified pattern may be translated in different manner.
> > For example the pattern "eth / ipv4" can be translated to match
> > untagged packets only, since the pattern doesn't specify a vlan item.
> 
> vlan -> VLAN

I will change to uppercase.

> 
> > It can also be translated to match both tagged and untagged packets,
> > for the same reason.
> > This patch updates the rte_flow documentation to clearly specify the
> > required pattern to use.
> > For example:
> > To match tagged ipv4 packets, the pattern "eth type is 0x8100 / vlan /
> > ipv4 / end" should be used.
> 
> Isn't eth / vlan / ipv4 /end sufficient? What's the difference?
> I guess later should allow any VLAN TPID, but it is greyish since it is HW
> dependent.

In the example I wanted to show explicit rule, to emphasize the importance of 
detailed pattern structure.

> 
> > To match untagged ipv4 packets, the pattern "eth type is 0x0800 /
> > ipv4 / end" should be used.
> 
> What about eth / ipv4 / end?
> Does usage of ipv4 assume that EtherType is 0x0800?

Same as above.

> 
> > To match both tagged and untagged packets, the pattern "eth / end"
> > should be used.
> 
> The interesting question is what should be used if I want either tagged or
> untagged IPv4 packets. I think it worse to mention to make the picture
> complete.

To match any IPV4 packet, either tagged or untagged, need to apply two rules.
One for tagged packets and the other for untagged packets.
I will add this example as well.

> 
> > Signed-off-by: Dekel Peled 
> > ---
> >  doc/guides/prog_guide/rte_flow.rst | 8 
> >  lib/librte_ethdev/rte_flow.h   | 9 +
> >  2 files changed, 17 insertions(+)
> >
> > diff --git a/doc/guides/prog_guide/rte_flow.rst
> > b/doc/guides/prog_guide/rte_flow.rst
> > index cf4368e..0d1c305 100644
> > --- a/doc/guides/prog_guide/rte_flow.rst
> > +++ b/doc/guides/prog_guide/rte_flow.rst
> > @@ -905,6 +905,12 @@ so-called layer 2.5 pattern items such as
> > ``RTE_FLOW_ITEM_TYPE_VLAN``. In  the latter case, ``type`` refers to
> > that of the outer header, with the inner  EtherType/TPID provided by
> > the subsequent pattern item. This is the same  order as on the wire.
> > +If the ``type`` field contains a TPID value, then only tagged packets
> > +will match the pattern.
> 
> Shouldn't we emphasis that "tagged packets with specified TPID will match
> the pattern." since tagged packets could have various TPIDs.

Agree, I will update.

> 
> > +If the ``type`` field contains another EtherType value, then only
> > +untagged packets will match the pattern.
> 
> I'm afraid "another EtherType" is too ambiguous.
> "non-TPID EtherType" is ambiguous as well and HW dependent. May be it is
> better to remove the sentence completely.

I think this sentence is important in order to emphasize the untagged packets 
case.
How about "Otherwise, only untagged packets will match the pattern."?

> 
> > +If the ``ETH`` item is the only item in the pattern, and the ``type``
> > +field is not specified, then both tagged and untagged packets will match
> the pattern.
> >
> >  - ``dst``: destination MAC.
> >  - ``src``: source MAC.
> > @@ -919,6 +925,8 @@ Matches an 802.1Q/ad VLAN tag.
> >  The corresponding standard outer EtherType (TPID) values are
> > ``RTE_ETHER_TYPE_VLAN`` or ``RTE_ETHER_TYPE_QINQ``. It can be
> > overridden by the  preceding pattern item.
> > +If a ``VLAN`` item is present in the pattern, then only tagged
> > +packets will match the pattern.
> >
> >  - ``tci``: tag control information.
> >  - ``inner_type``: inner EtherType or TPID.
> > diff --git a/lib/librte_ethdev/rte_flow.h
> > b/lib/librte_ethdev/rte_flow.h index 132b44e..178e87e 100644
> > --- a/lib/librte_ethdev/rte_flow.h
> > +++ b/lib/librte_ethdev/rte_flow.h
> > @@ -710,6 +710,13 @@ struct rte_flow_item_raw {
> >   * the latter case, @p type refers to that of the outer header, with the
> >   * inner EtherType/TPID provided by the subsequent pattern item. This is
> the
> >   * same order as on the wire.
> > + * If the @p type field contains a TPID value, then only tagged
> > + packets will
> > + * match the pattern.
> > + * If the @p type field contains another EtherType value, then only
> > + untagged
> > + * packets will match the pattern.
> > + * If the @p ETH item is the only item in the pattern, and the @p
> > + type field
> > + * is not specified, then both tagged and untagged packets will match
> > + the
> > + * pattern.
> >   */
> >  struct rte_flow_item_eth {
> > struct rte_ether_addr dst; /**< Destination MAC. */ @@ -734,6
> +741,8
> > @@ struct rte_flow_item_eth {
> >   * The correspon

Re: [dpdk-dev] [PATCH] app/testpmd: fix Rx/Tx stats after clear stats command

2020-04-26 Thread Wei Hu (Xavier)

Hi, Ferruh Yigit

On 2020/4/25 0:12, Ferruh Yigit wrote:

On 4/24/2020 12:07 PM, Wei Hu (Xavier) wrote:

From: Chengwen Feng 

Currently, when running start/clear stats&xstats/stop command many times
based on testpmd application, there are incorrect RX/TX-packets stats as
below:
-- Forward statistics for port 0  --
RX-packets: 18446744073709544808 RX-dropped: 0 ...ignore
TX-packets: 18446744073709536616 TX-dropped: 0 ...ignore


The root cause as below:
1. The struct rte_port of testpmd.h has a member variable
"struct rte_eth_stats stats" to store the last port statistics.
2. When runnig start command, it execute cmd_start_parsed ->
start_packet_forwarding -> fwd_stats_reset, which call rte_eth_stats_get
API function to save current port statistics.
3. When running stop command, it execute fwd_stats_display, which call
rte_eth_stats_get to get current port statistics, and then minus last
port statistics.
4. If we run clear stats or xstats after start command, then run stop,
it may display above incorrect stats because the current Rx/Tx-packets
is lower than the last saved RX/TX-packets(uint64_t overflow).


Looks like valid issue.

Can you please update the title to mention this fixes the forward stats (to
prevent the misunderstanding that issue is in the port stats).

Also can you please update the documentation
(doc/guides/testpmd_app_ug/testpmd_funcs.rst), "clear port" command to say this
will also affect the forward stats output (show fwd)?



This patch fixes it by clearing last port statistics when executing
"clear stats/xstats" command.

Fixes: af75078fece3 ("first public release")
Cc: sta...@dpdk.org

Signed-off-by: Chengwen Feng 
Signed-off-by: Wei Hu (Xavier) 
---
  app/test-pmd/config.c | 11 +++
  1 file changed, 11 insertions(+)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 72f25d152..0d2375607 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -234,10 +234,16 @@ nic_stats_display(portid_t port_id)
  void
  nic_stats_clear(portid_t port_id)
  {
+   struct rte_port *port;
+
if (port_id_is_invalid(port_id, ENABLED_WARN)) {
print_valid_ports();
return;
}
+
+   port = &ports[port_id];
+   /* clear last port statistics because eth stats reset */
+   memset(&port->stats, 0, sizeof(port->stats));


"clear fwd stats" command does same thing in "fwd_stats_reset()" as:
rte_eth_stats_get(pt_id, &ports[pt_id].stats);

I suggest doing same here for consistency, but it should be after
'rte_eth_stats_reset()' in that case.



I will modify it as follows, is it consistent with your comment?
Thanks.

void
fwd_stats_reset(void)
{
streamid_t sm_id;
portid_t pt_id;
int i;

for (i = 0; i < cur_fwd_config.nb_fwd_ports; i++) {
pt_id = fwd_ports_ids[i];
-   rte_eth_stats_get(pt_id, &ports[pt_id].stats);
+   rte_eth_stats_reset(port_id);
+   meset(&ports[pt_id].stats, 0, sizeof(ports[pt_id].stats));
}
for (sm_id = 0; sm_id < cur_fwd_config.nb_fwd_streams; sm_id++) {
struct fwd_stream *fs = fwd_streams[sm_id];

}
}

Regards
Xavier


rte_eth_stats_reset(port_id);
printf("\n  NIC statistics for port %d cleared\n", port_id);
  }
@@ -308,12 +314,17 @@ nic_xstats_display(portid_t port_id)
  void
  nic_xstats_clear(portid_t port_id)
  {
+   struct rte_port *port;
int ret;
  
  	if (port_id_is_invalid(port_id, ENABLED_WARN)) {

print_valid_ports();
return;
}
+
+   port = &ports[port_id];
+   /* clear last port statistics because eth xstats(include stats) reset */
+   memset(&port->stats, 0, sizeof(port->stats));
ret = rte_eth_xstats_reset(port_id);
if (ret != 0) {
printf("%s: Error: failed to reset xstats (port %u): %s",





Re: [dpdk-dev] [PATCH v9 1/6] lib/eal: implement the family of common bit operation APIs

2020-04-26 Thread Thomas Monjalon
26/04/2020 09:18, Joyce Kong:
> From: Thomas Monjalon 
> > 24/04/2020 05:21, Joyce Kong:
> > > Bitwise operation APIs are defined and used in a lot of PMDs, which
> > > caused a huge code duplication. To reduce duplication, this patch
> > > consolidates them into a common API family.
> > [...]
> > > +rte_get_bit32_relaxed(unsigned int nr, volatile uint32_t *addr)
> > > +rte_set_bit32_relaxed(unsigned int nr, volatile uint32_t *addr)
> > > +rte_clear_bit32_relaxed(unsigned int nr, volatile uint32_t *addr)
> > > +rte_test_and_set_bit32_relaxed(unsigned int nr, volatile uint32_t
> > > +*addr) rte_test_and_clear_bit32_relaxed(unsigned int nr, volatile
> > > +uint32_t *addr) rte_get_bit64_relaxed(unsigned int nr, volatile
> > > +uint64_t *addr) rte_set_bit64_relaxed(unsigned int nr, volatile
> > > +uint64_t *addr) rte_clear_bit64_relaxed(unsigned int nr, volatile
> > > +uint64_t *addr) rte_test_and_set_bit64_relaxed(unsigned int nr,
> > > +volatile uint64_t *addr) rte_test_and_clear_bit64_relaxed(unsigned
> > > +int nr, volatile uint64_t *addr)
> > 
> > Sorry, I have one more naming concern with this series.
> > I prefer a common namespace for bit operations.
> > Would you be OK to prefix all function names with rte_bit_relaxed_?
> > 
> Hi Thomas,
> Do you mean to rename the functions as 'rte_bit_relaxed_get_bit32'?
> If the example is ok, I will modify as this in v10.

Yes, thank you.




Re: [dpdk-dev] [PATCH] Fix various typos found by Lintian

2020-04-26 Thread Luca Boccassi
On Sat, 2020-04-25 at 19:47 +0200, Thomas Monjalon wrote:
> 04/03/2020 16:28, Luca Boccassi:
> > On Wed, 2020-03-04 at 14:34 +, Kevin Traynor wrote:
> > > On 29/02/2020 16:37, luca.bocca...@gmail.com  wrote:
> > > > Debian's linter is getting more and more annoy^^smart and now
> > > > parses binaries
> > > > for typos too - CC stable to get it off my back in the next
> > > > release
> > > 
> > > Minor: Probably linter is better trained in the Queen's English
> > > than
> > > me
> > > or it could be personal preference, but 'one' seems to be
> > > referring
> > > to
> > > the user and it reads a bit strange for me. e.g.
> > > 
> > > "Slave %d capabilities doesn't allow one to allocate additional
> > > queues"
> > > "hardware specifications that allow one to handle virtual memory"
> > > "Do not allow one to send packet if the maximum DMA.."
> > > 
> > > as opposed to
> > > 
> > > "Slave %d capabilities don't allow allocation of additional
> > > queues"
> > > "hardware specifications that allow handling of virtual memory"
> > > "Do not allow sending of a packet if the maximum DMA.."
> > 
> > You might be right - but the intent here is not to be correct, it's
> > to
> > get the linter to leave me alone :-)
> 
> I agree with Kevin that the wording "allow one to make" is strange.
> Would lintian leave you alone with "allow making"?
> 
> Anyway the "allow to" sentences are not typos.
> They could be reworded in a separate patch.
> 
> Patch partly applied, except the "allow one to" changes, thanks.

It probably would work - will check in the next cycle.

-- 
Kind regards,
Luca Boccassi


[dpdk-dev] [PATCH] mem: fix build

2020-04-26 Thread Thomas Monjalon
Some compilers (on RHEL7 and CentOS7) were getting this error:
error: "RTE_EXEC_ENV_FREEBSD" is not defined [-Werror=undef]

Existence of a macro must be checked with "#ifdef" or "#if defined".

Fixes: d72e4042c5eb ("mem: exclude unused memory from core dump")

Signed-off-by: Thomas Monjalon 
---
 lib/librte_eal/common/eal_common_memory.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/eal_common_memory.c 
b/lib/librte_eal/common/eal_common_memory.c
index 9a797a4fa3..4c897a13f1 100644
--- a/lib/librte_eal/common/eal_common_memory.c
+++ b/lib/librte_eal/common/eal_common_memory.c
@@ -42,7 +42,7 @@ static uint64_t system_page_sz;
 
 #ifdef RTE_EXEC_ENV_LINUX
 #define RTE_DONTDUMP MADV_DONTDUMP
-#elif RTE_EXEC_ENV_FREEBSD
+#elif defined RTE_EXEC_ENV_FREEBSD
 #define RTE_DONTDUMP MADV_NOCORE
 #else
 #error "madvise doesn't support this OS"
-- 
2.26.0



Re: [dpdk-dev] [PATCH] mem: fix build

2020-04-26 Thread Raslan Darawsheh
> -Original Message-
> From: dev  On Behalf Of Thomas Monjalon
> Sent: Sunday, April 26, 2020 1:47 PM
> To: dev@dpdk.org
> Cc: david.march...@redhat.com; Anatoly Burakov
> ; Li Feng 
> Subject: [dpdk-dev] [PATCH] mem: fix build
> 
> Some compilers (on RHEL7 and CentOS7) were getting this error:
>   error: "RTE_EXEC_ENV_FREEBSD" is not defined [-Werror=undef]
> 
> Existence of a macro must be checked with "#ifdef" or "#if defined".
> 
> Fixes: d72e4042c5eb ("mem: exclude unused memory from core dump")
> 
> Signed-off-by: Thomas Monjalon 
Tested-by: Raslan Darawsheh 

> ---
>  lib/librte_eal/common/eal_common_memory.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_eal/common/eal_common_memory.c
> b/lib/librte_eal/common/eal_common_memory.c
> index 9a797a4fa3..4c897a13f1 100644
> --- a/lib/librte_eal/common/eal_common_memory.c
> +++ b/lib/librte_eal/common/eal_common_memory.c
> @@ -42,7 +42,7 @@ static uint64_t system_page_sz;
> 
>  #ifdef RTE_EXEC_ENV_LINUX
>  #define RTE_DONTDUMP MADV_DONTDUMP
> -#elif RTE_EXEC_ENV_FREEBSD
> +#elif defined RTE_EXEC_ENV_FREEBSD
>  #define RTE_DONTDUMP MADV_NOCORE
>  #else
>  #error "madvise doesn't support this OS"
> --
> 2.26.0

Kindest regards
Raslan Darawsheh


Re: [dpdk-dev] [PATCH 1/2] eal/windows: replace sys/queue.h with a complete one from FreeBSD

2020-04-26 Thread Jerin Jacob
On Sun, Apr 26, 2020 at 8:53 AM Dmitry Kozlyuk  wrote:
>
> Limited version imported previously lacks STAILQ macros used by tracing
> and SLIST macros used by memory management.  Import a complete file from
> FreeBSD, since its license exception is already approved by Technical
> Board.
>
> Fixes: 321dd5f8fa62 ("trace: add internal init and fini interface")

IMO, we don't need to mark this patch as fixes.

> Signed-off-by: Dmitry Kozlyuk 
> ---


Re: [dpdk-dev] [PATCH] mem: fix build

2020-04-26 Thread Thomas Monjalon
> > Some compilers (on RHEL7 and CentOS7) were getting this error:
> > error: "RTE_EXEC_ENV_FREEBSD" is not defined [-Werror=undef]
> > 
> > Existence of a macro must be checked with "#ifdef" or "#if defined".
> > 
> > Fixes: d72e4042c5eb ("mem: exclude unused memory from core dump")
> > 
> > Signed-off-by: Thomas Monjalon 
> Tested-by: Raslan Darawsheh 

Applied





Re: [dpdk-dev] [PATCH 2/2] eal/windows: fix build by supporting trace

2020-04-26 Thread Jerin Jacob
On Sun, Apr 26, 2020 at 8:53 AM Dmitry Kozlyuk  wrote:
>
> Add EAL private functions to support trace storage:
>
> * eal_persistent_data_path()
> * eal_dir_create()
>
> Replace clock_gettime(CLOCK_REALTIME) with C11 timespec_get().
> Implementation is provided for MinGW-w64 that misses this function.
>
> Provide minimum viable implementations of malloc and timer functions
> used by tracing.
>
> Fixes: 185b7dc1d467 ("trace: save bootup timestamp")
> Fixes: 321dd5f8fa62 ("trace: add internal init and fini interface")
> Reported-by: Pallavi Kadam 
> Signed-off-by: Dmitry Kozlyuk 
> ---

In general, the patch looks good to me. Could you split the patch as two

1) New eal_permanent_data_path()  and eal_dir_create() API definition,
and its implementation with unix and the common code changes to adapt new APIs.
2) Windows implementation of eal_permanent_data_path() and other
missing pieces for
trace implementation.

Some comments below,

>
>  void eal_free_no_trace(void *addr);
>
> +/**
> + * Get absolute path to the directory where permanent data can be stored.
> + *
> + * @return
> + *  Statically allocated string on success, NULL on failure.
> + */
> +const char *
> +eal_permanent_data_path(void);

Do windows have PATH_MAX kind of macro? I think, it is better API
consumer allocates
the memory of size PATH_MAX and implementation fills it, instead of,
the static scheme.

> +
> +/**
> + * Create a directory accessible to the current user only.
> + *
> + * This function does not create intermediate directories,
> + * thus only the last path component may be nonexistent.
> + *
> + * This function succeeds if path already exists and is a directory.
> + *
> + * Platform-independent code should use forward slash as path separator.
> + *
> + * @param path
> + *  Path to be created.
> + * @return
> + *  0 on success, (-1) on failure and rte_errno is set.
> + */
> +int eal_dir_create(const char *path);
>

Looks good to me.


Re: [dpdk-dev] [PATCH 2/2] ethdev: allow unknown link speed

2020-04-26 Thread Matan Azrad
Hi

From: Thomas Monjalon
> When querying the link informations, the link status is a mandatory major
> information.
> Other boolean values are supposed to be accurate:
>   - duplex mode (half/full)
>   - negotiation (auto/fixed)
> 
> This API update is making explicit that the link speed information is 
> optional.
> The value ETH_SPEED_NUM_NONE (0) was already part of the API.
> The value ETH_SPEED_NUM_UNKNOWN (infinite) is added to cover two
> different cases:
>   - speed is not known by the driver
>   - device is virtual
> 
> Suggested-by: Morten Brørup 
> Suggested-by: Benoit Ganne 
> Signed-off-by: Thomas Monjalon 
> ---
>  lib/librte_ethdev/rte_ethdev.h | 20 ++--
>  1 file changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
> index d1a593ad11..2d51fd3444 100644
> --- a/lib/librte_ethdev/rte_ethdev.h
> +++ b/lib/librte_ethdev/rte_ethdev.h
> @@ -300,6 +300,7 @@ struct rte_eth_stats {
>  #define ETH_SPEED_NUM_50G  5 /**<  50 Gbps */
>  #define ETH_SPEED_NUM_56G  56000 /**<  56 Gbps */
>  #define ETH_SPEED_NUM_100G10 /**< 100 Gbps */
> +#define ETH_SPEED_NUM_UNKNOWN UINT32_MAX /**< Unknown */
> 
>  /**
>   * A structure used to retrieve link-level information of an Ethernet port.
> @@ -2245,15 +2246,16 @@ int rte_eth_allmulticast_disable(uint16_t
> port_id);  int rte_eth_allmulticast_get(uint16_t port_id);
> 
>  /**
> - * Retrieve the status (ON/OFF), the speed (in Mbps) and the mode (HALF-
> DUPLEX
> - * or FULL-DUPLEX) of the physical link of an Ethernet device. It might need
> - * to wait up to 9 seconds in it.
> + * Retrieve the link status (up/down), the duplex mode (half/full),
> + * the negotiation (auto/fixed), and if available, the speed (Mbps).
> + *
> + * It might need to wait up to 9 seconds.
> + * @see rte_eth_link_get_nowait.
>   *
>   * @param port_id
>   *   The port identifier of the Ethernet device.
>   * @param link
> - *   A pointer to an *rte_eth_link* structure to be filled with
> - *   the status, the speed and the mode of the Ethernet device link.
> + *   Link informations written back.
>   * @return
>   *   - (0) if successful.
>   *   - (-ENOTSUP) if the function is not supported in PMD driver.
> @@ -2262,15 +2264,13 @@ int rte_eth_allmulticast_get(uint16_t port_id);
> int rte_eth_link_get(uint16_t port_id, struct rte_eth_link *link);
> 
>  /**
> - * Retrieve the status (ON/OFF), the speed (in Mbps) and the mode (HALF-
> DUPLEX
> - * or FULL-DUPLEX) of the physical link of an Ethernet device. It is a 
> no-wait
> - * version of rte_eth_link_get().
> + * Retrieve the link status (up/down), the duplex mode (half/full),
> + * the negotiation (auto/fixed), and if available, the speed (Mbps).

The current API doesn’t say that link speed is optional, so no link speed 
information means error in API.
When making link speed optional, it may break existing app that expects to get 
error in API when link speed is not available in order to call the API again.
So I think it breaks API.
I agree for the change but think that this is better to integrate it in 20.11 
after sending prior deprecation notice.

Matan


>   *
>   * @param port_id
>   *   The port identifier of the Ethernet device.
>   * @param link
> - *   A pointer to an *rte_eth_link* structure to be filled with
> - *   the status, the speed and the mode of the Ethernet device link.
> + *   Link informations written back.
>   * @return
>   *   - (0) if successful.
>   *   - (-ENOTSUP) if the function is not supported in PMD driver.
> --
> 2.26.0



Re: [dpdk-dev] [PATCH 2/2] eal/windows: fix build by supporting trace

2020-04-26 Thread Dmitry Kozlyuk
On 2020-04-26 17:02 GMT+0530 Jerin Jacob wrote:
> > +/**
> > + * Get absolute path to the directory where permanent data can be stored.
> > + *
> > + * @return
> > + *  Statically allocated string on success, NULL on failure.
> > + */
> > +const char *
> > +eal_permanent_data_path(void);  
> 
> Do windows have PATH_MAX kind of macro? I think, it is better API
> consumer allocates
> the memory of size PATH_MAX and implementation fills it, instead of,
> the static scheme.

This API falls in line with rte_eal_get_runtime_dir() and other
eal_filesystem.h functions, that use static scheme. Logically, its result
never changes. It is race-free and is only called during initialization. What
you propose can be done, but are there any benefits?

While we're at it, don't these declarations belong to eal_filesystem.h? I
left them in eal_private.h, because eal_filesystem.h is mostly Unix-specific.

-- 
Dmitry Kozlyuk


[dpdk-dev] [PATCH 1/3] vdpa/mlx5: manage virtqs by array

2020-04-26 Thread Matan Azrad
As a preparation to listen the virtqs status before the device is
configured, manage the virtqs structures in array instead of list.

Signed-off-by: Matan Azrad 
Acked-by: Viacheslav Ovsiienko 
---
 drivers/vdpa/mlx5/mlx5_vdpa.c   | 43 --
 drivers/vdpa/mlx5/mlx5_vdpa.h   |  2 +-
 drivers/vdpa/mlx5/mlx5_vdpa_lm.c| 43 --
 drivers/vdpa/mlx5/mlx5_vdpa_steer.c | 18 +++
 drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 46 +++--
 5 files changed, 68 insertions(+), 84 deletions(-)

diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c
index f1bfe7a..70327df 100644
--- a/drivers/vdpa/mlx5/mlx5_vdpa.c
+++ b/drivers/vdpa/mlx5/mlx5_vdpa.c
@@ -116,20 +116,18 @@
 {
int did = rte_vhost_get_vdpa_device_id(vid);
struct mlx5_vdpa_priv *priv = mlx5_vdpa_find_priv_resource_by_did(did);
-   struct mlx5_vdpa_virtq *virtq = NULL;
 
if (priv == NULL) {
DRV_LOG(ERR, "Invalid device id: %d.", did);
return -EINVAL;
}
-   SLIST_FOREACH(virtq, &priv->virtq_list, next)
-   if (virtq->index == vring)
-   break;
-   if (!virtq) {
+   if (!priv->configured || vring >= RTE_MIN((int)priv->nr_virtqs,
+   (int)priv->caps.max_num_virtio_queues * 2) ||
+   !priv->virtqs[vring].virtq) {
DRV_LOG(ERR, "Invalid or unconfigured vring id: %d.", vring);
return -EINVAL;
}
-   return mlx5_vdpa_virtq_enable(virtq, state);
+   return mlx5_vdpa_virtq_enable(&priv->virtqs[vring], state);
 }
 
 static int
@@ -482,28 +480,28 @@
rte_errno = ENODEV;
return -rte_errno;
}
-   priv = rte_zmalloc("mlx5 vDPA device private", sizeof(*priv),
-  RTE_CACHE_LINE_SIZE);
-   if (!priv) {
-   DRV_LOG(ERR, "Failed to allocate private memory.");
-   rte_errno = ENOMEM;
-   goto error;
-   }
ret = mlx5_devx_cmd_query_hca_attr(ctx, &attr);
if (ret) {
DRV_LOG(ERR, "Unable to read HCA capabilities.");
rte_errno = ENOTSUP;
goto error;
-   } else {
-   if (!attr.vdpa.valid || !attr.vdpa.max_num_virtio_queues) {
-   DRV_LOG(ERR, "Not enough capabilities to support vdpa,"
-   " maybe old FW/OFED version?");
-   rte_errno = ENOTSUP;
-   goto error;
-   }
-   priv->caps = attr.vdpa;
-   priv->log_max_rqt_size = attr.log_max_rqt_size;
+   } else if (!attr.vdpa.valid || !attr.vdpa.max_num_virtio_queues) {
+   DRV_LOG(ERR, "Not enough capabilities to support vdpa, maybe "
+   "old FW/OFED version?");
+   rte_errno = ENOTSUP;
+   goto error;
+   }
+   priv = rte_zmalloc("mlx5 vDPA device private", sizeof(*priv) +
+  sizeof(struct mlx5_vdpa_virtq) *
+  attr.vdpa.max_num_virtio_queues * 2,
+  RTE_CACHE_LINE_SIZE);
+   if (!priv) {
+   DRV_LOG(ERR, "Failed to allocate private memory.");
+   rte_errno = ENOMEM;
+   goto error;
}
+   priv->caps = attr.vdpa;
+   priv->log_max_rqt_size = attr.log_max_rqt_size;
priv->ctx = ctx;
priv->dev_addr.pci_addr = pci_dev->addr;
priv->dev_addr.type = VDPA_ADDR_PCI;
@@ -519,7 +517,6 @@
goto error;
}
SLIST_INIT(&priv->mr_list);
-   SLIST_INIT(&priv->virtq_list);
pthread_mutex_lock(&priv_list_lock);
TAILQ_INSERT_TAIL(&priv_list, priv, next);
pthread_mutex_unlock(&priv_list_lock);
diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h
index 75af410..baec106 100644
--- a/drivers/vdpa/mlx5/mlx5_vdpa.h
+++ b/drivers/vdpa/mlx5/mlx5_vdpa.h
@@ -120,11 +120,11 @@ struct mlx5_vdpa_priv {
uint16_t nr_virtqs;
uint64_t features; /* Negotiated features. */
uint16_t log_max_rqt_size;
-   SLIST_HEAD(virtq_list, mlx5_vdpa_virtq) virtq_list;
struct mlx5_vdpa_steer steer;
struct mlx5dv_var *var;
void *virtq_db_addr;
SLIST_HEAD(mr_list, mlx5_vdpa_query_mr) mr_list;
+   struct mlx5_vdpa_virtq virtqs[];
 };
 
 /**
diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_lm.c b/drivers/vdpa/mlx5/mlx5_vdpa_lm.c
index 4457760..77f2eda 100644
--- a/drivers/vdpa/mlx5/mlx5_vdpa_lm.c
+++ b/drivers/vdpa/mlx5/mlx5_vdpa_lm.c
@@ -15,13 +15,12 @@
.type = MLX5_VIRTQ_MODIFY_TYPE_DIRTY_BITMAP_DUMP_ENABLE,
.dirty_bitmap_dump_enable = enable,
};
-   struct mlx5_vdpa_virtq *virtq;
+   int i;
 
-   SLIST_FOREACH(virtq, &priv->virtq_list, next) {
-   attr.queue_index = virtq->index;
-   

[dpdk-dev] [PATCH 0/3] vdpa/mlx5: recreate a virtq becoming enabled

2020-04-26 Thread Matan Azrad
Since a virtq configuration may be changed in disable state it is better to 
recreate a virtq becoming enabled.
This series adding this behaviour to the mlx5 driver for vDPA.

v2:
1. Address Maxime comments:
- returning -1 for out of range error.
- commit massage spelling.
2. rebase.

Matan Azrad (3):
  vdpa/mlx5: manage virtqs by array
  vdpa/mlx5: separate virtq stop
  vdpa/mlx5: recreate a virtq becoming enabled

 drivers/vdpa/mlx5/mlx5_vdpa.c   |  47 +++
 drivers/vdpa/mlx5/mlx5_vdpa.h   |  54 +++--
 drivers/vdpa/mlx5/mlx5_vdpa_lm.c|  57 --
 drivers/vdpa/mlx5/mlx5_vdpa_steer.c | 103 
 drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 113 +++-
 5 files changed, 231 insertions(+), 143 deletions(-)

-- 
1.8.3.1



[dpdk-dev] [PATCH 2/3] vdpa/mlx5: separate virtq stop

2020-04-26 Thread Matan Azrad
In live migration, before logging the virtq, the driver queries the
virtq indexes after moving it to suspend mode.

Separate this method to new function mlx5_vdpa_virtq_stop as a
preparation for reusing.

Signed-off-by: Matan Azrad 
Acked-by: Viacheslav Ovsiienko 
Reviewed-by: Maxime Coquelin 
---
 drivers/vdpa/mlx5/mlx5_vdpa.h   | 13 +
 drivers/vdpa/mlx5/mlx5_vdpa_lm.c| 17 ++---
 drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 26 ++
 3 files changed, 41 insertions(+), 15 deletions(-)

diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h
index baec106..0edd688 100644
--- a/drivers/vdpa/mlx5/mlx5_vdpa.h
+++ b/drivers/vdpa/mlx5/mlx5_vdpa.h
@@ -308,4 +308,17 @@ int mlx5_vdpa_dirty_bitmap_set(struct mlx5_vdpa_priv 
*priv, uint64_t log_base,
  */
 int mlx5_vdpa_virtq_modify(struct mlx5_vdpa_virtq *virtq, int state);
 
+/**
+ * Stop virtq before destroying it.
+ *
+ * @param[in] priv
+ *   The vdpa driver private structure.
+ * @param[in] index
+ *   The virtq index.
+ *
+ * @return
+ *   0 on success, a negative value otherwise.
+ */
+int mlx5_vdpa_virtq_stop(struct mlx5_vdpa_priv *priv, int index);
+
 #endif /* RTE_PMD_MLX5_VDPA_H_ */
diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_lm.c b/drivers/vdpa/mlx5/mlx5_vdpa_lm.c
index 77f2eda..26b7ce1 100644
--- a/drivers/vdpa/mlx5/mlx5_vdpa_lm.c
+++ b/drivers/vdpa/mlx5/mlx5_vdpa_lm.c
@@ -91,7 +91,6 @@
 int
 mlx5_vdpa_lm_log(struct mlx5_vdpa_priv *priv)
 {
-   struct mlx5_devx_virtq_attr attr = {0};
uint64_t features;
int ret = rte_vhost_get_negotiated_features(priv->vid, &features);
int i;
@@ -103,21 +102,9 @@
if (!RTE_VHOST_NEED_LOG(features))
return 0;
for (i = 0; i < priv->nr_virtqs; ++i) {
-   ret = mlx5_vdpa_virtq_modify(&priv->virtqs[i], 0);
-   if (ret)
-   return -1;
-   if (mlx5_devx_cmd_query_virtq(priv->virtqs[i].virtq, &attr)) {
-   DRV_LOG(ERR, "Failed to query virtq %d.", i);
-   return -1;
-   }
-   DRV_LOG(INFO, "Query vid %d vring %d: hw_available_idx=%d, "
-   "hw_used_index=%d", priv->vid, i,
-   attr.hw_available_index, attr.hw_used_index);
-   ret = rte_vhost_set_vring_base(priv->vid, i,
-  attr.hw_available_index,
-  attr.hw_used_index);
+   ret = mlx5_vdpa_virtq_stop(priv, i);
if (ret) {
-   DRV_LOG(ERR, "Failed to set virtq %d base.", i);
+   DRV_LOG(ERR, "Failed to stop virtq %d.", i);
return -1;
}
rte_vhost_log_used_vring(priv->vid, i, 0,
diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c 
b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c
index e07c826..9c0284c 100644
--- a/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c
+++ b/drivers/vdpa/mlx5/mlx5_vdpa_virtq.c
@@ -112,6 +112,32 @@
return mlx5_devx_cmd_modify_virtq(virtq->virtq, &attr);
 }
 
+int
+mlx5_vdpa_virtq_stop(struct mlx5_vdpa_priv *priv, int index)
+{
+   struct mlx5_devx_virtq_attr attr = {0};
+   struct mlx5_vdpa_virtq *virtq = &priv->virtqs[index];
+   int ret = mlx5_vdpa_virtq_modify(virtq, 0);
+
+   if (ret)
+   return -1;
+   if (mlx5_devx_cmd_query_virtq(virtq->virtq, &attr)) {
+   DRV_LOG(ERR, "Failed to query virtq %d.", index);
+   return -1;
+   }
+   DRV_LOG(INFO, "Query vid %d vring %d: hw_available_idx=%d, "
+   "hw_used_index=%d", priv->vid, index,
+   attr.hw_available_index, attr.hw_used_index);
+   ret = rte_vhost_set_vring_base(priv->vid, index,
+  attr.hw_available_index,
+  attr.hw_used_index);
+   if (ret) {
+   DRV_LOG(ERR, "Failed to set virtq %d base.", index);
+   return -1;
+   }
+   return 0;
+}
+
 static uint64_t
 mlx5_vdpa_hva_to_gpa(struct rte_vhost_memory *mem, uint64_t hva)
 {
-- 
1.8.3.1



[dpdk-dev] [PATCH 3/3] vdpa/mlx5: recreate a virtq becoming enabled

2020-04-26 Thread Matan Azrad
The virtq configurations may be changed when it moves from disabled
state to enabled state.

Listen to the state callback even if the device is not configured.
Recreate the virtq when it moves from disabled state to enabled state
and when the device is configured.

Signed-off-by: Matan Azrad 
Acked-by: Viacheslav Ovsiienko 
Reviewed-by: Maxime Coquelin 
---
 drivers/vdpa/mlx5/mlx5_vdpa.c   | 12 +++--
 drivers/vdpa/mlx5/mlx5_vdpa.h   | 39 +++--
 drivers/vdpa/mlx5/mlx5_vdpa_lm.c| 17 +---
 drivers/vdpa/mlx5/mlx5_vdpa_steer.c | 87 ++---
 drivers/vdpa/mlx5/mlx5_vdpa_virtq.c | 59 ++---
 5 files changed, 146 insertions(+), 68 deletions(-)

diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c
index 70327df..9f7353d 100644
--- a/drivers/vdpa/mlx5/mlx5_vdpa.c
+++ b/drivers/vdpa/mlx5/mlx5_vdpa.c
@@ -121,13 +121,11 @@
DRV_LOG(ERR, "Invalid device id: %d.", did);
return -EINVAL;
}
-   if (!priv->configured || vring >= RTE_MIN((int)priv->nr_virtqs,
-   (int)priv->caps.max_num_virtio_queues * 2) ||
-   !priv->virtqs[vring].virtq) {
-   DRV_LOG(ERR, "Invalid or unconfigured vring id: %d.", vring);
-   return -EINVAL;
+   if (vring >= (int)priv->caps.max_num_virtio_queues * 2) {
+   DRV_LOG(ERR, "Too big vring id: %d.", vring);
+   return -E2BIG;
}
-   return mlx5_vdpa_virtq_enable(&priv->virtqs[vring], state);
+   return mlx5_vdpa_virtq_enable(priv, vring, state);
 }
 
 static int
@@ -206,7 +204,7 @@
if (priv->configured)
ret |= mlx5_vdpa_lm_log(priv);
mlx5_vdpa_cqe_event_unset(priv);
-   ret |= mlx5_vdpa_steer_unset(priv);
+   mlx5_vdpa_steer_unset(priv);
mlx5_vdpa_virtqs_release(priv);
mlx5_vdpa_event_qp_global_release(priv);
mlx5_vdpa_mem_dereg(priv);
diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h
index 0edd688..fcc216a 100644
--- a/drivers/vdpa/mlx5/mlx5_vdpa.h
+++ b/drivers/vdpa/mlx5/mlx5_vdpa.h
@@ -127,6 +127,24 @@ struct mlx5_vdpa_priv {
struct mlx5_vdpa_virtq virtqs[];
 };
 
+/*
+ * Check whether virtq is for traffic receive.
+ * According to VIRTIO_NET Spec the virtqueues index identity its type by:
+ * 0 receiveq1
+ * 1 transmitq1
+ * ...
+ * 2(N-1) receiveqN
+ * 2(N-1)+1 transmitqN
+ * 2N controlq
+ */
+static inline uint8_t
+is_virtq_recvq(int virtq_index, int nr_vring)
+{
+   if (virtq_index % 2 == 0 && virtq_index != nr_vring - 1)
+   return 1;
+   return 0;
+}
+
 /**
  * Release all the prepared memory regions and all their related resources.
  *
@@ -223,15 +241,17 @@ int mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv 
*priv, uint16_t desc_n,
 /**
  * Enable\Disable virtq..
  *
- * @param[in] virtq
- *   The vdpa driver private virtq structure.
+ * @param[in] priv
+ *   The vdpa driver private structure.
+ * @param[in] index
+ *   The virtq index.
  * @param[in] enable
  *   Set to enable, otherwise disable.
  *
  * @return
  *   0 on success, a negative value otherwise.
  */
-int mlx5_vdpa_virtq_enable(struct mlx5_vdpa_virtq *virtq, int enable);
+int mlx5_vdpa_virtq_enable(struct mlx5_vdpa_priv *priv, int index, int enable);
 
 /**
  * Unset steering and release all its related resources- stop traffic.
@@ -239,7 +259,18 @@ int mlx5_vdpa_event_qp_create(struct mlx5_vdpa_priv *priv, 
uint16_t desc_n,
  * @param[in] priv
  *   The vdpa driver private structure.
  */
-int mlx5_vdpa_steer_unset(struct mlx5_vdpa_priv *priv);
+void mlx5_vdpa_steer_unset(struct mlx5_vdpa_priv *priv);
+
+/**
+ * Update steering according to the received queues status.
+ *
+ * @param[in] priv
+ *   The vdpa driver private structure.
+ *
+ * @return
+ *   0 on success, a negative value otherwise.
+ */
+int mlx5_vdpa_steer_update(struct mlx5_vdpa_priv *priv);
 
 /**
  * Setup steering and all its related resources to enable RSS traffic from the
diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_lm.c b/drivers/vdpa/mlx5/mlx5_vdpa_lm.c
index 26b7ce1..460e01d 100644
--- a/drivers/vdpa/mlx5/mlx5_vdpa_lm.c
+++ b/drivers/vdpa/mlx5/mlx5_vdpa_lm.c
@@ -19,7 +19,8 @@
 
for (i = 0; i < priv->nr_virtqs; ++i) {
attr.queue_index = i;
-   if (mlx5_devx_cmd_modify_virtq(priv->virtqs[i].virtq, &attr)) {
+   if (!priv->virtqs[i].virtq ||
+   mlx5_devx_cmd_modify_virtq(priv->virtqs[i].virtq, &attr)) {
DRV_LOG(ERR, "Failed to modify virtq %d logging.", i);
return -1;
}
@@ -68,7 +69,8 @@
attr.dirty_bitmap_mkey = mr->mkey->id;
for (i = 0; i < priv->nr_virtqs; ++i) {
attr.queue_index = i;
-   if (mlx5_devx_cmd_modify_virtq(priv->virtqs[i].virtq, &attr)) {
+   if (!priv->virtqs[i].virtq ||
+   mlx5_devx_cmd_modify_virtq(priv->v

Re: [dpdk-dev] [PATCH v3] lib/timer: relax barrier for status update

2020-04-26 Thread Carrillo, Erik G



> -Original Message-
> From: Phil Yang 
> Sent: Sunday, April 26, 2020 2:36 AM
> To: tho...@monjalon.net
> Cc: Carrillo, Erik G ; rsanf...@akamai.com;
> dev@dpdk.org; david.march...@redhat.com; Honnappa Nagarahalli
> ; Gavin Hu ; nd
> ; nd 
> Subject: RE: [dpdk-dev] [PATCH v3] lib/timer: relax barrier for status update
> 
> > -Original Message-
> > From: Thomas Monjalon 
> > Sent: Sunday, April 26, 2020 1:18 AM
> > To: Phil Yang 
> > Cc: erik.g.carri...@intel.com; rsanf...@akamai.com; dev@dpdk.org;
> > david.march...@redhat.com; Honnappa Nagarahalli
> > ; Gavin Hu ; nd
> > 
> > Subject: Re: [dpdk-dev] [PATCH v3] lib/timer: relax barrier for status
> > update
> >
> > 24/04/2020 09:24, Phil Yang:
> > > Volatile has no ordering semantics. The rte_timer structure defines
> > > timer status as a volatile variable and uses the rte_r/wmb barrier
> > > to guarantee inter-thread visibility.
> > >
> > > This patch optimized the volatile operation with c11 atomic
> > > operations and one-way barrier to save the performance penalty.
> > > According to the timer_perf_autotest benchmarking results, this
> > > patch can uplift 10%~16% timer appending performance, 3%~20% timer
> > > resetting performance and
> > 45%
> > > timer callbacks scheduling performance on aarch64 and no loss in
> > > performance for x86.
> > >
> > > Suggested-by: Honnappa Nagarahalli 
> > > Signed-off-by: Phil Yang 
> > > Reviewed-by: Gavin Hu 
> > > Acked-by: Erik Gabriel Carrillo 
> > [...]
> > > --- a/lib/librte_timer/rte_timer.h
> > > +++ b/lib/librte_timer/rte_timer.h
> > > @@ -101,7 +101,7 @@ struct rte_timer
> > > - volatile union rte_timer_status status; /**< Status of timer. */
> > > + union rte_timer_status status; /**< Status of timer. */
> >
> > Unfortunately, I cannot merge this patch because it breaks the ABI:
> >
> >   [C]'function void rte_timer_init(rte_timer*)' at rte_timer.c:214:1
> > has some indirect sub-type changes:
> > parameter 1 of type 'rte_timer*' has sub-type changes:
> >   in pointed to type 'struct rte_timer' at rte_timer.h:100:1:
> > type size hasn't changed
> > 1 data member changes (2 filtered):
> >  type of 'volatile rte_timer_status rte_timer::status' changed:
> >entity changed from 'volatile rte_timer_status' to 'union
> > rte_timer_status' at rte_timer.h:67:1
> >type size hasn't changed
> >
> 
> I think we can revert it to the original definition of rte_timer and keep the
> union rte_timer_status volatile-qualified.
> Because with or without this 'volatile' qualify, it generates the same code on
> aarch64 and x86.
> So it seems acceptable to ignore it to make the ABI compatible?
> 
> Thank,
> Phil

I was wondering about this also.  Is the performance improvement on aarch64 
still the same in that case?

Thanks,
Erik


Re: [dpdk-dev] [PATCH 2/2] eal/windows: fix build by supporting trace

2020-04-26 Thread Jerin Jacob
On Sun, Apr 26, 2020 at 5:32 PM Dmitry Kozlyuk  wrote:
>
> On 2020-04-26 17:02 GMT+0530 Jerin Jacob wrote:
> > > +/**
> > > + * Get absolute path to the directory where permanent data can be stored.
> > > + *
> > > + * @return
> > > + *  Statically allocated string on success, NULL on failure.
> > > + */
> > > +const char *
> > > +eal_permanent_data_path(void);
> >
> > Do windows have PATH_MAX kind of macro? I think, it is better API
> > consumer allocates
> > the memory of size PATH_MAX and implementation fills it, instead of,
> > the static scheme.
>
> This API falls in line with rte_eal_get_runtime_dir() and other
> eal_filesystem.h functions, that use static scheme. Logically, its result
> never changes. It is race-free and is only called during initialization. What
> you propose can be done, but are there any benefits?

I thought who will free that memory? It looks like libc creating a static memory
for this item. so, your current eal_permanent_data_path() declarations
looks good to
me.

>
> While we're at it, don't these declarations belong to eal_filesystem.h? I
> left them in eal_private.h, because eal_filesystem.h is mostly Unix-specific.

Understood, Leaving that decision to EAL maintainers.

>
> --
> Dmitry Kozlyuk


Re: [dpdk-dev] [PATCH 2/2] eal/windows: fix build by supporting trace

2020-04-26 Thread Thomas Monjalon
26/04/2020 14:23, Jerin Jacob:
> On Sun, Apr 26, 2020 at 5:32 PM Dmitry Kozlyuk  
> wrote:
> > While we're at it, don't these declarations belong to eal_filesystem.h? I
> > left them in eal_private.h, because eal_filesystem.h is mostly 
> > Unix-specific.

Yes it looks to be a good fit for eal_filesystem.h.
Don't hesitate removing Linux specifics (starting with the file comment).

> Understood, Leaving that decision to EAL maintainers.

There is no EAL maintainer for this.




Re: [dpdk-dev] [PATCH 2/2] eal/windows: fix build by supporting trace

2020-04-26 Thread Dmitry Kozlyuk
On 2020-04-26 17:53 GMT+0530 Jerin Jacob wrote:
> On Sun, Apr 26, 2020 at 5:32 PM Dmitry Kozlyuk  
> wrote:
> >
> > On 2020-04-26 17:02 GMT+0530 Jerin Jacob wrote:  
> > > > +/**
> > > > + * Get absolute path to the directory where permanent data can be 
> > > > stored.
> > > > + *
> > > > + * @return
> > > > + *  Statically allocated string on success, NULL on failure.
> > > > + */
> > > > +const char *
> > > > +eal_permanent_data_path(void);  
> > >
> > > Do windows have PATH_MAX kind of macro? I think, it is better API
> > > consumer allocates
> > > the memory of size PATH_MAX and implementation fills it, instead of,
> > > the static scheme.  
> >
> > This API falls in line with rte_eal_get_runtime_dir() and other
> > eal_filesystem.h functions, that use static scheme. Logically, its result
> > never changes. It is race-free and is only called during initialization. 
> > What
> > you propose can be done, but are there any benefits?  
> 
> I thought who will free that memory? It looks like libc creating a static 
> memory
> for this item. so, your current eal_permanent_data_path() declarations
> looks good to
> me.

Actually, your concern is valid (see man quotes below). Windows implementation
has no such issue because of its own static buffer. I'd use static scheme
in EAL interface for convenience, but change Unix implementation to copy
string returned by libc into a static buffer.

man 3 getenv:
The string pointed to by the return value of getenv() may be
statically allocated, and can be modified by a subsequent call to
getenv(), putenv(3), setenv(3), or unsetenv(3).

man 3 getpwuid:
The return value may point to a static area, and may be overwritten
by subsequent calls to getpwent(3), getpwnam(), or getpwuid().

-- 
Dmitry Kozlyuk


[dpdk-dev] [PATCH] crypto/caam_jr: fix wrong check of fd

2020-04-26 Thread wangyunjian
From: Yunjian Wang 

Zero is a valid fd. It will fail to check the fd if the fd is zero.

Fixes: e7a45f3cc245 ("crypto/caam_jr: add UIO specific operations")
Cc: sta...@dpdk.org

Signed-off-by: Yunjian Wang 
---
 drivers/crypto/caam_jr/caam_jr_uio.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/caam_jr/caam_jr_uio.c 
b/drivers/crypto/caam_jr/caam_jr_uio.c
index b1bb44ca4..658de5460 100644
--- a/drivers/crypto/caam_jr/caam_jr_uio.c
+++ b/drivers/crypto/caam_jr/caam_jr_uio.c
@@ -145,7 +145,7 @@ file_read_first_line(const char root[], const char subdir[],
 "%s/%s/%s", root, subdir, filename);
 
fd = open(absolute_file_name, O_RDONLY);
-   SEC_ASSERT(fd > 0, fd, "Error opening file %s",
+   SEC_ASSERT(fd >= 0, fd, "Error opening file %s",
absolute_file_name);
 
/* read UIO device name from first line in file */
@@ -389,7 +389,7 @@ uio_job_ring *config_job_ring(void)
 
/* Open device file */
job_ring->uio_fd = open(uio_device_file_name, O_RDWR);
-   SEC_ASSERT(job_ring->uio_fd > 0, NULL,
+   SEC_ASSERT(job_ring->uio_fd >= 0, NULL,
"Failed to open UIO device file for job ring %d",
job_ring->jr_id);
 
@@ -488,7 +488,7 @@ sec_cleanup(void)
/* I need to close the fd after shutdown UIO commands need to be
 * sent using the fd
 */
-   if (job_ring->uio_fd != 0) {
+   if (job_ring->uio_fd >= 0) {
CAAM_JR_INFO(
"Closed device file for job ring %d , fd = %d",
job_ring->jr_id, job_ring->uio_fd);
-- 
2.19.1




Re: [dpdk-dev] [PATCH v3] lib/timer: relax barrier for status update

2020-04-26 Thread Phil Yang
> -Original Message-
> From: Carrillo, Erik G 
> Sent: Sunday, April 26, 2020 8:19 PM
> To: Phil Yang ; tho...@monjalon.net
> Cc: rsanf...@akamai.com; dev@dpdk.org; david.march...@redhat.com;
> Honnappa Nagarahalli ; Gavin Hu
> ; nd ; nd 
> Subject: RE: [dpdk-dev] [PATCH v3] lib/timer: relax barrier for status update
> 
> 
> 
> > -Original Message-
> > From: Phil Yang 
> > Sent: Sunday, April 26, 2020 2:36 AM
> > To: tho...@monjalon.net
> > Cc: Carrillo, Erik G ; rsanf...@akamai.com;
> > dev@dpdk.org; david.march...@redhat.com; Honnappa Nagarahalli
> > ; Gavin Hu ; nd
> > ; nd 
> > Subject: RE: [dpdk-dev] [PATCH v3] lib/timer: relax barrier for status
> update
> >
> > > -Original Message-
> > > From: Thomas Monjalon 
> > > Sent: Sunday, April 26, 2020 1:18 AM
> > > To: Phil Yang 
> > > Cc: erik.g.carri...@intel.com; rsanf...@akamai.com; dev@dpdk.org;
> > > david.march...@redhat.com; Honnappa Nagarahalli
> > > ; Gavin Hu ; nd
> > > 
> > > Subject: Re: [dpdk-dev] [PATCH v3] lib/timer: relax barrier for status
> > > update
> > >
> > > 24/04/2020 09:24, Phil Yang:
> > > > Volatile has no ordering semantics. The rte_timer structure defines
> > > > timer status as a volatile variable and uses the rte_r/wmb barrier
> > > > to guarantee inter-thread visibility.
> > > >
> > > > This patch optimized the volatile operation with c11 atomic
> > > > operations and one-way barrier to save the performance penalty.
> > > > According to the timer_perf_autotest benchmarking results, this
> > > > patch can uplift 10%~16% timer appending performance, 3%~20% timer
> > > > resetting performance and
> > > 45%
> > > > timer callbacks scheduling performance on aarch64 and no loss in
> > > > performance for x86.
> > > >
> > > > Suggested-by: Honnappa Nagarahalli 
> > > > Signed-off-by: Phil Yang 
> > > > Reviewed-by: Gavin Hu 
> > > > Acked-by: Erik Gabriel Carrillo 
> > > [...]
> > > > --- a/lib/librte_timer/rte_timer.h
> > > > +++ b/lib/librte_timer/rte_timer.h
> > > > @@ -101,7 +101,7 @@ struct rte_timer
> > > > -   volatile union rte_timer_status status; /**< Status of timer. */
> > > > +   union rte_timer_status status; /**< Status of timer. */
> > >
> > > Unfortunately, I cannot merge this patch because it breaks the ABI:
> > >
> > >   [C]'function void rte_timer_init(rte_timer*)' at rte_timer.c:214:1
> > > has some indirect sub-type changes:
> > > parameter 1 of type 'rte_timer*' has sub-type changes:
> > >   in pointed to type 'struct rte_timer' at rte_timer.h:100:1:
> > > type size hasn't changed
> > > 1 data member changes (2 filtered):
> > >  type of 'volatile rte_timer_status rte_timer::status' changed:
> > >entity changed from 'volatile rte_timer_status' to 'union
> > > rte_timer_status' at rte_timer.h:67:1
> > >type size hasn't changed
> > >
> >
> > I think we can revert it to the original definition of rte_timer and keep 
> > the
> > union rte_timer_status volatile-qualified.
> > Because with or without this 'volatile' qualify, it generates the same code
> on
> > aarch64 and x86.
> > So it seems acceptable to ignore it to make the ABI compatible?
> >
> > Thank,
> > Phil
> 
> I was wondering about this also.  Is the performance improvement on
> aarch64 still the same in that case?

Yes. it is. 
It got the same performance improvement on aarch64 and no performance loss on 
x86.

I will update it in v4.



[dpdk-dev] [PATCH v4] lib/timer: relax barrier for status update

2020-04-26 Thread Phil Yang
Volatile has no ordering semantics. The rte_timer structure defines
timer status as a volatile variable and uses the rte_r/wmb barrier
to guarantee inter-thread visibility.

This patch optimized the volatile operation with c11 atomic operations
and one-way barrier to save the performance penalty. According to the
timer_perf_autotest benchmarking results, this patch can uplift 10%~16%
timer appending performance, 3%~20% timer resetting performance and 45%
timer callbacks scheduling performance on aarch64 and no loss in
performance for x86.

Suggested-by: Honnappa Nagarahalli 
Signed-off-by: Phil Yang 
Reviewed-by: Gavin Hu 
Acked-by: Erik Gabriel Carrillo 
---
It is still using built-ins as the wrapper functions for C11 built-ins
are not defined yet

v4:
fix ABI break in rte_timer struct.

v3:
fix typo.

v2:
Changed the memory ordering comment in timer_set_config_state.

 lib/librte_timer/rte_timer.c | 87 ++--
 1 file changed, 60 insertions(+), 27 deletions(-)

diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c
index 269e921..6d19ce4 100644
--- a/lib/librte_timer/rte_timer.c
+++ b/lib/librte_timer/rte_timer.c
@@ -10,7 +10,6 @@
 #include 
 #include 
 
-#include 
 #include 
 #include 
 #include 
@@ -218,7 +217,7 @@ rte_timer_init(struct rte_timer *tim)
 
status.state = RTE_TIMER_STOP;
status.owner = RTE_TIMER_NO_OWNER;
-   tim->status.u32 = status.u32;
+   __atomic_store_n(&tim->status.u32, status.u32, __ATOMIC_RELAXED);
 }
 
 /*
@@ -239,9 +238,9 @@ timer_set_config_state(struct rte_timer *tim,
 
/* wait that the timer is in correct status before update,
 * and mark it as being configured */
-   while (success == 0) {
-   prev_status.u32 = tim->status.u32;
+   prev_status.u32 = __atomic_load_n(&tim->status.u32, __ATOMIC_RELAXED);
 
+   while (success == 0) {
/* timer is running on another core
 * or ready to run on local core, exit
 */
@@ -258,9 +257,15 @@ timer_set_config_state(struct rte_timer *tim,
 * mark it atomically as being configured */
status.state = RTE_TIMER_CONFIG;
status.owner = (int16_t)lcore_id;
-   success = rte_atomic32_cmpset(&tim->status.u32,
- prev_status.u32,
- status.u32);
+   /* CONFIG states are acting as locked states. If the
+* timer is in CONFIG state, the state cannot be changed
+* by other threads. So, we should use ACQUIRE here.
+*/
+   success = __atomic_compare_exchange_n(&tim->status.u32,
+ &prev_status.u32,
+ status.u32, 0,
+ __ATOMIC_ACQUIRE,
+ __ATOMIC_RELAXED);
}
 
ret_prev_status->u32 = prev_status.u32;
@@ -279,20 +284,27 @@ timer_set_running_state(struct rte_timer *tim)
 
/* wait that the timer is in correct status before update,
 * and mark it as running */
-   while (success == 0) {
-   prev_status.u32 = tim->status.u32;
+   prev_status.u32 = __atomic_load_n(&tim->status.u32, __ATOMIC_RELAXED);
 
+   while (success == 0) {
/* timer is not pending anymore */
if (prev_status.state != RTE_TIMER_PENDING)
return -1;
 
-   /* here, we know that timer is stopped or pending,
-* mark it atomically as being configured */
+   /* we know that the timer will be pending at this point
+* mark it atomically as being running
+*/
status.state = RTE_TIMER_RUNNING;
status.owner = (int16_t)lcore_id;
-   success = rte_atomic32_cmpset(&tim->status.u32,
- prev_status.u32,
- status.u32);
+   /* RUNNING states are acting as locked states. If the
+* timer is in RUNNING state, the state cannot be changed
+* by other threads. So, we should use ACQUIRE here.
+*/
+   success = __atomic_compare_exchange_n(&tim->status.u32,
+ &prev_status.u32,
+ status.u32, 0,
+ __ATOMIC_ACQUIRE,
+ __ATOMIC_RELAXED);
}
 
return 0;
@@ -520,10 +532,12 @@ __rte_timer_reset(struct rte_timer *tim, uint64_t expire,
 
/* update state: as we are in CONFIG state, only us can modify
 * the state so we don't need to use cmpset() here */
-   rte_wmb();
status.state = RTE_TI

[dpdk-dev] reenchanting bugzilla with REST commands

2020-04-26 Thread Thomas Monjalon
Hi,

There are 144 bugs open in
https://bugs.dpdk.org/buglist.cgi?bug_status=__open__&product=DPDK

Some regular effort is required to follow and update the bugs.
If you don't have a bugzilla account on dpdk.org, please go to
https://bugs.dpdk.org/createaccount.cgi

In case it could motivate some of us using bugzilla more regularly,
please be aware a REST API is available in bugzilla.

The API documentation is quite good:
https://bugzilla.readthedocs.io/en/latest/api/
For user-authenticated calls, a secret key can be generated at
https://bugs.dpdk.org/userprefs.cgi?tab=apikey


-- example --
Below is a shell command getting (and formatting) not assigned bugs:

curl -s 
'https://bugs.dpdk.org/rest/bug?product=DPDK&status=UNCONFIRMED&status=CONFIRMED&status=IN_PROGRESS&include_fields=id,summary&assigned_to=dev@dpdk.org'
 | tr -d '"' | sed -e 's,^{bugs:\[{,,' -e 's,}]}$,,' -e 's/},{/\n/g' | sed -e 
's/,id:/\n=/' -e 's/^id:\([0-9]*\),\(.*\)/\2\n=\1/' | sed -e 
's,^=,\thttps://bugs.dpdk.org/show_bug.cgi?id=,' -e 's,^summary:,,'




[dpdk-dev] [PATCH v2 0/3] eal/windows: fix build by enabling trace compilation

2020-04-26 Thread Dmitry Kozlyuk
This patch fixes errors caused by using Unix-only functions in tracing
EAL.  It does not provide full tracing support for Windows because of
missing regex implementation.  It introduces new internal EAL wrappers
for directory management and provides basic, but correct implementation
for some EAL functions required for tracing compilation.

This patch implements rte_get_tsc_hz() instead of basing upon a pending
patchset, because fixing the build allows testing said patchset in the
first place, and also re-implemented code is only a few lines.

v2:
* Change title to reflect that only tracing compilation is enabled.
* Split commits adding new API and fixind build on Windows.
* Move new functions to eal_filesystem.h.
* Remove unneeded Fixes: line.

Dmitry Kozlyuk (3):
  eal/windows: replace sys/queue.h with a complete one from FreeBSD
  eal: add internal directory management API
  eal/windows: fix build by enabling trace compilation

 config/meson.build|   2 +
 .../common/eal_common_trace_utils.c   |  29 +-
 lib/librte_eal/common/eal_filesystem.h|  30 +-
 lib/librte_eal/common/meson.build |   5 +
 lib/librte_eal/freebsd/Makefile   |   4 +
 .../include/generic/rte_byteorder.h   |   4 +-
 lib/librte_eal/linux/Makefile |   4 +
 lib/librte_eal/meson.build|   4 +
 lib/librte_eal/unix/eal_unix_filesystem.c |  51 ++
 lib/librte_eal/unix/meson.build   |   6 +
 lib/librte_eal/windows/eal.c  |  92 +++
 lib/librte_eal/windows/eal_thread.c   |   9 +
 lib/librte_eal/windows/eal_windows.h  |   3 +
 lib/librte_eal/windows/include/rte_os.h   |  33 +-
 lib/librte_eal/windows/include/sys/queue.h| 663 --
 15 files changed, 847 insertions(+), 92 deletions(-)
 create mode 100644 lib/librte_eal/unix/eal_unix_filesystem.c
 create mode 100644 lib/librte_eal/unix/meson.build

-- 
2.25.1



[dpdk-dev] [PATCH v2 1/3] eal/windows: replace sys/queue.h with a complete one from FreeBSD

2020-04-26 Thread Dmitry Kozlyuk
Limited version imported previously lacks STAILQ macros used by tracing
and SLIST macros used by memory management.  Import a complete file from
FreeBSD, since its license exception is already approved by Technical
Board.

Signed-off-by: Dmitry Kozlyuk 
---
 lib/librte_eal/windows/include/sys/queue.h | 663 +++--
 1 file changed, 601 insertions(+), 62 deletions(-)

diff --git a/lib/librte_eal/windows/include/sys/queue.h 
b/lib/librte_eal/windows/include/sys/queue.h
index a65949a78..9756bee6f 100644
--- a/lib/librte_eal/windows/include/sys/queue.h
+++ b/lib/librte_eal/windows/include/sys/queue.h
@@ -8,7 +8,36 @@
 #define_SYS_QUEUE_H_
 
 /*
- * This file defines tail queues.
+ * This file defines four types of data structures: singly-linked lists,
+ * singly-linked tail queues, lists and tail queues.
+ *
+ * A singly-linked list is headed by a single forward pointer. The elements
+ * are singly linked for minimum space and pointer manipulation overhead at
+ * the expense of O(n) removal for arbitrary elements. New elements can be
+ * added to the list after an existing element or at the head of the list.
+ * Elements being removed from the head of the list should use the explicit
+ * macro for this purpose for optimum efficiency. A singly-linked list may
+ * only be traversed in the forward direction.  Singly-linked lists are ideal
+ * for applications with large datasets and few or no removals or for
+ * implementing a LIFO queue.
+ *
+ * A singly-linked tail queue is headed by a pair of pointers, one to the
+ * head of the list and the other to the tail of the list. The elements are
+ * singly linked for minimum space and pointer manipulation overhead at the
+ * expense of O(n) removal for arbitrary elements. New elements can be added
+ * to the list after an existing element, at the head of the list, or at the
+ * end of the list. Elements being removed from the head of the tail queue
+ * should use the explicit macro for this purpose for optimum efficiency.
+ * A singly-linked tail queue may only be traversed in the forward direction.
+ * Singly-linked tail queues are ideal for applications with large datasets
+ * and few or no removals or for implementing a FIFO queue.
+ *
+ * A list is headed by a single forward pointer (or an array of forward
+ * pointers for a hash table header). The elements are doubly linked
+ * so that an arbitrary element can be removed without a need to
+ * traverse the list. New elements can be added to the list before
+ * or after an existing element or at the head of the list. A list
+ * may be traversed in either direction.
  *
  * A tail queue is headed by a pair of pointers, one to the head of the
  * list and the other to the tail of the list. The elements are doubly
@@ -17,65 +46,93 @@
  * after an existing element, at the head of the list, or at the end of
  * the list. A tail queue may be traversed in either direction.
  *
+ * For details on the use of these macros, see the queue(3) manual page.
+ *
  * Below is a summary of implemented functions where:
  *  +  means the macro is available
  *  -  means the macro is not available
  *  s  means the macro is available but is slow (runs in O(n) time)
  *
- * TAILQ
- * _HEAD   +
- * _CLASS_HEAD +
- * _HEAD_INITIALIZER   +
- * _ENTRY  +
- * _CLASS_ENTRY+
- * _INIT   +
- * _EMPTY  +
- * _FIRST  +
- * _NEXT   +
- * _PREV   +
- * _LAST   +
- * _LAST_FAST  +
- * _FOREACH+
- * _FOREACH_FROM   +
- * _FOREACH_SAFE   +
- * _FOREACH_FROM_SAFE  +
- * _FOREACH_REVERSE+
- * _FOREACH_REVERSE_FROM   +
- * _FOREACH_REVERSE_SAFE   +
- * _FOREACH_REVERSE_FROM_SAFE  +
- * _INSERT_HEAD+
- * _INSERT_BEFORE  +
- * _INSERT_AFTER   +
- * _INSERT_TAIL+
- * _CONCAT +
- * _REMOVE_AFTER   -
- * _REMOVE_HEAD-
- * _REMOVE +
- * _SWAP   +
+ * SLIST   LISTSTAILQ  TAILQ
+ * _HEAD   +   +   +   +
+ * _CLASS_HEAD +   +   +   +
+ * _HEAD_INITIALIZER   +   +   +   +
+ * _ENTRY  +   +   +   +
+ * _CLASS_ENTRY+   +   +   +
+ * _INIT   +   +   +   +
+ * _EMPTY  +   +   +   +
+ * _FIRST  +   +   +   +
+ * _NEXT   +   +   +   +
+ * _PREV   -   +   -   +
+ * _LAST   -   -   +   +
+ * _LAST_FAST 

[dpdk-dev] [PATCH v2 2/3] eal: add internal directory management API

2020-04-26 Thread Dmitry Kozlyuk
Add functions for handling directories in a platform-independent way:

* eal_persistent_data_path()
* eal_dir_create()

Currently, only tracing requires this API for its common code.

Signed-off-by: Dmitry Kozlyuk 
---
 .../common/eal_common_trace_utils.c   | 26 +++---
 lib/librte_eal/common/eal_filesystem.h| 30 ++-
 lib/librte_eal/freebsd/Makefile   |  4 ++
 lib/librte_eal/linux/Makefile |  4 ++
 lib/librte_eal/meson.build|  4 ++
 lib/librte_eal/unix/eal_unix_filesystem.c | 51 +++
 lib/librte_eal/unix/meson.build   |  6 +++
 7 files changed, 105 insertions(+), 20 deletions(-)
 create mode 100644 lib/librte_eal/unix/eal_unix_filesystem.c
 create mode 100644 lib/librte_eal/unix/meson.build

diff --git a/lib/librte_eal/common/eal_common_trace_utils.c 
b/lib/librte_eal/common/eal_common_trace_utils.c
index fce8892c3..78df3b41e 100644
--- a/lib/librte_eal/common/eal_common_trace_utils.c
+++ b/lib/librte_eal/common/eal_common_trace_utils.c
@@ -3,8 +3,6 @@
  */
 
 #include 
-#include 
-#include 
 #include 
 
 #include 
@@ -321,22 +319,14 @@ trace_dir_default_path_get(char *dir_path)
 {
struct trace *trace = trace_obj_get();
uint32_t size = sizeof(trace->dir);
-   struct passwd *pwd;
-   char *home_dir;
-
-   /* First check for shell environment variable */
-   home_dir = getenv("HOME");
-   if (home_dir == NULL) {
-   /* Fallback to password file entry */
-   pwd = getpwuid(getuid());
-   if (pwd == NULL)
-   return -EINVAL;
-
-   home_dir = pwd->pw_dir;
-   }
+   const char *perm_dir;
+
+   perm_dir = eal_permanent_data_path();
+   if (perm_dir == NULL)
+   return -EINVAL;
 
/* Append dpdk-traces to directory */
-   if (snprintf(dir_path, size, "%s/dpdk-traces/", home_dir) < 0)
+   if (snprintf(dir_path, size, "%s/dpdk-traces/", perm_dir) < 0)
return -ENAMETOOLONG;
 
return 0;
@@ -371,7 +361,7 @@ trace_mkdir(void)
}
 
/* Create the path if it t exist, no "mkdir -p" available here */
-   rc = mkdir(trace->dir, 0700);
+   rc = eal_dir_create(trace->dir);
if (rc < 0 && errno != EEXIST) {
trace_err("mkdir %s failed [%s]", trace->dir, strerror(errno));
rte_errno = errno;
@@ -385,7 +375,7 @@ trace_mkdir(void)
if (rc < 0)
return rc;
 
-   rc = mkdir(trace->dir, 0700);
+   rc = eal_dir_create(trace->dir);
if (rc < 0) {
trace_err("mkdir %s failed [%s]", trace->dir, strerror(errno));
rte_errno = errno;
diff --git a/lib/librte_eal/common/eal_filesystem.h 
b/lib/librte_eal/common/eal_filesystem.h
index 5d21f07c2..77fe3be69 100644
--- a/lib/librte_eal/common/eal_filesystem.h
+++ b/lib/librte_eal/common/eal_filesystem.h
@@ -4,8 +4,8 @@
 
 /**
  * @file
- * Stores functions and path defines for files and directories
- * on the filesystem for Linux, that are used by the Linux EAL.
+ * Stores functions and path defines for files and directories used by DPDK.
+ * Parts of this file are Unix-specific for historical reasons.
  */
 
 #ifndef EAL_FILESYSTEM_H
@@ -28,6 +28,32 @@ eal_create_runtime_dir(void);
 int
 eal_clean_runtime_dir(void);
 
+/**
+ * Get absolute path to the directory where permanent data can be stored.
+ *
+ * @return
+ *  Statically allocated string on success, NULL on failure.
+ */
+const char *
+eal_permanent_data_path(void);
+
+/**
+ * Create a directory accessible to the current user only.
+ *
+ * This function does not create intermediate directories,
+ * thus only the last path component may be nonexistent.
+ *
+ * This function succeeds if path already exists and is a directory.
+ *
+ * Platform-independent code should use forward slash as path separator.
+ *
+ * @param path
+ *  Path to be created.
+ * @return
+ *  0 on success, (-1) on failure and rte_errno is set.
+ */
+int eal_dir_create(const char *path);
+
 /** Function to return hugefile prefix that's currently set up */
 const char *
 eal_get_hugefile_prefix(void);
diff --git a/lib/librte_eal/freebsd/Makefile b/lib/librte_eal/freebsd/Makefile
index a8400f20a..5170a85ab 100644
--- a/lib/librte_eal/freebsd/Makefile
+++ b/lib/librte_eal/freebsd/Makefile
@@ -7,6 +7,7 @@ LIB = librte_eal.a
 
 ARCH_DIR ?= $(RTE_ARCH)
 VPATH += $(RTE_SDK)/lib/librte_eal/$(ARCH_DIR)
+VPATH += $(RTE_SDK)/lib/librte_eal/unix
 VPATH += $(RTE_SDK)/lib/librte_eal/common
 
 CFLAGS += -I$(SRCDIR)/include
@@ -74,6 +75,9 @@ SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += rte_service.c
 SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += rte_random.c
 SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += rte_reciprocal.c
 
+# from unix dir
+SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += eal_unix_filesystem.c
+
 # from arch dir
 SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) += rte_cpuflags.c
 SRCS-$(CONFIG_RTE_EXEC_ENV_FREEBSD) 

[dpdk-dev] [PATCH v2 3/3] eal/windows: fix build by enabling trace compilation

2020-04-26 Thread Dmitry Kozlyuk
Replace clock_gettime(CLOCK_REALTIME) with C11 timespec_get().
Implementation is provided for MinGW-w64 that misses this function.

Provide minimum viable implementations of malloc and timer functions
used by tracing. Regex stubs are already present in Windows EAL.

Fixes: 185b7dc1d467 ("trace: save bootup timestamp")
Fixes: 321dd5f8fa62 ("trace: add internal init and fini interface")
Reported-by: Pallavi Kadam 
Signed-off-by: Dmitry Kozlyuk 
---
 config/meson.build|  2 +
 .../common/eal_common_trace_utils.c   |  3 +-
 lib/librte_eal/common/meson.build |  5 +
 .../include/generic/rte_byteorder.h   |  4 +-
 lib/librte_eal/windows/eal.c  | 92 +++
 lib/librte_eal/windows/eal_thread.c   |  9 ++
 lib/librte_eal/windows/eal_windows.h  |  3 +
 lib/librte_eal/windows/include/rte_os.h   | 33 +--
 8 files changed, 141 insertions(+), 10 deletions(-)

diff --git a/config/meson.build b/config/meson.build
index e851b407b..91cba9313 100644
--- a/config/meson.build
+++ b/config/meson.build
@@ -267,6 +267,8 @@ if is_windows
# Minimum supported API is Windows 7.
add_project_arguments('-D_WIN32_WINNT=0x0601', language: 'c')
 
+   add_project_link_arguments(['-lshell32', '-lshlwapi'], language: 'c')
+
# Use MinGW-w64 stdio, because DPDK assumes ANSI-compliant formatting.
if cc.get_id() == 'gcc'
add_project_arguments('-D__USE_MINGW_ANSI_STDIO', language: 'c')
diff --git a/lib/librte_eal/common/eal_common_trace_utils.c 
b/lib/librte_eal/common/eal_common_trace_utils.c
index 78df3b41e..1fb5bc772 100644
--- a/lib/librte_eal/common/eal_common_trace_utils.c
+++ b/lib/librte_eal/common/eal_common_trace_utils.c
@@ -7,6 +7,7 @@
 
 #include 
 #include 
+#include 
 #include 
 
 #include "eal_filesystem.h"
@@ -300,7 +301,7 @@ trace_epoch_time_save(void)
uint64_t avg, start, end;
 
start = rte_get_tsc_cycles();
-   if (clock_gettime(CLOCK_REALTIME, &epoch) < 0) {
+   if (timespec_get(&epoch, TIME_UTC) < 0) {
trace_err("failed to get the epoch time");
return -1;
}
diff --git a/lib/librte_eal/common/meson.build 
b/lib/librte_eal/common/meson.build
index 155da29b4..f7393b8c3 100644
--- a/lib/librte_eal/common/meson.build
+++ b/lib/librte_eal/common/meson.build
@@ -14,6 +14,11 @@ if is_windows
'eal_common_log.c',
'eal_common_options.c',
'eal_common_thread.c',
+   'eal_common_trace.c',
+   'eal_common_trace_ctf.c',
+   'eal_common_trace_utils.c',
+   'eal_common_string_fns.c',
+   'eal_common_uuid.c',
'rte_option.c',
)
subdir_done()
diff --git a/lib/librte_eal/include/generic/rte_byteorder.h 
b/lib/librte_eal/include/generic/rte_byteorder.h
index 38e8cfd32..bc3ad8e23 100644
--- a/lib/librte_eal/include/generic/rte_byteorder.h
+++ b/lib/librte_eal/include/generic/rte_byteorder.h
@@ -15,9 +15,9 @@
  */
 
 #include 
-#ifdef RTE_EXEC_ENV_FREEBSD
+#if defined(RTE_EXEC_ENV_FREEBSD)
 #include 
-#else
+#elif defined(RTE_EXEC_ENV_LINUX)
 #include 
 #endif
 
diff --git a/lib/librte_eal/windows/eal.c b/lib/librte_eal/windows/eal.c
index 2cf7a04ef..fec7e5001 100644
--- a/lib/librte_eal/windows/eal.c
+++ b/lib/librte_eal/windows/eal.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "eal_windows.h"
 
@@ -208,6 +209,91 @@ eal_parse_args(int argc, char **argv)
return ret;
 }
 
+void *
+eal_malloc_no_trace(const char *type, size_t size, unsigned int align)
+{
+   /* Simulate failure, so that tracing falls back to malloc(). */
+   RTE_SET_USED(type);
+   RTE_SET_USED(size);
+   RTE_SET_USED(align);
+   return NULL;
+}
+
+void
+eal_free_no_trace(void *addr __rte_unused)
+{
+   /* Nothing to free. */
+}
+
+/* MinGW-w64 does not implement timespec_get(). */
+#ifdef RTE_TOOLCHAIN_GCC
+
+int
+timespec_get(struct timespec *ts, int base)
+{
+   static const uint64_t UNITS_PER_SEC = 1000; /* 1 unit = 100 ns */
+
+   FILETIME ft;
+   ULARGE_INTEGER ui;
+
+   GetSystemTimePreciseAsFileTime(&ft);
+   ui.LowPart = ft.dwLowDateTime;
+   ui.HighPart = ft.dwHighDateTime;
+   ts->tv_sec = ui.QuadPart / UNITS_PER_SEC;
+   ts->tv_nsec = ui.QuadPart - (ts->tv_sec * UNITS_PER_SEC);
+   return base;
+}
+
+#endif /* RTE_TOOLCHAIN_GCC */
+
+uint64_t
+rte_get_tsc_hz(void)
+{
+   static LARGE_INTEGER freq; /* static so auto-zeroed */
+
+   if (freq.QuadPart != 0)
+   return freq.QuadPart;
+
+   QueryPerformanceFrequency(&freq);
+   return freq.QuadPart;
+}
+
+const char *
+eal_permanent_data_path(void)
+{
+   static char buffer[PATH_MAX]; /* static so auto-zeroed */
+
+   HRESULT ret;
+
+   if (buffer[0] != '\0')
+   return buffer;
+
+   ret = SHGetFolderPathA(
+   

Re: [dpdk-dev] [PATCH v2 3/3] eal/windows: fix build by enabling trace compilation

2020-04-26 Thread Thomas Monjalon
I think this patch is doing too many things at once.
Why not just disabling tracing on Windows for now,
and apply proper patches for memory management, timer, endianness, etc
in 20.08?

Some cosmetic comments below,

26/04/2020 17:28, Dmitry Kozlyuk:
> Replace clock_gettime(CLOCK_REALTIME) with C11 timespec_get().
> Implementation is provided for MinGW-w64 that misses this function.
> 
> Provide minimum viable implementations of malloc and timer functions
> used by tracing. Regex stubs are already present in Windows EAL.
> 
> Fixes: 185b7dc1d467 ("trace: save bootup timestamp")
> Fixes: 321dd5f8fa62 ("trace: add internal init and fini interface")
> Reported-by: Pallavi Kadam 
> Signed-off-by: Dmitry Kozlyuk 

Blank line is required between "Fixes" block and " Reported/SoB".

[...]
> --- a/lib/librte_eal/include/generic/rte_byteorder.h
> +++ b/lib/librte_eal/include/generic/rte_byteorder.h
> -#ifdef RTE_EXEC_ENV_FREEBSD
> +#if defined(RTE_EXEC_ENV_FREEBSD)

No need to change above line.

>  #include 
> -#else
> +#elif defined(RTE_EXEC_ENV_LINUX)

Parenthesis are useless.

>  #include 
>  #endif





Re: [dpdk-dev] [PATCH 05/14] eal: intr: cleanup resources

2020-04-26 Thread Stephen Hemminger
On Sat, 25 Apr 2020 18:49:23 +0200
David Marchand  wrote:

> >
> > +void
> > +rte_eal_intr_cleanup(void)
> > +{
> > +   pthread_cancel(intr_thread);
> > +   pthread_join(intr_thread, NULL);
> > +   close(intr_pipe.readfd);
> > +   close(intr_pipe.writefd);  
> 
> What happens to the intr_sources callbacks?
> I am unsure we can expect the application to clean this before the eal 
> cleanup.
> 
> It would be worth a followup patch.

The callbacks should be not run after cleanup.
The goal was to cleanup outstanding system resources (as reported by valgrind)
on eal_cleanup


Re: [dpdk-dev] [PATCH v2 3/3] eal/windows: fix build by enabling trace compilation

2020-04-26 Thread Dmitry Kozlyuk
On 2020-04-26 17:38 GMT+0200 Thomas Monjalon wrote:
> I think this patch is doing too many things at once.
> Why not just disabling tracing on Windows for now,
> and apply proper patches for memory management, timer, endianness, etc
> in 20.08?

Sounds reasonable since tracing cannot be fully supported anyway and we're
past deadline. Replacing the series with a trivial patch for now.

> Some cosmetic comments below,

Thanks.

-- 
Dmitry Kozlyuk


[dpdk-dev] [PATCH v3] eal: disable tracing on Windows

2020-04-26 Thread Dmitry Kozlyuk
Fix build errors caused by using Unix-specific functions in common code.
Hide and disable command-line options related to tracing on Windows.

Fixes: 3d26a70ae338 ("trace: add trace configuration parameter")
Fixes: 3b155d24bdaf ("trace: hook subsystem to Linux")

Reported-by: Pallavi Kadam 
Suggested-by: Thomas Monjalon 
Signed-off-by: Dmitry Kozlyuk 
---
 lib/librte_eal/common/eal_common_options.c | 6 ++
 lib/librte_eal/common/eal_common_thread.c  | 5 +
 2 files changed, 11 insertions(+)

diff --git a/lib/librte_eal/common/eal_common_options.c 
b/lib/librte_eal/common/eal_common_options.c
index 7e3a7df9c..418731ca4 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -34,7 +34,9 @@
 #include "eal_options.h"
 #include "eal_filesystem.h"
 #include "eal_private.h"
+#ifndef RTE_EXEC_ENV_WINDOWS
 #include "eal_trace.h"
+#endif
 
 #define BITS_PER_HEX 4
 #define LCORE_OPT_LST 1
@@ -1424,6 +1426,7 @@ eal_parse_common_option(int opt, const char *optarg,
break;
}
 
+#ifndef RTE_EXEC_ENV_WINDOWS
case OPT_TRACE_NUM: {
if (eal_trace_args_save(optarg) < 0) {
RTE_LOG(ERR, EAL, "invalid parameters for --"
@@ -1459,6 +1462,7 @@ eal_parse_common_option(int opt, const char *optarg,
}
break;
}
+#endif /* !RTE_EXEC_ENV_WINDOWS */
 
case OPT_LCORES_NUM:
if (eal_parse_lcores(optarg) < 0) {
@@ -1735,6 +1739,7 @@ eal_common_usage(void)
   "  --"OPT_LOG_LEVEL"=   Set global log level\n"
   "  --"OPT_LOG_LEVEL"=:\n"
   "  Set specific log level\n"
+#ifndef RTE_EXEC_ENV_WINDOWS
   "  --"OPT_TRACE"=\n"
   "  Enable trace based on regular expression 
trace name.\n"
   "  By default, the trace is disabled.\n"
@@ -1758,6 +1763,7 @@ eal_common_usage(void)
   "  reaches its maximum limit.\n"
   "  Default mode is 'overwrite' and 
parameter\n"
   "  must be specified once only.\n"
+#endif /* !RTE_EXEC_ENV_WINDOWS */
   "  -v  Display version information on startup\n"
   "  -h, --help  This help\n"
   "  --"OPT_IN_MEMORY"   Operate entirely in memory. This will\n"
diff --git a/lib/librte_eal/common/eal_common_thread.c 
b/lib/librte_eal/common/eal_common_thread.c
index 20dbcc7a0..f9f588c17 100644
--- a/lib/librte_eal/common/eal_common_thread.c
+++ b/lib/librte_eal/common/eal_common_thread.c
@@ -15,7 +15,9 @@
 #include 
 #include 
 #include 
+#ifndef RTE_EXEC_ENV_WINDOWS
 #include 
+#endif
 
 #include "eal_internal_cfg.h"
 #include "eal_private.h"
@@ -166,7 +168,10 @@ static void *rte_thread_init(void *arg)
pthread_barrier_destroy(¶ms->configured);
free(params);
}
+
+#ifndef RTE_EXEC_ENV_WINDOWS
__rte_trace_mem_per_thread_alloc();
+#endif
return start_routine(routine_arg);
 }
 
-- 
2.25.1



[dpdk-dev] [PATCH v4] mempool: return 0 if area is too small on populate

2020-04-26 Thread Thomas Monjalon
From: Olivier Matz 

Change rte_mempool_populate_iova() and rte_mempool_populate_iova() to
return 0 instead of -EINVAL when there is not enough room to store one
object, as it can be helpful for applications to distinguish this
specific case.

As this is an ABI change, use symbol versioning to preserve old
behavior for binary applications.

Signed-off-by: Olivier Matz 
Signed-off-by: Thomas Monjalon 
---

We are missing the symbols versioned for the previous ABI:
rte_mempool_populate_iova@@DPDK_20.0
rte_mempool_populate_virt@@DPDK_20.0

changes in v4:
- move populate_iova trace in v20_0_2 base function

changes in v3:
- rebase
- remove deprecation notice
- notify API change in release notes
- fix ABI version from 20.0.1 to 20.0.2 (should be 21 maybe)

---
 doc/guides/rel_notes/deprecation.rst   |  5 --
 doc/guides/rel_notes/release_20_05.rst |  4 ++
 examples/ntb/ntb_fwd.c |  2 +-
 lib/librte_mempool/meson.build |  2 +
 lib/librte_mempool/rte_mempool.c   | 80 ++
 lib/librte_mempool/rte_mempool.h   | 14 ++--
 lib/librte_mempool/rte_mempool_version.map |  7 ++
 7 files changed, 92 insertions(+), 22 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index 1339f54f5f..20aa745b77 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -65,11 +65,6 @@ Deprecation Notices
   structure would be made internal (or removed if all dependencies are cleared)
   in future releases.
 
-* mempool: starting from v20.05, the API of rte_mempool_populate_iova()
-  and rte_mempool_populate_virt() will change to return 0 instead
-  of -EINVAL when there is not enough room to store one object. The ABI
-  will be preserved until 20.11.
-
 * ethdev: the legacy filter API, including
   ``rte_eth_dev_filter_supported()``, ``rte_eth_dev_filter_ctrl()`` as well
   as filter types MACVLAN, ETHERTYPE, FLEXIBLE, SYN, NTUPLE, TUNNEL, FDIR,
diff --git a/doc/guides/rel_notes/release_20_05.rst 
b/doc/guides/rel_notes/release_20_05.rst
index b124c3f287..ab20a7d021 100644
--- a/doc/guides/rel_notes/release_20_05.rst
+++ b/doc/guides/rel_notes/release_20_05.rst
@@ -241,6 +241,10 @@ API Changes
Also, make sure to start the actual text at the margin.
=
 
+* mempool: The API of ``rte_mempool_populate_iova()`` and
+  ``rte_mempool_populate_virt()`` changed to return 0 instead of -EINVAL
+  when there is not enough room to store one object.
+
 
 ABI Changes
 ---
diff --git a/examples/ntb/ntb_fwd.c b/examples/ntb/ntb_fwd.c
index d49189e175..eba8ebf9fa 100644
--- a/examples/ntb/ntb_fwd.c
+++ b/examples/ntb/ntb_fwd.c
@@ -1319,7 +1319,7 @@ ntb_mbuf_pool_create(uint16_t mbuf_seg_size, uint32_t 
nb_mbuf,
mz->len - ntb_info.ntb_hdr_size,
ntb_mempool_mz_free,
(void *)(uintptr_t)mz);
-   if (ret < 0) {
+   if (ret <= 0) {
rte_memzone_free(mz);
rte_mempool_free(mp);
return NULL;
diff --git a/lib/librte_mempool/meson.build b/lib/librte_mempool/meson.build
index a6e861cbfc..7dbe6b9bea 100644
--- a/lib/librte_mempool/meson.build
+++ b/lib/librte_mempool/meson.build
@@ -9,6 +9,8 @@ foreach flag: extra_flags
endif
 endforeach
 
+use_function_versioning = true
+
 sources = files('rte_mempool.c', 'rte_mempool_ops.c',
'rte_mempool_ops_default.c', 'mempool_trace_points.c')
 headers = files('rte_mempool.h', 'rte_mempool_trace.h',
diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index 0be8f9f59d..0a6119d6ad 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "rte_mempool.h"
 #include "rte_mempool_trace.h"
@@ -303,12 +304,17 @@ mempool_ops_alloc_once(struct rte_mempool *mp)
return 0;
 }
 
+__vsym int
+rte_mempool_populate_iova_v20_0_2(struct rte_mempool *mp, char *vaddr,
+   rte_iova_t iova, size_t len, rte_mempool_memchunk_free_cb_t *free_cb,
+   void *opaque);
+
 /* Add objects in the pool, using a physically contiguous memory
  * zone. Return the number of objects added, or a negative value
  * on error.
  */
-static int
-__rte_mempool_populate_iova(struct rte_mempool *mp, char *vaddr,
+__vsym int
+rte_mempool_populate_iova_v20_0_2(struct rte_mempool *mp, char *vaddr,
rte_iova_t iova, size_t len, rte_mempool_memchunk_free_cb_t *free_cb,
void *opaque)
 {
@@ -359,6 +365,8 @@ __rte_mempool_populate_iova(struct rte_mempool *mp, char 
*vaddr,
 
STAILQ_INSERT_TAIL(&mp->mem_list, memhdr, next);
mp->nb_mem_chunks++;
+
+   rte_mempool_trace_populate_iova(mp, vaddr, iova, len, free_cb, opaque

Re: [dpdk-dev] [PATCH] doc: refine ethernet and VLAN flow rule items

2020-04-26 Thread Stephen Hemminger
On Sun, 26 Apr 2020 09:18:54 +
Dekel Peled  wrote:

> Thanks, PSB.
> 
> > -Original Message-
> > From: Andrew Rybchenko 
> > Sent: Saturday, April 25, 2020 5:00 PM
> > To: Dekel Peled ; Ori Kam ;
> > john.mcnam...@intel.com; marko.kovace...@intel.com; Thomas Monjalon
> > ; ferruh.yi...@intel.com
> > Cc: dev@dpdk.org; Asaf Penso 
> > Subject: Re: [PATCH] doc: refine ethernet and VLAN flow rule items
> > 
> > On 4/23/20 9:30 PM, Dekel Peled wrote:  
> > > Specified pattern may be translated in different manner.
> > > For example the pattern "eth / ipv4" can be translated to match
> > > untagged packets only, since the pattern doesn't specify a vlan item.  
> > 
> > vlan -> VLAN  
> 
> I will change to uppercase.
> 
> >   
> > > It can also be translated to match both tagged and untagged packets,
> > > for the same reason.
> > > This patch updates the rte_flow documentation to clearly specify the
> > > required pattern to use.
> > > For example:
> > > To match tagged ipv4 packets, the pattern "eth type is 0x8100 / vlan /
> > > ipv4 / end" should be used.  
> > 
> > Isn't eth / vlan / ipv4 /end sufficient? What's the difference?
> > I guess later should allow any VLAN TPID, but it is greyish since it is HW
> > dependent.  
> 
> In the example I wanted to show explicit rule, to emphasize the importance of 
> detailed pattern structure.
> 
> >   
> > > To match untagged ipv4 packets, the pattern "eth type is 0x0800 /
> > > ipv4 / end" should be used.  
> > 
> > What about eth / ipv4 / end?
> > Does usage of ipv4 assume that EtherType is 0x0800?  
> 
> Same as above.
> 
> >   
> > > To match both tagged and untagged packets, the pattern "eth / end"
> > > should be used.  
> > 
> > The interesting question is what should be used if I want either tagged or
> > untagged IPv4 packets. I think it worse to mention to make the picture
> > complete.  
> 
> To match any IPV4 packet, either tagged or untagged, need to apply two rules.
> One for tagged packets and the other for untagged packets.
> I will add this example as well.
> 
> >   
> > > Signed-off-by: Dekel Peled 

All this reminds me that "code is the best documentation" and there
is no working code that does a complete software implementation of the
rte_flow engine. This means the actual interpretation of what the rte_flow
means is left to documentation and hardware vendors choices in implementation.



[dpdk-dev] [PATCH] bus/pci: optimize pci device probe

2020-04-26 Thread jerinj
From: Jerin Jacob 

If the PCI device is not attached to any driver then there is no
point in probing it. As an optimization, skip the PCI device probe if
the PCI device driver of type RTE_KDRV_NONE.

Signed-off-by: Jerin Jacob 
---
Notes:
--
- virtio drivers does special treatment based on RTE_KDRV_UNKNOWN, That is
the reason allowing RTE_KDRV_UNKNOWN in this patch.
- virio devices uses RTE_KDRV_UNKNOWN for some special meaning, IMO, if it would
  be better, if
a) Introduce the KDRV for virio
b) If the PCIe device of driver type NONE or UNKNOWN then not even add in pci
list
in the scan, It will improve the boot time by avoiding operation on
unwanted device like sorting the PCI devices, scanning it, probe it, managing
it etc.

- Initial problem reported at http://patches.dpdk.org/patch/64999/ as
boot time printf clutter on octeontx2 devices with a lot PCI devices which are
of type RTE_KDRV_NONE.

 drivers/bus/pci/pci_common.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c
index 3f5542076..2fa3d85ae 100644
--- a/drivers/bus/pci/pci_common.c
+++ b/drivers/bus/pci/pci_common.c
@@ -271,6 +271,8 @@ pci_probe_all_drivers(struct rte_pci_device *dev)
return -EINVAL;

FOREACH_DRIVER_ON_PCIBUS(dr) {
+   if (dev->kdrv == RTE_KDRV_NONE)
+   continue;
rc = rte_pci_probe_one_driver(dr, dev);
if (rc < 0)
/* negative value is an error */
--
2.26.2



Re: [dpdk-dev] [PATCH] bus/pci: set boot-up log prints to absolute minimum

2020-04-26 Thread Jerin Jacob
On Sun, Apr 26, 2020 at 1:42 AM Thomas Monjalon  wrote:
>
> 06/02/2020 15:36, Jerin Jacob:
> > On Thu, Feb 6, 2020 at 7:44 PM Thomas Monjalon  wrote:
> > > 21/01/2020 09:00, jer...@marvell.com:
> > > > From: Jerin Jacob 
> > > >
> > > > Some machines may have a lot of PCI devices, logs from PCI probe
> > > > creates a lot of clutter on boot-up, typically one needs
> > > > to scroll the screen to find other issues in boot-up.
> > > >
> > > > This patch changes the loglevel of PCI probes to `debug`
> > > > to reduce the clutter on default boot-up logs
> > >
> > > I think the PCI probe informations are... informational.
> > > Maybe you are just not interested in info logs.
> > > If this is the case, I suggest to change the log level at runtime.
> >
> > I am wondering, what would be the right balance, Following is DPDK
> > startup output from octeontx2[1]
> > It creates a lot of clutter in the "default" boot up. Why not enable
> > below prints using log level at runtime?
> > I believe it comes as a debug category, i.e information required to
> > debug if something is not working,
> > dpdk bind script already lists what is bound to DPDK.
> >
> > Suggestion to remove clutter?
>
> I suggest using dynamic log level in the PCI driver.
> Unfortunately a lot of old DPDK code is still using the old log macros.
> Some cleanup work is needed here.

Sent an alternative fix to skip probing the devices with RTE_KDRV_NONE.
I think that is the source of the problem.
http://patches.dpdk.org/patch/69351/


>
>


Re: [dpdk-dev] [PATCH v3] eal: disable tracing on Windows

2020-04-26 Thread Jerin Jacob
On Sun, Apr 26, 2020 at 10:12 PM Dmitry Kozlyuk
 wrote:
>
> Fix build errors caused by using Unix-specific functions in common code.
> Hide and disable command-line options related to tracing on Windows.
>
> Fixes: 3d26a70ae338 ("trace: add trace configuration parameter")
> Fixes: 3b155d24bdaf ("trace: hook subsystem to Linux")
>
> Reported-by: Pallavi Kadam 
> Suggested-by: Thomas Monjalon 
> Signed-off-by: Dmitry Kozlyuk 

Acked-by: Jerin Jacob 


> ---
>  lib/librte_eal/common/eal_common_options.c | 6 ++
>  lib/librte_eal/common/eal_common_thread.c  | 5 +
>  2 files changed, 11 insertions(+)
>


Re: [dpdk-dev] [PATCH] bus/pci: set boot-up log prints to absolute minimum

2020-04-26 Thread Thomas Monjalon
26/04/2020 19:42, Jerin Jacob:
> On Sun, Apr 26, 2020 at 1:42 AM Thomas Monjalon  wrote:
> > 06/02/2020 15:36, Jerin Jacob:
> > > On Thu, Feb 6, 2020 at 7:44 PM Thomas Monjalon  
> > > wrote:
> > > > 21/01/2020 09:00, jer...@marvell.com:
> > > > > From: Jerin Jacob 
> > > > >
> > > > > Some machines may have a lot of PCI devices, logs from PCI probe
> > > > > creates a lot of clutter on boot-up, typically one needs
> > > > > to scroll the screen to find other issues in boot-up.
> > > > >
> > > > > This patch changes the loglevel of PCI probes to `debug`
> > > > > to reduce the clutter on default boot-up logs
> > > >
> > > > I think the PCI probe informations are... informational.
> > > > Maybe you are just not interested in info logs.
> > > > If this is the case, I suggest to change the log level at runtime.
> > >
> > > I am wondering, what would be the right balance, Following is DPDK
> > > startup output from octeontx2[1]
> > > It creates a lot of clutter in the "default" boot up. Why not enable
> > > below prints using log level at runtime?
> > > I believe it comes as a debug category, i.e information required to
> > > debug if something is not working,
> > > dpdk bind script already lists what is bound to DPDK.
> > >
> > > Suggestion to remove clutter?
> >
> > I suggest using dynamic log level in the PCI driver.
> > Unfortunately a lot of old DPDK code is still using the old log macros.
> > Some cleanup work is needed here.
> 
> Sent an alternative fix to skip probing the devices with RTE_KDRV_NONE.

No, a PCI PMD can work without a known kernel driver.
This is the case of mlx4/mlx5.

> I think that is the source of the problem.
> http://patches.dpdk.org/patch/69351/

The source of the problem is just changing log levels dynamically
is not possible currently with PCI driver logs.




Re: [dpdk-dev] [PATCH] bus/pci: optimize pci device probe

2020-04-26 Thread Thomas Monjalon
26/04/2020 19:38, jer...@marvell.com:
> From: Jerin Jacob 
> 
> If the PCI device is not attached to any driver then there is no
> point in probing it. As an optimization, skip the PCI device probe if
> the PCI device driver of type RTE_KDRV_NONE.
> 
> Signed-off-by: Jerin Jacob 
> ---
> Notes:
> --
> - virtio drivers does special treatment based on RTE_KDRV_UNKNOWN, That is
> the reason allowing RTE_KDRV_UNKNOWN in this patch.
> - virio devices uses RTE_KDRV_UNKNOWN for some special meaning, IMO, if it 
> would
>   be better, if
> a) Introduce the KDRV for virio
> b) If the PCIe device of driver type NONE or UNKNOWN then not even add in pci
> list
> in the scan, It will improve the boot time by avoiding operation on
> unwanted device like sorting the PCI devices, scanning it, probe it, managing
> it etc.

mlx4/mlx4 uses RTE_KDRV_UNKNOWN.

> - Initial problem reported at http://patches.dpdk.org/patch/64999/ as
> boot time printf clutter on octeontx2 devices with a lot PCI devices which are
> of type RTE_KDRV_NONE.

Add a logtype for PCI driver and adjust log level accordingly
to your preferences.

> @@ -271,6 +271,8 @@ pci_probe_all_drivers(struct rte_pci_device *dev)
>   FOREACH_DRIVER_ON_PCIBUS(dr) {
> + if (dev->kdrv == RTE_KDRV_NONE)
> + continue;
>   rc = rte_pci_probe_one_driver(dr, dev);

Nack




Re: [dpdk-dev] [PATCH v3] eal: disable tracing on Windows

2020-04-26 Thread Thomas Monjalon
26/04/2020 19:46, Jerin Jacob:
> On Sun, Apr 26, 2020 at 10:12 PM Dmitry Kozlyuk
>  wrote:
> >
> > Fix build errors caused by using Unix-specific functions in common code.
> > Hide and disable command-line options related to tracing on Windows.
> >
> > Fixes: 3d26a70ae338 ("trace: add trace configuration parameter")
> > Fixes: 3b155d24bdaf ("trace: hook subsystem to Linux")
> >
> > Reported-by: Pallavi Kadam 
> > Suggested-by: Thomas Monjalon 
> > Signed-off-by: Dmitry Kozlyuk 
> 
> Acked-by: Jerin Jacob 

Applied, thanks




Re: [dpdk-dev] [PATCH] bus/pci: optimize pci device probe

2020-04-26 Thread Jerin Jacob
On Sun, Apr 26, 2020 at 11:38 PM Thomas Monjalon  wrote:
>
> 26/04/2020 19:38, jer...@marvell.com:
> > From: Jerin Jacob 
> >
> > If the PCI device is not attached to any driver then there is no
> > point in probing it. As an optimization, skip the PCI device probe if
> > the PCI device driver of type RTE_KDRV_NONE.
> >
> > Signed-off-by: Jerin Jacob 
> > ---
> > Notes:
> > --
> > - virtio drivers does special treatment based on RTE_KDRV_UNKNOWN, That is
> > the reason allowing RTE_KDRV_UNKNOWN in this patch.
> > - virio devices uses RTE_KDRV_UNKNOWN for some special meaning, IMO, if it 
> > would
> >   be better, if
> > a) Introduce the KDRV for virio
> > b) If the PCIe device of driver type NONE or UNKNOWN then not even add in 
> > pci
> > list
> > in the scan, It will improve the boot time by avoiding operation on
> > unwanted device like sorting the PCI devices, scanning it, probe it, 
> > managing
> > it etc.
>
> mlx4/mlx4 uses RTE_KDRV_UNKNOWN.

OK.

>
> > - Initial problem reported at http://patches.dpdk.org/patch/64999/ as
> > boot time printf clutter on octeontx2 devices with a lot PCI devices which 
> > are
> > of type RTE_KDRV_NONE.
>
> Add a logtype for PCI driver and adjust log level accordingly
> to your preferences.
>
> > @@ -271,6 +271,8 @@ pci_probe_all_drivers(struct rte_pci_device *dev)
> >   FOREACH_DRIVER_ON_PCIBUS(dr) {
> > + if (dev->kdrv == RTE_KDRV_NONE)
> > + continue;
> >   rc = rte_pci_probe_one_driver(dr, dev);
>
> Nack

I understand mlx4/mlx5 is using RTE_KDRV_UNKNOWN, Here we are skipping
the RTE_KDRV_NONE,
What is the use case for probing the devices with RTE_KDRV_NONE?


>
>


Re: [dpdk-dev] [PATCH] bus/pci: set boot-up log prints to absolute minimum

2020-04-26 Thread Jerin Jacob
On Sun, Apr 26, 2020 at 11:37 PM Thomas Monjalon  wrote:
>
> 26/04/2020 19:42, Jerin Jacob:
> > On Sun, Apr 26, 2020 at 1:42 AM Thomas Monjalon  wrote:
> > > 06/02/2020 15:36, Jerin Jacob:
> > > > On Thu, Feb 6, 2020 at 7:44 PM Thomas Monjalon  
> > > > wrote:
> > > > > 21/01/2020 09:00, jer...@marvell.com:
> > > > > > From: Jerin Jacob 
> > > > > >
> > > > > > Some machines may have a lot of PCI devices, logs from PCI probe
> > > > > > creates a lot of clutter on boot-up, typically one needs
> > > > > > to scroll the screen to find other issues in boot-up.
> > > > > >
> > > > > > This patch changes the loglevel of PCI probes to `debug`
> > > > > > to reduce the clutter on default boot-up logs
> > > > >
> > > > > I think the PCI probe informations are... informational.
> > > > > Maybe you are just not interested in info logs.
> > > > > If this is the case, I suggest to change the log level at runtime.
> > > >
> > > > I am wondering, what would be the right balance, Following is DPDK
> > > > startup output from octeontx2[1]
> > > > It creates a lot of clutter in the "default" boot up. Why not enable
> > > > below prints using log level at runtime?
> > > > I believe it comes as a debug category, i.e information required to
> > > > debug if something is not working,
> > > > dpdk bind script already lists what is bound to DPDK.
> > > >
> > > > Suggestion to remove clutter?
> > >
> > > I suggest using dynamic log level in the PCI driver.
> > > Unfortunately a lot of old DPDK code is still using the old log macros.
> > > Some cleanup work is needed here.
> >
> > Sent an alternative fix to skip probing the devices with RTE_KDRV_NONE.
>
> No, a PCI PMD can work without a known kernel driver.
> This is the case of mlx4/mlx5.

Yes. it can work with UNKNOWN, But It not with NONE.

>
> > I think that is the source of the problem.
> > http://patches.dpdk.org/patch/69351/
>
> The source of the problem is just changing log levels dynamically
> is not possible currently with PCI driver logs.

Assuming if we add, dynamic stuff what would the default log level for
RTE_LOG(INFO, EAL, "PCI device "PCI_PRI_FMT" on NUMA socket %i\n",

If it is INFO, Still the problem persist in the default bootup. Right?




>
>


Re: [dpdk-dev] [PATCH v3] lib/timer: relax barrier for status update

2020-04-26 Thread Carrillo, Erik G



> -Original Message-
> From: Phil Yang 
> Sent: Sunday, April 26, 2020 9:20 AM
> To: Carrillo, Erik G ; tho...@monjalon.net
> Cc: rsanf...@akamai.com; dev@dpdk.org; david.march...@redhat.com;
> Honnappa Nagarahalli ; Gavin Hu
> ; nd ; nd ; nd
> 
> Subject: RE: [dpdk-dev] [PATCH v3] lib/timer: relax barrier for status update
> 
> > -Original Message-
> > From: Carrillo, Erik G 
> > Sent: Sunday, April 26, 2020 8:19 PM
> > To: Phil Yang ; tho...@monjalon.net
> > Cc: rsanf...@akamai.com; dev@dpdk.org; david.march...@redhat.com;
> > Honnappa Nagarahalli ; Gavin Hu
> > ; nd ; nd 
> > Subject: RE: [dpdk-dev] [PATCH v3] lib/timer: relax barrier for status
> > update
> >
> >
> >
> > > -Original Message-
> > > From: Phil Yang 
> > > Sent: Sunday, April 26, 2020 2:36 AM
> > > To: tho...@monjalon.net
> > > Cc: Carrillo, Erik G ;
> > > rsanf...@akamai.com; dev@dpdk.org; david.march...@redhat.com;
> > > Honnappa Nagarahalli ; Gavin Hu
> > > ; nd ; nd 
> > > Subject: RE: [dpdk-dev] [PATCH v3] lib/timer: relax barrier for
> > > status
> > update
> > >
> > > > -Original Message-
> > > > From: Thomas Monjalon 
> > > > Sent: Sunday, April 26, 2020 1:18 AM
> > > > To: Phil Yang 
> > > > Cc: erik.g.carri...@intel.com; rsanf...@akamai.com; dev@dpdk.org;
> > > > david.march...@redhat.com; Honnappa Nagarahalli
> > > > ; Gavin Hu ;
> nd
> > > > 
> > > > Subject: Re: [dpdk-dev] [PATCH v3] lib/timer: relax barrier for
> > > > status update
> > > >
> > > > 24/04/2020 09:24, Phil Yang:
> > > > > Volatile has no ordering semantics. The rte_timer structure
> > > > > defines timer status as a volatile variable and uses the
> > > > > rte_r/wmb barrier to guarantee inter-thread visibility.
> > > > >
> > > > > This patch optimized the volatile operation with c11 atomic
> > > > > operations and one-way barrier to save the performance penalty.
> > > > > According to the timer_perf_autotest benchmarking results, this
> > > > > patch can uplift 10%~16% timer appending performance, 3%~20%
> > > > > timer resetting performance and
> > > > 45%
> > > > > timer callbacks scheduling performance on aarch64 and no loss in
> > > > > performance for x86.
> > > > >
> > > > > Suggested-by: Honnappa Nagarahalli
> > > > > 
> > > > > Signed-off-by: Phil Yang 
> > > > > Reviewed-by: Gavin Hu 
> > > > > Acked-by: Erik Gabriel Carrillo 
> > > > [...]
> > > > > --- a/lib/librte_timer/rte_timer.h
> > > > > +++ b/lib/librte_timer/rte_timer.h
> > > > > @@ -101,7 +101,7 @@ struct rte_timer
> > > > > - volatile union rte_timer_status status; /**< Status of timer.
> */
> > > > > + union rte_timer_status status; /**< Status of timer. */
> > > >
> > > > Unfortunately, I cannot merge this patch because it breaks the ABI:
> > > >
> > > >   [C]'function void rte_timer_init(rte_timer*)' at
> > > > rte_timer.c:214:1 has some indirect sub-type changes:
> > > > parameter 1 of type 'rte_timer*' has sub-type changes:
> > > >   in pointed to type 'struct rte_timer' at rte_timer.h:100:1:
> > > > type size hasn't changed
> > > > 1 data member changes (2 filtered):
> > > >  type of 'volatile rte_timer_status rte_timer::status' changed:
> > > >entity changed from 'volatile rte_timer_status' to
> > > > 'union rte_timer_status' at rte_timer.h:67:1
> > > >type size hasn't changed
> > > >
> > >
> > > I think we can revert it to the original definition of rte_timer and
> > > keep the union rte_timer_status volatile-qualified.
> > > Because with or without this 'volatile' qualify, it generates the
> > > same code
> > on
> > > aarch64 and x86.
> > > So it seems acceptable to ignore it to make the ABI compatible?
> > >
> > > Thank,
> > > Phil
> >
> > I was wondering about this also.  Is the performance improvement on
> > aarch64 still the same in that case?
> 
> Yes. it is.
> It got the same performance improvement on aarch64 and no performance
> loss on x86.
> 
> I will update it in v4.

Great - thanks, Phil.


Re: [dpdk-dev] [PATCH] bus/pci: set boot-up log prints to absolute minimum

2020-04-26 Thread Thomas Monjalon
26/04/2020 20:48, Jerin Jacob:
> On Sun, Apr 26, 2020 at 11:37 PM Thomas Monjalon  wrote:
> > 26/04/2020 19:42, Jerin Jacob:
> > > On Sun, Apr 26, 2020 at 1:42 AM Thomas Monjalon  
> > > wrote:
> > > > 06/02/2020 15:36, Jerin Jacob:
> > > > > On Thu, Feb 6, 2020 at 7:44 PM Thomas Monjalon  
> > > > > wrote:
> > > > > > 21/01/2020 09:00, jer...@marvell.com:
> > > > > > > From: Jerin Jacob 
> > > > > > >
> > > > > > > Some machines may have a lot of PCI devices, logs from PCI probe
> > > > > > > creates a lot of clutter on boot-up, typically one needs
> > > > > > > to scroll the screen to find other issues in boot-up.
> > > > > > >
> > > > > > > This patch changes the loglevel of PCI probes to `debug`
> > > > > > > to reduce the clutter on default boot-up logs
> > > > > >
> > > > > > I think the PCI probe informations are... informational.
> > > > > > Maybe you are just not interested in info logs.
> > > > > > If this is the case, I suggest to change the log level at runtime.
> > > > >
> > > > > I am wondering, what would be the right balance, Following is DPDK
> > > > > startup output from octeontx2[1]
> > > > > It creates a lot of clutter in the "default" boot up. Why not enable
> > > > > below prints using log level at runtime?
> > > > > I believe it comes as a debug category, i.e information required to
> > > > > debug if something is not working,
> > > > > dpdk bind script already lists what is bound to DPDK.
> > > > >
> > > > > Suggestion to remove clutter?
> > > >
> > > > I suggest using dynamic log level in the PCI driver.
> > > > Unfortunately a lot of old DPDK code is still using the old log macros.
> > > > Some cleanup work is needed here.
> > >
> > > Sent an alternative fix to skip probing the devices with RTE_KDRV_NONE.
> >
> > No, a PCI PMD can work without a known kernel driver.
> > This is the case of mlx4/mlx5.
> 
> Yes. it can work with UNKNOWN, But It not with NONE.
> 
> > > I think that is the source of the problem.
> > > http://patches.dpdk.org/patch/69351/
> >
> > The source of the problem is just changing log levels dynamically
> > is not possible currently with PCI driver logs.
> 
> Assuming if we add, dynamic stuff what would the default log level for
> RTE_LOG(INFO, EAL, "PCI device "PCI_PRI_FMT" on NUMA socket %i\n",
> 
> If it is INFO, Still the problem persist in the default bootup. Right?

Yes

We need to know in INFO level which devices are successfully probed.
You want to decrease to DEBUG the level of logs announcing a probe.
I think it would be OK if adding an INFO log after probe success.





Re: [dpdk-dev] [PATCH] bus/pci: optimize pci device probe

2020-04-26 Thread Thomas Monjalon
26/04/2020 20:41, Jerin Jacob:
> On Sun, Apr 26, 2020 at 11:38 PM Thomas Monjalon  wrote:
> >
> > 26/04/2020 19:38, jer...@marvell.com:
> > > From: Jerin Jacob 
> > >
> > > If the PCI device is not attached to any driver then there is no
> > > point in probing it. As an optimization, skip the PCI device probe if
> > > the PCI device driver of type RTE_KDRV_NONE.
> > >
> > > Signed-off-by: Jerin Jacob 
> > > ---
> > > Notes:
> > > --
> > > - virtio drivers does special treatment based on RTE_KDRV_UNKNOWN, That is
> > > the reason allowing RTE_KDRV_UNKNOWN in this patch.
> > > - virio devices uses RTE_KDRV_UNKNOWN for some special meaning, IMO, if 
> > > it would
> > >   be better, if
> > > a) Introduce the KDRV for virio
> > > b) If the PCIe device of driver type NONE or UNKNOWN then not even add in 
> > > pci
> > > list
> > > in the scan, It will improve the boot time by avoiding operation on
> > > unwanted device like sorting the PCI devices, scanning it, probe it, 
> > > managing
> > > it etc.
> >
> > mlx4/mlx4 uses RTE_KDRV_UNKNOWN.
> 
> OK.
> 
> > > - Initial problem reported at http://patches.dpdk.org/patch/64999/ as
> > > boot time printf clutter on octeontx2 devices with a lot PCI devices 
> > > which are
> > > of type RTE_KDRV_NONE.
> >
> > Add a logtype for PCI driver and adjust log level accordingly
> > to your preferences.
> >
> > > @@ -271,6 +271,8 @@ pci_probe_all_drivers(struct rte_pci_device *dev)
> > >   FOREACH_DRIVER_ON_PCIBUS(dr) {
> > > + if (dev->kdrv == RTE_KDRV_NONE)
> > > + continue;
> > >   rc = rte_pci_probe_one_driver(dr, dev);
> >
> > Nack
> 
> I understand mlx4/mlx5 is using RTE_KDRV_UNKNOWN, Here we are skipping
> the RTE_KDRV_NONE,
> What is the use case for probing the devices with RTE_KDRV_NONE?

Maybe you are right. I don't remember the use case.
I think I remember these virtio and vmxnet3 PMD were not using UIO:
http://git.dpdk.org/old/virtio-net-pmd/tree/virtio_user.c
http://git.dpdk.org/old/vmxnet3-usermap/tree/pmd/vmxnet3.c

We need to know which case is using following code:

case RTE_KDRV_NONE:
#if defined(RTE_ARCH_X86)
ret = pci_ioport_map(dev, bar, p);
#endif
break;

David, please could you refresh our memory?




Re: [dpdk-dev] [PATCH v4] lib/timer: relax barrier for status update

2020-04-26 Thread Thomas Monjalon
26/04/2020 16:45, Phil Yang:
> Volatile has no ordering semantics. The rte_timer structure defines
> timer status as a volatile variable and uses the rte_r/wmb barrier
> to guarantee inter-thread visibility.
> 
> This patch optimized the volatile operation with c11 atomic operations
> and one-way barrier to save the performance penalty. According to the
> timer_perf_autotest benchmarking results, this patch can uplift 10%~16%
> timer appending performance, 3%~20% timer resetting performance and 45%
> timer callbacks scheduling performance on aarch64 and no loss in
> performance for x86.
> 
> Suggested-by: Honnappa Nagarahalli 
> Signed-off-by: Phil Yang 
> Reviewed-by: Gavin Hu 
> Acked-by: Erik Gabriel Carrillo 

Applied, thanks





Re: [dpdk-dev] [PATCH v2] doc: fix incorrect example for specifying log level

2020-04-26 Thread Thomas Monjalon
18/03/2020 01:58, Xiaolong Ye:
> Now we need to add prefix like lib. to enable the log, also changing val 8 to
> "debug"" which would be more descriptive.
> 
> Fixes: ffb9fd1b0808 ("log: update legacy modules dynamic logs regex")
> Cc: sta...@dpdk.org
> 
> Reported-by: Haiyue Wang 
> Signed-off-by: Xiaolong Ye 

Applied, thanks





Re: [dpdk-dev] [PATCH v2] eal: add madvise to avoid dump memory

2020-04-26 Thread Feng Li
Bruce Richardson  于2020年4月24日周五 下午5:14写道:
>
> On Fri, Apr 24, 2020 at 10:12:10AM +0100, Burakov, Anatoly wrote:
> > On 23-Apr-20 9:04 PM, David Marchand wrote:
> > > On Thu, Apr 23, 2020 at 6:34 PM Burakov, Anatoly
> > >  wrote:
> > > > > diff --git a/lib/librte_eal/common/eal_common_memory.c 
> > > > > b/lib/librte_eal/common/eal_common_memory.c
> > > > > index cc7d54e0c..2d9564b28 100644
> > > > > --- a/lib/librte_eal/common/eal_common_memory.c
> > > > > +++ b/lib/librte_eal/common/eal_common_memory.c
> > > > > @@ -177,6 +177,20 @@ eal_get_virtual_area(void *requested_addr, 
> > > > > size_t *size,
> > > > >after_len = RTE_PTR_DIFF(map_end, aligned_end);
> > > > >if (after_len > 0)
> > > > >munmap(aligned_end, after_len);
> > > > > +
> > > > > + /*
> > > > > +  * Exclude this pages from a core dump.
> > > > > +  */
> > > > > + if (madvise(aligned_addr, *size, MADV_DONTDUMP) != 0)
> > > > > + RTE_LOG(WARNING, EAL, "Madvise with 
> > > > > MADV_DONTDUMP failed: %s\n",
> > > > > + strerror(errno));> +   } else {
> > > > > + /*
> > > > > +  * Exclude this pages from a core dump.
> > > > > +  */
> > > > > + if (madvise(mapped_addr, map_sz, MADV_DONTDUMP) != 0)
> > > > > + RTE_LOG(WARNING, EAL, "Madvise with 
> > > > > MADV_DONTDUMP failed: %s\n",
> > > > > + strerror(errno));
> > > > >}
> > > > >
> > > > >return aligned_addr;
> > > > >
> > > >
> > > > For the contents of this patch,
> > >
> > > MADV_DONTDUMP does not seem POSIX, but as I said [1], there seems to
> > > be a MADV_NOCORE option on FreeBSD.
> > > 1: 
> > > http://inbox.dpdk.org/dev/cajfav8y9ytt-7njuz+md6u8+3xuqyrgp28kd7jy2923epac...@mail.gmail.com/
> > >
> > >
> >
> > Oh, right, so this would probably not compile on FreeBSD. Perhaps this
> > function would have to be OS-specific after all (or call into an OS-specific
> > madvise() after reserving the memory area).
> >
>
> Is it just a differently named flag? If so, I think a single #ifdef macro
> won't kill us in the common code.
>
Just the flag name is different.
I should use RTE_EXEC_ENV_FREEBSD and RTE_EXEC_ENV_LINUX, right?

Another question, in `eal_memalloc.c:alloc_seg`, I should undo the
DONTMAP of the memory region.
Right? @Anatoly

Just few minutes, I have prepared a patch for the OS-specific code:
--- a/lib/librte_eal/common/eal_private.h
+++ b/lib/librte_eal/common/eal_private.h
@@ -443,4 +443,20 @@ rte_option_usage(void);
 uint64_t
 eal_get_baseaddr(void);

+/**
+ * @internal
+ * Exclude this pages from a core dump.
+ *
+ * @param addr
+ *  The memory region starts.
+ *
+ * @param len
+ *  The memory region length..
+ *
+ * @return
+ * returns 0 or -errno
+ */
+int
+eal_madvise_dontdump(void* addr, size_t len);
+
 #endif /* _EAL_PRIVATE_H_ */
diff --git a/lib/librte_eal/freebsd/eal_memory.c
b/lib/librte_eal/freebsd/eal_memory.c
index a97d8f0f0..585042dde 100644
--- a/lib/librte_eal/freebsd/eal_memory.c
+++ b/lib/librte_eal/freebsd/eal_memory.c
@@ -534,3 +534,9 @@ rte_eal_memseg_init(void)
  memseg_primary_init() :
  memseg_secondary_init();
 }
+
+int
+eal_madvise_dontdump(void* addr, size_t len)
+{
+ return madvise(addr, len, MADV_NOCORE);
+}
diff --git a/lib/librte_eal/linux/eal_memory.c
b/lib/librte_eal/linux/eal_memory.c
index 7a9c97ff8..cfdbfccfe 100644
--- a/lib/librte_eal/linux/eal_memory.c
+++ b/lib/librte_eal/linux/eal_memory.c
@@ -2479,3 +2479,9 @@ rte_eal_memseg_init(void)
 #endif
  memseg_secondary_init();
 }
+
+int
+eal_madvise_dontdump(void* addr, size_t len)
+{
+ return madvise(addr, len, MADV_DONTDUMP);
+}


Re: [dpdk-dev] [dpdk-stable] [PATCH] examples/l2fwd-keepalive: fix packet drops limited mbufs

2020-04-26 Thread Thomas Monjalon
11/03/2020 13:04, Louise Kilheeney:
> MBUF pool of size 8192 was causing packet loss when using four ports. To
> fix this issue this patch specifies the number of MBUF's per port
> instead of having one set MBUF pool size, this way it will adapt to any
> number of ports.
> 
> Fixes: e64833f2273a ("examples/l2fwd-keepalive: add sample application")
> 
> Cc: sta...@dpdk.org

No blank line between Fixes and Cc lines please.

> 
> Signed-off-by: Louise Kilheeney 
> ---
>  examples/l2fwd-keepalive/main.c | 17 ++---
>  1 file changed, 10 insertions(+), 7 deletions(-)
> 
> diff --git a/examples/l2fwd-keepalive/main.c b/examples/l2fwd-keepalive/main.c
> index b36834974..0f0010d51 100644
> --- a/examples/l2fwd-keepalive/main.c
> +++ b/examples/l2fwd-keepalive/main.c
>   nb_ports = rte_eth_dev_count_avail();
>   if (nb_ports == 0)
>   rte_exit(EXIT_FAILURE, "No Ethernet ports - bye\n");
>  
> + /* create the mbuf pool */
> + unsigned int total_nb_mbufs = NB_MBUF_PER_PORT * nb_ports;

Please avoid declaring a variable in the middle of a function.
This code style is too much modern for DPDK ;-)
It could break on a random compiler.





Re: [dpdk-dev] [dpdk-stable] [PATCH] examples/l2fwd-keepalive: fix packet drops limited mbufs

2020-04-26 Thread Thomas Monjalon
13/03/2020 04:07, Jiang, MaoX:
> Tested-by:xix.zh...@intel.com

Please provide full name.





Re: [dpdk-dev] [PATCH] [v2 1/1] examples/l2fwd: add cmdline option for forwarding port info

2020-04-26 Thread Thomas Monjalon
05/04/2020 05:52, vattun...@marvell.com:
> From: Vamsi Attunuru 
> 
> Current l2fwd application configures adjacent ports as destination
> ports for forwarding the traffic which is a kind of static mapping
> that can not be altered by the command line options.
> 
> Patch adds a config option to pass the forwarding port pair mapping
> as a command line parameter which allows the user to pass required
> forwarding port mapping.
> 
> If no config argument is specified, destination port map is not
> changed and traffic gets forwarded with existing mapping.
> 
> When port pair mapping is passed in config option, destination port map
> is configured and traffic gets forwarded accordingly.
> 
> Ex: ./l2fwd -c 0xff -- -p 0x3f --config="(0,3)(1,4)(2,5)"

The option name "--config" is vague.
Could we call it "--portmap", or something similar?





Re: [dpdk-dev] [PATCH v2 2/2] l3fwd-power: implement proper shutdown handling

2020-04-26 Thread Thomas Monjalon
20/04/2020 19:56, Anatoly Burakov:
> Currently, shutdown for l3fwd-power application is all over the place
> and may or may not happen either in the signal handler or in the main()
> function. Fix this so that the signal handler will only set the exit
> variable, thereby allowing all of the loops to end properly and proceed
> to deinitialize everything.
> 
> Signed-off-by: Anatoly Burakov 
> Acked-by: David Hunt 
> Reviewed-by: Reshma Pattan 

Applied, thanks





[dpdk-dev] [dpdk-announce] release candidate 20.05-rc1

2020-04-26 Thread Thomas Monjalon
A new DPDK release candidate is ready for testing:
https://git.dpdk.org/dpdk/tag/?id=v20.05-rc1

There are 822 new patches in this snapshot.

Release notes:
https://doc.dpdk.org/guides/rel_notes/release_20_05.html

Highlights of 20.05-rc1, grouped by category:
* General
- low overhead tracing framework
- ring synchronisation modes for VMs and containers
- RCU API for deferred resource reclamation
* Networking
- flow aging API
- driver for Intel Foxville I225
* Cryptography
- event mode in the example application ipsec-secgw
* Baseband
- 5G driver for Intel FPGA-based N3000

We are still expecting some features for -rc2:
- packet processing graph
- telemetry rework
- flow rules performance test
- GCC 10 support

We have four small weeks to complete this release cycle.
DPDK 20.05-rc2 is expected on April 8th.

Please test and report issues on bugs.dpdk.org.
As a community, we must close as many bugs as possible for -rc2.
Please read this bugzilla-related message:
http://mails.dpdk.org/archives/dev/2020-April/165437.html

Thank you everyone




Re: [dpdk-dev] [dpdk-stable] [PATCH] examples/l2fwd-keepalive: fix packet drops limited mbufs

2020-04-26 Thread Jiang, MaoX
Tested-by:zhang,xi 

Best regards,
Jiang Mao


> -Original Message-
> From: stable [mailto:stable-boun...@dpdk.org] On Behalf Of Louise
> Kilheeney
> Sent: Wednesday, March 11, 2020 8:05 PM
> To: dev@dpdk.org
> Cc: Kilheeney, Louise ; sta...@dpdk.org
> Subject: [dpdk-stable] [PATCH] examples/l2fwd-keepalive: fix packet drops
> limited mbufs
> 
> MBUF pool of size 8192 was causing packet loss when using four ports. To fix
> this issue this patch specifies the number of MBUF's per port instead of
> having one set MBUF pool size, this way it will adapt to any number of ports.
> 
> Fixes: e64833f2273a ("examples/l2fwd-keepalive: add sample application")
> 
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Louise Kilheeney 
> ---
>  examples/l2fwd-keepalive/main.c | 17 ++---
>  1 file changed, 10 insertions(+), 7 deletions(-)
> 
> diff --git a/examples/l2fwd-keepalive/main.c b/examples/l2fwd-
> keepalive/main.c index b36834974..0f0010d51 100644
> --- a/examples/l2fwd-keepalive/main.c
> +++ b/examples/l2fwd-keepalive/main.c
> @@ -44,7 +44,7 @@
> 
>  #define RTE_LOGTYPE_L2FWD RTE_LOGTYPE_USER1
> 
> -#define NB_MBUF   8192
> +#define NB_MBUF_PER_PORT 3000
> 
>  #define MAX_PKT_BURST 32
>  #define BURST_TX_DRAIN_US 100 /* TX drain every ~100us */ @@ -561,16
> +561,19 @@ main(int argc, char **argv)
>   if (ret < 0)
>   rte_exit(EXIT_FAILURE, "Invalid L2FWD arguments\n");
> 
> - /* create the mbuf pool */
> - l2fwd_pktmbuf_pool = rte_pktmbuf_pool_create("mbuf_pool",
> NB_MBUF, 32,
> - 0, RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id());
> - if (l2fwd_pktmbuf_pool == NULL)
> - rte_exit(EXIT_FAILURE, "Cannot init mbuf pool\n");
> -
>   nb_ports = rte_eth_dev_count_avail();
>   if (nb_ports == 0)
>   rte_exit(EXIT_FAILURE, "No Ethernet ports - bye\n");
> 
> + /* create the mbuf pool */
> + unsigned int total_nb_mbufs = NB_MBUF_PER_PORT * nb_ports;
> +
> + l2fwd_pktmbuf_pool = rte_pktmbuf_pool_create("mbuf_pool",
> + total_nb_mbufs, 32, 0,
> RTE_MBUF_DEFAULT_BUF_SIZE,
> + rte_socket_id());
> + if (l2fwd_pktmbuf_pool == NULL)
> + rte_exit(EXIT_FAILURE, "Cannot init mbuf pool\n");
> +
>   /* reset l2fwd_dst_ports */
>   for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++)
>   l2fwd_dst_ports[portid] = 0;
> --
> 2.17.1



[dpdk-dev] [PATCH] maintainers: update for Intel ixgbe/igb/igc

2020-04-26 Thread Wei Zhao
Replace Wenzhuo Lu, Alvin Zhang and Konstantin Ananyev
with  Wei Zhao and Jeff Guo.

Signed-off-by: Wei Zhao 
---
 MAINTAINERS | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index d31a80929..7a57535ee 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -649,7 +649,8 @@ F: doc/guides/nics/hinic.rst
 F: doc/guides/nics/features/hinic.ini
 
 Intel e1000
-M: Wenzhuo Lu 
+M: Wei Zhao 
+M: Jeff Guo 
 T: git://dpdk.org/next/dpdk-next-net-intel
 F: drivers/net/e1000/
 F: doc/guides/nics/e1000em.rst
@@ -658,8 +659,8 @@ F: doc/guides/nics/features/e1000.ini
 F: doc/guides/nics/features/igb*.ini
 
 Intel ixgbe
-M: Wenzhuo Lu 
-M: Konstantin Ananyev 
+M: Wei Zhao 
+M: Jeff Guo 
 T: git://dpdk.org/next/dpdk-next-net-intel
 F: drivers/net/ixgbe/
 F: doc/guides/nics/ixgbe.rst
@@ -700,7 +701,8 @@ F: doc/guides/nics/ice.rst
 F: doc/guides/nics/features/ice.ini
 
 Intel igc
-M: Alvin Zhang 
+M: Wei Zhao 
+M: Jeff Guo 
 T: git://dpdk.org/next/dpdk-next-net-intel
 F: drivers/net/igc/
 F: doc/guides/nics/igc.rst
-- 
2.19.1



[dpdk-dev] [PATCH] maintainers: update for Intel ixgbe/igb/igc

2020-04-26 Thread Wei Zhao
Replace Wenzhuo Lu, Alvin Zhang and Konstantin Ananyev
with  Wei Zhao and Jeff Guo.

Signed-off-by: Wei Zhao 
---
 MAINTAINERS | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index d31a80929..7a57535ee 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -649,7 +649,8 @@ F: doc/guides/nics/hinic.rst
 F: doc/guides/nics/features/hinic.ini
 
 Intel e1000
-M: Wenzhuo Lu 
+M: Wei Zhao 
+M: Jeff Guo 
 T: git://dpdk.org/next/dpdk-next-net-intel
 F: drivers/net/e1000/
 F: doc/guides/nics/e1000em.rst
@@ -658,8 +659,8 @@ F: doc/guides/nics/features/e1000.ini
 F: doc/guides/nics/features/igb*.ini
 
 Intel ixgbe
-M: Wenzhuo Lu 
-M: Konstantin Ananyev 
+M: Wei Zhao 
+M: Jeff Guo 
 T: git://dpdk.org/next/dpdk-next-net-intel
 F: drivers/net/ixgbe/
 F: doc/guides/nics/ixgbe.rst
@@ -700,7 +701,8 @@ F: doc/guides/nics/ice.rst
 F: doc/guides/nics/features/ice.ini
 
 Intel igc
-M: Alvin Zhang 
+M: Wei Zhao 
+M: Jeff Guo 
 T: git://dpdk.org/next/dpdk-next-net-intel
 F: drivers/net/igc/
 F: doc/guides/nics/igc.rst
-- 
2.19.1



[dpdk-dev] [PATCH] maintainers: update for Intel ixgbe/igb/igc

2020-04-26 Thread Wei Zhao
Replace Wenzhuo Lu, Alvin Zhang and Konstantin Ananyev
with Wei Zhao and Jeff Guo.

Signed-off-by: Wei Zhao 
---
 MAINTAINERS | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index d31a80929..7a57535ee 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -649,7 +649,8 @@ F: doc/guides/nics/hinic.rst
 F: doc/guides/nics/features/hinic.ini
 
 Intel e1000
-M: Wenzhuo Lu 
+M: Wei Zhao 
+M: Jeff Guo 
 T: git://dpdk.org/next/dpdk-next-net-intel
 F: drivers/net/e1000/
 F: doc/guides/nics/e1000em.rst
@@ -658,8 +659,8 @@ F: doc/guides/nics/features/e1000.ini
 F: doc/guides/nics/features/igb*.ini
 
 Intel ixgbe
-M: Wenzhuo Lu 
-M: Konstantin Ananyev 
+M: Wei Zhao 
+M: Jeff Guo 
 T: git://dpdk.org/next/dpdk-next-net-intel
 F: drivers/net/ixgbe/
 F: doc/guides/nics/ixgbe.rst
@@ -700,7 +701,8 @@ F: doc/guides/nics/ice.rst
 F: doc/guides/nics/features/ice.ini
 
 Intel igc
-M: Alvin Zhang 
+M: Wei Zhao 
+M: Jeff Guo 
 T: git://dpdk.org/next/dpdk-next-net-intel
 F: drivers/net/igc/
 F: doc/guides/nics/igc.rst
-- 
2.19.1



[dpdk-dev] [PATCH] lib/librte_hash: add rte_hash_del_key_fixed without compact

2020-04-26 Thread Lilijun (Jerry)
The keys idx are stored in rte_hash main bucket key slots and extend bucket key 
stots. 
We iterate every no empty Keys in h->buckets and h->buckets_ext from start to 
last. 
When deleting keys the function __rte_hash_compact_ll() may move last_bkt's key 
to previous bucket in order to compact extend bucket list.
If the previous bucket has been iterated, the moved key may be missed for 
users. 
Then those missed keys are leaked and rte_hash table can't be cleanup.
So we add a new API rte_hash_del_key_fixed() used in iterate loop to avoid this 
bugs.

---
 lib/librte_hash/rte_cuckoo_hash.c| 19 ++-
 lib/librte_hash/rte_hash.h   |  5 +
 lib/librte_hash/rte_hash_version.map |  1 +
 3 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/lib/librte_hash/rte_cuckoo_hash.c 
b/lib/librte_hash/rte_cuckoo_hash.c
index b52630b..2da3c1d 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -1523,7 +1523,7 @@ search_and_remove(const struct rte_hash *h, const void 
*key,
 
 static inline int32_t
 __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
-   hash_sig_t sig)
+   hash_sig_t sig, uint8_t compact)
 {
uint32_t prim_bucket_idx, sec_bucket_idx;
struct rte_hash_bucket *prim_bkt, *sec_bkt, *prev_bkt, *last_bkt;
@@ -1541,7 +1541,8 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, 
const void *key,
/* look for key in primary bucket */
ret = search_and_remove(h, key, prim_bkt, short_sig, &pos);
if (ret != -1) {
-   __rte_hash_compact_ll(h, prim_bkt, pos);
+   if (compact)
+   __rte_hash_compact_ll(h, prim_bkt, pos);
last_bkt = prim_bkt->next;
prev_bkt = prim_bkt;
goto return_bkt;
@@ -1553,7 +1554,8 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, 
const void *key,
FOR_EACH_BUCKET(cur_bkt, sec_bkt) {
ret = search_and_remove(h, key, cur_bkt, short_sig, &pos);
if (ret != -1) {
-   __rte_hash_compact_ll(h, cur_bkt, pos);
+   if (compact)
+   __rte_hash_compact_ll(h, cur_bkt, pos);
last_bkt = sec_bkt->next;
prev_bkt = sec_bkt;
goto return_bkt;
@@ -1607,14 +1609,21 @@ rte_hash_del_key_with_hash(const struct rte_hash *h,
const void *key, hash_sig_t sig)
 {
RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL);
-   return __rte_hash_del_key_with_hash(h, key, sig);
+   return __rte_hash_del_key_with_hash(h, key, sig, 1);
 }
 
 int32_t
 rte_hash_del_key(const struct rte_hash *h, const void *key)
 {
RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL);
-   return __rte_hash_del_key_with_hash(h, key, rte_hash_hash(h, key));
+   return __rte_hash_del_key_with_hash(h, key, rte_hash_hash(h, key), 1);
+}
+
+int32_t
+rte_hash_del_key_fixed(const struct rte_hash *h, const void *key)
+{
+   RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL);
+   return __rte_hash_del_key_with_hash(h, key, rte_hash_hash(h, key), 0);
 }
 
 int
diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h
index eceb365..9b71d8a 100644
--- a/lib/librte_hash/rte_hash.h
+++ b/lib/librte_hash/rte_hash.h
@@ -297,6 +297,11 @@ rte_hash_add_key_with_hash(const struct rte_hash *h, const 
void *key, hash_sig_t
 int32_t
 rte_hash_del_key(const struct rte_hash *h, const void *key);
 
+
+/* for without compact */
+int32_t
+rte_hash_del_key_fixed(const struct rte_hash *h, const void *key);
+
 /**
  * Remove a key from an existing hash table.
  * This operation is not multi-thread safe
diff --git a/lib/librte_hash/rte_hash_version.map 
b/lib/librte_hash/rte_hash_version.map
index 30cc086..1941d17 100644
--- a/lib/librte_hash/rte_hash_version.map
+++ b/lib/librte_hash/rte_hash_version.map
@@ -11,6 +11,7 @@ DPDK_20.0 {
rte_hash_count;
rte_hash_create;
rte_hash_del_key;
+   rte_hash_del_key_fixed;
rte_hash_del_key_with_hash;
rte_hash_find_existing;
rte_hash_free;
-- 
2.19.1

-邮件原件-
发件人: Lilijun (Jerry) 
发送时间: 2020年4月18日 18:00
收件人: 'dev@dpdk.org' ; 'sta...@dpdk.org' 
主题: rte_hash bug: can't iterate all entries when deleting keys in rte_hash 
iterate loop.

Hi all,

In my test, entries can't be cleanup in rte_hash table when deleting keys 
in rte_hash iterate loop. The test steps:
1.  create a hash table table1 with limit 3, ext bucket enabled,  and 
insert 3 entries into this hash table.
2.  create a larger hash table table2 with limit 6, , ext bucket 
enabled.
3.  iterate all entries of table1 and insert them to the table2. Insert new 
1 entries to this table2.
4.  Then flush all entries from table2 b

[dpdk-dev] [PATCH] lib/librte_hash: avoid iterate bugs with delete keys.

2020-04-26 Thread Lilijun (Jerry)
>From 0faea722b82ffe30adfa55d5ea4ad3a23ed30d4e Mon Sep 17 00:00:00 2001
From: Yunjian Wang 
Date: Mon, 27 Apr 2020 11:12:25 +0800
Subject: [PATCH] lib/librte_hash: avoid iterate bugs with delete keys.

The keys idx are stored in rte_hash main and extend bucket key slots.
We iterate every no empty Keys in h->buckets and h->buckets_ext
from start to last. When deleting keys the function
__rte_hash_compact_ll() may move last_bkt's key to previous bucket
in order to compact extend bucket list.
If the previous bucket has been iterated, the moved key may be missed
for users. Then those missed keys are leaked and rte_hash table can't
be cleanup.
So we add a new API rte_hash_del_key_fixed() used in iterate loop
to avoid this bugs.

Signed-off-by: Lilijun 
Signed-off-by: Yunjian Wang 
---
 lib/librte_hash/rte_cuckoo_hash.c| 19 +-
 lib/librte_hash/rte_hash.h   | 30 
 lib/librte_hash/rte_hash_version.map |  1 +
 3 files changed, 45 insertions(+), 5 deletions(-)

diff --git a/lib/librte_hash/rte_cuckoo_hash.c 
b/lib/librte_hash/rte_cuckoo_hash.c
index 38767a8a1..1b6f84759 100644
--- a/lib/librte_hash/rte_cuckoo_hash.c
+++ b/lib/librte_hash/rte_cuckoo_hash.c
@@ -1491,7 +1491,7 @@ search_and_remove(const struct rte_hash *h, const void 
*key,
 
 static inline int32_t
 __rte_hash_del_key_with_hash(const struct rte_hash *h, const void *key,
-   hash_sig_t sig)
+   hash_sig_t sig, uint8_t compact)
 {
uint32_t prim_bucket_idx, sec_bucket_idx;
struct rte_hash_bucket *prim_bkt, *sec_bkt, *prev_bkt, *last_bkt;
@@ -1509,7 +1509,8 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, 
const void *key,
/* look for key in primary bucket */
ret = search_and_remove(h, key, prim_bkt, short_sig, &pos);
if (ret != -1) {
-   __rte_hash_compact_ll(h, prim_bkt, pos);
+   if (compact)
+   __rte_hash_compact_ll(h, prim_bkt, pos);
last_bkt = prim_bkt->next;
prev_bkt = prim_bkt;
goto return_bkt;
@@ -1521,7 +1522,8 @@ __rte_hash_del_key_with_hash(const struct rte_hash *h, 
const void *key,
FOR_EACH_BUCKET(cur_bkt, sec_bkt) {
ret = search_and_remove(h, key, cur_bkt, short_sig, &pos);
if (ret != -1) {
-   __rte_hash_compact_ll(h, cur_bkt, pos);
+   if (compact)
+   __rte_hash_compact_ll(h, cur_bkt, pos);
last_bkt = sec_bkt->next;
prev_bkt = sec_bkt;
goto return_bkt;
@@ -1576,14 +1578,21 @@ rte_hash_del_key_with_hash(const struct rte_hash *h,
const void *key, hash_sig_t sig)
 {
RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL);
-   return __rte_hash_del_key_with_hash(h, key, sig);
+   return __rte_hash_del_key_with_hash(h, key, sig, 1);
 }
 
 int32_t
 rte_hash_del_key(const struct rte_hash *h, const void *key)
 {
RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL);
-   return __rte_hash_del_key_with_hash(h, key, rte_hash_hash(h, key));
+   return __rte_hash_del_key_with_hash(h, key, rte_hash_hash(h, key), 1);
+}
+
+int32_t
+rte_hash_del_key_fixed(const struct rte_hash *h, const void *key)
+{
+   RETURN_IF_TRUE(((h == NULL) || (key == NULL)), -EINVAL);
+   return __rte_hash_del_key_with_hash(h, key, rte_hash_hash(h, key), 0);
 }
 
 int
diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h
index bff40251b..1bea33809 100644
--- a/lib/librte_hash/rte_hash.h
+++ b/lib/librte_hash/rte_hash.h
@@ -309,6 +309,36 @@ rte_hash_add_key_with_hash(const struct rte_hash *h, const 
void *key, hash_sig_t
 int32_t
 rte_hash_del_key(const struct rte_hash *h, const void *key);
 
+/**
+ * Remove a key without compact ext bucket list from an existing hash table.
+ * This operation is not multi-thread safe
+ * and should only be called from one thread by default.
+ * Thread safety can be enabled by setting flag during
+ * table creation.
+ * If RTE_HASH_EXTRA_FLAGS_NO_FREE_ON_DEL or
+ * RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF is enabled,
+ * the key index returned by rte_hash_add_key_xxx APIs will not be
+ * freed by this API. rte_hash_free_key_with_position API must be called
+ * additionally to free the index associated with the key.
+ * rte_hash_free_key_with_position API should be called after all
+ * the readers have stopped referencing the entry corresponding to
+ * this key. RCU mechanisms could be used to determine such a state.
+ *
+ * @param h
+ *   Hash table to remove the key from.
+ * @param key
+ *   Key to remove from the hash table.
+ * @return
+ *   - -EINVAL if the parameters are invalid.
+ *   - -ENOENT if the key is not found.
+ *   - A positive value that can be used by the caller as an offset into an
+ * arr

Re: [dpdk-dev] [PATCH] crypto/ccp: fix fd leak on probe failure

2020-04-26 Thread Kumar, Ravi1
[AMD Public Use]

Acked-by: Ravi Kumar 
>-Original Message-
>From: wangyunjian  
>Sent: Sunday, April 26, 2020 12:06 PM
>To: dev@dpdk.org; Kumar, Ravi1 
>Cc: jerry.lili...@huawei.com; xudin...@huawei.com; Yunjian Wang 
>; sta...@dpdk.org
>Subject: [dpdk-dev] [PATCH] crypto/ccp: fix fd leak on probe failure
>
>
>From: Yunjian Wang 
>
>Zero is a valid fd. When ccp_probe_device() is failed, the uio_fd won't be 
>closed thus leading fd leak.
>
>Fixes: ef4b04f87fa6 ("crypto/ccp: support device init")
>Cc: sta...@dpdk.org
>
>Signed-off-by: Yunjian Wang 
>---
> drivers/crypto/ccp/ccp_dev.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/drivers/crypto/ccp/ccp_dev.c b/drivers/crypto/ccp/ccp_dev.c index 
>80fe6a453..7d98b2eb2 100644
>--- a/drivers/crypto/ccp/ccp_dev.c
>+++ b/drivers/crypto/ccp/ccp_dev.c
>@@ -760,7 +760,7 @@ ccp_probe_device(const char *dirname, uint16_t domain,
>return 0;
> fail:
>CCP_LOG_ERR("CCP Device probe failed");
>-   if (uio_fd > 0)
>+   if (uio_fd >= 0)
>close(uio_fd);
>if (ccp_dev)
>rte_free(ccp_dev);
>--
>2.19.1
>
>
>


Re: [dpdk-dev] [PATCH dpdk-dev v3 2/2] mempool: use shared memzone for rte_mempool_ops

2020-04-26 Thread Tonghao Zhang
On Thu, Apr 23, 2020 at 9:38 PM Andrew Rybchenko
 wrote:
>
> On 4/13/20 5:21 PM, xiangxia.m@gmail.com wrote:
> > From: Tonghao Zhang 
> >
> > The order of mempool initiation affects mempool index in the
> > rte_mempool_ops_table. For example, when building APPs with:
> >
> > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ...
> >
> > The "bucket" mempool will be registered firstly, and its index
> > in table is 0 while the index of "ring" mempool is 1. DPDK
> > uses the mk/rte.app.mk to build APPs, and others, for example,
> > Open vSwitch, use the libdpdk.a or libdpdk.so to build it.
> > The mempool lib linked in dpdk and Open vSwitch is different.
> >
> > The mempool can be used between primary and secondary process,
> > such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled).
> > There will be a crash because dpdk-pdump creates the "ring_mp_mc"
> > ring which index in table is 0, but the index of "bucket" ring
> > is 0 in Open vSwitch. If Open vSwitch use the index 0 to get
> > mempool ops and malloc memory from mempool. The crash will occur:
> >
> > bucket_dequeue (access null and crash)
> > rte_mempool_get_ops (should get "ring_mp_mc",
> >  but get "bucket" mempool)
> > rte_mempool_ops_dequeue_bulk
> > ...
> > rte_pktmbuf_alloc
> > rte_pktmbuf_copy
> > pdump_copy
> > pdump_rx
> > rte_eth_rx_burst
> >
> > To avoid the crash, there are some solution:
> > * constructor priority: Different mempool uses different
> >   priority in RTE_INIT, but it's not easy to maintain.
> >
> > * change mk/rte.app.mk: Change the order in mk/rte.app.mk to
> >   be same as libdpdk.a/libdpdk.so, but when adding a new mempool
> >   driver in future, we must make sure the order.
> >
> > * register mempool orderly: Sort the mempool when registering,
> >   so the lib linked will not affect the index in mempool table.
> >   but the number of mempool libraries may be different.
> >
> > * shared memzone: The primary process allocates a struct in
> >   shared memory named memzone, When we register a mempool ops,
> >   we first get a name and id from the shared struct: with the lock held,
> >   lookup for the registered name and return its index, else
> >   get the last id and copy the name in the struct.
> >
> > Previous discussion: 
> > https://mails.dpdk.org/archives/dev/2020-March/159354.html
> >
> > Suggested-by: Olivier Matz 
> > Suggested-by: Jerin Jacob 
> > Signed-off-by: Tonghao Zhang 
> > ---
> > v2:
> > * fix checkpatch warning
> > ---
> >  lib/librte_mempool/rte_mempool.h | 28 +++-
> >  lib/librte_mempool/rte_mempool_ops.c | 89 
> > 
> >  2 files changed, 96 insertions(+), 21 deletions(-)
> >
> > diff --git a/lib/librte_mempool/rte_mempool.h 
> > b/lib/librte_mempool/rte_mempool.h
> > index c90cf31467b2..2709b9e1d51b 100644
> > --- a/lib/librte_mempool/rte_mempool.h
> > +++ b/lib/librte_mempool/rte_mempool.h
> > @@ -50,6 +50,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >
> >  #ifdef __cplusplus
> >  extern "C" {
> > @@ -678,7 +679,6 @@ struct rte_mempool_ops {
> >   */
> >  struct rte_mempool_ops_table {
> >   rte_spinlock_t sl; /**< Spinlock for add/delete. */
> > - uint32_t num_ops;  /**< Number of used ops structs in the table. 
> > */
> >   /**
> >* Storage for all possible ops structs.
> >*/
> > @@ -910,6 +910,30 @@ int rte_mempool_ops_get_info(const struct rte_mempool 
> > *mp,
> >   */
> >  int rte_mempool_register_ops(const struct rte_mempool_ops *ops);
> >
> > +struct rte_mempool_shared_ops {
> > + size_t num_mempool_ops;
>
> Is there any specific reason to change type from uint32_t used
> above to size_t? I think that uint32_t is better here since
> it is just a number, not a size of memory or related value.
Thanks for you review. busy to commit other patch, so that be delayed.
I will change that in next version.
> > + struct {
> > + char name[RTE_MEMPOOL_OPS_NAMESIZE];
> > + } mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX];
> > +
> > + rte_spinlock_t mempool;
> > +};
> > +
> > +static inline int
> > +mempool_ops_register_cb(const void *arg)
> > +{
> > + const struct rte_mempool_ops *h = (const struct rte_mempool_ops *)arg;
> > +
> > + return rte_mempool_register_ops(h);
> > +}
> > +
> > +static inline void
> > +mempool_ops_register(const struct rte_mempool_ops *ops)
> > +{
> > + rte_init_register(mempool_ops_register_cb, (const void *)ops,
> > +   RTE_INIT_PRE);
> > +}
> > +
> >  /**
> >   * Macro to statically register the ops of a mempool handler.
> >   * Note that the rte_mempool_register_ops fails silently here when
> > @@ -918,7 +942,7 @@ int rte_mempool_ops_get_info(const struct rte_mempool 
> > *mp,
> >  #define MEMPOOL_REGISTER_OPS(ops)\
> >   RTE_INIT(mp_hdlr_init_##ops)\
> >   { 

[dpdk-dev] [PATCH v2] net/axgbe: support sfp module EEPROM

2020-04-26 Thread asomalap
From: Amaranath Somalapuram 

Adding API for get_module_eeprom and get_module_info.

Signed-off-by: Amaranath Somalapuram 
---
 doc/guides/nics/features/axgbe.ini |   1 +
 drivers/net/axgbe/axgbe_ethdev.c   |   2 +
 drivers/net/axgbe/axgbe_phy.h  |   4 ++
 drivers/net/axgbe/axgbe_phy_impl.c | 107 +
 4 files changed, 114 insertions(+)

diff --git a/doc/guides/nics/features/axgbe.ini 
b/doc/guides/nics/features/axgbe.ini
index 0becaa097..80a3bcee1 100644
--- a/doc/guides/nics/features/axgbe.ini
+++ b/doc/guides/nics/features/axgbe.ini
@@ -18,3 +18,4 @@ Basic stats  = Y
 Linux UIO= Y
 x86-32   = Y
 x86-64   = Y
+Module EEPROM dump   = Y
diff --git a/drivers/net/axgbe/axgbe_ethdev.c b/drivers/net/axgbe/axgbe_ethdev.c
index 867058845..ea2f9bba1 100644
--- a/drivers/net/axgbe/axgbe_ethdev.c
+++ b/drivers/net/axgbe/axgbe_ethdev.c
@@ -214,6 +214,8 @@ static const struct eth_dev_ops axgbe_eth_dev_ops = {
.dev_supported_ptypes_get = axgbe_dev_supported_ptypes_get,
.rx_descriptor_status = axgbe_dev_rx_descriptor_status,
.tx_descriptor_status = axgbe_dev_tx_descriptor_status,
+   .get_module_info  = axgbe_get_module_info,
+   .get_module_eeprom= axgbe_get_module_eeprom,
 };
 
 static int axgbe_phy_reset(struct axgbe_port *pdata)
diff --git a/drivers/net/axgbe/axgbe_phy.h b/drivers/net/axgbe/axgbe_phy.h
index 77ee20a31..d9d7fde41 100644
--- a/drivers/net/axgbe/axgbe_phy.h
+++ b/drivers/net/axgbe/axgbe_phy.h
@@ -188,5 +188,9 @@
 #define DUPLEX_FULL 0x01
 #define DUPLEX_UNKNOWN  0xff
 
+int axgbe_get_module_info(struct rte_eth_dev *dev,
+   struct rte_eth_dev_module_info *modinfo);
+int axgbe_get_module_eeprom(struct rte_eth_dev *dev,
+ struct rte_dev_eeprom_info *info);
 #endif
 /* PHY */
diff --git a/drivers/net/axgbe/axgbe_phy_impl.c 
b/drivers/net/axgbe/axgbe_phy_impl.c
index 02236ec19..53716071f 100644
--- a/drivers/net/axgbe/axgbe_phy_impl.c
+++ b/drivers/net/axgbe/axgbe_phy_impl.c
@@ -141,12 +141,18 @@ enum axgbe_sfp_speed {
 
 #define AXGBE_SFP_EXTD_CC  31
 
+#define AXGBE_SFP_EEPROM_PAGE_SIZE 256
+
 struct axgbe_sfp_eeprom {
u8 base[64];
u8 extd[32];
u8 vendor[32];
 };
 
+struct axgbe_sfp_eeprom_module {
+   u8 base[256];
+};
+
 #define AXGBE_BEL_FUSE_VENDOR  "BEL-FUSE"
 #define AXGBE_BEL_FUSE_PARTNO  "1GBT-SFP06"
 
@@ -734,6 +740,106 @@ static int axgbe_phy_sfp_read_eeprom(struct axgbe_port 
*pdata)
return ret;
 }
 
+int axgbe_get_module_info(struct rte_eth_dev *dev,
+   struct rte_eth_dev_module_info *modinfo)
+{
+   struct axgbe_port *pdata = dev->data->dev_private;
+   struct axgbe_sfp_eeprom sfp_eeprom;
+   uint8_t eeprom_addr;
+   int ret;
+
+   ret = axgbe_phy_get_comm_ownership(pdata);
+
+   if (ret)
+   return -EIO;
+
+   ret = axgbe_phy_sfp_get_mux(pdata);
+
+   if (ret) {
+   PMD_DRV_LOG(ERR, "I2C error setting SFP MUX\n");
+   return ret;
+   }
+
+   eeprom_addr = 0;
+   ret = axgbe_phy_i2c_read(pdata, AXGBE_SFP_SERIAL_ID_ADDRESS,
+   &eeprom_addr, sizeof(eeprom_addr),
+&sfp_eeprom, sizeof(sfp_eeprom));
+
+   if (ret) {
+   PMD_DRV_LOG(ERR, "I2C error reading SFP EEPROM\n");
+   goto put;
+   }
+
+   if (sfp_eeprom.extd[AXGBE_SFP_EXTD_SFF_8472] != 0xff) {
+   if (sfp_eeprom.extd[AXGBE_SFP_EXTD_SFF_8472] == 0) {
+   modinfo->type = RTE_ETH_MODULE_SFF_8079;
+   modinfo->eeprom_len = RTE_ETH_MODULE_SFF_8079_LEN;
+   } else {
+   modinfo->type = RTE_ETH_MODULE_SFF_8472;
+   modinfo->eeprom_len = RTE_ETH_MODULE_SFF_8472_LEN;
+   }
+   }
+
+
+put:
+   axgbe_phy_sfp_put_mux(pdata);
+   axgbe_phy_put_comm_ownership(pdata);
+   return 0;
+}
+
+int axgbe_get_module_eeprom(struct rte_eth_dev *dev,
+   struct rte_dev_eeprom_info *info)
+{
+   struct axgbe_port *pdata = dev->data->dev_private;
+   struct axgbe_sfp_eeprom_module sfp_eeprom;
+   uint8_t eeprom_addr;
+   uint8_t *data;
+   uint32_t i;
+   int ret;
+
+   ret = axgbe_phy_get_comm_ownership(pdata);
+
+   if (ret)
+   return -EIO;
+
+   if (!info || !info->length || !info->data)
+   return -EINVAL;
+
+   ret = axgbe_phy_sfp_get_mux(pdata);
+   if (ret) {
+   PMD_DRV_LOG(ERR, "I2C error setting SFP MUX\n");
+   return ret;
+   }
+
+   eeprom_addr = 0;
+   ret = axgbe_phy_i2c_read(pdata, AXGBE_SFP_SERIAL_ID_ADDRESS,
+   &eeprom_addr, sizeof(eeprom_addr),
+   &sfp_eeprom, sizeof(

[dpdk-dev] [PATCH v2] net/axgbe: support sfp module EEPROM

2020-04-26 Thread asomalap
From: Amaranath Somalapuram 

Adding API for get_module_eeprom and get_module_info.

Signed-off-by: Amaranath Somalapuram 
---
 doc/guides/nics/features/axgbe.ini |   1 +
 drivers/net/axgbe/axgbe_ethdev.c   |   2 +
 drivers/net/axgbe/axgbe_phy.h  |   4 ++
 drivers/net/axgbe/axgbe_phy_impl.c | 107 +
 4 files changed, 114 insertions(+)

diff --git a/doc/guides/nics/features/axgbe.ini 
b/doc/guides/nics/features/axgbe.ini
index 0becaa097..80a3bcee1 100644
--- a/doc/guides/nics/features/axgbe.ini
+++ b/doc/guides/nics/features/axgbe.ini
@@ -18,3 +18,4 @@ Basic stats  = Y
 Linux UIO= Y
 x86-32   = Y
 x86-64   = Y
+Module EEPROM dump   = Y
diff --git a/drivers/net/axgbe/axgbe_ethdev.c b/drivers/net/axgbe/axgbe_ethdev.c
index 867058845..ea2f9bba1 100644
--- a/drivers/net/axgbe/axgbe_ethdev.c
+++ b/drivers/net/axgbe/axgbe_ethdev.c
@@ -214,6 +214,8 @@ static const struct eth_dev_ops axgbe_eth_dev_ops = {
.dev_supported_ptypes_get = axgbe_dev_supported_ptypes_get,
.rx_descriptor_status = axgbe_dev_rx_descriptor_status,
.tx_descriptor_status = axgbe_dev_tx_descriptor_status,
+   .get_module_info  = axgbe_get_module_info,
+   .get_module_eeprom= axgbe_get_module_eeprom,
 };
 
 static int axgbe_phy_reset(struct axgbe_port *pdata)
diff --git a/drivers/net/axgbe/axgbe_phy.h b/drivers/net/axgbe/axgbe_phy.h
index 77ee20a31..d9d7fde41 100644
--- a/drivers/net/axgbe/axgbe_phy.h
+++ b/drivers/net/axgbe/axgbe_phy.h
@@ -188,5 +188,9 @@
 #define DUPLEX_FULL 0x01
 #define DUPLEX_UNKNOWN  0xff
 
+int axgbe_get_module_info(struct rte_eth_dev *dev,
+   struct rte_eth_dev_module_info *modinfo);
+int axgbe_get_module_eeprom(struct rte_eth_dev *dev,
+ struct rte_dev_eeprom_info *info);
 #endif
 /* PHY */
diff --git a/drivers/net/axgbe/axgbe_phy_impl.c 
b/drivers/net/axgbe/axgbe_phy_impl.c
index 02236ec19..53716071f 100644
--- a/drivers/net/axgbe/axgbe_phy_impl.c
+++ b/drivers/net/axgbe/axgbe_phy_impl.c
@@ -141,12 +141,18 @@ enum axgbe_sfp_speed {
 
 #define AXGBE_SFP_EXTD_CC  31
 
+#define AXGBE_SFP_EEPROM_PAGE_SIZE 256
+
 struct axgbe_sfp_eeprom {
u8 base[64];
u8 extd[32];
u8 vendor[32];
 };
 
+struct axgbe_sfp_eeprom_module {
+   u8 base[256];
+};
+
 #define AXGBE_BEL_FUSE_VENDOR  "BEL-FUSE"
 #define AXGBE_BEL_FUSE_PARTNO  "1GBT-SFP06"
 
@@ -734,6 +740,106 @@ static int axgbe_phy_sfp_read_eeprom(struct axgbe_port 
*pdata)
return ret;
 }
 
+int axgbe_get_module_info(struct rte_eth_dev *dev,
+   struct rte_eth_dev_module_info *modinfo)
+{
+   struct axgbe_port *pdata = dev->data->dev_private;
+   struct axgbe_sfp_eeprom sfp_eeprom;
+   uint8_t eeprom_addr;
+   int ret;
+
+   ret = axgbe_phy_get_comm_ownership(pdata);
+
+   if (ret)
+   return -EIO;
+
+   ret = axgbe_phy_sfp_get_mux(pdata);
+
+   if (ret) {
+   PMD_DRV_LOG(ERR, "I2C error setting SFP MUX\n");
+   return ret;
+   }
+
+   eeprom_addr = 0;
+   ret = axgbe_phy_i2c_read(pdata, AXGBE_SFP_SERIAL_ID_ADDRESS,
+   &eeprom_addr, sizeof(eeprom_addr),
+&sfp_eeprom, sizeof(sfp_eeprom));
+
+   if (ret) {
+   PMD_DRV_LOG(ERR, "I2C error reading SFP EEPROM\n");
+   goto put;
+   }
+
+   if (sfp_eeprom.extd[AXGBE_SFP_EXTD_SFF_8472] != 0xff) {
+   if (sfp_eeprom.extd[AXGBE_SFP_EXTD_SFF_8472] == 0) {
+   modinfo->type = RTE_ETH_MODULE_SFF_8079;
+   modinfo->eeprom_len = RTE_ETH_MODULE_SFF_8079_LEN;
+   } else {
+   modinfo->type = RTE_ETH_MODULE_SFF_8472;
+   modinfo->eeprom_len = RTE_ETH_MODULE_SFF_8472_LEN;
+   }
+   }
+
+
+put:
+   axgbe_phy_sfp_put_mux(pdata);
+   axgbe_phy_put_comm_ownership(pdata);
+   return 0;
+}
+
+int axgbe_get_module_eeprom(struct rte_eth_dev *dev,
+   struct rte_dev_eeprom_info *info)
+{
+   struct axgbe_port *pdata = dev->data->dev_private;
+   struct axgbe_sfp_eeprom_module sfp_eeprom;
+   uint8_t eeprom_addr;
+   uint8_t *data;
+   uint32_t i;
+   int ret;
+
+   ret = axgbe_phy_get_comm_ownership(pdata);
+
+   if (ret)
+   return -EIO;
+
+   if (!info || !info->length || !info->data)
+   return -EINVAL;
+
+   ret = axgbe_phy_sfp_get_mux(pdata);
+   if (ret) {
+   PMD_DRV_LOG(ERR, "I2C error setting SFP MUX\n");
+   return ret;
+   }
+
+   eeprom_addr = 0;
+   ret = axgbe_phy_i2c_read(pdata, AXGBE_SFP_SERIAL_ID_ADDRESS,
+   &eeprom_addr, sizeof(eeprom_addr),
+   &sfp_eeprom, sizeof(

Re: [dpdk-dev] [PATCH v2] net/axgbe: support sfp module EEPROM

2020-04-26 Thread Somalapuram, Amaranath
Hi Ravi,
Can you acknowledge this patch.

Regards,
S.Amarnath

-Original Message-
From: Somalapuram, Amaranath  
Sent: Monday, April 27, 2020 11:13 AM
To: dev@dpdk.org
Cc: Kumar, Ravi1 
Subject: [PATCH v2] net/axgbe: support sfp module EEPROM

From: Amaranath Somalapuram 

Adding API for get_module_eeprom and get_module_info.

Signed-off-by: Amaranath Somalapuram 
---
 doc/guides/nics/features/axgbe.ini |   1 +
 drivers/net/axgbe/axgbe_ethdev.c   |   2 +
 drivers/net/axgbe/axgbe_phy.h  |   4 ++
 drivers/net/axgbe/axgbe_phy_impl.c | 107 +
 4 files changed, 114 insertions(+)

diff --git a/doc/guides/nics/features/axgbe.ini 
b/doc/guides/nics/features/axgbe.ini
index 0becaa097..80a3bcee1 100644
--- a/doc/guides/nics/features/axgbe.ini
+++ b/doc/guides/nics/features/axgbe.ini
@@ -18,3 +18,4 @@ Basic stats  = Y
 Linux UIO= Y
 x86-32   = Y
 x86-64   = Y
+Module EEPROM dump   = Y
diff --git a/drivers/net/axgbe/axgbe_ethdev.c b/drivers/net/axgbe/axgbe_ethdev.c
index 867058845..ea2f9bba1 100644
--- a/drivers/net/axgbe/axgbe_ethdev.c
+++ b/drivers/net/axgbe/axgbe_ethdev.c
@@ -214,6 +214,8 @@ static const struct eth_dev_ops axgbe_eth_dev_ops = {
.dev_supported_ptypes_get = axgbe_dev_supported_ptypes_get,
.rx_descriptor_status = axgbe_dev_rx_descriptor_status,
.tx_descriptor_status = axgbe_dev_tx_descriptor_status,
+   .get_module_info  = axgbe_get_module_info,
+   .get_module_eeprom= axgbe_get_module_eeprom,
 };
 
 static int axgbe_phy_reset(struct axgbe_port *pdata) diff --git 
a/drivers/net/axgbe/axgbe_phy.h b/drivers/net/axgbe/axgbe_phy.h index 
77ee20a31..d9d7fde41 100644
--- a/drivers/net/axgbe/axgbe_phy.h
+++ b/drivers/net/axgbe/axgbe_phy.h
@@ -188,5 +188,9 @@
 #define DUPLEX_FULL 0x01
 #define DUPLEX_UNKNOWN  0xff
 
+int axgbe_get_module_info(struct rte_eth_dev *dev,
+   struct rte_eth_dev_module_info *modinfo); int 
+axgbe_get_module_eeprom(struct rte_eth_dev *dev,
+ struct rte_dev_eeprom_info *info);
 #endif
 /* PHY */
diff --git a/drivers/net/axgbe/axgbe_phy_impl.c 
b/drivers/net/axgbe/axgbe_phy_impl.c
index 02236ec19..53716071f 100644
--- a/drivers/net/axgbe/axgbe_phy_impl.c
+++ b/drivers/net/axgbe/axgbe_phy_impl.c
@@ -141,12 +141,18 @@ enum axgbe_sfp_speed {
 
 #define AXGBE_SFP_EXTD_CC  31
 
+#define AXGBE_SFP_EEPROM_PAGE_SIZE 256
+
 struct axgbe_sfp_eeprom {
u8 base[64];
u8 extd[32];
u8 vendor[32];
 };
 
+struct axgbe_sfp_eeprom_module {
+   u8 base[256];
+};
+
 #define AXGBE_BEL_FUSE_VENDOR  "BEL-FUSE"
 #define AXGBE_BEL_FUSE_PARTNO  "1GBT-SFP06"
 
@@ -734,6 +740,106 @@ static int axgbe_phy_sfp_read_eeprom(struct axgbe_port 
*pdata)
return ret;
 }
 
+int axgbe_get_module_info(struct rte_eth_dev *dev,
+   struct rte_eth_dev_module_info *modinfo) {
+   struct axgbe_port *pdata = dev->data->dev_private;
+   struct axgbe_sfp_eeprom sfp_eeprom;
+   uint8_t eeprom_addr;
+   int ret;
+
+   ret = axgbe_phy_get_comm_ownership(pdata);
+
+   if (ret)
+   return -EIO;
+
+   ret = axgbe_phy_sfp_get_mux(pdata);
+
+   if (ret) {
+   PMD_DRV_LOG(ERR, "I2C error setting SFP MUX\n");
+   return ret;
+   }
+
+   eeprom_addr = 0;
+   ret = axgbe_phy_i2c_read(pdata, AXGBE_SFP_SERIAL_ID_ADDRESS,
+   &eeprom_addr, sizeof(eeprom_addr),
+&sfp_eeprom, sizeof(sfp_eeprom));
+
+   if (ret) {
+   PMD_DRV_LOG(ERR, "I2C error reading SFP EEPROM\n");
+   goto put;
+   }
+
+   if (sfp_eeprom.extd[AXGBE_SFP_EXTD_SFF_8472] != 0xff) {
+   if (sfp_eeprom.extd[AXGBE_SFP_EXTD_SFF_8472] == 0) {
+   modinfo->type = RTE_ETH_MODULE_SFF_8079;
+   modinfo->eeprom_len = RTE_ETH_MODULE_SFF_8079_LEN;
+   } else {
+   modinfo->type = RTE_ETH_MODULE_SFF_8472;
+   modinfo->eeprom_len = RTE_ETH_MODULE_SFF_8472_LEN;
+   }
+   }
+
+
+put:
+   axgbe_phy_sfp_put_mux(pdata);
+   axgbe_phy_put_comm_ownership(pdata);
+   return 0;
+}
+
+int axgbe_get_module_eeprom(struct rte_eth_dev *dev,
+   struct rte_dev_eeprom_info *info)
+{
+   struct axgbe_port *pdata = dev->data->dev_private;
+   struct axgbe_sfp_eeprom_module sfp_eeprom;
+   uint8_t eeprom_addr;
+   uint8_t *data;
+   uint32_t i;
+   int ret;
+
+   ret = axgbe_phy_get_comm_ownership(pdata);
+
+   if (ret)
+   return -EIO;
+
+   if (!info || !info->length || !info->data)
+   return -EINVAL;
+
+   ret = axgbe_phy_sfp_get_mux(pdata);
+   if (ret) {
+   PMD_DRV_LOG(ERR, "I2C error sett

[dpdk-dev] [PATCH v1] maintainers: update for AMD xgbe and ccp crypto

2020-04-26 Thread asomalap
From: Amaranath Somalapuram 

Ownership change.

Signed-off-by: Amaranath Somalapuram 
---
 MAINTAINERS | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index d31a80929..c3dcedbb3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -566,7 +566,7 @@ F: doc/guides/nics/ena.rst
 F: doc/guides/nics/features/ena.ini
 
 AMD axgbe
-M: Ravi Kumar 
+M: Somalapuram Amaranath 
 F: drivers/net/axgbe/
 F: doc/guides/nics/axgbe.rst
 F: doc/guides/nics/features/axgbe.ini
@@ -953,7 +953,7 @@ T: git://dpdk.org/next/dpdk-next-crypto
 F: doc/guides/cryptodevs/features/default.ini
 
 AMD CCP Crypto
-M: Ravi Kumar 
+M: Somalapuram Amaranath 
 F: drivers/crypto/ccp/
 F: doc/guides/cryptodevs/ccp.rst
 F: doc/guides/cryptodevs/features/ccp.ini
-- 
2.17.1



Re: [dpdk-dev] [PATCH v1] maintainers: update for AMD xgbe and ccp crypto

2020-04-26 Thread Kumar, Ravi1
[AMD Public Use]

Acked-by: Ravi Kumar 
>-Original Message-
>From: Somalapuram, Amaranath  
>Sent: Monday, April 27, 2020 11:42 AM
>To: dev@dpdk.org
>Cc: Kumar, Ravi1 
>Subject: [PATCH v1] maintainers: update for AMD xgbe and ccp crypto
>
>From: Amaranath Somalapuram 
>
>Ownership change.
>
>Signed-off-by: Amaranath Somalapuram 
>---
> MAINTAINERS | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
>diff --git a/MAINTAINERS b/MAINTAINERS
>index d31a80929..c3dcedbb3 100644
>--- a/MAINTAINERS
>+++ b/MAINTAINERS
>@@ -566,7 +566,7 @@ F: doc/guides/nics/ena.rst
> F: doc/guides/nics/features/ena.ini
> 
> AMD axgbe
>-M: Ravi Kumar 
>+M: Somalapuram Amaranath 
> F: drivers/net/axgbe/
> F: doc/guides/nics/axgbe.rst
> F: doc/guides/nics/features/axgbe.ini
>@@ -953,7 +953,7 @@ T: git://dpdk.org/next/dpdk-next-crypto
> F: doc/guides/cryptodevs/features/default.ini
> 
> AMD CCP Crypto
>-M: Ravi Kumar 
>+M: Somalapuram Amaranath 
> F: drivers/crypto/ccp/
> F: doc/guides/cryptodevs/ccp.rst
> F: doc/guides/cryptodevs/features/ccp.ini
>-- 
>2.17.1
>


Re: [dpdk-dev] [PATCH 3/3] maintainers: update for driver testing tool

2020-04-26 Thread Wu, Jingjing



> -Original Message-
> From: Xing, Beilei 
> Sent: Sunday, April 26, 2020 4:22 PM
> To: dev@dpdk.org; Wu, Jingjing ; Lu, Wenzhuo
> ; Zhang, Qi Z 
> Subject: [PATCH 3/3] maintainers: update for driver testing tool
> 
> Replace Jingjing Wu with Beilei Xing.
> 
> Signed-off-by: Beilei Xing 

Acked-by: Jingjing Wu 


Re: [dpdk-dev] [PATCH] doc: refine ethernet and VLAN flow rule items

2020-04-26 Thread Ori Kam
Hi Stephen,

> -Original Message-
> From: Stephen Hemminger 
> Sent: Sunday, April 26, 2020 8:02 PM
> To: Dekel Peled 
> Cc: Andrew Rybchenko ; Ori Kam
> ; john.mcnam...@intel.com;
> marko.kovace...@intel.com; Thomas Monjalon ;
> ferruh.yi...@intel.com; dev@dpdk.org; Asaf Penso 
> Subject: Re: [dpdk-dev] [PATCH] doc: refine ethernet and VLAN flow rule items
> 
> On Sun, 26 Apr 2020 09:18:54 +
> Dekel Peled  wrote:
> 
> > Thanks, PSB.
> >
> > > -Original Message-
> > > From: Andrew Rybchenko 
> > > Sent: Saturday, April 25, 2020 5:00 PM
> > > To: Dekel Peled ; Ori Kam ;
> > > john.mcnam...@intel.com; marko.kovace...@intel.com; Thomas Monjalon
> > > ; ferruh.yi...@intel.com
> > > Cc: dev@dpdk.org; Asaf Penso 
> > > Subject: Re: [PATCH] doc: refine ethernet and VLAN flow rule items
> > >
> > > On 4/23/20 9:30 PM, Dekel Peled wrote:
> > > > Specified pattern may be translated in different manner.
> > > > For example the pattern "eth / ipv4" can be translated to match
> > > > untagged packets only, since the pattern doesn't specify a vlan item.
> > >
> > > vlan -> VLAN
> >
> > I will change to uppercase.
> >
> > >
> > > > It can also be translated to match both tagged and untagged packets,
> > > > for the same reason.
> > > > This patch updates the rte_flow documentation to clearly specify the
> > > > required pattern to use.
> > > > For example:
> > > > To match tagged ipv4 packets, the pattern "eth type is 0x8100 / vlan /
> > > > ipv4 / end" should be used.
> > >
> > > Isn't eth / vlan / ipv4 /end sufficient? What's the difference?
> > > I guess later should allow any VLAN TPID, but it is greyish since it is HW
> > > dependent.
> >
> > In the example I wanted to show explicit rule, to emphasize the importance 
> > of
> detailed pattern structure.
> >
> > >
> > > > To match untagged ipv4 packets, the pattern "eth type is 0x0800 /
> > > > ipv4 / end" should be used.
> > >
> > > What about eth / ipv4 / end?
> > > Does usage of ipv4 assume that EtherType is 0x0800?
> >
> > Same as above.
> >
> > >
> > > > To match both tagged and untagged packets, the pattern "eth / end"
> > > > should be used.
> > >
> > > The interesting question is what should be used if I want either tagged or
> > > untagged IPv4 packets. I think it worse to mention to make the picture
> > > complete.
> >
> > To match any IPV4 packet, either tagged or untagged, need to apply two
> rules.
> > One for tagged packets and the other for untagged packets.
> > I will add this example as well.
> >
> > >
> > > > Signed-off-by: Dekel Peled 
> 
> All this reminds me that "code is the best documentation" and there
> is no working code that does a complete software implementation of the
> rte_flow engine. This means the actual interpretation of what the rte_flow
> means is left to documentation and hardware vendors choices in
> implementation.

I agree with you, that is why I think this patch is important. 

Best,
Ori