RE: [RFC PATCH v2] ethdev: add indirect list flow action

2023-05-03 Thread Gregory Etelson
Hello Ivan,

> Overall, the idea of action list indirection sounds good.
> But why deprecate the existing indirect action API?

Deprecation is not necessary.
If the decision is to keep existing indirect action API,
it must be updated to differentiate between
action mutable and flow mutable update parameters.

> Are there any strong reasons for that? This action
> should probably stay. Well, at least, I wouldn't
> expect this patch to immediately introduce some
> transition period. Let's cross that bridge
> when we get there. What do you think?
> 
> Also, what are the target applications/projects to
> benefit from the new indirect action list API?
> Could you please provide some examples. Or,
> alternatively, you could shed light on
> which "shapes" of flow rules would
> be the primary users of the action.
> 

We have 2 usages for the new indirect list API
1. Create single HW action from input list of RTE actions.
For example, MLX5 template API will create single mirror HW action from SAMPLE 
/ JUMP list.
2.  METER_MARK init_color and CONNTRACK last_direction parameters are flow 
mutable, by the actions definitions.
However, current indirect action cannot update flow parameters.

Regards,
Gregory




RE: [PATCH] test/crypto: add cryptodev reconfig test

2023-05-03 Thread Akhil Goyal
> Hi Aakash,
> > Subject: [PATCH] test/crypto: add cryptodev reconfig test
> >
> > Add cryptodev tests to verify that the device supports reconfiguration any
> > number of times via rte_cryptodev_configure API.
> >
> > Signed-off-by: Aakash Sasidharan 
> > ---
> >  app/test/test_cryptodev.c | 81
> > +++
> >  1 file changed, 81 insertions(+)
> 
> Tested on the following PMDs: QAT + SW Crypto PMDs
> (snow3g, null, chachapoly, kasumi, aesni_mb, aesni_gcm, zuc)
> The test results for the testcase were a mix of skipped and passed, based on 
> the
> capabilities of each PMD, no failures.
> 
> Tested-by: Saoirse O'Donovan 

Applied to dpdk-next-crypto
Thanks.


Re: [PATCH v3 05/11] devtools: add acronym in dictionary for commit checks

2023-05-03 Thread Jerin Jacob
On Mon, Apr 24, 2023 at 8:07 PM Sathesh Edara  wrote:
>
> ISM -> Interrupt Status Messages
>
> Signed-off-by: Sathesh Edara 
> ---
>  devtools/words-case.txt | 1 +

Squashed this patch with 6/11.

Series applied to dpdk-next-net-mrvl/for-next-net. Thanks

>  1 file changed, 1 insertion(+)
>
> diff --git a/devtools/words-case.txt b/devtools/words-case.txt
> index 53e029a958..3a7af902bd 100644
> --- a/devtools/words-case.txt
> +++ b/devtools/words-case.txt
> @@ -35,6 +35,7 @@ IP
>  IPsec
>  IPv4
>  IPv6
> +ISM
>  L2
>  L3
>  L4
> --
> 2.31.1
>


RE: [PATCH] doc: fix invalid auth algo in cryptoperf app

2023-05-03 Thread Akhil Goyal
> > Subject: [PATCH] doc: fix invalid auth algo in cryptoperf app
> >
> > 3des-cbc is not an authentication algorithm, hence need to be removed.
> >
> > Fixes: c6baca7adc94 ("doc: describe new performance test application")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Akhil Goyal 
> > ---
> Acked-by: Ciara Power 
Applied to dpdk-next-crypto



[PATCH V3] lib: set/get max memzone segments

2023-05-03 Thread Ophir Munk
In current DPDK the RTE_MAX_MEMZONE definition is unconditionally hard
coded as 2560.  For applications requiring different values of this
parameter – it is more convenient to set the max value via an rte API -
rather than changing the dpdk source code per application.  In many
organizations, the possibility to compile a private DPDK library for a
particular application does not exist at all.  With this option there is
no need to recompile DPDK and it allows using an in-box packaged DPDK.
An example usage for updating the RTE_MAX_MEMZONE would be of an
application that uses the DPDK mempool library which is based on DPDK
memzone library.  The application may need to create a number of
steering tables, each of which will require its own mempool allocation.
This commit is not about how to optimize the application usage of
mempool nor about how to improve the mempool implementation based on
memzone.  It is about how to make the max memzone definition - run-time
customized.
This commit adds an API which must be called before rte_eal_init():
rte_memzone_max_set(int max).  If not called, the default memzone
(RTE_MAX_MEMZONE) is used.  There is also an API to query the effective
max memzone: rte_memzone_max_get().

Signed-off-by: Ophir Munk 
---
 app/test/test_func_reentrancy.c |  2 +-
 app/test/test_malloc_perf.c |  2 +-
 app/test/test_memzone.c | 43 -
 config/rte_config.h |  1 -
 drivers/net/qede/base/bcm_osal.c| 30 --
 drivers/net/qede/base/bcm_osal.h|  3 +++
 drivers/net/qede/qede_main.c|  7 ++
 lib/eal/common/eal_common_memzone.c | 28 +---
 lib/eal/include/rte_memzone.h   | 20 +
 lib/eal/version.map |  4 
 10 files changed, 112 insertions(+), 28 deletions(-)

diff --git a/app/test/test_func_reentrancy.c b/app/test/test_func_reentrancy.c
index d1ed5d4..ae9de6f 100644
--- a/app/test/test_func_reentrancy.c
+++ b/app/test/test_func_reentrancy.c
@@ -51,7 +51,7 @@ typedef void (*case_clean_t)(unsigned lcore_id);
 #define MEMPOOL_ELT_SIZE(sizeof(uint32_t))
 #define MEMPOOL_SIZE(4)
 
-#define MAX_LCORES (RTE_MAX_MEMZONE / (MAX_ITER_MULTI * 4U))
+#define MAX_LCORES (rte_memzone_max_get() / (MAX_ITER_MULTI * 4U))
 
 static uint32_t obj_count;
 static uint32_t synchro;
diff --git a/app/test/test_malloc_perf.c b/app/test/test_malloc_perf.c
index ccec43a..9bd1662 100644
--- a/app/test/test_malloc_perf.c
+++ b/app/test/test_malloc_perf.c
@@ -165,7 +165,7 @@ test_malloc_perf(void)
return -1;
 
if (test_alloc_perf("rte_memzone_reserve", memzone_alloc, memzone_free,
-   NULL, memset_us_gb, RTE_MAX_MEMZONE - 1) < 0)
+   NULL, memset_us_gb, rte_memzone_max_get() - 1) < 0)
return -1;
 
return 0;
diff --git a/app/test/test_memzone.c b/app/test/test_memzone.c
index c9255e5..36b9790 100644
--- a/app/test/test_memzone.c
+++ b/app/test/test_memzone.c
@@ -871,9 +871,18 @@ test_memzone_bounded(void)
 static int
 test_memzone_free(void)
 {
-   const struct rte_memzone *mz[RTE_MAX_MEMZONE + 1];
+   size_t mz_size;
+   const struct rte_memzone **mz;
int i;
char name[20];
+   int rc = -1;
+
+   mz_size = (rte_memzone_max_get() + 1) * sizeof(struct rte_memzone *);
+   mz = rte_zmalloc("memzone_test", mz_size, 0);
+   if (!mz) {
+   printf("Fail allocating memzone test array\n");
+   return rc;
+   }
 
mz[0] = rte_memzone_reserve(TEST_MEMZONE_NAME("tempzone0"), 2000,
SOCKET_ID_ANY, 0);
@@ -881,42 +890,42 @@ test_memzone_free(void)
SOCKET_ID_ANY, 0);
 
if (mz[0] > mz[1])
-   return -1;
+   goto exit_test;
if (!rte_memzone_lookup(TEST_MEMZONE_NAME("tempzone0")))
-   return -1;
+   goto exit_test;
if (!rte_memzone_lookup(TEST_MEMZONE_NAME("tempzone1")))
-   return -1;
+   goto exit_test;
 
if (rte_memzone_free(mz[0])) {
printf("Fail memzone free - tempzone0\n");
-   return -1;
+   goto exit_test;
}
if (rte_memzone_lookup(TEST_MEMZONE_NAME("tempzone0"))) {
printf("Found previously free memzone - tempzone0\n");
-   return -1;
+   goto exit_test;
}
mz[2] = rte_memzone_reserve(TEST_MEMZONE_NAME("tempzone2"), 2000,
SOCKET_ID_ANY, 0);
 
if (mz[2] > mz[1]) {
printf("tempzone2 should have gotten the free entry from 
tempzone0\n");
-   return -1;
+   goto exit_test;
}
if (rte_memzone_free(mz[2])) {
printf("Fail memzone free - tempzone2\n");
-   return -1;
+   goto exit_test;

RE: [PATCH] crypto/qat: fix stack buffer overflow in SGL loop

2023-05-03 Thread Akhil Goyal
> Acked-by: Kai Ji 
Applied to dpdk-next-crypto
Thanks.


RE: [PATCH] crypto/scheduler: fix last element for valid args

2023-05-03 Thread Akhil Goyal
> Acked-by: Kai Ji 
> 
Applied to dpdk-next-crypto
Thanks.


RE: [PATCH] test/crypto: fix return value for snow3g testcase

2023-05-03 Thread Akhil Goyal
> 
> Acked-by: Ciara Power 
Applied to dpdk-next-crypto
Thanks.


Re: [RFC PATCH 1/5] eventdev: add power monitoring API on event port

2023-05-03 Thread Jerin Jacob
On Tue, May 2, 2023 at 4:49 PM Ferruh Yigit  wrote:
>
> On 4/25/2023 5:09 AM, Jerin Jacob wrote:
> > On Mon, Apr 24, 2023 at 9:36 PM Ferruh Yigit  wrote:
> >>
> >> On 4/19/2023 11:15 AM, Jerin Jacob wrote:
> >>> On Wed, Apr 19, 2023 at 3:24 PM Sivaprasad Tummala
> >>>  wrote:
> 
>  A new API to allow power monitoring condition on event port to
>  optimize power when no events are arriving on an event port for
>  the worker core to process in an eventdev based pipelined application.
> 
>  Signed-off-by: Sivaprasad Tummala 
>  + *
>  + * @param dev_id
>  + *   Eventdev id
>  + * @param port_id
>  + *   Eventdev port id
>  + * @param pmc
>  + *   The pointer to power-optimized monitoring condition structure.
>  + *
>  + * @return
>  + *   - 0: Success.
>  + *   -ENOTSUP: Operation not supported.
>  + *   -EINVAL: Invalid parameters.
>  + *   -ENODEV: Invalid device ID.
>  + */
>  +__rte_experimental
>  +int
>  +rte_event_port_get_monitor_addr(uint8_t dev_id, uint8_t port_id,
>  +   struct rte_power_monitor_cond *pmc);
> >>>
> >>> + eventdev driver maintainers
> >>>
> >>> I think, we don't need to expose this application due to applications
> >>> 1)To make applications to be transparent whether power saving is enabled 
> >>> or not?
> >>> 2)Some HW and Arch already supports power managent in driver and in HW
> >>> (Not using  CPU architecture directly)
> >>>
> >>> If so, that will be translated to following,
> >>> a) Add rte_event_port_power_saving_ena_dis(uint8_t dev_id, uint8_t
> >>> port_id, bool ena) for controlling power saving in slowpath.
> >>> b) Create reusable PMD private function based on the CPU architecture
> >>> power saving primitive to cover the PMD don't have native power saving
> >>> support.
> >>> c)Update rte_event_dequeue_burst() burst of PMD callback to use (b).
> >>>
> >>>
> >>
> >> Hi Jerin,
> >
> > Hi Ferruh,
> >
> >>
> >> ethdev approach seems applied here.
> >
> > Understands that. But none of the NIC HW supports power management at
> > HW level like eventdev, so that way
> > for what we are doing for ethdev is a correct abstraction for ethdev.
> >
>
> What I understand is there is HW based event manager and SW based ones,
> SW based ones can benefit more from CPU power optimizations, for HW
> event managers if there is not enough benefit they can just ignore the
> feature.
>
> >>
> >> In ethdev, 'rte_event_port_get_monitor_addr()' equivalent is
> >> 'rte_eth_get_monitor_addr()'.
> >>
> >> Although 'rte_eth_get_monitor_addr()' is public API, it is currently
> >> only called from Rx/Tx callback functions implemented in the power library.
> >> But I assume intention to make it public is to enable users to implement
> >> their own callback functions that has custom algorithm for the power
> >> management.
> >
> > If there is a use case for customizing with own callback, we can provide 
> > that.
> > Provided NULL is valid with default algorithm.
> >
> >>
> >> And probably same is true for the 'rte_event_port_get_monitor_addr()'.
> >>
> >>
> >> Also instead of implementing power features for withing PMDs, isn't it
> >> better to have a common eventdev layer for it?
> >
> > We can have rte_evetdev_pmd_* APIs as non-public APIs.
> > My only objection is to NOT introduce _monitor_ APIs at eventdev level,
> > Instead, _monitor_ is one way to do it in SW, So we need higher level
> > of abstraction.
> >
>
> I see, this seems a trade off between flexibility and usability. If
> application has access to _monitor_ APIs, they can be more flexible to
> implement their own logic.

OK.

>
> Another option can be application provides the policy with an API and
> monitor API used to realize the policy, but for this case it can be
> challenge to find and implement correct policies.

OK. If we can enumerate the policies, then it will be ideal.
On plus side, there will not be any changes in needed in lib/power/


>
> >>
> >> For the PMDs benefit from HW event manager, just not implementing
> >> .get_monitor_addr() dev_ops will make them free from power related APIs.
> >
> > But application fast path code gets diverged by exposing low level 
> > primitives.
> >
>
> I am not clear with concern above, but for application that use default
> callbacks, 'rte_power_eventdev_pmgmt_port_enable()' needs to be called
> to enable this feature, if not called datapath is not impacted.
> And if not dequeue callback added at all, custom or default, data path
> is not impacted at all.

Concerns are around following code[1] when callback is not registered
for this use case.
In eventdev, we are using _one packet at a time_ for a lot of use case
with latency critical workload like L1 processing.
On such cases, the following code will add up.

[1]
  cb = __atomic_load_n((void **)&fp_ops->ev_port.clbk[port_id],
__ATOMIC_RELAXED);
  if (unlikely(cb != NULL))
   nb_rx = rte_event_dequeue_callbacks(dev_id, port_id, 

Re: [PATCH] common/cnxk: update ROC erratas

2023-05-03 Thread Jerin Jacob
On Tue, Apr 11, 2023 at 12:49 PM Ashwin Sekhar T K  wrote:
>
> Update the models where errata IPBUNPA-37480 is applicable.
>
> Signed-off-by: Ashwin Sekhar T K 


Updated the git commit as follows and applied to
dpdk-next-net-eventdev/for-main. Thanks

common/cnxk: update IPBUNPA-37480 errata scope

Update the SoC models where errata IPBUNPA-37480 is applicable.

Signed-off-by: Ashwin Sekhar T K 


> ---
>  drivers/common/cnxk/roc_errata.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/common/cnxk/roc_errata.h 
> b/drivers/common/cnxk/roc_errata.h
> index 2d15e639b7..1333bde629 100644
> --- a/drivers/common/cnxk/roc_errata.h
> +++ b/drivers/common/cnxk/roc_errata.h
> @@ -25,7 +25,8 @@ roc_errata_nix_has_cq_min_size_4k(void)
>  static inline bool
>  roc_errata_npa_has_no_fc_stype_ststp(void)
>  {
> -   return roc_model_is_cn10ka_a0() ? true : false;
> +   return roc_model_is_cn10ka_a0() || roc_model_is_cn10ka_a1() || 
> roc_model_is_cnf10ka_a0() ||
> +  roc_model_is_cnf10kb_a0();
>  }
>
>  /* Errata IPBUNIXTX-39337 */
> --
> 2.25.1
>


Re: [RFC PATCH 1/5] eventdev: add power monitoring API on event port

2023-05-03 Thread Ferruh Yigit
On 5/3/2023 8:58 AM, Jerin Jacob wrote:
> On Tue, May 2, 2023 at 4:49 PM Ferruh Yigit  wrote:
>>
>> On 4/25/2023 5:09 AM, Jerin Jacob wrote:
>>> On Mon, Apr 24, 2023 at 9:36 PM Ferruh Yigit  wrote:

 On 4/19/2023 11:15 AM, Jerin Jacob wrote:
> On Wed, Apr 19, 2023 at 3:24 PM Sivaprasad Tummala
>  wrote:
>>
>> A new API to allow power monitoring condition on event port to
>> optimize power when no events are arriving on an event port for
>> the worker core to process in an eventdev based pipelined application.
>>
>> Signed-off-by: Sivaprasad Tummala 
>> + *
>> + * @param dev_id
>> + *   Eventdev id
>> + * @param port_id
>> + *   Eventdev port id
>> + * @param pmc
>> + *   The pointer to power-optimized monitoring condition structure.
>> + *
>> + * @return
>> + *   - 0: Success.
>> + *   -ENOTSUP: Operation not supported.
>> + *   -EINVAL: Invalid parameters.
>> + *   -ENODEV: Invalid device ID.
>> + */
>> +__rte_experimental
>> +int
>> +rte_event_port_get_monitor_addr(uint8_t dev_id, uint8_t port_id,
>> +   struct rte_power_monitor_cond *pmc);
>
> + eventdev driver maintainers
>
> I think, we don't need to expose this application due to applications
> 1)To make applications to be transparent whether power saving is enabled 
> or not?
> 2)Some HW and Arch already supports power managent in driver and in HW
> (Not using  CPU architecture directly)
>
> If so, that will be translated to following,
> a) Add rte_event_port_power_saving_ena_dis(uint8_t dev_id, uint8_t
> port_id, bool ena) for controlling power saving in slowpath.
> b) Create reusable PMD private function based on the CPU architecture
> power saving primitive to cover the PMD don't have native power saving
> support.
> c)Update rte_event_dequeue_burst() burst of PMD callback to use (b).
>
>

 Hi Jerin,
>>>
>>> Hi Ferruh,
>>>

 ethdev approach seems applied here.
>>>
>>> Understands that. But none of the NIC HW supports power management at
>>> HW level like eventdev, so that way
>>> for what we are doing for ethdev is a correct abstraction for ethdev.
>>>
>>
>> What I understand is there is HW based event manager and SW based ones,
>> SW based ones can benefit more from CPU power optimizations, for HW
>> event managers if there is not enough benefit they can just ignore the
>> feature.
>>

 In ethdev, 'rte_event_port_get_monitor_addr()' equivalent is
 'rte_eth_get_monitor_addr()'.

 Although 'rte_eth_get_monitor_addr()' is public API, it is currently
 only called from Rx/Tx callback functions implemented in the power library.
 But I assume intention to make it public is to enable users to implement
 their own callback functions that has custom algorithm for the power
 management.
>>>
>>> If there is a use case for customizing with own callback, we can provide 
>>> that.
>>> Provided NULL is valid with default algorithm.
>>>

 And probably same is true for the 'rte_event_port_get_monitor_addr()'.


 Also instead of implementing power features for withing PMDs, isn't it
 better to have a common eventdev layer for it?
>>>
>>> We can have rte_evetdev_pmd_* APIs as non-public APIs.
>>> My only objection is to NOT introduce _monitor_ APIs at eventdev level,
>>> Instead, _monitor_ is one way to do it in SW, So we need higher level
>>> of abstraction.
>>>
>>
>> I see, this seems a trade off between flexibility and usability. If
>> application has access to _monitor_ APIs, they can be more flexible to
>> implement their own logic.
> 
> OK.
> 
>>
>> Another option can be application provides the policy with an API and
>> monitor API used to realize the policy, but for this case it can be
>> challenge to find and implement correct policies.
> 
> OK. If we can enumerate the policies, then it will be ideal.
> On plus side, there will not be any changes in needed in lib/power/
> 

If we are talking about a power framework that user defines policies, I
expect parsing/defining policies will be in the power library and will
require changes in the power library anyway.

But as mentioned above it is difficult to define a proper policy, this
is not really related to eventdev, more a power library issue. We can
continue to provide flexibility to user in eventdev and discuss the
policy if a wider forum.

> 
>>

 For the PMDs benefit from HW event manager, just not implementing
 .get_monitor_addr() dev_ops will make them free from power related APIs.
>>>
>>> But application fast path code gets diverged by exposing low level 
>>> primitives.
>>>
>>
>> I am not clear with concern above, but for application that use default
>> callbacks, 'rte_power_eventdev_pmgmt_port_enable()' needs to be called
>> to enable this feature, if not called datapath is not impacted.
>> And if not dequeue callback add

Re: [RFC PATCH 1/5] eventdev: add power monitoring API on event port

2023-05-03 Thread Jerin Jacob
On Wed, May 3, 2023 at 1:44 PM Ferruh Yigit  wrote:
>
> On 5/3/2023 8:58 AM, Jerin Jacob wrote:
> > On Tue, May 2, 2023 at 4:49 PM Ferruh Yigit  wrote:
> >>
> >> On 4/25/2023 5:09 AM, Jerin Jacob wrote:
> >>> On Mon, Apr 24, 2023 at 9:36 PM Ferruh Yigit  wrote:
> 
>  On 4/19/2023 11:15 AM, Jerin Jacob wrote:
> > On Wed, Apr 19, 2023 at 3:24 PM Sivaprasad Tummala
> >  wrote:
> >>
> >> A new API to allow power monitoring condition on event port to
> >> optimize power when no events are arriving on an event port for
> >> the worker core to process in an eventdev based pipelined application.
> >>
> >> Signed-off-by: Sivaprasad Tummala 
> >> + *
> >> + * @param dev_id
> >> + *   Eventdev id
> >> + * @param port_id
> >> + *   Eventdev port id
> >> + * @param pmc
> >> + *   The pointer to power-optimized monitoring condition structure.
> >> + *
> >> + * @return
> >> + *   - 0: Success.
> >> + *   -ENOTSUP: Operation not supported.
> >> + *   -EINVAL: Invalid parameters.
> >> + *   -ENODEV: Invalid device ID.
> >> + */
> >> +__rte_experimental
> >> +int
> >> +rte_event_port_get_monitor_addr(uint8_t dev_id, uint8_t port_id,
> >> +   struct rte_power_monitor_cond *pmc);
> >
> > + eventdev driver maintainers
> >
> > I think, we don't need to expose this application due to applications
> > 1)To make applications to be transparent whether power saving is 
> > enabled or not?
> > 2)Some HW and Arch already supports power managent in driver and in HW
> > (Not using  CPU architecture directly)
> >
> > If so, that will be translated to following,
> > a) Add rte_event_port_power_saving_ena_dis(uint8_t dev_id, uint8_t
> > port_id, bool ena) for controlling power saving in slowpath.
> > b) Create reusable PMD private function based on the CPU architecture
> > power saving primitive to cover the PMD don't have native power saving
> > support.
> > c)Update rte_event_dequeue_burst() burst of PMD callback to use (b).
> >
> >
> 
>  Hi Jerin,
> >>>
> >>> Hi Ferruh,
> >>>
> 
>  ethdev approach seems applied here.
> >>>
> >>> Understands that. But none of the NIC HW supports power management at
> >>> HW level like eventdev, so that way
> >>> for what we are doing for ethdev is a correct abstraction for ethdev.
> >>>
> >>
> >> What I understand is there is HW based event manager and SW based ones,
> >> SW based ones can benefit more from CPU power optimizations, for HW
> >> event managers if there is not enough benefit they can just ignore the
> >> feature.
> >>
> 
>  In ethdev, 'rte_event_port_get_monitor_addr()' equivalent is
>  'rte_eth_get_monitor_addr()'.
> 
>  Although 'rte_eth_get_monitor_addr()' is public API, it is currently
>  only called from Rx/Tx callback functions implemented in the power 
>  library.
>  But I assume intention to make it public is to enable users to implement
>  their own callback functions that has custom algorithm for the power
>  management.
> >>>
> >>> If there is a use case for customizing with own callback, we can provide 
> >>> that.
> >>> Provided NULL is valid with default algorithm.
> >>>
> 
>  And probably same is true for the 'rte_event_port_get_monitor_addr()'.
> 
> 
>  Also instead of implementing power features for withing PMDs, isn't it
>  better to have a common eventdev layer for it?
> >>>
> >>> We can have rte_evetdev_pmd_* APIs as non-public APIs.
> >>> My only objection is to NOT introduce _monitor_ APIs at eventdev level,
> >>> Instead, _monitor_ is one way to do it in SW, So we need higher level
> >>> of abstraction.
> >>>
> >>
> >> I see, this seems a trade off between flexibility and usability. If
> >> application has access to _monitor_ APIs, they can be more flexible to
> >> implement their own logic.
> >
> > OK.
> >
> >>
> >> Another option can be application provides the policy with an API and
> >> monitor API used to realize the policy, but for this case it can be
> >> challenge to find and implement correct policies.
> >
> > OK. If we can enumerate the policies, then it will be ideal.
> > On plus side, there will not be any changes in needed in lib/power/
> >
>
> If we are talking about a power framework that user defines policies, I
> expect parsing/defining policies will be in the power library and will
> require changes in the power library anyway.

OK

>
> But as mentioned above it is difficult to define a proper policy, this
> is not really related to eventdev, more a power library issue. We can
> continue to provide flexibility to user in eventdev and discuss the
> policy if a wider forum.

OK.

>
> >
> >>
> 
>  For the PMDs benefit from HW event manager, just not implementing
>  .get_monitor_addr() dev_ops will make them free from power related APIs.
> >>>
> >>> But application fast p

Re: DPDK 22.11 Troubleshooting

2023-05-03 Thread Bruce Richardson
On Mon, May 01, 2023 at 10:27:05PM +, Gilbert Carrillo wrote:
>Hello,
> 
> 
>I installed DPDK version 22.11 and the QDMA DPDK driver. However, I am
>having trouble compiling the test applications.
> 
> 
>I have a c++ program that has an external buffer and my end goal is to
>attach/map an mbuf to my external buffer for zero-copy DMA.
> 
> 
>Currently I use CMAKE to compile my program, so I was curious if this
>application has to be run on meson/ninja or is there a cmake option? I
>tried compiling the test applications with the makefile but had no
>luck, I could only compile using meson/ninja.
> 

If you have installed DPDK on your system, then a pkg-config file for it
should be available, allowing applications to be built against it using any
build system. The makefiles for the sample applications demonstrate how
this can be done.

If building the example applications with make is failing, can you share
the error messages got here, as it should work ok, once DPDK is correctly
installed on the system. An additional test you can run is "pkg-config
--path libdpdk" to check where DPDK is installed [though not all versions
of pkg-config support --path, I think].

Regards,
/Bruce


RE: [dpdk-dev] [PATCH v1] examples/ip_pipeline: fix build issue with GCC 13

2023-05-03 Thread Ali Alnubani
> -Original Message-
> From: jer...@marvell.com 
> Sent: Tuesday, May 2, 2023 4:49 PM
> To: dev@dpdk.org
> Cc: NBU-Contact-Thomas Monjalon (EXTERNAL) ;
> david.march...@redhat.com; ferruh.yi...@xilinx.com;
> step...@networkplumber.org; Jerin Jacob ;
> sta...@dpdk.org
> Subject: [dpdk-dev] [PATCH v1] examples/ip_pipeline: fix build issue with
> GCC 13
> 
> From: Jerin Jacob 
> 
> Fix the following build issue by initializing req to NULL for
> the local variable.
> 
> In function 'thread_msg_handle', inlined from 'thread_main' at
> ../examples/ip_pipeline/thread.c:3130:6:
> ../examples/ip_pipeline/thread.c:535:20: warning: 'req' may be used
> uninitialized [-Wmaybe-uninitialized]
>   535 | if (req == NULL)
>   |^
> ../examples/ip_pipeline/thread.c: In function 'thread_main':
> ../examples/ip_pipeline/thread.c:433:32: note: 'req' was declared here
>   433 | struct thread_msg_req *req;
> 
> Bugzilla ID: 1220
> Fixes: a8bd581de397 ("examples/ip_pipeline: add thread runtime")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Jerin Jacob 
> ---

Resolves the warning I see in Bugzilla 1220, thanks.

Now I see a failure in another example app:
"""
examples/ntb/ntb_fwd.c:945:23: error: 'rte_rawdev_xstats_get' accessing 8 bytes 
in a region of size 0 [-Werror=stringop-overflow=]
"""
Will open a separate issue for that.

Tested-by: Ali Alnubani 


RE: [dpdk-dev] [PATCH v1] examples/ip_pipeline: fix build issue with GCC 13

2023-05-03 Thread Ali Alnubani
> -Original Message-
> From: Ali Alnubani
> Sent: Wednesday, May 3, 2023 11:30 AM
> To: 'jer...@marvell.com' ; dev@dpdk.org
> Cc: NBU-Contact-Thomas Monjalon (EXTERNAL) ;
> david.march...@redhat.com; ferruh.yi...@xilinx.com;
> step...@networkplumber.org; sta...@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v1] examples/ip_pipeline: fix build issue with
> GCC 13
> 
> > -Original Message-
> > From: jer...@marvell.com 
> > Sent: Tuesday, May 2, 2023 4:49 PM
> > To: dev@dpdk.org
> > Cc: NBU-Contact-Thomas Monjalon (EXTERNAL) ;
> > david.march...@redhat.com; ferruh.yi...@xilinx.com;
> > step...@networkplumber.org; Jerin Jacob ;
> > sta...@dpdk.org
> > Subject: [dpdk-dev] [PATCH v1] examples/ip_pipeline: fix build issue with
> > GCC 13
> >
> > From: Jerin Jacob 
> >
> > Fix the following build issue by initializing req to NULL for
> > the local variable.
> >
> > In function 'thread_msg_handle', inlined from 'thread_main' at
> > ../examples/ip_pipeline/thread.c:3130:6:
> > ../examples/ip_pipeline/thread.c:535:20: warning: 'req' may be used
> > uninitialized [-Wmaybe-uninitialized]
> >   535 | if (req == NULL)
> >   |^
> > ../examples/ip_pipeline/thread.c: In function 'thread_main':
> > ../examples/ip_pipeline/thread.c:433:32: note: 'req' was declared here
> >   433 | struct thread_msg_req *req;
> >
> > Bugzilla ID: 1220
> > Fixes: a8bd581de397 ("examples/ip_pipeline: add thread runtime")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Jerin Jacob 
> > ---
> 
> Resolves the warning I see in Bugzilla 1220, thanks.
> 
> Now I see a failure in another example app:
> """
> examples/ntb/ntb_fwd.c:945:23: error: 'rte_rawdev_xstats_get' accessing 8
> bytes in a region of size 0 [-Werror=stringop-overflow=]
> """
> Will open a separate issue for that.

Nevermind, I see it's already addressed in this patch:
https://patches.dpdk.org/project/dpdk/patch/20230502135045.3541570-2-jer...@marvell.com/

Thanks!


RE: [dpdk-dev] [PATCH v1] examples/ntb: fix build issue with GCC 13

2023-05-03 Thread Ali Alnubani
> -Original Message-
> From: jer...@marvell.com 
> Sent: Tuesday, May 2, 2023 4:51 PM
> To: dev@dpdk.org; Jingjing Wu ; Junfeng Guo
> ; Xiaoyun Li 
> Cc: NBU-Contact-Thomas Monjalon (EXTERNAL) ;
> david.march...@redhat.com; ferruh.yi...@xilinx.com;
> step...@networkplumber.org; Jerin Jacob ;
> sta...@dpdk.org
> Subject: [dpdk-dev] [PATCH v1] examples/ntb: fix build issue with GCC 13
> 
> From: Jerin Jacob 
> 
> Fix the following build issue by not allowing nb_ids to be zero.
> nb_ids can be zero based on rte_rawdev_xstats_get() API
> documentation but it is not valid for the case when second
> argument is NULL.
> 
> examples/ntb/ntb_fwd.c: In function 'ntb_stats_display':
> examples/ntb/ntb_fwd.c:945:23: error: 'rte_rawdev_xstats_get'
> accessing 8 bytes in a region of size 0 [-Werror=stringop-overflow=]
>   945 | if (nb_ids != rte_rawdev_xstats_get(dev_id, ids, values, nb_ids)) {
>   |   ^~
> 
> examples/ntb/ntb_fwd.c:945:23: note: referencing argument 3
> of type 'uint64_t[0]' {aka 'long unsigned int[]'}
> In file included from ../examples/ntb/ntb_fwd.c:17:
> lib/rawdev/rte_rawdev.h:504:1: note: in a call to function
> 'rte_rawdev_xstats_get'
>   504 | rte_rawdev_xstats_get(uint16_t dev_id,
> 
> Fixes: 5194299d6ef5 ("examples/ntb: support more functions")
> Cc: sta...@dpdk.org
> Signed-off-by: Jerin Jacob 
> ---

Resolves the build failure for me on Fedora 38 with gcc 13.1.1.

Tested-by: Ali Alnubani 


[PATCH v3] app/mldev: add internal function for file read

2023-05-03 Thread Srikanth Yalavarthi
Added internal function to read model, input and reference
files with required error checks. This change fixes the
unchecked return value and improper use of negative value
issues reported by coverity scan for file read operations.

Coverity issue: 383742, 383743
Fixes: f6661e6d9a3a ("app/mldev: validate model operations")
Fixes: da6793390596 ("app/mldev: support inference validation")

Signed-off-by: Srikanth Yalavarthi 
---
v3:
* Fix incorrect use of rte_free with free

v2:
* Replace rte_malloc in ml_read_file with malloc

v1:
* Initial patch

 app/test-mldev/test_common.c   | 59 ++
 app/test-mldev/test_common.h   |  2 +
 app/test-mldev/test_inference_common.c | 54 +--
 app/test-mldev/test_model_common.c | 39 -
 4 files changed, 90 insertions(+), 64 deletions(-)

diff --git a/app/test-mldev/test_common.c b/app/test-mldev/test_common.c
index 016b31c6ba..d8a8e8a448 100644
--- a/app/test-mldev/test_common.c
+++ b/app/test-mldev/test_common.c
@@ -5,12 +5,71 @@
 #include 

 #include 
+#include 
 #include 
 #include 

 #include "ml_common.h"
 #include "test_common.h"

+int
+ml_read_file(char *file, size_t *size, char **buffer)
+{
+   char *file_buffer = NULL;
+   long file_size = 0;
+   int ret = 0;
+   FILE *fp;
+
+   fp = fopen(file, "r");
+   if (fp == NULL) {
+   ml_err("Failed to open file: %s\n", file);
+   return -EIO;
+   }
+
+   if (fseek(fp, 0, SEEK_END) == 0) {
+   file_size = ftell(fp);
+   if (file_size == -1) {
+   ret = -EIO;
+   goto error;
+   }
+
+   file_buffer = malloc(file_size);
+   if (file_buffer == NULL) {
+   ml_err("Failed to allocate memory: %s\n", file);
+   ret = -ENOMEM;
+   goto error;
+   }
+
+   if (fseek(fp, 0, SEEK_SET) != 0) {
+   ret = -EIO;
+   goto error;
+   }
+
+   if (fread(file_buffer, sizeof(char), file_size, fp) != 
(unsigned long)file_size) {
+   ml_err("Failed to read file : %s\n", file);
+   ret = -EIO;
+   goto error;
+   }
+   fclose(fp);
+   } else {
+   ret = -EIO;
+   goto error;
+   }
+
+   *buffer = file_buffer;
+   *size = file_size;
+
+   return 0;
+
+error:
+   free(file_buffer);
+
+   if (fp != NULL)
+   fclose(fp);
+
+   return ret;
+}
+
 bool
 ml_test_cap_check(struct ml_options *opt)
 {
diff --git a/app/test-mldev/test_common.h b/app/test-mldev/test_common.h
index a7b2ea652a..7e3634b0c6 100644
--- a/app/test-mldev/test_common.h
+++ b/app/test-mldev/test_common.h
@@ -24,4 +24,6 @@ int ml_test_device_close(struct ml_test *test, struct 
ml_options *opt);
 int ml_test_device_start(struct ml_test *test, struct ml_options *opt);
 int ml_test_device_stop(struct ml_test *test, struct ml_options *opt);

+int ml_read_file(char *file, size_t *size, char **buffer);
+
 #endif /* TEST_COMMON_H */
diff --git a/app/test-mldev/test_inference_common.c 
b/app/test-mldev/test_inference_common.c
index af831fc1bf..9a1c706e11 100644
--- a/app/test-mldev/test_inference_common.c
+++ b/app/test-mldev/test_inference_common.c
@@ -604,10 +604,10 @@ ml_inference_iomem_setup(struct ml_test *test, struct 
ml_options *opt, uint16_t
char mp_name[RTE_MEMPOOL_NAMESIZE];
const struct rte_memzone *mz;
uint64_t nb_buffers;
+   char *buffer = NULL;
uint32_t buff_size;
uint32_t mz_size;
-   uint32_t fsize;
-   FILE *fp;
+   size_t fsize;
int ret;

/* get input buffer size */
@@ -647,51 +647,35 @@ ml_inference_iomem_setup(struct ml_test *test, struct 
ml_options *opt, uint16_t
t->model[fid].reference = NULL;

/* load input file */
-   fp = fopen(opt->filelist[fid].input, "r");
-   if (fp == NULL) {
-   ml_err("Failed to open input file : %s\n", 
opt->filelist[fid].input);
-   ret = -errno;
+   ret = ml_read_file(opt->filelist[fid].input, &fsize, &buffer);
+   if (ret != 0)
goto error;
-   }

-   fseek(fp, 0, SEEK_END);
-   fsize = ftell(fp);
-   fseek(fp, 0, SEEK_SET);
-   if (fsize != t->model[fid].inp_dsize) {
-   ml_err("Invalid input file, size = %u (expected size = %" 
PRIu64 ")\n", fsize,
+   if (fsize == t->model[fid].inp_dsize) {
+   rte_memcpy(t->model[fid].input, buffer, fsize);
+   free(buffer);
+   } else {
+   ml_err("Invalid input file, size = %zu (expected size = %" 
PRIu64 ")\n", fsize,
   t->model[fid].inp_dsize);
ret = -EINVAL;
-   fclose(fp);
-   goto error;
-   

Re: [PATCH 0/2] vhost: add port mirroring function in the vhost lib

2023-05-03 Thread Maxime Coquelin

Hi Cheng,

On 4/21/23 03:09, Cheng Jiang wrote:

Similar to the port mirroring function on the switch or router, this
patch set implements such function on the Vhost lib. When
data is sent to a front-end, it will also send the data to its mirror
front-end. When data is received from a front-end, it will also send
the data to its mirror front-end.


Why not just keeping mirroring in the switch/router?
I am really not convinced this is the way to go:
1. API is too complex
2. It requires async support
3. There is too much code duplication, it increases  virtio-net.c by
   30%, and it is without packed ring support.
4. If mirror port is down for any reason, packets to/from the original
   port are dropped.
5. It seems to assume negotiated features of the two ports are
   identical, e.g. Virtio-net header length? If so, that's not a
   manageable solution.

Regards,
Maxime



Cheng Jiang (2):
   vhost: add ingress API for port mirroring datapath
   vhost: add egress API for port mirroring datapath

  lib/vhost/rte_vhost_async.h |   17 +
  lib/vhost/version.map   |3 +
  lib/vhost/virtio_net.c  | 1266 +++
  3 files changed, 1286 insertions(+)

--
2.35.1





[PATCH v1] devtools: allow variable declaration inside for loop

2023-05-03 Thread Ferruh Yigit
Declaring variable inside for loop is not supported via C89 and it was
checked in checkpatch.sh via commit [1].
But as DPDK supported C standard is becoming C99 [2], declaring variable
inside loop can be allowed.

[1]
Commit 43e73483a4b8 ("devtools: forbid variable declaration inside for")

[2]
https://dpdk.org/patch/121912

Signed-off-by: Ferruh Yigit 
---
Cc: Bruce Richardson 
Cc: David Marchand 
---
 devtools/checkpatches.sh | 8 
 1 file changed, 8 deletions(-)

diff --git a/devtools/checkpatches.sh b/devtools/checkpatches.sh
index 15d5d6709445..b5baf6f2b161 100755
--- a/devtools/checkpatches.sh
+++ b/devtools/checkpatches.sh
@@ -78,14 +78,6 @@ check_forbidden_additions() { # 
-f $(dirname $(readlink -f $0))/check-forbidden-tokens.awk \
"$1" || res=1
 
-   # forbid variable declaration inside "for" loop
-   awk -v FOLDERS='.' \
-   -v 
EXPRESSIONS='for[[:space:]]*\\((char|u?int|unsigned|s?size_t)' \
-   -v RET_ON_FAIL=1 \
-   -v MESSAGE='Declaring a variable inside for()' \
-   -f $(dirname $(readlink -f $0))/check-forbidden-tokens.awk \
-   "$1" || res=1
-
# refrain from new additions of 16/32/64 bits rte_atomicNN_xxx()
awk -v FOLDERS="lib drivers app examples" \
-v EXPRESSIONS="rte_atomic[0-9][0-9]_.*\\\(" \
-- 
2.34.1



Re: [PATCH 1/2] net/cnxk: optimize flow control hysteresis

2023-05-03 Thread Jerin Jacob
On Tue, Apr 11, 2023 at 12:31 PM  wrote:
>
> From: Pavan Nikhilesh 
>
> Set flow control hysteresis to ignore immediate sequence of
> decrement-increment to avoid unnecessary LLC traffic.
> Enable hysteresis when SQ length is more than 512 as it lower
> queue lengths will require faster updates.
>
> Signed-off-by: Pavan Nikhilesh 

Series applied to dpdk-next-net-mrvl/for-next-net. Thanks


> Signed-off-by: Satha Rao 
> ---
>  drivers/net/cnxk/cnxk_ethdev.c | 2 ++
>  drivers/net/cnxk/cnxk_ethdev.h | 2 ++
>  2 files changed, 4 insertions(+)
>
> diff --git a/drivers/net/cnxk/cnxk_ethdev.c b/drivers/net/cnxk/cnxk_ethdev.c
> index 1cae3084e1..42a52ed0ca 100644
> --- a/drivers/net/cnxk/cnxk_ethdev.c
> +++ b/drivers/net/cnxk/cnxk_ethdev.c
> @@ -487,6 +487,8 @@ cnxk_nix_tx_queue_setup(struct rte_eth_dev *eth_dev, 
> uint16_t qid,
> sq->qid = qid;
> sq->nb_desc = nb_desc;
> sq->max_sqe_sz = nix_sq_max_sqe_sz(dev);
> +   if (sq->nb_desc >= CNXK_NIX_DEF_SQ_COUNT)
> +   sq->fc_hyst_bits = 0x1;
>
> if (nix->tx_compl_ena) {
> sq->cqid = sq->qid + dev->nb_rxq;
> diff --git a/drivers/net/cnxk/cnxk_ethdev.h b/drivers/net/cnxk/cnxk_ethdev.h
> index 62a06e5d03..97537de17a 100644
> --- a/drivers/net/cnxk/cnxk_ethdev.h
> +++ b/drivers/net/cnxk/cnxk_ethdev.h
> @@ -49,6 +49,8 @@
>  /* LPB & SPB */
>  #define CNXK_NIX_NUM_POOLS_MAX 2
>
> +#define CNXK_NIX_DEF_SQ_COUNT  512
> +
>  #define CNXK_NIX_RSS_L3_L4_SRC_DST   
>   \
> (RTE_ETH_RSS_L3_SRC_ONLY | RTE_ETH_RSS_L3_DST_ONLY |  
>  \
>  RTE_ETH_RSS_L4_SRC_ONLY | RTE_ETH_RSS_L4_DST_ONLY)
> --
> 2.25.1
>


Re: [PATCH v1] devtools: allow variable declaration inside for loop

2023-05-03 Thread Bruce Richardson
On Wed, May 03, 2023 at 10:50:18AM +0100, Ferruh Yigit wrote:
> Declaring variable inside for loop is not supported via C89 and it was
> checked in checkpatch.sh via commit [1].  But as DPDK supported C
> standard is becoming C99 [2], declaring variable inside loop can be
> allowed.
> 
> [1] Commit 43e73483a4b8 ("devtools: forbid variable declaration inside
> for")
> 
> [2] https://dpdk.org/patch/121912
> 
> Signed-off-by: Ferruh Yigit  --- Cc: Bruce
> Richardson  Cc: David Marchand
>  --- devtools/checkpatches.sh | 8  1
> file changed, 8 deletions(-)
> 

Definite +1 from me for allowing this. However, is the plan still to move
to C99 in this release. I thought we were just going to jump to C11 in
23.11 release? However, I can't see any compilers refusing this if we do
relax things a bit now.

I was thinking that our coding standards doc might need an update for this,
but I don't see the restriction on not doing this documented there, so it
seems no doc change is necessary.

/Bruce


Re: [PATCH next 4/7] vmxnet3: add command to set ring buffer sizes

2023-05-03 Thread Ferruh Yigit
On 4/27/2023 4:59 PM, Ronak Doshi wrote:
> 
> > On 4/27/23, 1:51 AM, "Ferruh Yigit"  > wrote:
>>
>> README doesn't say much.
>>
>> The usage is not standard, and intention is not clear.
>> Can you please dig this issue more to learn the the intention, may be we
>> can find a better way or get rid of them completely?
> 
> Sure, I can take this up, but it will be better if that is investigated and 
> done as a separate patch
> as lot of definitions include those header files. The intention of this patch 
> series is completely different,
> and I don't want to break or delay this patch series. Is that OK?
> 

OK to address it as a separate patch, but please follow the issue.



Re: [PATCH next 0/7] vmxnet3: upgrade to version 7

2023-05-03 Thread Ferruh Yigit
On 4/28/2023 8:05 AM, Ronak Doshi wrote:
> 
>> On 4/27/23, 2:16 AM, "Ferruh Yigit" > > wrote:
>>
>> That document focuses on 1G and 10G network speeds, and uses "AMD
>> Opteron 2384 Processors" (which seems discontinued at this point).
>> I would expect there is an up to date version of the document, but if
>> there isn't is the existing one still relevant to keep?
> 
> I don't think there is any recent public facing performance measurement 
> document
> for vmxnet3. I will remove the reference to this old document and will update 
> the doc
> once we have new document in the future.
> 

OK, it is good to keep documentation up to date.


RE: [PATCH 1/1] eal: add tracepoints to track lcores and services

2023-05-03 Thread Van Haaren, Harry
> -Original Message-
> From: Arnaud Fiorini 
> Sent: Monday, April 24, 2023 4:24 PM
> To: Thomas Monjalon ; Jerin Jacob ;
> Sunil Kumar Kori ; Van Haaren, Harry
> 
> Cc: dev@dpdk.org; Arnaud Fiorini 
> Subject: [PATCH 1/1] eal: add tracepoints to track lcores and services
> 
> The tracepoints added are used to track lcore role and status,
> as well as service mapping and service runstates. These
> tracepoints are then used in analyses in Trace Compass.
> 
> Signed-off-by: Arnaud Fiorini 

I've had a look at these changes, and don't have any big objections;
When disabled, the tracepoints add no measurable overhead here, but
When enabled they cause a ~+50% cycle-cost (eg from ~100 to 150 cycles).
Running the "service_perf_autotest" prints cycle-costs, so this is easy to 
check.

Please add documentation on enabling the traces; today it is hard to 
identify/find
an example of enabling tracing on service-cores. Suggest to add it to:
1) the commit message, how to enable service cores traces only
2) add a section in the service-cores documentation, to enable service-cores 
only, and link to tracing documentation page for more info.

+CC Mattias and Honnappa who have expressed specific interest in service-cores
Performance in the past, any concerns/input Mattias/Honnappa?

@Arnaud, assuming no concerns from wider DPDK community, I'm happy to Ack a v2 
with docs added.


> ---
>  .mailmap |  1 +
>  lib/eal/common/eal_common_thread.c   |  4 ++
>  lib/eal/common/eal_common_trace_points.c | 21 +
>  lib/eal/common/rte_service.c | 18 ++-
>  lib/eal/include/eal_trace_internal.h | 60 
>  5 files changed, 103 insertions(+), 1 deletion(-)
> 
> diff --git a/.mailmap b/.mailmap
> index dc30369117..2a0b132572 100644
> --- a/.mailmap
> +++ b/.mailmap
> @@ -120,6 +120,7 @@ Archana Muniganti 
> 
>  Archit Pandey 
>  Arkadiusz Kubalewski 
>  Arkadiusz Kusztal 
> +Arnaud Fiorini 
>  Arnon Warshavsky 
>  Arshdeep Kaur 
>  Artem V. Andreev 
> diff --git a/lib/eal/common/eal_common_thread.c
> b/lib/eal/common/eal_common_thread.c
> index 079a385630..25dbdd68e3 100644
> --- a/lib/eal/common/eal_common_thread.c
> +++ b/lib/eal/common/eal_common_thread.c
> @@ -205,6 +205,8 @@ eal_thread_loop(void *arg)
>   __ATOMIC_ACQUIRE)) == NULL)
>   rte_pause();
> 
> + rte_eal_trace_thread_lcore_running(lcore_id, f);
> +
>   /* call the function and store the return value */
>   fct_arg = lcore_config[lcore_id].arg;
>   ret = f(fct_arg);
> @@ -219,6 +221,8 @@ eal_thread_loop(void *arg)
>*/
>   __atomic_store_n(&lcore_config[lcore_id].state, WAIT,
>   __ATOMIC_RELEASE);
> +
> + rte_eal_trace_thread_lcore_stopped(lcore_id);
>   }
> 
>   /* never reached */
> diff --git a/lib/eal/common/eal_common_trace_points.c
> b/lib/eal/common/eal_common_trace_points.c
> index 3f5bf5c55c..0f1240ea3a 100644
> --- a/lib/eal/common/eal_common_trace_points.c
> +++ b/lib/eal/common/eal_common_trace_points.c
> @@ -70,6 +70,27 @@
> RTE_TRACE_POINT_REGISTER(rte_eal_trace_thread_remote_launch,
>   lib.eal.thread.remote.launch)
>  RTE_TRACE_POINT_REGISTER(rte_eal_trace_thread_lcore_ready,
>   lib.eal.thread.lcore.ready)
> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_thread_lcore_running,
> + lib.eal.thread.lcore.running)
> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_thread_lcore_stopped,
> + lib.eal.thread.lcore.stopped)
> +
> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_service_map_lcore,
> + lib.eal.service.map.lcore)
> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_service_lcore_state_change,
> + lib.eal.service.lcore.state.change)
> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_service_lcore_start,
> + lib.eal.service.lcore.start)
> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_service_lcore_stop,
> + lib.eal.service.lcore.stop)
> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_service_run_begin,
> + lib.eal.service.run.begin)
> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_service_runstate_set,
> + lib.eal.service.run.state.set)
> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_service_run_end,
> + lib.eal.service.run.end)
> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_service_component_register,
> + lib.eal.service.component.register)
> 
>  RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_callback_register,
>   lib.eal.intr.register)
> diff --git a/lib/eal/common/rte_service.c b/lib/eal/common/rte_service.c
> index 42ca1d001d..5daec007aa 100644
> --- a/lib/eal/common/rte_service.c
> +++ b/lib/eal/common/rte_service.c
> @@ -9,6 +9,7 @@
>  #include 
>  #include 
> 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -16,6 +17,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  #include "eal_private.h"
> 
> @@ -276,6 +278,8 @@ rte_service_component_register(const struct
> rte_service_spec *spec,
>   if (id_ptr

Re: [PATCH v1] devtools: allow variable declaration inside for loop

2023-05-03 Thread Ferruh Yigit
On 5/3/2023 11:02 AM, Bruce Richardson wrote:
> On Wed, May 03, 2023 at 10:50:18AM +0100, Ferruh Yigit wrote:
>> Declaring variable inside for loop is not supported via C89 and it was
>> checked in checkpatch.sh via commit [1].  But as DPDK supported C
>> standard is becoming C99 [2], declaring variable inside loop can be
>> allowed.
>>
>> [1] Commit 43e73483a4b8 ("devtools: forbid variable declaration inside
>> for")
>>
>> [2] https://dpdk.org/patch/121912
>>
>> Signed-off-by: Ferruh Yigit  --- Cc: Bruce
>> Richardson  Cc: David Marchand
>>  --- devtools/checkpatches.sh | 8  1
>> file changed, 8 deletions(-)
>>
> 
> Definite +1 from me for allowing this. However, is the plan still to move
> to C99 in this release. I thought we were just going to jump to C11 in
> 23.11 release? However, I can't see any compilers refusing this if we do
> relax things a bit now.

I will update the commit log for target as C99/C11 .

> 
> I was thinking that our coding standards doc might need an update for this,
> but I don't see the restriction on not doing this documented there, so it
> seems no doc change is necessary.
> 

Commit 43e73483a4b8 refers to following document:

http://doc.dpdk.org/guides/contributing/coding_style.html#local-variables

Which has: "Variables should be declared at the start of a block of code
rather than in the middle."

Although it is comparing between declaring start of a block and middle
of the code, maybe we can update the document to explicitly state that
declaring variable inside for loop is allowed to prevent confusion.

Let me send a new version with documentation update.




[PATCH v2] devtools: allow variable declaration inside for loop

2023-05-03 Thread Ferruh Yigit
Declaring variable inside for loop is not supported via C89 and it was
checked in checkpatch.sh via commit [1].
But as DPDK supported C standard is becoming C99/C11 [2], declaring
variable inside loop can be allowed.

[1]
Commit 43e73483a4b8 ("devtools: forbid variable declaration inside for")

[2]
https://dpdk.org/patch/121912

Signed-off-by: Ferruh Yigit 
---
Cc: Bruce Richardson 
Cc: David Marchand 

v2:
 * Update coding convention too
---
 devtools/checkpatches.sh | 8 
 doc/guides/contributing/coding_style.rst | 1 +
 2 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/devtools/checkpatches.sh b/devtools/checkpatches.sh
index 15d5d6709445..b5baf6f2b161 100755
--- a/devtools/checkpatches.sh
+++ b/devtools/checkpatches.sh
@@ -78,14 +78,6 @@ check_forbidden_additions() { # 
-f $(dirname $(readlink -f $0))/check-forbidden-tokens.awk \
"$1" || res=1
 
-   # forbid variable declaration inside "for" loop
-   awk -v FOLDERS='.' \
-   -v 
EXPRESSIONS='for[[:space:]]*\\((char|u?int|unsigned|s?size_t)' \
-   -v RET_ON_FAIL=1 \
-   -v MESSAGE='Declaring a variable inside for()' \
-   -f $(dirname $(readlink -f $0))/check-forbidden-tokens.awk \
-   "$1" || res=1
-
# refrain from new additions of 16/32/64 bits rte_atomicNN_xxx()
awk -v FOLDERS="lib drivers app examples" \
-v EXPRESSIONS="rte_atomic[0-9][0-9]_.*\\\(" \
diff --git a/doc/guides/contributing/coding_style.rst 
b/doc/guides/contributing/coding_style.rst
index 89db6260cfbf..e18b8d4439ea 100644
--- a/doc/guides/contributing/coding_style.rst
+++ b/doc/guides/contributing/coding_style.rst
@@ -558,6 +558,7 @@ Local Variables
 
 * Variables should be declared at the start of a block of code rather than in 
the middle.
   The exception to this is when the variable is ``const`` in which case the 
declaration must be at the point of first use/assignment.
+  Declaring variable inside a for loop is OK.
 * When declaring variables in functions, multiple variables per line are OK.
   However, if multiple declarations would cause the line to exceed a 
reasonable line length, begin a new set of declarations on the next line rather 
than using a line continuation.
 * Be careful to not obfuscate the code by initializing variables in the 
declarations, only the last variable on a line should be initialized.
-- 
2.34.1



Re: [PATCH v2] devtools: allow variable declaration inside for loop

2023-05-03 Thread Bruce Richardson
On Wed, May 03, 2023 at 11:30:53AM +0100, Ferruh Yigit wrote:
> Declaring variable inside for loop is not supported via C89 and it was
> checked in checkpatch.sh via commit [1].
> But as DPDK supported C standard is becoming C99/C11 [2], declaring
> variable inside loop can be allowed.
> 
> [1]
> Commit 43e73483a4b8 ("devtools: forbid variable declaration inside for")
> 
> [2]
> https://dpdk.org/patch/121912
> 
> Signed-off-by: Ferruh Yigit 
> ---
> Cc: Bruce Richardson 
> Cc: David Marchand 
> 
> v2:
>  * Update coding convention too
> ---

Acked-by: Bruce Richardson 

>  devtools/checkpatches.sh | 8 
>  doc/guides/contributing/coding_style.rst | 1 +
>  2 files changed, 1 insertion(+), 8 deletions(-)
> 
> diff --git a/devtools/checkpatches.sh b/devtools/checkpatches.sh
> index 15d5d6709445..b5baf6f2b161 100755
> --- a/devtools/checkpatches.sh
> +++ b/devtools/checkpatches.sh
> @@ -78,14 +78,6 @@ check_forbidden_additions() { # 
>   -f $(dirname $(readlink -f $0))/check-forbidden-tokens.awk \
>   "$1" || res=1
>  
> - # forbid variable declaration inside "for" loop
> - awk -v FOLDERS='.' \
> - -v 
> EXPRESSIONS='for[[:space:]]*\\((char|u?int|unsigned|s?size_t)' \
> - -v RET_ON_FAIL=1 \
> - -v MESSAGE='Declaring a variable inside for()' \
> - -f $(dirname $(readlink -f $0))/check-forbidden-tokens.awk \
> - "$1" || res=1
> -
>   # refrain from new additions of 16/32/64 bits rte_atomicNN_xxx()
>   awk -v FOLDERS="lib drivers app examples" \
>   -v EXPRESSIONS="rte_atomic[0-9][0-9]_.*\\\(" \
> diff --git a/doc/guides/contributing/coding_style.rst 
> b/doc/guides/contributing/coding_style.rst
> index 89db6260cfbf..e18b8d4439ea 100644
> --- a/doc/guides/contributing/coding_style.rst
> +++ b/doc/guides/contributing/coding_style.rst
> @@ -558,6 +558,7 @@ Local Variables
>  
>  * Variables should be declared at the start of a block of code rather than 
> in the middle.

I'd love to see this restriction removed in future too. Having a variable
declared on first use in the middle of block I find a far easier way of
working as a) it saves scrolling to look for variable definitions and b) it
makes it far easier when adding/removing blocks of code e.g. commenting out
for testing,  to have all the code together rather than having variables at
the top to add/remove also.

>The exception to this is when the variable is ``const`` in which case the 
> declaration must be at the point of first use/assignment.
> +  Declaring variable inside a for loop is OK.
>  * When declaring variables in functions, multiple variables per line are OK.
>However, if multiple declarations would cause the line to exceed a 
> reasonable line length, begin a new set of declarations on the next line 
> rather than using a line continuation.
>  * Be careful to not obfuscate the code by initializing variables in the 
> declarations, only the last variable on a line should be initialized.
> -- 
> 2.34.1
> 


Re: [RFC PATCH v1 0/4] dts: add dts api docs

2023-05-03 Thread Juraj Linkeš
On Tue, Apr 25, 2023 at 11:43 AM Bruce Richardson
 wrote:
>
> On Tue, Apr 25, 2023 at 10:57:12AM +0200, Juraj Linkeš wrote:
> > On Tue, Apr 25, 2023 at 10:44 AM Bruce Richardson
> >  wrote:
> > >
> > > On Tue, Apr 25, 2023 at 10:20:36AM +0200, Juraj Linkeš wrote:
> > > > On Mon, Apr 3, 2023 at 11:42 AM Bruce Richardson
> > > >  wrote:
> > > > >
> > > > > On Mon, Apr 03, 2023 at 11:17:06AM +0200, Juraj Linkeš wrote:
> > > > > >Hi Bruce, Thomas,
> > > > > >The meson integration is kinda all over the place. I wanted to 
> > > > > > use the
> > > > > >existing conf.py Sphinx config file, but I also wanted to keep 
> > > > > > the docs
> > > > > >separated (because of extra DTS api docs dependencies), so the 
> > > > > > various
> > > > > >pieces are in different places (the config file in one place, 
> > > > > > meson
> > > > > >code in dts directory and generated Sphinx docs are in a new 
> > > > > > directory
> > > > > >in the api build dir, separate from the rest of the Sphinx html).
> > > > > >The big thing here is that I didn't figure out how to separate 
> > > > > > the dts
> > > > > >api build from the rest of the docs. I don't know how the 
> > > > > > -Denable_docs
> > > > > >option is supposed to work. I wanted to use -Denable_dts_docs in 
> > > > > > the
> > > > > >same fashion to decouple the builds, but it doesn't seem to work.
> > > > > >Reading the code I think the original option doesn't actually do
> > > > > >anything - does it work? How is it supposed to work?
> > > > > >Thanks,
> > > > > >Juraj
> > > > >
> > > > > The enable_docs option works by selectively enabling the doc build 
> > > > > tasks
> > > > > using the "build_by_default" parameter on them.
> > > > > See http://git.dpdk.org/dpdk/tree/doc/guides/meson.build#n23 for an
> > > > > example. The custom_target for sphinx is not a dependency of any other
> > > > > task, so whether it gets run or not depends entirely on whether the
> > > > > "build_by_default" and/or "install" options are set.
> > > > >
> > > > > As usual, there may be other stuff that needs cleaning up on this, but
> > > > > that's how it works for now, anyway. [And it does actually work, last 
> > > > > I
> > > > > tested it :-)]
> > > >
> > > > I looked into this and as is so frequently the case, we're both right. 
> > > > :-)
> > > >
> > > > When running according to docs, that is with:
> > > > 1. meson setup doc_build
> > > > 2. ninja -C doc_build doc
> > > >
> > > > it doesn't matter what enable_docs is set to, it always builds the docs.
> > > >
> > >
> > > Yes, I'd forgotten that. That was deliberately done so one could always
> > > request a doc build directly, without having to worry about DPDK config or
> > > building the rest of DPDK.
> > >
> > > > But in the full build it does control whether docs are built, i.e.:
> > > >
> > > > 1. meson setup doc_build
> > > > 2. ninja -C doc_build
> > > > doesn't build the docs, whereas:
> > > >
> > > > 1. meson setup doc_build -Denable_docs=true
> > > > 2. ninja -C doc_build
> > > > builds the docs.
> > > >
> > > > Now the problem in this version is when doing just the doc build
> > > > (ninja -C doc_build doc) both DPDK and DTS docs are built and I'd like
> > > > to separate those (because DTS doc build has additional dependencies).
> > > > I'm thinking the following would be a good solution within the current
> > > > paradigm:
> > > > 1. The -Denable_docs=true and -Denable_dts_docs=true options to
> > > > separate doc builds for the full build.
> > > > 2. Separate dts doc dir for the doc build ("ninja -C doc_build doc"
> > > > for DPDK docs and "ninja -C doc_build dts" (or maybe some other dir)
> > > > for DTS docs).
> > >
> > > How important is it to separate out the dts docs from the regular docs?
> >
> > It is mostly a matter of dependencies.
> >
> > > What are the additional dependencies, and how hard are they to get? If
> > > possible I'd rather not have an additional build config option added for
> > > this.
> >
> > The same dependencies as for running DTS, which are the proper Python
> > version (3.10 and newer) with DTS depencies obtained with Poetry
> > (which is a matter of installing Poetry and running it). As is
> > standard with Python projects, this is all set up in a virtual
> > environment, which needs to be activated before running the doc build.
> > It's documented in more detail in the tools docs:
> > https://doc.dpdk.org/guides/tools/dts.html#setting-up-dts-environment
> >
> > This may be too much of a hassle for DPDK devs to build non-DTS docs,
> > but I don't know whether DPDK devs actually build docs at all.
> >
>
> Can't really say for sure. I suspect most don't build them on a daily
> basis, but would often need to build them before submitting patches with a
> doc change included.
>
> What format are the DTS docs in? I agree that as described above the
> requirements are pretty different than those for the regular DPDK docs.
>

The resu

[RFC PATCH] ring: adding TPAUSE instruction to ring dequeue

2023-05-03 Thread David Coyle
This is NOT for upstreaming. This is being submitted to allow early
comparison testing with the preferred solution, which will add TAPUSE
power management support to the ring library through the addition of
callbacks. Initial stages of the preferred solution are available at
http://dpdk.org/patch/125454.

This patch adds functionality directly to rte_ring_dequeue functions to
monitor the empty reads of the ring. When a configurable number of
empty reads is reached, a TPAUSE instruction is triggered by using
rte_power_pause() on supported architectures. rte_pause() is used on
other architectures. The functionality can be included or excluded at
compilation time using the RTE_RING_PMGMT flag. If included, the new
API can be used to enable/disable the feature on a per-ring basis.
Other related settings can also be configured using the API.

Signed-off-by: David Coyle 
---
 lib/ring/meson.build  |   4 +-
 lib/ring/rte_ring.h   |  13 +-
 lib/ring/rte_ring_pmgmt.c | 367 ++
 lib/ring/rte_ring_pmgmt.h | 149 
 lib/ring/version.map  |  12 ++
 5 files changed, 539 insertions(+), 6 deletions(-)
 create mode 100644 lib/ring/rte_ring_pmgmt.c
 create mode 100644 lib/ring/rte_ring_pmgmt.h

diff --git a/lib/ring/meson.build b/lib/ring/meson.build
index c20685c689..087028542c 100644
--- a/lib/ring/meson.build
+++ b/lib/ring/meson.build
@@ -1,8 +1,8 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2017 Intel Corporation
 
-sources = files('rte_ring.c')
-headers = files('rte_ring.h')
+sources = files('rte_ring.c', 'rte_ring_pmgmt.c')
+headers = files('rte_ring.h', 'rte_ring_pmgmt.h')
 # most sub-headers are not for direct inclusion
 indirect_headers += files (
 'rte_ring_core.h',
diff --git a/lib/ring/rte_ring.h b/lib/ring/rte_ring.h
index 7e4cd60650..64dbb5bca2 100644
--- a/lib/ring/rte_ring.h
+++ b/lib/ring/rte_ring.h
@@ -41,6 +41,7 @@ extern "C" {
 
 #include 
 #include 
+#include 
 
 /**
  * Calculate the memory size needed for a ring
@@ -412,8 +413,10 @@ static __rte_always_inline unsigned int
 rte_ring_dequeue_bulk(struct rte_ring *r, void **obj_table, unsigned int n,
unsigned int *available)
 {
-   return rte_ring_dequeue_bulk_elem(r, obj_table, sizeof(void *),
-   n, available);
+   uint32_t nb_rx = rte_ring_dequeue_bulk_elem(r, obj_table, sizeof(void 
*),
+   n, available);
+   RTE_RING_PMGMT_IMPL(nb_rx, r->name);
+   return nb_rx;
 }
 
 /**
@@ -812,8 +815,10 @@ static __rte_always_inline unsigned int
 rte_ring_dequeue_burst(struct rte_ring *r, void **obj_table,
unsigned int n, unsigned int *available)
 {
-   return rte_ring_dequeue_burst_elem(r, obj_table, sizeof(void *),
-   n, available);
+   uint32_t nb_rx = rte_ring_dequeue_burst_elem(r, obj_table, sizeof(void 
*),
+n, available);
+   RTE_RING_PMGMT_IMPL(nb_rx, r->name);
+   return nb_rx;
 }
 
 #ifdef __cplusplus
diff --git a/lib/ring/rte_ring_pmgmt.c b/lib/ring/rte_ring_pmgmt.c
new file mode 100644
index 00..25d64736d0
--- /dev/null
+++ b/lib/ring/rte_ring_pmgmt.c
@@ -0,0 +1,367 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include "rte_ring_pmgmt.h"
+
+static unsigned int emptypoll_max;
+static unsigned int pause_duration;
+
+/* store some internal state */
+static struct ring_conf_data {
+   /** what do we support? */
+   struct rte_cpu_intrinsics intrinsics_support;
+   /** pre-calculated tsc diff for 1us */
+   uint64_t tsc_per_us;
+   /** how many rte_pause can we fit in a microsecond? */
+   uint64_t pause_per_us;
+} global_data;
+
+/**
+ * Possible power management states of a rte ring.
+ */
+enum ring_mgmt_state {
+   /** Device power management is disabled. */
+   RING_MGMT_DISABLED = 0,
+   /** Device power management is enabled. */
+   RING_MGMT_ENABLED
+};
+
+struct ring {
+   char name[RTE_RING_NAMESIZE] __rte_cache_aligned;
+};
+
+struct ring_list_entry {
+   TAILQ_ENTRY(ring_list_entry) next;
+   struct ring ring;
+   uint64_t n_empty_polls;
+   uint64_t n_sleeps;
+};
+
+struct ring_core_cfg {
+   TAILQ_HEAD(ring_list_head, ring_list_entry) head;
+   /**< List of rings associated with this lcore */
+   size_t n_rings;
+   /**< How many rings are in the list? */
+   volatile enum ring_mgmt_state pwr_mgmt_state;
+   /**< State of power management for this ring */
+   uint64_t n_rings_ready_to_sleep;
+   /**< Number of rings ready to enter power optimized state */
+   uint64_t sleep_target;
+   /**< Prevent a ring from triggering sleep multiple times */
+} __rte_cache_aligned;
+static struct ring_core_cfg lcore_cfgs[RTE_MAX_LCORE];
+
+st

[PATCH 0/3] Windows performance enhancements

2023-05-03 Thread Tal Shnaiderman
The following series enables support of 3 hardware offloads on Windows which 
improve PMD throughput.

RX throughput improvements:
**Multi-packet RQ.
**CQE compression.

TX throughput improvement:
**Multi packet send.

Tal Shnaiderman (3):
  net/mlx5: support multi-packet RQ on Windows
  net/mlx5: support CQE compression on Windows
  net/mlx5: support enhanced multi-packet write on Windows

 doc/guides/rel_notes/release_23_07.rst  | 33 +++--
 drivers/common/mlx5/mlx5_devx_cmds.c| 11 +++
 drivers/common/mlx5/mlx5_devx_cmds.h|  5 
 drivers/common/mlx5/windows/mlx5_win_defs.h |  8 -
 drivers/net/mlx5/windows/mlx5_os.c  | 46 -
 5 files changed, 72 insertions(+), 31 deletions(-)

-- 
2.16.1.windows.4



[PATCH 1/3] net/mlx5: support multi-packet RQ on Windows

2023-05-03 Thread Tal Shnaiderman
Multi-Packet RQ can further save PCIe bandwidth by posting a single large
buffer for multiple packets.

Instead of posting a buffer per a packet, one large buffer is posted
to receive multiple packets on the buffer.

Add support for multi-packet RQ on Windows.
The feature is disabled by default and can by enabled
by setting mprq_en=1 in the PMD specific arguments.

Signed-off-by: Tal Shnaiderman 
---
 drivers/common/mlx5/mlx5_devx_cmds.c|  3 +++
 drivers/common/mlx5/mlx5_devx_cmds.h|  2 ++
 drivers/common/mlx5/windows/mlx5_win_defs.h |  8 +++-
 drivers/net/mlx5/windows/mlx5_os.c  | 26 ++
 4 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c 
b/drivers/common/mlx5/mlx5_devx_cmds.c
index d0907fcd49..096bd1d520 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.c
+++ b/drivers/common/mlx5/mlx5_devx_cmds.c
@@ -1076,6 +1076,9 @@ mlx5_devx_cmd_query_hca_attr(void *ctx,
 general_obj_types) &
  MLX5_GENERAL_OBJ_TYPES_CAP_CONN_TRACK_OFFLOAD);
attr->rq_delay_drop = MLX5_GET(cmd_hca_cap, hcattr, rq_delay_drop);
+   attr->striding_rq = MLX5_GET(cmd_hca_cap, hcattr, striding_rq);
+   attr->ext_stride_num_range =
+   MLX5_GET(cmd_hca_cap, hcattr, ext_stride_num_range);
attr->max_flow_counter_15_0 = MLX5_GET(cmd_hca_cap, hcattr,
max_flow_counter_15_0);
attr->max_flow_counter_31_16 = MLX5_GET(cmd_hca_cap, hcattr,
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h 
b/drivers/common/mlx5/mlx5_devx_cmds.h
index ce173bc36a..9e7992b1c6 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.h
+++ b/drivers/common/mlx5/mlx5_devx_cmds.h
@@ -282,6 +282,8 @@ struct mlx5_hca_attr {
uint32_t crypto_wrapped_import_method:1;
uint16_t esw_mgr_vport_id; /* E-Switch Mgr vport ID . */
uint16_t max_wqe_sz_sq;
+   uint32_t striding_rq:1;
+   uint32_t ext_stride_num_range:1;
uint32_t set_reg_c:8;
uint32_t nic_flow_table:1;
uint32_t modify_outer_ip_ecn:1;
diff --git a/drivers/common/mlx5/windows/mlx5_win_defs.h 
b/drivers/common/mlx5/windows/mlx5_win_defs.h
index 65da820c5e..885114655f 100644
--- a/drivers/common/mlx5/windows/mlx5_win_defs.h
+++ b/drivers/common/mlx5/windows/mlx5_win_defs.h
@@ -270,4 +270,10 @@ enum {
MLX5_MATCH_INNER_HEADERS= RTE_BIT32(2),
 };
 
-#endif /* MLX5_WIN_DEFS_H */
+#define MLX5_MIN_SINGLE_WQE_LOG_NUM_STRIDES 9
+#define MLX5_MAX_SINGLE_WQE_LOG_NUM_STRIDES 16
+#define MLX5_MIN_SINGLE_STRIDE_LOG_NUM_BYTES 6
+#define MLX5_MAX_SINGLE_STRIDE_LOG_NUM_BYTES 13
+#define MLX5_EXT_MIN_SINGLE_WQE_LOG_NUM_STRIDES 3
+#define IB_QPT_RAW_PACKET 8
+#endif /* __MLX5_WIN_DEFS_H__ */
diff --git a/drivers/net/mlx5/windows/mlx5_os.c 
b/drivers/net/mlx5/windows/mlx5_os.c
index f401264b61..0caa8931e4 100644
--- a/drivers/net/mlx5/windows/mlx5_os.c
+++ b/drivers/net/mlx5/windows/mlx5_os.c
@@ -187,6 +187,32 @@ mlx5_os_capabilities_prepare(struct mlx5_dev_ctx_shared 
*sh)
if (sh->dev_cap.tso)
sh->dev_cap.tso_max_payload_sz = 1 << hca_attr->max_lso_cap;
DRV_LOG(DEBUG, "Counters are not supported.");
+   if (hca_attr->striding_rq) {
+   sh->dev_cap.mprq.enabled = 1;
+   sh->dev_cap.mprq.log_min_stride_size =
+   MLX5_MIN_SINGLE_STRIDE_LOG_NUM_BYTES;
+   sh->dev_cap.mprq.log_max_stride_size =
+   MLX5_MAX_SINGLE_STRIDE_LOG_NUM_BYTES;
+   if (hca_attr->ext_stride_num_range)
+   sh->dev_cap.mprq.log_min_stride_num =
+   MLX5_EXT_MIN_SINGLE_WQE_LOG_NUM_STRIDES;
+   else
+   sh->dev_cap.mprq.log_min_stride_num =
+   MLX5_MIN_SINGLE_WQE_LOG_NUM_STRIDES;
+   sh->dev_cap.mprq.log_max_stride_num =
+   MLX5_MAX_SINGLE_WQE_LOG_NUM_STRIDES;
+   DRV_LOG(DEBUG, "\tmin_single_stride_log_num_of_bytes: %u",
+   sh->dev_cap.mprq.log_min_stride_size);
+   DRV_LOG(DEBUG, "\tmax_single_stride_log_num_of_bytes: %u",
+   sh->dev_cap.mprq.log_max_stride_size);
+   DRV_LOG(DEBUG, "\tmin_single_wqe_log_num_of_strides: %u",
+   sh->dev_cap.mprq.log_min_stride_num);
+   DRV_LOG(DEBUG, "\tmax_single_wqe_log_num_of_strides: %u",
+   sh->dev_cap.mprq.log_max_stride_num);
+   DRV_LOG(DEBUG, "\tmin_stride_wqe_log_size: %u",
+   sh->dev_cap.mprq.log_min_stride_wqe_size);
+   DRV_LOG(DEBUG, "Device supports Multi-Packet RQ.");
+   }
if (hca_attr->rss_ind_tbl_cap) {
/*
 * DPDK doesn't support larger/variable indirection tables.
-- 
2.16.1.windows.4



[PATCH 2/3] net/mlx5: support CQE compression on Windows

2023-05-03 Thread Tal Shnaiderman
CQE Compression reduces PCI overhead by coalescing and compressing
multiple CQEs into a single merged CQE.

Add supported for the CQE compression feature on Windows.
feature is enabled by default unless not supported by the HW
or if the rxq_cqe_comp_en PMD argument is explicitly disabled.

Signed-off-by: Tal Shnaiderman 
---
 drivers/common/mlx5/mlx5_devx_cmds.c |  2 ++
 drivers/common/mlx5/mlx5_devx_cmds.h |  1 +
 drivers/net/mlx5/windows/mlx5_os.c   | 12 
 3 files changed, 15 insertions(+)

diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c 
b/drivers/common/mlx5/mlx5_devx_cmds.c
index 096bd1d520..a31e4995f5 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.c
+++ b/drivers/common/mlx5/mlx5_devx_cmds.c
@@ -1062,6 +1062,8 @@ mlx5_devx_cmd_query_hca_attr(void *ctx,
attr->cqe_compression = MLX5_GET(cmd_hca_cap, hcattr, cqe_compression);
attr->mini_cqe_resp_flow_tag = MLX5_GET(cmd_hca_cap, hcattr,
mini_cqe_resp_flow_tag);
+   attr->cqe_compression_128 = MLX5_GET(cmd_hca_cap, hcattr,
+   cqe_compression_128);
attr->mini_cqe_resp_l3_l4_tag = MLX5_GET(cmd_hca_cap, hcattr,
 mini_cqe_resp_l3_l4_tag);
attr->enhanced_cqe_compression = MLX5_GET(cmd_hca_cap, hcattr,
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h 
b/drivers/common/mlx5/mlx5_devx_cmds.h
index 9e7992b1c6..edcd867c4e 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.h
+++ b/drivers/common/mlx5/mlx5_devx_cmds.h
@@ -284,6 +284,7 @@ struct mlx5_hca_attr {
uint16_t max_wqe_sz_sq;
uint32_t striding_rq:1;
uint32_t ext_stride_num_range:1;
+   uint32_t cqe_compression_128:1;
uint32_t set_reg_c:8;
uint32_t nic_flow_table:1;
uint32_t modify_outer_ip_ecn:1;
diff --git a/drivers/net/mlx5/windows/mlx5_os.c 
b/drivers/net/mlx5/windows/mlx5_os.c
index 0caa8931e4..6527269663 100644
--- a/drivers/net/mlx5/windows/mlx5_os.c
+++ b/drivers/net/mlx5/windows/mlx5_os.c
@@ -237,6 +237,18 @@ mlx5_os_capabilities_prepare(struct mlx5_dev_ctx_shared 
*sh)
} else {
DRV_LOG(DEBUG, "Tunnel offloading is not supported.");
}
+   sh->dev_cap.cqe_comp = 0;
+#if (RTE_CACHE_LINE_SIZE == 128)
+   if (hca_attr->cqe_compression_128)
+   sh->dev_cap.cqe_comp = 1;
+   DRV_LOG(DEBUG, "Rx CQE 128B compression is %ssupported.",
+   sh->dev_cap.cqe_comp ? "" : "not ");
+#else
+   if (hca_attr->cqe_compression)
+   sh->dev_cap.cqe_comp = 1;
+   DRV_LOG(DEBUG, "Rx CQE compression is %ssupported.",
+   sh->dev_cap.cqe_comp ? "" : "not ");
+#endif
snprintf(sh->dev_cap.fw_ver, 64, "%x.%x.%04x",
 MLX5_GET(initial_seg, pv_iseg, fw_rev_major),
 MLX5_GET(initial_seg, pv_iseg, fw_rev_minor),
-- 
2.16.1.windows.4



[PATCH 3/3] net/mlx5: support enhanced multi-packet write on Windows

2023-05-03 Thread Tal Shnaiderman
Add support for enhanced multi-packet write on Windows.

Enhanced multi-packet write allows the Tx burst function to pack up
multiple packets in a single descriptor session to save PCI bandwidth
and improve performance.

The feature can be controlled by the txq_mpw_en PMD argument:

txq_mpw_en=1 - PMD will first attempt to use "enhanced multi packet write"
if the feature is not supported by the HW the legacy "multi packet write"
will be used.
if both are unsupported the multi packet write feature is disabled.

txq_mpw_en=0 - multi packet write is disabled.

txq_mpw_en unset(default) - enhanced multi packet write
will be activated if supported.
if unsupported the multi packet write feature is disabled.

Signed-off-by: Tal Shnaiderman 
---
 doc/guides/rel_notes/release_23_07.rst | 33 -
 drivers/common/mlx5/mlx5_devx_cmds.c   |  6 ++
 drivers/common/mlx5/mlx5_devx_cmds.h   |  2 ++
 drivers/net/mlx5/windows/mlx5_os.c |  8 +++-
 4 files changed, 19 insertions(+), 30 deletions(-)

diff --git a/doc/guides/rel_notes/release_23_07.rst 
b/doc/guides/rel_notes/release_23_07.rst
index a9b1293689..d74551414d 100644
--- a/doc/guides/rel_notes/release_23_07.rst
+++ b/doc/guides/rel_notes/release_23_07.rst
@@ -24,36 +24,11 @@ DPDK Release 23.07
 New Features
 
 
-.. This section should contain new features added in this release.
-   Sample format:
+* **Updated NVIDIA mlx5 driver.**
 
-   * **Add a title in the past tense with a full stop.**
-
- Add a short 1-2 sentence description in the past tense.
- The description should be enough to allow someone scanning
- the release notes to understand the new feature.
-
- If the feature adds a lot of sub-features you can use a bullet list
- like this:
-
- * Added feature foo to do something.
- * Enhanced feature bar to do something else.
-
- Refer to the previous release notes for examples.
-
- Suggested order in release notes items:
- * Core libs (EAL, mempool, ring, mbuf, buses)
- * Device abstraction libs and PMDs (ordered alphabetically by vendor name)
-   - ethdev (lib, PMDs)
-   - cryptodev (lib, PMDs)
-   - eventdev (lib, PMDs)
-   - etc
- * Other libs
- * Apps, Examples, Tools (if significant)
-
- This section is a comment. Do not overwrite or remove it.
- Also, make sure to start the actual text at the margin.
- ===
+  * Added support for multi-packet RQ on Windows.
+  * Added support for CQE compression on Windows.
+  * Added support for enhanced multi-packet write on Windows.
 
 
 Removed Items
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.c 
b/drivers/common/mlx5/mlx5_devx_cmds.c
index a31e4995f5..b2abc742cf 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.c
+++ b/drivers/common/mlx5/mlx5_devx_cmds.c
@@ -1298,6 +1298,12 @@ mlx5_devx_cmd_query_hca_attr(void *ctx,
attr->rss_ind_tbl_cap = MLX5_GET
(per_protocol_networking_offload_caps,
 hcattr, rss_ind_tbl_cap);
+   attr->multi_pkt_send_wqe = MLX5_GET
+   (per_protocol_networking_offload_caps,
+hcattr, multi_pkt_send_wqe);
+   attr->enhanced_multi_pkt_send_wqe = MLX5_GET
+   (per_protocol_networking_offload_caps,
+hcattr, enhanced_multi_pkt_send_wqe);
/* Query HCA attribute for ROCE. */
if (attr->roce) {
hcattr = mlx5_devx_get_hca_cap(ctx, in, out, &rc,
diff --git a/drivers/common/mlx5/mlx5_devx_cmds.h 
b/drivers/common/mlx5/mlx5_devx_cmds.h
index edcd867c4e..c8427d2dbb 100644
--- a/drivers/common/mlx5/mlx5_devx_cmds.h
+++ b/drivers/common/mlx5/mlx5_devx_cmds.h
@@ -285,6 +285,8 @@ struct mlx5_hca_attr {
uint32_t striding_rq:1;
uint32_t ext_stride_num_range:1;
uint32_t cqe_compression_128:1;
+   uint32_t multi_pkt_send_wqe:1;
+   uint32_t enhanced_multi_pkt_send_wqe:1;
uint32_t set_reg_c:8;
uint32_t nic_flow_table:1;
uint32_t modify_outer_ip_ecn:1;
diff --git a/drivers/net/mlx5/windows/mlx5_os.c 
b/drivers/net/mlx5/windows/mlx5_os.c
index 6527269663..b731bdff06 100644
--- a/drivers/net/mlx5/windows/mlx5_os.c
+++ b/drivers/net/mlx5/windows/mlx5_os.c
@@ -173,7 +173,6 @@ mlx5_os_capabilities_prepare(struct mlx5_dev_ctx_shared *sh)
sh->dev_cap.max_qp = 1 << hca_attr->log_max_qp;
sh->dev_cap.max_qp_wr = 1 << hca_attr->log_max_qp_sz;
sh->dev_cap.dv_flow_en = 1;
-   sh->dev_cap.mps = MLX5_MPW_DISABLED;
DRV_LOG(DEBUG, "MPW isn't supported.");
DRV_LOG(DEBUG, "MPLS over GRE/UDP tunnel offloading is no supported.");
sh->dev_cap.hw_csum = hca_attr->csum_cap;
@@ -224,6 +223,13 @@ mlx5_os_capabilities_prepare(struct mlx5_dev_ctx_shared 
*sh)
DRV_LOG(DEBUG

Re: [PATCH] mempool/cnxk: avoid indefinite wait when counting batch alloc pointers

2023-05-03 Thread Jerin Jacob
On Tue, Apr 11, 2023 at 12:52 PM Ashwin Sekhar T K  wrote:
>
> Avoid waiting indefinitely when counting batch alloc pointers by adding a
> wait timeout.
>
> Signed-off-by: Ashwin Sekhar T K 


Updated the git commit as follows and applied to
dpdk-next-net-mrvl/for-next-net. Thanks

mempool/cnxk: avoid hang when counting batch allocs

Avoid waiting indefinitely when counting batch alloc pointers by adding a
wait timeout.

Fixes: 8f2cd7946083 ("mempool/cnxk: add cn10k get count")
Cc: sta...@dpdk.org

Signed-off-by: Ashwin Sekhar T K 

> ---
>  drivers/common/cnxk/roc_npa.h| 15 +--
>  drivers/mempool/cnxk/cn10k_mempool_ops.c |  3 ++-
>  2 files changed, 11 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/common/cnxk/roc_npa.h b/drivers/common/cnxk/roc_npa.h
> index dd588b0322..1ef3ecc08a 100644
> --- a/drivers/common/cnxk/roc_npa.h
> +++ b/drivers/common/cnxk/roc_npa.h
> @@ -241,19 +241,23 @@ roc_npa_aura_batch_alloc_issue(uint64_t aura_handle, 
> uint64_t *buf,
>  }
>
>  static inline void
> -roc_npa_batch_alloc_wait(uint64_t *cache_line)
> +roc_npa_batch_alloc_wait(uint64_t *cache_line, unsigned int wait_us)
>  {
> +   const uint64_t ticks = (uint64_t)wait_us * plt_tsc_hz() / 
> (uint64_t)1E6;
> +   const uint64_t start = plt_tsc_cycles();
> +
> /* Batch alloc status code is updated in bits [5:6] of the first word
>  * of the 128 byte cache line.
>  */
> while (((__atomic_load_n(cache_line, __ATOMIC_RELAXED) >> 5) & 0x3) ==
>ALLOC_CCODE_INVAL)
> -   ;
> +   if (wait_us && (plt_tsc_cycles() - start) >= ticks)
> +   break;
>  }
>
>  static inline unsigned int
>  roc_npa_aura_batch_alloc_count(uint64_t *aligned_buf, unsigned int num,
> -  unsigned int do_wait)
> +  unsigned int wait_us)
>  {
> unsigned int count, i;
>
> @@ -267,8 +271,7 @@ roc_npa_aura_batch_alloc_count(uint64_t *aligned_buf, 
> unsigned int num,
>
> status = (struct npa_batch_alloc_status_s *)&aligned_buf[i];
>
> -   if (do_wait)
> -   roc_npa_batch_alloc_wait(&aligned_buf[i]);
> +   roc_npa_batch_alloc_wait(&aligned_buf[i], wait_us);
>
> count += status->count;
> }
> @@ -293,7 +296,7 @@ roc_npa_aura_batch_alloc_extract(uint64_t *buf, uint64_t 
> *aligned_buf,
>
> status = (struct npa_batch_alloc_status_s *)&aligned_buf[i];
>
> -   roc_npa_batch_alloc_wait(&aligned_buf[i]);
> +   roc_npa_batch_alloc_wait(&aligned_buf[i], 0);
>
> line_count = status->count;
>
> diff --git a/drivers/mempool/cnxk/cn10k_mempool_ops.c 
> b/drivers/mempool/cnxk/cn10k_mempool_ops.c
> index ba826f0f01..ff0015d8de 100644
> --- a/drivers/mempool/cnxk/cn10k_mempool_ops.c
> +++ b/drivers/mempool/cnxk/cn10k_mempool_ops.c
> @@ -9,6 +9,7 @@
>
>  #define BATCH_ALLOC_SZ  ROC_CN10K_NPA_BATCH_ALLOC_MAX_PTRS
>  #define BATCH_OP_DATA_TABLE_MZ_NAME "batch_op_data_table_mz"
> +#define BATCH_ALLOC_WAIT_US 5
>
>  enum batch_op_status {
> BATCH_ALLOC_OP_NOT_ISSUED = 0,
> @@ -178,7 +179,7 @@ cn10k_mempool_get_count(const struct rte_mempool *mp)
>
> if (mem->status == BATCH_ALLOC_OP_ISSUED)
> count += roc_npa_aura_batch_alloc_count(
> -   mem->objs, BATCH_ALLOC_SZ, 1);
> +   mem->objs, BATCH_ALLOC_SZ, 
> BATCH_ALLOC_WAIT_US);
>
> if (mem->status == BATCH_ALLOC_OP_DONE)
> count += mem->sz;
> --
> 2.25.1
>


RE: [PATCH v2] devtools: allow variable declaration inside for loop

2023-05-03 Thread Morten Brørup
> From: Bruce Richardson [mailto:bruce.richard...@intel.com]
> Sent: Wednesday, 3 May 2023 12.57
> 
> On Wed, May 03, 2023 at 11:30:53AM +0100, Ferruh Yigit wrote:
> > Declaring variable inside for loop is not supported via C89 and it was
> > checked in checkpatch.sh via commit [1].
> > But as DPDK supported C standard is becoming C99/C11 [2], declaring
> > variable inside loop can be allowed.
> >
> > [1]
> > Commit 43e73483a4b8 ("devtools: forbid variable declaration inside
> for")
> >
> > [2]
> > https://dpdk.org/patch/121912
> >
> > Signed-off-by: Ferruh Yigit 
> > ---
> > Cc: Bruce Richardson 
> > Cc: David Marchand 
> >
> > v2:
> >  * Update coding convention too
> > ---
> 
> Acked-by: Bruce Richardson 

Acked-by: Morten Brørup 

[...]

> > @@ -558,6 +558,7 @@ Local Variables
> >
> >  * Variables should be declared at the start of a block of code rather
> than in the middle.
> 
> I'd love to see this restriction removed in future too. Having a
> variable
> declared on first use in the middle of block I find a far easier way of
> working as a) it saves scrolling to look for variable definitions and b)
> it
> makes it far easier when adding/removing blocks of code e.g. commenting
> out
> for testing,  to have all the code together rather than having variables
> at
> the top to add/remove also.

And c) Initializing the variables close to where they are used the first time 
reduces the risk of initializing them incorrectly. Especially when modifying a 
block of code, initialization of its variables might be missed if out of sight. 
(Although this is probably a consequence of "a)".)

I consider it old style to only declare variables at the start of a block of 
code, and this style of coding should be considered obsolete.

If you are really old (like me?), you might remember when function parameters 
were provided like this:

int main(argc, argv)
int argc;
char *argv[];
{
return(0);
}

We have moved on from that to a more modern coding style a long time ago. We 
should also move on to a more modern coding style regarding variable 
declarations.

> 
> >The exception to this is when the variable is ``const`` in which
> case the declaration must be at the point of first use/assignment.
> > +  Declaring variable inside a for loop is OK.
> >  * When declaring variables in functions, multiple variables per line
> are OK.
> >However, if multiple declarations would cause the line to exceed a
> reasonable line length, begin a new set of declarations on the next line
> rather than using a line continuation.
> >  * Be careful to not obfuscate the code by initializing variables in
> the declarations, only the last variable on a line should be
> initialized.
> > --
> > 2.34.1
> >


RE: [RFC PATCH] ring: adding TPAUSE instruction to ring dequeue

2023-05-03 Thread Morten Brørup
> From: David Coyle [mailto:david.co...@intel.com]
> Sent: Wednesday, 3 May 2023 13.39
> 
> This is NOT for upstreaming. This is being submitted to allow early
> comparison testing with the preferred solution, which will add TAPUSE
> power management support to the ring library through the addition of
> callbacks. Initial stages of the preferred solution are available at
> http://dpdk.org/patch/125454.
> 
> This patch adds functionality directly to rte_ring_dequeue functions to
> monitor the empty reads of the ring. When a configurable number of
> empty reads is reached, a TPAUSE instruction is triggered by using
> rte_power_pause() on supported architectures. rte_pause() is used on
> other architectures. The functionality can be included or excluded at
> compilation time using the RTE_RING_PMGMT flag. If included, the new
> API can be used to enable/disable the feature on a per-ring basis.
> Other related settings can also be configured using the API.

I don't understand why DPDK developers keep spending time on trying to invent 
methods to determine application busyness based on entry/exit points in a 
variety of libraries, when the application is in a much better position to 
determine busyness. All of these "busyness measuring" library extensions have 
their own specific assumptions and weird limitations.

I do understand that the goal is power saving, which certainly is relevant! I 
only criticize the measuring methods.

For reference, we implemented something very simple in our application 
framework:
1. When each pipeline stage has completed a burst, it reports if it was busy or 
not.
2. If the pipeline busyness is low, we take a nap to save some power.

And here is the magic twist to this simple algorithm:
3. A pipeline stage is not considered busy unless it processed a full burst, 
and is ready to process more packets immediately. This interpretation of 
busyness has a significant impact on the percentage of time spent napping 
during the low-traffic hours.

This algorithm was very quickly implemented. It might not be perfect, and we do 
intend to improve it (also to determine CPU Utilization on a scale that the end 
user can translate to a linear interpretation of how busy the system is). But I 
seriously doubt that any of the proposed "busyness measuring" library 
extensions are any better.

So: The application knows better, please spend your precious time on something 
useful instead.

@David, my outburst is not directed at you specifically. Generally, I do 
appreciate experimenting as a good way of obtaining knowledge. So thank you for 
sharing your experiments with this audience!

PS: If cruft can be disabled at build time, I generally don't oppose to it.

-Morten



Re: [RFC 07/27] vhost: change to single IOTLB cache per device

2023-05-03 Thread Maxime Coquelin

Hi Chenbo,

On 4/25/23 08:19, Xia, Chenbo wrote:

Hi Maxime,


-Original Message-
From: Maxime Coquelin 
Sent: Friday, March 31, 2023 11:43 PM
To: dev@dpdk.org; david.march...@redhat.com; Xia, Chenbo
; m...@redhat.com; f...@redhat.com;
jasow...@redhat.com; Liang, Cunming ; Xie, Yongji
; echau...@redhat.com; epere...@redhat.com;
amore...@redhat.com
Cc: Maxime Coquelin 
Subject: [RFC 07/27] vhost: change to single IOTLB cache per device

This patch simplifies IOTLB implementation and improves
IOTLB memory consumption by having a single IOTLB cache
per device, instead of having one per queue.

In order to not impact performance, it keeps an IOTLB lock
per virtqueue, so that there is no contention between
multiple queue trying to acquire it.

Signed-off-by: Maxime Coquelin 
---
  lib/vhost/iotlb.c  | 212 +++--
  lib/vhost/iotlb.h  |  43 ++---
  lib/vhost/vhost.c  |  18 ++--
  lib/vhost/vhost.h  |  16 ++--
  lib/vhost/vhost_user.c |  25 +++--
  5 files changed, 160 insertions(+), 154 deletions(-)



[...]


diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
index d60e39b6bc..81ebef0137 100644
--- a/lib/vhost/vhost_user.c
+++ b/lib/vhost/vhost_user.c
@@ -7,7 +7,7 @@
   * The vhost-user protocol connection is an external interface, so it
must be
   * robust against invalid inputs.
   *
- * This is important because the vhost-user frontend is only one step
removed
+* This is important because the vhost-user frontend is only one step
removed


This is changed by accident?


Yes, this will be fixed in v1.

Thanks,
Maxime


Thanks,
Chenbo




Re: [RFC 09/27] vhost: add page size info to IOTLB entry

2023-05-03 Thread Maxime Coquelin

Hi Chenbo,

On 4/25/23 08:20, Xia, Chenbo wrote:

Hi Maxime,


-Original Message-
From: Maxime Coquelin 
Sent: Friday, March 31, 2023 11:43 PM
To: dev@dpdk.org; david.march...@redhat.com; Xia, Chenbo
; m...@redhat.com; f...@redhat.com;
jasow...@redhat.com; Liang, Cunming ; Xie, Yongji
; echau...@redhat.com; epere...@redhat.com;
amore...@redhat.com
Cc: Maxime Coquelin 
Subject: [RFC 09/27] vhost: add page size info to IOTLB entry

VDUSE will close the file descriptor after having mapped
the shared memory, so it will not be possible to get the
page size afterwards.

This patch adds an new page_shift field to the IOTLB entry,
so that the information will be passed at IOTLB cache
insertion time. The information is stored as a bit shift
value so that IOTLB entry keeps fitting in a single
cacheline.

Signed-off-by: Maxime Coquelin 
---
  lib/vhost/iotlb.c  | 46 --
  lib/vhost/iotlb.h  |  2 +-
  lib/vhost/vhost.h  |  1 -
  lib/vhost/vhost_user.c |  8 +---
  4 files changed, 28 insertions(+), 29 deletions(-)

diff --git a/lib/vhost/iotlb.c b/lib/vhost/iotlb.c
index 51f118bc48..188dfb8e38 100644
--- a/lib/vhost/iotlb.c
+++ b/lib/vhost/iotlb.c
@@ -19,14 +19,14 @@ struct vhost_iotlb_entry {
uint64_t uaddr;
uint64_t uoffset;
uint64_t size;
+   uint8_t page_shift;
uint8_t perm;
  };

  #define IOTLB_CACHE_SIZE 2048

  static bool
-vhost_user_iotlb_share_page(struct vhost_iotlb_entry *a, struct
vhost_iotlb_entry *b,
-   uint64_t align)
+vhost_user_iotlb_share_page(struct vhost_iotlb_entry *a, struct
vhost_iotlb_entry *b)
  {
uint64_t a_start, a_end, b_start;

@@ -38,44 +38,41 @@ vhost_user_iotlb_share_page(struct vhost_iotlb_entry
*a, struct vhost_iotlb_entr

/* Assumes entry a lower than entry b */
RTE_ASSERT(a_start < b_start);
-   a_end = RTE_ALIGN_CEIL(a_start + a->size, align);
-   b_start = RTE_ALIGN_FLOOR(b_start, align);
+   a_end = RTE_ALIGN_CEIL(a_start + a->size, RTE_BIT64(a->page_shift));
+   b_start = RTE_ALIGN_FLOOR(b_start, RTE_BIT64(b->page_shift));

return a_end > b_start;
  }

  static void
-vhost_user_iotlb_set_dump(struct virtio_net *dev, struct
vhost_iotlb_entry *node)
+vhost_user_iotlb_set_dump(struct vhost_iotlb_entry *node)
  {
-   uint64_t align, start;
+   uint64_t start;

start = node->uaddr + node->uoffset;
-   align = hua_to_alignment(dev->mem, (void *)(uintptr_t)start);
-
-   mem_set_dump((void *)(uintptr_t)start, node->size, false, align);
+   mem_set_dump((void *)(uintptr_t)start, node->size, false,
RTE_BIT64(node->page_shift));
  }

  static void
-vhost_user_iotlb_clear_dump(struct virtio_net *dev, struct
vhost_iotlb_entry *node,
+vhost_user_iotlb_clear_dump(struct vhost_iotlb_entry *node,
struct vhost_iotlb_entry *prev, struct vhost_iotlb_entry *next)
  {
-   uint64_t align, start, end;
+   uint64_t start, end;

start = node->uaddr + node->uoffset;
end = start + node->size;

-   align = hua_to_alignment(dev->mem, (void *)(uintptr_t)start);
-
/* Skip first page if shared with previous entry. */
-   if (vhost_user_iotlb_share_page(prev, node, align))
-   start = RTE_ALIGN_CEIL(start, align);
+   if (vhost_user_iotlb_share_page(prev, node))
+   start = RTE_ALIGN_CEIL(start, RTE_BIT64(node->page_shift));

/* Skip last page if shared with next entry. */
-   if (vhost_user_iotlb_share_page(node, next, align))
-   end = RTE_ALIGN_FLOOR(end, align);
+   if (vhost_user_iotlb_share_page(node, next))
+   end = RTE_ALIGN_FLOOR(end, RTE_BIT64(node->page_shift));

if (end > start)
-   mem_set_dump((void *)(uintptr_t)start, end - start, false,
align);
+   mem_set_dump((void *)(uintptr_t)start, end - start, false,
+   RTE_BIT64(node->page_shift));
  }

  static struct vhost_iotlb_entry *
@@ -198,7 +195,7 @@ vhost_user_iotlb_cache_remove_all(struct virtio_net
*dev)
vhost_user_iotlb_wr_lock_all(dev);

RTE_TAILQ_FOREACH_SAFE(node, &dev->iotlb_list, next, temp_node) {
-   vhost_user_iotlb_set_dump(dev, node);
+   vhost_user_iotlb_set_dump(node);

TAILQ_REMOVE(&dev->iotlb_list, node, next);
vhost_user_iotlb_pool_put(dev, node);
@@ -223,7 +220,7 @@ vhost_user_iotlb_cache_random_evict(struct virtio_net
*dev)
if (!entry_idx) {
struct vhost_iotlb_entry *next_node =
RTE_TAILQ_NEXT(node, next);

-   vhost_user_iotlb_clear_dump(dev, node, prev_node,
next_node);
+   vhost_user_iotlb_clear_dump(node, prev_node, next_node);

TAILQ_REMOVE(&dev->iotlb_list, node, next);
vhost_user_iotlb_pool_put(dev, node);
@@ -239,7 +236,7 @@ vhost_user_iotlb_cache_random_evict(st

Re: [RFC PATCH] ring: adding TPAUSE instruction to ring dequeue

2023-05-03 Thread Stephen Hemminger
On Wed, 3 May 2023 15:32:27 +0200
Morten Brørup  wrote:

> > From: David Coyle [mailto:david.co...@intel.com]
> > Sent: Wednesday, 3 May 2023 13.39
> > 
> > This is NOT for upstreaming. This is being submitted to allow early
> > comparison testing with the preferred solution, which will add TAPUSE
> > power management support to the ring library through the addition of
> > callbacks. Initial stages of the preferred solution are available at
> > http://dpdk.org/patch/125454.
> > 
> > This patch adds functionality directly to rte_ring_dequeue functions to
> > monitor the empty reads of the ring. When a configurable number of
> > empty reads is reached, a TPAUSE instruction is triggered by using
> > rte_power_pause() on supported architectures. rte_pause() is used on
> > other architectures. The functionality can be included or excluded at
> > compilation time using the RTE_RING_PMGMT flag. If included, the new
> > API can be used to enable/disable the feature on a per-ring basis.
> > Other related settings can also be configured using the API.  
> 
> I don't understand why DPDK developers keep spending time on trying to invent 
> methods to determine application busyness based on entry/exit points in a 
> variety of libraries, when the application is in a much better position to 
> determine busyness. All of these "busyness measuring" library extensions have 
> their own specific assumptions and weird limitations.
> 
> I do understand that the goal is power saving, which certainly is relevant! I 
> only criticize the measuring methods.
> 
> For reference, we implemented something very simple in our application 
> framework:
> 1. When each pipeline stage has completed a burst, it reports if it was busy 
> or not.
> 2. If the pipeline busyness is low, we take a nap to save some power.
> 
> And here is the magic twist to this simple algorithm:
> 3. A pipeline stage is not considered busy unless it processed a full burst, 
> and is ready to process more packets immediately. This interpretation of 
> busyness has a significant impact on the percentage of time spent napping 
> during the low-traffic hours.
> 
> This algorithm was very quickly implemented. It might not be perfect, and we 
> do intend to improve it (also to determine CPU Utilization on a scale that 
> the end user can translate to a linear interpretation of how busy the system 
> is). But I seriously doubt that any of the proposed "busyness measuring" 
> library extensions are any better.
> 
> So: The application knows better, please spend your precious time on 
> something useful instead.

Agree, with Morten. This is not the place to add more code.
Does this have an applicable use case to common DPDK applications like OVS, 
VPP, or even Contrail?
Or is it some intern experiment kind of thing.

The existing l3fwd is an example of how to do PM in application. Switching to 
interrupt mode lets
the kernel help out. And the kernel will no how to handle multicore etc.


> @David, my outburst is not directed at you specifically. Generally, I do 
> appreciate experimenting as a good way of obtaining knowledge. So thank you 
> for sharing your experiments with this audience!
> 
> PS: If cruft can be disabled at build time, I generally don't oppose to it.

Cruft increases test coverage surface, creates technical debt, and makes Linux 
distro maintainers upset.


Re: [PATCH v3] app/mldev: add internal function for file read

2023-05-03 Thread Stephen Hemminger
On Wed, 3 May 2023 01:56:41 -0700
Srikanth Yalavarthi  wrote:

> Added internal function to read model, input and reference
> files with required error checks. This change fixes the
> unchecked return value and improper use of negative value
> issues reported by coverity scan for file read operations.
> 
> Coverity issue: 383742, 383743
> Fixes: f6661e6d9a3a ("app/mldev: validate model operations")
> Fixes: da6793390596 ("app/mldev: support inference validation")
> 
> Signed-off-by: Srikanth Yalavarthi 
> ---
> v3:
> * Fix incorrect use of rte_free with free
> 
> v2:
> * Replace rte_malloc in ml_read_file with malloc
> 
> v1:
> * Initial patch
> 
>  app/test-mldev/test_common.c   | 59 ++
>  app/test-mldev/test_common.h   |  2 +
>  app/test-mldev/test_inference_common.c | 54 +--
>  app/test-mldev/test_model_common.c | 39 -
>  4 files changed, 90 insertions(+), 64 deletions(-)
> 
> diff --git a/app/test-mldev/test_common.c b/app/test-mldev/test_common.c
> index 016b31c6ba..d8a8e8a448 100644
> --- a/app/test-mldev/test_common.c
> +++ b/app/test-mldev/test_common.c
> @@ -5,12 +5,71 @@
>  #include 
> 
>  #include 
> +#include 
>  #include 
>  #include 
> 
>  #include "ml_common.h"
>  #include "test_common.h"
> 
> +int
> +ml_read_file(char *file, size_t *size, char **buffer)
> +{
> + char *file_buffer = NULL;
> + long file_size = 0;
> + int ret = 0;
> + FILE *fp;
> +
> + fp = fopen(file, "r");
> + if (fp == NULL) {
> + ml_err("Failed to open file: %s\n", file);
> + return -EIO;
> + }
> +
> + if (fseek(fp, 0, SEEK_END) == 0) {
> + file_size = ftell(fp);
> + if (file_size == -1) {
> + ret = -EIO;
> + goto error;
> + }
> +
> + file_buffer = malloc(file_size);
> + if (file_buffer == NULL) {
> + ml_err("Failed to allocate memory: %s\n", file);
> + ret = -ENOMEM;
> + goto error;
> + }
> +
> + if (fseek(fp, 0, SEEK_SET) != 0) {
> + ret = -EIO;
> + goto error;
> + }
> +
> + if (fread(file_buffer, sizeof(char), file_size, fp) != 
> (unsigned long)file_size) {
> + ml_err("Failed to read file : %s\n", file);
> + ret = -EIO;
> + goto error;
> + }
> + fclose(fp);
> + } else {


Granted this is a test program. But why did you ignore my feedback that this
is the slowest way to read a file. Stdio requires extra buffering, use regular 
read() or
better yet mmap().


RE: [EXT] Re: [PATCH v3] app/mldev: add internal function for file read

2023-05-03 Thread Srikanth Yalavarthi
> -Original Message-
> From: Stephen Hemminger 
> Sent: 03 May 2023 20:24
> To: Srikanth Yalavarthi 
> Cc: Anup Prabhu ; dev@dpdk.org; Shivah Shankar
> Shankar Narayan Rao ; Prince Takkar
> 
> Subject: [EXT] Re: [PATCH v3] app/mldev: add internal function for file read
> 
> External Email
> 
> --
> On Wed, 3 May 2023 01:56:41 -0700
> Srikanth Yalavarthi  wrote:
> 
> > Added internal function to read model, input and reference files with
> > required error checks. This change fixes the unchecked return value
> > and improper use of negative value issues reported by coverity scan
> > for file read operations.
> >
> > Coverity issue: 383742, 383743
> > Fixes: f6661e6d9a3a ("app/mldev: validate model operations")
> > Fixes: da6793390596 ("app/mldev: support inference validation")
> >
> > Signed-off-by: Srikanth Yalavarthi 
> > ---
> > v3:
> > * Fix incorrect use of rte_free with free
> >
> > v2:
> > * Replace rte_malloc in ml_read_file with malloc
> >
> > v1:
> > * Initial patch
> >
> >  app/test-mldev/test_common.c   | 59 ++
> >  app/test-mldev/test_common.h   |  2 +
> >  app/test-mldev/test_inference_common.c | 54 +--
> >  app/test-mldev/test_model_common.c | 39 -
> >  4 files changed, 90 insertions(+), 64 deletions(-)
> >
> > diff --git a/app/test-mldev/test_common.c
> > b/app/test-mldev/test_common.c index 016b31c6ba..d8a8e8a448 100644
> > --- a/app/test-mldev/test_common.c
> > +++ b/app/test-mldev/test_common.c
> > @@ -5,12 +5,71 @@
> >  #include 
> >
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >
> >  #include "ml_common.h"
> >  #include "test_common.h"
> >
> > +int
> > +ml_read_file(char *file, size_t *size, char **buffer) {
> > +   char *file_buffer = NULL;
> > +   long file_size = 0;
> > +   int ret = 0;
> > +   FILE *fp;
> > +
> > +   fp = fopen(file, "r");
> > +   if (fp == NULL) {
> > +   ml_err("Failed to open file: %s\n", file);
> > +   return -EIO;
> > +   }
> > +
> > +   if (fseek(fp, 0, SEEK_END) == 0) {
> > +   file_size = ftell(fp);
> > +   if (file_size == -1) {
> > +   ret = -EIO;
> > +   goto error;
> > +   }
> > +
> > +   file_buffer = malloc(file_size);
> > +   if (file_buffer == NULL) {
> > +   ml_err("Failed to allocate memory: %s\n", file);
> > +   ret = -ENOMEM;
> > +   goto error;
> > +   }
> > +
> > +   if (fseek(fp, 0, SEEK_SET) != 0) {
> > +   ret = -EIO;
> > +   goto error;
> > +   }
> > +
> > +   if (fread(file_buffer, sizeof(char), file_size, fp) != (unsigned
> long)file_size) {
> > +   ml_err("Failed to read file : %s\n", file);
> > +   ret = -EIO;
> > +   goto error;
> > +   }
> > +   fclose(fp);
> > +   } else {
> 
> 
> Granted this is a test program. But why did you ignore my feedback that this
> is the slowest way to read a file. Stdio requires extra buffering, use regular
> read() or better yet mmap().

Agree on the improvement, but, considering that this is a test code and these 
operations are done in slow-path, I would prefer to have the implementation 
based on C library calls rather than using system calls.

Also, using system calls may not make this code portable? Though we are not 
supporting this app on platforms other than Linux, as of now.
Pls let me know what you think.

I had shared my additional comments on v2 patch.


Re: [PATCH v2] devtools: allow variable declaration inside for loop

2023-05-03 Thread Thomas Monjalon
03/05/2023 14:19, Morten Brørup:
> > From: Bruce Richardson [mailto:bruce.richard...@intel.com]
> > Sent: Wednesday, 3 May 2023 12.57
> > 
> > On Wed, May 03, 2023 at 11:30:53AM +0100, Ferruh Yigit wrote:
> > > Declaring variable inside for loop is not supported via C89 and it was
> > > checked in checkpatch.sh via commit [1].
> > > But as DPDK supported C standard is becoming C99/C11 [2], declaring
> > > variable inside loop can be allowed.
> > >
> > > [1]
> > > Commit 43e73483a4b8 ("devtools: forbid variable declaration inside
> > for")
> > >
> > > [2]
> > > https://dpdk.org/patch/121912
> > >
> > > Signed-off-by: Ferruh Yigit 
> > > ---
> > > Cc: Bruce Richardson 
> > > Cc: David Marchand 
> > >
> > > v2:
> > >  * Update coding convention too
> > > ---
> > 
> > Acked-by: Bruce Richardson 
> 
> Acked-by: Morten Brørup 
> 
> [...]
> 
> > > @@ -558,6 +558,7 @@ Local Variables
> > >
> > >  * Variables should be declared at the start of a block of code rather
> > than in the middle.
> > 
> > I'd love to see this restriction removed in future too. Having a
> > variable
> > declared on first use in the middle of block I find a far easier way of
> > working as a) it saves scrolling to look for variable definitions and b)
> > it
> > makes it far easier when adding/removing blocks of code e.g. commenting
> > out
> > for testing,  to have all the code together rather than having variables
> > at
> > the top to add/remove also.
> 
> And c) Initializing the variables close to where they are used the first time 
> reduces the risk of initializing them incorrectly. Especially when modifying 
> a block of code, initialization of its variables might be missed if out of 
> sight. (Although this is probably a consequence of "a)".)
> 
> I consider it old style to only declare variables at the start of a block of 
> code, and this style of coding should be considered obsolete.
> 
> If you are really old (like me?), you might remember when function parameters 
> were provided like this:
> 
> int main(argc, argv)
> int argc;
> char *argv[];
> {
>   return(0);
> }
> 
> We have moved on from that to a more modern coding style a long time ago. We 
> should also move on to a more modern coding style regarding variable 
> declarations.

Old men are used to look for variable types at the beginning of functions.
Having only new code adopting a different style may be confusing a little.
Note I'm not against it, just asking for more feedbacks.




Re: [PATCH v2] devtools: allow variable declaration inside for loop

2023-05-03 Thread Tyler Retzlaff
On Wed, May 03, 2023 at 05:01:01PM +0200, Thomas Monjalon wrote:
> 03/05/2023 14:19, Morten Brørup:
> > > From: Bruce Richardson [mailto:bruce.richard...@intel.com]
> > > Sent: Wednesday, 3 May 2023 12.57
> > > 
> > > On Wed, May 03, 2023 at 11:30:53AM +0100, Ferruh Yigit wrote:
> > > > Declaring variable inside for loop is not supported via C89 and it was
> > > > checked in checkpatch.sh via commit [1].
> > > > But as DPDK supported C standard is becoming C99/C11 [2], declaring
> > > > variable inside loop can be allowed.
> > > >
> > > > [1]
> > > > Commit 43e73483a4b8 ("devtools: forbid variable declaration inside
> > > for")
> > > >
> > > > [2]
> > > > https://dpdk.org/patch/121912
> > > >
> > > > Signed-off-by: Ferruh Yigit 
> > > > ---
> > > > Cc: Bruce Richardson 
> > > > Cc: David Marchand 
> > > >
> > > > v2:
> > > >  * Update coding convention too
> > > > ---
> > > 
> > > Acked-by: Bruce Richardson 
> > 
> > Acked-by: Morten Brørup 
> > 
> > [...]
> > 
> > > > @@ -558,6 +558,7 @@ Local Variables
> > > >
> > > >  * Variables should be declared at the start of a block of code rather
> > > than in the middle.
> > > 
> > > I'd love to see this restriction removed in future too. Having a
> > > variable
> > > declared on first use in the middle of block I find a far easier way of
> > > working as a) it saves scrolling to look for variable definitions and b)
> > > it
> > > makes it far easier when adding/removing blocks of code e.g. commenting
> > > out
> > > for testing,  to have all the code together rather than having variables
> > > at
> > > the top to add/remove also.
> > 
> > And c) Initializing the variables close to where they are used the first 
> > time reduces the risk of initializing them incorrectly. Especially when 
> > modifying a block of code, initialization of its variables might be missed 
> > if out of sight. (Although this is probably a consequence of "a)".)
> > 
> > I consider it old style to only declare variables at the start of a block 
> > of code, and this style of coding should be considered obsolete.
> > 
> > If you are really old (like me?), you might remember when function 
> > parameters were provided like this:
> > 
> > int main(argc, argv)
> > int argc;
> > char *argv[];
> > {
> > return(0);
> > }

heh, k&r C

> > 
> > We have moved on from that to a more modern coding style a long time ago. 
> > We should also move on to a more modern coding style regarding variable 
> > declarations.
> 
> Old men are used to look for variable types at the beginning of functions.
> Having only new code adopting a different style may be confusing a little.
> Note I'm not against it, just asking for more feedbacks.

Acked-by: Tyler Retzlaff 

+1 for declare in minimum necessary scope
+1 for declare at first use (enables more use of const)

thank you!


RE: [RFC PATCH 1/5] eventdev: add power monitoring API on event port

2023-05-03 Thread Tummala, Sivaprasad
[AMD Official Use Only - General]

Hi Jerin, 

> -Original Message-
> From: Jerin Jacob 
> Sent: Wednesday, May 3, 2023 1:57 PM
> To: Yigit, Ferruh 
> Cc: Tummala, Sivaprasad ;
> david.h...@intel.com; jer...@marvell.com; harry.van.haa...@intel.com;
> dev@dpdk.org; Pavan Nikhilesh ; McDaniel, Timothy
> ; Shijith Thotton ;
> Hemant Agrawal ; Sachin Saxena
> ; Mattias Rönnblom
> ; Peter Mccarthy
> ; Liang Ma 
> Subject: Re: [RFC PATCH 1/5] eventdev: add power monitoring API on event port
> 
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
> 
> 
> On Wed, May 3, 2023 at 1:44 PM Ferruh Yigit  wrote:
> >
> > On 5/3/2023 8:58 AM, Jerin Jacob wrote:
> > > On Tue, May 2, 2023 at 4:49 PM Ferruh Yigit  wrote:
> > >>
> > >> On 4/25/2023 5:09 AM, Jerin Jacob wrote:
> > >>> On Mon, Apr 24, 2023 at 9:36 PM Ferruh Yigit 
> wrote:
> > 
> >  On 4/19/2023 11:15 AM, Jerin Jacob wrote:
> > > On Wed, Apr 19, 2023 at 3:24 PM Sivaprasad Tummala
> > >  wrote:
> > >>
> > >> A new API to allow power monitoring condition on event port to
> > >> optimize power when no events are arriving on an event port for
> > >> the worker core to process in an eventdev based pipelined 
> > >> application.
> > >>
> > >> Signed-off-by: Sivaprasad Tummala 
> > >> + *
> > >> + * @param dev_id
> > >> + *   Eventdev id
> > >> + * @param port_id
> > >> + *   Eventdev port id
> > >> + * @param pmc
> > >> + *   The pointer to power-optimized monitoring condition structure.
> > >> + *
> > >> + * @return
> > >> + *   - 0: Success.
> > >> + *   -ENOTSUP: Operation not supported.
> > >> + *   -EINVAL: Invalid parameters.
> > >> + *   -ENODEV: Invalid device ID.
> > >> + */
> > >> +__rte_experimental
> > >> +int
> > >> +rte_event_port_get_monitor_addr(uint8_t dev_id, uint8_t port_id,
> > >> +   struct rte_power_monitor_cond *pmc);
> > >
> > > + eventdev driver maintainers
> > >
> > > I think, we don't need to expose this application due to
> > > applications 1)To make applications to be transparent whether power
> saving is enabled or not?
> > > 2)Some HW and Arch already supports power managent in driver and
> > > in HW (Not using  CPU architecture directly)
> > >
> > > If so, that will be translated to following,
> > > a) Add rte_event_port_power_saving_ena_dis(uint8_t dev_id,
> > > uint8_t port_id, bool ena) for controlling power saving in slowpath.
> > > b) Create reusable PMD private function based on the CPU
> > > architecture power saving primitive to cover the PMD don't have
> > > native power saving support.
> > > c)Update rte_event_dequeue_burst() burst of PMD callback to use (b).
> > >
> > >
> > 
> >  Hi Jerin,
> > >>>
> > >>> Hi Ferruh,
> > >>>
> > 
> >  ethdev approach seems applied here.
> > >>>
> > >>> Understands that. But none of the NIC HW supports power management
> > >>> at HW level like eventdev, so that way for what we are doing for
> > >>> ethdev is a correct abstraction for ethdev.
> > >>>
> > >>
> > >> What I understand is there is HW based event manager and SW based
> > >> ones, SW based ones can benefit more from CPU power optimizations,
> > >> for HW event managers if there is not enough benefit they can just
> > >> ignore the feature.
> > >>
> > 
> >  In ethdev, 'rte_event_port_get_monitor_addr()' equivalent is
> >  'rte_eth_get_monitor_addr()'.
> > 
> >  Although 'rte_eth_get_monitor_addr()' is public API, it is
> >  currently only called from Rx/Tx callback functions implemented in the
> power library.
> >  But I assume intention to make it public is to enable users to
> >  implement their own callback functions that has custom algorithm
> >  for the power management.
> > >>>
> > >>> If there is a use case for customizing with own callback, we can provide
> that.
> > >>> Provided NULL is valid with default algorithm.
> > >>>
> > 
> >  And probably same is true for the 'rte_event_port_get_monitor_addr()'.
> > 
> > 
> >  Also instead of implementing power features for withing PMDs,
> >  isn't it better to have a common eventdev layer for it?
> > >>>
> > >>> We can have rte_evetdev_pmd_* APIs as non-public APIs.
> > >>> My only objection is to NOT introduce _monitor_ APIs at eventdev
> > >>> level, Instead, _monitor_ is one way to do it in SW, So we need
> > >>> higher level of abstraction.
> > >>>
> > >>
> > >> I see, this seems a trade off between flexibility and usability. If
> > >> application has access to _monitor_ APIs, they can be more flexible
> > >> to implement their own logic.
> > >
> > > OK.
> > >
> > >>
> > >> Another option can be application provides the policy with an API
> > >> and monitor API used to realize the policy, but for this case it
> > >> can b

[PATCH] build: announce requirement for C11

2023-05-03 Thread Bruce Richardson
Add a deprecation notice informing users that we will require a C11
compiler from 23.11 release onwards. This requirement was agreed by
technical board to enable use of newer C language features, e.g.
standard atomics. [1]

[1] 
http://inbox.dpdk.org/dev/dbapr08mb58148cec3e1454e8848a938998...@dbapr08mb5814.eurprd08.prod.outlook.com/

Signed-off-by: Bruce Richardson 
---
 doc/guides/rel_notes/deprecation.rst | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index dcc1ca1696..9a391d2c49 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -11,6 +11,12 @@ here.
 Deprecation Notices
 ---
 
+* C Compiler: From DPDK 23.11 onwards,
+  building DPDK will require a C compiler which supports the C11 standard, or 
later.
+  Please note:
+ - C11 is supported from GCC version 5 onwards, and is the default 
language version in that release
+ - C11 is the default compilation mode in Clang from version 3.6
+
 * kvargs: The function ``rte_kvargs_process`` will get a new parameter
   for returning key match count. It will ease handling of no-match case.
 
-- 
2.39.2



Re: [PATCH 1/1] eal: add tracepoints to track lcores and services

2023-05-03 Thread Mattias Rönnblom
On 2023-05-03 12:14, Van Haaren, Harry wrote:
>> -Original Message-
>> From: Arnaud Fiorini 
>> Sent: Monday, April 24, 2023 4:24 PM
>> To: Thomas Monjalon ; Jerin Jacob ;
>> Sunil Kumar Kori ; Van Haaren, Harry
>> 
>> Cc: dev@dpdk.org; Arnaud Fiorini 
>> Subject: [PATCH 1/1] eal: add tracepoints to track lcores and services
>>
>> The tracepoints added are used to track lcore role and status,
>> as well as service mapping and service runstates. These
>> tracepoints are then used in analyses in Trace Compass.
>>
>> Signed-off-by: Arnaud Fiorini 
> 
> I've had a look at these changes, and don't have any big objections;
> When disabled, the tracepoints add no measurable overhead here, but
> When enabled they cause a ~+50% cycle-cost (eg from ~100 to 150 cycles).
> Running the "service_perf_autotest" prints cycle-costs, so this is easy to 
> check.
> 
> Please add documentation on enabling the traces; today it is hard to 
> identify/find
> an example of enabling tracing on service-cores. Suggest to add it to:
> 1) the commit message, how to enable service cores traces only
> 2) add a section in the service-cores documentation, to enable service-cores 
> only, and link to tracing documentation page for more info.
> 
> +CC Mattias and Honnappa who have expressed specific interest in service-cores
> Performance in the past, any concerns/input Mattias/Honnappa?
> 

Adding tracepoints to DPDK services seems like a great idea to me.

> @Arnaud, assuming no concerns from wider DPDK community, I'm happy to Ack a 
> v2 with docs added.
> 
> 
>> ---
>>   .mailmap |  1 +
>>   lib/eal/common/eal_common_thread.c   |  4 ++
>>   lib/eal/common/eal_common_trace_points.c | 21 +
>>   lib/eal/common/rte_service.c | 18 ++-
>>   lib/eal/include/eal_trace_internal.h | 60 
>>   5 files changed, 103 insertions(+), 1 deletion(-)
>>
>> diff --git a/.mailmap b/.mailmap
>> index dc30369117..2a0b132572 100644
>> --- a/.mailmap
>> +++ b/.mailmap
>> @@ -120,6 +120,7 @@ Archana Muniganti 
>> 
>>   Archit Pandey 
>>   Arkadiusz Kubalewski 
>>   Arkadiusz Kusztal 
>> +Arnaud Fiorini 
>>   Arnon Warshavsky 
>>   Arshdeep Kaur 
>>   Artem V. Andreev 
>> diff --git a/lib/eal/common/eal_common_thread.c
>> b/lib/eal/common/eal_common_thread.c
>> index 079a385630..25dbdd68e3 100644
>> --- a/lib/eal/common/eal_common_thread.c
>> +++ b/lib/eal/common/eal_common_thread.c
>> @@ -205,6 +205,8 @@ eal_thread_loop(void *arg)
>>  __ATOMIC_ACQUIRE)) == NULL)
>>  rte_pause();
>>
>> +rte_eal_trace_thread_lcore_running(lcore_id, f);
>> +
>>  /* call the function and store the return value */
>>  fct_arg = lcore_config[lcore_id].arg;
>>  ret = f(fct_arg);
>> @@ -219,6 +221,8 @@ eal_thread_loop(void *arg)
>>   */
>>  __atomic_store_n(&lcore_config[lcore_id].state, WAIT,
>>  __ATOMIC_RELEASE);
>> +
>> +rte_eal_trace_thread_lcore_stopped(lcore_id);
>>  }
>>
>>  /* never reached */
>> diff --git a/lib/eal/common/eal_common_trace_points.c
>> b/lib/eal/common/eal_common_trace_points.c
>> index 3f5bf5c55c..0f1240ea3a 100644
>> --- a/lib/eal/common/eal_common_trace_points.c
>> +++ b/lib/eal/common/eal_common_trace_points.c
>> @@ -70,6 +70,27 @@
>> RTE_TRACE_POINT_REGISTER(rte_eal_trace_thread_remote_launch,
>>  lib.eal.thread.remote.launch)
>>   RTE_TRACE_POINT_REGISTER(rte_eal_trace_thread_lcore_ready,
>>  lib.eal.thread.lcore.ready)
>> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_thread_lcore_running,
>> +lib.eal.thread.lcore.running)
>> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_thread_lcore_stopped,
>> +lib.eal.thread.lcore.stopped)
>> +
>> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_service_map_lcore,
>> +lib.eal.service.map.lcore)
>> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_service_lcore_state_change,
>> +lib.eal.service.lcore.state.change)
>> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_service_lcore_start,
>> +lib.eal.service.lcore.start)
>> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_service_lcore_stop,
>> +lib.eal.service.lcore.stop)
>> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_service_run_begin,
>> +lib.eal.service.run.begin)
>> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_service_runstate_set,
>> +lib.eal.service.run.state.set)
>> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_service_run_end,
>> +lib.eal.service.run.end)
>> +RTE_TRACE_POINT_REGISTER(rte_eal_trace_service_component_register,
>> +lib.eal.service.component.register)
>>
>>   RTE_TRACE_POINT_REGISTER(rte_eal_trace_intr_callback_register,
>>  lib.eal.intr.register)
>> diff --git a/lib/eal/common/rte_service.c b/lib/eal/common/rte_service.c
>> index 42ca1d001d..5daec007aa 100644
>> --- a/lib/eal/common/rte_service.c
>> +++ b/lib/eal/common/rte_service.c
>> @@ -9,6 +9,7 @@
>>   #include 
>>   #include 
>>
>> +#include 
>>   #include

RE: [RFC PATCH] ring: adding TPAUSE instruction to ring dequeue

2023-05-03 Thread Coyle, David
Hi Morten

> -Original Message-
> From: Morten Brørup 
> 
> > From: David Coyle [mailto:david.co...@intel.com]
> > Sent: Wednesday, 3 May 2023 13.39
> >
> > This is NOT for upstreaming. This is being submitted to allow early
> > comparison testing with the preferred solution, which will add TAPUSE
> > power management support to the ring library through the addition of
> > callbacks. Initial stages of the preferred solution are available at
> > http://dpdk.org/patch/125454.
> >
> > This patch adds functionality directly to rte_ring_dequeue functions
> > to monitor the empty reads of the ring. When a configurable number of
> > empty reads is reached, a TPAUSE instruction is triggered by using
> > rte_power_pause() on supported architectures. rte_pause() is used on
> > other architectures. The functionality can be included or excluded at
> > compilation time using the RTE_RING_PMGMT flag. If included, the new
> > API can be used to enable/disable the feature on a per-ring basis.
> > Other related settings can also be configured using the API.
> 
> I don't understand why DPDK developers keep spending time on trying to
> invent methods to determine application busyness based on entry/exit
> points in a variety of libraries, when the application is in a much better
> position to determine busyness. All of these "busyness measuring" library
> extensions have their own specific assumptions and weird limitations.
> 
> I do understand that the goal is power saving, which certainly is relevant! I
> only criticize the measuring methods.
> 
> For reference, we implemented something very simple in our application
> framework:
> 1. When each pipeline stage has completed a burst, it reports if it was busy 
> or
> not.
> 2. If the pipeline busyness is low, we take a nap to save some power.
> 
> And here is the magic twist to this simple algorithm:
> 3. A pipeline stage is not considered busy unless it processed a full burst, 
> and
> is ready to process more packets immediately. This interpretation of
> busyness has a significant impact on the percentage of time spent napping
> during the low-traffic hours.
> 
> This algorithm was very quickly implemented. It might not be perfect, and we
> do intend to improve it (also to determine CPU Utilization on a scale that the
> end user can translate to a linear interpretation of how busy the system is).
> But I seriously doubt that any of the proposed "busyness measuring" library
> extensions are any better.
> 
> So: The application knows better, please spend your precious time on
> something useful instead.
> 
> @David, my outburst is not directed at you specifically. Generally, I do
> appreciate experimenting as a good way of obtaining knowledge. So thank
> you for sharing your experiments with this audience!
> 
> PS: If cruft can be disabled at build time, I generally don't oppose to it.

[DC] Appreciate that feedback, and it is certainly another way of looking at
and tackling the problem that we are ultimately trying to solve (i.e power
saving)

The problem however is that we work with a large number of ISVs and operators,
each with their own workload architecture and implementation. That means we
would have to work individually with each of these to integrate this type of
pipeline-stage-busyness algorithm into their applications. And as these
applications are usually commercial, non-open-source applications, that could
prove to be very difficult.

Also most ISVs and operators don't want to have to worry about changing their
application, especially their fast-path dataplane, in order to get power
savings. They prefer for it to just happen without them caring about the finer
details.

For these reasons, consolidating the busyness algorithms down into the DPDK
libraries and PMDs is currently the preferred solution. As you say though, the
libraries and PMDs may not be in the best position to determine the busyness
of the pipeline, but it provides a good balance between achieving power savings
and ease of adoption.

It's also worth calling out again that this patch is only to allow early
testing by some customers of the benefit of adding TPAUSE support to the ring
library. We don't intend on this patch being upstreamed. The preferred longer
term solution is to use callbacks from the ring library to initiate the pause
(either via the DPDK power management API or through functions that an ISV
may write themselves). This is mentioned in the commit message.

Also, the pipeline stage busyness algorithm that you have added to your
pipeline - have you ever considered implementing this into DPDK as a generic
type library. This could certainly be of benefit to other DPDK application
developers, and having this mechanism in DPDK could again ease the adoption
and realisation of power savings for others. I understand though if this is your
own secret sauce and you want to keep it like that :)

David


Re: [PATCH] build: announce requirement for C11

2023-05-03 Thread Tyler Retzlaff
On Wed, May 03, 2023 at 04:14:13PM +0100, Bruce Richardson wrote:
> Add a deprecation notice informing users that we will require a C11
> compiler from 23.11 release onwards. This requirement was agreed by
> technical board to enable use of newer C language features, e.g.
> standard atomics. [1]
> 
> [1] 
> http://inbox.dpdk.org/dev/dbapr08mb58148cec3e1454e8848a938998...@dbapr08mb5814.eurprd08.prod.outlook.com/
> 
> Signed-off-by: Bruce Richardson 
> ---
>  doc/guides/rel_notes/deprecation.rst | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst 
> b/doc/guides/rel_notes/deprecation.rst
> index dcc1ca1696..9a391d2c49 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -11,6 +11,12 @@ here.
>  Deprecation Notices
>  ---
>  
> +* C Compiler: From DPDK 23.11 onwards,
> +  building DPDK will require a C compiler which supports the C11 standard, 
> or later.
> +  Please note:
> + - C11 is supported from GCC version 5 onwards, and is the default 
> language version in that release
> + - C11 is the default compilation mode in Clang from version 3.6

suggest adding an additional qualification that

  C11 conformant compiler including support for optional standard atomics

  does NOT #define __STDC_NO_ATOMICS__ 1

  which requires providing the stdatomic.h header and feature. this
  shouldn't be contentious since both gcc and clang have support.
  
Acked-by: Tyler Retzlaff 

> +
>  * kvargs: The function ``rte_kvargs_process`` will get a new parameter
>for returning key match count. It will ease handling of no-match case.
>  
> -- 
> 2.39.2


Re: [PATCH] build: announce requirement for C11

2023-05-03 Thread Bruce Richardson
On Wed, May 03, 2023 at 08:35:44AM -0700, Tyler Retzlaff wrote:
> On Wed, May 03, 2023 at 04:14:13PM +0100, Bruce Richardson wrote:
> > Add a deprecation notice informing users that we will require a C11
> > compiler from 23.11 release onwards. This requirement was agreed by
> > technical board to enable use of newer C language features, e.g.
> > standard atomics. [1]
> > 
> > [1] 
> > http://inbox.dpdk.org/dev/dbapr08mb58148cec3e1454e8848a938998...@dbapr08mb5814.eurprd08.prod.outlook.com/
> > 
> > Signed-off-by: Bruce Richardson 
> > ---
> >  doc/guides/rel_notes/deprecation.rst | 6 ++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/doc/guides/rel_notes/deprecation.rst 
> > b/doc/guides/rel_notes/deprecation.rst
> > index dcc1ca1696..9a391d2c49 100644
> > --- a/doc/guides/rel_notes/deprecation.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> > @@ -11,6 +11,12 @@ here.
> >  Deprecation Notices
> >  ---
> >  
> > +* C Compiler: From DPDK 23.11 onwards,
> > +  building DPDK will require a C compiler which supports the C11 standard, 
> > or later.
> > +  Please note:
> > + - C11 is supported from GCC version 5 onwards, and is the default 
> > language version in that release
> > + - C11 is the default compilation mode in Clang from version 3.6
> 
> suggest adding an additional qualification that
> 
>   C11 conformant compiler including support for optional standard atomics
> 
>   does NOT #define __STDC_NO_ATOMICS__ 1
> 
>   which requires providing the stdatomic.h header and feature. this
>   shouldn't be contentious since both gcc and clang have support.
>   

Agree, that is good to clarify. I'll wait a while for more feedback and
then do a V2.

> Acked-by: Tyler Retzlaff 
> 
> > +
> >  * kvargs: The function ``rte_kvargs_process`` will get a new parameter
> >for returning key match count. It will ease handling of no-match case.
> >  
> > -- 
> > 2.39.2


Re: [PATCH] build: announce requirement for C11

2023-05-03 Thread Ferruh Yigit
On 5/3/2023 4:14 PM, Bruce Richardson wrote:
> Add a deprecation notice informing users that we will require a C11
> compiler from 23.11 release onwards. This requirement was agreed by
> technical board to enable use of newer C language features, e.g.
> standard atomics. [1]
> 
> [1] 
> http://inbox.dpdk.org/dev/dbapr08mb58148cec3e1454e8848a938998...@dbapr08mb5814.eurprd08.prod.outlook.com/
> 
> Signed-off-by: Bruce Richardson 
> ---
>  doc/guides/rel_notes/deprecation.rst | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst 
> b/doc/guides/rel_notes/deprecation.rst
> index dcc1ca1696..9a391d2c49 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -11,6 +11,12 @@ here.
>  Deprecation Notices
>  ---
>  
> +* C Compiler: From DPDK 23.11 onwards,
> +  building DPDK will require a C compiler which supports the C11 standard, 
> or later.
> +  Please note:
> + - C11 is supported from GCC version 5 onwards, and is the default 
> language version in that release
> + - C11 is the default compilation mode in Clang from version 3.6
> +
>  * kvargs: The function ``rte_kvargs_process`` will get a new parameter
>for returning key match count. It will ease handling of no-match case.
>  

This only applies to DPDK internals, right?
Application linked with DPDK library won't have this requirement,
meaning DPDK public headers won't rely on C99 and C11 features.

Although this is a deprecation notice for DPDK, if above is correct,
does it make sense to highlight it to not confuse users?


Re: [PATCH] build: announce requirement for C11

2023-05-03 Thread Tyler Retzlaff
On Wed, May 03, 2023 at 04:39:14PM +0100, Ferruh Yigit wrote:
> On 5/3/2023 4:14 PM, Bruce Richardson wrote:
> > Add a deprecation notice informing users that we will require a C11
> > compiler from 23.11 release onwards. This requirement was agreed by
> > technical board to enable use of newer C language features, e.g.
> > standard atomics. [1]
> > 
> > [1] 
> > http://inbox.dpdk.org/dev/dbapr08mb58148cec3e1454e8848a938998...@dbapr08mb5814.eurprd08.prod.outlook.com/
> > 
> > Signed-off-by: Bruce Richardson 
> > ---
> >  doc/guides/rel_notes/deprecation.rst | 6 ++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/doc/guides/rel_notes/deprecation.rst 
> > b/doc/guides/rel_notes/deprecation.rst
> > index dcc1ca1696..9a391d2c49 100644
> > --- a/doc/guides/rel_notes/deprecation.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> > @@ -11,6 +11,12 @@ here.
> >  Deprecation Notices
> >  ---
> >  
> > +* C Compiler: From DPDK 23.11 onwards,
> > +  building DPDK will require a C compiler which supports the C11 standard, 
> > or later.
> > +  Please note:
> > + - C11 is supported from GCC version 5 onwards, and is the default 
> > language version in that release
> > + - C11 is the default compilation mode in Clang from version 3.6
> > +
> >  * kvargs: The function ``rte_kvargs_process`` will get a new parameter
> >for returning key match count. It will ease handling of no-match case.
> >  
> 
> This only applies to DPDK internals, right?
> Application linked with DPDK library won't have this requirement,
> meaning DPDK public headers won't rely on C99 and C11 features.
> 
> Although this is a deprecation notice for DPDK, if above is correct,
> does it make sense to highlight it to not confuse users?

it applies to applications as well because dpdk exposes integers
expected to be atomic via headers and inline functions use atomic
functions both internal/external.


Re: [PATCH] build: announce requirement for C11

2023-05-03 Thread Bruce Richardson
On Wed, May 03, 2023 at 04:39:14PM +0100, Ferruh Yigit wrote:
> On 5/3/2023 4:14 PM, Bruce Richardson wrote:
> > Add a deprecation notice informing users that we will require a C11
> > compiler from 23.11 release onwards. This requirement was agreed by
> > technical board to enable use of newer C language features, e.g.
> > standard atomics. [1]
> > 
> > [1] 
> > http://inbox.dpdk.org/dev/dbapr08mb58148cec3e1454e8848a938998...@dbapr08mb5814.eurprd08.prod.outlook.com/
> > 
> > Signed-off-by: Bruce Richardson 
> > ---
> >  doc/guides/rel_notes/deprecation.rst | 6 ++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/doc/guides/rel_notes/deprecation.rst 
> > b/doc/guides/rel_notes/deprecation.rst
> > index dcc1ca1696..9a391d2c49 100644
> > --- a/doc/guides/rel_notes/deprecation.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> > @@ -11,6 +11,12 @@ here.
> >  Deprecation Notices
> >  ---
> >  
> > +* C Compiler: From DPDK 23.11 onwards,
> > +  building DPDK will require a C compiler which supports the C11 standard, 
> > or later.
> > +  Please note:
> > + - C11 is supported from GCC version 5 onwards, and is the default 
> > language version in that release
> > + - C11 is the default compilation mode in Clang from version 3.6
> > +
> >  * kvargs: The function ``rte_kvargs_process`` will get a new parameter
> >for returning key match count. It will ease handling of no-match case.
> >  
> 
> This only applies to DPDK internals, right?
> Application linked with DPDK library won't have this requirement,
> meaning DPDK public headers won't rely on C99 and C11 features.
>
No, AFAIK, that is not correct. Originally I had thought that that would be
the case - hence the special-case tests for the headers in my previous C99
patch - but the consensus at the DPDK techboard was that we won't require
all headers to remain C89 compatible.

Originally, I was unsure about this, but now I agree with this position, on
the basis that since GCC 5, unless you have been explicitly requesting an
older standard, the compiler is using C11 rules. Therefore, everyone using
these later GCC versions is already using C11+.

/Bruce 


Re: [PATCH] build: announce requirement for C11

2023-05-03 Thread Tyler Retzlaff
On Wed, May 03, 2023 at 04:57:40PM +0100, Bruce Richardson wrote:
> On Wed, May 03, 2023 at 04:39:14PM +0100, Ferruh Yigit wrote:
> > On 5/3/2023 4:14 PM, Bruce Richardson wrote:
> > > Add a deprecation notice informing users that we will require a C11
> > > compiler from 23.11 release onwards. This requirement was agreed by
> > > technical board to enable use of newer C language features, e.g.
> > > standard atomics. [1]
> > > 
> > > [1] 
> > > http://inbox.dpdk.org/dev/dbapr08mb58148cec3e1454e8848a938998...@dbapr08mb5814.eurprd08.prod.outlook.com/
> > > 
> > > Signed-off-by: Bruce Richardson 
> > > ---
> > >  doc/guides/rel_notes/deprecation.rst | 6 ++
> > >  1 file changed, 6 insertions(+)
> > > 
> > > diff --git a/doc/guides/rel_notes/deprecation.rst 
> > > b/doc/guides/rel_notes/deprecation.rst
> > > index dcc1ca1696..9a391d2c49 100644
> > > --- a/doc/guides/rel_notes/deprecation.rst
> > > +++ b/doc/guides/rel_notes/deprecation.rst
> > > @@ -11,6 +11,12 @@ here.
> > >  Deprecation Notices
> > >  ---
> > >  
> > > +* C Compiler: From DPDK 23.11 onwards,
> > > +  building DPDK will require a C compiler which supports the C11 
> > > standard, or later.
> > > +  Please note:
> > > + - C11 is supported from GCC version 5 onwards, and is the default 
> > > language version in that release
> > > + - C11 is the default compilation mode in Clang from version 3.6
> > > +
> > >  * kvargs: The function ``rte_kvargs_process`` will get a new parameter
> > >for returning key match count. It will ease handling of no-match case.
> > >  
> > 
> > This only applies to DPDK internals, right?
> > Application linked with DPDK library won't have this requirement,
> > meaning DPDK public headers won't rely on C99 and C11 features.
> >
> No, AFAIK, that is not correct. Originally I had thought that that would be
> the case - hence the special-case tests for the headers in my previous C99
> patch - but the consensus at the DPDK techboard was that we won't require
> all headers to remain C89 compatible.
> 

this is my recollection as well, everything is being aligned to C11,
which is why we are waiting for 23.11 release before the changes can
come in.

> Originally, I was unsure about this, but now I agree with this position, on
> the basis that since GCC 5, unless you have been explicitly requesting an
> older standard, the compiler is using C11 rules. Therefore, everyone using
> these later GCC versions is already using C11+.

> 
> /Bruce 


Re: DPDK 22.11 Troubleshooting

2023-05-03 Thread Bruce Richardson
On Wed, May 03, 2023 at 04:22:05PM +, Gilbert Carrillo wrote:
> Hi Bruce, 
> 
> Thank you for the response.
> 
> There is no errors when I run the makefile, however I do see a difference in 
> the programs. I don't believe the makefile is linking all the libraries 
> together as intended.
> 
> For example, when I run the ethtool sample program and compile it using 
> meson, it works fine and rte_eth_dev_count_avail() returns the correct 
> amount. However, when I compile ethtool with the makefile and run it 
> rte_eth_dev_count_avail() returns 0.
> 

Note that by default the meson build will statically link the examples,
while the makefile will dynamically load the drivers at runtime. That may
explain the difference. Can you try building and running using "make
static" rather than just "make".

/Bruce


Re: [PATCH] build: announce requirement for C11

2023-05-03 Thread Ferruh Yigit
On 5/3/2023 4:57 PM, Bruce Richardson wrote:
> On Wed, May 03, 2023 at 04:39:14PM +0100, Ferruh Yigit wrote:
>> On 5/3/2023 4:14 PM, Bruce Richardson wrote:
>>> Add a deprecation notice informing users that we will require a C11
>>> compiler from 23.11 release onwards. This requirement was agreed by
>>> technical board to enable use of newer C language features, e.g.
>>> standard atomics. [1]
>>>
>>> [1] 
>>> http://inbox.dpdk.org/dev/dbapr08mb58148cec3e1454e8848a938998...@dbapr08mb5814.eurprd08.prod.outlook.com/
>>>
>>> Signed-off-by: Bruce Richardson 
>>> ---
>>>  doc/guides/rel_notes/deprecation.rst | 6 ++
>>>  1 file changed, 6 insertions(+)
>>>
>>> diff --git a/doc/guides/rel_notes/deprecation.rst 
>>> b/doc/guides/rel_notes/deprecation.rst
>>> index dcc1ca1696..9a391d2c49 100644
>>> --- a/doc/guides/rel_notes/deprecation.rst
>>> +++ b/doc/guides/rel_notes/deprecation.rst
>>> @@ -11,6 +11,12 @@ here.
>>>  Deprecation Notices
>>>  ---
>>>  
>>> +* C Compiler: From DPDK 23.11 onwards,
>>> +  building DPDK will require a C compiler which supports the C11 standard, 
>>> or later.
>>> +  Please note:
>>> + - C11 is supported from GCC version 5 onwards, and is the default 
>>> language version in that release
>>> + - C11 is the default compilation mode in Clang from version 3.6
>>> +
>>>  * kvargs: The function ``rte_kvargs_process`` will get a new parameter
>>>for returning key match count. It will ease handling of no-match case.
>>>  
>>
>> This only applies to DPDK internals, right?
>> Application linked with DPDK library won't have this requirement,
>> meaning DPDK public headers won't rely on C99 and C11 features.
>>
> No, AFAIK, that is not correct. Originally I had thought that that would be
> the case - hence the special-case tests for the headers in my previous C99
> patch - but the consensus at the DPDK techboard was that we won't require
> all headers to remain C89 compatible.
> 

Ah, thanks for clarification. The previous patch made me think we are
keeping headers C89 compatible, I missed the final decision.

> Originally, I was unsure about this, but now I agree with this position, on
> the basis that since GCC 5, unless you have been explicitly requesting an
> older standard, the compiler is using C11 rules. Therefore, everyone using
> these later GCC versions is already using C11+.
> 

Not just compiler version, application enforcing an old standard also
may cause trouble.

But I can see this is simpler from DPDK perspective.



[RFC PATCH] power: refactor uncore power management interfaces

2023-05-03 Thread Sivaprasad Tummala
From: Sivaprasad Tummala 

currently the uncore power management implementation is vendor specific.
Added new vendor agnostic uncore power interface similar to rte_power
and subsequently will rename specific implementations
("rte_power_intel_uncore") to "power_intel_uncore" along with functions.

Signed-off-by: Sivaprasad Tummala 
---
 lib/power/rte_power_uncore.h | 234 +++
 1 file changed, 234 insertions(+)
 create mode 100644 lib/power/rte_power_uncore.h

diff --git a/lib/power/rte_power_uncore.h b/lib/power/rte_power_uncore.h
new file mode 100644
index 00..196fb9ec01
--- /dev/null
+++ b/lib/power/rte_power_uncore.h
@@ -0,0 +1,234 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Intel Corporation
+ * Copyright(c) 2023 AMD Corporation
+ */
+
+#ifndef RTE_POWER_UNCORE_H
+#define RTE_POWER_UNCORE_H
+
+/**
+ * @file
+ * RTE Uncore Frequency Management
+ */
+
+#include 
+#include "rte_power.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* Uncore Power Management Environment */
+enum uncore_power_mgmt_env { UNCORE_PM_ENV_NOT_SET,
+   UNCORE_PM_ENV_INTEL_UNCORE, UNCORE_PM_ENV_AMD_HSMP};
+
+/**
+ * Initialize uncore frequency management for specific die on a package.
+ * It will get the available frequencies and prepare to set new die 
frequencies.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ *  Package number.
+ *  Each physical CPU in a system is referred to as a package.
+ * @param die
+ *  Die number.
+ *  Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+__rte_experimental
+int
+rte_power_uncore_init(unsigned int pkg, unsigned int die);
+
+/**
+ * Exit uncore frequency management on a specific die on a package.
+ * It will restore uncore min and* max values to previous values
+ * before initialization of API.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ *  Package number.
+ *  Each physical CPU in a system is referred to as a package.
+ * @param die
+ *  Die number.
+ *  Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ *  - 0 on success.
+ *  - Negative on error.
+ */
+__rte_experimental
+int
+rte_power_uncore_exit(unsigned int pkg, unsigned int die);
+
+/**
+ * Return the current index of available frequencies of a specific die on a 
package.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ *  Package number.
+ *  Each physical CPU in a system is referred to as a package.
+ * @param die
+ *  Die number.
+ *  Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ *  The current index of available frequencies.
+ *  If error, it will return 'RTE_POWER_INVALID_FREQ_INDEX = (~0)'.
+ */
+typedef uint32_t (*rte_power_get_uncore_freq_t)(unsigned int pkg, unsigned int 
die);
+
+extern rte_power_get_uncore_freq_t rte_power_get_uncore_freq;
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to specified index value.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ *  Package number.
+ *  Each physical CPU in a system is referred to as a package.
+ * @param die
+ *  Die number.
+ *  Each package can have several dies connected together via the uncore mesh.
+ * @param index
+ *  The index of available frequencies.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+typedef int (*rte_power_set_uncore_freq_t)(unsigned int pkg, unsigned int die, 
uint32_t index);
+
+extern rte_power_set_uncore_freq_t rte_power_set_uncore_freq;
+
+/**
+ * Function pointer definition for generic frequency change functions.
+ *
+ * @param pkg
+ *  Package number.
+ *  Each physical CPU in a system is referred to as a package.
+ * @param die
+ *  Die number.
+ *  Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on success without frequency changed.
+ *  - Negative on error.
+ */
+typedef int (*rte_power_uncore_freq_change_t)(unsigned int pkg, unsigned int 
die);
+
+/**
+ * Set minimum and maximum uncore frequency for specified die on a package
+ * to maximum value according to the available frequencies.
+ * It should be protected outside of this function for threadsafe.
+ *
+ * This function should NOT be called in the fast path.
+ *
+ * @param pkg
+ *  Package number.
+ *  Each physical CPU in a system is referred to as a package.
+ * @param die
+ *  Die number.
+ *  Each package can have several dies connected together via the uncore mesh.
+ *
+ * @return
+ *  - 1 on success with frequency changed.
+ *  - 0 on 

Re: DPDK 22.11 Troubleshooting

2023-05-03 Thread Bruce Richardson
On Wed, May 03, 2023 at 04:53:20PM +, Gilbert Carrillo wrote:
> Make static returns an error (see attached).
> 
> v/R,
> Gilbert
> 

JFYI, best practice is to reply below last message when replying in mailing
list. It just makes it easier for readers to follow the conversation as it
flows from top to bottom.

Looks like we are missing libelf in the linker command when using "make
static". May need some investigation.

> -Original Message-
> From: Bruce Richardson  
> Sent: Wednesday, May 3, 2023 10:35 AM
> To: Gilbert Carrillo 
> Cc: dev@dpdk.org
> Subject: Re: DPDK 22.11 Troubleshooting
> 
> On Wed, May 03, 2023 at 04:22:05PM +, Gilbert Carrillo wrote:
> > Hi Bruce,
> > 
> > Thank you for the response.
> > 
> > There is no errors when I run the makefile, however I do see a difference 
> > in the programs. I don't believe the makefile is linking all the libraries 
> > together as intended.
> > 
> > For example, when I run the ethtool sample program and compile it using 
> > meson, it works fine and rte_eth_dev_count_avail() returns the correct 
> > amount. However, when I compile ethtool with the makefile and run it 
> > rte_eth_dev_count_avail() returns 0.
> > 
> 
> Note that by default the meson build will statically link the examples, while 
> the makefile will dynamically load the drivers at runtime. That may explain 
> the difference. Can you try building and running using "make static" rather 
> than just "make".
> 
> /Bruce




Re: DPDK 22.11 Troubleshooting

2023-05-03 Thread Bruce Richardson
On Wed, May 03, 2023 at 04:53:20PM +, Gilbert Carrillo wrote:
> Make static returns an error (see attached).
> 
> v/R,
> Gilbert
>

To help investigate this issue, can you perhaps include the text of the
full build output when you run "make static". On my system I see libelf
listed on the linker flags when I run "make static", and things link
properly. I'm wondering how my setup may differ from yours.

/Bruce
 
> -Original Message-
> From: Bruce Richardson  
> Sent: Wednesday, May 3, 2023 10:35 AM
> To: Gilbert Carrillo 
> Cc: dev@dpdk.org
> Subject: Re: DPDK 22.11 Troubleshooting
> 
> On Wed, May 03, 2023 at 04:22:05PM +, Gilbert Carrillo wrote:
> > Hi Bruce,
> > 
> > Thank you for the response.
> > 
> > There is no errors when I run the makefile, however I do see a difference 
> > in the programs. I don't believe the makefile is linking all the libraries 
> > together as intended.
> > 
> > For example, when I run the ethtool sample program and compile it using 
> > meson, it works fine and rte_eth_dev_count_avail() returns the correct 
> > amount. However, when I compile ethtool with the makefile and run it 
> > rte_eth_dev_count_avail() returns 0.
> > 
> 
> Note that by default the meson build will statically link the examples, while 
> the makefile will dynamically load the drivers at runtime. That may explain 
> the difference. Can you try building and running using "make static" rather 
> than just "make".
> 
> /Bruce




Re: [PATCH v3 next 7/7] net/vmxnet3: update to version 7

2023-05-03 Thread Ferruh Yigit
On 4/28/2023 8:10 AM, Ronak Doshi wrote:
> diff --git a/doc/guides/rel_notes/release_23_07.rst 
> b/doc/guides/rel_notes/release_23_07.rst
> index a9b1293689..907a06cd62 100644
> --- a/doc/guides/rel_notes/release_23_07.rst
> +++ b/doc/guides/rel_notes/release_23_07.rst
> @@ -55,6 +55,12 @@ New Features
>   Also, make sure to start the actual text at the margin.
>   ===
> 
> +   * **Added vmxnet3 version 7 support.**
> +
> + Added support for vmxnet3 version 7 which includes support
> + for uniform passthrough(UPT). The patches also add support
> + for new capability registers, large passthru BAR and some
> + performance enhancements for UPT.

Update added as continuation of section comment, I will fix while merging.


[PATCH v2] build: announce requirement for C11

2023-05-03 Thread Bruce Richardson
Add a deprecation notice informing users that we will require a C11
compiler from 23.11 release onwards. This requirement was agreed by
technical board to enable use of newer C language features, e.g.
standard atomics. [1]

[1] 
http://inbox.dpdk.org/dev/dbapr08mb58148cec3e1454e8848a938998...@dbapr08mb5814.eurprd08.prod.outlook.com/

Signed-off-by: Bruce Richardson 
Acked-by: Tyler Retzlaff 

---

V2:
- add requirement for stdatomics
- fix sphinx formatting
---
 doc/guides/rel_notes/deprecation.rst | 9 +
 1 file changed, 9 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index dcc1ca1696..70c6019d26 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -11,6 +11,15 @@ here.
 Deprecation Notices
 ---
 
+* C Compiler: From DPDK 23.11 onwards,
+  building DPDK will require a C compiler which supports the C11 standard,
+  including support for C11 standard atomics.
+
+  Please note:
+
+  * C11 is supported from GCC version 5 onwards, and is the default language 
version in that release
+  * C11 is the default compilation mode in Clang from version 3.6
+
 * kvargs: The function ``rte_kvargs_process`` will get a new parameter
   for returning key match count. It will ease handling of no-match case.
 
-- 
2.39.2



Re: [PATCH v3 next 0/7] net/vmxnet3: upgrade to version 7

2023-05-03 Thread Ferruh Yigit
On 4/28/2023 8:10 AM, Ronak Doshi wrote:
> vmxnet3 emulation has recently added several new features including
> support for uniform passthrough(UPT). To make UPT work vmxnet3 has
> to be enhanced as per the new specification. This patch series
> extends the vmxnet3 driver to leverage these new features.
> 
> Compatibility is maintained using existing vmxnet3 versioning mechanism as
> follows:
> - new features added to vmxnet3 emulation are associated with new vmxnet3
>version viz. vmxnet3 version 7.
> - emulation advertises all the versions it supports to the driver.
> - during initialization, vmxnet3 driver picks the highest version number
> supported by both the emulation and the driver and configures emulation
> to run at that version.
> 
> In particular, following changes are introduced:
> 
> Patch 1:
>   This patch introduces utility macros for vmxnet3 version 7 comparison
>   and updates Copyright information.
> 
> Patch 2:
>   This patch adds new capability registers to fine control enablement of
>   individual features based on emulation and passthrough.
> 
> Patch 3:
>   This patch adds support for large passthrough BAR register.
> 
> Patch 4:
>   This patch introduces new command to set ring buffer sizes to pass this
>   information to the hardware.
> 
> Patch 5:
>   For better performance, hardware has a requirement to limit number of TSO
>   descriptors. This patch adds that support.
> 
> Patch 6:
>   Avoid updating rxprod register when in UPT for performance reasons.
> 
> Patch 7:
>   With all vmxnet3 version 7 changes incorporated in the vmxnet3 driver,
>   with this patch, the driver can configure emulation to run at vmxnet3
>   version 7.
> 
> Changes v2->v3:
> - removed reference to old performance document
> 
> Changes in v2:
> - modified the title to include "net/"
> - addressed checkpatch complaints and some typo in patch commits
> - removed RTE_ETH_DEV_CAPA_PASS_THRU as it was specific to vmxnet3
> - added new features information in release notes
> - updated feature related information in vmxnet3 driver documentation
> 
> Ronak Doshi (7):
>   net/vmxnet3: prepare for version 7 changes
>   net/vmxnet3: add support for capability registers
>   net/vmxnet3: add support for large passthrough BAR register
>   net/vmxnet3: add command to set ring buffer sizes
>   net/vmxnet3: limit number of TXDs used for TSO packet
>   net/vmxnet3: avoid updating rxprod register frequently
>   net/vmxnet3: update to version 7

Series applied to dpdk-next-net/main, thanks.


Re: [PATCH v2] dts: replace pexpect with fabric

2023-05-03 Thread Jeremy Spewock
On Tue, May 2, 2023 at 9:00 AM Juraj Linkeš 
wrote:

> On Fri, Apr 28, 2023 at 9:04 PM Jeremy Spewock 
> wrote:
> >
> >
> >
> > On Mon, Apr 24, 2023 at 9:35 AM Juraj Linkeš 
> wrote:
> >>
> >> Pexpect is not a dedicated SSH connection library while Fabric is. With
> >> Fabric, all SSH-related logic is provided and we can just focus on
> >> what's DTS specific.
> >>
> >> Signed-off-by: Juraj Linkeš 
> >> ---
> >>  doc/guides/tools/dts.rst  |  29 +-
> >>  dts/conf.yaml |   2 +-
> >>  dts/framework/exception.py|  10 +-
> >>  dts/framework/remote_session/linux_session.py |  31 +-
> >>  dts/framework/remote_session/os_session.py|  51 +++-
> >>  dts/framework/remote_session/posix_session.py |  48 +--
> >>  .../remote_session/remote/remote_session.py   |  35 ++-
> >>  .../remote_session/remote/ssh_session.py  | 287 ++
> >>  dts/framework/testbed_model/sut_node.py   |  12 +-
> >>  dts/framework/utils.py|   9 -
> >>  dts/poetry.lock   | 161 --
> >>  dts/pyproject.toml|   2 +-
> >>  12 files changed, 376 insertions(+), 301 deletions(-)
> >>
> >> diff --git a/doc/guides/tools/dts.rst b/doc/guides/tools/dts.rst
> >> index ebd6dceb6a..d15826c098 100644
> >> --- a/doc/guides/tools/dts.rst
> >> +++ b/doc/guides/tools/dts.rst
> >> @@ -95,9 +95,14 @@ Setting up DTS environment
> >>
> >>  #. **SSH Connection**
> >>
> >> -   DTS uses Python pexpect for SSH connections between DTS environment
> and the other hosts.
> >> -   The pexpect implementation is a wrapper around the ssh command in
> the DTS environment.
> >> -   This means it'll use the SSH agent providing the ssh command and
> its keys.
> >> +   DTS uses the Fabric Python library for SSH connections between DTS
> environment
> >> +   and the other hosts.
> >> +   The authentication method used is pubkey authentication.
> >> +   Fabric tries to use a passed key/certificate,
> >> +   then any key it can with through an SSH agent,
> >> +   then any "id_rsa", "id_dsa" or "id_ecdsa" key discoverable in
> ``~/.ssh/``
> >> +   (with any matching OpenSSH-style certificates).
> >> +   DTS doesn't pass any keys, so Fabric tries to use the other two
> methods.
> >>
> >>
> >>  Setting up System Under Test
> >> @@ -132,6 +137,21 @@ There are two areas that need to be set up on a
> System Under Test:
> >>   It's possible to use the hugepage configuration already present
> on the SUT.
> >>   If you wish to do so, don't specify the hugepage configuration in
> the DTS config file.
> >>
> >> +#. **User with administrator privileges**
> >> +
> >> +.. _sut_admin_user:
> >> +
> >> +   DTS needs administrator privileges to run DPDK applications (such
> as testpmd) on the SUT.
> >> +   The SUT user must be able run commands in privileged mode without
> asking for password.
> >> +   On most Linux distributions, it's a matter of setting up
> passwordless sudo:
> >> +
> >> +   #. Run ``sudo visudo`` and check that it contains ``%sudo
>  ALL=(ALL:ALL) ALL``.
> >> +
> >> +   #. Add the SUT user to the sudo group with:
> >> +
> >> +   .. code-block:: console
> >> +
> >> +  sudo usermod -aG sudo 
> >>
> >>  Running DTS
> >>  ---
> >> @@ -151,7 +171,8 @@ which is a template that illustrates what can be
> configured in DTS:
> >>   :start-at: executions:
> >>
> >>
> >> -The user must be root or any other user with prompt starting with
> ``#``.
> >> +The user must have :ref:`administrator privileges `
> >> +which don't require password authentication.
> >>  The other fields are mostly self-explanatory
> >>  and documented in more detail in
> ``dts/framework/config/conf_yaml_schema.json``.
> >>
> >> diff --git a/dts/conf.yaml b/dts/conf.yaml
> >> index a9bd8a3ecf..129801d87c 100644
> >> --- a/dts/conf.yaml
> >> +++ b/dts/conf.yaml
> >> @@ -16,7 +16,7 @@ executions:
> >>  nodes:
> >>- name: "SUT 1"
> >>  hostname: sut1.change.me.localhost
> >> -user: root
> >> +user: dtsuser
> >>  arch: x86_64
> >>  os: linux
> >>  lcores: ""
> >> diff --git a/dts/framework/exception.py b/dts/framework/exception.py
> >> index ca353d98fc..44ff4e979a 100644
> >> --- a/dts/framework/exception.py
> >> +++ b/dts/framework/exception.py
> >> @@ -62,13 +62,19 @@ class SSHConnectionError(DTSError):
> >>  """
> >>
> >>  host: str
> >> +errors: list[str]
> >>  severity: ClassVar[ErrorSeverity] = ErrorSeverity.SSH_ERR
> >>
> >> -def __init__(self, host: str):
> >> +def __init__(self, host: str, errors: list[str] | None = None):
> >>  self.host = host
> >> +self.errors = [] if errors is None else errors
> >>
> >>  def __str__(self) -> str:
> >> -return f"Error trying to connect with {self.host}"
> >> +message = f"Error trying to connect with {self.host}."
> >> +if self.errors:
> >> +message += f" Errors encountered wh

Re: [RFC PATCH v1 5/5] dts: add traffic generator node to dts runner

2023-05-03 Thread Jeremy Spewock
Acked-by: Jeremy Spewock 

On Thu, Apr 20, 2023 at 5:51 AM Juraj Linkeš 
wrote:

> Initialize the TG node and do basic verification.
>
> Signed-off-by: Juraj Linkeš 
> ---
>  dts/framework/dts.py| 42 -
>  dts/framework/testbed_model/__init__.py |  1 +
>  2 files changed, 28 insertions(+), 15 deletions(-)
>
> diff --git a/dts/framework/dts.py b/dts/framework/dts.py
> index 0502284580..9c82bfe1f4 100644
> --- a/dts/framework/dts.py
> +++ b/dts/framework/dts.py
> @@ -9,7 +9,7 @@
>  from .logger import DTSLOG, getLogger
>  from .test_result import BuildTargetResult, DTSResult, ExecutionResult,
> Result
>  from .test_suite import get_test_suites
> -from .testbed_model import SutNode
> +from .testbed_model import SutNode, TGNode, Node
>  from .utils import check_dts_python_version
>
>  dts_logger: DTSLOG = getLogger("DTSRunner")
> @@ -27,28 +27,40 @@ def run_all() -> None:
>  # check the python version of the server that run dts
>  check_dts_python_version()
>
> -nodes: dict[str, SutNode] = {}
> +nodes: dict[str, Node] = {}
>  try:
>  # for all Execution sections
>  for execution in CONFIGURATION.executions:
>  sut_node = None
> +tg_node = None
>  if execution.system_under_test.name in nodes:
>  # a Node with the same name already exists
>  sut_node = nodes[execution.system_under_test.name]
> -else:
> -# the SUT has not been initialized yet
> -try:
> +
> +if execution.traffic_generator_system.name in nodes:
> +# a Node with the same name already exists
> +tg_node = nodes[execution.traffic_generator_system.name]
> +
> +try:
> +if not sut_node:
>  sut_node = SutNode(execution.system_under_test)
> -result.update_setup(Result.PASS)
> -except Exception as e:
> -dts_logger.exception(
> -f"Connection to node
> {execution.system_under_test} failed."
> -)
> -result.update_setup(Result.FAIL, e)
> -else:
> -nodes[sut_node.name] = sut_node
> -
> -if sut_node:
> +if not tg_node:
> +tg_node = TGNode(execution.traffic_generator_system)
> +tg_node.verify()
> +result.update_setup(Result.PASS)
> +except Exception as e:
> +failed_node = execution.system_under_test.name
> +if sut_node:
> +failed_node = execution.traffic_generator_system.name
> +dts_logger.exception(
> +f"Creation of node {failed_node} failed."
> +)
> +result.update_setup(Result.FAIL, e)
> +else:
> +nodes[sut_node.name] = sut_node
> +nodes[tg_node.name] = tg_node
> +
> +if sut_node and tg_node:
>  _run_execution(sut_node, execution, result)
>
>  except Exception as e:
> diff --git a/dts/framework/testbed_model/__init__.py
> b/dts/framework/testbed_model/__init__.py
> index f54a947051..5cbb859e47 100644
> --- a/dts/framework/testbed_model/__init__.py
> +++ b/dts/framework/testbed_model/__init__.py
> @@ -20,3 +20,4 @@
>  )
>  from .node import Node
>  from .sut_node import SutNode
> +from .tg_node import TGNode
> --
> 2.30.2
>
>


Re: [EXT] Re: [PATCH v3] app/mldev: add internal function for file read

2023-05-03 Thread Stephen Hemminger
On Wed, 3 May 2023 14:59:40 +
Srikanth Yalavarthi  wrote:

> > 
> > Granted this is a test program. But why did you ignore my feedback that this
> > is the slowest way to read a file. Stdio requires extra buffering, use 
> > regular
> > read() or better yet mmap().  
> 
> Agree on the improvement, but, considering that this is a test code and these 
> operations are done in slow-path, I would prefer to have the implementation 
> based on C library calls rather than using system calls.
> 
> Also, using system calls may not make this code portable? Though we are not 
> supporting this app on platforms other than Linux, as of now.
> Pls let me know what you think.
> 
> I had shared my additional comments on v2 patch.

Using system calls read/write is used lots of places in DPDK already and is 
portable
to all the supported platforms.


RE: [RFC PATCH] ring: adding TPAUSE instruction to ring dequeue

2023-05-03 Thread Morten Brørup
> From: Coyle, David [mailto:david.co...@intel.com]
> Sent: Wednesday, 3 May 2023 17.32
> 
> Hi Morten
> 
> > From: Morten Brørup 
> >
> > > From: David Coyle [mailto:david.co...@intel.com]
> > > Sent: Wednesday, 3 May 2023 13.39
> > >
> > > This is NOT for upstreaming. This is being submitted to allow early
> > > comparison testing with the preferred solution, which will add
> TAPUSE
> > > power management support to the ring library through the addition of
> > > callbacks. Initial stages of the preferred solution are available at
> > > http://dpdk.org/patch/125454.
> > >
> > > This patch adds functionality directly to rte_ring_dequeue functions
> > > to monitor the empty reads of the ring. When a configurable number
> of
> > > empty reads is reached, a TPAUSE instruction is triggered by using
> > > rte_power_pause() on supported architectures. rte_pause() is used on
> > > other architectures. The functionality can be included or excluded
> at
> > > compilation time using the RTE_RING_PMGMT flag. If included, the new
> > > API can be used to enable/disable the feature on a per-ring basis.
> > > Other related settings can also be configured using the API.
> >
> > I don't understand why DPDK developers keep spending time on trying to
> > invent methods to determine application busyness based on entry/exit
> > points in a variety of libraries, when the application is in a much
> better
> > position to determine busyness. All of these "busyness measuring"
> library
> > extensions have their own specific assumptions and weird limitations.
> >
> > I do understand that the goal is power saving, which certainly is
> relevant! I
> > only criticize the measuring methods.
> >
> > For reference, we implemented something very simple in our application
> > framework:
> > 1. When each pipeline stage has completed a burst, it reports if it
> was busy or
> > not.
> > 2. If the pipeline busyness is low, we take a nap to save some power.
> >
> > And here is the magic twist to this simple algorithm:
> > 3. A pipeline stage is not considered busy unless it processed a full
> burst, and
> > is ready to process more packets immediately. This interpretation of
> > busyness has a significant impact on the percentage of time spent
> napping
> > during the low-traffic hours.
> >
> > This algorithm was very quickly implemented. It might not be perfect,
> and we
> > do intend to improve it (also to determine CPU Utilization on a scale
> that the
> > end user can translate to a linear interpretation of how busy the
> system is).
> > But I seriously doubt that any of the proposed "busyness measuring"
> library
> > extensions are any better.
> >
> > So: The application knows better, please spend your precious time on
> > something useful instead.
> >
> > @David, my outburst is not directed at you specifically. Generally, I
> do
> > appreciate experimenting as a good way of obtaining knowledge. So
> thank
> > you for sharing your experiments with this audience!
> >
> > PS: If cruft can be disabled at build time, I generally don't oppose
> to it.
> 
> [DC] Appreciate that feedback, and it is certainly another way of
> looking at
> and tackling the problem that we are ultimately trying to solve (i.e
> power
> saving)
> 
> The problem however is that we work with a large number of ISVs and
> operators,
> each with their own workload architecture and implementation. That means
> we
> would have to work individually with each of these to integrate this
> type of
> pipeline-stage-busyness algorithm into their applications. And as these
> applications are usually commercial, non-open-source applications, that
> could
> prove to be very difficult.
> 
> Also most ISVs and operators don't want to have to worry about changing
> their
> application, especially their fast-path dataplane, in order to get power
> savings. They prefer for it to just happen without them caring about the
> finer
> details.
> 
> For these reasons, consolidating the busyness algorithms down into the
> DPDK
> libraries and PMDs is currently the preferred solution. As you say
> though, the
> libraries and PMDs may not be in the best position to determine the
> busyness
> of the pipeline, but it provides a good balance between achieving power
> savings
> and ease of adoption.

Thank you for describing the business logic driving this technical approach. 
Now I get it!

Automagic busyness monitoring and power management would be excellent. But what 
I see on the mailing list is a bunch of incoherent attempts at doing this. (And 
I don't mean your patches, I mean all the patches for automagic power 
management.) And the cost is not insignificant: Pollution of DPDK all over the 
place, in both drivers and libraries.

I would much rather see a top-down approach, so we could all work towards a 
unified solution.

However, I understand that customers are impatient, so I accept that in reality 
we have to live with these weird "code injection" based solutions until 
something sane bec

RE: [PATCH] build: announce requirement for C11

2023-05-03 Thread Morten Brørup
> From: Bruce Richardson [mailto:bruce.richard...@intel.com]
> Sent: Wednesday, 3 May 2023 17.14
> 
> Add a deprecation notice informing users that we will require a C11
> compiler from 23.11 release onwards. This requirement was agreed by
> technical board to enable use of newer C language features, e.g.
> standard atomics. [1]
> 
> [1]
> http://inbox.dpdk.org/dev/DBAPR08MB58148CEC3E1454E8848A938998AB9@DBAPR08
> MB5814.eurprd08.prod.outlook.com/
> 
> Signed-off-by: Bruce Richardson 

Acked-by: Morten Brørup 



RE: [PATCH V3] lib: set/get max memzone segments

2023-05-03 Thread Morten Brørup
> From: Ophir Munk [mailto:ophi...@nvidia.com]
> Sent: Wednesday, 3 May 2023 09.27
> 
> In current DPDK the RTE_MAX_MEMZONE definition is unconditionally hard
> coded as 2560.  For applications requiring different values of this
> parameter – it is more convenient to set the max value via an rte API -
> rather than changing the dpdk source code per application.  In many
> organizations, the possibility to compile a private DPDK library for a
> particular application does not exist at all.  With this option there is
> no need to recompile DPDK and it allows using an in-box packaged DPDK.
> An example usage for updating the RTE_MAX_MEMZONE would be of an
> application that uses the DPDK mempool library which is based on DPDK
> memzone library.  The application may need to create a number of
> steering tables, each of which will require its own mempool allocation.
> This commit is not about how to optimize the application usage of
> mempool nor about how to improve the mempool implementation based on
> memzone.  It is about how to make the max memzone definition - run-time
> customized.
> This commit adds an API which must be called before rte_eal_init():
> rte_memzone_max_set(int max).  If not called, the default memzone
> (RTE_MAX_MEMZONE) is used.  There is also an API to query the effective
> max memzone: rte_memzone_max_get().
> 
> Signed-off-by: Ophir Munk 

I retracted my objection to the RFC, but should also have added:

Acked-by: Morten Brørup 



Re: [dpdk-dev] [PATCH v2] ring: fix use after free in ring release

2023-05-03 Thread Konstantin Ananyev

03/05/2023 06:44, Honnappa Nagarahalli пишет:








After the memzone is freed, it is not removed from the 'rte_ring_tailq'.
If rte_ring_lookup is called at this time, it will cause a
use-after-free problem. This change prevents that from happening.

Fixes: 4e32101f9b01 ("ring: support freeing")
Cc: sta...@dpdk.org

Suggested-by: Honnappa Nagarahalli 
Signed-off-by: Yunjian Wang 
---
v2: update code suggested by Honnappa Nagarahalli
---
   lib/ring/rte_ring.c | 8 +++-
   1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/lib/ring/rte_ring.c b/lib/ring/rte_ring.c index
8ed455043d..2755323b8a 100644
--- a/lib/ring/rte_ring.c
+++ b/lib/ring/rte_ring.c
@@ -333,11 +333,6 @@ rte_ring_free(struct rte_ring *r)
return;
}

-   if (rte_memzone_free(r->memzone) != 0) {
-   RTE_LOG(ERR, RING, "Cannot free memory\n");
-   return;
-   }
-
ring_list = RTE_TAILQ_CAST(rte_ring_tailq.head, rte_ring_list);
rte_mcfg_tailq_write_lock();

@@ -354,6 +349,9 @@ rte_ring_free(struct rte_ring *r)

TAILQ_REMOVE(ring_list, te, next);

+   if (rte_memzone_free(r->memzone) != 0)
+   RTE_LOG(ERR, RING, "Cannot free memory\n");
+


I nit: I think it is a bit better to first release the lock and then
free the memzone.

I think both of our suggestions are contradictory. Any reason why you want

to free outside the locked region?


Don't know what you mean by 'both suggestions' here.

I wrote 'both of our suggestions'. Essentially, in v1, freeing the memzone was 
outside of the lock. I suggested to move it inside and you are suggesting to 
move it inside.



Ah ok, I missed v1 and your comments for it.
As I said before, I don't think that we need to hold qlock here
while calling mmezone_free().
Though I don't see any harm with it either.
I'd personally would move memzone_free() after releasing qlock,
but if you guys prefer to keep it as it is - I wouldn't insist.




I think I gave only one - move memzone_free() after tailq_write_unlock().
To be more precise:
1) rte_mcfg_tailq_write_lock();
...
2) TAILQ_REMOVE(...);
3) rte_mcfg_tailq_write_unlock();
4) rte_memzone_free(r->memzone);

As I remember, memzones are protected by their own lock (mlock), so we
don't need to hold qlock to free a memzone, after ring was already removed
from the ring_list.



I thought, since it belongs to the ring being freed, it makes sense to free it

while holding the lock to avoid any race conditions (though, I have not
checked what those are).


As I understand, it is ok with current design to grab mlock while holding qlock.
So, there is nothing wrong with current patch, I just think that in that case 
it is
excessive, and can be safely avoided.




Apart from that, LGTM.
Acked-by: Konstantin Ananyev 


rte_mcfg_tailq_write_unlock();

rte_free(te);
--
2.33.0










Re: [EXT] Re: [PATCH v3] app/mldev: add internal function for file read

2023-05-03 Thread Tyler Retzlaff
On Wed, May 03, 2023 at 11:28:26AM -0700, Stephen Hemminger wrote:
> On Wed, 3 May 2023 14:59:40 +
> Srikanth Yalavarthi  wrote:
> 
> > > 
> > > Granted this is a test program. But why did you ignore my feedback that 
> > > this
> > > is the slowest way to read a file. Stdio requires extra buffering, use 
> > > regular
> > > read() or better yet mmap().  
> > 
> > Agree on the improvement, but, considering that this is a test code and 
> > these operations are done in slow-path, I would prefer to have the 
> > implementation based on C library calls rather than using system calls.
> > 
> > Also, using system calls may not make this code portable? Though we are not 
> > supporting this app on platforms other than Linux, as of now.
> > Pls let me know what you think.
> > 
> > I had shared my additional comments on v2 patch.
> 
> Using system calls read/write is used lots of places in DPDK already and is 
> portable
> to all the supported platforms.

well almost, the windows standard c library implements a subset of
POSIX.1 (ISO/IEC 9945-1:1996) and there should be a strong emphasis on
`a subset' as in it is not fully conformant to any specific POSIX standard.

also because they aren't technically part of the standard C library
(again POSIX is not standard C) they are exposed with different names on
windows by prepending a leading `_' to the names. so you get `_read' instead
of `read' for example.

you can force exposure of the non-conforming names (i.e. the POSIX
names) with the _CRT_DECLARE_NONSTDC_NAMES define but if you do and you use
them you may then get deprecation warnings.

anyway, i read above nobody cares if this code ever runs on anything but
Linux ~forever so i won't make it my business to comment further unless
there is a desire to include windows.


RE: [dpdk-dev] [PATCH v2] ring: fix use after free in ring release

2023-05-03 Thread Honnappa Nagarahalli


> >
> 
> 
> 
> > After the memzone is freed, it is not removed from the 'rte_ring_tailq'.
> > If rte_ring_lookup is called at this time, it will cause a
> > use-after-free problem. This change prevents that from happening.
> >
> > Fixes: 4e32101f9b01 ("ring: support freeing")
> > Cc: sta...@dpdk.org
> >
> > Suggested-by: Honnappa Nagarahalli
> 
> > Signed-off-by: Yunjian Wang 
> > ---
> > v2: update code suggested by Honnappa Nagarahalli
> > ---
> >lib/ring/rte_ring.c | 8 +++-
> >1 file changed, 3 insertions(+), 5 deletions(-)
> >
> > diff --git a/lib/ring/rte_ring.c b/lib/ring/rte_ring.c index
> > 8ed455043d..2755323b8a 100644
> > --- a/lib/ring/rte_ring.c
> > +++ b/lib/ring/rte_ring.c
> > @@ -333,11 +333,6 @@ rte_ring_free(struct rte_ring *r)
> > return;
> > }
> >
> > -   if (rte_memzone_free(r->memzone) != 0) {
> > -   RTE_LOG(ERR, RING, "Cannot free memory\n");
> > -   return;
> > -   }
> > -
> > ring_list = RTE_TAILQ_CAST(rte_ring_tailq.head,
> rte_ring_list);
> > rte_mcfg_tailq_write_lock();
> >
> > @@ -354,6 +349,9 @@ rte_ring_free(struct rte_ring *r)
> >
> > TAILQ_REMOVE(ring_list, te, next);
> >
> > +   if (rte_memzone_free(r->memzone) != 0)
> > +   RTE_LOG(ERR, RING, "Cannot free memory\n");
> > +
> 
>  I nit: I think it is a bit better to first release the lock and
>  then free the memzone.
> >>> I think both of our suggestions are contradictory. Any reason why
> >>> you want
> >> to free outside the locked region?
> >>
> >>
> >> Don't know what you mean by 'both suggestions' here.
> > I wrote 'both of our suggestions'. Essentially, in v1, freeing the memzone
> was outside of the lock. I suggested to move it inside and you are suggesting
> to move it inside.
> 
> 
> Ah ok, I missed v1 and your comments for it.
> As I said before, I don't think that we need to hold qlock here while calling
> mmezone_free().
> Though I don't see any harm with it either.
> I'd personally would move memzone_free() after releasing qlock, but if you
> guys prefer to keep it as it is - I wouldn't insist.
I looked at other libraries, stack library is the closest. Stack library frees 
the memzone outside the lock. I think we should keep it consistent.
I am fine to move the free outside the lock.

> 
> >
> >> I think I gave only one - move memzone_free() after tailq_write_unlock().
> >> To be more precise:
> >> 1) rte_mcfg_tailq_write_lock();
> >> ...
> >> 2) TAILQ_REMOVE(...);
> >> 3) rte_mcfg_tailq_write_unlock();
> >> 4) rte_memzone_free(r->memzone);
> >>
> >> As I remember, memzones are protected by their own lock (mlock), so
> >> we don't need to hold qlock to free a memzone, after ring was already
> >> removed from the ring_list.
> >>
> >>>
> >>> I thought, since it belongs to the ring being freed, it makes sense
> >>> to free it
> >> while holding the lock to avoid any race conditions (though, I have
> >> not checked what those are).
> >>
> >>
> >> As I understand, it is ok with current design to grab mlock while holding
> qlock.
> >> So, there is nothing wrong with current patch, I just think that in
> >> that case it is excessive, and can be safely avoided.
> >>
> >>>
>  Apart from that, LGTM.
>  Acked-by: Konstantin Ananyev 
> 
> > rte_mcfg_tailq_write_unlock();
> >
> > rte_free(te);
> > --
> > 2.33.0
> 
> >>>
> >



RE: 21.11.4 patches review and test

2023-05-03 Thread Xu, HailinX
> -Original Message-
> From: Kevin Traynor 
> Sent: Tuesday, May 2, 2023 5:35 PM
> To: Xu, HailinX ; sta...@dpdk.org
> Cc: Stokes, Ian ; Mcnamara, John
> ; Luca Boccassi ; Xu, Qian Q
> ; Thomas Monjalon ; Peng,
> Yuan ; Chen, Zhaoyan ;
> dev@dpdk.org
> Subject: Re: 21.11.4 patches review and test
> 
> On 20/04/2023 11:32, Kevin Traynor wrote:
> > On 20/04/2023 03:40, Xu, HailinX wrote:
> >>> -Original Message-
> >>> From: Xu, HailinX 
> >>> Sent: Thursday, April 13, 2023 2:13 PM
> >>> To: Kevin Traynor ; sta...@dpdk.org
> >>> Cc: dev@dpdk.org; Abhishek Marathe
> ;
> >>> Ali Alnubani ; Walker, Benjamin
> >>> ; David Christensen
> >>> ; Hemant Agrawal
> ;
> >>> Stokes, Ian ; Jerin Jacob
> >>> ; Mcnamara, John ;
> >>> Ju-Hyoung Lee ; Luca Boccassi
> >>> ; Pei Zhang ; Xu, Qian Q
> >>> ; Raslan Darawsheh ;
> Thomas
> >>> Monjalon ; yangh...@redhat.com; Peng, Yuan
> >>> ; Chen, Zhaoyan 
> >>> Subject: RE: 21.11.4 patches review and test
> >>>
>  -Original Message-
>  From: Kevin Traynor 
>  Sent: Thursday, April 6, 2023 7:38 PM
>  To: sta...@dpdk.org
>  Cc: dev@dpdk.org; Abhishek Marathe
>  ; Ali Alnubani
>  ; Walker, Benjamin ;
>  David Christensen ; Hemant Agrawal
>  ; Stokes, Ian ; Jerin
>  Jacob ; Mcnamara, John
>  ; Ju-Hyoung Lee ;
>  Kevin Traynor ; Luca Boccassi
>  ; Pei Zhang ; Xu, Qian Q
>  ; Raslan Darawsheh ;
> >>> Thomas
>  Monjalon ; yangh...@redhat.com; Peng, Yuan
>  ; Chen, Zhaoyan 
>  Subject: 21.11.4 patches review and test
> 
>  Hi all,
> 
>  Here is a list of patches targeted for stable release 21.11.4.
> 
>  The planned date for the final release is 25th April.
> 
>  Please help with testing and validation of your use cases and
>  report any issues/results with reply-all to this mail. For the
>  final release the fixes and reported validations will be added to the
> release notes.
> 
>  A release candidate tarball can be found at:
> 
>    https://dpdk.org/browse/dpdk-stable/tag/?id=v21.11.4-rc1
> 
>  These patches are located at branch 21.11 of dpdk-stable repo:
>    https://dpdk.org/browse/dpdk-stable/
> 
>  Thanks.
> 
>  Kevin
> >>>
> >>> HI All,
> >>>
> >>> Update the test status for Intel part. Till now dpdk21.11.4-rc1
> >>> validation test rate is 85%. No critical issue is found.
> >>> 2 new bugs are found, 1 new issue is under confirming by Intel Dev.
> >>> New bugs:   --20.11.8-rc1 also has these two issues
> >>> 1.
> pvp_qemu_multi_paths_port_restart:perf_pvp_qemu_vector_rx_mac:
> >>> performance drop about 23.5% when send small packets
> >>>   https://bugs.dpdk.org/show_bug.cgi?id=1212-- no fix yet
> >>> 2. some of the virtio tests are failing:-- Intel dev is under
> investigating
> >>> # Basic Intel(R) NIC testing
> >>> * Build & CFLAG compile: cover the build test combination with
> >>> latest GCC/Clang version and the popular OS revision such as
> >>> Ubuntu20.04, Ubuntu22.04, Fedora35, Fedora37, RHEL8.6, RHEL8.4,
> >>> FreeBSD13.1, SUSE15, CentOS7.9, etc.
> >>> - All test done. No new dpdk issue is found.
> >>> * PF(i40e, ixgbe): test scenarios including
> >>> RTE_FLOW/TSO/Jumboframe/checksum offload/VLAN/VXLAN, etc.
> >>> - All test done. No new dpdk issue is found.
> >>> * VF(i40e, ixgbe): test scenarios including
> >>> VF-RTE_FLOW/TSO/Jumboframe/checksum offload/VLAN/VXLAN, etc.
> >>> - All test done. No new dpdk issue is found.
> >>> * PF/VF(ice): test scenarios including Switch features/Package
> >>> Management/Flow Director/Advanced Tx/Advanced RSS/ACL/DCF/Flexible
> >>> Descriptor, etc.
> >>> - All test done. No new dpdk issue is found.
> >>> * Intel NIC single core/NIC performance: test scenarios including
> >>> PF/VF single core performance test, etc.
> >>> - All test done. No new dpdk issue is found.
> >>> * IPsec: test scenarios including ipsec/ipsec-gw/ipsec library basic
> >>> test - QAT&SW/FIB library, etc.
> >>> - On going.
> >>>
> >>> # Basic cryptodev and virtio testing
> >>> * Virtio: both function and performance test are covered. Such as
> >>> PVP/Virtio_loopback/virtio-user loopback/virtio-net VM2VM perf
> >>> testing/VMAWARE ESXI 8.0, etc.
> >>> - All test done. found bug1.
> >>> * Cryptodev:
> >>> *Function test: test scenarios including Cryptodev API
> >>> testing/CompressDev ISA-L/QAT/ZLIB PMD Testing/FIPS, etc.
> >>>   - Execution rate is 90%. found bug2.
> >>> *Performance test: test scenarios including Thoughput
> >>> Performance/Cryptodev Latency, etc.
> >>>   - All test done. No new dpdk issue is found.
> >>>
> >>> Regards,
> >>> Xu, Hailin
> >> Update the test status for Intel part. completed dpdk21.11.4-rc1 all
> validation. No critical issue is found.
> >
> > Hi. Thanks for testing.
> >
> >> 2 new bugs are found, 1 new issue is under confirming by Intel Dev.
> >> New bugs: --20.11.8-rc1 also has these two issues
> >> 1.
> pvp_q

[PATCH] doc: fix typo in graph lib doc

2023-05-03 Thread Ashwin Sekhar T K
Fix typo in graph lib doc.

Fixes: 4dc6d8e63c168 ("doc: add graph library guide")

Signed-off-by: Ashwin Sekhar T K 
---
 doc/guides/prog_guide/graph_lib.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/guides/prog_guide/graph_lib.rst 
b/doc/guides/prog_guide/graph_lib.rst
index 1cfdc86433..4ab0623f44 100644
--- a/doc/guides/prog_guide/graph_lib.rst
+++ b/doc/guides/prog_guide/graph_lib.rst
@@ -173,7 +173,7 @@ Create the graph object
 ~~~
 Now that the nodes are linked, Its time to create a graph by including
 the required nodes. The application can provide a set of node patterns to
-form a graph object. The ``famish()`` API used underneath for the pattern
+form a graph object. The ``fnmatch()`` API used underneath for the pattern
 matching to include the required nodes. After the graph create any changes to
 nodes or graph is not allowed.
 
-- 
2.25.1



Re: [PATCH] doc: fix typo in graph lib doc

2023-05-03 Thread Jerin Jacob
On Thu, May 4, 2023 at 10:01 AM Ashwin Sekhar T K  wrote:
>
> Fix typo in graph lib doc.
>
> Fixes: 4dc6d8e63c168 ("doc: add graph library guide")
>
> Signed-off-by: Ashwin Sekhar T K 

Acked-by: Jerin Jacob 


> ---
>  doc/guides/prog_guide/graph_lib.rst | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/doc/guides/prog_guide/graph_lib.rst 
> b/doc/guides/prog_guide/graph_lib.rst
> index 1cfdc86433..4ab0623f44 100644
> --- a/doc/guides/prog_guide/graph_lib.rst
> +++ b/doc/guides/prog_guide/graph_lib.rst
> @@ -173,7 +173,7 @@ Create the graph object
>  ~~~
>  Now that the nodes are linked, Its time to create a graph by including
>  the required nodes. The application can provide a set of node patterns to
> -form a graph object. The ``famish()`` API used underneath for the pattern
> +form a graph object. The ``fnmatch()`` API used underneath for the pattern
>  matching to include the required nodes. After the graph create any changes to
>  nodes or graph is not allowed.
>
> --
> 2.25.1
>


RE: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx functions

2023-05-03 Thread Ruifeng Wang
> -Original Message-
> From: Konstantin Ananyev 
> Sent: Monday, May 1, 2023 9:29 PM
> To: zhou...@loongson.cn
> Cc: dev@dpdk.org; maob...@loongson.cn; qiming.y...@intel.com; 
> wenjun1...@intel.com;
> Ruifeng Wang ; d...@linux.vnet.ibm.com
> Subject: Re: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx 
> functions
> 
> > Segmentation fault has been observed while running the
> > ixgbe_recv_pkts_lro() function to receive packets on the Loongson
> > 3C5000 processor which has 64 cores and 4 NUMA nodes.
> >
> > From the ixgbe_recv_pkts_lro() function, we found that as long as the
> > first packet has the EOP bit set, and the length of this packet is
> > less than or equal to rxq->crc_len, the segmentation fault will
> > definitely happen even though on the other platforms, such as X86.
> >
> > Because when processd the first packet the first_seg->next will be
> > NULL, if at the same time this packet has the EOP bit set and its
> > length is less than or equal to rxq->crc_len, the following loop will be 
> > excecuted:
> >
> > for (lp = first_seg; lp->next != rxm; lp = lp->next)
> > ;
> >
> > We know that the first_seg->next will be NULL under this condition. So
> > the expression of lp->next->next will cause the segmentation fault.
> >
> > Normally, the length of the first packet with EOP bit set will be
> > greater than rxq->crc_len. However, the out-of-order execution of CPU
> > may make the read ordering of the status and the rest of the
> > descriptor fields in this function not be correct. The related codes are as 
> > following:
> >
> > rxdp = &rx_ring[rx_id];
> >  #1 staterr = rte_le_to_cpu_32(rxdp->wb.upper.status_error);
> >
> > if (!(staterr & IXGBE_RXDADV_STAT_DD))
> > break;
> >
> >  #2 rxd = *rxdp;
> >
> > The sentence #2 may be executed before sentence #1. This action is
> > likely to make the ready packet zero length. If the packet is the
> > first packet and has the EOP bit set, the above segmentation fault will 
> > happen.
> >
> > So, we should add rte_rmb() to ensure the read ordering be correct. We
> > also did the same thing in the ixgbe_recv_pkts() function to make the
> > rxd data be valid even thougth we did not find segmentation fault in this 
> > function.
> >
> > Signed-off-by: Min Zhou 

"Fixes" tag for backport.

> > ---
> > v2:
> > - Make the calling of rte_rmb() for all platforms
> > ---
> >  drivers/net/ixgbe/ixgbe_rxtx.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c
> > b/drivers/net/ixgbe/ixgbe_rxtx.c index c9d6ca9efe..302a5ab7ff 100644
> > --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> > +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> > @@ -1823,6 +1823,8 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf 
> > **rx_pkts,
> > staterr = rxdp->wb.upper.status_error;
> > if (!(staterr & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD)))
> > break;
> > +
> > +   rte_rmb();
> > rxd = *rxdp;
> 
> 
> 
> Indeed, looks like a problem to me on systems with relaxed MO.
> Strange that it was never hit on arm or ppc - cc-ing ARM/PPC maintainers.

Thanks, Konstantin.

> About a fix - looks right, but a bit excessive to me - as I understand all we 
> need here is
> to prevent re-ordering by CPU itself.
> So rte_smp_rmb() seems enough here.

Agree that rte_rmb() is excessive.
rte_smp_rmb() or rte_atomic_thread_fence(__ATOMIC_ACQUIRE) is enough.
And it is better to add a comment to justify the barrier.

> Or might be just:
> staterr = __atomic_load_n(&rxdp->wb.upper.status_error, __ATOMIC_ACQUIRE);
> 
> 
> > /*
> > @@ -2122,6 +2124,7 @@ ixgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf 
> > **rx_pkts,
> uint16_t nb_pkts,

With the proper barrier in place, I think the long comments at the beginning of 
this loop can be removed.

> > if (!(staterr & IXGBE_RXDADV_STAT_DD))
> > break;
> >
> > +   rte_rmb();
> > rxd = *rxdp;
> >
> > PMD_RX_LOG(DEBUG, "port_id=%u queue_id=%u rx_id=%u "
> > --
> > 2.31.1