RE: [PATCH v2 3/8] mbuf: fix Tx checksum offload examples

2024-04-16 Thread Morten Brørup
> From: Ferruh Yigit [mailto:ferruh.yi...@amd.com]
> Sent: Monday, 15 April 2024 17.08
> 
> On 4/12/2024 3:44 PM, Morten Brørup wrote:
>  Mandate use of rte_eth_tx_prepare() in the mbuf Tx checksum
> >> offload
>  examples.
> >>>
> >>> I strongly disagree with this change!
> >>>
> >>> It will cause a huge performance degradation for shaping
> >> applications:
> >>>
> >>> A packet will be processed and finalized at an output or
> >> forwarding
> >> pipeline stage, where some other fields might also be written,
> >> so
> >>> zeroing e.g. the out_ip checksum at this stage has low cost
> >> (no new
> >> cache misses).
> >>>
> >>> Then, the packet might be queued for QoS or similar.
> >>>
> >>> If rte_eth_tx_prepare() must be called at the egress pipeline
> >> stage,
> >> it has to write to the packet and cause a cache miss per packet,
> >>> instead of simply passing on the packet to the NIC hardware.
> >>>
> >>> It must be possible to finalize the packet at the
> >> output/forwarding
> >> pipeline stage!
> >>
> >> If you can finalize your packet on  output/forwarding, then why
> >> you
> >> can't invoke tx_prepare() on the same stage?
> >> There seems to be some misunderstanding about what tx_prepare()
> >> does -
> >> in fact it doesn't communicate with HW queue (doesn't update TXD
> >> ring,
> >> etc.), what it does - just make changes in mbuf itself.
> >> Yes, it reads some fields in SW TX queue struct (max number of
> >> TXDs per
> >> packet, etc.), but AFAIK it is safe
> >> to call tx_prepare() and tx_burst() from different threads.
> >> At least on implementations I am aware about.
> >> Just checked the docs - it seems not stated explicitly anywhere,
> >> might
> >> be that's why it causing such misunderstanding.
> >>
> >>>
> >>> Also, how is rte_eth_tx_prepare() supposed to work for cloned
> >> packets
> >> egressing on different NIC hardware?
> >>
> >> If you create a clone of full packet (including L2/L3) headers
> >> then
> >> obviously such construction might not
> >> work properly with tx_prepare() over two different NICs.
> >> Though In majority of cases you do clone segments with data,
> >> while at
> >> least L2 headers are put into different segments.
> >> One simple approach would be to keep L3 header in that separate
> >> segment.
> >> But yes, there is a problem when you'll need to send exactly the
> >> same
> >> packet over different NICs.
> >> As I remember, for bonding PMD things don't work quite well here
> >> - you
> >> might have a bond over 2 NICs with
> >> different tx_prepare() and which one to call might be not clear
> >> till
> >> actual PMD tx_burst() is invoked.
> >>
> >>>
> >>> In theory, it might get even worse if we make this opaque
> >> instead of
> >> transparent and standardized:
> >>> One PMD might reset out_ip checksum to 0x, and another PMD
> >> might
> >> reset it to 0x.
> >>
> >>>
> >>> I can only see one solution:
> >>> We need to standardize on common minimum requirements for how
> >> to
> >> prepare packets for each TX offload.
> >>
> >> If we can make each and every vendor to agree here - that
> >> definitely
> >> will help to simplify things quite a bit.
> >
> > An API is more than a function name and parameters.
> > It also has preconditions and postconditions.
> >
> > All major NIC vendors are contributing to DPDK.
> > It should be possible to reach consensus for reasonable minimum
> >> requirements
>  for offloads.
> > Hardware- and driver-specific exceptions can be documented with
> >> the offload
>  flag, or with rte_eth_rx/tx_burst(), like the note to
> > rte_eth_rx_burst():
> > "Some drivers using vector instructions require that nb_pkts is
> >> divisible by
>  4 or 8, depending on the driver implementation."
> 
>  If we introduce a rule that everyone supposed to follow and then
> >> straightway
>  allow people to have a 'documented exceptions',
>  for me it means like 'no rule' in practice.
>  A 'documented exceptions' approach might work if you have 5
> >> different PMDs to
>  support, but not when you have 50+.
>  No-one would write an app with possible 10 different exception
> cases
> >> in his
>  head.
>  Again, with such approach we can forget about backward
> >> compatibility.
>  I think we already had this discussion before, my opinion remains
> >> the same
>  here -
>  'documented exceptions' approach is a way to trouble.
> >>>
> >>> The "minimum requirements" should be the lowest common denominator
> of
> >> all NICs.
> >>> Exceptions should be extremely few, for outlier NICs that still want
> >> to provide an offload and its driver is unable to live up to the
> >>> minimum requirements.
>

Re: How to remove a word from the dictionary.txt we are using in the CI/CD pipeline.

2024-04-16 Thread Thomas Monjalon
We have such exception already:
devtools/build-dict.sh


16/04/2024 00:19, Patrick Robb:
> Hi Aditya,
> 
> I don't run these CI checks, but I think the checkpatch script in CI
> is using https://github.com/codespell-project/codespell as the
> dictionary provider. So, one possibility is suggesting to that project
> that deque is a valid word which should be supported in software
> projects. Then, the person who maintains this check would have to
> update their codespell repo. I haven't tried submitting such a change
> to this project before, but it looks like they have some open PRs for
> word policy suggestions like yours:
> https://github.com/codespell-project/codespell/pull/2812
> 
> It looks like the first and fourth from this list would need to become
> valid uses?
> 
> ./codespell_lib/data/dictionary_code.txt:deque->dequeue
> ./codespell_lib/data/dictionary.txt:dequed->dequeued
> ./codespell_lib/data/dictionary.txt:dequeing->dequeuing
> ./codespell_lib/data/dictionary.txt:deques->dequeues
> 
> Ali, are you the person who maintains the checkpatch run which posts
> results to patchwork?
> 
> Otherwise I think it is up to the DPDK maintainers to decide if an
> exception is appropriate. I believe they have done this in the past.
> It might not hurt to suggest the change to codespell, and then if
> they're not on board, requesting an exception for your DPDK patch.
> 
> One possibility is starting a DPDK specific exceptions list, which
> people could add to alongside their patches. I don't think the
> maintainers will favor this though as I expect they will want to keep
> checkpatch pretty simple and standard.
> 
> Adding the dev mailing list as I'm guessing other DPDK devs have run
> into the same issue and might be able to provide more feedback.
> 
> On Mon, Apr 15, 2024 at 5:17 PM Honnappa Nagarahalli
>  wrote:
> >
> > + Patrick
> >
> >
> >
> > From: Aditya Ambadipudi 
> > Date: Monday, April 15, 2024 at 3:41 PM
> > To: c...@dpdk.org 
> > Cc: Honnappa Nagarahalli , Dhruv Tripathi 
> > , Wathsala Wathawana Vithanage 
> > 
> > Subject: How to remove a word from the dictionary.txt we are using in the 
> > CI/CD pipeline.
> >
> > Hello folks,
> >
> > My name is Aditya Ambadipudi. I work as a software engineer at Arm.
> >
> > Recently we submitted a patch set for DPDK:
> >
> > http://patches.dpdk.org/project/dpdk/patch/20240401013729.1466298-3-aditya.ambadip...@arm.com/
> >
> > And the CI/CD is failing for this patch because of the word Deque.
> >
> > 1787:adaquate->adequate
> >
> > 1788:adaquately->adequately
> >
> > 1789:adaquetely->adequately
> >
> > 1790:adaquit->adequate
> >
> > 1791:adaquitly->adequately
> >
> > 1975:adecuate->adequate
> >
> > 1977:adequat->adequate
> >
> > 1978:adequatly->adequately
> >
> > 1979:adequit->adequate
> >
> > 1980:adequite->adequate
> >
> > 1981:adequitely->adequately
> >
> > 1982:adequitly->adequately
> >
> > 18484:dequed->dequeued
> >
> > 18485:dequeing->dequeuing
> >
> > 18486:deques->dequeues
> >
> > 30005:inadiquate->inadequate
> >
> > 30006:inadquate->inadequate
> >
> > 61309:deque->dequeue
> >
> >
> >
> >
> >
> > Code spell wants us to substitute all the places where we used the word 
> > "deque" with "dequeue".
> >
> >
> >
> > We were wondering if it is possible to remove this rule from that 
> > dictionary that the CI/CD pipeline uses.  Or if there is a way to 
> > circumvent that rule.
> >
> > The CR we created was specifically to add a Deque library to DPDK. The data 
> > structure we have created is a Deque and there is no other word that we can 
> > think of that describes that data structure better than "Deque".
> >
> >
> >
> > Thank you,
> >
> > Aditya Ambadipudi
> >
> >
> >
> 







Re: [PATCH] mailmap: clean up vmware entries

2024-04-16 Thread Thomas Monjalon
16/04/2024 01:39, Jochen Behrens:
> VMware was acquired by Broadcom, and vmware e-mail addresses will
> be invalid in the future.
> 
> Change e-mail to broadcom.com for those still in the company and
> remove the other entries.
> 
> Signed-off-by: Jochen Behrens 
> ---
> -Bharat Mota 
> +Bharat Mota 

You need to keep the old address in order to map old addresses which are in the 
repository.




[PATCH v1] crypto: fix build issues on crypto callbacks macro undefined

2024-04-16 Thread Ganapati Kundapura
Crypto callbacks macro is defined with value 1 and being used with ifdef,
on config value is changed to 0 to disable, crypto callback changes
still being compiled.

Defined crypto callbacks macro without value, undef to disable

Wrapped crypto callback changes with RTE_CRYPTO_CALLBACKS macro
to fix build issues when macro is undefined.

As callback head nodes have valid pointer, this patch checks the next
node from the head if callbacks registered.

Fixes: 1c3ffb9 ("cryptodev: add enqueue and dequeue callbacks")
Fixes: 5523a75 ("test/crypto: add case for enqueue/dequeue callbacks")

Signed-off-by: Ganapati Kundapura 

diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c
index 1703ebccf1..1a592f2302 100644
--- a/app/test/test_cryptodev.c
+++ b/app/test/test_cryptodev.c
@@ -14547,6 +14547,7 @@ test_null_burst_operation(void)
return TEST_SUCCESS;
 }
 
+#ifdef RTE_CRYPTO_CALLBACKS
 static uint16_t
 test_enq_callback(uint16_t dev_id, uint16_t qp_id, struct rte_crypto_op **ops,
  uint16_t nb_ops, void *user_param)
@@ -14784,6 +14785,7 @@ test_deq_callback_setup(void)
 
return TEST_SUCCESS;
 }
+#endif /* RTE_CRYPTO_CALLBACKS */
 
 static void
 generate_gmac_large_plaintext(uint8_t *data)
@@ -18069,8 +18071,10 @@ static struct unit_test_suite cryptodev_gen_testsuite  
= {
TEST_CASE_ST(ut_setup, ut_teardown,
test_device_configure_invalid_queue_pair_ids),
TEST_CASE_ST(ut_setup, ut_teardown, test_stats),
+#ifdef RTE_CRYPTO_CALLBACKS
TEST_CASE_ST(ut_setup, ut_teardown, test_enq_callback_setup),
TEST_CASE_ST(ut_setup, ut_teardown, test_deq_callback_setup),
+#endif
TEST_CASES_END() /**< NULL terminate unit test array */
}
 };
diff --git a/config/rte_config.h b/config/rte_config.h
index dd7bb0d35b..b647a69ba8 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -72,7 +72,7 @@
 /* cryptodev defines */
 #define RTE_CRYPTO_MAX_DEVS 64
 #define RTE_CRYPTODEV_NAME_LEN 64
-#define RTE_CRYPTO_CALLBACKS 1
+#define RTE_CRYPTO_CALLBACKS   /* No Value, undef/comment out to disable */
 
 /* compressdev defines */
 #define RTE_COMPRESS_MAX_DEVS 64
diff --git a/lib/cryptodev/rte_cryptodev.c b/lib/cryptodev/rte_cryptodev.c
index 886eb7adc4..5f3b17a517 100644
--- a/lib/cryptodev/rte_cryptodev.c
+++ b/lib/cryptodev/rte_cryptodev.c
@@ -628,6 +628,7 @@ rte_cryptodev_asym_xform_capability_check_hash(
return ret;
 }
 
+#ifdef RTE_CRYPTO_CALLBACKS
 /* spinlock for crypto device enq callbacks */
 static rte_spinlock_t rte_cryptodev_callback_lock = RTE_SPINLOCK_INITIALIZER;
 
@@ -744,6 +745,7 @@ cryptodev_cb_init(struct rte_cryptodev *dev)
cryptodev_cb_cleanup(dev);
return -ENOMEM;
 }
+#endif /* RTE_CRYPTO_CALLBACKS */
 
 const char *
 rte_cryptodev_get_feature_name(uint64_t flag)
@@ -1244,9 +1246,11 @@ rte_cryptodev_configure(uint8_t dev_id, struct 
rte_cryptodev_config *config)
if (*dev->dev_ops->dev_configure == NULL)
return -ENOTSUP;
 
+#ifdef RTE_CRYPTO_CALLBACKS
rte_spinlock_lock(&rte_cryptodev_callback_lock);
cryptodev_cb_cleanup(dev);
rte_spinlock_unlock(&rte_cryptodev_callback_lock);
+#endif
 
/* Setup new number of queue pairs and reconfigure device. */
diag = rte_cryptodev_queue_pairs_config(dev, config->nb_queue_pairs,
@@ -1257,6 +1261,7 @@ rte_cryptodev_configure(uint8_t dev_id, struct 
rte_cryptodev_config *config)
return diag;
}
 
+#ifdef RTE_CRYPTO_CALLBACKS
rte_spinlock_lock(&rte_cryptodev_callback_lock);
diag = cryptodev_cb_init(dev);
rte_spinlock_unlock(&rte_cryptodev_callback_lock);
@@ -1264,6 +1269,7 @@ rte_cryptodev_configure(uint8_t dev_id, struct 
rte_cryptodev_config *config)
CDEV_LOG_ERR("Callback init failed for dev_id=%d", dev_id);
return diag;
}
+#endif
 
rte_cryptodev_trace_configure(dev_id, config);
return (*dev->dev_ops->dev_configure)(dev, config);
@@ -1485,6 +1491,7 @@ rte_cryptodev_queue_pair_setup(uint8_t dev_id, uint16_t 
queue_pair_id,
socket_id);
 }
 
+#ifdef RTE_CRYPTO_CALLBACKS
 struct rte_cryptodev_cb *
 rte_cryptodev_add_enq_callback(uint8_t dev_id,
   uint16_t qp_id,
@@ -1763,6 +1770,7 @@ rte_cryptodev_remove_deq_callback(uint8_t dev_id,
rte_spinlock_unlock(&rte_cryptodev_callback_lock);
return ret;
 }
+#endif /* RTE_CRYPTO_CALLBACKS */
 
 int
 rte_cryptodev_stats_get(uint8_t dev_id, struct rte_cryptodev_stats *stats)
diff --git a/lib/cryptodev/rte_cryptodev.h b/lib/cryptodev/rte_cryptodev.h
index 00ba6a234a..b811b458d5 100644
--- a/lib/cryptodev/rte_cryptodev.h
+++ b/lib/cryptodev/rte_cryptodev.h
@@ -1910,7 +1910,7 @@ rte_cryptodev_dequeue_burst(uint8_t dev_id, uint16_t 
qp_id,
nb_ops = fp_ops->dequeue_burst(qp, ops, nb_ops);
 
 #ifdef RTE_CRYPTO_CALLBACKS
-   i

RE: [PATCH] crypto: fix crypto callbacks macro

2024-04-16 Thread Power, Ciara



> -Original Message-
> From: Power, Ciara 
> Sent: Monday, April 15, 2024 1:36 PM
> To: dev@dpdk.org
> Cc: Power, Ciara ; Gujjar, Abhinandan S
> ; sta...@dpdk.org; Akhil Goyal
> ; Fan Zhang 
> Subject: [PATCH] crypto: fix crypto callbacks macro
> 
> The crypto callbacks macro is being used with ifdef instead of if, so when the
> config file value is changed to 0 to disable, the code is still being 
> compiled in.
> 
> Fixes: 1c3ffb95595e ("cryptodev: add enqueue and dequeue callbacks")
> Cc: abhinandan.guj...@intel.com
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Ciara Power 
> ---
>  lib/cryptodev/rte_cryptodev.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/cryptodev/rte_cryptodev.h b/lib/cryptodev/rte_cryptodev.h 
> index
> 00ba6a234a..357d4bcf9c 100644
> --- a/lib/cryptodev/rte_cryptodev.h
> +++ b/lib/cryptodev/rte_cryptodev.h
> @@ -1909,7 +1909,7 @@ rte_cryptodev_dequeue_burst(uint8_t dev_id,
> uint16_t qp_id,
> 
>   nb_ops = fp_ops->dequeue_burst(qp, ops, nb_ops);
> 
> -#ifdef RTE_CRYPTO_CALLBACKS
> +#if RTE_CRYPTO_CALLBACKS
>   if (unlikely(fp_ops->qp.deq_cb != NULL)) {
>   struct rte_cryptodev_cb_rcu *list;
>   struct rte_cryptodev_cb *cb;
> @@ -1976,7 +1976,7 @@ rte_cryptodev_enqueue_burst(uint8_t dev_id,
> uint16_t qp_id,
> 
>   fp_ops = &rte_crypto_fp_ops[dev_id];
>   qp = fp_ops->qp.data[qp_id];
> -#ifdef RTE_CRYPTO_CALLBACKS
> +#if RTE_CRYPTO_CALLBACKS
>   if (unlikely(fp_ops->qp.enq_cb != NULL)) {
>   struct rte_cryptodev_cb_rcu *list;
>   struct rte_cryptodev_cb *cb;
> --
> 2.25.1

Rejecting this patch, as an alternative fix was sent along with some other 
fixes: 
https://patchwork.dpdk.org/project/dpdk/patch/20240416081222.3002268-1-ganapati.kundap...@intel.com/



Re: [PATCH 1/5] dts: fix InteractiveShell command prompt filtering

2024-04-16 Thread Juraj Linkeš
On Fri, Apr 12, 2024 at 1:11 PM Luca Vizzarro  wrote:
>
> When sending a command using an instance of InteractiveShell the output
> is meant to filter out the leading shell prompt. The filtering logic is
> present but the line is appended anyways.
>

I don't think that's what's happening here. The output collecting
logic is "stop when we encounter a prompt, but not the prompt with the
command we sent". We could change the comment though.

> Bugzilla ID: 1411
> Fixes: 88489c0501af ("dts: add smoke tests")
>
> Signed-off-by: Luca Vizzarro 
> Reviewed-by: Paul Szczepanek 
> Reviewed-by: Jack Bond-Preston 
> ---
> Cc: Jeremy Spewock 
> ---
>  dts/framework/remote_session/interactive_shell.py | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/dts/framework/remote_session/interactive_shell.py 
> b/dts/framework/remote_session/interactive_shell.py
> index 5cfe202e15..8a9bf96ea9 100644
> --- a/dts/framework/remote_session/interactive_shell.py
> +++ b/dts/framework/remote_session/interactive_shell.py
> @@ -132,11 +132,11 @@ def send_command(self, command: str, prompt: str | None 
> = None) -> str:
>  self._stdin.flush()
>  out: str = ""
>  for line in self._stdout:
> -out += line
>  if prompt in line and not line.rstrip().endswith(
>  command.rstrip()
>  ):  # ignore line that sent command
>  break
> +out += line

If we do this, we'll only filter out the last prompt, which may not be
desirable, since the last prompt is there only because all of our
interactive shells force an extra prompt with _command_extra_chars.

One thing we could improve though is removing the distribution welcome
message from logs, or at least separate it from the first command sent
with the interactive shell. The second option will allow us to see
clearly that an interactive session has been established, although we
could just emit a shorter log (something like "Started a testpmd
session" and then flush the welcome screen output).

>  self._logger.debug(f"Got output: {out}")
>  return out
>
> --
> 2.34.1
>


Re: [PATCH 2/5] dts: skip first line of send_command output

2024-04-16 Thread Juraj Linkeš
On Fri, Apr 12, 2024 at 1:11 PM Luca Vizzarro  wrote:
>
> The first line of the InteractiveShell send_command method is generally
> the command input field. This sometimes is unwanted, therefore this
> commit enables the possibility of omitting the first line from the
> returned output.
>

Oh, the first commit message was confusing. It said leading prompt
which I understood to be the first prompt (the one with the command).
I see that this commit actually addresses what I thought the first
commit was trying to do.

> Signed-off-by: Luca Vizzarro 
> Reviewed-by: Paul Szczepanek 
> Reviewed-by: Jack Bond-Preston 
> ---
>  dts/framework/remote_session/interactive_shell.py | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/dts/framework/remote_session/interactive_shell.py 
> b/dts/framework/remote_session/interactive_shell.py
> index 8a9bf96ea9..e290a083e9 100644
> --- a/dts/framework/remote_session/interactive_shell.py
> +++ b/dts/framework/remote_session/interactive_shell.py
> @@ -105,7 +105,9 @@ def _start_application(self, get_privileged_command: 
> Callable[[str], str] | None
>  start_command = get_privileged_command(start_command)
>  self.send_command(start_command)
>
> -def send_command(self, command: str, prompt: str | None = None) -> str:
> +def send_command(
> +self, command: str, prompt: str | None = None, skip_first_line: bool 
> = False

Do we generally want or don't want to include the first line? When do
we absolutely not want to include it?

> +) -> str:
>  """Send `command` and get all output before the expected ending 
> string.
>
>  Lines that expect input are not included in the stdout buffer, so 
> they cannot
> @@ -121,6 +123,7 @@ def send_command(self, command: str, prompt: str | None = 
> None) -> str:
>  command: The command to send.
>  prompt: After sending the command, `send_command` will be 
> expecting this string.
>  If :data:`None`, will use the class's default prompt.
> +skip_first_line: Skip the first line when capturing the output.
>
>  Returns:
>  All output in the buffer before expected string.
> @@ -132,6 +135,9 @@ def send_command(self, command: str, prompt: str | None = 
> None) -> str:
>  self._stdin.flush()
>  out: str = ""
>  for line in self._stdout:
> +if skip_first_line:
> +skip_first_line = False
> +continue

Is there ever a reason to distinguish between the first line and the
line with the command on it?

>  if prompt in line and not line.rstrip().endswith(
>  command.rstrip()
>  ):  # ignore line that sent command
> --
> 2.34.1
>


RE: [PATCH 2/2] dma/cnxk: remove completion pool

2024-04-16 Thread Vamsi Krishna Attunuru



> -Original Message-
> From: pbhagavat...@marvell.com 
> Sent: Saturday, April 6, 2024 3:43 PM
> To: Jerin Jacob ; Vamsi Krishna Attunuru
> ; Pavan Nikhilesh Bhagavatula
> ; Shijith Thotton 
> Cc: dev@dpdk.org
> Subject: [PATCH 2/2] dma/cnxk: remove completion pool
> 
> From: Pavan Nikhilesh 
> 
> Use DMA ops to store metadata, remove use of completion pool.
> 
> Signed-off-by: Pavan Nikhilesh 
> ---

Acked-by: Vamsi Attunuru 



Re: [PATCH 3/5] dts: add parsing utility module

2024-04-16 Thread Juraj Linkeš
On Fri, Apr 12, 2024 at 1:11 PM Luca Vizzarro  wrote:
>
> Adds parsing text into a custom data structure. It provides a new
> `TextParser` dataclass to be inherited. This implements the `parse`
> method, which combined with the parser functions, it can automatically
> parse the value for each field.
>

>From this commit message, I don't know why we're adding the module.
What are we going to use it for? Since you mentioned you'll send a v2,
I'll wait with review after that as I think it'll make it a bit easier
to review.


Re: [PATCH 4/5] dts: add `show port info` command to TestPmdShell

2024-04-16 Thread Juraj Linkeš
On Fri, Apr 12, 2024 at 1:11 PM Luca Vizzarro  wrote:
>
> Add a new TestPmdPort data structure to represent the output
> returned by `show port info`, which is implemented as part of
> TestPmdShell.
>
> The TestPmdPort data structure and its derived classes are modelled
> based on the relevant testpmd source code.
>
> This implementation makes extensive use of regular expressions, which
> all parse individually. The rationale behind this is to lower the risk
> of the testpmd output changing as part of development. Therefore
> minimising breakage.
>
> Bugzilla ID: 1407
>
> Signed-off-by: Luca Vizzarro 
> Reviewed-by: Paul Szczepanek 



> +@dataclass
> +class TestPmdPort(TextParser):

This and the classes above are missing docstrings.



> @@ -225,6 +664,38 @@ def set_forward_mode(self, mode: TestPmdForwardingModes, 
> verify: bool = True):
>  f"Test pmd failed to set fwd mode to {mode.value}"
>  )
>
> +def show_port_info_all(self) -> list[TestPmdPort]:
> +"""Returns the information of all the ports."""

Can we add sample output so that the format of what we're trying to
parse is clear?

> +output = self.send_command("show port info all")
> +
> +ports = []
> +iter = re.finditer(r"\*+.+\*+", output)
> +if next(iter, False):  # we are slicing retrospectively, skip first 
> block
> +start_pos = 0
> +for block in iter:
> +end_pos = block.start()
> +ports.append(TestPmdPort.parse(output[start_pos:end_pos]))
> +start_pos = end_pos
> +
> +ports.append(TestPmdPort.parse(output[start_pos:]))
> +
> +return ports

Can this be done the same way it's done in the last commit?

iter = re.finditer(r"(^  #*.+#*$[^#]+)^  #*$", output, re.MULTILINE)
return [TestPmdPortStats.parse(block.group(1)) for block in iter]

Looks much better.

> +
> +def show_port_info(self, port_id: int) -> TestPmdPort:
> +"""Returns the given port information.
> +
> +Args:
> +port_id: The port ID to gather information for.
> +
> +Raises:
> +InteractiveCommandExecutionError: If `port_id` is invalid.
> +"""
> +output = self.send_command(f"show port info {port_id}", 
> skip_first_line=True)
> +if output.startswith("Invalid port"):
> +raise InteractiveCommandExecutionError("invalid port given")
> +
> +return TestPmdPort.parse(output)
> +
>  def close(self) -> None:
>  """Overrides :meth:`~.interactive_shell.close`."""
>  self.send_command("quit", "")
> --
> 2.34.1
>


Re: [PATCH 5/5] dts: add `show port stats` command to TestPmdShell

2024-04-16 Thread Juraj Linkeš
On Fri, Apr 12, 2024 at 1:11 PM Luca Vizzarro  wrote:
>
> Add a new TestPmdPortStats data structure to represent the output
> returned by `show port stats`, which is implemented as part of
> TestPmdShell.
>
> Bugzilla ID: 1407
>
> Signed-off-by: Luca Vizzarro 
> Reviewed-by: Paul Szczepanek 

Acked-by: Juraj Linkeš 


RE: [PATCH v1] crypto: fix build issues on crypto callbacks macro undefined

2024-04-16 Thread Gujjar, Abhinandan S



> -Original Message-
> From: Kundapura, Ganapati 
> Sent: Tuesday, April 16, 2024 1:42 PM
> To: dev@dpdk.org
> Cc: Gujjar, Abhinandan S ; Power, Ciara
> ; gak...@marvell.com; fanzhang@gmail.com
> Subject: [PATCH v1] crypto: fix build issues on crypto callbacks macro 
> undefined
> 
> Crypto callbacks macro is defined with value 1 and being used with ifdef, on
> config value is changed to 0 to disable, crypto callback changes still being
> compiled.
> 
> Defined crypto callbacks macro without value, undef to disable
> 
> Wrapped crypto callback changes with RTE_CRYPTO_CALLBACKS macro to fix
> build issues when macro is undefined.
> 
> As callback head nodes have valid pointer, this patch checks the next node 
> from
> the head if callbacks registered.
> 
> Fixes: 1c3ffb9 ("cryptodev: add enqueue and dequeue callbacks")
> Fixes: 5523a75 ("test/crypto: add case for enqueue/dequeue callbacks")
> 
> Signed-off-by: Ganapati Kundapura 
>
Acked-by: Abhinandan Gujjar 
 
> diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c index
> 1703ebccf1..1a592f2302 100644
> --- a/app/test/test_cryptodev.c
> +++ b/app/test/test_cryptodev.c
> @@ -14547,6 +14547,7 @@ test_null_burst_operation(void)
>   return TEST_SUCCESS;
>  }
> 
> +#ifdef RTE_CRYPTO_CALLBACKS
>  static uint16_t
>  test_enq_callback(uint16_t dev_id, uint16_t qp_id, struct rte_crypto_op 
> **ops,
> uint16_t nb_ops, void *user_param)
> @@ -14784,6 +14785,7 @@ test_deq_callback_setup(void)
> 
>   return TEST_SUCCESS;
>  }
> +#endif /* RTE_CRYPTO_CALLBACKS */
> 
>  static void
>  generate_gmac_large_plaintext(uint8_t *data) @@ -18069,8 +18071,10 @@
> static struct unit_test_suite cryptodev_gen_testsuite  = {
>   TEST_CASE_ST(ut_setup, ut_teardown,
> 
>   test_device_configure_invalid_queue_pair_ids),
>   TEST_CASE_ST(ut_setup, ut_teardown, test_stats),
> +#ifdef RTE_CRYPTO_CALLBACKS
>   TEST_CASE_ST(ut_setup, ut_teardown,
> test_enq_callback_setup),
>   TEST_CASE_ST(ut_setup, ut_teardown,
> test_deq_callback_setup),
> +#endif
>   TEST_CASES_END() /**< NULL terminate unit test array */
>   }
>  };
> diff --git a/config/rte_config.h b/config/rte_config.h index
> dd7bb0d35b..b647a69ba8 100644
> --- a/config/rte_config.h
> +++ b/config/rte_config.h
> @@ -72,7 +72,7 @@
>  /* cryptodev defines */
>  #define RTE_CRYPTO_MAX_DEVS 64
>  #define RTE_CRYPTODEV_NAME_LEN 64
> -#define RTE_CRYPTO_CALLBACKS 1
> +#define RTE_CRYPTO_CALLBACKS /* No Value, undef/comment out to
> disable */
> 
>  /* compressdev defines */
>  #define RTE_COMPRESS_MAX_DEVS 64
> diff --git a/lib/cryptodev/rte_cryptodev.c b/lib/cryptodev/rte_cryptodev.c 
> index
> 886eb7adc4..5f3b17a517 100644
> --- a/lib/cryptodev/rte_cryptodev.c
> +++ b/lib/cryptodev/rte_cryptodev.c
> @@ -628,6 +628,7 @@ rte_cryptodev_asym_xform_capability_check_hash(
>   return ret;
>  }
> 
> +#ifdef RTE_CRYPTO_CALLBACKS
>  /* spinlock for crypto device enq callbacks */  static rte_spinlock_t
> rte_cryptodev_callback_lock = RTE_SPINLOCK_INITIALIZER;
> 
> @@ -744,6 +745,7 @@ cryptodev_cb_init(struct rte_cryptodev *dev)
>   cryptodev_cb_cleanup(dev);
>   return -ENOMEM;
>  }
> +#endif /* RTE_CRYPTO_CALLBACKS */
> 
>  const char *
>  rte_cryptodev_get_feature_name(uint64_t flag) @@ -1244,9 +1246,11 @@
> rte_cryptodev_configure(uint8_t dev_id, struct rte_cryptodev_config *config)
>   if (*dev->dev_ops->dev_configure == NULL)
>   return -ENOTSUP;
> 
> +#ifdef RTE_CRYPTO_CALLBACKS
>   rte_spinlock_lock(&rte_cryptodev_callback_lock);
>   cryptodev_cb_cleanup(dev);
>   rte_spinlock_unlock(&rte_cryptodev_callback_lock);
> +#endif
> 
>   /* Setup new number of queue pairs and reconfigure device. */
>   diag = rte_cryptodev_queue_pairs_config(dev, config->nb_queue_pairs,
> @@ -1257,6 +1261,7 @@ rte_cryptodev_configure(uint8_t dev_id, struct
> rte_cryptodev_config *config)
>   return diag;
>   }
> 
> +#ifdef RTE_CRYPTO_CALLBACKS
>   rte_spinlock_lock(&rte_cryptodev_callback_lock);
>   diag = cryptodev_cb_init(dev);
>   rte_spinlock_unlock(&rte_cryptodev_callback_lock);
> @@ -1264,6 +1269,7 @@ rte_cryptodev_configure(uint8_t dev_id, struct
> rte_cryptodev_config *config)
>   CDEV_LOG_ERR("Callback init failed for dev_id=%d", dev_id);
>   return diag;
>   }
> +#endif
> 
>   rte_cryptodev_trace_configure(dev_id, config);
>   return (*dev->dev_ops->dev_configure)(dev, config); @@ -1485,6
> +1491,7 @@ rte_cryptodev_queue_pair_setup(uint8_t dev_id, uint16_t
> queue_pair_id,
>   socket_id);
>  }
> 
> +#ifdef RTE_CRYPTO_CALLBACKS
>  struct rte_cryptodev_cb *
>  rte_cryptodev_add_enq_callback(uint8_t dev_id,
>  uint16_t qp_id,
> @@ -1763,6 +1770,7 @@ rte_cryptodev_remove_deq_callback(uint8_t
> dev_id,
>   rte_spinlock_unlock(&rte_cryptodev_callback_lock);
>   return ret;
>

RE: [PATCH v2 3/8] mbuf: fix Tx checksum offload examples

2024-04-16 Thread Konstantin Ananyev



> > > > > > > > > > Mandate use of rte_eth_tx_prepare() in the mbuf Tx
> > checksum
> > > > offload
> > > > > > > > > > examples.
> > > > > > > > >
> > > > > > > > > I strongly disagree with this change!
> > > > > > > > >
> > > > > > > > > It will cause a huge performance degradation for shaping
> > > > applications:
> > > > > > > > >
> > > > > > > > > A packet will be processed and finalized at an output or
> > > > forwarding
> > > > > > > > pipeline stage, where some other fields might also be
> > written,
> > > > so
> > > > > > > > > zeroing e.g. the out_ip checksum at this stage has low
> > cost
> > > > (no new
> > > > > > > > cache misses).
> > > > > > > > >
> > > > > > > > > Then, the packet might be queued for QoS or similar.
> > > > > > > > >
> > > > > > > > > If rte_eth_tx_prepare() must be called at the egress
> > pipeline
> > > > stage,
> > > > > > > > it has to write to the packet and cause a cache miss per
> > packet,
> > > > > > > > > instead of simply passing on the packet to the NIC
> > hardware.
> > > > > > > > >
> > > > > > > > > It must be possible to finalize the packet at the
> > > > output/forwarding
> > > > > > > > pipeline stage!
> > > > > > > >
> > > > > > > > If you can finalize your packet on  output/forwarding, then
> > why
> > > > you
> > > > > > > > can't invoke tx_prepare() on the same stage?
> > > > > > > > There seems to be some misunderstanding about what
> > tx_prepare()
> > > > does -
> > > > > > > > in fact it doesn't communicate with HW queue (doesn't update
> > TXD
> > > > ring,
> > > > > > > > etc.), what it does - just make changes in mbuf itself.
> > > > > > > > Yes, it reads some fields in SW TX queue struct (max number
> > of
> > > > TXDs per
> > > > > > > > packet, etc.), but AFAIK it is safe
> > > > > > > > to call tx_prepare() and tx_burst() from different threads.
> > > > > > > > At least on implementations I am aware about.
> > > > > > > > Just checked the docs - it seems not stated explicitly
> > anywhere,
> > > > might
> > > > > > > > be that's why it causing such misunderstanding.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Also, how is rte_eth_tx_prepare() supposed to work for
> > cloned
> > > > packets
> > > > > > > > egressing on different NIC hardware?
> > > > > > > >
> > > > > > > > If you create a clone of full packet (including L2/L3)
> > headers
> > > > then
> > > > > > > > obviously such construction might not
> > > > > > > > work properly with tx_prepare() over two different NICs.
> > > > > > > > Though In majority of cases you do clone segments with data,
> > > > while at
> > > > > > > > least L2 headers are put into different segments.
> > > > > > > > One simple approach would be to keep L3 header in that
> > separate
> > > > segment.
> > > > > > > > But yes, there is a problem when you'll need to send exactly
> > the
> > > > same
> > > > > > > > packet over different NICs.
> > > > > > > > As I remember, for bonding PMD things don't work quite well
> > here
> > > > - you
> > > > > > > > might have a bond over 2 NICs with
> > > > > > > > different tx_prepare() and which one to call might be not
> > clear
> > > > till
> > > > > > > > actual PMD tx_burst() is invoked.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > In theory, it might get even worse if we make this opaque
> > > > instead of
> > > > > > > > transparent and standardized:
> > > > > > > > > One PMD might reset out_ip checksum to 0x, and another
> > PMD
> > > > might
> > > > > > > > reset it to 0x.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > I can only see one solution:
> > > > > > > > > We need to standardize on common minimum requirements for
> > how
> > > > to
> > > > > > > > prepare packets for each TX offload.
> > > > > > > >
> > > > > > > > If we can make each and every vendor to agree here - that
> > > > definitely
> > > > > > > > will help to simplify things quite a bit.
> > > > > > >
> > > > > > > An API is more than a function name and parameters.
> > > > > > > It also has preconditions and postconditions.
> > > > > > >
> > > > > > > All major NIC vendors are contributing to DPDK.
> > > > > > > It should be possible to reach consensus for reasonable
> > minimum
> > > > requirements
> > > > > > for offloads.
> > > > > > > Hardware- and driver-specific exceptions can be documented
> > with
> > > > the offload
> > > > > > flag, or with rte_eth_rx/tx_burst(), like the note to
> > > > > > > rte_eth_rx_burst():
> > > > > > > "Some drivers using vector instructions require that nb_pkts
> > is
> > > > divisible by
> > > > > > 4 or 8, depending on the driver implementation."
> > > > > >
> > > > > > If we introduce a rule that everyone supposed to follow and then
> > > > straightway
> > > > > > allow people to have a 'documented exceptions',
> > > > > > for me it means like 'no rule' in practice.
> > > > > > A 'documented exceptions' approach might work if you have 5
> > > > different PMDs to
> > > > > > support, but not when you have 50+.
> > > > > > No

RE: [PATCH v2 3/8] mbuf: fix Tx checksum offload examples

2024-04-16 Thread Konstantin Ananyev


> > From: Ferruh Yigit [mailto:ferruh.yi...@amd.com]
> > Sent: Monday, 15 April 2024 17.08
> >
> > On 4/12/2024 3:44 PM, Morten Brørup wrote:
> >  Mandate use of rte_eth_tx_prepare() in the mbuf Tx checksum
> > >> offload
> >  examples.
> > >>>
> > >>> I strongly disagree with this change!
> > >>>
> > >>> It will cause a huge performance degradation for shaping
> > >> applications:
> > >>>
> > >>> A packet will be processed and finalized at an output or
> > >> forwarding
> > >> pipeline stage, where some other fields might also be written,
> > >> so
> > >>> zeroing e.g. the out_ip checksum at this stage has low cost
> > >> (no new
> > >> cache misses).
> > >>>
> > >>> Then, the packet might be queued for QoS or similar.
> > >>>
> > >>> If rte_eth_tx_prepare() must be called at the egress pipeline
> > >> stage,
> > >> it has to write to the packet and cause a cache miss per packet,
> > >>> instead of simply passing on the packet to the NIC hardware.
> > >>>
> > >>> It must be possible to finalize the packet at the
> > >> output/forwarding
> > >> pipeline stage!
> > >>
> > >> If you can finalize your packet on  output/forwarding, then why
> > >> you
> > >> can't invoke tx_prepare() on the same stage?
> > >> There seems to be some misunderstanding about what tx_prepare()
> > >> does -
> > >> in fact it doesn't communicate with HW queue (doesn't update TXD
> > >> ring,
> > >> etc.), what it does - just make changes in mbuf itself.
> > >> Yes, it reads some fields in SW TX queue struct (max number of
> > >> TXDs per
> > >> packet, etc.), but AFAIK it is safe
> > >> to call tx_prepare() and tx_burst() from different threads.
> > >> At least on implementations I am aware about.
> > >> Just checked the docs - it seems not stated explicitly anywhere,
> > >> might
> > >> be that's why it causing such misunderstanding.
> > >>
> > >>>
> > >>> Also, how is rte_eth_tx_prepare() supposed to work for cloned
> > >> packets
> > >> egressing on different NIC hardware?
> > >>
> > >> If you create a clone of full packet (including L2/L3) headers
> > >> then
> > >> obviously such construction might not
> > >> work properly with tx_prepare() over two different NICs.
> > >> Though In majority of cases you do clone segments with data,
> > >> while at
> > >> least L2 headers are put into different segments.
> > >> One simple approach would be to keep L3 header in that separate
> > >> segment.
> > >> But yes, there is a problem when you'll need to send exactly the
> > >> same
> > >> packet over different NICs.
> > >> As I remember, for bonding PMD things don't work quite well here
> > >> - you
> > >> might have a bond over 2 NICs with
> > >> different tx_prepare() and which one to call might be not clear
> > >> till
> > >> actual PMD tx_burst() is invoked.
> > >>
> > >>>
> > >>> In theory, it might get even worse if we make this opaque
> > >> instead of
> > >> transparent and standardized:
> > >>> One PMD might reset out_ip checksum to 0x, and another PMD
> > >> might
> > >> reset it to 0x.
> > >>
> > >>>
> > >>> I can only see one solution:
> > >>> We need to standardize on common minimum requirements for how
> > >> to
> > >> prepare packets for each TX offload.
> > >>
> > >> If we can make each and every vendor to agree here - that
> > >> definitely
> > >> will help to simplify things quite a bit.
> > >
> > > An API is more than a function name and parameters.
> > > It also has preconditions and postconditions.
> > >
> > > All major NIC vendors are contributing to DPDK.
> > > It should be possible to reach consensus for reasonable minimum
> > >> requirements
> >  for offloads.
> > > Hardware- and driver-specific exceptions can be documented with
> > >> the offload
> >  flag, or with rte_eth_rx/tx_burst(), like the note to
> > > rte_eth_rx_burst():
> > > "Some drivers using vector instructions require that nb_pkts is
> > >> divisible by
> >  4 or 8, depending on the driver implementation."
> > 
> >  If we introduce a rule that everyone supposed to follow and then
> > >> straightway
> >  allow people to have a 'documented exceptions',
> >  for me it means like 'no rule' in practice.
> >  A 'documented exceptions' approach might work if you have 5
> > >> different PMDs to
> >  support, but not when you have 50+.
> >  No-one would write an app with possible 10 different exception
> > cases
> > >> in his
> >  head.
> >  Again, with such approach we can forget about backward
> > >> compatibility.
> >  I think we already had this discussion before, my opinion remains
> > >> the same
> >  here -
> >  'documented exceptions' approach is a way to trouble.
> > >>>
> > >>> The "minim

[PATCH v2] app/testpmd: fix lcore ID restriction

2024-04-16 Thread Sivaprasad Tummala
With modern CPUs, it is possible to have higher
CPU count thus we can have higher RTE_MAX_LCORES.
In testpmd application, the current config forwarding
cores option "--nb-cores" is hard limited to 255.

The patch fixes this constraint and also adjusts the lcore
data structure to 32-bit to align with rte lcore APIs.

Fixes: af75078fece3 ("first public release")
Cc: sta...@dpdk.org

Signed-off-by: Sivaprasad Tummala 
---
v2:
* Fixed type mistmatch in comparison 
* Fixed incorrect format specifier
---
 app/test-pmd/config.c | 6 +++---
 app/test-pmd/parameters.c | 4 ++--
 app/test-pmd/testpmd.h| 2 +-
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index ba1007ace6..6b28c22c96 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -4785,9 +4785,9 @@ fwd_stream_on_other_lcores(uint16_t domain_id, lcoreid_t 
src_lc,
continue;
printf("Shared Rx queue group %u queue %hu can't be 
scheduled on different cores:\n",
   share_group, share_rxq);
-   printf("  lcore %hhu Port %hu queue %hu\n",
+   printf("  lcore %u Port %hu queue %hu\n",
   src_lc, src_port, src_rxq);
-   printf("  lcore %hhu Port %hu queue %hu\n",
+   printf("  lcore %u Port %hu queue %hu\n",
   lc_id, fs->rx_port, fs->rx_queue);
printf("Please use --nb-cores=%hu to limit number of 
forwarding cores\n",
   nb_rxq);
@@ -5159,7 +5159,7 @@ icmp_echo_config_setup(void)
lcoreid_t lc_id;
uint16_t  sm_id;
 
-   if ((nb_txq * nb_fwd_ports) < nb_fwd_lcores)
+   if ((lcoreid_t)(nb_txq * nb_fwd_ports) < nb_fwd_lcores)
cur_fwd_config.nb_fwd_lcores = (lcoreid_t)
(nb_txq * nb_fwd_ports);
else
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index c13f7564bf..22364e09ab 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -1071,8 +1071,8 @@ launch_args_parse(int argc, char** argv)
break;
case TESTPMD_OPT_NB_CORES_NUM:
n = atoi(optarg);
-   if (n > 0 && n <= nb_lcores)
-   nb_fwd_lcores = (uint8_t) n;
+   if (n > 0 && (lcoreid_t)n <= nb_lcores)
+   nb_fwd_lcores = (lcoreid_t) n;
else
rte_exit(EXIT_FAILURE,
"nb-cores should be > 0 and <= %d\n",
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 0afae7d771..9facd7f281 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -84,7 +84,7 @@ extern volatile uint8_t f_quit;
 /* Maximum number of pools supported per Rx queue */
 #define MAX_MEMPOOL 8
 
-typedef uint8_t  lcoreid_t;
+typedef uint32_t lcoreid_t;
 typedef uint16_t portid_t;
 typedef uint16_t queueid_t;
 typedef uint16_t streamid_t;
-- 
2.34.1



RE: [PATCH] common/mlx5: fix unsigned signed mismatch warning

2024-04-16 Thread Dariusz Sosnowski
> -Original Message-
> From: Tyler Retzlaff 
> Sent: Tuesday, April 16, 2024 01:10
> To: dev@dpdk.org
> Cc: Dariusz Sosnowski ; Dekel Peled
> ; Matan Azrad ; Ori Kam
> ; Suanming Mou ; Slava
> Ovsiienko ; Tyler Retzlaff
> ; dek...@mellanox.com; Matan Azrad
> ; sta...@dpdk.org
> Subject: [PATCH] common/mlx5: fix unsigned signed mismatch warning
> 
> Use unsigned int for 2 loop indexes that are being compared against an
> unsigned int struct field to avoid signed/unsigned mismatch warning.
> 
> Fixes: 718d166e5504 ("net/mlx5: create advanced RxQ table via DevX")
> Fixes: e1da60a8a6e9 ("common/mlx5: add DevX command to modify RQT")
> Cc: dek...@mellanox.com
> Cc: ma...@mellanox.com
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Tyler Retzlaff 

Acked-by: Dariusz Sosnowski 

Thanks,
Dariusz Sosnowski


[DPDK/ethdev Bug 1416] net/af_packet: tx_burst() can modify packets

2024-04-16 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=1416

Bug ID: 1416
   Summary: net/af_packet: tx_burst() can modify packets
   Product: DPDK
   Version: 24.03
  Hardware: All
OS: All
Status: UNCONFIRMED
  Severity: normal
  Priority: Normal
 Component: ethdev
  Assignee: dev@dpdk.org
  Reporter: konstantin.v.anan...@yandex.ru
  Target Milestone: ---

According to the ethdev doc, in general, PMD tx_burst() should not modify mbuf
contents. To be more specific:

ethdev/rte_ethdev.h:6396
...
 * @note This function must not modify mbufs (including packets data)
 * unless the refcnt is 1.
 * An exception is the bonding PMD, which does not have "Tx prepare" support,
 * in this case, mbufs may be modified.
...

Though why looking at eth_af_packet_tx(), it looks to me like it does modify
the packet contents without any checks for refcnt, etc.:

static uint16_t
eth_af_packet_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
{
...
for (i = 0; i < nb_pkts; i++) {
mbuf = *bufs++;

...

/* insert vlan info if necessary */
if (mbuf->ol_flags & RTE_MBUF_F_TX_VLAN) {
if (rte_vlan_insert(&mbuf)) {
rte_pktmbuf_free(mbuf);
continue; 

AFAIU, it does copy of mbuf contents into pbuf anyway (just few line below).
So the fix might be - simply insert VLAN tag at copying stage.
Feel free to correct me, if I missed something.

-- 
You are receiving this mail because:
You are the assignee for the bug.

RE: [PATCH v2 3/8] mbuf: fix Tx checksum offload examples

2024-04-16 Thread Konstantin Ananyev


> > > > > > > > > > > Mandate use of rte_eth_tx_prepare() in the mbuf Tx
> > > checksum
> > > > > offload
> > > > > > > > > > > examples.
> > > > > > > > > >
> > > > > > > > > > I strongly disagree with this change!
> > > > > > > > > >
> > > > > > > > > > It will cause a huge performance degradation for shaping
> > > > > applications:
> > > > > > > > > >
> > > > > > > > > > A packet will be processed and finalized at an output or
> > > > > forwarding
> > > > > > > > > pipeline stage, where some other fields might also be
> > > written,
> > > > > so
> > > > > > > > > > zeroing e.g. the out_ip checksum at this stage has low
> > > cost
> > > > > (no new
> > > > > > > > > cache misses).
> > > > > > > > > >
> > > > > > > > > > Then, the packet might be queued for QoS or similar.
> > > > > > > > > >
> > > > > > > > > > If rte_eth_tx_prepare() must be called at the egress
> > > pipeline
> > > > > stage,
> > > > > > > > > it has to write to the packet and cause a cache miss per
> > > packet,
> > > > > > > > > > instead of simply passing on the packet to the NIC
> > > hardware.
> > > > > > > > > >
> > > > > > > > > > It must be possible to finalize the packet at the
> > > > > output/forwarding
> > > > > > > > > pipeline stage!
> > > > > > > > >
> > > > > > > > > If you can finalize your packet on  output/forwarding, then
> > > why
> > > > > you
> > > > > > > > > can't invoke tx_prepare() on the same stage?
> > > > > > > > > There seems to be some misunderstanding about what
> > > tx_prepare()
> > > > > does -
> > > > > > > > > in fact it doesn't communicate with HW queue (doesn't update
> > > TXD
> > > > > ring,
> > > > > > > > > etc.), what it does - just make changes in mbuf itself.
> > > > > > > > > Yes, it reads some fields in SW TX queue struct (max number
> > > of
> > > > > TXDs per
> > > > > > > > > packet, etc.), but AFAIK it is safe
> > > > > > > > > to call tx_prepare() and tx_burst() from different threads.
> > > > > > > > > At least on implementations I am aware about.
> > > > > > > > > Just checked the docs - it seems not stated explicitly
> > > anywhere,
> > > > > might
> > > > > > > > > be that's why it causing such misunderstanding.
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Also, how is rte_eth_tx_prepare() supposed to work for
> > > cloned
> > > > > packets
> > > > > > > > > egressing on different NIC hardware?
> > > > > > > > >
> > > > > > > > > If you create a clone of full packet (including L2/L3)
> > > headers
> > > > > then
> > > > > > > > > obviously such construction might not
> > > > > > > > > work properly with tx_prepare() over two different NICs.
> > > > > > > > > Though In majority of cases you do clone segments with data,
> > > > > while at
> > > > > > > > > least L2 headers are put into different segments.
> > > > > > > > > One simple approach would be to keep L3 header in that
> > > separate
> > > > > segment.
> > > > > > > > > But yes, there is a problem when you'll need to send exactly
> > > the
> > > > > same
> > > > > > > > > packet over different NICs.
> > > > > > > > > As I remember, for bonding PMD things don't work quite well
> > > here
> > > > > - you
> > > > > > > > > might have a bond over 2 NICs with
> > > > > > > > > different tx_prepare() and which one to call might be not
> > > clear
> > > > > till
> > > > > > > > > actual PMD tx_burst() is invoked.
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > In theory, it might get even worse if we make this opaque
> > > > > instead of
> > > > > > > > > transparent and standardized:
> > > > > > > > > > One PMD might reset out_ip checksum to 0x, and another
> > > PMD
> > > > > might
> > > > > > > > > reset it to 0x.
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I can only see one solution:
> > > > > > > > > > We need to standardize on common minimum requirements for
> > > how
> > > > > to
> > > > > > > > > prepare packets for each TX offload.
> > > > > > > > >
> > > > > > > > > If we can make each and every vendor to agree here - that
> > > > > definitely
> > > > > > > > > will help to simplify things quite a bit.
> > > > > > > >
> > > > > > > > An API is more than a function name and parameters.
> > > > > > > > It also has preconditions and postconditions.
> > > > > > > >
> > > > > > > > All major NIC vendors are contributing to DPDK.
> > > > > > > > It should be possible to reach consensus for reasonable
> > > minimum
> > > > > requirements
> > > > > > > for offloads.
> > > > > > > > Hardware- and driver-specific exceptions can be documented
> > > with
> > > > > the offload
> > > > > > > flag, or with rte_eth_rx/tx_burst(), like the note to
> > > > > > > > rte_eth_rx_burst():
> > > > > > > > "Some drivers using vector instructions require that nb_pkts
> > > is
> > > > > divisible by
> > > > > > > 4 or 8, depending on the driver implementation."
> > > > > > >
> > > > > > > If we introduce a rule that everyone supposed to follow and then
> > > > > straightway
> > > > > > > allow peopl

Re: [PATCH 1/5] dts: fix InteractiveShell command prompt filtering

2024-04-16 Thread Luca Vizzarro

Thank you for your review Juraj!

On 16/04/2024 09:40, Juraj Linkeš wrote:


When sending a command using an instance of InteractiveShell the output
is meant to filter out the leading shell prompt. The filtering logic is
present but the line is appended anyways.



I don't think that's what's happening here. The output collecting
logic is "stop when we encounter a prompt, but not the prompt with the
command we sent". We could change the comment though.


Yeah, I actually identified the behaviour better after writing this. 
Will update the commit body accordingly. And I mixed-up leading with 
trailing! This is meant to say "trailing shell prompt".



@@ -132,11 +132,11 @@ def send_command(self, command: str, prompt: str | None = 
None) -> str:
  self._stdin.flush()
  out: str = ""
  for line in self._stdout:
-out += line
  if prompt in line and not line.rstrip().endswith(
  command.rstrip()
  ):  # ignore line that sent command
  break
+out += line


If we do this, we'll only filter out the last prompt, which may not be
desirable, since the last prompt is there only because all of our
interactive shells force an extra prompt with _command_extra_chars.


Could you please expand more on this?


One thing we could improve though is removing the distribution welcome
message from logs, or at least separate it from the first command sent
with the interactive shell. The second option will allow us to see
clearly that an interactive session has been established, although we
could just emit a shorter log (something like "Started a testpmd
session" and then flush the welcome screen output).


I am not sure what you are referring to exactly, could you also expand 
more on this please?


Given it's not particularly explained, I thought having two command 
prompts (especially a trailing one) was an error. The main reason behind 
this is that when we go to parse the port info, the last entry which is 
"device private info" appears to be open ended, and I couldn't gather 
much information from the testpmd source code. So I opted to parse 
everything until the end. With a trailing command prompt this meant 
that: device_private_info="testpmd> ".


Re: [PATCH 2/5] dts: skip first line of send_command output

2024-04-16 Thread Luca Vizzarro

On 16/04/2024 09:48, Juraj Linkeš wrote:

Oh, the first commit message was confusing. It said leading prompt
which I understood to be the first prompt (the one with the command).
I see that this commit actually addresses what I thought the first
commit was trying to do.


Yes, my bad!


-def send_command(self, command: str, prompt: str | None = None) -> str:
+def send_command(
+self, command: str, prompt: str | None = None, skip_first_line: bool = 
False


Do we generally want or don't want to include the first line? When do
we absolutely not want to include it?


In the case of `show port info/stats {x}` if the provided port is 
invalid, then the first message starts with `Invalid port`. By providing 
an output that skips the command prompt, this is easily checked with 
output.startswith("Invalid port") as you may have noticed in the next 
commit. Otherwise it'd be a bit more complicated. Personally, I am not 
sure whether we care about the first line. With my limited knowledge I 
don't see a reason to include it (just as much as the trailing prompt).



+) -> str:
  """Send `command` and get all output before the expected ending 
string.

  Lines that expect input are not included in the stdout buffer, so 
they cannot
@@ -121,6 +123,7 @@ def send_command(self, command: str, prompt: str | None = 
None) -> str:
  command: The command to send.
  prompt: After sending the command, `send_command` will be 
expecting this string.
  If :data:`None`, will use the class's default prompt.
+skip_first_line: Skip the first line when capturing the output.

  Returns:
  All output in the buffer before expected string.
@@ -132,6 +135,9 @@ def send_command(self, command: str, prompt: str | None = 
None) -> str:
  self._stdin.flush()
  out: str = ""
  for line in self._stdout:
+if skip_first_line:
+skip_first_line = False
+continue


Is there ever a reason to distinguish between the first line and the
line with the command on it?


As above, not really sure. Would this always be a command prompt? The 
doubt arises only because I don't understand why we'd need the command 
prompt fed back.





  if prompt in line and not line.rstrip().endswith(
  command.rstrip()
  ):  # ignore line that sent command
--
2.34.1





Re: [PATCH 3/5] dts: add parsing utility module

2024-04-16 Thread Luca Vizzarro

On 16/04/2024 09:59, Juraj Linkeš wrote:

 From this commit message, I don't know why we're adding the module.
What are we going to use it for? Since you mentioned you'll send a v2,
I'll wait with review after that as I think it'll make it a bit easier
to review.


Ack. Will rewrite the commit body.


Re: [PATCH 4/5] dts: add `show port info` command to TestPmdShell

2024-04-16 Thread Luca Vizzarro

On 16/04/2024 10:03, Juraj Linkeš wrote:

+@dataclass
+class TestPmdPort(TextParser):


This and the classes above are missing docstrings.


Ack.


@@ -225,6 +664,38 @@ def set_forward_mode(self, mode: TestPmdForwardingModes, 
verify: bool = True):
  f"Test pmd failed to set fwd mode to {mode.value}"
  )

+def show_port_info_all(self) -> list[TestPmdPort]:
+"""Returns the information of all the ports."""


Can we add sample output so that the format of what we're trying to
parse is clear?


Ack.


+output = self.send_command("show port info all")
+
+ports = []
+iter = re.finditer(r"\*+.+\*+", output)
+if next(iter, False):  # we are slicing retrospectively, skip first 
block
+start_pos = 0
+for block in iter:
+end_pos = block.start()
+ports.append(TestPmdPort.parse(output[start_pos:end_pos]))
+start_pos = end_pos
+
+ports.append(TestPmdPort.parse(output[start_pos:]))
+
+return ports


Can this be done the same way it's done in the last commit?

iter = re.finditer(r"(^  #*.+#*$[^#]+)^  #*$", output, re.MULTILINE)
return [TestPmdPortStats.parse(block.group(1)) for block in iter]

Looks much better.


I agree that it looks much better. I gave it a first attempt to come up 
with a regular expression that is not too complicated and is able to 
match blocks individually. I've noticed that blocks start with:


  \n* Infos for port X 

but don't have an actual ending delimiter, unlike for the stats. I'll 
experiment with some look ahead constructs. The easiest solution is to 
match everything that is not * ([^*]+) but can we be certain that there 
won't be any asterisk in the actual information?


[PATCH v1 1/2] test/dma: update the sg test to verify wrap around case

2024-04-16 Thread Vidya Sagar Velumuri
Run the sg test in a loop to verify wrap around case.
Total number commands submitted to be more than the number descriptors
allocated to verify the scenario.

Signed-off-by: Vidya Sagar Velumuri 

diff --git a/app/test/test_dmadev.c b/app/test/test_dmadev.c
index 143e1bcd68..7462e90831 100644
--- a/app/test/test_dmadev.c
+++ b/app/test/test_dmadev.c
@@ -393,34 +393,26 @@ test_stop_start(int16_t dev_id, uint16_t vchan)
 }
 
 static int
-test_enqueue_sg_copies(int16_t dev_id, uint16_t vchan)
+test_enqueue_sg(int16_t dev_id, uint16_t vchan, unsigned int n_sge, unsigned 
int test_len)
 {
-   unsigned int src_len, dst_len, n_sge, len, i, j, k;
char orig_src[COPY_LEN], orig_dst[COPY_LEN];
-   struct rte_dma_info info = { 0 };
+   unsigned int src_len, dst_len, i, j, k;
enum rte_dma_status_code status;
uint16_t id, n_src, n_dst;
 
-   if (rte_dma_info_get(dev_id, &info) < 0)
-   ERR_RETURN("Failed to get dev info");
-
-   if (info.max_sges < 2)
-   ERR_RETURN("Test needs minimum 2 SG pointers");
-
-   n_sge = info.max_sges;
-
for (n_src = 1; n_src <= n_sge; n_src++) {
for (n_dst = 1; n_dst <= n_sge; n_dst++) {
/* Normalize SG buffer lengths */
-   len = COPY_LEN;
-   len -= (len % (n_src * n_dst));
-   dst_len = len / n_dst;
-   src_len = len / n_src;
-
+   unsigned int len = test_len - (test_len % (n_src * 
n_dst));
struct rte_dma_sge sg_src[n_sge], sg_dst[n_sge];
struct rte_mbuf *src[n_sge], *dst[n_sge];
char *src_data[n_sge], *dst_data[n_sge];
 
+   dst_len = len / n_dst;
+   src_len = len / n_src;
+   if (dst_len == 0 || src_len == 0)
+   continue;
+
for (i = 0 ; i < len; i++)
orig_src[i] = rte_rand() & 0xFF;
 
@@ -511,6 +503,27 @@ test_enqueue_sg_copies(int16_t dev_id, uint16_t vchan)
return 0;
 }
 
+static int
+test_enqueue_sg_copies(int16_t dev_id, uint16_t vchan)
+{
+   struct rte_dma_info info = { 0 };
+   unsigned int n_sge, len;
+   int loop_count = 0;
+
+   if (rte_dma_info_get(dev_id, &info) < 0)
+   ERR_RETURN("Failed to get dev info");
+
+   n_sge = RTE_MIN(info.max_sges, TEST_SG_MAX);
+   len = COPY_LEN;
+
+   do {
+   test_enqueue_sg(dev_id, vchan, n_sge, len);
+   loop_count++;
+   } while (loop_count * n_sge * n_sge < TEST_RINGSIZE * 3);
+
+   return 0;
+}
+
 /* Failure handling test cases - global macros and variables for those tests*/
 #define COMP_BURST_SZ  16
 #define OPT_FENCE(idx) ((fence && idx == 8) ? RTE_DMA_OP_FLAG_FENCE : 0)
diff --git a/app/test/test_dmadev_api.c b/app/test/test_dmadev_api.c
index d40c05cfbf..6a07ed593b 100644
--- a/app/test/test_dmadev_api.c
+++ b/app/test/test_dmadev_api.c
@@ -16,7 +16,6 @@ extern int test_dma_api(uint16_t dev_id);
 
 #define TEST_MEMCPY_SIZE   1024
 #define TEST_WAIT_US_VAL   5
-#define TEST_SG_MAX64
 
 static int16_t test_dev_id;
 static int16_t invalid_dev_id;
diff --git a/app/test/test_dmadev_api.h b/app/test/test_dmadev_api.h
index 33fbc5bd41..a03f7acd4f 100644
--- a/app/test/test_dmadev_api.h
+++ b/app/test/test_dmadev_api.h
@@ -2,4 +2,6 @@
  * Copyright(c) 2021 HiSilicon Limited
  */
 
+#define TEST_SG_MAX64
+
 int test_dma_api(uint16_t dev_id);
-- 
2.25.1



[PATCH v1 2/2] test/dma: add functions to verify zero and one fill

2024-04-16 Thread Vidya Sagar Velumuri
Add test cases to verify zero fill and one fill

Signed-off-by: Vidya Sagar Velumuri 

diff --git a/app/test/test.h b/app/test/test.h
index 15e23d297f..0ca6519f6e 100644
--- a/app/test/test.h
+++ b/app/test/test.h
@@ -27,6 +27,10 @@
 
 #include 
 
+#ifndef ARRAY_SIZE
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
+#endif
+
 #define TEST_ASSERT RTE_TEST_ASSERT
 
 #define TEST_ASSERT_EQUAL RTE_TEST_ASSERT_EQUAL
diff --git a/app/test/test_dmadev.c b/app/test/test_dmadev.c
index 7462e90831..b21994d592 100644
--- a/app/test/test_dmadev.c
+++ b/app/test/test_dmadev.c
@@ -869,42 +869,52 @@ test_completion_handling(int16_t dev_id, uint16_t vchan)
 static int
 test_enqueue_fill(int16_t dev_id, uint16_t vchan)
 {
+   uint64_t pattern[3] = {0x0, 0xfedcba9876543210, 0x};
const unsigned int lengths[] = {8, 64, 1024, 50, 100, 89};
+   unsigned int i, j, k;
struct rte_mbuf *dst;
char *dst_data;
-   uint64_t pattern = 0xfedcba9876543210;
-   unsigned int i, j;
 
dst = rte_pktmbuf_alloc(pool);
if (dst == NULL)
ERR_RETURN("Failed to allocate mbuf\n");
dst_data = rte_pktmbuf_mtod(dst, char *);
 
-   for (i = 0; i < RTE_DIM(lengths); i++) {
-   /* reset dst_data */
-   memset(dst_data, 0, rte_pktmbuf_data_len(dst));
+   for (k = 0; k < ARRAY_SIZE(pattern); k++) {
+   printf("Test fill pattern: 0x%016lx\n", pattern[k]);
+   for (i = 0; i < RTE_DIM(lengths); i++) {
+   /* reset dst_data */
+   memset(dst_data, 0, rte_pktmbuf_data_len(dst));
+
+   /* perform the fill operation */
+   int id = rte_dma_fill(dev_id, vchan, pattern[k],
+   rte_pktmbuf_iova(dst), lengths[i], 
RTE_DMA_OP_FLAG_SUBMIT);
+   if (id < 0) {
+   if (id == -ENOTSUP) {
+   rte_pktmbuf_free(dst);
+   break;
+   }
+   ERR_RETURN("Error with rte_dma_fill\n");
+   }
+   await_hw(dev_id, vchan);
 
-   /* perform the fill operation */
-   int id = rte_dma_fill(dev_id, vchan, pattern,
-   rte_pktmbuf_iova(dst), lengths[i], 
RTE_DMA_OP_FLAG_SUBMIT);
-   if (id < 0)
-   ERR_RETURN("Error with rte_dma_fill\n");
-   await_hw(dev_id, vchan);
+   if (rte_dma_completed(dev_id, vchan, 1, NULL, NULL) != 
1)
+   ERR_RETURN("Error: fill operation failed 
(length: %u)\n",
+  lengths[i]);
+   /* check the data from the fill operation is correct */
+   for (j = 0; j < lengths[i]; j++) {
+   char pat_byte = ((char *)&pattern[k])[j % 8];
 
-   if (rte_dma_completed(dev_id, vchan, 1, NULL, NULL) != 1)
-   ERR_RETURN("Error: fill operation failed (length: 
%u)\n", lengths[i]);
-   /* check the data from the fill operation is correct */
-   for (j = 0; j < lengths[i]; j++) {
-   char pat_byte = ((char *)&pattern)[j % 8];
-   if (dst_data[j] != pat_byte)
-   ERR_RETURN("Error with fill operation (lengths 
= %u): got (%x), not (%x)\n",
-   lengths[i], dst_data[j], 
pat_byte);
+   if (dst_data[j] != pat_byte)
+   ERR_RETURN("Error with fill operation 
(lengths = %u): got (%x), not (%x)\n",
+   lengths[i], 
dst_data[j], pat_byte);
+   }
+   /* check that the data after the fill operation was not 
written to */
+   for (; j < rte_pktmbuf_data_len(dst); j++)
+   if (dst_data[j] != 0)
+   ERR_RETURN("Error, fill operation wrote 
too far (lengths = %u): got (%x), not (%x)\n",
+   lengths[i], 
dst_data[j], 0);
}
-   /* check that the data after the fill operation was not written 
to */
-   for (; j < rte_pktmbuf_data_len(dst); j++)
-   if (dst_data[j] != 0)
-   ERR_RETURN("Error, fill operation wrote too far 
(lengths = %u): got (%x), not (%x)\n",
-   lengths[i], dst_data[j], 0);
}
 
rte_pktmbuf_free(dst);
-- 
2.25.1



[PATCH] doc/sample_app_ug: add baseline mode

2024-04-16 Thread Karen Kelly
Updating PMD Power Management Mode section of the l3fwd-power
sample app to add baseline option as this was missing from the
original commit.

Fixes: a9ea60702ecb ("examples/l3fwd-power: add baseline PMD management mode")

Signed-off-by: Karen Kelly 
---
 doc/guides/sample_app_ug/l3_forward_power_man.rst | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/doc/guides/sample_app_ug/l3_forward_power_man.rst 
b/doc/guides/sample_app_ug/l3_forward_power_man.rst
index 4a6f33bf4f..9c9684fea7 100644
--- a/doc/guides/sample_app_ug/l3_forward_power_man.rst
+++ b/doc/guides/sample_app_ug/l3_forward_power_man.rst
@@ -280,6 +280,9 @@ will use automatic PMD power management.
 This mode is limited to one queue per core,
 and has three available power management schemes:
 
+``baseline``
+  This mode will not enable any power saving features.
+
 ``monitor``
   This will use ``rte_power_monitor()`` function to enter
   a power-optimized state (subject to platform support).
-- 
2.34.1



[PATCH v2] usertools: add telemetry exporter

2024-04-16 Thread Robin Jarry
For now the telemetry socket is local to the machine running a DPDK
application. Also, there is no official "schema" for the exposed
metrics. Add a framework and a script to collect and expose these
metrics to telemetry and observability agree gators such as Prometheus,
Carbon or Influxdb. The exposed data must be done with end-users in
mind, some DPDK terminology or internals may not make sense to everyone.

The script only serves as an entry point and does not know anything
about any specific metrics nor JSON data structures exposed in the
telemetry socket.

It uses dynamically loaded endpoint exporters which are basic python
files that must implement two functions:

 def info() -> dict[MetricName, MetricInfo]:
 Mapping of metric names to their description and type.

 def metrics(sock: TelemetrySocket) -> list[MetricValue]:
 Request data from sock and return it as metric values. A metric
 value is a 3-tuple: (name: str, value: any, labels: dict). Each
 name must be present in info().

The sock argument passed to metrics() has a single method:

 def cmd(self, uri: str, arg: any = None) -> dict | list:
 Request JSON data to the telemetry socket and parse it to python
 values.

The main script invokes endpoints and exports the data into an output
format. For now, only two formats are implemented:

* openmetrics/prometheus: text based format exported via a local HTTP
  server.
* carbon/graphite: binary (python pickle) format exported to a distant
  carbon TCP server.

As a starting point, 3 built-in endpoints are implemented:

* counters: ethdev hardware counters
* cpu: lcore usage
* memory: overall memory usage

The goal is to keep all built-in endpoints in the DPDK repository so
that they can be updated along with the telemetry JSON data structures.

Example output for the openmetrics:// format:

 ~# dpdk-telemetry-exporter.py -o openmetrics://:9876 &
 INFO using endpoint: counters (from .../telemetry-endpoints/counters.py)
 INFO using endpoint: cpu (from .../telemetry-endpoints/cpu.py)
 INFO using endpoint: memory (from .../telemetry-endpoints/memory.py)
 INFO listening on port 9876
 [1] 838829

 ~$ curl http://127.0.0.1:9876/
 # HELP dpdk_cpu_total_cycles Total number of CPU cycles.
 # TYPE dpdk_cpu_total_cycles counter
 # HELP dpdk_cpu_busy_cycles Number of busy CPU cycles.
 # TYPE dpdk_cpu_busy_cycles counter
 dpdk_cpu_total_cycles{cpu="73", numa="0"} 4353385274702980
 dpdk_cpu_busy_cycles{cpu="73", numa="0"} 6215932860
 dpdk_cpu_total_cycles{cpu="9", numa="0"} 4353385274745740
 dpdk_cpu_busy_cycles{cpu="9", numa="0"} 6215932860
 dpdk_cpu_total_cycles{cpu="8", numa="0"} 4353383451895540
 dpdk_cpu_busy_cycles{cpu="8", numa="0"} 6171923160
 dpdk_cpu_total_cycles{cpu="72", numa="0"} 4353385274817320
 dpdk_cpu_busy_cycles{cpu="72", numa="0"} 6215932860
 # HELP dpdk_memory_total_bytes The total size of reserved memory in bytes.
 # TYPE dpdk_memory_total_bytes gauge
 # HELP dpdk_memory_used_bytes The currently used memory in bytes.
 # TYPE dpdk_memory_used_bytes gauge
 dpdk_memory_total_bytes 1073741824
 dpdk_memory_used_bytes 794197376

Link: 
https://prometheus.io/docs/instrumenting/exposition_formats/#text-based-format
Link: 
https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#text-format
Link: 
https://graphite.readthedocs.io/en/latest/feeding-carbon.html#the-pickle-protocol
Link: 
https://github.com/influxdata/telegraf/tree/master/plugins/inputs/prometheus
Signed-off-by: Robin Jarry 
---

Notes:
v2:

* Refuse to run if no endpoints are enabled.
* Handle endpoint errors gracefully without failing the whole query.

 usertools/dpdk-telemetry-exporter.py  | 405 ++
 usertools/meson.build |   6 +
 usertools/telemetry-endpoints/counters.py |  47 +++
 usertools/telemetry-endpoints/cpu.py  |  29 ++
 usertools/telemetry-endpoints/memory.py   |  37 ++
 5 files changed, 524 insertions(+)
 create mode 100755 usertools/dpdk-telemetry-exporter.py
 create mode 100644 usertools/telemetry-endpoints/counters.py
 create mode 100644 usertools/telemetry-endpoints/cpu.py
 create mode 100644 usertools/telemetry-endpoints/memory.py

diff --git a/usertools/dpdk-telemetry-exporter.py 
b/usertools/dpdk-telemetry-exporter.py
new file mode 100755
index ..f8d873ad856c
--- /dev/null
+++ b/usertools/dpdk-telemetry-exporter.py
@@ -0,0 +1,405 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright (c) 2023 Robin Jarry
+
+r'''
+DPDK telemetry exporter.
+
+It uses dynamically loaded endpoint exporters which are basic python files that
+must implement two functions:
+
+def info() -> dict[MetricName, MetricInfo]:
+"""
+Mapping of metric names to their description and type.
+"""
+
+def metrics(sock: TelemetrySocket) -> list[MetricValue]:
+"""
+Request data from sock and return it as metric values. A metric value
+is a 3-tuple: (n

[PATCH 0/5] use RTE_DIM where possible

2024-04-16 Thread Stephen Hemminger
There is macro for computing the number of elements in an array RTE_DIM.
But it is not used in many places where it could be.
Based on similar coccinelle script in Linux.

Stephen Hemminger (5):
  cocci: add script to use RTE_DIM
  app: use RTE_DIM
  lib: use RTE_DIM
  examples _use RTE_DIM
  drivers/net: use RTE_DIM

 app/test-flow-perf/main.c  |  4 +--
 app/test/test_ethdev_link.c|  3 +-
 app/test/test_security_inline_macsec.c | 47 +-
 devtools/cocci/rte-dim.cocci   | 23 +
 drivers/net/ark/ark_pktchkr.c  |  2 +-
 drivers/net/ark/ark_pktgen.c   |  2 +-
 drivers/net/bnxt/bnxt_hwrm.c   | 12 +++
 drivers/net/e1000/em_rxtx.c|  3 +-
 drivers/net/iavf/iavf_ipsec_crypto.c   |  3 +-
 drivers/net/igc/igc_ethdev.c   |  3 +-
 drivers/net/ipn3ke/ipn3ke_tm.c |  3 +-
 drivers/net/ngbe/ngbe_ethdev.c |  6 ++--
 drivers/net/octeontx/octeontx_stats.h  |  3 +-
 drivers/net/txgbe/txgbe_ethdev.c   |  9 ++---
 drivers/net/txgbe/txgbe_ethdev_vf.c|  3 +-
 examples/l3fwd/main.c  |  3 +-
 examples/qos_sched/init.c  |  3 +-
 lib/cmdline/cmdline_vt100.c|  4 ++-
 lib/latencystats/rte_latencystats.c|  3 +-
 19 files changed, 73 insertions(+), 66 deletions(-)
 create mode 100644 devtools/cocci/rte-dim.cocci

-- 
2.43.0



[PATCH 1/5] cocci: add script to use RTE_DIM

2024-04-16 Thread Stephen Hemminger
New script to find RTE_DIM should be used.

Signed-off-by: Stephen Hemminger 
---
 devtools/cocci/rte-dim.cocci | 23 +++
 1 file changed, 23 insertions(+)
 create mode 100644 devtools/cocci/rte-dim.cocci

diff --git a/devtools/cocci/rte-dim.cocci b/devtools/cocci/rte-dim.cocci
new file mode 100644
index 00..e7fcadec98
--- /dev/null
+++ b/devtools/cocci/rte-dim.cocci
@@ -0,0 +1,23 @@
+// SPDX-License-Identifier: BSD-3-Clause
+// Use RTE_DIM macro instead of dividing sizeof array with sizeof an elmemnt
+//
+// Based of Linux kernela array_size.cocci
+//
+@@
+type T;
+T[] E;
+@@
+(
+|
+- (sizeof(E)/sizeof(E[...]))
++ RTE_DIM(E)
+|
+- (sizeof(E)/sizeof(*E))
++ RTE_DIM(E)
+|
+- (sizeof(E)/sizeof(T))
++ RTE_DIM(E)
+|
+- RTE_DIM((E))
++ RTE_DIM(E)
+)
-- 
2.43.0



[PATCH 2/5] app: use RTE_DIM

2024-04-16 Thread Stephen Hemminger
Use RTE_DIM instead of computing directly with sizeof.
Patch automatically generated via cocci/rte_dim.cocci.

Signed-off-by: Stephen Hemminger 
---
 app/test-flow-perf/main.c  |  4 +--
 app/test/test_ethdev_link.c|  3 +-
 app/test/test_security_inline_macsec.c | 47 +-
 3 files changed, 25 insertions(+), 29 deletions(-)

diff --git a/app/test-flow-perf/main.c b/app/test-flow-perf/main.c
index e224ef6798..e9ef1ae04c 100644
--- a/app/test-flow-perf/main.c
+++ b/app/test-flow-perf/main.c
@@ -965,9 +965,7 @@ args_parse(int argc, char **argv)
"meter-profile") == 0) {
i = 0;
token = strsep(&optarg, ",\0");
-   while (token != NULL && i < sizeof(
-   meter_profile_values) /
-   sizeof(uint64_t)) {
+   while (token != NULL && i < 
RTE_DIM(meter_profile_values)) {
meter_profile_values[i++] = atol(token);
token = strsep(&optarg, ",\0");
}
diff --git a/app/test/test_ethdev_link.c b/app/test/test_ethdev_link.c
index f063a5fe26..e305df71be 100644
--- a/app/test/test_ethdev_link.c
+++ b/app/test/test_ethdev_link.c
@@ -135,8 +135,7 @@ test_link_speed_all_values(void)
{ "Invalid",   50505 }
};
 
-   for (i = 0; i < sizeof(speed_str_map) / sizeof(struct link_speed_t);
-   i++) {
+   for (i = 0; i < RTE_DIM(speed_str_map); i++) {
speed = rte_eth_link_speed_to_str(speed_str_map[i].link_speed);
TEST_ASSERT_BUFFERS_ARE_EQUAL(speed_str_map[i].value,
speed, strlen(speed_str_map[i].value),
diff --git a/app/test/test_security_inline_macsec.c 
b/app/test/test_security_inline_macsec.c
index f11e9da8c3..26f7504dc2 100644
--- a/app/test/test_security_inline_macsec.c
+++ b/app/test/test_security_inline_macsec.c
@@ -1294,7 +1294,7 @@ test_inline_macsec_encap_all(void)
opts.sectag_insert_mode = 1;
opts.mtu = RTE_ETHER_MTU;
 
-   size = (sizeof(list_mcs_cipher_vectors) / 
sizeof((list_mcs_cipher_vectors)[0]));
+   size = RTE_DIM(list_mcs_cipher_vectors);
for (i = 0; i < size; i++) {
cur_td = &list_mcs_cipher_vectors[i];
err = test_macsec(&cur_td, MCS_ENCAP, &opts);
@@ -1332,7 +1332,7 @@ test_inline_macsec_decap_all(void)
opts.sectag_insert_mode = 1;
opts.mtu = RTE_ETHER_MTU;
 
-   size = (sizeof(list_mcs_cipher_vectors) / 
sizeof((list_mcs_cipher_vectors)[0]));
+   size = RTE_DIM(list_mcs_cipher_vectors);
for (i = 0; i < size; i++) {
cur_td = &list_mcs_cipher_vectors[i];
err = test_macsec(&cur_td, MCS_DECAP, &opts);
@@ -1371,7 +1371,7 @@ test_inline_macsec_auth_only_all(void)
opts.sectag_insert_mode = 1;
opts.mtu = RTE_ETHER_MTU;
 
-   size = (sizeof(list_mcs_integrity_vectors) / 
sizeof((list_mcs_integrity_vectors)[0]));
+   size = RTE_DIM(list_mcs_integrity_vectors);
 
for (i = 0; i < size; i++) {
cur_td = &list_mcs_integrity_vectors[i];
@@ -1410,7 +1410,7 @@ test_inline_macsec_verify_only_all(void)
opts.sectag_insert_mode = 1;
opts.mtu = RTE_ETHER_MTU;
 
-   size = (sizeof(list_mcs_integrity_vectors) / 
sizeof((list_mcs_integrity_vectors)[0]));
+   size = RTE_DIM(list_mcs_integrity_vectors);
 
for (i = 0; i < size; i++) {
cur_td = &list_mcs_integrity_vectors[i];
@@ -1451,7 +1451,7 @@ test_inline_macsec_encap_decap_all(void)
opts.sectag_insert_mode = 1;
opts.mtu = RTE_ETHER_MTU;
 
-   size = (sizeof(list_mcs_cipher_vectors) / 
sizeof((list_mcs_cipher_vectors)[0]));
+   size = RTE_DIM(list_mcs_cipher_vectors);
 
for (i = 0; i < size; i++) {
cur_td = &list_mcs_cipher_vectors[i];
@@ -1492,7 +1492,7 @@ test_inline_macsec_auth_verify_all(void)
opts.sectag_insert_mode = 1;
opts.mtu = RTE_ETHER_MTU;
 
-   size = (sizeof(list_mcs_integrity_vectors) / 
sizeof((list_mcs_integrity_vectors)[0]));
+   size = RTE_DIM(list_mcs_integrity_vectors);
 
for (i = 0; i < size; i++) {
cur_td = &list_mcs_integrity_vectors[i];
@@ -1578,7 +1578,7 @@ test_inline_macsec_with_vlan(void)
opts.nb_td = 1;
opts.mtu = RTE_ETHER_MTU;
 
-   size = (sizeof(list_mcs_vlan_vectors) / 
sizeof((list_mcs_vlan_vectors)[0]));
+   size = RTE_DIM(list_mcs_vlan_vectors);
 
for (i = 0; i < size; i++) {
cur_td = &list_mcs_vlan_vectors[i];
@@ -1653,7 +1653,7 @@ test_inline_macsec_pkt_drop(void)
opts.sectag_insert_mode = 1;
opts.mtu = RTE_ETHER_MTU;
 
-   size = (sizeof(list_mcs_err_cipher_vec

[PATCH 3/5] lib: use RTE_DIM

2024-04-16 Thread Stephen Hemminger
Use RTE_DIM instead of computing directly with sizeof.
Patch automatically generated via cocci/rte_dim.cocci.

Signed-off-by: Stephen Hemminger 
---
 lib/cmdline/cmdline_vt100.c | 4 +++-
 lib/latencystats/rte_latencystats.c | 3 +--
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/lib/cmdline/cmdline_vt100.c b/lib/cmdline/cmdline_vt100.c
index 4c9a46c953..61a2ecdd72 100644
--- a/lib/cmdline/cmdline_vt100.c
+++ b/lib/cmdline/cmdline_vt100.c
@@ -8,6 +8,8 @@
 #include 
 #include 
 
+#include 
+
 #include "cmdline_vt100.h"
 
 const char *cmdline_vt100_commands[] = {
@@ -56,7 +58,7 @@ match_command(char *buf, unsigned int size)
size_t cmdlen;
unsigned int i = 0;
 
-   for (i=0 ; i

[PATCH 4/5] examples _use RTE_DIM

2024-04-16 Thread Stephen Hemminger
Use RTE_DIM instead of computing directly with sizeof.
Patch automatically generated via cocci/rte_dim.cocci.

Signed-off-by: Stephen Hemminger 
---
 examples/l3fwd/main.c | 3 +--
 examples/qos_sched/init.c | 3 +--
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
index 8d32ae1dd5..3960d85202 100644
--- a/examples/l3fwd/main.c
+++ b/examples/l3fwd/main.c
@@ -116,8 +116,7 @@ static struct lcore_params lcore_params_array_default[] = {
 };
 
 static struct lcore_params * lcore_params = lcore_params_array_default;
-static uint16_t nb_lcore_params = sizeof(lcore_params_array_default) /
-   sizeof(lcore_params_array_default[0]);
+static uint16_t nb_lcore_params = RTE_DIM(lcore_params_array_default);
 
 static struct rte_eth_conf port_conf = {
.rxmode = {
diff --git a/examples/qos_sched/init.c b/examples/qos_sched/init.c
index d8abae635a..75629e36ea 100644
--- a/examples/qos_sched/init.c
+++ b/examples/qos_sched/init.c
@@ -206,8 +206,7 @@ struct rte_sched_subport_params 
subport_params[MAX_SCHED_SUBPORTS] = {
.n_pipes_per_subport_enabled = 4096,
.qsize = {64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64},
.pipe_profiles = pipe_profiles,
-   .n_pipe_profiles = sizeof(pipe_profiles) /
-   sizeof(struct rte_sched_pipe_params),
+   .n_pipe_profiles = RTE_DIM(pipe_profiles),
.n_max_pipe_profiles = MAX_SCHED_PIPE_PROFILES,
.cman_params = NULL,
},
-- 
2.43.0



[PATCH 5/5] drivers/net: use RTE_DIM

2024-04-16 Thread Stephen Hemminger
Use RTE_DIM instead of computing directly with sizeof.
Patch automatically generated via cocci/rte_dim.cocci.
Code in base/ subdirectory manually excluded.

Signed-off-by: Stephen Hemminger 
---
 drivers/net/ark/ark_pktchkr.c |  2 +-
 drivers/net/ark/ark_pktgen.c  |  2 +-
 drivers/net/bnxt/bnxt_hwrm.c  | 12 ++--
 drivers/net/e1000/em_rxtx.c   |  3 +--
 drivers/net/iavf/iavf_ipsec_crypto.c  |  3 +--
 drivers/net/igc/igc_ethdev.c  |  3 +--
 drivers/net/ipn3ke/ipn3ke_tm.c|  3 +--
 drivers/net/ngbe/ngbe_ethdev.c|  6 ++
 drivers/net/octeontx/octeontx_stats.h |  3 +--
 drivers/net/txgbe/txgbe_ethdev.c  |  9 +++--
 drivers/net/txgbe/txgbe_ethdev_vf.c   |  3 +--
 11 files changed, 19 insertions(+), 30 deletions(-)

diff --git a/drivers/net/ark/ark_pktchkr.c b/drivers/net/ark/ark_pktchkr.c
index e1f336c73c..63c298fca8 100644
--- a/drivers/net/ark/ark_pktchkr.c
+++ b/drivers/net/ark/ark_pktchkr.c
@@ -320,7 +320,7 @@ options(const char *id)
 {
unsigned int i;
 
-   for (i = 0; i < sizeof(toptions) / sizeof(struct OPTIONS); i++) {
+   for (i = 0; i < RTE_DIM(toptions); i++) {
if (strcmp(id, toptions[i].opt) == 0)
return &toptions[i];
}
diff --git a/drivers/net/ark/ark_pktgen.c b/drivers/net/ark/ark_pktgen.c
index 69ff7072b2..98e692c3ad 100644
--- a/drivers/net/ark/ark_pktgen.c
+++ b/drivers/net/ark/ark_pktgen.c
@@ -297,7 +297,7 @@ options(const char *id)
 {
unsigned int i;
 
-   for (i = 0; i < sizeof(toptions) / sizeof(struct OPTIONS); i++) {
+   for (i = 0; i < RTE_DIM(toptions); i++) {
if (strcmp(id, toptions[i].opt) == 0)
return &toptions[i];
}
diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 6ea7089a3f..68d7382806 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -72,13 +72,13 @@ const char *media_type[] = { "Unknown", "Twisted Pair",
"Direct Attached Copper", "Fiber"
 };
 
-#define MAX_MEDIA_TYPE (sizeof(media_type) / sizeof(const char *))
+#define MAX_MEDIA_TYPE RTE_DIM(media_type)
 
 const char *link_status_str[] = { "Down. No link or cable detected.",
"Down. No link, but a cable has been detected.", "Up.",
 };
 
-#define MAX_LINK_STR (sizeof(link_status_str) / sizeof(const char *))
+#define MAX_LINK_STR RTE_DIM(link_status_str)
 
 const char *fec_mode[] = {
"No active FEC",
@@ -90,13 +90,13 @@ const char *fec_mode[] = {
"FEC RS(272,257)"
 };
 
-#define MAX_FEC_MODE (sizeof(fec_mode) / sizeof(const char *))
+#define MAX_FEC_MODE RTE_DIM(fec_mode)
 
 const char *signal_mode[] = {
"NRZ", "PAM4", "PAM4_112"
 };
 
-#define MAX_SIG_MODE (sizeof(signal_mode) / sizeof(const char *))
+#define MAX_SIG_MODE RTE_DIM(signal_mode)
 
 /* multi-purpose multi-key table container.
  * Add a unique entry for a new PHY attribs as per HW CAS.
@@ -226,7 +226,7 @@ struct link_speeds2_tbl {
},
 };
 
-#define BNXT_SPEEDS2_TBL_SZ (sizeof(link_speeds2_tbl) / 
sizeof(*link_speeds2_tbl))
+#define BNXT_SPEEDS2_TBL_SZ RTE_DIM(link_speeds2_tbl)
 
 /* In hwrm_phy_qcfg reports trained up speeds in link_speed(offset:0x8[31:16]) 
*/
 struct link_speeds_tbl {
@@ -269,7 +269,7 @@ struct link_speeds_tbl {
},
 };
 
-#define BNXT_SPEEDS_TBL_SZ (sizeof(link_speeds_tbl) / sizeof(*link_speeds_tbl))
+#define BNXT_SPEEDS_TBL_SZ RTE_DIM(link_speeds_tbl)
 
 static const char *bnxt_get_xcvr_type(uint32_t 
xcvr_identifier_type_tx_lpi_timer)
 {
diff --git a/drivers/net/e1000/em_rxtx.c b/drivers/net/e1000/em_rxtx.c
index df5fbb7823..59d7793787 100644
--- a/drivers/net/e1000/em_rxtx.c
+++ b/drivers/net/e1000/em_rxtx.c
@@ -1666,8 +1666,7 @@ em_rctl_bsize(__rte_unused enum e1000_mac_type hwtyp, 
uint32_t *bufsz)
 * ***
 */
 
-   for (i = 0; i != sizeof(bufsz_to_rctl) / sizeof(bufsz_to_rctl[0]);
-   i++) {
+   for (i = 0; i != RTE_DIM(bufsz_to_rctl); i++) {
if (rctl_bsize >= bufsz_to_rctl[i].bufsz) {
*bufsz = bufsz_to_rctl[i].bufsz;
return bufsz_to_rctl[i].rctl;
diff --git a/drivers/net/iavf/iavf_ipsec_crypto.c 
b/drivers/net/iavf/iavf_ipsec_crypto.c
index 6fd45ff45f..d3ce666ef0 100644
--- a/drivers/net/iavf/iavf_ipsec_crypto.c
+++ b/drivers/net/iavf/iavf_ipsec_crypto.c
@@ -1483,8 +1483,7 @@ iavf_ipsec_crypto_capabilities_get(void *device)
 * crypto capabilities, except for last element of the array which is
 * the null termination
 */
-   for (i = 0; i < ((sizeof(iavf_security_capabilities) /
-   sizeof(iavf_security_capabilities[0])) - 1); i++) {
+   for (i = 0; i < (RTE_DIM(iavf_security_capabilities) - 1); i++) {
iavf_security_capabilities[i].crypto_capabilities =
iavf_sctx->crypto_capabilit

Re: [PATCH 2/5] app: use RTE_DIM

2024-04-16 Thread Tyler Retzlaff
On Tue, Apr 16, 2024 at 08:19:28AM -0700, Stephen Hemminger wrote:
> Use RTE_DIM instead of computing directly with sizeof.
> Patch automatically generated via cocci/rte_dim.cocci.
> 
> Signed-off-by: Stephen Hemminger 
> ---

Acked-by: Tyler Retzlaff 



[PATCH 0/4] net/mlx5: some unrelated fixes and improvements

2024-04-16 Thread Michael Baum
This patch-set groups some unrelated fixes and improvements in MLX5 PMD
code.

Michael Baum (4):
  net/mlx5: fix secondary process port close
  net/mlx5/hws: fix GENEVE option class partial mask
  net/mlx5/hws: add fragment packet ID matching support
  net/mlx5/hws: remove table type DONTCARE

 drivers/net/mlx5/hws/mlx5dr.h |  1 -
 drivers/net/mlx5/hws/mlx5dr_definer.c | 47 +++
 drivers/net/mlx5/hws/mlx5dr_definer.h |  2 ++
 drivers/net/mlx5/mlx5.c   | 15 +
 drivers/net/mlx5/mlx5_flow.h  |  2 +-
 5 files changed, 37 insertions(+), 30 deletions(-)

-- 
2.25.1



[PATCH 3/4] net/mlx5/hws: add fragment packet ID matching support

2024-04-16 Thread Michael Baum
Add HWS support of IPv4 fragment packet id field matching.

Signed-off-by: Michael Baum 
Reviewed-by: Alex Vesker 
Acked-by: Matan Azrad 
---
 drivers/net/mlx5/hws/mlx5dr_definer.c | 11 ++-
 drivers/net/mlx5/hws/mlx5dr_definer.h |  2 ++
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/hws/mlx5dr_definer.c 
b/drivers/net/mlx5/hws/mlx5dr_definer.c
index f1f544deab..7bb328e85e 100644
--- a/drivers/net/mlx5/hws/mlx5dr_definer.c
+++ b/drivers/net/mlx5/hws/mlx5dr_definer.c
@@ -171,6 +171,7 @@ struct mlx5dr_definer_conv_data {
X(SET,  ipv4_version,   STE_IPV4,   
rte_ipv4_hdr) \
X(SET_BE16, ipv4_frag,  v->fragment_offset, 
rte_ipv4_hdr) \
X(SET_BE16, ipv4_len,   v->total_length,
rte_ipv4_hdr) \
+   X(SET_BE16, ipv4_identification,v->packet_id,   
rte_ipv4_hdr) \
X(SET,  ip_fragmented,  !!v->fragment_offset,   
rte_ipv4_hdr) \
X(SET_BE16, ipv6_payload_len,   v->hdr.payload_len, 
rte_flow_item_ipv6) \
X(SET,  ipv6_proto, v->hdr.proto,   
rte_flow_item_ipv6) \
@@ -1026,7 +1027,7 @@ mlx5dr_definer_conv_item_ipv4(struct 
mlx5dr_definer_conv_data *cd,
if (!m)
return 0;
 
-   if (m->packet_id || m->hdr_checksum ||
+   if (m->hdr_checksum ||
(l && (l->next_proto_id || l->type_of_service))) {
rte_errno = ENOTSUP;
return rte_errno;
@@ -1060,6 +1061,14 @@ mlx5dr_definer_conv_item_ipv4(struct 
mlx5dr_definer_conv_data *cd,
DR_CALC_SET(fc, eth_l3, protocol_next_header, inner);
}
 
+   if (m->packet_id) {
+   fc = &cd->fc[DR_CALC_FNAME(IP_ID, inner)];
+   fc->item_idx = item_idx;
+   fc->is_range = l && l->packet_id;
+   fc->tag_set = &mlx5dr_definer_ipv4_identification_set;
+   DR_CALC_SET(fc, eth_l3, identification, inner);
+   }
+
if (m->total_length) {
fc = &cd->fc[DR_CALC_FNAME(IP_LEN, inner)];
fc->item_idx = item_idx;
diff --git a/drivers/net/mlx5/hws/mlx5dr_definer.h 
b/drivers/net/mlx5/hws/mlx5dr_definer.h
index ca530ebf30..a42ba9b81a 100644
--- a/drivers/net/mlx5/hws/mlx5dr_definer.h
+++ b/drivers/net/mlx5/hws/mlx5dr_definer.h
@@ -75,6 +75,8 @@ enum mlx5dr_definer_fname {
MLX5DR_DEFINER_FNAME_IP_VERSION_I,
MLX5DR_DEFINER_FNAME_IP_FRAG_O,
MLX5DR_DEFINER_FNAME_IP_FRAG_I,
+   MLX5DR_DEFINER_FNAME_IP_ID_O,
+   MLX5DR_DEFINER_FNAME_IP_ID_I,
MLX5DR_DEFINER_FNAME_IP_LEN_O,
MLX5DR_DEFINER_FNAME_IP_LEN_I,
MLX5DR_DEFINER_FNAME_IP_TOS_O,
-- 
2.25.1



[PATCH 1/4] net/mlx5: fix secondary process port close

2024-04-16 Thread Michael Baum
The "mlx5_dev_close()" function is used for both primary and secondary
processes.

If secondary process use this function after primary process is closed,
the priv structure isn't valid anymore.
The function is accessing priv structure to get "sh" pointer in part
shared between processes causing a crash for secondary.

This patch avoids this access and print warning in this case.

Fixes: f5177bdc8b76 ("net/mlx5: add GENEVE TLV options parser API")
Cc: michae...@nvidia.com
Cc: sta...@dpdk.org

Signed-off-by: Michael Baum 
Acked-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index d1a63822a5..585b4d5497 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -2295,11 +2295,13 @@ int
 mlx5_dev_close(struct rte_eth_dev *dev)
 {
struct mlx5_priv *priv = dev->data->dev_private;
-   struct mlx5_dev_ctx_shared *sh = priv->sh;
+   struct mlx5_dev_ctx_shared *sh;
unsigned int i;
int ret;
 
if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
+   if (!priv)
+   DRV_LOG(WARNING, "primary process is already closed");
/* Check if process_private released. */
if (!dev->process_private)
return 0;
@@ -2308,6 +2310,7 @@ mlx5_dev_close(struct rte_eth_dev *dev)
rte_eth_dev_release_port(dev);
return 0;
}
+   sh = priv->sh;
if (!sh)
return 0;
if (priv->shared_refcnt) {
@@ -2326,9 +2329,7 @@ mlx5_dev_close(struct rte_eth_dev *dev)
}
 #endif
DRV_LOG(DEBUG, "port %u closing device \"%s\"",
-   dev->data->port_id,
-   ((priv->sh->cdev->ctx != NULL) ?
-   mlx5_os_get_ctx_device_name(priv->sh->cdev->ctx) : ""));
+   dev->data->port_id, sh->ibdev_name);
/*
 * If default mreg copy action is removed at the stop stage,
 * the search will return none and nothing will be done anymore.
@@ -2402,7 +2403,7 @@ mlx5_dev_close(struct rte_eth_dev *dev)
mlx5_free(priv->rss_conf.rss_key);
if (priv->reta_idx != NULL)
mlx5_free(priv->reta_idx);
-   if (priv->sh->dev_cap.vf)
+   if (sh->dev_cap.vf)
mlx5_os_mac_addr_flush(dev);
if (priv->nl_socket_route >= 0)
close(priv->nl_socket_route);
@@ -2445,7 +2446,7 @@ mlx5_dev_close(struct rte_eth_dev *dev)
if (priv->hrxqs)
mlx5_list_destroy(priv->hrxqs);
mlx5_free(priv->ext_rxqs);
-   priv->sh->port[priv->dev_port - 1].nl_ih_port_id = RTE_MAX_ETHPORTS;
+   sh->port[priv->dev_port - 1].nl_ih_port_id = RTE_MAX_ETHPORTS;
/*
 * The interrupt handler port id must be reset before priv is reset
 * since 'mlx5_dev_interrupt_nl_cb' uses priv.
@@ -2457,7 +2458,7 @@ mlx5_dev_close(struct rte_eth_dev *dev)
 * mlx5_os_mac_addr_flush() uses ibdev_path for retrieving
 * ifindex if Netlink fails.
 */
-   mlx5_free_shared_dev_ctx(priv->sh);
+   mlx5_free_shared_dev_ctx(sh);
if (priv->domain_id != RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID) {
unsigned int c = 0;
uint16_t port_id;
-- 
2.25.1



[PATCH 4/4] net/mlx5/hws: remove table type DONTCARE

2024-04-16 Thread Michael Baum
This patch removes the "MLX5DR_TABLE_TYPE_DONTCARE" enum value and use
the correct type instead even for places the type is "don't care".

Signed-off-by: Michael Baum 
Reviewed-by: Alex Vesker 
Acked-by: Dariusz Sosnowski 
---
 drivers/net/mlx5/hws/mlx5dr.h |  1 -
 drivers/net/mlx5/hws/mlx5dr_definer.c | 26 +++---
 drivers/net/mlx5/mlx5_flow.h  |  2 +-
 3 files changed, 12 insertions(+), 17 deletions(-)

diff --git a/drivers/net/mlx5/hws/mlx5dr.h b/drivers/net/mlx5/hws/mlx5dr.h
index 80e118a980..36ecccf9ac 100644
--- a/drivers/net/mlx5/hws/mlx5dr.h
+++ b/drivers/net/mlx5/hws/mlx5dr.h
@@ -18,7 +18,6 @@ enum mlx5dr_table_type {
MLX5DR_TABLE_TYPE_NIC_TX,
MLX5DR_TABLE_TYPE_FDB,
MLX5DR_TABLE_TYPE_MAX,
-   MLX5DR_TABLE_TYPE_DONTCARE = MLX5DR_TABLE_TYPE_MAX,
 };
 
 enum mlx5dr_matcher_resource_mode {
diff --git a/drivers/net/mlx5/hws/mlx5dr_definer.c 
b/drivers/net/mlx5/hws/mlx5dr_definer.c
index 7bb328e85e..a0f95c6923 100644
--- a/drivers/net/mlx5/hws/mlx5dr_definer.c
+++ b/drivers/net/mlx5/hws/mlx5dr_definer.c
@@ -152,6 +152,7 @@ struct mlx5dr_definer_conv_data {
uint8_t geneve_opt_ok_idx;
uint8_t geneve_opt_data_idx;
enum rte_flow_item_type last_item;
+   enum mlx5dr_table_type table_type;
 };
 
 /* Xmacro used to create generic item setter from items */
@@ -1754,7 +1755,7 @@ mlx5dr_definer_conv_item_tag(struct 
mlx5dr_definer_conv_data *cd,
if (item->type == RTE_FLOW_ITEM_TYPE_TAG)
reg = flow_hw_get_reg_id_from_ctx(cd->ctx,
  RTE_FLOW_ITEM_TYPE_TAG,
- MLX5DR_TABLE_TYPE_DONTCARE,
+ cd->table_type,
  v->index);
else
reg = (int)v->index;
@@ -1817,8 +1818,7 @@ mlx5dr_definer_conv_item_quota(struct 
mlx5dr_definer_conv_data *cd,
 {
int mtr_reg = flow_hw_get_reg_id_from_ctx(cd->ctx,
  
RTE_FLOW_ITEM_TYPE_METER_COLOR,
- MLX5DR_TABLE_TYPE_DONTCARE,
- 0);
+ cd->table_type, 0);
struct mlx5dr_definer_fc *fc;
 
if (mtr_reg < 0) {
@@ -1837,7 +1837,6 @@ mlx5dr_definer_conv_item_quota(struct 
mlx5dr_definer_conv_data *cd,
 
 static int
 mlx5dr_definer_conv_item_metadata(struct mlx5dr_definer_conv_data *cd,
- enum mlx5dr_table_type table_domain_type,
  struct rte_flow_item *item,
  int item_idx)
 {
@@ -1850,7 +1849,7 @@ mlx5dr_definer_conv_item_metadata(struct 
mlx5dr_definer_conv_data *cd,
return 0;
 
reg = flow_hw_get_reg_id_from_ctx(cd->ctx, RTE_FLOW_ITEM_TYPE_META,
- table_domain_type, -1);
+ cd->table_type, -1);
if (reg <= 0) {
DR_LOG(ERR, "Invalid register for item metadata");
rte_errno = EINVAL;
@@ -2159,7 +2158,7 @@ mlx5dr_definer_conv_item_conntrack(struct 
mlx5dr_definer_conv_data *cd,
 
reg = flow_hw_get_reg_id_from_ctx(cd->ctx,
  RTE_FLOW_ITEM_TYPE_CONNTRACK,
- MLX5DR_TABLE_TYPE_DONTCARE, -1);
+ cd->table_type, -1);
if (reg <= 0) {
DR_LOG(ERR, "Invalid register for item conntrack");
rte_errno = EINVAL;
@@ -2302,7 +2301,7 @@ mlx5dr_definer_conv_item_meter_color(struct 
mlx5dr_definer_conv_data *cd,
 
reg = flow_hw_get_reg_id_from_ctx(cd->ctx,
  RTE_FLOW_ITEM_TYPE_METER_COLOR,
- MLX5DR_TABLE_TYPE_DONTCARE, 0);
+ cd->table_type, 0);
MLX5_ASSERT(reg > 0);
 
fc = mlx5dr_definer_get_register_fc(cd, reg);
@@ -2897,7 +2896,6 @@ mlx5dr_definer_conv_item_compare_field(const struct 
rte_flow_field_data *f,
   const struct rte_flow_field_data 
*other_f,
   struct mlx5dr_definer_conv_data *cd,
   int item_idx,
-  enum mlx5dr_table_type table_domain_type,
   enum mlx5dr_definer_compare_dw_selectors 
dw_offset)
 {
struct mlx5dr_definer_fc *fc = NULL;
@@ -2913,7 +2911,7 @@ mlx5dr_definer_conv_item_compare_field(const struct 
rte_flow_field_data *f,
case RTE_FLOW_FIELD_META:
reg = flow_hw_get_reg_id_from_ctx(cd->ctx,
  RTE_FLOW_ITEM_TYPE_META,
- tab

[PATCH 2/4] net/mlx5/hws: fix GENEVE option class partial mask

2024-04-16 Thread Michael Baum
When GENEVE option parser is configured, the class field has 3 optional
modes:
1. ignored - ignore this field.
2. fixed - this field is part of option identifier along with type
   field. In this mode, the exact value is provided in "spec"
   field during pattern template creation and mask must be 0x.
3. matchable - class field isn't part of the identifier and only mask is
   provided in pattern template creation. The mask can be
   any value like all other fields.

In current implementation, when class mask isn't 0, pattern template
creation is failed for mask != 0x regardless to class mode.

This patch fixes this validation to be only when class mode is fixed.

Fixes: 8f8dad4289e0 ("net/mlx5/hws: support GENEVE options matching")
Cc: va...@nvidia.com
Cc: sta...@dpdk.org

Signed-off-by: Michael Baum 
Reviewed-by: Alex Vesker 
---
 drivers/net/mlx5/hws/mlx5dr_definer.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/mlx5/hws/mlx5dr_definer.c 
b/drivers/net/mlx5/hws/mlx5dr_definer.c
index 35a2ed2048..f1f544deab 100644
--- a/drivers/net/mlx5/hws/mlx5dr_definer.c
+++ b/drivers/net/mlx5/hws/mlx5dr_definer.c
@@ -2500,11 +2500,6 @@ mlx5dr_definer_conv_item_geneve_opt(struct 
mlx5dr_definer_conv_data *cd,
goto out_not_supp;
}
 
-   if (m->option_class && m->option_class != RTE_BE16(UINT16_MAX)) {
-   DR_LOG(ERR, "Geneve option class has invalid mask");
-   goto out_not_supp;
-   }
-
ret = mlx5_get_geneve_hl_data(cd->ctx,
  v->option_type,
  v->option_class,
@@ -2517,6 +2512,11 @@ mlx5dr_definer_conv_item_geneve_opt(struct 
mlx5dr_definer_conv_data *cd,
goto out_not_supp;
}
 
+   if (ok_bit_on_class && m->option_class != RTE_BE16(UINT16_MAX)) {
+   DR_LOG(ERR, "Geneve option class has invalid mask");
+   goto out_not_supp;
+   }
+
if (!ok_bit_on_class && m->option_class) {
/* DW0 is used, we will match type, class */
if (!num_of_dws || hl_dws[0].dw_mask != UINT32_MAX) {
-- 
2.25.1



Re: [PATCH 3/5] lib: use RTE_DIM

2024-04-16 Thread Tyler Retzlaff
On Tue, Apr 16, 2024 at 08:19:29AM -0700, Stephen Hemminger wrote:
> Use RTE_DIM instead of computing directly with sizeof.
> Patch automatically generated via cocci/rte_dim.cocci.
> 
> Signed-off-by: Stephen Hemminger 
> ---
Acked-by: Tyler Retzlaff 



Re: [PATCH 4/5] examples _use RTE_DIM

2024-04-16 Thread Tyler Retzlaff
On Tue, Apr 16, 2024 at 08:19:30AM -0700, Stephen Hemminger wrote:
> Use RTE_DIM instead of computing directly with sizeof.
> Patch automatically generated via cocci/rte_dim.cocci.
> 
> Signed-off-by: Stephen Hemminger 
> ---

Acked-by: Tyler Retzlaff 



Re: [DPDK/ethdev Bug 1416] net/af_packet: tx_burst() can modify packets

2024-04-16 Thread Stephen Hemminger
On Tue, 16 Apr 2024 10:29:53 +
bugzi...@dpdk.org wrote:

> static uint16_t
> eth_af_packet_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
> {
> ...
> for (i = 0; i < nb_pkts; i++) {
> mbuf = *bufs++;
> 
> ...
> 
> /* insert vlan info if necessary */
> if (mbuf->ol_flags & RTE_MBUF_F_TX_VLAN) {
> if (rte_vlan_insert(&mbuf)) {
> rte_pktmbuf_free(mbuf);
> continue; 
> 
> AFAIU, it does copy of mbuf contents into pbuf anyway (just few line below).
> So the fix might be - simply insert VLAN tag at copying stage.
> Feel free to correct me, if I missed something.

vlan_insert will fail if the mbuf is has refcnt > 1.

static inline int rte_vlan_insert(struct rte_mbuf **m)
{
struct rte_ether_hdr *oh, *nh;
struct rte_vlan_hdr *vh;

/* Can't insert header if mbuf is shared */
if (!RTE_MBUF_DIRECT(*m) || rte_mbuf_refcnt_read(*m) > 1)
return -EINVAL;


Re: [PATCH v2] net/bonding: failover of LACP with mode 4 takes long time

2024-04-16 Thread Ferruh Yigit
On 6/6/2022 3:34 PM, Gaoxiang Liu wrote:
> When the primary port of bond slaves with bond mode 4 linked down,
> the system id of the other slave ports channged.
> It may cause some switches to renegotiate,
> and the process takes a few seconds. It is not acceptable for any
> Telcos.
> We need sub-second switchover time like in linux.
> 
> Set the mac of the bond port to the slave port's system to solve the
> problem.
> 
> Bugzilla ID: 551
> Fixes: 46fb43683679 ("bond: add mode 4")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Gaoxiang Liu 
> 
>

This is another old patch, as far as I can see from bugzilla, the patch
the issue is reported by Kiran and proposed by Nandu, and shared in mail
list by Gaoxiang.

As it is addressing two bugzilla items, and we are at the beginning of
the release cycle, I am willing to take risk and apply this patch
without ack from a bonding maintainer.
If we found some issues we can revert this back.

Chas, Connor, if you have any objection, or if you need time for review
please let me know, otherwise I will merge the patch soon.

And if anyone can provide test, like Kiran who reported the issue, or
any other stakeholder, it helps a lot to give confidence to the fix.


Thanks,
ferruh




RE: [DPDK/ethdev Bug 1416] net/af_packet: tx_burst() can modify packets

2024-04-16 Thread Konstantin Ananyev



> 
> > static uint16_t
> > eth_af_packet_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
> > {
> > ...
> > for (i = 0; i < nb_pkts; i++) {
> > mbuf = *bufs++;
> >
> > ...
> >
> > /* insert vlan info if necessary */
> > if (mbuf->ol_flags & RTE_MBUF_F_TX_VLAN) {
> > if (rte_vlan_insert(&mbuf)) {
> > rte_pktmbuf_free(mbuf);
> > continue;
> >
> > AFAIU, it does copy of mbuf contents into pbuf anyway (just few line below).
> > So the fix might be - simply insert VLAN tag at copying stage.
> > Feel free to correct me, if I missed something.
> 
> vlan_insert will fail if the mbuf is has refcnt > 1.
> 
> static inline int rte_vlan_insert(struct rte_mbuf **m)
> {
>   struct rte_ether_hdr *oh, *nh;
>   struct rte_vlan_hdr *vh;
> 
>   /* Can't insert header if mbuf is shared */
>   if (!RTE_MBUF_DIRECT(*m) || rte_mbuf_refcnt_read(*m) > 1)
>   return -EINVAL;

You are right, I missed that.
Will close it then.
Thanks
Konstantin



[DPDK/ethdev Bug 1416] net/af_packet: tx_burst() can modify packets

2024-04-16 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=1416

Konstantin Ananyev (konstantin.v.anan...@yandex.ru) changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #1 from Konstantin Ananyev (konstantin.v.anan...@yandex.ru) ---
As Stephen Hemminger  pointed out:

vlan_insert will fail if the mbuf is has refcnt > 1.

static inline int rte_vlan_insert(struct rte_mbuf **m)
{
struct rte_ether_hdr *oh, *nh;
struct rte_vlan_hdr *vh;

/* Can't insert header if mbuf is shared */
if (!RTE_MBUF_DIRECT(*m) || rte_mbuf_refcnt_read(*m) > 1)
return -EINVAL;

So closing as not a bug.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[PATCH] hash: cast away atomic qualification

2024-04-16 Thread Tyler Retzlaff
rte_free accepts only non-cva qualified arguments so cast away
RTE_ATOMIC qualification for tbl_chng_cnt and h->tbl_chng_cnt when
calling rte_free.

Signed-off-by: Tyler Retzlaff 
---
 lib/hash/rte_cuckoo_hash.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/hash/rte_cuckoo_hash.c b/lib/hash/rte_cuckoo_hash.c
index 9cf9464..b31a3d9 100644
--- a/lib/hash/rte_cuckoo_hash.c
+++ b/lib/hash/rte_cuckoo_hash.c
@@ -481,7 +481,7 @@ struct rte_hash *
rte_free(buckets);
rte_free(buckets_ext);
rte_free(k);
-   rte_free(tbl_chng_cnt);
+   rte_free((void *)(uintptr_t)tbl_chng_cnt);
rte_free(ext_bkt_to_free);
return NULL;
 }
@@ -526,7 +526,7 @@ struct rte_hash *
rte_free(h->key_store);
rte_free(h->buckets);
rte_free(h->buckets_ext);
-   rte_free(h->tbl_chng_cnt);
+   rte_free((void *)(uintptr_t)h->tbl_chng_cnt);
rte_free(h->ext_bkt_to_free);
rte_free(h->hash_rcu_cfg);
rte_free(h);
-- 
1.8.3.1



RE: [DPDK/ethdev Bug 1416] net/af_packet: tx_burst() can modify packets

2024-04-16 Thread Morten Brørup
> > > static uint16_t
> > > eth_af_packet_tx(void *queue, struct rte_mbuf **bufs, uint16_t
> nb_pkts)
> > > {
> > > ...
> > > for (i = 0; i < nb_pkts; i++) {
> > > mbuf = *bufs++;
> > >
> > > ...
> > >
> > > /* insert vlan info if necessary */
> > > if (mbuf->ol_flags & RTE_MBUF_F_TX_VLAN) {
> > > if (rte_vlan_insert(&mbuf)) {
> > > rte_pktmbuf_free(mbuf);
> > > continue;
> > >
> > > AFAIU, it does copy of mbuf contents into pbuf anyway (just few line
> below).
> > > So the fix might be - simply insert VLAN tag at copying stage.
> > > Feel free to correct me, if I missed something.
> >
> > vlan_insert will fail if the mbuf is has refcnt > 1.
> >
> > static inline int rte_vlan_insert(struct rte_mbuf **m)
> > {
> > struct rte_ether_hdr *oh, *nh;
> > struct rte_vlan_hdr *vh;
> >
> > /* Can't insert header if mbuf is shared */
> > if (!RTE_MBUF_DIRECT(*m) || rte_mbuf_refcnt_read(*m) > 1)
> > return -EINVAL;
> 
> You are right, I missed that.
> Will close it then.

Don't close, silent drop is also a bug.

The VLAN tag could be insert when copying, as originally suggested.

Alternatively, set tpacket2_hdr's tp_vlan_tci and tp_vlan_tpid fields and set 
TP_STATUS_VLAN_VALID and TP_STATUS_VLAN_TPID_VALID flags in tp_status. If the 
kernel reads those?



Re: [PATCH 5/5] drivers/net: use RTE_DIM

2024-04-16 Thread Ferruh Yigit
On 4/16/2024 4:19 PM, Stephen Hemminger wrote:
> Use RTE_DIM instead of computing directly with sizeof.
> Patch automatically generated via cocci/rte_dim.cocci.
> Code in base/ subdirectory manually excluded.
> 
> Signed-off-by: Stephen Hemminger 
> 

Updated ones looks good to me, but I can see a few more, I don't know if
you excluded base file one intentionally,
searched as `git grep "sizeof.*\[0\]" drivers/net/`


-
drivers/net/bnxt/tf_core/cfa_tcam_mgr.h:28:#define ARRAY_SIZE(_array)
(sizeof(_array) / sizeof(_array[0]))

-
drivers/net/hinic/base/hinic_pmd_nicio.c:449:  /
sizeof(hinic_hw_rx_buf_size[0]);

-
drivers/net/ice/ice_ethdev.c:330:#define ICE_NB_MBUF_XSTATS
(sizeof(ice_mbuf_strings) / sizeof(ice_mbuf_strings[0]))

-
drivers/net/i40e/i40e_ethdev.c:554:#define I40E_NB_MBUF_XSTATS
(sizeof(i40e_mbuf_strings) / sizeof(i40e_mbuf_strings[0]))

-
drivers/net/i40e/base/i40e_adminq.h:126:if (!((u32)aq_rc <
(sizeof(aq_to_posix) / sizeof((aq_to_posix)[0]

-
drivers/net/ipn3ke/ipn3ke_representor.c:506:/
sizeof(ipn3ke_rpst_hw_port_strings[0]))
drivers/net/ipn3ke/ipn3ke_representor.c:517:/
sizeof(ipn3ke_rpst_rxq_prio_strings[0]))
drivers/net/ipn3ke/ipn3ke_representor.c:530:/
sizeof(ipn3ke_rpst_txq_prio_strings[0]))

-
drivers/net/ixgbe/base/ixgbe_x550.c:443:for (i = 0; i <
sizeof(ixgbe_fw_map) / sizeof(ixgbe_fw_map[0]); ++i) {
drivers/net/ixgbe/base/ixgbe_x550.c:700:for (i = 0; i <
sizeof(ixgbe_fw_map) / sizeof(ixgbe_fw_map[0]); ++i) {

-
drivers/net/ixgbe/base/ixgbe_x550.c:3338:   bufsz = sizeof(buf) /
sizeof(buf[0]);

-
drivers/net/mlx5/hws/mlx5dr_internal.h:52:#define ARRAY_SIZE(x)
(sizeof(x) / sizeof((x)[0]))




[PATCH] eal: fix missing type in dtor macro expansion

2024-04-16 Thread Tyler Retzlaff
RTE_FINI expansion failed to specify void * type for storage of
destructor function pointer resulting it defaulting to type ``int``.

Update the macro to specify ``void *`` as the type so the correct size
is allocated in the segment.

Fixes: 64eff943ca82 ("eal: implement constructors for MSVC")
Cc: roret...@linux.microsoft.com
Cc: sta...@dpdk.org

Signed-off-by: Tyler Retzlaff 
---
 lib/eal/include/rte_common.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/eal/include/rte_common.h b/lib/eal/include/rte_common.h
index 298a5c6..618ed56 100644
--- a/lib/eal/include/rte_common.h
+++ b/lib/eal/include/rte_common.h
@@ -291,7 +291,7 @@ static void __attribute__((destructor(RTE_PRIO(prio)), 
used)) func(void)
 #define RTE_FINI_PRIO(name, priority) \
static void name(void); \
__pragma(const_seg(DTOR_PRIORITY_TO_SECTION(priority))) \
-   __declspec(allocate(DTOR_PRIORITY_TO_SECTION(priority))) name ## 
_pointer = &name; \
+   __declspec(allocate(DTOR_PRIORITY_TO_SECTION(priority))) void *name ## 
_pointer = &name; \
__pragma(const_seg()) \
static void name(void)
 #endif
-- 
1.8.3.1



RE: [DPDK/ethdev Bug 1416] net/af_packet: tx_burst() can modify packets

2024-04-16 Thread Konstantin Ananyev



> > > > static uint16_t
> > > > eth_af_packet_tx(void *queue, struct rte_mbuf **bufs, uint16_t
> > nb_pkts)
> > > > {
> > > > ...
> > > > for (i = 0; i < nb_pkts; i++) {
> > > > mbuf = *bufs++;
> > > >
> > > > ...
> > > >
> > > > /* insert vlan info if necessary */
> > > > if (mbuf->ol_flags & RTE_MBUF_F_TX_VLAN) {
> > > > if (rte_vlan_insert(&mbuf)) {
> > > > rte_pktmbuf_free(mbuf);
> > > > continue;
> > > >
> > > > AFAIU, it does copy of mbuf contents into pbuf anyway (just few line
> > below).
> > > > So the fix might be - simply insert VLAN tag at copying stage.
> > > > Feel free to correct me, if I missed something.
> > >
> > > vlan_insert will fail if the mbuf is has refcnt > 1.
> > >
> > > static inline int rte_vlan_insert(struct rte_mbuf **m)
> > > {
> > >   struct rte_ether_hdr *oh, *nh;
> > >   struct rte_vlan_hdr *vh;
> > >
> > >   /* Can't insert header if mbuf is shared */
> > >   if (!RTE_MBUF_DIRECT(*m) || rte_mbuf_refcnt_read(*m) > 1)
> > >   return -EINVAL;
> >
> > You are right, I missed that.
> > Will close it then.
> 
> Don't close, silent drop is also a bug.
> 
> The VLAN tag could be insert when copying, as originally suggested.

Agree, but to me that would be enhancement request, not a bug report.

> 
> Alternatively, set tpacket2_hdr's tp_vlan_tci and tp_vlan_tpid fields and set 
> TP_STATUS_VLAN_VALID and
> TP_STATUS_VLAN_TPID_VALID flags in tp_status. If the kernel reads those?



Re: [PATCH 5/5] drivers/net: use RTE_DIM

2024-04-16 Thread Stephen Hemminger
On Tue, 16 Apr 2024 17:29:25 +0100
Ferruh Yigit  wrote:

> On 4/16/2024 4:19 PM, Stephen Hemminger wrote:
> > Use RTE_DIM instead of computing directly with sizeof.
> > Patch automatically generated via cocci/rte_dim.cocci.
> > Code in base/ subdirectory manually excluded.
> > 
> > Signed-off-by: Stephen Hemminger 
> >   
> 
> Updated ones looks good to me, but I can see a few more, I don't know if
> you excluded base file one intentionally,
> searched as `git grep "sizeof.*\[0\]" drivers/net/`
> 
> 
> -
> drivers/net/bnxt/tf_core/cfa_tcam_mgr.h:28:#define ARRAY_SIZE(_array)
> (sizeof(_array) / sizeof(_array[0]))
> 
> -

For the drivers that choose to define and use ARRAY_SIZE() that is fine.

Since base/ directories often come from other upstream repos, I ignored those.

Something about the Intel drivers was causing warnings and the script would
not automatically change those.



RE: [DPDK/ethdev Bug 1416] net/af_packet: tx_burst() can modify packets

2024-04-16 Thread Morten Brørup
> > > > > static uint16_t
> > > > > eth_af_packet_tx(void *queue, struct rte_mbuf **bufs, uint16_t
> > > nb_pkts)
> > > > > {
> > > > > ...
> > > > > for (i = 0; i < nb_pkts; i++) {
> > > > > mbuf = *bufs++;
> > > > >
> > > > > ...
> > > > >
> > > > > /* insert vlan info if necessary */
> > > > > if (mbuf->ol_flags & RTE_MBUF_F_TX_VLAN) {
> > > > > if (rte_vlan_insert(&mbuf)) {
> > > > > rte_pktmbuf_free(mbuf);
> > > > > continue;
> > > > >
> > > > > AFAIU, it does copy of mbuf contents into pbuf anyway (just few
> line
> > > below).
> > > > > So the fix might be - simply insert VLAN tag at copying stage.
> > > > > Feel free to correct me, if I missed something.
> > > >
> > > > vlan_insert will fail if the mbuf is has refcnt > 1.
> > > >
> > > > static inline int rte_vlan_insert(struct rte_mbuf **m)
> > > > {
> > > > struct rte_ether_hdr *oh, *nh;
> > > > struct rte_vlan_hdr *vh;
> > > >
> > > > /* Can't insert header if mbuf is shared */
> > > > if (!RTE_MBUF_DIRECT(*m) || rte_mbuf_refcnt_read(*m) > 1)
> > > > return -EINVAL;
> > >
> > > You are right, I missed that.
> > > Will close it then.
> >
> > Don't close, silent drop is also a bug.
> >
> > The VLAN tag could be insert when copying, as originally suggested.
> 
> Agree, but to me that would be enhancement request, not a bug report.

Hmm... there is still a bug, although slightly different:

net/af_packet: tx_burst() silently drops packets with RTE_MBUF_F_TX_VLAN if 
mbuf is shared

And the suggested fixes would fix this (other) bug.

> 
> >
> > Alternatively, set tpacket2_hdr's tp_vlan_tci and tp_vlan_tpid fields
> and set TP_STATUS_VLAN_VALID and
> > TP_STATUS_VLAN_TPID_VALID flags in tp_status. If the kernel reads
> those?



RE: [PATCH] eal: fix missing type in dtor macro expansion

2024-04-16 Thread Morten Brørup
> From: Tyler Retzlaff [mailto:roret...@linux.microsoft.com]
> Sent: Tuesday, 16 April 2024 18.33
> 
> RTE_FINI expansion failed to specify void * type for storage of
> destructor function pointer resulting it defaulting to type ``int``.
> 
> Update the macro to specify ``void *`` as the type so the correct size
> is allocated in the segment.
> 
> Fixes: 64eff943ca82 ("eal: implement constructors for MSVC")
> Cc: roret...@linux.microsoft.com
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Tyler Retzlaff 
> ---

Acked-by: Morten Brørup 



[PATCH v3] dts: Change hugepage runtime config to 2MB Exclusively

2024-04-16 Thread Nicholas Pratte
The previous implementation configures and allocates hugepage sizes
based on a system default. This can lead to two problems: overallocation of
hugepages (which may crash the remote host), and configuration of hugepage
sizes that are not recommended during runtime. This new implementation
allows only 2MB hugepage allocation during runtime; any other unique
hugepage size must be configured by the end-user for initializing DTS.

If the amount of 2MB hugepages requested exceeds the amount of 2MB
hugepages already configured on the system, then the system will remount
hugepages to cover the difference. If the amount of hugepages requested is
either less than or equal to the amount already configured on the system,
then nothing is done.

Bugzilla ID: 1370
Signed-off-by: Nicholas Pratte 
Reviewed-by: Jeremy Spewock 
---
 doc/guides/tools/dts.rst |  7 +-
 dts/conf.yaml|  4 ++--
 dts/framework/config/__init__.py |  4 ++--
 dts/framework/config/conf_yaml_schema.json   |  6 ++---
 dts/framework/config/types.py|  2 +-
 dts/framework/testbed_model/linux_session.py | 24 +++-
 dts/framework/testbed_model/node.py  |  8 ++-
 dts/framework/testbed_model/os_session.py|  4 +++-
 8 files changed, 37 insertions(+), 22 deletions(-)

diff --git a/doc/guides/tools/dts.rst b/doc/guides/tools/dts.rst
index 47b218b2c6..1103db0faa 100644
--- a/doc/guides/tools/dts.rst
+++ b/doc/guides/tools/dts.rst
@@ -131,7 +131,12 @@ There are two areas that need to be set up on a System 
Under Test:
 
  You may specify the optional hugepage configuration in the DTS config 
file.
  If you do, DTS will take care of configuring hugepages,
- overwriting your current SUT hugepage configuration.
+ overwriting your current SUT hugepage configuration. Configuration of 
hugepages via DTS
+ allows only for configuration of 2MB hugepages. Thus, if your needs 
require hugepage
+ sizes not equal to 2MB, then you these configurations must be done 
outside of the DTS
+ framework; moreover, if you do not desire the use of 2MB hugepages, and 
instead perfer
+ other sizes (e.g 1GB), then we assume that hugepages have been manually 
configued before
+ runtime.
 
* System under test configuration
 
diff --git a/dts/conf.yaml b/dts/conf.yaml
index 8068345dd5..56c3ae6f4c 100644
--- a/dts/conf.yaml
+++ b/dts/conf.yaml
@@ -35,7 +35,7 @@ nodes:
 lcores: "" # use all the available logical cores
 use_first_core: false # tells DPDK to use any physical core
 memory_channels: 4 # tells DPDK to use 4 memory channels
-hugepages:  # optional; if removed, will use system hugepage configuration
+hugepages_2mb: # optional; if removed, will use system hugepage 
configuration
 amount: 256
 force_first_numa: false
 ports:
@@ -71,7 +71,7 @@ nodes:
 os_driver: rdma
 peer_node: "SUT 1"
 peer_pci: ":00:08.1"
-hugepages:  # optional; if removed, will use system hugepage configuration
+hugepages_2mb: # optional; if removed, will use system hugepage 
configuration
 amount: 256
 force_first_numa: false
 traffic_generator:
diff --git a/dts/framework/config/__init__.py b/dts/framework/config/__init__.py
index 4cb5c74059..b6f820e39e 100644
--- a/dts/framework/config/__init__.py
+++ b/dts/framework/config/__init__.py
@@ -255,8 +255,8 @@ def from_dict(
 Either an SUT or TG configuration instance.
 """
 hugepage_config = None
-if "hugepages" in d:
-hugepage_config_dict = d["hugepages"]
+if "hugepages_2mb" in d:
+hugepage_config_dict = d["hugepages_2mb"]
 if "force_first_numa" not in hugepage_config_dict:
 hugepage_config_dict["force_first_numa"] = False
 hugepage_config = HugepageConfiguration(**hugepage_config_dict)
diff --git a/dts/framework/config/conf_yaml_schema.json 
b/dts/framework/config/conf_yaml_schema.json
index 4731f4511d..f4d7199523 100644
--- a/dts/framework/config/conf_yaml_schema.json
+++ b/dts/framework/config/conf_yaml_schema.json
@@ -146,7 +146,7 @@
 "compiler"
   ]
 },
-"hugepages": {
+"hugepages_2mb": {
   "type": "object",
   "description": "Optional hugepage configuration. If not specified, 
hugepages won't be configured and DTS will use system configuration.",
   "properties": {
@@ -253,8 +253,8 @@
 "type": "integer",
 "description": "How many memory channels to use. Optional, 
defaults to 1."
   },
-  "hugepages": {
-"$ref": "#/definitions/hugepages"
+  "hugepages_2mb": {
+"$ref": "#/definitions/hugepages_2mb"
   },
   "ports": {
 "type": "array",
diff --git a/dts/framework/config/types.py b/dts/framework/config/types.py
index 1927910d88..016e0c3dbd 100644
--- a/dts/framework/config/types

RE: [PATCH v3] dts: Change hugepage runtime config to 2MB Exclusively

2024-04-16 Thread Morten Brørup
> From: Nicholas Pratte [mailto:npra...@iol.unh.edu]
> Sent: Tuesday, 16 April 2024 20.19
> 
> The previous implementation configures and allocates hugepage sizes
> based on a system default. This can lead to two problems: overallocation
> of
> hugepages (which may crash the remote host), and configuration of
> hugepage
> sizes that are not recommended during runtime. This new implementation
> allows only 2MB hugepage allocation during runtime; any other unique
> hugepage size must be configured by the end-user for initializing DTS.
> 
> If the amount of 2MB hugepages requested exceeds the amount of 2MB
> hugepages already configured on the system, then the system will remount
> hugepages to cover the difference. If the amount of hugepages requested
> is
> either less than or equal to the amount already configured on the
> system,
> then nothing is done.
> 
> Bugzilla ID: 1370
> Signed-off-by: Nicholas Pratte 
> Reviewed-by: Jeremy Spewock 
> ---

This version LGTM.
Acked-by: Morten Brørup 



Re: [PATCH v3] dts: Change hugepage runtime config to 2MB Exclusively

2024-04-16 Thread Nicholas Pratte
I like the changes suggested by Morten, and much such modifications
accordingly. Given that the hugepage size will likely be determined by
arch going forward, looking ahead, when considering patch 1360 and
ongoing config changes, I decided to create a property method in the
Node object that can be modified to our liking in the future (credit
to Jeremy for this idea). This solution is designed in such a way
that, when arch is discovered and removed from the config, hugepage
sizes can be determined right after discovering the arch (this is just
to save me time in the future when working on patch 1360). I also
opted to throw the size variable back into the methods under
linux_session.py (per Morten's suggestions), but given the
implementation Jeremy and I worked on, there is no need to have the
get_hugepage_size() method in linux_session.py as we likely won't need
it going forward, so this has been removed.

On Tue, Apr 16, 2024 at 2:18 PM Nicholas Pratte  wrote:
>
> The previous implementation configures and allocates hugepage sizes
> based on a system default. This can lead to two problems: overallocation of
> hugepages (which may crash the remote host), and configuration of hugepage
> sizes that are not recommended during runtime. This new implementation
> allows only 2MB hugepage allocation during runtime; any other unique
> hugepage size must be configured by the end-user for initializing DTS.
>
> If the amount of 2MB hugepages requested exceeds the amount of 2MB
> hugepages already configured on the system, then the system will remount
> hugepages to cover the difference. If the amount of hugepages requested is
> either less than or equal to the amount already configured on the system,
> then nothing is done.
>
> Bugzilla ID: 1370
> Signed-off-by: Nicholas Pratte 
> Reviewed-by: Jeremy Spewock 
> ---
>  doc/guides/tools/dts.rst |  7 +-
>  dts/conf.yaml|  4 ++--
>  dts/framework/config/__init__.py |  4 ++--
>  dts/framework/config/conf_yaml_schema.json   |  6 ++---
>  dts/framework/config/types.py|  2 +-
>  dts/framework/testbed_model/linux_session.py | 24 +++-
>  dts/framework/testbed_model/node.py  |  8 ++-
>  dts/framework/testbed_model/os_session.py|  4 +++-
>  8 files changed, 37 insertions(+), 22 deletions(-)
>
> diff --git a/doc/guides/tools/dts.rst b/doc/guides/tools/dts.rst
> index 47b218b2c6..1103db0faa 100644
> --- a/doc/guides/tools/dts.rst
> +++ b/doc/guides/tools/dts.rst
> @@ -131,7 +131,12 @@ There are two areas that need to be set up on a System 
> Under Test:
>
>   You may specify the optional hugepage configuration in the DTS config 
> file.
>   If you do, DTS will take care of configuring hugepages,
> - overwriting your current SUT hugepage configuration.
> + overwriting your current SUT hugepage configuration. Configuration of 
> hugepages via DTS
> + allows only for configuration of 2MB hugepages. Thus, if your needs 
> require hugepage
> + sizes not equal to 2MB, then you these configurations must be done 
> outside of the DTS
> + framework; moreover, if you do not desire the use of 2MB hugepages, and 
> instead perfer
> + other sizes (e.g 1GB), then we assume that hugepages have been manually 
> configued before
> + runtime.
>
> * System under test configuration
>
> diff --git a/dts/conf.yaml b/dts/conf.yaml
> index 8068345dd5..56c3ae6f4c 100644
> --- a/dts/conf.yaml
> +++ b/dts/conf.yaml
> @@ -35,7 +35,7 @@ nodes:
>  lcores: "" # use all the available logical cores
>  use_first_core: false # tells DPDK to use any physical core
>  memory_channels: 4 # tells DPDK to use 4 memory channels
> -hugepages:  # optional; if removed, will use system hugepage 
> configuration
> +hugepages_2mb: # optional; if removed, will use system hugepage 
> configuration
>  amount: 256
>  force_first_numa: false
>  ports:
> @@ -71,7 +71,7 @@ nodes:
>  os_driver: rdma
>  peer_node: "SUT 1"
>  peer_pci: ":00:08.1"
> -hugepages:  # optional; if removed, will use system hugepage 
> configuration
> +hugepages_2mb: # optional; if removed, will use system hugepage 
> configuration
>  amount: 256
>  force_first_numa: false
>  traffic_generator:
> diff --git a/dts/framework/config/__init__.py 
> b/dts/framework/config/__init__.py
> index 4cb5c74059..b6f820e39e 100644
> --- a/dts/framework/config/__init__.py
> +++ b/dts/framework/config/__init__.py
> @@ -255,8 +255,8 @@ def from_dict(
>  Either an SUT or TG configuration instance.
>  """
>  hugepage_config = None
> -if "hugepages" in d:
> -hugepage_config_dict = d["hugepages"]
> +if "hugepages_2mb" in d:
> +hugepage_config_dict = d["hugepages_2mb"]
>  if "force_first_numa" not in hugepage_config_dict:
>  hugepage_config_di

Re: [DPDK/ethdev Bug 1416] net/af_packet: tx_burst() can modify packets

2024-04-16 Thread Stephen Hemminger
On Tue, 16 Apr 2024 20:11:05 +0200
Morten Brørup  wrote:

> > > > > > static uint16_t
> > > > > > eth_af_packet_tx(void *queue, struct rte_mbuf **bufs, uint16_t  
> > > > nb_pkts)  
> > > > > > {
> > > > > > ...
> > > > > > for (i = 0; i < nb_pkts; i++) {
> > > > > > mbuf = *bufs++;
> > > > > >
> > > > > > ...
> > > > > >
> > > > > > /* insert vlan info if necessary */
> > > > > > if (mbuf->ol_flags & RTE_MBUF_F_TX_VLAN) {
> > > > > > if (rte_vlan_insert(&mbuf)) {
> > > > > > rte_pktmbuf_free(mbuf);
> > > > > > continue;
> > > > > >
> > > > > > AFAIU, it does copy of mbuf contents into pbuf anyway (just few  
> > line  
> > > > below).  
> > > > > > So the fix might be - simply insert VLAN tag at copying stage.
> > > > > > Feel free to correct me, if I missed something.  
> > > > >
> > > > > vlan_insert will fail if the mbuf is has refcnt > 1.
> > > > >
> > > > > static inline int rte_vlan_insert(struct rte_mbuf **m)
> > > > > {
> > > > >   struct rte_ether_hdr *oh, *nh;
> > > > >   struct rte_vlan_hdr *vh;
> > > > >
> > > > >   /* Can't insert header if mbuf is shared */
> > > > >   if (!RTE_MBUF_DIRECT(*m) || rte_mbuf_refcnt_read(*m) > 1)
> > > > >   return -EINVAL;  
> > > >
> > > > You are right, I missed that.
> > > > Will close it then.  
> > >
> > > Don't close, silent drop is also a bug.
> > >
> > > The VLAN tag could be insert when copying, as originally suggested.  
> > 
> > Agree, but to me that would be enhancement request, not a bug report.  
> 
> Hmm... there is still a bug, although slightly different:
> 
> net/af_packet: tx_burst() silently drops packets with RTE_MBUF_F_TX_VLAN if 
> mbuf is shared
> 
> And the suggested fixes would fix this (other) bug.


In older DPDK, vlan_insert would try and clone the mbuf, but doing a 
rte_pktmbuf_clone().
But that was buggy and removed by:

commit 15a74163b12ed9b8b980b1576bdd8de16d60612b
Author: Ferruh Yigit 
Date:   Tue Apr 16 16:51:26 2019 +0100

net: forbid VLAN insert in shared mbuf

The vlan_insert() is buggy when it tries to handle the shared mbufs,
instead don't support inserting VLAN tag into shared mbufs and return
an error for that case.

Signed-off-by: Ferruh Yigit 
Acked-by: Olivier Matz 





RE: [DPDK/ethdev Bug 1416] net/af_packet: tx_burst() can modify packets

2024-04-16 Thread Morten Brørup
+TO: John W. Linville, AF_PACKET maintainer

> > > > > > > static uint16_t
> > > > > > > eth_af_packet_tx(void *queue, struct rte_mbuf **bufs,
> uint16_t
> > > > > nb_pkts)
> > > > > > > {
> > > > > > > ...
> > > > > > > for (i = 0; i < nb_pkts; i++) {
> > > > > > > mbuf = *bufs++;
> > > > > > >
> > > > > > > ...
> > > > > > >
> > > > > > > /* insert vlan info if necessary */
> > > > > > > if (mbuf->ol_flags & RTE_MBUF_F_TX_VLAN) {
> > > > > > > if (rte_vlan_insert(&mbuf)) {
> > > > > > > rte_pktmbuf_free(mbuf);
> > > > > > > continue;
> > > > > > >
> > > > > > > AFAIU, it does copy of mbuf contents into pbuf anyway (just
> few
> > > line
> > > > > below).
> > > > > > > So the fix might be - simply insert VLAN tag at copying
> stage.
> > > > > > > Feel free to correct me, if I missed something.
> > > > > >
> > > > > > vlan_insert will fail if the mbuf is has refcnt > 1.
> > > > > >
> > > > > > static inline int rte_vlan_insert(struct rte_mbuf **m)
> > > > > > {
> > > > > > struct rte_ether_hdr *oh, *nh;
> > > > > > struct rte_vlan_hdr *vh;
> > > > > >
> > > > > > /* Can't insert header if mbuf is shared */
> > > > > > if (!RTE_MBUF_DIRECT(*m) || rte_mbuf_refcnt_read(*m) > 1)
> > > > > > return -EINVAL;
> > > > >
> > > > > You are right, I missed that.
> > > > > Will close it then.
> > > >
> > > > Don't close, silent drop is also a bug.
> > > >
> > > > The VLAN tag could be insert when copying, as originally
> suggested.
> > >
> > > Agree, but to me that would be enhancement request, not a bug
> report.
> >
> > Hmm... there is still a bug, although slightly different:
> >
> > net/af_packet: tx_burst() silently drops packets with
> RTE_MBUF_F_TX_VLAN if mbuf is shared

I took a more thorough look at the code.
I was wrong: The drop is not silent, the err_pkts counter is incremented.
But still a bug (non-conformance) to claim VLAN Insert capability, and not 
fully support it.

> >
> > And the suggested fixes would fix this (other) bug.

The kernel doesn't look at TP_STATUS_VLAN_VALID in TX direction, so setting the 
VLAN tag in the tpacket2_hdr doesn't work; inserting it in the data when 
copying would be the fix.

Before the copy loop, the first segment should be copied, possibly with VLAN 
insertion.
And the copy loop should copy the following segments (if any), i.e. starting at 
*tmp_mbuf = mbuf->next.

As a separate bug, the check for oversize packets doesn't consider the length 
of an inserted VLAN tag, and could cause a buffer overrun if the packet is 
large enough.
However, with a default buffer size of 2048, this is very unlikely in reality.

> 
> 
> In older DPDK, vlan_insert would try and clone the mbuf, but doing a
> rte_pktmbuf_clone().
> But that was buggy and removed by:
> 
> commit 15a74163b12ed9b8b980b1576bdd8de16d60612b
> Author: Ferruh Yigit 
> Date:   Tue Apr 16 16:51:26 2019 +0100
> 
> net: forbid VLAN insert in shared mbuf
> 
> The vlan_insert() is buggy when it tries to handle the shared mbufs,
> instead don't support inserting VLAN tag into shared mbufs and
> return
> an error for that case.
> 
> Signed-off-by: Ferruh Yigit 
> Acked-by: Olivier Matz 
> 
> 



[PATCH v2] mailmap: clean up vmware entries

2024-04-16 Thread Jochen Behrens
VMware was acquired by Broadcom, and vmware e-mail addresses will
be invalid in the future.

Add broadcom.com addresses for those still in the company.

Signed-off-by: Jochen Behrens 
---
 .mailmap | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/.mailmap b/.mailmap
index 3843868716..6d99110ffe 100644
--- a/.mailmap
+++ b/.mailmap
@@ -173,7 +173,7 @@ Ben Walker 
 Bernard Iremonger 
 Bert van Leeuwen 
 Bhagyada Modali 
-Bharat Mota 
+Bharat Mota  
 Bhuvan Mital 
 Bill Hong 
 Billy McFall 
@@ -184,7 +184,7 @@ Bin Zheng 
 Björn Töpel 
 Bo Chen 
 Boleslav Stankevich 
-Boon Ang 
+Boon Ang  
 Boris Ouretskey 
 Boris Pismenny 
 Brad Larson 
@@ -469,7 +469,7 @@ Guduri Prathyusha 
 Guido Barzini 
 Guillaume Gaudonville 
 Guinan Sun 
-Guolin Yang 
+Guolin Yang  
 Guo Xin 
 Guoyang Zhou 
 Guruprasad Rao 
@@ -659,13 +659,13 @@ Jing Chen 
 Jingguo Fu 
 Jingjing Wu 
 Jingzhao Ni 
-Jin Heo 
+Jin Heo  
 Jin Ling 
 Jin Liu 
 Jin Yu 
 Jiri Slaby 
 Job Abraham 
-Jochen Behrens 
+Jochen Behrens  
 Joey Xing 
 Johan Faltstrom 
 Johan Källström 
@@ -1075,7 +1075,7 @@ Padraig Connolly 
 Pallantla Poornima 
 Pallavi Kadam 
 Pandi Kumar Maharajan 
-Pankaj Gupta  
+Pankaj Gupta   

 Panu Matilainen 
 Paolo Valerio 
 Parav Pandit  
@@ -1228,7 +1228,7 @@ Roman Kapl 
 Roman Korynkevych 
 Roman Storozhenko 
 Roman Zhukov  
-Ronak Doshi 
+Ronak Doshi  
 Ron Beider 
 Ronghua Zhang 
 RongQiang Xie 
-- 
2.34.1


-- 
This electronic communication and the information and any files transmitted 
with it, or attached to it, are confidential and are intended solely for 
the use of the individual or entity to whom it is addressed and may contain 
information that is confidential, legally privileged, protected by privacy 
laws, or otherwise restricted from disclosure to anyone else. If you are 
not the intended recipient or the person responsible for delivering the 
e-mail to the intended recipient, you are hereby notified that any use, 
copying, distributing, dissemination, forwarding, printing, or copying of 
this e-mail is strictly prohibited. If you received this e-mail in error, 
please return the e-mail to the sender, delete it from your computer, and 
destroy any printed copy of it.


Reminder - DPDK Tech Board Meeting - Tomorrow, Wed. April 17, 2024 - 8am Pacific/11am Eastern/1500h UTC

2024-04-16 Thread Nathan Southern
You have been invited to a recurring meeting for Data Plane Development Kit
(DPDK)

Agenda (read-only):
https://annuel.framapad.org/p/r.0c3cc4d1e011214183872a98f6b5c7db



Minutes:
http://core.dpdk.org/techboard/minutes


Ways to join meeting:

1. Join from PC, Mac, iPad, or Android

https://zoom-lfx.platform.linuxfoundation.org/meeting/96459488340?password=d808f1f6-0a28-4165-929e-5a5bcae7efeb


2. Join via audio

One tap mobile:
US: +12532158782,,96459488340# or +13462487799,,96459488340

Or dial:
US: +1 253 215 8782 or +1 346 248 7799 or +1 669 900 6833 or +1 301 715
8592 or +1 312 626 6799 or +1 646 374 8656 or 877 369 0926 (Toll Free) or
855 880 1246 (Toll Free)
Canada: +1 647 374 4685 or +1 647 558 0588 or +1 778 907 2071 or +1 204 272
7920 or +1 438 809 7799 or +1 587 328 1099 or 855 703 8985 (Toll Free)

Meeting ID: 96459488340

Meeting Passcode: 699526


International numbers: https://zoom.us/u/alwnPIaVT



[PATCH v2 1/2] eventdev/dma: reorganize event DMA ops

2024-04-16 Thread pbhagavatula
From: Pavan Nikhilesh 

Re-organize event DMA ops structure to allow holding
source and destination pointers without the need for
additional memory, the mempool allocating memory for
rte_event_dma_adapter_ops can size the structure to
accommodate all the needed source and destination
pointers.

Add multiple words for holding user metadata, adapter
implementation specific metadata and event metadata.

Signed-off-by: Pavan Nikhilesh 
---
 v2 Changes:
 - Fix 32bit compilation

 app/test-eventdev/test_perf_common.c| 26 
 app/test/test_event_dma_adapter.c   | 20 +++--
 doc/guides/prog_guide/event_dma_adapter.rst |  2 +-
 drivers/dma/cnxk/cnxk_dmadev_fp.c   | 39 +++--
 lib/eventdev/rte_event_dma_adapter.c| 27 
 lib/eventdev/rte_event_dma_adapter.h| 46 +++--
 6 files changed, 72 insertions(+), 88 deletions(-)

diff --git a/app/test-eventdev/test_perf_common.c 
b/app/test-eventdev/test_perf_common.c
index 93e6132de8..db0f9c1f3b 100644
--- a/app/test-eventdev/test_perf_common.c
+++ b/app/test-eventdev/test_perf_common.c
@@ -1503,7 +1503,6 @@ perf_event_dev_port_setup(struct evt_test *test, struct 
evt_options *opt,
prod = 0;
for (; port < perf_nb_event_ports(opt); port++) {
struct prod_data *p = &t->prod[port];
-   struct rte_event *response_info;
uint32_t flow_id;

p->dev_id = opt->dev_id;
@@ -1523,13 +1522,10 @@ perf_event_dev_port_setup(struct evt_test *test, struct 
evt_options *opt,
for (flow_id = 0; flow_id < t->nb_flows; flow_id++) {
rte_mempool_get(t->da_op_pool, (void **)&op);

-   op->src_seg = rte_malloc(NULL, sizeof(struct 
rte_dma_sge), 0);
-   op->dst_seg = rte_malloc(NULL, sizeof(struct 
rte_dma_sge), 0);
-
-   op->src_seg->addr = 
rte_pktmbuf_iova(rte_pktmbuf_alloc(pool));
-   op->dst_seg->addr = 
rte_pktmbuf_iova(rte_pktmbuf_alloc(pool));
-   op->src_seg->length = 1024;
-   op->dst_seg->length = 1024;
+   op->src_dst_seg[0].addr = 
rte_pktmbuf_iova(rte_pktmbuf_alloc(pool));
+   op->src_dst_seg[1].addr = 
rte_pktmbuf_iova(rte_pktmbuf_alloc(pool));
+   op->src_dst_seg[0].length = 1024;
+   op->src_dst_seg[1].length = 1024;
op->nb_src = 1;
op->nb_dst = 1;
op->flags = RTE_DMA_OP_FLAG_SUBMIT;
@@ -1537,12 +1533,6 @@ perf_event_dev_port_setup(struct evt_test *test, struct 
evt_options *opt,
op->dma_dev_id = dma_dev_id;
op->vchan = vchan_id;

-   response_info = (struct rte_event *)((uint8_t 
*)op +
-sizeof(struct 
rte_event_dma_adapter_op));
-   response_info->queue_id = p->queue_id;
-   response_info->sched_type = 
RTE_SCHED_TYPE_ATOMIC;
-   response_info->flow_id = flow_id;
-
p->da.dma_op[flow_id] = op;
}

@@ -2036,7 +2026,7 @@ perf_dmadev_setup(struct evt_test *test, struct 
evt_options *opt)
return -ENODEV;
}

-   elt_size = sizeof(struct rte_event_dma_adapter_op) + sizeof(struct 
rte_event);
+   elt_size = sizeof(struct rte_event_dma_adapter_op) + (sizeof(struct 
rte_dma_sge) * 2);
t->da_op_pool = rte_mempool_create("dma_op_pool", opt->pool_sz, 
elt_size, 256,
   0, NULL, NULL, NULL, NULL, 
rte_socket_id(), 0);
if (t->da_op_pool == NULL) {
@@ -2085,10 +2075,8 @@ perf_dmadev_destroy(struct evt_test *test, struct 
evt_options *opt)
for (flow_id = 0; flow_id < t->nb_flows; flow_id++) {
op = p->da.dma_op[flow_id];

-   rte_pktmbuf_free((struct rte_mbuf 
*)(uintptr_t)op->src_seg->addr);
-   rte_pktmbuf_free((struct rte_mbuf 
*)(uintptr_t)op->dst_seg->addr);
-   rte_free(op->src_seg);
-   rte_free(op->dst_seg);
+   rte_pktmbuf_free((struct rte_mbuf 
*)(uintptr_t)op->src_dst_seg[0].addr);
+   rte_pktmbuf_free((struct rte_mbuf 
*)(uintptr_t)op->src_dst_seg[1].addr);
rte_mempool_put(op->op_mp, op);
}

diff --git a/app/test/test_event_dma_adapter.c 
b/app/test/test_event_dma_adapter.c
index 35b417b69f..d9dff4ff7d 100644
--- a/app/test/test_event_dma_adapter.c
+++ b/app/test/test_event_dma_adapter.c
@@ -235,7 +235,

[PATCH v2 2/2] dma/cnxk: remove completion pool

2024-04-16 Thread pbhagavatula
From: Pavan Nikhilesh 

Use DMA ops to store metadata, remove use of completion pool.

Signed-off-by: Pavan Nikhilesh 
Acked-by: Vamsi Attunuru 
---
 drivers/dma/cnxk/cnxk_dmadev.c   | 53 ++--
 drivers/dma/cnxk/cnxk_dmadev.h   | 24 +--
 drivers/dma/cnxk/cnxk_dmadev_fp.c| 79 +---
 drivers/event/cnxk/cnxk_eventdev_adptr.c | 47 +++---
 4 files changed, 45 insertions(+), 158 deletions(-)

diff --git a/drivers/dma/cnxk/cnxk_dmadev.c b/drivers/dma/cnxk/cnxk_dmadev.c
index 4ab3cfbdf2..dfd7222713 100644
--- a/drivers/dma/cnxk/cnxk_dmadev.c
+++ b/drivers/dma/cnxk/cnxk_dmadev.c
@@ -2,6 +2,8 @@
  * Copyright (C) 2021 Marvell International Ltd.
  */
 
+#include 
+
 #include 
 
 static int cnxk_stats_reset(struct rte_dma_dev *dev, uint16_t vchan);
@@ -30,8 +32,7 @@ cnxk_dmadev_vchan_free(struct cnxk_dpi_vf_s *dpivf, uint16_t 
vchan)
 {
struct cnxk_dpi_conf *dpi_conf;
uint16_t num_vchans;
-   uint16_t max_desc;
-   int i, j;
+   int i;
 
if (vchan == RTE_DMA_ALL_VCHAN) {
num_vchans = dpivf->num_vchans;
@@ -46,12 +47,6 @@ cnxk_dmadev_vchan_free(struct cnxk_dpi_vf_s *dpivf, uint16_t 
vchan)
 
for (; i < num_vchans; i++) {
dpi_conf = &dpivf->conf[i];
-   max_desc = dpi_conf->c_desc.max_cnt + 1;
-   if (dpi_conf->c_desc.compl_ptr) {
-   for (j = 0; j < max_desc; j++)
-   rte_free(dpi_conf->c_desc.compl_ptr[j]);
-   }
-
rte_free(dpi_conf->c_desc.compl_ptr);
dpi_conf->c_desc.compl_ptr = NULL;
}
@@ -261,7 +256,7 @@ cnxk_dmadev_vchan_setup(struct rte_dma_dev *dev, uint16_t 
vchan,
if (max_desc > CNXK_DPI_MAX_DESC)
max_desc = CNXK_DPI_MAX_DESC;
 
-   size = (max_desc * sizeof(struct cnxk_dpi_compl_s *));
+   size = (max_desc * sizeof(uint8_t) * CNXK_DPI_COMPL_OFFSET);
dpi_conf->c_desc.compl_ptr = rte_zmalloc(NULL, size, 0);
 
if (dpi_conf->c_desc.compl_ptr == NULL) {
@@ -269,16 +264,8 @@ cnxk_dmadev_vchan_setup(struct rte_dma_dev *dev, uint16_t 
vchan,
return -ENOMEM;
}
 
-   for (i = 0; i < max_desc; i++) {
-   dpi_conf->c_desc.compl_ptr[i] =
-   rte_zmalloc(NULL, sizeof(struct cnxk_dpi_compl_s), 0);
-   if (!dpi_conf->c_desc.compl_ptr[i]) {
-   plt_err("Failed to allocate for descriptor memory");
-   return -ENOMEM;
-   }
-
-   dpi_conf->c_desc.compl_ptr[i]->cdata = CNXK_DPI_REQ_CDATA;
-   }
+   for (i = 0; i < max_desc; i++)
+   dpi_conf->c_desc.compl_ptr[i * CNXK_DPI_COMPL_OFFSET] = 
CNXK_DPI_REQ_CDATA;
 
dpi_conf->c_desc.max_cnt = (max_desc - 1);
 
@@ -301,10 +288,8 @@ cnxk_dmadev_start(struct rte_dma_dev *dev)
dpi_conf->pnum_words = 0;
dpi_conf->pending = 0;
dpi_conf->desc_idx = 0;
-   for (j = 0; j < dpi_conf->c_desc.max_cnt + 1; j++) {
-   if (dpi_conf->c_desc.compl_ptr[j])
-   dpi_conf->c_desc.compl_ptr[j]->cdata = 
CNXK_DPI_REQ_CDATA;
-   }
+   for (j = 0; j < dpi_conf->c_desc.max_cnt + 1; j++)
+   dpi_conf->c_desc.compl_ptr[j * CNXK_DPI_COMPL_OFFSET] = 
CNXK_DPI_REQ_CDATA;
nb_desc += dpi_conf->c_desc.max_cnt + 1;
cnxk_stats_reset(dev, i);
dpi_conf->completed_offset = 0;
@@ -382,22 +367,22 @@ cnxk_dmadev_completed(void *dev_private, uint16_t vchan, 
const uint16_t nb_cpls,
struct cnxk_dpi_vf_s *dpivf = dev_private;
struct cnxk_dpi_conf *dpi_conf = &dpivf->conf[vchan];
struct cnxk_dpi_cdesc_data_s *c_desc = &dpi_conf->c_desc;
-   struct cnxk_dpi_compl_s *comp_ptr;
+   uint8_t status;
int cnt;
 
for (cnt = 0; cnt < nb_cpls; cnt++) {
-   comp_ptr = c_desc->compl_ptr[c_desc->head];
-
-   if (comp_ptr->cdata) {
-   if (comp_ptr->cdata == CNXK_DPI_REQ_CDATA)
+   status = c_desc->compl_ptr[c_desc->head * 
CNXK_DPI_COMPL_OFFSET];
+   if (status) {
+   if (status == CNXK_DPI_REQ_CDATA)
break;
*has_error = 1;
dpi_conf->stats.errors++;
+   c_desc->compl_ptr[c_desc->head * CNXK_DPI_COMPL_OFFSET] 
=
+   CNXK_DPI_REQ_CDATA;
CNXK_DPI_STRM_INC(*c_desc, head);
break;
}
-
-   comp_ptr->cdata = CNXK_DPI_REQ_CDATA;
+   c_desc->compl_ptr[c_desc->head * CNXK_DPI_COMPL_OFFSET] = 
CNXK_DPI_REQ_CDATA;
CNXK_DPI_STRM_INC(*c_desc, head);
}
 
@@ -414,19 +399,17 @@ cnxk_dmadev_completed_status(void *dev_private, 

RE: [EXTERNAL] [PATCH 2/5] app: use RTE_DIM

2024-04-16 Thread Akhil Goyal
> Use RTE_DIM instead of computing directly with sizeof.
> Patch automatically generated via cocci/rte_dim.cocci.
> 
> Signed-off-by: Stephen Hemminger 
> ---
>  app/test-flow-perf/main.c  |  4 +--
>  app/test/test_ethdev_link.c|  3 +-
>  app/test/test_security_inline_macsec.c | 47 +-
>  3 files changed, 25 insertions(+), 29 deletions(-)
Acked-by: Akhil Goyal 


RE: [EXTERNAL] [PATCH v2 76/83] app/test: move alignment attribute on types

2024-04-16 Thread Akhil Goyal
> Move location of __rte_aligned(a) to new conventional location. The new
> placement between {struct,union} and the tag allows the desired
> alignment to be imparted on the type regardless of the toolchain being
> used for both C and C++. Additionally, it avoids confusion by Doxygen
> when generating documentation.
> 
> Signed-off-by: Tyler Retzlaff 
> Acked-by: Morten Brørup 
> ---
>  app/test/test_cryptodev_aead_test_vectors.h|  6 +++---
>  app/test/test_cryptodev_blockcipher.h  |  2 +-
>  app/test/test_cryptodev_kasumi_test_vectors.h  |  2 +-
>  app/test/test_cryptodev_mixed_test_vectors.h   |  2 +-
>  app/test/test_cryptodev_security_docsis_test_vectors.h |  2 +-
>  app/test/test_cryptodev_snow3g_test_vectors.h  |  2 +-
>  app/test/test_cryptodev_zuc_test_vectors.h |  2 +-

Acked-by: Akhil Goyal 


RE: [EXTERNAL] [PATCH v2 01/83] examples: move alignment attribute on types

2024-04-16 Thread Akhil Goyal
> Move location of __rte_aligned(a) to new conventional location. The new
> placement between {struct,union} and the tag allows the desired
> alignment to be imparted on the type regardless of the toolchain being
> used for both C and C++. Additionally, it avoids confusion by Doxygen
> when generating documentation.
> 
> Signed-off-by: Tyler Retzlaff 
> Reviewed-by: Morten Brørup 
> Acked-by: Morten Brørup 
> ---


>  examples/ipsec-secgw/ipsec-secgw.c |  4 +--
>  examples/ipsec-secgw/ipsec-secgw.h |  8 +++---
>  examples/ipsec-secgw/ipsec.h   | 22 +++
>  examples/ipsec-secgw/ipsec_worker.h|  4 +--

Acked-by:  Akhil Goyal