RE: [RFC 1/5] ethdev: add port affinity match item

2023-01-18 Thread Jiawei(Jonny) Wang
Hi,

> 
> 21/12/2022 11:29, Jiawei Wang:
> > +   /**
> > +* Matches on the physical port affinity of the received packet.
> > +*
> > +* See struct rte_flow_item_port_affinity.
> > +*/
> > +   RTE_FLOW_ITEM_TYPE_PORT_AFFINITY,
> >  };
> 
> I'm not sure about the word "affinity".
> I think you want to match on a physical port.
> It could be a global physical port id or an index in the group of physical 
> ports
> connected to a single DPDK port.
> In first case, the name of the item could be RTE_FLOW_ITEM_TYPE_PHY_PORT,
> in the second case, the name could be
> RTE_FLOW_ITEM_TYPE_MHPSDP_PHY_PORT,
> "MHPSDP" meaning "Multiple Hardware Ports - Single DPDK Port".
> We could replace "PHY" with "HW" as well.
>

Since DPDK only probe/attach the single port, seems first case does not meet 
this case.
Here, 'affinity' stands for the packet association with actual physical port.
 
> Note that we cannot use the new item
> RTE_FLOW_ITEM_TYPE_REPRESENTED_PORT
> because we are in a case where multiple hardware ports are merged in a single
> software represented port.
> 
> 
> [...]
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior notice
> > + *
> > + * RTE_FLOW_ITEM_TYPE_PORT_AFFINITY
> > + *
> > + * For the multiple hardware ports connect to a single DPDK port
> > +(mhpsdp),
> > + * use this item to match the hardware port affinity of the packets.
> > + */
> > +struct rte_flow_item_port_affinity {
> > +   uint8_t affinity; /**< port affinity value. */ };
> 
> We need to define how the port numbering is done.
> Is it driver-dependent?
> Does it start at 0? etc...
> 
> 

User can define any value they want; one use case is the packet could be 
received and
sent to same port, then they can set the same 'affinity' value in flow and 
queue configuration.

The flow behavior is driver dependent.

Thanks.


RE: [RFC 2/5] ethdev: introduce the affinity field in Tx queue API

2023-01-18 Thread Jiawei(Jonny) Wang
Hi,


> 
> 21/12/2022 11:29, Jiawei Wang:
> > For the multiple hardware ports connect to a single DPDK port
> > (mhpsdp), the previous patch introduces the new rte flow item to match
> > the port affinity of the received packets.
> >
> > This patch adds the tx_affinity setting in Tx queue API, the affinity
> > value reflects packets be sent to which hardware port.
> 
> I think "affinity" means we would like packet to be sent on a specific 
> hardware
> port, but it is not mandatory.
> Is it the meaning you want? Or should it be a mandatory port?
> 

Right, it's optional setting not mandatory.

> > Adds the new tx_affinity field into the padding hole of rte_eth_txconf
> > structure, the size of rte_eth_txconf keeps the same. Adds a suppress
> > type for structure change in the ABI check file.
> >
> > This patch adds the testpmd command line:
> > testpmd> port config (port_id) txq (queue_id) affinity (value)
> >
> > For example, there're two hardware ports connects to a single DPDK
> 
> connects -> connected
> 

OK, will fix in next version.

> > port (port id 0), and affinity 1 stood for hard port 1 and affinity
> > 2 stood for hardware port 2, used the below command to config tx
> > affinity for each TxQ:
> > port config 0 txq 0 affinity 1
> > port config 0 txq 1 affinity 1
> > port config 0 txq 2 affinity 2
> > port config 0 txq 3 affinity 2
> >
> > These commands config the TxQ index 0 and TxQ index 1 with affinity 1,
> > uses TxQ 0 or TxQ 1 send packets, these packets will be sent from the
> > hardware port 1, and similar with hardware port 2 if sending packets
> > with TxQ 2 or TxQ 3.
> 
> [...]
> > @@ -212,6 +212,10 @@ API Changes
> > +* ethdev: added a new field:
> > +
> > +  - Tx affinity per-queue ``rte_eth_txconf.tx_affinity``
> 
> Adding a new field is not an API change because existing applications don't
> need to update their code if they don't care this new field.
> I think you can remove this note.
> 

OK, will remove in next version.

> > --- a/lib/ethdev/rte_ethdev.h
> > +++ b/lib/ethdev/rte_ethdev.h
> > @@ -1138,6 +1138,7 @@ struct rte_eth_txconf {
> >   less free descriptors than this value. */
> >
> > uint8_t tx_deferred_start; /**< Do not start queue with
> > rte_eth_dev_start(). */
> > +   uint8_t tx_affinity; /**< Drives the setting of affinity per-queue.
> > +*/
> 
> Why "Drives"? It is the setting, right?
> rte_eth_txconf is per-queue so no need to repeat.
> I think a good comment here would be to mention it is a physical port index 
> for
> mhpsdp.
> Another good comment would be to specify how ports are numbered.
> 

OK, will update the comment for this new setting.

Thanks.



RE: [RFC 2/5] ethdev: introduce the affinity field in Tx queue API

2023-01-24 Thread Jiawei(Jonny) Wang
Hi,

> 18/01/2023 15:44, Jiawei(Jonny) Wang:
> > > 21/12/2022 11:29, Jiawei Wang:
> > > > For the multiple hardware ports connect to a single DPDK port
> > > > (mhpsdp), the previous patch introduces the new rte flow item to
> > > > match the port affinity of the received packets.
> > > >
> > > > This patch adds the tx_affinity setting in Tx queue API, the
> > > > affinity value reflects packets be sent to which hardware port.
> > >
> > > I think "affinity" means we would like packet to be sent on a
> > > specific hardware port, but it is not mandatory.
> > > Is it the meaning you want? Or should it be a mandatory port?
> >
> > Right, it's optional setting not mandatory.
> 
> I think there is a misunderstanding.
> I mean that "affinity" with port 0 may suggest that we try to send to port 0 
> but
> sometimes the packet will be sent to port 1.
>
> And I think you want the packet to be always sent to port 0 if affinity is 0, 
> right?
>

These packets should be always sent to port 0 if 'affinity' be set with 
hardware port 0.
'affinity is 0' -> 0 means that no affinity be set and traffic should be kept 
the same behavior
as before, for example, routing between different hardware ports.
 
> If yes, I think the word "affinity" does not convey the right idea.
> And again, the naming should give the idea that we are talking about multiple
> ports merged in one DPDK port.
> 

OK, how about 'tx_mhpsdp_hwport? 
'mhpsdp' as mentioned before, 'hwport' means for one 'hardware port'.

> > > > Adds the new tx_affinity field into the padding hole of
> > > > rte_eth_txconf structure, the size of rte_eth_txconf keeps the
> > > > same. Adds a suppress type for structure change in the ABI check file.
> > > >
> > > > This patch adds the testpmd command line:
> > > > testpmd> port config (port_id) txq (queue_id) affinity (value)
> > > >
> > > > For example, there're two hardware ports connects to a single DPDK
> > >
> > > connects -> connected
> >
> > OK, will fix in next version.
> >
> > > > port (port id 0), and affinity 1 stood for hard port 1 and
> > > > affinity
> > > > 2 stood for hardware port 2, used the below command to config tx
> > > > affinity for each TxQ:
> > > > port config 0 txq 0 affinity 1
> > > > port config 0 txq 1 affinity 1
> > > > port config 0 txq 2 affinity 2
> > > > port config 0 txq 3 affinity 2
> > > >
> > > > These commands config the TxQ index 0 and TxQ index 1 with
> > > > affinity 1, uses TxQ 0 or TxQ 1 send packets, these packets will
> > > > be sent from the hardware port 1, and similar with hardware port 2
> > > > if sending packets with TxQ 2 or TxQ 3.
> > >
> > > [...]
> > > > @@ -212,6 +212,10 @@ API Changes
> > > > +* ethdev: added a new field:
> > > > +
> > > > +  - Tx affinity per-queue ``rte_eth_txconf.tx_affinity``
> > >
> > > Adding a new field is not an API change because existing
> > > applications don't need to update their code if they don't care this new 
> > > field.
> > > I think you can remove this note.
> >
> > OK, will remove in next version.
> >
> > > > --- a/lib/ethdev/rte_ethdev.h
> > > > +++ b/lib/ethdev/rte_ethdev.h
> > > > @@ -1138,6 +1138,7 @@ struct rte_eth_txconf {
> > > >   less free descriptors than this 
> > > > value. */
> > > >
> > > > uint8_t tx_deferred_start; /**< Do not start queue with
> > > > rte_eth_dev_start(). */
> > > > +   uint8_t tx_affinity; /**< Drives the setting of affinity 
> > > > per-queue.
> > > > +*/
> > >
> > > Why "Drives"? It is the setting, right?
> > > rte_eth_txconf is per-queue so no need to repeat.
> > > I think a good comment here would be to mention it is a physical
> > > port index for mhpsdp.
> > > Another good comment would be to specify how ports are numbered.
> >
> > OK, will update the comment for this new setting.
> >
> > Thanks.
> 
> 



RE: [RFC 1/5] ethdev: add port affinity match item

2023-01-24 Thread Jiawei(Jonny) Wang
Hi,


> > >
> > > 21/12/2022 11:29, Jiawei Wang:
> > > > +   /**
> > > > +* Matches on the physical port affinity of the received packet.
> > > > +*
> > > > +* See struct rte_flow_item_port_affinity.
> > > > +*/
> > > > +   RTE_FLOW_ITEM_TYPE_PORT_AFFINITY,
> > > >  };
> > >
> > > I'm not sure about the word "affinity".
> > > I think you want to match on a physical port.
> > > It could be a global physical port id or an index in the group of
> > > physical ports connected to a single DPDK port.
> > > In first case, the name of the item could be
> > > RTE_FLOW_ITEM_TYPE_PHY_PORT, in the second case, the name could be
> > > RTE_FLOW_ITEM_TYPE_MHPSDP_PHY_PORT,
> > > "MHPSDP" meaning "Multiple Hardware Ports - Single DPDK Port".
> > > We could replace "PHY" with "HW" as well.
> > >
> >
> > Since DPDK only probe/attach the single port, seems first case does not meet
> this case.
> > Here, 'affinity' stands for the packet association with actual physical 
> > port.
> 
> I think it is more than affinity because the packet is effectively received 
> from
> this port.
> And the other concern is that this name does not give any clue that we are
> talking about multiple ports merged in a single one.
> 

RTE_FLOW_ITEM_TYPE_MHPSDP_HW_PORT is better? @Ori Kam WDYT?

> > > Note that we cannot use the new item
> > > RTE_FLOW_ITEM_TYPE_REPRESENTED_PORT
> > > because we are in a case where multiple hardware ports are merged in
> > > a single software represented port.
> > >
> > >
> > > [...]
> > > > +/**
> > > > + * @warning
> > > > + * @b EXPERIMENTAL: this structure may change without prior
> > > > +notice
> > > > + *
> > > > + * RTE_FLOW_ITEM_TYPE_PORT_AFFINITY
> > > > + *
> > > > + * For the multiple hardware ports connect to a single DPDK port
> > > > +(mhpsdp),
> > > > + * use this item to match the hardware port affinity of the packets.
> > > > + */
> > > > +struct rte_flow_item_port_affinity {
> > > > +   uint8_t affinity; /**< port affinity value. */ };
> > >
> > > We need to define how the port numbering is done.
> > > Is it driver-dependent?
> > > Does it start at 0? etc...
> >
> > User can define any value they want; one use case is the packet could
> > be received and sent to same port, then they can set the same 'affinity' 
> > value
> in flow and queue configuration.
> 
> No it does not work.
> If ports are numbered 1 and 2, and user thinks it is 0 and 1, the port 2 
> won't be
> matched at all.
> 

OK, I can update the document the affinity 0 is no affinity in tx side and then 
match on affinity 0
will result an error.
For above case, user should use 1 and 2 to match.

> > The flow behavior is driver dependent.
> >
> > Thanks.
> 
> 
> 
> 



RE: [PATCH v2 2/2] ethdev: introduce the PHY affinity field in Tx queue API

2023-02-01 Thread Jiawei(Jonny) Wang


> 30/01/2023 18:00, Jiawei Wang:
> > --- a/devtools/libabigail.abignore
> > +++ b/devtools/libabigail.abignore
> > @@ -20,6 +20,11 @@
> >  [suppress_file]
> >  soname_regexp = ^librte_.*mlx.*glue\.
> >
> > +; Ignore fields inserted in middle padding of rte_eth_txconf
> > +[suppress_type]
> > +name = rte_eth_txconf
> > +has_data_member_inserted_between =
> > +{offset_after(tx_deferred_start), offset_of(offloads)}
> 
> You are adding the exception inside
> "Core suppression rules: DO NOT TOUCH".
> 
> Please move it at the end in the section "Temporary exceptions till next major
> ABI version"
> 

OK, will move.

> Also the rule does not work.
> It should be:
>   has_data_member_inserted_between = {offset_of(tx_deferred_start),
> offset_of(offloads)}
> 

Thanks, Will change it and send with new version.
> 



RE: [PATCH v2 1/2] ethdev: add PHY affinity match item

2023-02-01 Thread Jiawei(Jonny) Wang
Hi,

> -Original Message-
> From: Andrew Rybchenko 
> Sent: Wednesday, February 1, 2023 4:50 PM
> On 1/30/23 20:00, Jiawei Wang wrote:
> > For the multiple hardware ports connect to a single DPDK port
> > (mhpsdp),
> 
> Sorry, what is mhpsdp?
> 

(m)ultiple (h)ardware (p)orts (s)ingle (D)PDK (p)ort.
It's short name for "multiple hardware ports connect to a single DPDK port".

> > currently, there is no information to indicate the packet belongs to
> > which hardware port.
> >
> > This patch introduces a new phy affinity item in rte flow API, and
> 
> "This patch introduces ..." -> "Introduce ..."
> rte -> RTE
> 

OK.
> > the phy affinity value reflects the physical port of the received packets.
> >
> > While uses the phy affinity as a matching item in the flow, and sets
> > the same phy_affinity value on the tx queue, then the packet can be
> > sent from
> 
> tx -> Tx
> 

OK.
> > the same hardware port with received.
> >
> > This patch also adds the testpmd command line to match the new item:
> > flow create 0 ingress group 0 pattern phy_affinity affinity is 1 /
> > end actions queue index 0 / end
> >
> > The above command means that creates a flow on a single DPDK port and
> > matches the packet from the first physical port (assume the phy
> > affinity 1
> 
> Why is it numbered from 1, not 0? Anyway it should be defined in the
> documentation below.
> 

While uses the phy affinity as a matching item in the flow, and sets the
same phy_affinity value on the tx queue, then the packet can be sent from
the same hardware port with received. 

So, if the Phy affinity 0 is no affinity then the first value should be 1.


> > stands for the first port) and redirects these packets into RxQ 0.
> >
> > Signed-off-by: Jiawei Wang 
> 
> [snip]
> 
> > diff --git a/doc/guides/rel_notes/release_23_03.rst
> > b/doc/guides/rel_notes/release_23_03.rst
> > index c15f6fbb9f..a1abd67771 100644
> > --- a/doc/guides/rel_notes/release_23_03.rst
> > +++ b/doc/guides/rel_notes/release_23_03.rst
> > @@ -69,6 +69,11 @@ New Features
> >   ``rte_event_dev_config::nb_single_link_event_port_queues`` parameter
> >   required for eth_rx, eth_tx, crypto and timer eventdev adapters.
> >
> > +* **Added rte_flow support for matching PHY Affinity fields.**
> 
> Why "Affinity", not "affinity"?
> 

correct, will update.
> > +
> > +  For the multiple hardware ports connect to a single DPDK port
> > + (mhpsdp),  Added ``phy_affinity`` item in rte_flow to support
> > + physical affinity of  the packets.
> 
> Please, add one more empty line to have two before the next section.
> 
OK.
> >
> >   Removed Items
> >   -
> 
> [snip]
> 
> > diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
> > b60987db4b..56c04ea37c 100644
> > --- a/lib/ethdev/rte_flow.h
> > +++ b/lib/ethdev/rte_flow.h
> 
> > @@ -2103,6 +2110,27 @@ static const struct rte_flow_item_meter_color
> rte_flow_item_meter_color_mask = {
> >   };
> >   #endif
> >
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior notice
> > + *
> > + * RTE_FLOW_ITEM_TYPE_PHY_AFFINITY
> > + *
> > + * For the multiple hardware ports connect to a single DPDK port
> > +(mhpsdp),
> > + * use this item to match the physical affinity of the packets.
> > + */
> > +struct rte_flow_item_phy_affinity {
> > +   uint8_t affinity; /**< physical affinity value. */
> 
> Sorry, I'd like to know how application should find out which values may be
> used here? How many physical ports are behind this one DPDK ethdev?
> 

Like Linux bonding scenario, multiple physical port (for example PF1, PF2) can 
add into bond port as slave role, 
dpdk only probe and attach the bond master port (bond0), so total two phy 
affinity values. 

PMD can define the phy affinity and mapping the physical port, Or I can 
document the numbering in RTE level.

> Also, please, define which value should be used for the first port 0 or 1. 
> I'd vote
> for 0.

If need to define the affinity numbering, 
I prefer to use 1 for first port, 0 for reserve and can keep the same value as 
tx side (second patch introduces the tx_phy_affinity).






RE: [PATCH v2 2/2] ethdev: introduce the PHY affinity field in Tx queue API

2023-02-01 Thread Jiawei(Jonny) Wang

Hi,

> -Original Message-
> From: Andrew Rybchenko 
> Subject: Re: [PATCH v2 2/2] ethdev: introduce the PHY affinity field in Tx 
> queue
> API
> 
> On 1/30/23 20:00, Jiawei Wang wrote:
> > For the multiple hardware ports connect to a single DPDK port
> > (mhpsdp), the previous patch introduces the new rte flow item to match
> > the phy affinity of the received packets.
> >
> > This patch adds the tx_phy_affinity setting in Tx queue API, the
> > affinity
> 
> "This patch adds" -> "Add ..."
> 
OK,  will change to 'Add the tx_phy_affinity"

> > value reflects packets be sent to which hardware port.
> > Value 0 is no affinity and traffic will be routed between different
> > physical ports,
> 
> Who will it be routed?
> 

Assume there's two slave physical port bonded and DPDK attached the bond master 
bond,
The packets can be sent from first physical port or second physical port, it 
depends on the PMD
Driver and low level 'routing' selection.

> > if 0 is disabled then try to match on phy_affinity 0 will result in an
> > error.
> 
> Why are you talking about matching here?
> 

Previous patch we mentioned the same phy affinity can be used to handled the 
packet on same hardware
Port, so if 0 is no affinity then match it should report error.

> >
> > Adds the new tx_phy_affinity field into the padding hole of
> > rte_eth_txconf structure, the size of rte_eth_txconf keeps the same.
> > Adds a suppress type for structure change in the ABI check file.
> >
> > This patch adds the testpmd command line:
> > testpmd> port config (port_id) txq (queue_id) phy_affinity (value)
> >
> > For example, there're two hardware ports 0 and 1 connected to a single
> > DPDK port (port id 0), and phy_affinity 1 stood for hardware port 0
> > and phy_affinity 2 stood for hardware port 1, used the below command
> > to config tx phy affinity for per Tx Queue:
> >  port config 0 txq 0 phy_affinity 1
> >  port config 0 txq 1 phy_affinity 1
> >  port config 0 txq 2 phy_affinity 2
> >  port config 0 txq 3 phy_affinity 2
> >
> > These commands config the TxQ index 0 and TxQ index 1 with phy
> > affinity 1, uses TxQ 0 or TxQ 1 send packets, these packets will be
> > sent from the hardware port 0, and similar with hardware port 1 if
> > sending packets with TxQ 2 or TxQ 3.
> 
> Frankly speaking I dislike it. Why do we need to expose it on generic ethdev
> layer? IMHO dynamic mbuf field would be a better solution to control Tx
> routing to a specific PHY port.
> 

OK, the phy affinity is not part of packet information(like timestamp).
And second, the phy affinity is Queue layer, that is, the phy affinity value 
should keep the same behavior per Queue. 
After the TxQ was created, the packets should be sent the same physical port
If using the same TxQ index.  

> IMHO, we definitely need dev_info information about a number of physical
> ports behind. Advertising value greater than 0 should mean that PMD supports
> corresponding mbuf dynamic field to contol ongoing physical port on Tx (or
> should just reject packets on prepare which try to specify outgoing phy port
> otherwise). In the same way the information may be provided on Rx.
> 

See above, I think phy affinity is Queue level not for each packet.

> I'm OK to have 0 as no phy affinity value and greater than zero as specified 
> phy
> affinity. I.e. no dynamic flag is required.
> 

Thanks for agreement.

> Also I think that order of patches should be different.
> We should start from a patch which provides dev_info and flow API matching
> and action should be in later patch.
>

OK.
 
> >
> > Signed-off-by: Jiawei Wang 
> 
> [snip]



RE: [PATCH v4 1/2] ethdev: introduce the PHY affinity field in Tx queue API

2023-02-06 Thread Jiawei(Jonny) Wang
Hi,

@Andrew, @Thomas, @Ori, 

Could you lease help to review the patch?

Thanks.

> -Original Message-
> From: Jiawei Wang 
> Sent: Friday, February 3, 2023 9:34 PM
> To: Slava Ovsiienko ; Ori Kam ;
> NBU-Contact-Thomas Monjalon (EXTERNAL) ;
> andrew.rybche...@oktetlabs.ru; Aman Singh ;
> Yuying Zhang ; Ferruh Yigit 
> Cc: dev@dpdk.org; Raslan Darawsheh 
> Subject: [PATCH v4 1/2] ethdev: introduce the PHY affinity field in Tx queue
> API
> 
> When multiple physical ports are connected to a single DPDK port,
> (example: kernel bonding, DPDK bonding, failsafe, etc.), we want to know
> which physical port is used for Rx and Tx.
> 
> This patch maps a DPDK Tx queue with a physical port, by adding
> tx_phy_affinity setting in Tx queue.
> The affinity number is the physical port ID where packets will be sent.
> Value 0 means no affinity and traffic could be routed to any connected
> physical ports, this is the default current behavior.
> 
> The number of physical ports is reported with rte_eth_dev_info_get().
> 
> The new tx_phy_affinity field is added into the padding hole of rte_eth_txconf
> structure, the size of rte_eth_txconf keeps the same.
> An ABI check rule needs to be added to avoid false warning.
> 
> Add the testpmd command line:
> testpmd> port config (port_id) txq (queue_id) phy_affinity (value)
> 
> For example, there're two physical ports connected to a single DPDK port
> (port id 0), and phy_affinity 1 stood for the first physical port and 
> phy_affinity
> 2 stood for the second physical port.
> Use the below commands to config tx phy affinity for per Tx Queue:
> port config 0 txq 0 phy_affinity 1
> port config 0 txq 1 phy_affinity 1
> port config 0 txq 2 phy_affinity 2
> port config 0 txq 3 phy_affinity 2
> 
> These commands config the Tx Queue index 0 and Tx Queue index 1 with phy
> affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets, these packets will be
> sent from the first physical port, and similar with the second physical port 
> if
> sending packets with Tx Queue 2 or Tx Queue 3.
> 
> Signed-off-by: Jiawei Wang 
> ---

snip

> 2.18.1



Re: [dpdk-dev] [PATCH v8 1/2] ethdev: add pre-defined meter policy API

2021-04-19 Thread Jiawei(Jonny) Wang
Thanks Jasvinder's comments!
Li is on vacation so I help to update the code changes based on your comments, 
V9 is sent with your ack.

> -Original Message-
> From: Singh, Jasvinder 
> Sent: Monday, April 19, 2021 8:35 PM
> To: Li Zhang ; dek...@nvidia.com; Ori Kam
> ; Slava Ovsiienko ; Matan
> Azrad ; Shahaf Shuler ;
> Dumitrescu, Cristian ; lir...@marvell.com;
> jer...@marvell.com; Yigit, Ferruh ;
> ajit.khapa...@broadcom.com; Wisam Monther ; Li,
> Xiaoyun ; NBU-Contact-Thomas Monjalon
> ; Andrew Rybchenko
> ; Ray Kinsella ; Neil
> Horman 
> Cc: dev@dpdk.org; Raslan Darawsheh ; Roni Bar Yanai
> ; Haifei Luo ; Jiawei(Jonny) Wang
> 
> Subject: RE: [PATCH v8 1/2] ethdev: add pre-defined meter policy API
> 
> 
> 
> > +/* MTR meter policy add */
> > +static int
> > +pmd_mtr_meter_policy_add(struct rte_eth_dev *dev,
> > +   uint32_t meter_policy_id,
> > +   struct rte_mtr_meter_policy_params *policy,
> > +   struct rte_mtr_error *error)
> > +{
> > +   struct pmd_internals *p = dev->data->dev_private;
> > +   struct softnic_mtr_meter_policy_list *mpl = &p-
> > >mtr.meter_policies;
> > +   struct softnic_mtr_meter_policy *mp;
> > +   const struct rte_flow_action *act;
> > +   const struct rte_flow_action_meter_color *recolor;
> > +   uint32_t i;
> > +   bool valid_act_found;
> > +
> > +   if (policy == NULL)
> > +   return -rte_mtr_error_set(error,
> > +   EINVAL,
> > +   RTE_MTR_ERROR_TYPE_METER_POLICY,
> > +   NULL,
> > +   "Null meter policy invalid");
> > +
> > +   /* Meter policy must not exist. */
> > +   mp = softnic_mtr_meter_policy_find(p, meter_policy_id);
> > +   if (mp != NULL)
> > +   return -rte_mtr_error_set(error,
> > +   EINVAL,
> > +   RTE_MTR_ERROR_TYPE_METER_POLICY_ID,
> > +   NULL,
> > +   "Meter policy already exists");
> > +
> > +   for (i = 0; i < RTE_COLORS; i++) {
> > +   if (policy->actions[i] == NULL)
> > +   return -rte_mtr_error_set(error,
> > +   EINVAL,
> > +   RTE_MTR_ERROR_TYPE_METER_POLICY,
> > +   NULL,
> > +   "Null action list");
> > +   for (act = policy->actions[i], valid_act_found = false;
> > +   act && act->type != RTE_FLOW_ACTION_TYPE_END;
> > +   act++) {
> 
> Hi Li,
> No need to check "act" in for loop instruction as it is validated before the 
> for
> loop.  Also, would be good to skip all the steps inside this loop for the 
> actions
> like RTE_FLOW_ACTION_TYPE_VOID. After for loop, will be good to check
> "valid_act_found" is true else return an error for that color action.
> 
> Rest seems good to me
> 
> With above fix for softnic-
> Acked-by: Jasvinder Singh 
> 
> 



Re: [dpdk-dev] [PATCH v5 00/14] Add ASO meter support in MLX5 PMD

2021-04-20 Thread Jiawei(Jonny) Wang
Hi,

> -Original Message-
> From: Thomas Monjalon 
> Sent: Tuesday, April 20, 2021 5:59 AM
> To: Li Zhang ; Ferruh Yigit 
> Cc: Ori Kam ; Slava Ovsiienko ;
> Matan Azrad ; dev@dpdk.org; Raslan Darawsheh
> ; Asaf Penso ; Jiawei(Jonny)
> Wang 
> Subject: Re: [dpdk-dev] [PATCH v5 00/14] Add ASO meter support in MLX5
> PMD
> 
> 19/04/2021 23:42, Ferruh Yigit:
> > On 4/15/2021 4:11 PM, Li Zhang wrote:
> > > To support more meters and better performance, MLX HW provide ASO
> > > flow meter.
> > > It can expose millions of ASO flow meter context's in HW.
> > > This ASO object can allocate the large bulk meter objects.
> > > This patch set implement the ASO flow meter for mlx5 driver.
> > > MLX5 PMD driver will be responsible for ASO flow meter manage to HW.
> > >
> >
> > What is ASO?
> > Search yields "Advanced Steering Operation" but that seems Mellanox
> > jargon, would you mind adding some mlx documentation to describe it?
> > If there are some design considerations around it, it may be good to
> document that too.

Yes, ASO means (Advanced Steering Operation) and it's MLX5 internal usage,
we adds the usage in the commit log.

> > Also please provide the long version of the abbreviation in the commit
> > log, at least at first usage of it.
> >

Ok, will add it.

> > And what do you think mentioning from this new support in the release
> notes?
> 
> Yes, new PMD feature should be announced in the release notes, it seems to
> be a miss.
> 

Yes, will add the description in release_notes_21.05.rst

> > >   doc/guides/nics/mlx5.rst  |   6 +
> > >   drivers/common/mlx5/mlx5_devx_cmds.c  |  68 ++
> > >   drivers/common/mlx5/mlx5_devx_cmds.h  |  26 +-
> > >   drivers/common/mlx5/mlx5_prm.h|  81 +-
> > >   drivers/common/mlx5/version.map   |   1 +
> > >   drivers/net/mlx5/linux/mlx5_os.c  |  20 +-
> > >   drivers/net/mlx5/meson.build  |   2 +-
> > >   drivers/net/mlx5/mlx5.c   |  98 +-
> > >   drivers/net/mlx5/mlx5.h   | 258 +-
> > >   drivers/net/mlx5/mlx5_flow.c  | 334 +--
> > >   drivers/net/mlx5/mlx5_flow.h  | 212 ++---
> > >   .../mlx5/{mlx5_flow_age.c => mlx5_flow_aso.c} | 289 +-
> > >   drivers/net/mlx5/mlx5_flow_dv.c   | 792 +++-
> > >   drivers/net/mlx5/mlx5_flow_meter.c| 873 --
> > >   drivers/net/mlx5/mlx5_utils.h |  90 ++
> > >   15 files changed, 2320 insertions(+), 830 deletions(-)
> 
> 

Thanks.


Re: [dpdk-dev] [PATCH v5 11/14] net/mlx5: aso flow meter send WQE and CQE handle

2021-04-20 Thread Jiawei(Jonny) Wang
Hi,

> -Original Message-
> From: Thomas Monjalon 
> Sent: Tuesday, April 20, 2021 6:02 AM
> To: Li Zhang ; Ferruh Yigit ; Asaf
> Penso 
> Cc: Ori Kam ; Slava Ovsiienko ;
> Matan Azrad ; dev@dpdk.org; Raslan Darawsheh
> ; Jiawei(Jonny) Wang 
> Subject: Re: [dpdk-dev] [PATCH v5 11/14] net/mlx5: aso flow meter send
> WQE and CQE handle
> 
> 19/04/2021 23:46, Ferruh Yigit:
> > On 4/15/2021 4:11 PM, Li Zhang wrote:
> > > ASO flow meter send WQE and CQE handle functions
> > >
> >
> > Can you please update patch title, to stat with the action?
> 
> You mean "start" with the action verb.
> 

Will change the title to:" net/mlx5: add meter ASO queue management"

> > And can you please clarify the abbreviations used, same for all patch
> commit logs?
> 
> And "aso" should be uppercase "ASO" like any acronym.
> 
> 

Yes, change to uppercase.

Thanks.



Re: [dpdk-dev] [PATCH v9 0/2] Support meter policy API

2021-04-20 Thread Jiawei(Jonny) Wang
Hi,

> -Original Message-
> From: Ferruh Yigit 
> Sent: Tuesday, April 20, 2021 7:37 PM
> To: Jiawei(Jonny) Wang ; Matan Azrad
> ; Ori Kam ; Slava Ovsiienko
> ; Shahaf Shuler 
> Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> ; Raslan Darawsheh ; Roni
> Bar Yanai 
> Subject: Re: [dpdk-dev] [PATCH v9 0/2] Support meter policy API
> 
> On 4/19/2021 5:08 PM, Jiawei Wang wrote:
> > Currently, the flow meter policy does not support multiple actions per
> > color; also the allowed action types per color are very limited.
> > In addition, the policy cannot be pre-defined.
> >
> > Due to the growing in flow actions offload abilities there is a
> > potential for the user to use variety of actions per color differently.
> > This new meter policy API comes to allow this potential in the most
> > ethdev common way using rte_flow action definition.
> > A list of rte_flow actions will be provided by the user per color in
> > order to create a meter policy.
> > In addition, the API forces to pre-define the policy before the meters
> > creation in order to allow sharing of single policy with multiple
> > meters efficiently.
> >
> > meter_policy_id is added into struct rte_mtr_params.
> > So that it can get the policy during the meters creation.
> >
> > Add two common policy template as macros in the header file,
> >
> > RFC ("ethdev: add pre-defined meter policy API")
> > https://patchwork.dpdk.org/project/dpdk/patch/20210318085815.804896-
> 1-
> > l...@nvidia.com/
> >
> > Depends-on: series=16351  ("Add ASO meter support in MLX5 PMD ")
> > https://patchwork.dpdk.org/project/dpdk/list/?series=16351
> >
> > V2: Delete default policy and change relation doc files.
> > V3: Fix coding style issues.
> > V4: Fix comments about Depends-on
> > V5: Fix comments about rte_mtr_meter_policy_add.
> > V6: Delete policy example.
> > V7: Fix comments and place two common policy template as macros.
> > V8: Fix rebase conflict issues and CI warning
> > V9: Rebase and Fix the comments for softnic driver.
> >
> > Haifei Luo (1):
> >app/testpmd: support policy actions per color
> >
> > Li Zhang (1):
> >ethdev: add pre-defined meter policy API
> >
> 
> Hi Li, Haifei,
> 
> This patch conflicts with merged integrity check/conntrack features, also the
> dependent mlx5 ASO set send a new version.
> 
> Can you please send a new version rebasing on latest versions?
> 
> 

Thanks Ferruh point it, I had rebased and also update the new mlx5 ASO series 
link.
The latest v10 already been sent.

Thanks.



Re: [dpdk-dev] [PATCH] doc: add sampling and mirroring in testpmd guide

2021-03-30 Thread Jiawei(Jonny) Wang
Hi Ferruh,

> -Original Message-
> From: Ferruh Yigit 
> Sent: Friday, March 26, 2021 1:11 AM
> To: Jiawei(Jonny) Wang ; Slava Ovsiienko
> ; xiaoyun...@intel.com; Ori Kam
> 
> Cc: dev@dpdk.org; Andrew Rybchenko ; NBU-
> Contact-Thomas Monjalon 
> Subject: Re: [PATCH] doc: add sampling and mirroring in testpmd guide
> 
> On 3/9/2021 1:18 PM, Jiawei Wang wrote:
> > Update documentation for sample action usage in testpmd and show the
> > command line example.
> >
> 
> +1 to document this.
> 
> Indeed for all testpmd flow update, it must be mandatory to update "Flow
> rules management" section for it, Ori what do you think?
> 
> > Signed-off-by: Jiawei Wang 
> > Acked-by: Viacheslav Ovsiienko 
> 
> <...>
> 
> > +Sample Sampling/Mirroring rules
> > +~~~
> > +
> > +Sample/Mirroring rules can be set by the following commands
> > +
> > +NIC-RX Sampling rule, the matched ingress packets are duplicated and
> > +sent to the queue 1, and each second packets are marked with 0x1234
> > +and sent to queue 0.
> > +
> 
> Also each second packets duplicated to queue 1, isn't it, because of 'ratio 
> 2'?
> 

Yes, 'ratio=2' means that 50% packet will be sampled.

> > +::
> > +
> > + testpmd> set sample_actions 0 mark id  0x1234 / queue index 0 / end
> > + testpmd> flow create 0 ingress group 1 pattern eth / end actions
> > +sample ratio 2 index 0 / queue index 1 / end
> > +
> > +E-Switch Sampling rule, the matched ingress packets are duplicated
> > +and sent to port id 2, and each second packets are sent to eswitch
> manager.
> > +
> 
> what is 'E-Switch', or "eswitch manager", isn't the mirror rule generic?

Here E-Switch means 'transfer=1' in flow attribute,  I will remove it since the 
concept is for Mellanox HW.


Re: [dpdk-dev] [PATCH] doc: add sampling and mirroring in testpmd guide

2021-03-30 Thread Jiawei(Jonny) Wang
Hi Thomas,

> -Original Message-
> From: Thomas Monjalon 
> Sent: Friday, March 26, 2021 1:16 AM
> To: Jiawei(Jonny) Wang ; Slava Ovsiienko
> ; Ori Kam ; Ferruh Yigit
> 
> Cc: xiaoyun...@intel.com; dev@dpdk.org; Andrew Rybchenko
> 
> Subject: Re: [PATCH] doc: add sampling and mirroring in testpmd guide
> 
> 25/03/2021 18:10, Ferruh Yigit:
> > On 3/9/2021 1:18 PM, Jiawei Wang wrote:
> > > Update documentation for sample action usage in testpmd and show the
> > > command line example.
> > >
> >
> > +1 to document this.
> >
> > Indeed for all testpmd flow update, it must be mandatory to update
> > "Flow rules management" section for it, Ori what do you think?
> >
> > > Signed-off-by: Jiawei Wang 
> > > Acked-by: Viacheslav Ovsiienko 
> >
> > <...>
> >
> > > +Sample Sampling/Mirroring rules
> > > +~~~
> > > +
> > > +Sample/Mirroring rules can be set by the following commands
> > > +
> > > +NIC-RX Sampling rule, the matched ingress packets are duplicated
> > > +and sent to the queue 1, and each second packets are marked with
> > > +0x1234 and sent to queue 0.
> > > +
> >
> > Also each second packets duplicated to queue 1, isn't it, because of 'ratio 
> > 2'?
> >
> > > +::
> > > +
> > > + testpmd> set sample_actions 0 mark id  0x1234 / queue index 0 /
> > > + testpmd> end flow create 0 ingress group 1 pattern eth / end
> > > + testpmd> actions
> > > +sample ratio 2 index 0 / queue index 1 / end
> > > +
> > > +E-Switch Sampling rule, the matched ingress packets are duplicated
> > > +and sent to port id 2, and each second packets are sent to eswitch
> manager.
> > > +
> >
> > what is 'E-Switch', or "eswitch manager", isn't the mirror rule generic?
> 
> This is the HW switch in Mellanox devices.
> It should not be mentioned in this generic doc.
> 

Yes, I will remove it and send v2 patch.
Thanks.


Re: [dpdk-dev] [PATCH] doc: add sampling and mirroring in testpmd guide

2021-03-31 Thread Jiawei(Jonny) Wang
Hi Ferruh,

> -Original Message-
> From: Ferruh Yigit 
> Sent: Wednesday, March 31, 2021 5:45 PM
> To: Jiawei(Jonny) Wang ; Slava Ovsiienko
> ; xiaoyun...@intel.com; Ori Kam
> 
> Cc: dev@dpdk.org; Andrew Rybchenko ; NBU-
> Contact-Thomas Monjalon 
> Subject: Re: [PATCH] doc: add sampling and mirroring in testpmd guide
> 
> On 3/31/2021 7:38 AM, Jiawei(Jonny) Wang wrote:
> > Hi Ferruh,
> >
> >> -Original Message-
> >> From: Ferruh Yigit 
> >> Sent: Friday, March 26, 2021 1:11 AM
> >> To: Jiawei(Jonny) Wang ; Slava Ovsiienko
> >> ; xiaoyun...@intel.com; Ori Kam
> >> 
> >> Cc: dev@dpdk.org; Andrew Rybchenko ;
> NBU-
> >> Contact-Thomas Monjalon 
> >> Subject: Re: [PATCH] doc: add sampling and mirroring in testpmd guide
> >>
> >> On 3/9/2021 1:18 PM, Jiawei Wang wrote:
> >>> Update documentation for sample action usage in testpmd and show the
> >>> command line example.
> >>>
> >>
> >> +1 to document this.
> >>
> >> Indeed for all testpmd flow update, it must be mandatory to update
> >> "Flow rules management" section for it, Ori what do you think?
> >>
> >>> Signed-off-by: Jiawei Wang 
> >>> Acked-by: Viacheslav Ovsiienko 
> >>
> >> <...>
> >>
> >>> +Sample Sampling/Mirroring rules
> >>> +~~~
> >>> +
> >>> +Sample/Mirroring rules can be set by the following commands
> >>> +
> >>> +NIC-RX Sampling rule, the matched ingress packets are duplicated
> >>> +and sent to the queue 1, and each second packets are marked with
> >>> +0x1234 and sent to queue 0.
> >>> +
> >>
> >> Also each second packets duplicated to queue 1, isn't it, because of 'ratio
> 2'?
> >>
> >
> > Yes, 'ratio=2' means that 50% packet will be sampled.
> >
> 
> I thought 'sample' action also applies to the "queue index 1" but most
> probably it only applies to 'sample_actions', so original paragraph is 
> correct,
> perhaps except the 'duplicated' detail, and my comment "each second
> packets duplicated to queue 1" is wrong.
> 

Here sample action is 'Queue index 0" and original action is 'Queue index 1",
sample action only impact on duplicated packets, so 50% duplicated to queue 0.

> Can you please confirm if below is correct:
> Assume 10 packets are received, from 1 to 10, after below rule queue status
> will be:
> Queue0: 2, 4, 6, 8, 10 [All marked with 0x1234, duplicated packets]
> Queue1: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 [Original packets]
> 
> 

My description is not correct here.
The Marked 0x1234 and  5 duplicated packet will be sent to queue 0;
But the duplicated packet order not strict 'each second packet'.
All of the 10 packet will be sent to Queue 1 without mark.

I will update as below:
"
NIC-RX Sampling rule, the matched ingress packets are sent to the queue 1,
and 50% packets are duplicated and marked with 0x1234 and sent to queue 0.
"

> >>> +::
> >>> +
> >>> + testpmd> set sample_actions 0 mark id  0x1234 / queue index 0 /
> >>> + testpmd> end flow create 0 ingress group 1 pattern eth / end
> >>> + testpmd> actions
> >>> +sample ratio 2 index 0 / queue index 1 / end
> >>> +
> >>> +E-Switch Sampling rule, the matched ingress packets are duplicated
> >>> +and sent to port id 2, and each second packets are sent to eswitch
> >> manager.
> >>> +
> >>
> >> what is 'E-Switch', or "eswitch manager", isn't the mirror rule generic?
> >
> > Here E-Switch means 'transfer=1' in flow attribute,  I will remove it since
> the concept is for Mellanox HW.
> >

Thanks.


Re: [dpdk-dev] [PATCH v3 1/8] app/testpmd: store VXLAN/NVGRE encap data globally

2021-03-31 Thread Jiawei(Jonny) Wang
Hello Ferruh,

> -Original Message-
> From: Ferruh Yigit 
> Sent: Wednesday, March 31, 2021 8:08 PM
> To: Salem Sol ; dev@dpdk.org
> Cc: Jiawei(Jonny) Wang ; Ori Kam ;
> Xiaoyun Li 
> Subject: Re: [dpdk-dev] [PATCH v3 1/8] app/testpmd: store VXLAN/NVGRE
> encap data globally
> 
> On 3/17/2021 9:26 AM, Salem Sol wrote:
> > From: Jiawei Wang 
> >
> > With the current code the VXLAN/NVGRE parsing routine stored the
> > configuration of the header on stack, this might lead to overwriting
> > the data on the stack.
> >
> > This patch stores the external data of vxlan and nvgre encap into
> > global data as a pre-step to supporting vxlan and nvgre encap as a
> > sample actions.
> >
> 
> I didn't get what was on stack and moved in to the global data, can you
> please elaborate.
> 

This patch split the function and saved input data into input parameter:
So it mentioned here "pre-step" for next store the data of vxlan/nvgre into 
global.

The global var for sample action is defined in: 
(https://patches.dpdk.org/project/dpdk/patch/20210317092610.71000-5-sal...@nvidia.com/)
struct action_vxlan_encap_data sample_vxlan_encap[RAW_SAMPLE_CONFS_MAX_NUM];

> For example for nvgre, 'action_nvgre_encap_data' is pointer in stack but the
> actual object is 'ctx->object', so it is not really in the stack.
> 

After call 'set sample 0 nvgre .. ', the data be stored into 'ctx->object', the 
'ctx->object' will be reused
for the following CLI command, After that, while we try to use previous 'sample 
action' to fetch nvgre data,
these data may be lost.

For sample action, use global data can save the previous nvgre configurations 
data.

> Tough, OK to refactor and split the function as preparation to support the
> sample action.
> 
> > Signed-off-by: Jiawei Wang 
> 
> <...>
> 
> > -/** Parse VXLAN encap action. */
> > +/** Setup VXLAN encap configuration. */
> >   static int
> > -parse_vc_action_vxlan_encap(struct context *ctx, const struct token
> *token,
> > -   const char *str, unsigned int len,
> > -   void *buf, unsigned int size)
> > +parse_setup_vxlan_encap_data
> > +   (struct action_vxlan_encap_data *action_vxlan_encap_data)
> 
> Can you please fix the syntax, I guess this is done to keep within in 80 
> column
> limit, but from readability perspective I think better to go over the 80 
> column
> a little instead of breaking the line like this.
> 

Ok, will change into one line.

> <...>
> 
> > +/** Setup NVGRE encap configuration. */ static int
> > +parse_setup_nvgre_encap_data
> > +   (struct action_nvgre_encap_data *action_nvgre_encap_data)
> {
> > +   if (!action_nvgre_encap_data)
> > +   return -1;
> 
> This is a static function, and the input of it is in our control, so not sure 
> if we
> should verify the input here.
> But if we need to, can you please check the return value of this function
> where it is called.

I agree with you that can remove this checking inside, since we make sure it's 
valid before call.

Thanks.


Re: [dpdk-dev] [PATCH] net/mlx5: fix mirror flow split with L3 encapsulation

2021-07-23 Thread Jiawei(Jonny) Wang
Hello, 

> -Original Message-
> From: Thomas Monjalon 
> Sent: Friday, July 23, 2021 8:57 PM
> To: Slava Ovsiienko ; Jiawei(Jonny) Wang
> 
> Cc: Matan Azrad ; Ori Kam ;
> Shahaf Shuler ; dev@dpdk.org; Raslan Darawsheh
> ; sta...@dpdk.org
> Subject: Re: [PATCH] net/mlx5: fix mirror flow split with L3 encapsulation
> 
> 23/07/2021 14:43, Jiawei Wang:
> > Due to hardware limitations, the decap action can't follow the
> 
> Which decap action? Pleas be more specific.

All of decap,  vxlan_decap, nvgre_decap and raw_decap.
Will update details.

> 
> > sample action in the same flow, to keep the original action order of
> > sample and decap actions the flow was internally split by PMD, and
> > decap action was moved into suffix subflow in the new table.
> 
> "the" new table? which one?

" sample and decap actions the flow was internally split by PMD, " means
that PMD split internally into two flows:prefix subflow with sample action in 
original table;
suffix subflow with decap action in the new table;

> 
> Was there an issue? Above seems to say all is fine.
> 

The issue is mentioned below that 'L3 encapsulation' case 
(raw_decap+raw_encap), 
the previous code not consier the combination decap/encap case.
And under this case flow don't need split.

> > There is a specific combination of raw decap and raw encap actions to
> > specify "L3 encapsulation" packet transformation - raw decap action to
> > remove L2 header and raw encap to add the tunnel header.
> 
> Is this combination the problem? It was not working before this patch?
> Please describe what happened.
> 

L3 encap is working without sample action or sample is working without L3 encap;

> > This specific L3 encapsulation is encoded as a single packet reformat
> > hardware transaction and is supported by hardware after sample action
> > (no hardware limitations for packet reformat).
> >
> > The patch checks whether the decap action is the part of "L3
> encapsulation"
> 
> I think you mean "is part of".

Thanks, will fix in new version.

> 
> > and does not move the decap action into suffix subflow for the case.
> >
> > Fixes: cafd87f62a06 ("net/mlx5: fix VLAN push/pop and decap actions
> > with mirror")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Jiawei Wang 
> > Acked-by: Viacheslav Ovsiienko 
> 
> 

Thanks.



Re: [dpdk-dev] [PATCH] net/mlx5: fix mirror flow split with L3 encapsulation

2021-07-26 Thread Jiawei(Jonny) Wang



> -Original Message-
> From: Thomas Monjalon 
> Sent: Friday, July 23, 2021 10:10 PM
> To: Slava Ovsiienko ; Jiawei(Jonny) Wang
> 
> Cc: Matan Azrad ; Ori Kam ;
> Shahaf Shuler ; dev@dpdk.org; Raslan Darawsheh
> ; sta...@dpdk.org
> Subject: Re: [PATCH] net/mlx5: fix mirror flow split with L3 encapsulation
> 
> 23/07/2021 16:01, Jiawei(Jonny) Wang:
> > Hello,
> >
> > From: Thomas Monjalon 
> > > 23/07/2021 14:43, Jiawei Wang:
> > > > Due to hardware limitations, the decap action can't follow the
> > >
> > > Which decap action? Pleas be more specific.
> >
> > All of decap,  vxlan_decap, nvgre_decap and raw_decap.
> > Will update details.
> >
> > >
> > > > sample action in the same flow, to keep the original action order
> > > > of sample and decap actions the flow was internally split by PMD,
> > > > and decap action was moved into suffix subflow in the new table.
> > >
> > > "the" new table? which one?
> >
> > " sample and decap actions the flow was internally split by PMD, "
> > means that PMD split internally into two flows:prefix subflow with
> > sample action in original table; suffix subflow with decap action in
> > the new table;
> >
> > >
> > > Was there an issue? Above seems to say all is fine.
> > >
> >
> > The issue is mentioned below that 'L3 encapsulation' case
> > (raw_decap+raw_encap), the previous code not consier the combination
> decap/encap case.
> > And under this case flow don't need split.
> >
> > > > There is a specific combination of raw decap and raw encap actions
> > > > to specify "L3 encapsulation" packet transformation - raw decap
> > > > action to remove L2 header and raw encap to add the tunnel header.
> > >
> > > Is this combination the problem? It was not working before this patch?
> > > Please describe what happened.
> > >
> >
> > L3 encap is working without sample action or sample is working without
> > L3 encap;
> 
> OK please be more explicit in the next version.
> 

The v2 patch is ready, 
https://patchwork.dpdk.org/project/dpdk/patch/20210726062233.30657-1-jiaw...@nvidia.com/
Please help to review, and Thanks your comments.

> > > > This specific L3 encapsulation is encoded as a single packet
> > > > reformat hardware transaction and is supported by hardware after
> > > > sample action (no hardware limitations for packet reformat).
> > > >
> > > > The patch checks whether the decap action is the part of "L3
> > > encapsulation"
> > >
> > > I think you mean "is part of".
> >
> > Thanks, will fix in new version.
> >
> > >
> > > > and does not move the decap action into suffix subflow for the case.
> > > >
> > > > Fixes: cafd87f62a06 ("net/mlx5: fix VLAN push/pop and decap
> > > > actions with mirror")
> > > > Cc: sta...@dpdk.org
> > > >
> > > > Signed-off-by: Jiawei Wang 
> > > > Acked-by: Viacheslav Ovsiienko 
> 
> 



RE: [PATCH v2] net/mlx5: fix the NIC egress flow mismatch in switchdev mode

2021-11-11 Thread Jiawei(Jonny) Wang
Hello Raslan,

I need send the new  V3 patch for this change, Sorry for the late update.

Thanks.
B.R.
Jonny

> -Original Message-
> From: Raslan Darawsheh 
> Sent: Thursday, November 11, 2021 9:00 PM
> To: Jiawei(Jonny) Wang ; Slava Ovsiienko
> ; Matan Azrad ; Ori Kam
> ; Ori Kam ; Yongseok Koh
> 
> Cc: dev@dpdk.org; sta...@dpdk.org
> Subject: RE: [PATCH v2] net/mlx5: fix the NIC egress flow mismatch in
> switchdev mode
> 
> Hi,
> 
> > -Original Message-
> > From: Jiawei(Jonny) Wang 
> > Sent: Friday, November 5, 2021 10:59 AM
> > To: Slava Ovsiienko ; Matan Azrad
> > ; Ori Kam ; Ori Kam
> > ; Yongseok Koh 
> > Cc: dev@dpdk.org; Raslan Darawsheh ;
> > sta...@dpdk.org
> > Subject: [PATCH v2] net/mlx5: fix the NIC egress flow mismatch in
> > switchdev mode
> >
> > When E-Switch mode was enabled, the NIC egress flows always added the
> > source vport in the matcher item, then cause the flows cannot be hit.
> > This commit fixes the issue that removes the source vport as matcher
> > for NIC egress flow.
> > Also adding the validation that rejects the NIC egress flows on the
> > representor ports.
> >
> > Fixes: ce777b147bf8 ("net/mlx5: fix E-Switch flow without port item")
> > Cc: sta...@dpdk.org
> 
> Patch applied to next-net-mlx,
> 
> Kindest regards,
> Raslan Darawsheh


Re: [dpdk-dev] [dpdk-stable] [PATCH 2/2] net/mlx5: use global default miss for E-Switch sampling

2021-01-27 Thread Jiawei(Jonny) Wang
Hi Ferruh,

> -Original Message-
> From: Ferruh Yigit 
> Sent: Wednesday, January 27, 2021 8:50 PM
> To: Jiawei(Jonny) Wang ; Slava Ovsiienko
> ; Matan Azrad ; Ori Kam
> ; NBU-Contact-Thomas Monjalon
> 
> Cc: dev@dpdk.org; Raslan Darawsheh ;
> sta...@dpdk.org; Kevin Traynor ; Luca Boccassi
> 
> Subject: Re: [dpdk-stable] [PATCH 2/2] net/mlx5: use global default miss for
> E-Switch sampling
> 
> On 1/26/2021 12:53 PM, Jiawei Wang wrote:
> > In E-Switch steering domain there was dedicated default miss action
> > created for every sampling flow.
> > The patch replaces this one with the global default miss action.
> >
> 
> Hi Jiawei,
> 
> The impact of the patch is not clear, what was the problem using action per
> flow? What will change when global action is used? Is this fixing some
> problem or is it optimization by preventing per flow action?
> 

The patch is optimization not fix,  MLX5 PMD can reuses the global action don’t 
need allocate new one each time. 

> > Cc: sta...@dpdk.org
> >
> 
> Since this looks like optimization, I am removing the stable tag for now.

Yes,  please remove this tag.
Thanks.

> 
> If it is a fix, please reply with a fixes line, I can update it in next-net.
> 
> > Signed-off-by: Jiawei Wang 
> > Acked-by: Viacheslav Ovsiienko 



RE: [PATCH v4 1/2] ethdev: introduce the PHY affinity field in Tx queue API

2023-02-10 Thread Jiawei(Jonny) Wang
Hi,

> -Original Message-
> From: Ferruh Yigit 
> Sent: Friday, February 10, 2023 3:45 AM
> To: Jiawei(Jonny) Wang ; Slava Ovsiienko
> ; Ori Kam ; NBU-Contact-
> Thomas Monjalon (EXTERNAL) ;
> andrew.rybche...@oktetlabs.ru; Aman Singh ;
> Yuying Zhang 
> Cc: dev@dpdk.org; Raslan Darawsheh 
> Subject: Re: [PATCH v4 1/2] ethdev: introduce the PHY affinity field in Tx 
> queue
> API
> 
> On 2/3/2023 1:33 PM, Jiawei Wang wrote:
> > When multiple physical ports are connected to a single DPDK port,
> > (example: kernel bonding, DPDK bonding, failsafe, etc.), we want to
> > know which physical port is used for Rx and Tx.
> >
> 
> I assume "kernel bonding" is out of context, but this patch concerns DPDK
> bonding, failsafe or softnic. (I will refer them as virtual bonding
> device.)
> 

''kernel bonding'' can be thought as Linux bonding.

> To use specific queues of the virtual bonding device may interfere with the
> logic of these devices, like bonding modes or RSS of the underlying devices. I
> can see feature focuses on a very specific use case, but not sure if all 
> possible
> side effects taken into consideration.
> 
> 
> And although the feature is only relavent to virtual bondiong device, core
> ethdev structures are updated for this. Most use cases won't need these, so is
> there a way to reduce the scope of the changes to virtual bonding devices?
> 
> 
> There are a few very core ethdev APIs, like:
> rte_eth_dev_configure()
> rte_eth_tx_queue_setup()
> rte_eth_rx_queue_setup()
> rte_eth_dev_start()
> rte_eth_dev_info_get()
> 
> Almost every user of ehtdev uses these APIs, since these are so fundemental I
> am for being a little more conservative on these APIs.
> 
> Every eccentric features are targetting these APIs first because they are
> common and extending them gives an easy solution, but in long run making
> these APIs more complex, harder to maintain and harder for PMDs to support
> them correctly. So I am for not updating them unless it is a generic use case.
> 
> 
> Also as we talked about PMDs supporting them, I assume your coming PMD
> patch will be implementing 'tx_phy_affinity' config option only for mlx 
> drivers.
> What will happen for other NICs? Will they silently ignore the config option
> from user? So this is a problem for the DPDK application portabiltiy.
> 

Yes, the PMD patch is for net/mlx5 only, the 'tx_phy_affinity' can be used for 
HW to
choose an mapping queue with physical port.

Other NICs ignore this new configuration for now, or we should add checking in 
queue setup?

> 
> 
> As far as I understand target is application controlling which sub-device is 
> used
> under the virtual bonding device, can you pleaes give more information why
> this is required, perhaps it can help to provide a better/different solution.
> Like adding the ability to use both bonding device and sub-device for data 
> path,
> this way application can use whichever it wants. (this is just first solution 
> I
> come with, I am not suggesting as replacement solution, but if you can 
> describe
> the problem more I am sure other people can come with better solutions.)
> 

For example: 
There're two physical ports (assume device interface: eth2, eth3), and bonded 
these two
Devices into one interface (assume bond0).
DPDK application probed/attached the bond0 only (dpdk port id:0),  while 
sending traffic from dpdk port,
We want to know the packet be sent into which physical port (eth2 or eth3).

With the new configuration, the queue could be configured with underlay device,
Then DPDK application could send the traffic into correct queue as desired.

Add all devices into DPDK, means that need to create multiple RX/TX Queue 
resources on it.


> And isn't this against the applicatio transparent to underneath device being
> bonding device or actual device?
> 
> 
> > This patch maps a DPDK Tx queue with a physical port, by adding
> > tx_phy_affinity setting in Tx queue.
> > The affinity number is the physical port ID where packets will be
> > sent.
> > Value 0 means no affinity and traffic could be routed to any connected
> > physical ports, this is the default current behavior.
> >
> > The number of physical ports is reported with rte_eth_dev_info_get().
> >
> > The new tx_phy_affinity field is added into the padding hole of
> > rte_eth_txconf structure, the size of rte_eth_txconf keeps the same.
> > An ABI check rule needs to be added to avoid false warning.
> >
> > Add the testpmd command line:
> > testpmd> port config (port_id) txq (queue_id) phy_affinity (value)
> >
> > For example, there'

RE: [PATCH v4 1/2] ethdev: introduce the PHY affinity field in Tx queue API

2023-02-14 Thread Jiawei(Jonny) Wang
Hi,

> -Original Message-
> From: Ferruh Yigit 
> Sent: Friday, February 10, 2023 3:45 AM
> To: Jiawei(Jonny) Wang ; Slava Ovsiienko
> ; Ori Kam ; NBU-Contact-
> Thomas Monjalon (EXTERNAL) ;
> andrew.rybche...@oktetlabs.ru; Aman Singh ;
> Yuying Zhang 
> Cc: dev@dpdk.org; Raslan Darawsheh 
> Subject: Re: [PATCH v4 1/2] ethdev: introduce the PHY affinity field in Tx 
> queue
> API
> 
> On 2/3/2023 1:33 PM, Jiawei Wang wrote:
> > When multiple physical ports are connected to a single DPDK port,
> > (example: kernel bonding, DPDK bonding, failsafe, etc.), we want to
> > know which physical port is used for Rx and Tx.
> >
> 
> I assume "kernel bonding" is out of context, but this patch concerns DPDK
> bonding, failsafe or softnic. (I will refer them as virtual bonding
> device.)
> 
> To use specific queues of the virtual bonding device may interfere with the
> logic of these devices, like bonding modes or RSS of the underlying devices. I
> can see feature focuses on a very specific use case, but not sure if all 
> possible
> side effects taken into consideration.
> 
> 
> And although the feature is only relavent to virtual bondiong device, core
> ethdev structures are updated for this. Most use cases won't need these, so is
> there a way to reduce the scope of the changes to virtual bonding devices?
> 
> 
> There are a few very core ethdev APIs, like:
> rte_eth_dev_configure()
> rte_eth_tx_queue_setup()
> rte_eth_rx_queue_setup()
> rte_eth_dev_start()
> rte_eth_dev_info_get()
> 
> Almost every user of ehtdev uses these APIs, since these are so fundemental I
> am for being a little more conservative on these APIs.
> 
> Every eccentric features are targetting these APIs first because they are
> common and extending them gives an easy solution, but in long run making
> these APIs more complex, harder to maintain and harder for PMDs to support
> them correctly. So I am for not updating them unless it is a generic use case.
> 
> 
> Also as we talked about PMDs supporting them, I assume your coming PMD
> patch will be implementing 'tx_phy_affinity' config option only for mlx 
> drivers.
> What will happen for other NICs? Will they silently ignore the config option
> from user? So this is a problem for the DPDK application portabiltiy.
> 
> 
> 
> As far as I understand target is application controlling which sub-device is 
> used
> under the virtual bonding device, can you pleaes give more information why
> this is required, perhaps it can help to provide a better/different solution.
> Like adding the ability to use both bonding device and sub-device for data 
> path,
> this way application can use whichever it wants. (this is just first solution 
> I
> come with, I am not suggesting as replacement solution, but if you can 
> describe
> the problem more I am sure other people can come with better solutions.)
> 
> And isn't this against the applicatio transparent to underneath device being
> bonding device or actual device?
> 
> 

OK, I will send the new version with separate functions in ethdev layer, 
to support the Map a Tx queue to port and get the number of ports.
And these functions work with device ops callback, other NICs will reported
The unsupported the ops callback is NULL.

> > This patch maps a DPDK Tx queue with a physical port, by adding
> > tx_phy_affinity setting in Tx queue.
> > The affinity number is the physical port ID where packets will be
> > sent.
> > Value 0 means no affinity and traffic could be routed to any connected
> > physical ports, this is the default current behavior.
> >
> > The number of physical ports is reported with rte_eth_dev_info_get().
> >
> > The new tx_phy_affinity field is added into the padding hole of
> > rte_eth_txconf structure, the size of rte_eth_txconf keeps the same.
> > An ABI check rule needs to be added to avoid false warning.
> >
> > Add the testpmd command line:
> > testpmd> port config (port_id) txq (queue_id) phy_affinity (value)
> >
> > For example, there're two physical ports connected to a single DPDK
> > port (port id 0), and phy_affinity 1 stood for the first physical port
> > and phy_affinity 2 stood for the second physical port.
> > Use the below commands to config tx phy affinity for per Tx Queue:
> > port config 0 txq 0 phy_affinity 1
> > port config 0 txq 1 phy_affinity 1
> > port config 0 txq 2 phy_affinity 2
> > port config 0 txq 3 phy_affinity 2
> >
> > These commands config the Tx Queue index 0 and Tx Queue index 1 with
> > phy affinity 1, uses Tx Queue 0 or Tx Queue

RE: [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated ports

2023-02-15 Thread Jiawei(Jonny) Wang
Hi Ori, Thomas and Ferruh,

Could you please help to review it?

Thanks.

> -Original Message-
> From: Jiawei Wang 
> Sent: Tuesday, February 14, 2023 11:49 PM
> To: Slava Ovsiienko ; Ori Kam ;
> NBU-Contact-Thomas Monjalon (EXTERNAL) ;
> andrew.rybche...@oktetlabs.ru; Aman Singh ;
> Yuying Zhang ; Ferruh Yigit 
> Cc: dev@dpdk.org; Raslan Darawsheh 
> Subject: [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated ports
> 
> When multiple ports are aggregated into a single DPDK port,
> (example: Linux bonding, DPDK bonding, failsafe, etc.), we want to know which
> port use for Tx via a queue.
> 
> This patch introduces the new ethdev API rte_eth_dev_map_aggr_tx_affinity(),
> it's used to map a Tx queue with an aggregated port of the DPDK port
> (specified with port_id), The affinity is the number of the aggregated port.
> Value 0 means no affinity and traffic could be routed to any aggregated port,
> this is the default current behavior.
> 
> The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().
> 
> Add the trace point for ethdev rte_eth_dev_count_aggr_ports() and
> rte_eth_dev_map_aggr_tx_affinity() functions.
> 
> Add the testpmd command line:
> testpmd> port config (port_id) txq (queue_id) affinity (value)
> 
> For example, there're two physical ports connected to a single DPDK port (port
> id 0), and affinity 1 stood for the first physical port and affinity 2 stood 
> for the
> second physical port.
> Use the below commands to config tx phy affinity for per Tx Queue:
> port config 0 txq 0 affinity 1
> port config 0 txq 1 affinity 1
> port config 0 txq 2 affinity 2
> port config 0 txq 3 affinity 2
> 
> These commands config the Tx Queue index 0 and Tx Queue index 1 with phy
> affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets, these packets will be
> sent from the first physical port, and similar with the second physical port 
> if
> sending packets with Tx Queue 2 or Tx Queue 3.
> 
> Signed-off-by: Jiawei Wang 
> ---
>  app/test-pmd/cmdline.c  | 96 +
>  doc/guides/rel_notes/release_23_03.rst  |  7 ++
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst | 14 +++
>  lib/ethdev/ethdev_driver.h  | 39 +
>  lib/ethdev/ethdev_trace.h   | 17 
>  lib/ethdev/ethdev_trace_points.c|  6 ++
>  lib/ethdev/rte_ethdev.c | 49 +++
>  lib/ethdev/rte_ethdev.h | 46 ++
>  lib/ethdev/version.map  |  2 +
>  9 files changed, 276 insertions(+)
> 
snip
> 2.18.1



RE: [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated ports

2023-02-16 Thread Jiawei(Jonny) Wang


> -Original Message-
> From: Ferruh Yigit 
> Sent: Friday, February 17, 2023 1:58 AM
> To: Jiawei(Jonny) Wang ; Slava Ovsiienko
> ; Ori Kam ; NBU-Contact-
> Thomas Monjalon (EXTERNAL) ;
> andrew.rybche...@oktetlabs.ru; Aman Singh ;
> Yuying Zhang 
> Cc: dev@dpdk.org; Raslan Darawsheh 
> Subject: Re: [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated
> ports
> 
> On 2/14/2023 3:48 PM, Jiawei Wang wrote:
> > When multiple ports are aggregated into a single DPDK port,
> > (example: Linux bonding, DPDK bonding, failsafe, etc.), we want to
> > know which port use for Tx via a queue.
> >
> > This patch introduces the new ethdev API
> > rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue with
> > an aggregated port of the DPDK port (specified with port_id), The
> > affinity is the number of the aggregated port.
> > Value 0 means no affinity and traffic could be routed to any
> > aggregated port, this is the default current behavior.
> >
> > The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().
> >
> > Add the trace point for ethdev rte_eth_dev_count_aggr_ports() and
> > rte_eth_dev_map_aggr_tx_affinity() functions.
> >
> > Add the testpmd command line:
> > testpmd> port config (port_id) txq (queue_id) affinity (value)
> >
> > For example, there're two physical ports connected to a single DPDK
> > port (port id 0), and affinity 1 stood for the first physical port and
> > affinity 2 stood for the second physical port.
> > Use the below commands to config tx phy affinity for per Tx Queue:
> > port config 0 txq 0 affinity 1
> > port config 0 txq 1 affinity 1
> > port config 0 txq 2 affinity 2
> > port config 0 txq 3 affinity 2
> >
> > These commands config the Tx Queue index 0 and Tx Queue index 1 with
> > phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets, these
> > packets will be sent from the first physical port, and similar with
> > the second physical port if sending packets with Tx Queue 2 or Tx
> > Queue 3.
> >
> > Signed-off-by: Jiawei Wang 
> 
> <...>
> 
> > diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c index
> > dc0a4eb12c..1d5b3a16b2 100644
> > --- a/lib/ethdev/rte_ethdev.c
> > +++ b/lib/ethdev/rte_ethdev.c
> > @@ -6915,6 +6915,55 @@
> rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t
> *ptypes
> > return j;
> >  }
> >
> > +int rte_eth_dev_count_aggr_ports(uint16_t port_id) {
> > +   struct rte_eth_dev *dev;
> > +   int ret;
> > +
> > +   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > +   dev = &rte_eth_devices[port_id];
> > +
> > +   if (*dev->dev_ops->count_aggr_ports == NULL)
> > +   return -ENOTSUP;
> 
> What do you think to return a default value when dev_ops is not defined,
> assuming device is not a bounded device.
> Not sure which one is better for application, return a default value or error.
> 

For device which isn't a boned device, the count should be zero.
So, we can return 0 as default value if the PMD doesn't support.

Per application perspective, it only needs to check the count > 0.

> 
> > +   ret = eth_err(port_id, (*dev->dev_ops->count_aggr_ports)(port_id));
> > +
> > +   rte_eth_trace_count_aggr_ports(port_id, ret);
> > +
> > +   return ret;
> > +}
> > +
> > +int rte_eth_dev_map_aggr_tx_affinity(uint16_t port_id, uint16_t
> tx_queue_id,
> > +uint8_t affinity)
> > +{
> > +   struct rte_eth_dev *dev;
> > +   int ret;
> > +
> > +   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > +   dev = &rte_eth_devices[port_id];
> > +
> > +   if (tx_queue_id >= dev->data->nb_tx_queues) {
> > +   RTE_ETHDEV_LOG(ERR, "Invalid Tx queue_id=%u\n",
> tx_queue_id);
> > +   return -EINVAL;
> > +   }
> > +
> 
> Although documentation says this API should be called before configure, if 
> user
> misses it I guess above can crash, is there a way to add runtime check, like
> checking 'dev->data->dev_configured'?
> 

OK, I will add the checking and report the error if (dev->data->dev_configured 
== 0).

> 
> > +   if (*dev->dev_ops->map_aggr_tx_affinity == NULL)
> > +   return -ENOTSUP;
> > +
> > +   if (dev->data->dev_started) {
> > +   RTE_ETHDEV_LOG(ERR,
> > +   "Port %u must be stopped to allow configu

RE: [PATCH v5 2/2] ethdev: add Aggregated affinity match item

2023-02-16 Thread Jiawei(Jonny) Wang
Hi,

> -Original Message-
> From: Thomas Monjalon 
> Sent: Friday, February 17, 2023 1:46 AM
> To: Jiawei(Jonny) Wang 
> Cc: Slava Ovsiienko ; Ori Kam ;
> andrew.rybche...@oktetlabs.ru; Aman Singh ;
> Yuying Zhang ; Ferruh Yigit ;
> dev@dpdk.org; Raslan Darawsheh 
> Subject: Re: [PATCH v5 2/2] ethdev: add Aggregated affinity match item
> 
> For the title, I suggest
> ethdev: add flow matching of aggregated port
> 
> 14/02/2023 16:48, Jiawei Wang:
> > When multiple ports are aggregated into a single DPDK port,
> > (example: Linux bonding, DPDK bonding, failsafe, etc.), we want to
> > know which port is used for Rx and Tx.
> >
> > This patch allows to map a Rx queue with an aggregated port by using a
> > flow rule. The new item is called RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY.
> >
> > While uses the aggregated affinity as a matching item in the flow
> > rule, and sets the same affinity value by call
> > rte_eth_dev_map_aggr_tx_affinity(), then the packet can be sent from
> > the same port as the receiving one.
> > The affinity numbering starts from 1, then trying to match on
> > aggr_affinity 0 will result in an error.
> >
> > Add the testpmd command line to match the new item:
> > flow create 0 ingress group 0 pattern aggr_affinity affinity is 1 /
> > end actions queue index 0 / end
> >
> > The above command means that creates a flow on a single DPDK port and
> > matches the packet from the first physical port and redirects these
> > packets into Rx queue 0.
> >
> > Signed-off-by: Jiawei Wang 
> 
> Acked-by: Thomas Monjalon 
> 

OK, update the title next patch, thanks for Ack.


RE: [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated ports

2023-02-16 Thread Jiawei(Jonny) Wang



> -Original Message-
> From: Thomas Monjalon 
> Sent: Friday, February 17, 2023 1:42 AM
> To: Jiawei(Jonny) Wang 
> Cc: Slava Ovsiienko ; Ori Kam ;
> andrew.rybche...@oktetlabs.ru; Aman Singh ;
> Yuying Zhang ; Ferruh Yigit ;
> dev@dpdk.org; Raslan Darawsheh 
> Subject: Re: [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated
> ports
> 
> For the title, I suggest
> ethdev: add Tx queue mapping of aggregated ports
> 
> 14/02/2023 16:48, Jiawei Wang:
> > When multiple ports are aggregated into a single DPDK port,
> > (example: Linux bonding, DPDK bonding, failsafe, etc.), we want to
> > know which port use for Tx via a queue.
> >
> > This patch introduces the new ethdev API
> > rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue with
> > an aggregated port of the DPDK port (specified with port_id), The
> > affinity is the number of the aggregated port.
> > Value 0 means no affinity and traffic could be routed to any
> > aggregated port, this is the default current behavior.
> >
> > The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().
> >
> > Add the trace point for ethdev rte_eth_dev_count_aggr_ports() and
> > rte_eth_dev_map_aggr_tx_affinity() functions.
> >
> > Add the testpmd command line:
> > testpmd> port config (port_id) txq (queue_id) affinity (value)
> >
> > For example, there're two physical ports connected to a single DPDK
> > port (port id 0), and affinity 1 stood for the first physical port and
> > affinity 2 stood for the second physical port.
> > Use the below commands to config tx phy affinity for per Tx Queue:
> > port config 0 txq 0 affinity 1
> > port config 0 txq 1 affinity 1
> > port config 0 txq 2 affinity 2
> > port config 0 txq 3 affinity 2
> >
> > These commands config the Tx Queue index 0 and Tx Queue index 1 with
> > phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets, these
> > packets will be sent from the first physical port, and similar with
> > the second physical port if sending packets with Tx Queue 2 or Tx
> > Queue 3.
> >
> > Signed-off-by: Jiawei Wang 
> 
> Acked-by: Thomas Monjalon 
> 

OK, update the title next patch, thanks for Ack.


RE: [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated ports

2023-02-17 Thread Jiawei(Jonny) Wang


> -Original Message-
> From: Andrew Rybchenko 
> Sent: Friday, February 17, 2023 4:24 PM
> To: Ferruh Yigit ; Jiawei(Jonny) Wang
> ; Slava Ovsiienko ; Ori Kam
> ; NBU-Contact-Thomas Monjalon (EXTERNAL)
> ; Aman Singh ; Yuying
> Zhang 
> Cc: dev@dpdk.org; Raslan Darawsheh 
> Subject: Re: [PATCH v5 1/2] ethdev: introduce the Tx map API for aggregated
> ports
> 
> On 2/16/23 20:58, Ferruh Yigit wrote:
> > On 2/14/2023 3:48 PM, Jiawei Wang wrote:
> >> When multiple ports are aggregated into a single DPDK port,
> >> (example: Linux bonding, DPDK bonding, failsafe, etc.), we want to
> >> know which port use for Tx via a queue.
> >>
> >> This patch introduces the new ethdev API
> >> rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue with
> >> an aggregated port of the DPDK port (specified with port_id), The
> >> affinity is the number of the aggregated port.
> >> Value 0 means no affinity and traffic could be routed to any
> >> aggregated port, this is the default current behavior.
> >>
> >> The maximum number of affinity is given by
> rte_eth_dev_count_aggr_ports().
> >>
> >> Add the trace point for ethdev rte_eth_dev_count_aggr_ports() and
> >> rte_eth_dev_map_aggr_tx_affinity() functions.
> >>
> >> Add the testpmd command line:
> >> testpmd> port config (port_id) txq (queue_id) affinity (value)
> >>
> >> For example, there're two physical ports connected to a single DPDK
> >> port (port id 0), and affinity 1 stood for the first physical port
> >> and affinity 2 stood for the second physical port.
> >> Use the below commands to config tx phy affinity for per Tx Queue:
> >>  port config 0 txq 0 affinity 1
> >>  port config 0 txq 1 affinity 1
> >>  port config 0 txq 2 affinity 2
> >>  port config 0 txq 3 affinity 2
> >>
> >> These commands config the Tx Queue index 0 and Tx Queue index 1 with
> >> phy affinity 1, uses Tx Queue 0 or Tx Queue 1 send packets, these
> >> packets will be sent from the first physical port, and similar with
> >> the second physical port if sending packets with Tx Queue 2 or Tx
> >> Queue 3.
> >>
> >> Signed-off-by: Jiawei Wang 
> >
> > <...>
> >
> >> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c index
> >> dc0a4eb12c..1d5b3a16b2 100644
> >> --- a/lib/ethdev/rte_ethdev.c
> >> +++ b/lib/ethdev/rte_ethdev.c
> >> @@ -6915,6 +6915,55 @@
> rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t
> *ptypes
> >>return j;
> >>   }
> >>
> >> +int rte_eth_dev_count_aggr_ports(uint16_t port_id) {
> >> +  struct rte_eth_dev *dev;
> >> +  int ret;
> >> +
> >> +  RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> >> +  dev = &rte_eth_devices[port_id];
> >> +
> >> +  if (*dev->dev_ops->count_aggr_ports == NULL)
> >> +  return -ENOTSUP;
> >
> > What do you think to return a default value when dev_ops is not
> > defined, assuming device is not a bounded device.
> > Not sure which one is better for application, return a default value
> > or error.
> 
> I think 0 is better here. It simply means that
> rte_eth_dev_map_aggr_tx_affinity() cannot be used as well as corresponding
> flow API item.
> It will be true even for bonding as long as corresponding API is not 
> supported.
> 

Will send the new patch later with this change.

> >> +  ret = eth_err(port_id, (*dev->dev_ops->count_aggr_ports)(port_id));
> >> +
> >> +  rte_eth_trace_count_aggr_ports(port_id, ret);
> >> +
> >> +  return ret;
> >> +}
> >> +
> >> +int rte_eth_dev_map_aggr_tx_affinity(uint16_t port_id, uint16_t
> tx_queue_id,
> >> +   uint8_t affinity)
> >> +{
> >> +  struct rte_eth_dev *dev;
> >> +  int ret;
> >> +
> >> +  RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> >> +  dev = &rte_eth_devices[port_id];
> >> +
> >> +  if (tx_queue_id >= dev->data->nb_tx_queues) {
> >> +  RTE_ETHDEV_LOG(ERR, "Invalid Tx queue_id=%u\n",
> tx_queue_id);
> >> +  return -EINVAL;
> >> +  }
> >> +
> >
> > Although documentation says this API should be called before
> > configure,
> 
> Documentation says "after". Anyway, it is bette

RE: [PATCH v6 1/2] ethdev: add Tx queue mapping of aggregated ports

2023-02-17 Thread Jiawei(Jonny) Wang


> -Original Message-
> From: Ferruh Yigit 
> Sent: Friday, February 17, 2023 9:00 PM
> To: Andrew Rybchenko ; Jiawei(Jonny)
> Wang ; Slava Ovsiienko ; Ori
> Kam ; NBU-Contact-Thomas Monjalon (EXTERNAL)
> ; Aman Singh ; Yuying
> Zhang 
> Cc: dev@dpdk.org; Raslan Darawsheh 
> Subject: Re: [PATCH v6 1/2] ethdev: add Tx queue mapping of aggregated ports
> 
> On 2/17/2023 12:56 PM, Andrew Rybchenko wrote:
> >> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> >> index 6a550cfc83..b7fdc454a8 100644
> >> --- a/lib/ethdev/ethdev_driver.h
> >> +++ b/lib/ethdev/ethdev_driver.h
> >> @@ -1171,6 +1171,40 @@ typedef int (*eth_tx_descriptor_dump_t)(const
> >> struct rte_eth_dev *dev,
> >>   uint16_t queue_id, uint16_t offset,
> >>   uint16_t num, FILE *file);
> >>   +/**
> >> + * @internal
> >> + * Get the number of aggregated ports.
> >> + *
> >> + * @param port_id
> >> + *   The port identifier of the Ethernet device.
> >> + *
> >> + * @return
> >> + *   Negative errno value on error, 0 or positive on success.
> >> + *
> >> + * @retval >=0
> >> + *   The number of aggregated port if success.
> >> + * @retval -ENOTSUP
> >> + *   Get aggregated ports API is not supported.
> >> + */
> >> +typedef int (*eth_count_aggr_ports_t)(uint16_t port_id);
> >
> > Why does use port_id as the first parameter whereas all other driver
> > callbacks use 'struct rte_eth_dev *'?
> 
> Ahh, this is wrong, internal functions should use 'struct rte_eth_dev *', not
> port_id handler.
> 
> Thanks Andrew for catching it.

Thanks Andrew and Ferruh, I will send the new version to fix it.


RE: [PATCH v6 0/2] Add Tx queue mapping of aggregated ports

2023-02-17 Thread Jiawei(Jonny) Wang


> -Original Message-
> From: Ferruh Yigit 
> Sent: Friday, February 17, 2023 9:01 PM
> To: Jiawei(Jonny) Wang ; Slava Ovsiienko
> ; Ori Kam ; NBU-Contact-
> Thomas Monjalon (EXTERNAL) ;
> andrew.rybche...@oktetlabs.ru
> Cc: dev@dpdk.org; Raslan Darawsheh 
> Subject: Re: [PATCH v6 0/2] Add Tx queue mapping of aggregated ports
> 
> On 2/17/2023 10:50 AM, Jiawei Wang wrote:
> > When multiple ports are aggregated into a single DPDK port,
> > (example: Linux bonding, DPDK bonding, failsafe, etc.), we want to
> > know which port is used for Rx and Tx.
> >
> > This patch introduces the new ethdev API
> > rte_eth_dev_map_aggr_tx_affinity(), it's used to map a Tx queue with
> > an aggregated port of the DPDK port (specified with port_id), The
> > affinity is the number of the aggregated port.
> > Value 0 means no affinity and traffic could be routed to any
> > aggregated port, this is the default current behavior.
> >
> > The maximum number of affinity is given by rte_eth_dev_count_aggr_ports().
> >
> > This patch allows to map a Rx queue with an aggregated port by using a
> > flow rule. The new item is called RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY.
> >
> > While uses the aggregated affinity as a matching item in the flow
> > rule, and sets the same affinity value by call
> > rte_eth_dev_map_aggr_tx_affinity(), then the packet can be sent from
> > the same port as the receiving one.
> > The affinity numbering starts from 1, then trying to match on
> > aggr_affinity 0 will result in an error.
> >
> > RFC:
> > http://patches.dpdk.org/project/dpdk/cover/20221221102934.13822-1-jiaw
> > e...@nvidia.com/
> >
> > v6:
> > * Update the commit titles.
> > * Return 0 by default if dev_ops.count_aggr_ports is not defined.
> > * Adds the dev_configure and affinity value checking before call
> map_aggr_tx_affinity.
> > * Update the rte_eth_dev_count_aggr_ports description.
> >
> > v5:
> > * Adds rte_eth_dev_map_aggr_tx_affinity() to map a Tx queue to an
> aggregated port.
> > * Adds rte_eth_dev_count_aggr_ports() to get the number of aggregated
> ports.
> > * Updates the flow item RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY.
> >
> > v4:
> > * Rebase the latest code
> > * Update new field description
> > * Update release release note
> > * Reword the commit log to make clear
> >
> > v3:
> > * Update exception rule
> > * Update the commit log
> > * Add the description for PHY affinity and numbering definition
> > * Add the number of physical ports into device info
> > * Change the patch order
> >
> > v2: Update based on the comments
> >
> > Jiawei Wang (2):
> >   ethdev: add Tx queue mapping of aggregated ports
> >   ethdev: add flow matching of aggregated port
> 
> It looks like there will be a new version, can you please rebase next version 
> on
> top of next-net [1], there are multiple conflicts with this patch, it is 
> safer if you
> resolve them yourself.
> 
> [1]
> https://git.dpdk.org/next/dpdk-next-net/

Sure, will send the v7 and rebase on it.


RE: [PATCH v6 1/2] ethdev: add Tx queue mapping of aggregated ports

2023-02-17 Thread Jiawei(Jonny) Wang
Hi Andrew,

> -Original Message-
> From: Andrew Rybchenko 
> Sent: Friday, February 17, 2023 8:57 PM
> To: Jiawei(Jonny) Wang ; Slava Ovsiienko
> ; Ori Kam ; NBU-Contact-
> Thomas Monjalon (EXTERNAL) ;
> ferruh.yi...@amd.com; Aman Singh ; Yuying
> Zhang 
> Cc: dev@dpdk.org; Raslan Darawsheh 
> Subject: Re: [PATCH v6 1/2] ethdev: add Tx queue mapping of aggregated ports
> 
[snip]
> > --- a/lib/ethdev/rte_ethdev.c
> > +++ b/lib/ethdev/rte_ethdev.c
> > @@ -6946,6 +6946,78 @@
> rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t
> *ptypes
> > return j;
> >   }
> >
> > +int rte_eth_dev_count_aggr_ports(uint16_t port_id) {
> > +   struct rte_eth_dev *dev;
> > +   int ret;
> > +
> > +   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > +   dev = &rte_eth_devices[port_id];
> > +
> > +   if (*dev->dev_ops->count_aggr_ports == NULL)
> 
> Is it OK that tracing is long in this case?
> 

Do you mean that we don't need tracing in this case?

> > +   return 0;
> > +   ret = eth_err(port_id, (*dev->dev_ops->count_aggr_ports)(port_id));
> > +
> > +   rte_eth_trace_count_aggr_ports(port_id, ret);
> > +
> > +   return ret;
> > +}
> > +
> 
> [snip]



RE: [PATCH] doc: update sample action description for mlx5

2022-11-04 Thread Jiawei(Jonny) Wang
Hi,

Sorry for the late reply.

> -Original Message-
> From: Thomas Monjalon 
> Sent: Friday, October 14, 2022 9:11 PM
> To: Jiawei(Jonny) Wang 
> Cc: Raslan Darawsheh ; Asaf Penso ;
> Matan Azrad ; Slava Ovsiienko
> ; dev@dpdk.org
> Subject: Re: [PATCH] doc: update sample action description for mlx5
> 
> 14/10/2022 12:14, Jiawei Wang:
> > This patch adds mlx5 description about E-Switch mirroring flow
> > (RTE_FLOW_ACTION_TYPE_SAMPLE with sample ratio=1) with encap action,
> > then supports the uplink port only in the sample actions list or in
> > the one flow.
> >
> > Signed-off-by: Jiawei Wang 
> [...]
> > +  - For E-Switch mirroring flow, supports ``ENCAP`` actions only to the
> > +UpLink Port and in either of sample actions list or in the flow.
> 
> Sorry I don't understand the second part of the sentence:
> "and in either of sample actions list or in the flow."
> 
> Please could you explain so we can find a simpler wording?
> 

RTE sample action support a list of actions that be defined as below structure:
struct rte_flow_action_sample {
uint32_t ratio; /**< packets sampled equals to '1/ratio'. */
const struct rte_flow_action *actions;
/**< sub-action list specific for the sampling hit 
cases. */
};

So the second part meaning that 'encap' with 'port' combination can be in the 
'sub-action' list of sample
Or flow actions while user calls rte_flow_create(,, actions[]);

> 



Re: [dpdk-dev] Duplicating traffic with RTE Flow

2021-03-10 Thread Jiawei(Jonny) Wang
Hi Jan,

Sorry for late response, 

First rule is invalid, port only works on FDB domain so need 'transfer' here;
Second rule should be ok,  could you please check if the port 1 was enabled on 
you dpdk application?

Thanks.
Jonny

> -Original Message-
> From: Jan Viktorin 
> Sent: Monday, March 1, 2021 10:43 PM
> To: Slava Ovsiienko 
> Cc: Asaf Penso ; dev@dpdk.org; Ori Kam
> ; Jiawei(Jonny) Wang 
> Subject: Re: [dpdk-dev] Duplicating traffic with RTE Flow
> 
> On Mon, 1 Mar 2021 14:34:07 +
> Slava Ovsiienko  wrote:
> 
> > Hi, Jan
> >
> > To use port action (I see it is in your sample action list) the flow
> > should be applied to the FDB domain, ie "transfer" attribute should be
> specified:
> >
> > flow validate 0 ingress transfer...
> 
> As you can see (however, it's a bit messy in the response below, in [1], it is
> better), I tried both. First without transfer and second with. The first gives
> hint "action is valid in transfer mode only" but the second try with transfer
> gives "Operation not supported".
> 
> Jan
> 
> [1] http://mails.dpdk.org/archives/dev/2021-March/200475.html
> 
> >
> > With best regards, Slava
> >
> > > -Original Message-
> > > From: Jan Viktorin 
> > > Sent: Monday, March 1, 2021 14:21
> > > To: Asaf Penso 
> > > Cc: dev@dpdk.org; Ori Kam ; Jiawei(Jonny) Wang
> > > ; Slava Ovsiienko 
> > > Subject: Re: [dpdk-dev] Duplicating traffic with RTE Flow
> > >
> > > Hello Asaf,
> > >
> > > it is a while we were in touch regarding this topic. Finally, I am
> > > again trying to get work this feature. I've seen that sampling is
> > > already upstreamed which is great. However, I am not very successful
> > > with that. There is nearly no documentation, just [1], I found no 
> > > examples,
> just commit logs...
> > >
> > > I tried:
> > >
> > >  > set sample_actions 0 port_id id 1 / end  > flow validate 0
> > > ingress pattern end actions sample ratio 1 index 0 / drop / end
> > >  port_flow_complain(): Caught PMD error type 1 (cause unspecified):
> > > port id action is valid in transfer mode only: Operation not
> > > supported  > flow validate
> > > 0 ingress transfer pattern end actions sample ratio 1 index 0 / drop
> > > / end
> > >  port_flow_complain(): Caught PMD error type 1 (cause unspecified):
> > > (no stated reason): Operation not supported
> > >
> > > Using CentOS 7, DPDK 20.11.0, OFED-5.2-1.0.4.
> > > NICs: MT2892 Family [ConnectX-6 Dx] 101d (fw 22.28.1002), MT27800
> > > Family [ConnectX-5] 1017 (fw 16.27.2008).
> > >
> > > My primary goal is to be able to deliver exactly the same packets
> > > both to DPDK and to the Linux kernel. Doing this at RTE Flow level
> > > would be great due to performance and transparency.
> > >
> > > Jan
> > >
> > > [1]
> > > https://doc.dpdk.org/guides/prog_guide/rte_flow.html#action-sample
> > >
> > > On Fri, 18 Sep 2020 14:23:42 +
> > > Asaf Penso  wrote:
> > >
> > > > Hello Jan,
> > > >
> > > > You can have a look in series [1] where we propose to add APIs to
> > > DPDK20.11 for both mirroring and sampling for packets, with
> > > additional actions of the different traffic.
> > > >
> > > > [1]
> > > > http://patches.dpdk.org/project/dpdk/list/?series=12045
> > > >
> > > > Regards,
> > > > Asaf Penso
> > > >
> > > > >-Original Message-
> > > > >From: dev  On Behalf Of Jan Viktorin
> > > > >Sent: Friday, September 18, 2020 3:56 PM
> > > > >To: dev@dpdk.org
> > > > >Subject: [dpdk-dev] Duplicating traffic with RTE Flow
> > > > >
> > > > >Hello all,
> > > > >
> > > > >we are looking for a way to duplicate ingress traffic in hardware.
> > > > >
> > > > >There is an example in [1] suggesting to insert two fate actions
> > > > >into the RTE Flow actions array like:
> > > > >
> > > > >  flow create 0 ingress pattern end \
> > > > >  actions queue index 0 / void / queue index 1 / end
> > > > >
> > > > >But our experience is that PMDs reject two fate actions (tried
> > > > >with mlx5). Another similar approach would be to deliver ever

Re: [dpdk-dev] Duplicating traffic with RTE Flow

2021-03-12 Thread Jiawei(Jonny) Wang
Hi Jan,

> -Original Message-
> From: Jan Viktorin 
> Sent: Friday, March 12, 2021 12:33 AM
> To: Jiawei(Jonny) Wang 
> Cc: Slava Ovsiienko ; Asaf Penso
> ; dev@dpdk.org; Ori Kam 
> Subject: Re: [dpdk-dev] Duplicating traffic with RTE Flow
> 
> On Thu, 11 Mar 2021 02:11:07 +
> "Jiawei(Jonny) Wang"  wrote:
> 
> > Hi Jan,
> >
> > Sorry for late response,
> >
> > First rule is invalid, port only works on FDB domain so need
> > 'transfer' here; Second rule should be ok,  could you please check if the
> port 1 was enabled on you dpdk application?
> 
> I assume that it is enabled, see full transcript:
> 
>  $ ofed_info
>  MLNX_OFED_LINUX-5.2-1.0.4.0 (OFED-5.2-1.0.4):
>  ...
>  $ sudo dpdk-testpmd -v -- -i
>  EAL: Detected 24 lcore(s)
>  EAL: Detected 1 NUMA nodes
>  EAL: RTE Version: 'DPDK 20.11.0'
>  EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
>  EAL: Selected IOVA mode 'PA'
>  EAL: No available hugepages reported in hugepages-1048576kB
>  EAL: Probing VFIO support...
>  EAL: Probe PCI driver: mlx5_pci (15b3:1017) device: :04:00.0 (socket 0)
>  mlx5_pci: No available register for Sampler.
>  mlx5_pci: Size 0x is not power of 2, will be aligned to 0x1.
>  EAL: Probe PCI driver: mlx5_pci (15b3:1017) device: :04:00.1 (socket 0)
>  mlx5_pci: No available register for Sampler.
>  mlx5_pci: Size 0x is not power of 2, will be aligned to 0x1.
>  EAL: No legacy callbacks, legacy socket not created  Interactive-mode
> selected
>  testpmd: create a new mbuf pool : n=331456, size=2176,
> socket=0
>  testpmd: preferred mempool ops selected: ring_mp_mc  Configuring Port 0
> (socket 0)  Port 0: B8:59:9F:E2:09:F6  Configuring Port 1 (socket 0)  Port 1:
> B8:59:9F:E2:09:F7  Checking link statuses...
>  Done

Seems that you start two PF port here,  Port 1 is not VF port;
FDB rule can steering the packet form PF to its VFs and vice versa, Could you 
please try to open the
VF ports and start the testpmd with representor=.

Thanks.

>  testpmd> port start 1
>  Port 1 is now not stopped
>  Please stop the ports first
>  Done
>  testpmd> set sample_actions 0 port_id id 1 / end  testpmd> flow validate 0
> ingress transfer pattern end actions sample ratio 1 index 0 / drop / end
>  port_flow_complain(): Caught PMD error type 1 (cause unspecified): (no
> stated reason): Operation not supported  testpmd> flow create 0 ingress
> transfer pattern end actions sample ratio 1 index 0 / drop / end
>  port_flow_complain(): Caught PMD error type 1 (cause unspecified): (no
> stated reason): Operation not supported  testpmd>  Stopping port 0...
>  Stopping ports...
>  Done
> 
>  Stopping port 1...
>  Stopping ports...
>  Done
> 
>  Shutting down port 0...
>  Closing ports...
>  Port 0 is closed
>  Done
> 
>  Shutting down port 1...
>  Closing ports...
>  Port 1 is closed
>  Done
> 
>  Bye...
> 
> Jan
> 
> >
> > Thanks.
> > Jonny
> >
> > > -Original Message-
> > > From: Jan Viktorin 
> > > Sent: Monday, March 1, 2021 10:43 PM
> > > To: Slava Ovsiienko 
> > > Cc: Asaf Penso ; dev@dpdk.org; Ori Kam
> > > ; Jiawei(Jonny) Wang 
> > > Subject: Re: [dpdk-dev] Duplicating traffic with RTE Flow
> > >
> > > On Mon, 1 Mar 2021 14:34:07 +
> > > Slava Ovsiienko  wrote:
> > >
> > > > Hi, Jan
> > > >
> > > > To use port action (I see it is in your sample action list) the
> > > > flow should be applied to the FDB domain, ie "transfer" attribute
> > > > should be
> > > specified:
> > > >
> > > > flow validate 0 ingress transfer...
> > >
> > > As you can see (however, it's a bit messy in the response below, in
> > > [1], it is better), I tried both. First without transfer and second
> > > with. The first gives hint "action is valid in transfer mode only"
> > > but the second try with transfer gives "Operation not supported".
> > >
> > > Jan
> > >
> > > [1] http://mails.dpdk.org/archives/dev/2021-March/200475.html
> > >
> > > >
> > > > With best regards, Slava
> > > >
> > > > > -Original Message-
> > > > > From: Jan Viktorin 
> > > > > Sent: Monday, March 1, 2021 14:21
> > > > > To: Asaf Penso 
> > > > > Cc: dev@dpdk.org; Ori Kam ; Jiawei(Jonny) Wang
> > > > > ; Slava Ovsiienko 
> > > > > Subject: Re: [dpdk-dev] Duplicating

Re: [dpdk-dev] [PATCH] doc: update sample actions support in mlx5 guide

2021-03-12 Thread Jiawei(Jonny) Wang
Hi Thomas,

> -Original Message-
> From: Thomas Monjalon 
> Sent: Tuesday, March 9, 2021 9:39 PM
> To: Jiawei(Jonny) Wang 
> Cc: Slava Ovsiienko ; Matan Azrad
> ; Ori Kam ; dev@dpdk.org; Raslan
> Darawsheh 
> Subject: Re: [PATCH] doc: update sample actions support in mlx5 guide
> 
> 09/03/2021 14:09, Jiawei Wang:
> > +  - Supports ``MARK``, ``COUNT``, ``QUEUE``, ``RSS`` as sample actions for
> NIC Rx sampling/mirroring flow.
> 
> "sample actions for NIC Rx sampling" is redundant.
> I suggest "Supports [...] actions for NIC Rx flow sampling/mirroring."
> 

These actions are support in the sample action list, user also can add modify, 
decap , encap in the NIC RX flow;
How about this:

"For NIC RX flow with sample action, Supports [...] actions in the sample 
action list" ? 

> > +  - Supports ``RAW ENCAP``, ``Port ID`` as sample actions for E-Switch
> mirroring (sample ratio = 1) flow.
> 
> I suggest "Supports [...] actions for E-Switch flow mirroring."
> 
> If lines are long, do not hesitate to start a new line after a punctuation.
> 

Ok, I will split into two lines.

Thanks.


Re: [dpdk-dev] Duplicating traffic with RTE Flow

2021-03-15 Thread Jiawei(Jonny) Wang
Hi Jan,


> -Original Message-
> From: Jan Viktorin 
> Sent: Monday, March 15, 2021 9:22 PM
> To: Jiawei(Jonny) Wang 
> Cc: Slava Ovsiienko ; Asaf Penso
> ; dev@dpdk.org; Ori Kam 
> Subject: Re: [dpdk-dev] Duplicating traffic with RTE Flow
> 
> Hello Jiawei,
> 
> On Fri, 12 Mar 2021 09:32:44 +
> "Jiawei(Jonny) Wang"  wrote:
> 
> > Hi Jan,
> >
> > > -Original Message-
> > > From: Jan Viktorin 
> > > Sent: Friday, March 12, 2021 12:33 AM
> > > To: Jiawei(Jonny) Wang 
> > > Cc: Slava Ovsiienko ; Asaf Penso
> > > ; dev@dpdk.org; Ori Kam 
> > > Subject: Re: [dpdk-dev] Duplicating traffic with RTE Flow
> > >
> > > On Thu, 11 Mar 2021 02:11:07 +
> > > "Jiawei(Jonny) Wang"  wrote:
> > >
> > > > Hi Jan,
> > > >
> > > > Sorry for late response,
> > > >
> > > > First rule is invalid, port only works on FDB domain so need
> > > > 'transfer' here; Second rule should be ok,  could you please check
> > > > if the
> > > port 1 was enabled on you dpdk application?
> > >
> > > I assume that it is enabled, see full transcript:
> > >
> > >  $ ofed_info
> > >  MLNX_OFED_LINUX-5.2-1.0.4.0 (OFED-5.2-1.0.4):
> > >  ...
> > >  $ sudo dpdk-testpmd -v -- -i
> > >  EAL: Detected 24 lcore(s)
> > >  EAL: Detected 1 NUMA nodes
> > >  EAL: RTE Version: 'DPDK 20.11.0'
> > >  EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
> > >  EAL: Selected IOVA mode 'PA'
> > >  EAL: No available hugepages reported in hugepages-1048576kB
> > >  EAL: Probing VFIO support...
> > >  EAL: Probe PCI driver: mlx5_pci (15b3:1017) device: :04:00.0
> > > (socket 0)
> > >  mlx5_pci: No available register for Sampler.
> > >  mlx5_pci: Size 0x is not power of 2, will be aligned to 0x1.
> > >  EAL: Probe PCI driver: mlx5_pci (15b3:1017) device: :04:00.1
> > > (socket 0)
> > >  mlx5_pci: No available register for Sampler.
> > >  mlx5_pci: Size 0x is not power of 2, will be aligned to 0x1.
> > >  EAL: No legacy callbacks, legacy socket not created
> > > Interactive-mode selected
> > >  testpmd: create a new mbuf pool : n=331456, size=2176,
> > > socket=0
> > >  testpmd: preferred mempool ops selected: ring_mp_mc  Configuring
> > > Port 0 (socket 0)  Port 0: B8:59:9F:E2:09:F6  Configuring Port 1 (socket 
> > > 0)
> Port 1:
> > > B8:59:9F:E2:09:F7  Checking link statuses...
> > >  Done
> >
> > Seems that you start two PF port here,  Port 1 is not VF port; FDB
> > rule can steering the packet form PF to its VFs and vice versa, Could
> > you please try to open the VF ports and start the testpmd with
> representor=.
> 
> I did not know this, so I tried with VFs:
> 
>  # echo 2 > /sys/class/net/hge1/device/sriov_numvfs
>  # echo switchdev > /sys/class/net/hge1/compat/devlink/mode
> 
>  # dpdk-testpmd -v -a ':05:00.1,representor=[0-1]' -- -i
>  EAL: Detected 24 lcore(s)
>  EAL: Detected 1 NUMA nodes
>  EAL: RTE Version: 'DPDK 20.11.0'
>  EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
>  EAL: Selected IOVA mode 'VA'
>  EAL: No available hugepages reported in hugepages-1048576kB
>  EAL: Probing VFIO support...
>  EAL: Probe PCI driver: mlx5_pci (15b3:1017) device: :05:00.1 (socket 0)
>  mlx5_pci: No available register for Sampler.
>  mlx5_pci: Size 0x is not power of 2, will be aligned to 0x1.
>  mlx5_pci: No available register for Sampler.
>  mlx5_pci: No available register for Sampler.
>  EAL: No legacy callbacks, legacy socket not created  Interactive-mode
> selected
>  testpmd: create a new mbuf pool : n=331456, size=2176,
> socket=0
>  testpmd: preferred mempool ops selected: ring_mp_mc
> 
>  Warning! port-topology=paired and odd forward ports number, the last port
> will pair with itself.
> 
>  Configuring Port 0 (socket 0)
>  Port 0: B8:59:9F:E2:09:F7
>  Configuring Port 1 (socket 0)
>  Port 1: B2:57:D6:72:F3:31
>  Configuring Port 2 (socket 0)
>  Port 2: 9E:CB:D0:73:59:CE
>  Checking link statuses...
>  Done
>  testpmd> show port summary all
>  Number of available ports: 3
>  Port MAC Address   Name Driver Status   Link
>  0B8:59:9F:E2:09:F7 :05:00.1 mlx5_pci   up   100 Gbps
>  1B2:57:D6:72:F3:31 :05:00.1_representor_0 mlx5_pci   up   100
> Gbps
>  29E:CB:D0:73:59:CE 

Re: [dpdk-dev] [PATCH] net/mlx5: support RSS expansion for NVGRE

2021-05-12 Thread Jiawei(Jonny) Wang



> -Original Message-
> From: Slava Ovsiienko 
> Sent: Wednesday, May 12, 2021 5:37 PM
> To: Jiawei(Jonny) Wang ; Matan Azrad
> ; Ori Kam ; Jack Min
> ; NBU-Contact-Thomas Monjalon
> 
> Cc: dev@dpdk.org; Raslan Darawsheh ;
> sta...@dpdk.org
> Subject: RE: [PATCH] net/mlx5: support RSS expansion for NVGRE
> 
> > -Original Message-
> > From: Jiawei(Jonny) Wang 
> > Sent: Wednesday, May 12, 2021 9:43
> > To: Matan Azrad ; Slava Ovsiienko
> > ; Ori Kam ; Jack Min
> > ; NBU-Contact-Thomas Monjalon
> > 
> > Cc: dev@dpdk.org; Raslan Darawsheh ;
> > sta...@dpdk.org
> > Subject: [PATCH] net/mlx5: support RSS expansion for NVGRE
> >
> > Currently RSS expansion only support GRE and GRE KEY.
> > This patch add RSS expansion for NVGRE item so PMD can expand flow
> > item correctly.
> Please, fix typos:
> support -> supportS
> add -> addS
> 
> Also, the "fix" should be in headline.
> 

Thank Slava! Will send v2 patch to fix it.

> With best regards,
> Slava
> 
> 
> 
> >[snip]



Re: [dpdk-dev] [PATCH 4/8] net/mlx5: add the validate sample action

2020-07-01 Thread Jiawei(Jonny) Wang



> -Original Message-
> From: Ori Kam 
> Sent: Wednesday, July 1, 2020 2:00 AM
> To: Jiawei(Jonny) Wang ; Slava Ovsiienko
> ; Matan Azrad 
> Cc: dev@dpdk.org; Thomas Monjalon ; Raslan
> Darawsheh ; ian.sto...@intel.com;
> f...@redhat.com; Jiawei(Jonny) Wang 
> Subject: RE: [PATCH 4/8] net/mlx5: add the validate sample action
> 
> Hi Jiawei,
> 
> PSB.
> 
> Best,
> Ori
> 
> > -Original Message-
> > From: Jiawei Wang 
> > Sent: Thursday, June 25, 2020 7:26 PM
> > Subject: [PATCH 4/8] net/mlx5: add the validate sample action
> >
> > Add sample action validate function.
> >
> > For Sample flow support NIC-RX and FDB domain, must include an action
> > of a dest TIR in NIC_RX or DEFAULT_MISS in FDB.
> 
> What is the DEFAULT_MISS action?
> I think from reading the code that you mean that no action is allowed and it
> is always goes to e-switch manager / go to PF, am I correct?
> 
Yes, you're right,  For FDB, not addition action be allowed for sampling, the
default action is go to e-switch manager port.
The DEFAULT_MISS is rdma-core action that steering packet to default miss
of the steering domain, for FDB domain, it's e-switch manager port.

I'll update the commit log description.

Thanks.

> >
> > Only NIC_RX support with addition optinal actions.
> >
> > Signed-off-by: Jiawei Wang 
> > ---
> >  drivers/net/mlx5/linux/mlx5_os.c |  14 +
> >  drivers/net/mlx5/mlx5.h  |   1 +
> >  drivers/net/mlx5/mlx5_flow.h |   1 +
> >  drivers/net/mlx5/mlx5_flow_dv.c  | 130
> > +++
> >  4 files changed, 146 insertions(+)
> >
> > diff --git a/drivers/net/mlx5/linux/mlx5_os.c
> > b/drivers/net/mlx5/linux/mlx5_os.c
> > index f0147e6..5c057d3 100644
> > --- a/drivers/net/mlx5/linux/mlx5_os.c
> > +++ b/drivers/net/mlx5/linux/mlx5_os.c
> > @@ -878,6 +878,20 @@
> > }
> > }
> >  #endif
> > +#if defined(HAVE_MLX5DV_DR) &&
> > defined(HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE)
> > +   if (config.hca_attr.log_max_ft_sampler_num > 0  &&
> > +   config.dv_flow_en) {
> > +   priv->sampler_en = 1;
> > +   DRV_LOG(DEBUG, "The Sampler enabled!\n");
> > +   } else {
> > +   priv->sampler_en = 0;
> > +   if (!config.hca_attr.log_max_ft_sampler_num)
> > +   DRV_LOG(WARNING, "No available register
> > for"
> > +   " Sampler.");
> > +   else
> > +   DRV_LOG(DEBUG, "DV flow is not
> > supported!\n");
> > +   }
> > +#endif
> > }
> > if (config.mprq.enabled && mprq) {
> > if (config.mprq.stride_num_n &&
> > diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index
> > 8a09ebc..c2a875c 100644
> > --- a/drivers/net/mlx5/mlx5.h
> > +++ b/drivers/net/mlx5/mlx5.h
> > @@ -607,6 +607,7 @@ struct mlx5_priv {
> > unsigned int counter_fallback:1; /* Use counter fallback management.
> > */
> > unsigned int mtr_en:1; /* Whether support meter. */
> > unsigned int mtr_reg_share:1; /* Whether support meter REG_C
> share.
> > */
> > +   unsigned int sampler_en:1; /* Whether support sampler. */
> > uint16_t domain_id; /* Switch domain identifier. */
> > uint16_t vport_id; /* Associated VF vport index (if any). */
> > uint32_t vport_meta_tag; /* Used for vport index match ove VF LAG.
> > */ diff --git a/drivers/net/mlx5/mlx5_flow.h
> > b/drivers/net/mlx5/mlx5_flow.h index 2c96677..902380b 100644
> > --- a/drivers/net/mlx5/mlx5_flow.h
> > +++ b/drivers/net/mlx5/mlx5_flow.h
> > @@ -200,6 +200,7 @@ enum mlx5_feature_name {  #define
> > MLX5_FLOW_ACTION_SET_IPV4_DSCP (1ull << 32)  #define
> > MLX5_FLOW_ACTION_SET_IPV6_DSCP (1ull << 33)  #define
> > MLX5_FLOW_ACTION_AGE (1ull << 34)
> > +#define MLX5_FLOW_ACTION_SAMPLE (1ull << 35)
> >
> >  #define MLX5_FLOW_FATE_ACTIONS \
> > (MLX5_FLOW_ACTION_DROP | MLX5_FLOW_ACTION_QUEUE | \ diff
> --git
> > a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
> > index f174009..710c0f3 100644
> > --- a/drivers/net/mlx5/mlx5_flow_dv.c
> > +++ b/drivers/net/mlx5/mlx5_flow_dv.c
> > @@ -3925,6 +3925,127 @@ struct field_modify_info modify_tcp[] = {  }
> >
> >  /**
> > + * Validate the 

Re: [dpdk-dev] [PATCH 6/8] net/mlx5: update translate function for sample action

2020-07-01 Thread Jiawei(Jonny) Wang



> -Original Message-
> From: Ori Kam 
> Sent: Wednesday, July 1, 2020 3:55 AM
> To: Jiawei(Jonny) Wang ; Slava Ovsiienko
> ; Matan Azrad 
> Cc: dev@dpdk.org; Thomas Monjalon ; Raslan
> Darawsheh ; ian.sto...@intel.com;
> f...@redhat.com; Jiawei(Jonny) Wang 
> Subject: RE: [PATCH 6/8] net/mlx5: update translate function for sample
> action
> 
> Hi Jiawei,
> PSB,
> 
> Thanks,
> Ori
> 
> > -Original Message-
> > From: Jiawei Wang 
> > Sent: Thursday, June 25, 2020 7:26 PM
> > Subject: [PATCH 6/8] net/mlx5: update translate function for sample
> > action
> >
> > Translate the attribute of sample action that include sample ratio and
> > sub actions list, then create the sample DR action.
> >
> > Signed-off-by: Jiawei Wang 
> > ---
> >  drivers/net/mlx5/mlx5_flow.c|  16 +-
> >  drivers/net/mlx5/mlx5_flow.h|  14 +-
> >  drivers/net/mlx5/mlx5_flow_dv.c | 502
> > +++-
> >  3 files changed, 511 insertions(+), 21 deletions(-)
> >
> > diff --git a/drivers/net/mlx5/mlx5_flow.c
> > b/drivers/net/mlx5/mlx5_flow.c index 7c65a9a..73ef290 100644
> > --- a/drivers/net/mlx5/mlx5_flow.c
> > +++ b/drivers/net/mlx5/mlx5_flow.c
> > @@ -4569,10 +4569,14 @@ uint32_t mlx5_flow_adjust_priority(struct
> > rte_eth_dev *dev, int32_t priority,
> > int hairpin_flow;
> > uint32_t hairpin_id = 0;
> > struct rte_flow_attr attr_tx = { .priority = 0 };
> > +   struct rte_flow_attr attr_factor = {0};
> > int ret;
> >
> > -   hairpin_flow = flow_check_hairpin_split(dev, attr, actions);
> > -   ret = flow_drv_validate(dev, attr, items, p_actions_rx,
> > +   memcpy((void *)&attr_factor, (const void *)attr, sizeof(*attr));
> > +   if (external)
> > +   attr_factor.group *= MLX5_FLOW_TABLE_FACTOR;
> > +   hairpin_flow = flow_check_hairpin_split(dev, &attr_factor, actions);
> > +   ret = flow_drv_validate(dev, &attr_factor, items, p_actions_rx,
> > external, hairpin_flow, error);
> > if (ret < 0)
> > return 0;
> > @@ -4591,7 +4595,7 @@ uint32_t mlx5_flow_adjust_priority(struct
> > rte_eth_dev *dev, int32_t priority,
> > rte_errno = ENOMEM;
> > goto error_before_flow;
> > }
> > -   flow->drv_type = flow_get_drv_type(dev, attr);
> > +   flow->drv_type = flow_get_drv_type(dev, &attr_factor);
> > if (hairpin_id != 0)
> > flow->hairpin_flow_id = hairpin_id;
> > MLX5_ASSERT(flow->drv_type > MLX5_FLOW_TYPE_MIN && @@ -
> 4637,7
> > +4641,7 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev *dev,
> > int32_t priority,
> >  * depending on configuration. In the simplest
> >  * case it just creates unmodified original flow.
> >  */
> > -   ret = flow_create_split_outer(dev, flow, attr,
> > +   ret = flow_create_split_outer(dev, flow, &attr_factor,
> >   buf->entry[i].pattern,
> >   p_actions_rx, external, idx,
> >   error);
> > @@ -4674,8 +4678,8 @@ uint32_t mlx5_flow_adjust_priority(struct
> > rte_eth_dev *dev, int32_t priority,
> >  * the egress Flows belong to the different device and
> >  * copy table should be updated in peer NIC Rx domain.
> >  */
> > -   if (attr->ingress &&
> > -   (external || attr->group != MLX5_FLOW_MREG_CP_TABLE_GROUP))
> > {
> > +   if (attr_factor.ingress &&
> > +   (external || attr_factor.group !=
> > MLX5_FLOW_MREG_CP_TABLE_GROUP)) {
> > ret = flow_mreg_update_copy_table(dev, flow, actions,
> error);
> > if (ret)
> > goto error;
> > diff --git a/drivers/net/mlx5/mlx5_flow.h
> > b/drivers/net/mlx5/mlx5_flow.h index 941de5f..4163183 100644
> > --- a/drivers/net/mlx5/mlx5_flow.h
> > +++ b/drivers/net/mlx5/mlx5_flow.h
> > @@ -369,6 +369,13 @@ enum mlx5_flow_fate_type {
> > MLX5_FLOW_FATE_MAX,
> >  };
> >
> > +/*
> > + * Max number of actions per DV flow.
> > + * See CREATE_FLOW_MAX_FLOW_ACTIONS_SUPPORTED
> > + * in rdma-core file providers/mlx5/verbs.c.
> > + */
> > +#define MLX5_DV_MAX_NUMBER_OF_ACTIONS 8
> > +
> >  /* Matcher PRM representation */
> >  struct mlx5_flow_dv_match_params {
> > size_t size;
> > @@ -599,13 +606,6 @@ struct mlx5_f

Re: [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for rte flow

2020-07-05 Thread Jiawei(Jonny) Wang
Thanks Andrew and Ori,

> -Original Message-
> From: Ori Kam 
> Sent: Sunday, July 5, 2020 6:19 PM
> To: Andrew Rybchenko ; Jiawei(Jonny) Wang
> ; Slava Ovsiienko ;
> Matan Azrad 
> Cc: dev@dpdk.org; Thomas Monjalon ; Raslan
> Darawsheh ; ian.sto...@intel.com;
> f...@redhat.com
> Subject: RE: [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action for
> rte flow
> 
> Hi Andrew,
> 
> I replied to some of your comments.
> Best,
> Ori
> > -Original Message-
> > From: dev  On Behalf Of Andrew Rybchenko
> > Sent: Saturday, July 4, 2020 4:05 PM
> > Subject: Re: [dpdk-dev] [PATCH v2 1/7] ethdev: introduce sample action
> > for rte flow
> >
> > On 7/2/20 9:43 PM, Jiawei Wang wrote:
> > > When using full offload, all traffic will be handled by the HW, and
> > > directed to the requested vf or wire, the control application loses
> >
> > vf->VF
will change in v3 patch.
> >
> > > visibility on the traffic.
> > > So there's a need for an action that will enable the control
> > > application some visibility.
> > >
> > > The solution is introduced a new action that will sample the
> > > incoming traffic and send a duplicated traffic in some predefined
> > > ratio to the application, while the original packet will continue to
> > > the target destination.
> > >
> >
> > May be 1 packet per second is a better sampling approach?
> > Or just different.
> >
> Those are two different things, lets take a packet that arrives once every two
> seconds and we ask to sample once every second, this means that we will
> always get that packet.
> Also as far as I understand the use case is to have some visibility about the
> traffic.
> so you can assume that if a packet is sent once per second the application
> will get the packet with very high delay and very low visibility. Lets take a 
> use
> case that the hyprivor wants to check if one of the VM is abusing the system
> (sends DDOS packets, or just trying to understand the network) in this case
> we can assume that the VM will send large amount of traffic. and if we only
> check once per second the application will not be able to understand the
> traffic meaning, but if we sample 1% of the traffic then the application will
> see very fast the type of the traffic the VM is sending and if it is trying to
> abuse the system.
> So I vote in favor of keeping as is.
> 
> 
> > > The packets sampled equals is '1/ratio', if the ratio value be set
> > > to 1 , means that the packets would be completely mirrored. The
> > > sample packet
> >
> > Comma on the next line looks bad.
ok, will move the Comma to the same line.
> >
> > > can be assigned with different set of actions from the original packet.
> > >
> > > In order to support the sample packet in rte_flow, new rte_flow
> > > action definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure
> > rte_flow_action_sample
> > > will be introduced.
> > >
> > > Signed-off-by: Jiawei Wang 
> > > Acked-by: Ori Kam 
> > > ---
> > >  doc/guides/prog_guide/rte_flow.rst | 25
> +
> > >  doc/guides/rel_notes/release_20_08.rst |  6 ++
> > >  lib/librte_ethdev/rte_flow.c   |  1 +
> > >  lib/librte_ethdev/rte_flow.h   | 28 
> > >  4 files changed, 60 insertions(+)
> > >
> > > diff --git a/doc/guides/prog_guide/rte_flow.rst
> > b/doc/guides/prog_guide/rte_flow.rst
> > > index d5dd18c..50dfe1f 100644
> > > --- a/doc/guides/prog_guide/rte_flow.rst
> > > +++ b/doc/guides/prog_guide/rte_flow.rst
> > > @@ -2645,6 +2645,31 @@ timeout passed without any matching on the
> flow.
> > > | ``context``  | user input flow context |
> > > +--+-+
> > >
> > > +Action: ``SAMPLE``
> > > +^^
> > > +
> > > +Adds a sample action to a matched flow.
> > > +
> > > +The matching packets will be duplicated to a special queue or vport
> >
> > what is vport above?
> >
> I think it should be port (when using E-Switch)
Yes, the destination port is working on e-switch mode, change vport->port.
> 
> > > +with the predefined ``ratio``, the packets sampled equals is '1/ratio'.
> > > +All the packets continues to the target destination.
> >
> > continues -> continue (if I'm not mistaken)
ok, will change.
> >
> > > +
> &

Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow

2020-06-28 Thread Jiawei(Jonny) Wang

On Friday, June 26, 2020 7:10 PM Jerin Jacob  Wrote:
>
> On Fri, Jun 26, 2020 at 4:16 PM Thomas Monjalon 
> wrote:
> >
> > 26/06/2020 12:35, Jerin Jacob:
> > > On Fri, Jun 26, 2020 at 12:59 AM Thomas Monjalon
>  wrote:
> > > >
> > > > 25/06/2020 19:55, Jerin Jacob:
> > > > > On Thu, Jun 25, 2020 at 10:20 PM Jiawei Wang
>  wrote:
> > > > > >
> > > > > > When using full offload, all traffic will be handled by the
> > > > > > HW, and directed to the requested vf or wire, the control
> > > > > > application loses visibility on the traffic.
> > > > > > So there's a need for an action that will enable the control
> > > > > > application some visibility.
> > > > > >
> > > > > > The solution is introduced a new action that will sample the
> > > > > > incoming traffic and send a duplicated traffic in some
> > > > > > predefined ratio to the application, while the original packet
> > > > > > will continue to the target destination.
> > > > > >
> > > > > > The packets sampled equals is '1/ratio', if the ratio value be
> > > > > > set to 1 , means that the packets would be completely
> > > > > > mirrored. The sample packet can be assigned with different set of
> actions from the original packet.
> > > > > >
> > > > > > In order to support the sample packet in rte_flow, new
> > > > > > rte_flow action definition RTE_FLOW_ACTION_TYPE_SAMPLE and
> > > > > > structure rte_flow_action_sample
> > > > >
> > > > > Isn't mirroring the packet? How about,
> > > > > RTE_FLOW_ACTION_TYPE_MIRROR I am not able to understand, Why
> it is called sample.
> > > >
> > > > Sampling is a partial mirroring.
> > >
> > > I think, By definition, _sampling_ is the _selection_ of items from
> > > a specific group.
> > > I think, _sampling_ is not dictating, what is the real action for
> > > the "selected"  items.
> > > One can get confused with the selected ones can be for forward, drop
> > > any other action.
> >
> > I see. Good design question (I will let others reply).
> >
> > > So IMO, explicit mirror keyword usage makes it is clear.

Sampled packet is duplicated from incoming traffic at specific ratio and will 
go to different sample actions;
ratio=1 is 100% duplication or mirroring.
All packets will continue to go to default flow actions.

> > >
> > > Some more related questions:
> > > 1) What is the real use case for ratio? I am not against adding a
> > > ratio attribute if the MLX hardware supports it. It will be good to
> > > know the use case from the application perspective? And what basics
> > > application set ratio != 1?
> >
> > If I understand well, some applications want to check, by picking
> > random packets, that the processing is not failing.
> 
> Not clear to me. I will wait for another explanation if any.
> In what basics application set .1 vs .8?

The real case is like monitor the traffic with full-offload. 
While packet hit the sample flow, the matching packets will be sampled and sent 
to specific Queue,
align with OVS sflow probability, user application can set it different value.

> 
> >
> > > 2) If it is for "rate-limiting" or "policing", why not use rte_mtr
> > > object (rte_mtr.h) via rte_flow action.

The sample ratio isn’t the same as “meter’, the ratio of sampling will be 
calculated with incoming packets mask (every some packets sampled 1). Then the 
packets will be duplicated and go to do the other sample actions.


> > > 3) One of the issue for driver developers and application writers
> > > are overlapping APIs. This would overlap with
> > > rte_eth_mirror_rule_set() API.
> > >
> > > Can we deprecate rte_eth_mirror_rule_set() API? It will be a pain
> > > for all to have overlapping APIs. We have not fixed the VLAN filter
> > > API overlap with rte_flow in ethdev. Its being TODO for multiple
> > > releases now.
> >
> > Ooooh yes!
> > I think flow-based API is more powerful, and should deprecate old
> > port-based API.
> 
> +1 from me.
> 
> it is taking too much effort and time to make support duplicate APIs.
> 
> > I want to help deprecating such API in 20.11 if possible.
> 
> Please start that discussion. In this case, it is clear API overlap with
> rte_eth_mirror_rule_set().
> We should not have two separate paths for the same function in the same
> ethdev library.
> 
> 
> 
> >
> > > > Full mirroring is sampling 100% packets (ratio = 1).
> > > > That's why only one action is enough.
> >
> >
> >


Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow

2020-06-28 Thread Jiawei(Jonny) Wang


> -Original Message-
> From: Jerin Jacob 
> Sent: Sunday, June 28, 2020 9:38 PM
> To: Jiawei(Jonny) Wang 
> Cc: Thomas Monjalon ; Ori Kam
> ; Slava Ovsiienko ;
> Matan Azrad ; dpdk-dev ; Raslan
> Darawsheh ; ian.sto...@intel.com;
> f...@redhat.com; Ferruh Yigit ; Andrew Rybchenko
> 
> Subject: Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte
> flow
> 
> On Sun, Jun 28, 2020 at 6:46 PM Jiawei(Jonny) Wang
>  wrote:
> >
> >
> > On Friday, June 26, 2020 7:10 PM Jerin Jacob 
> Wrote:
> > >
> > > On Fri, Jun 26, 2020 at 4:16 PM Thomas Monjalon
> > > 
> > > wrote:
> > > >
> > > > 26/06/2020 12:35, Jerin Jacob:
> > > > > On Fri, Jun 26, 2020 at 12:59 AM Thomas Monjalon
> > >  wrote:
> > > > > >
> > > > > > 25/06/2020 19:55, Jerin Jacob:
> > > > > > > On Thu, Jun 25, 2020 at 10:20 PM Jiawei Wang
> > >  wrote:
> > > > > > > >
> > > > > > > > When using full offload, all traffic will be handled by
> > > > > > > > the HW, and directed to the requested vf or wire, the
> > > > > > > > control application loses visibility on the traffic.
> > > > > > > > So there's a need for an action that will enable the
> > > > > > > > control application some visibility.
> > > > > > > >
> > > > > > > > The solution is introduced a new action that will sample
> > > > > > > > the incoming traffic and send a duplicated traffic in some
> > > > > > > > predefined ratio to the application, while the original
> > > > > > > > packet will continue to the target destination.
> > > > > > > >
> > > > > > > > The packets sampled equals is '1/ratio', if the ratio
> > > > > > > > value be set to 1 , means that the packets would be
> > > > > > > > completely mirrored. The sample packet can be assigned
> > > > > > > > with different set of
> > > actions from the original packet.
> > > > > > > >
> > > > > > > > In order to support the sample packet in rte_flow, new
> > > > > > > > rte_flow action definition RTE_FLOW_ACTION_TYPE_SAMPLE
> and
> > > > > > > > structure rte_flow_action_sample
> > > > > > >
> > > > > > > Isn't mirroring the packet? How about,
> > > > > > > RTE_FLOW_ACTION_TYPE_MIRROR I am not able to understand,
> Why
> > > it is called sample.
> > > > > >
> > > > > > Sampling is a partial mirroring.
> > > > >
> > > > > I think, By definition, _sampling_ is the _selection_ of items
> > > > > from a specific group.
> > > > > I think, _sampling_ is not dictating, what is the real action
> > > > > for the "selected"  items.
> > > > > One can get confused with the selected ones can be for forward,
> > > > > drop any other action.
> > > >
> > > > I see. Good design question (I will let others reply).
> > > >
> > > > > So IMO, explicit mirror keyword usage makes it is clear.
> >
> > Sampled packet is duplicated from incoming traffic at specific ratio
> > and will go to different sample actions;
> > ratio=1 is 100% duplication or mirroring.
> > All packets will continue to go to default flow actions.
> 
> Functionality is clear from the git commit log(Not from action name).
> The only question is what would be the appropriate name for this action.
> RTE_FLOW_ACTION_TYPE_SAMPLE vs RTE_FLOW_ACTION_TYPE_MIRROR
> 
> >
> > > > >
> > > > > Some more related questions:
> > > > > 1) What is the real use case for ratio? I am not against adding
> > > > > a ratio attribute if the MLX hardware supports it. It will be
> > > > > good to know the use case from the application perspective? And
> > > > > what basics application set ratio != 1?
> > > >
> > > > If I understand well, some applications want to check, by picking
> > > > random packets, that the processing is not failing.
> > >
> > > Not clear to me. I will wait for another explanation if any.
> > > In what basics application set .1 vs .8?
> >
> > The real case is like mon

Re: [dpdk-dev] [PATCH 1/8] ethdev: introduce sample action for rte flow

2020-06-28 Thread Jiawei(Jonny) Wang

On Sunday, June 28, 2020 4:27 PM, Andrew Rybchenko  
wrote:
> 
> On 6/25/20 7:26 PM, Jiawei Wang wrote:
> > When using full offload, all traffic will be handled by the HW, and
> > directed to the requested vf or wire, the control application loses
> > visibility on the traffic.
> > So there's a need for an action that will enable the control
> > application some visibility.
> >
> > The solution is introduced a new action that will sample the incoming
> > traffic and send a duplicated traffic in some predefined ratio to the
> > application, while the original packet will continue to the target
> > destination.
> >
> > The packets sampled equals is '1/ratio', if the ratio value be set to
> > 1 , means that the packets would be completely mirrored. The sample
> > packet can be assigned with different set of actions from the original
> packet.
> >
> > In order to support the sample packet in rte_flow, new rte_flow action
> > definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure
> > rte_flow_action_sample will be introduced.
> >
> > Signed-off-by: Jiawei Wang 
> 
> [snip]
> 
> > @@ -2709,6 +2716,28 @@ struct rte_flow_action {  struct rte_flow;
> >
> >  /**
> > + * @warning
> > + * @b EXPERIMENTAL: this structure may change without prior notice
> > + *
> > + * RTE_FLOW_ACTION_TYPE_SAMPLE
> > + *
> > + * Adds a sample action to a matched flow.
> > + *
> > + * The matching packets will be duplicated to a special queue or
> > +vport
> > + * in the predefined probabiilty, All the packets continues
> > +processing
> > + * on the default flow path.
> > + *
> > + * When the sample ratio is set to 1 then the packets will be 100%
> mirrored.
> > + * Additional action list be supported to add for sampled or mirrored
> packets.
> > + */
> > +struct rte_flow_action_sample {
> > +   /* packets sampled equals to '1/ratio' */
> > +   const uint32_t ratio;
> > +   /* sub-action list specific for the sampling hit cases */
> > +   const struct rte_flow_action *actions;
> 
> This design idea does not look good to me from the very beginning. IMHO it
> does not fit flow API overall design.
> I mean sub-action list.
> 
> As I understand Linux iptables solves it on match level (i.e. in pattern). 
> E.g.
> "limit" extension which is basically sampling. Sampling using meta pattern
> item in combination with PASSTHRU action (to make sampling actions non-
> terminating if required) is a better solution from design point of view.

On our design, there're sample flow path and normal flow path, each path can 
have different actions.
The defined sub-actions list only applied for sampled packets in the sample 
flow path;
For normal path, all packets will continue to go with the original actions.


Re: [dpdk-dev] [PATCH v6] app/testpmd: fix testpmd packets dump overlapping

2021-01-08 Thread Jiawei(Jonny) Wang



> -Original Message-
> From: Stephen Hemminger 
> Sent: Wednesday, January 6, 2021 11:55 PM
> To: Jiawei(Jonny) Wang 
> Cc: ferruh.yi...@intel.com; wenzhuo...@intel.com; beilei.x...@intel.com;
> bernard.iremon...@intel.com; Ori Kam ; Slava
> Ovsiienko ; NBU-Contact-Thomas Monjalon
> ; Raslan Darawsheh ;
> dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v6] app/testpmd: fix testpmd packets dump
> overlapping
> 
> On Wed,  6 Jan 2021 16:13:37 +0200
> Jiawei Wang  wrote:
> 
> > +print_ether_addr(const char *what, const struct rte_ether_addr
> *eth_addr,
> > +char print_buf[], int buf_size, int *cur_len)
> 
> Use size_t instead of int for sizes?
> The length can never be negative?

The buf_size and cur_len always >0, buf_size means that the total size of 
string buffer, default is 8K,
the cur_len means that the current location of putting the new dump output, 
start from 0.

Thanks. 


Re: [dpdk-dev] [PATCH v6] app/testpmd: fix testpmd packets dump overlapping

2021-01-08 Thread Jiawei(Jonny) Wang



> -Original Message-
> From: Stephen Hemminger 
> Sent: Wednesday, January 6, 2021 11:56 PM
> To: Jiawei(Jonny) Wang 
> Cc: ferruh.yi...@intel.com; wenzhuo...@intel.com; beilei.x...@intel.com;
> bernard.iremon...@intel.com; Ori Kam ; Slava
> Ovsiienko ; NBU-Contact-Thomas Monjalon
> ; Raslan Darawsheh ;
> dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v6] app/testpmd: fix testpmd packets dump
> overlapping
> 
> On Wed,  6 Jan 2021 16:13:37 +0200
> Jiawei Wang  wrote:
> 
> > +   int buf_size = MAX_STRING_LEN;
> > +   char print_buf[buf_size];
> 
> Don't use dynamic sized array if you don't have to.
> Various static checkers can see look for out of bounds access to fixed size
> array.

Ok, I will change to :
char print_buf[MAX_STRING_LEN];

Thanks.



Re: [dpdk-dev] [PATCH] app/testpmd: fix testpmd packets dump overlapping

2020-11-13 Thread Jiawei(Jonny) Wang
Hi Ferruh, 

Thanks for your comments.

> -Original Message-
> From: Ferruh Yigit 
> Sent: Friday, November 13, 2020 12:45 AM
> To: Jiawei(Jonny) Wang ; wenzhuo...@intel.com;
> beilei.x...@intel.com; bernard.iremon...@intel.com; Ori Kam
> ; NBU-Contact-Thomas Monjalon
> ; Raslan Darawsheh 
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] app/testpmd: fix testpmd packets dump
> overlapping
> 
> On 11/12/2020 8:36 AM, Jiawei Wang wrote:
> > When testpmd enabled the verbosity for the received packets, if two
> > packets was received at the same time, for example, sampling packet
> > and normal packet, the dump output of these packets may be overlapping
> > due to multiple core handled the multiple queues simultaneously.
> >
> 
> Hi Jiawei,
> 
> Is the problem observer only when having sampling? Do you observe the
> problem when multiple cores Rx?
We saw this problem happened easily with sampling.
And yes, I also can see the overlapping when multiple core RX if Traffic 
generator using test-pmd with --txonly-multi-flow.
> 
> > The patch uses one string buffer that collects all the packet dump
> > output into this buffer and then printout it at last, that guarantee
> > to printout separately the dump output per packet.
> >
> > Fixes: d862c45 ("app/testpmd: move dumping packets to a separate
> > function")
> >
> > Signed-off-by: Jiawei Wang 
> > ---
> >   app/test-pmd/util.c | 238
> ++--
> >   1 file changed, 177 insertions(+), 61 deletions(-)
> >
> > diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c index
> > 649bf8f..47b75b0 100644
> > --- a/app/test-pmd/util.c
> > +++ b/app/test-pmd/util.c
> > @@ -12,15 +12,20 @@
> >   #include 
> >   #include 
> >   #include 
> > +#include 
> >
> >   #include "testpmd.h"
> >
> > -static inline void
> > -print_ether_addr(const char *what, const struct rte_ether_addr
> > *eth_addr)
> > +#define MAX_STRING_LEN 8192
> > +
> > +static inline int
> > +print_ether_addr(const char *what, const struct rte_ether_addr
> *eth_addr,
> > +char print_buf[], int buf_size)
> >   {
> > char buf[RTE_ETHER_ADDR_FMT_SIZE];
> > +
> > rte_ether_format_addr(buf, RTE_ETHER_ADDR_FMT_SIZE,
> eth_addr);
> > -   printf("%s%s", what, buf);
> > +   return snprintf(print_buf, buf_size, "%s%s", what, buf);
> >   }
> >
> >   static inline bool
> > @@ -74,13 +79,18 @@
> > uint32_t vx_vni;
> > const char *reason;
> > int dynf_index;
> > +   int buf_size = MAX_STRING_LEN * nb_pkts;
> > +   char print_buf[buf_size];
> 
> This is a large value to allocate from stack, specially with larger burst 
> size.
> Allocating from heap is an option but it has a cost.
> 
> So what do you think print per packet, instead of print per burst? This also
> prevents display latency of the packet log.
Good idea, then we only need keep MAX_STRING_LEN stack size, each packet of 
burst can reuse the stack.
> 
> > +   int cur_len = 0;
> >
> > +   memset(print_buf, 0, sizeof(print_buf));
> > if (!nb_pkts)
> > return;
> > -   printf("port %u/queue %u: %s %u packets\n",
> > -   port_id, queue,
> > -  is_rx ? "received" : "sent",
> > -  (unsigned int) nb_pkts);
> > +   cur_len += snprintf(print_buf + cur_len, buf_size - cur_len,
> > +   "port %u/queue %u: %s %u packets\n",
> > +   port_id, queue,
> > +   is_rx ? "received" : "sent",
> > +   (unsigned int) nb_pkts);
> > for (i = 0; i < nb_pkts; i++) {
> > int ret;
> > struct rte_flow_error error;
> > @@ -93,95 +103,183 @@
> > is_encapsulation = RTE_ETH_IS_TUNNEL_PKT(packet_type);
> > ret = rte_flow_get_restore_info(port_id, mb, &info, &error);
> > if (!ret) {
> > -   printf("restore info:");
> > +   cur_len += snprintf(print_buf + cur_len,
> > +   buf_size - cur_len,
> > +   "restore info:");
> 
> This is not safe.
> 'snprintf' returns size of the required buffer size, not written chars, this 
> can
> make "buf_size - cur_len" a negative value at some point, and since 'size'
> type is unsigned nega

Re: [dpdk-dev] [PATCH v2] app/testpmd: fix testpmd packets dump overlapping

2020-11-18 Thread Jiawei(Jonny) Wang
Hi Ferruh,

> -Original Message-
> From: Ferruh Yigit 
> Sent: Tuesday, November 17, 2020 8:07 PM
> To: Jiawei(Jonny) Wang ; wenzhuo...@intel.com;
> beilei.x...@intel.com; bernard.iremon...@intel.com; Ori Kam
> ; NBU-Contact-Thomas Monjalon
> ; Raslan Darawsheh 
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] app/testpmd: fix testpmd packets dump
> overlapping
> 
> On 11/14/2020 12:01 PM, Jiawei Wang wrote:
> > When testpmd enabled the verbosity for the received packets, if two
> > packets was received at the same time, for example, sampling packet
> > and normal packet, the dump output of these packets may be overlapping
> > due to multiple core handled the multiple queues simultaneously.
> >
> > The patch uses one string buffer that collects all the packet dump
> > output into this buffer and then printout it at last, that guarantee
> > to printout separately the dump output per packet.
> >
> > Fixes: d862c45 ("app/testpmd: move dumping packets to a separate
> > function")
> >
> > Signed-off-by: Jiawei Wang 
> > ---
> > v2:
> > * Print dump output of per packet instead of per burst.
> > * Add the checking for return value of 'snprintf' and exit if required size
> exceed the print buffer size.
> > * Update the log messages.
> > ---
> >   app/test-pmd/util.c | 378
> 
> >   1 file changed, 295 insertions(+), 83 deletions(-)
> >
> > diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c index
> > 649bf8f..ffae590 100644
> > --- a/app/test-pmd/util.c
> > +++ b/app/test-pmd/util.c
> > @@ -12,15 +12,20 @@
> >   #include 
> >   #include 
> >   #include 
> > +#include 
> >
> >   #include "testpmd.h"
> >
> > -static inline void
> > -print_ether_addr(const char *what, const struct rte_ether_addr
> > *eth_addr)
> > +#define MAX_STRING_LEN 8192
> 
> Isn't '8192' too large for a single packet, what do you think for '2048'?
> 
2K is ok for the normal case, here consider the case that registered mbuf 
dynamic flags names, 
Then total maximum length can reach to   64* RTE_MBUF_DYN_NAMESIZE = 4096
 // char dynf_names[64][RTE_MBUF_DYN_NAMESIZE];
So 2048 is not enough if all dynf_names be configured as maximum length of 
dyn_names.

How about the 5K? I think should be enough for now.
> <...>
> 
> > @@ -93,95 +103,269 @@
> > is_encapsulation = RTE_ETH_IS_TUNNEL_PKT(packet_type);
> > ret = rte_flow_get_restore_info(port_id, mb, &info, &error);
> > if (!ret) {
> > -   printf("restore info:");
> > +   cur_len += snprintf(print_buf + cur_len,
> > +   buf_size - cur_len,
> > +   "restore info:");
> 
> What do you think having a macro like below to simplfy the code:
> 
>   #define FOO(buf, buf_size, cur_len, ...) \
>   do { \
>  if (cur_len >= buf_size) break; \
>  cur_len += snprintf(buf + cur_len, buf_size - cur_len, __VA_ARGS__); 
> \
>   } while (0)
> 
> It can be used as:
>   FOO(print_buf, buf_size, cur_len, "restore info:");
> 
Agree, will move these common code into a MARCO, 
I will use the  "MKDUMPSTR" as MARCO name,  means that  making a dump string 
buffer.

> > +   if (cur_len >= buf_size)
> > +   goto error;
> 
> This 'goto' goes out of the loop, not sure about breking the loop and not
> processing all packets if buffer is out of space for one of the packets.
> Anyway if you switch to macro above, the 'goto' is removed completely.
> 
Use 'break' only break the do{} while and continue to following processing, but 
won't filled anymore into printbuf since it will break firstly in the MARCO 
checking.
Or could use 'goto lable' instead 'break' in the marco?
> <...>
> 
> > +   if (cur_len >= buf_size)
> > +   goto error;
> > +   TESTPMD_LOG(INFO, "%s", print_buf);
> 
> Using 'TESTPMD_LOG' appends "testpmd: " at the beggining of the each line,
> for this case it is noise I think, what do you think turning back to printf() 
> as
> done originally?
> 
ok, will use the printf.
> > +   cur_len = 0;
> > }
> > +   return;
> > +error:
> > +   TESTPMD_LOG(INFO, "%s the output was truncated ...\n", print_buf);
> 
> What do you think having something shorter to notify the truncation, I think
> some special chars can work better, something like "...", "@@", "-->", "???" ?
> 
ok, I will use simple chars  for truncated case like below:
if (cur_len >= buf_size)
printf("%s ...\n", print_buf);

> >   }
> >
> >   uint16_t
> >

Thanks for your comments, I will do the changes and send V3 patch.

Thanks.
Jonny


Re: [dpdk-dev] [PATCH v3] app/testpmd: fix testpmd packets dump overlapping

2020-11-22 Thread Jiawei(Jonny) Wang
Hi Ferruh,

> -Original Message-
> From: Ferruh Yigit 
> Sent: Saturday, November 21, 2020 1:50 AM
> To: Jiawei(Jonny) Wang ; wenzhuo...@intel.com;
> beilei.x...@intel.com; bernard.iremon...@intel.com; Ori Kam
> ; Slava Ovsiienko ; NBU-
> Contact-Thomas Monjalon ; Raslan Darawsheh
> 
> Cc: dev@dpdk.org
> Subject: Re: [PATCH v3] app/testpmd: fix testpmd packets dump overlapping
> 
> On 11/20/2020 5:35 PM, Jiawei Wang wrote:
> > When testpmd enabled the verbosity for the received packets, if two
> > packets were received at the same time, for example, sampling packet
> > and normal packet, the dump output of these packets may be overlapping
> > due to multiple core handling the multiple queues simultaneously.
> >
> > The patch uses one string buffer that collects all the packet dump
> > output into this buffer and then printouts it at last, that guarantees
> > to printout separately the dump output per packet.
> >
> > Fixes: d862c45 ("app/testpmd: move dumping packets to a separate
> > function")
> >
> > Signed-off-by: Jiawei Wang 
> 
> <...>
> 
> > @@ -74,13 +85,16 @@
> > uint32_t vx_vni;
> > const char *reason;
> > int dynf_index;
> > +   int buf_size = MAX_STRING_LEN;
> > +   char print_buf[buf_size];
> > +   int cur_len = 0;
> >
> > +   memset(print_buf, 0, sizeof(print_buf));
> 
> Should 'print_buf' cleaned per each packet below, if not can we drop 'memset'
> completely?
> 
Since the 'snprintf' always append the a terminating null character('\0') after 
the written string,
and in the code the following character be appended start from previous null 
character, so only one 
'\0' will be appended in the last 'snprintf' calls, 
snprintf(buf + cur_len, buf_size - cur_len ..
At the last 'printf("%s") will print the string buffer up to a terminating null 
character ('\0').
so it's ok even not 'memset' calls to clean 'print_buf' per each packets. 
> <...>
> 
> > +   if (cur_len >= buf_size) {
> > +   printf("%s ...\n", print_buf);
> > +   break;
> 
> Why break here? Wouldn't just append some chars to indicate trancation and
> continue be OK?
> 
> 
Yes, doesn't  need 'break'  here.



Re: [dpdk-dev] [PATCH v3] app/testpmd: fix testpmd packets dump overlapping

2020-11-23 Thread Jiawei(Jonny) Wang
Hi Ferruh,

> -Original Message-
> From: Ferruh Yigit 
> Sent: Monday, November 23, 2020 6:23 PM
> To: Jiawei(Jonny) Wang ; wenzhuo...@intel.com;
> beilei.x...@intel.com; bernard.iremon...@intel.com; Ori Kam
> ; Slava Ovsiienko ; NBU-
> Contact-Thomas Monjalon ; Raslan Darawsheh
> 
> Cc: dev@dpdk.org
> Subject: Re: [PATCH v3] app/testpmd: fix testpmd packets dump overlapping
> 
> On 11/23/2020 6:13 AM, Jiawei(Jonny) Wang wrote:
> > Hi Ferruh,
> >
> >> -Original Message-
> >> From: Ferruh Yigit 
> >> Sent: Saturday, November 21, 2020 1:50 AM
> >> To: Jiawei(Jonny) Wang ; wenzhuo...@intel.com;
> >> beilei.x...@intel.com; bernard.iremon...@intel.com; Ori Kam
> >> ; Slava Ovsiienko ; NBU-
> >> Contact-Thomas Monjalon ; Raslan Darawsheh
> >> 
> >> Cc: dev@dpdk.org
> >> Subject: Re: [PATCH v3] app/testpmd: fix testpmd packets dump
> >> overlapping
> >>
> >> On 11/20/2020 5:35 PM, Jiawei Wang wrote:
> >>> When testpmd enabled the verbosity for the received packets, if two
> >>> packets were received at the same time, for example, sampling packet
> >>> and normal packet, the dump output of these packets may be
> >>> overlapping due to multiple core handling the multiple queues
> simultaneously.
> >>>
> >>> The patch uses one string buffer that collects all the packet dump
> >>> output into this buffer and then printouts it at last, that
> >>> guarantees to printout separately the dump output per packet.
> >>>
> >>> Fixes: d862c45 ("app/testpmd: move dumping packets to a separate
> >>> function")
> >>>
> >>> Signed-off-by: Jiawei Wang 
> >>
> >> <...>
> >>
> >>> @@ -74,13 +85,16 @@
> >>>   uint32_t vx_vni;
> >>>   const char *reason;
> >>>   int dynf_index;
> >>> + int buf_size = MAX_STRING_LEN;
> >>> + char print_buf[buf_size];
> >>> + int cur_len = 0;
> >>>
> >>> + memset(print_buf, 0, sizeof(print_buf));
> >>
> >> Should 'print_buf' cleaned per each packet below, if not can we drop
> 'memset'
> >> completely?
> >>
> > Since the 'snprintf' always append the a terminating null
> > character('\0') after the written string, and in the code the
> > following character be appended start from previous null character, so only
> one '\0' will be appended in the last 'snprintf' calls,
> > snprintf(buf + cur_len, buf_size - cur_len ..
> > At the last 'printf("%s") will print the string buffer up to a terminating 
> > null
> character ('\0').
> > so it's ok even not 'memset' calls to clean 'print_buf' per each packets.
> 
> agree, so why not drop the memset completely?
Yes, don't need memset before 'snprintf' calls, removed it in the latest patch: 
https://patchwork.dpdk.org/patch/84465/
Thanks.



Re: [dpdk-dev] [PATCH v6 11/12] app/testpmd: add port and encap support for sample action

2020-09-22 Thread Jiawei(Jonny) Wang
Yes, Nice catch.

I will fix it.

Thanks.
B.R.
Jonny

> -Original Message-
> From: Ajit Khaparde 
> Sent: Tuesday, September 22, 2020 6:28 AM
> To: Jiawei(Jonny) Wang 
> Cc: Ori Kam ; Slava Ovsiienko ;
> Matan Azrad ; NBU-Contact-Thomas Monjalon
> ; Ferruh Yigit ; Marko
> Kovacevic ; Andrew Rybchenko
> ; dpdk-dev ; Raslan
> Darawsheh ; ian.sto...@intel.com; f...@redhat.com;
> Asaf Penso 
> Subject: Re: [dpdk-dev] [PATCH v6 11/12] app/testpmd: add port and encap
> support for sample action
> 
> On Tue, Sep 8, 2020 at 11:50 PM Jiawei Wang  wrote:
> >
> > Use sample action with ratio is 1 for mirroring flow, add supports to
> > set the different port or encap action for mirrored packets.
> >
> > The example of test-pmd command:
> >
> > 1. set sample_actions 1 port_id id 1 / end
> >flow create 0 ... pattern eth / end actions
> > sample ratio 1 index 1 / port_id id 2...
> > The flow will result in all the matched ingress packets will be sent
> > to port 2, and also mirrored the packets and sent to port 2.
> 
>   ^
> 
> You probably meant "and also mirrored the packets and sent to port 1"?
> 
> >
> > 2. set raw_encap 0 eth src.../ ipv4.../...
> >set raw_encap 1 eth src.../ ipv4.../...
> >set sample_actions 2 raw_encap index 0 / port_id id 0 / end
> >flow create 0 ... pattern eth / end actions
> > sample ratio 1 index 2 / raw_encap index 1 / port_id id 0...
> > The flow will result in all the matched egress packets will be
> > encapsulated and sent to wire, and also mirrored the packets and with
> > the different encapsulated data and sent to wire.
> >
> > Signed-off-by: Jiawei Wang 
> > ---
> >  app/test-pmd/cmdline_flow.c | 16 
> >  1 file changed, 16 insertions(+)
> >
> > diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> > index 27fa294..1860657 100644
> > --- a/app/test-pmd/cmdline_flow.c
> > +++ b/app/test-pmd/cmdline_flow.c
> > @@ -514,6 +514,8 @@ struct raw_sample_conf {  struct
> > rte_flow_action_mark sample_mark[RAW_SAMPLE_CONFS_MAX_NUM];
> >  struct rte_flow_action_queue
> sample_queue[RAW_SAMPLE_CONFS_MAX_NUM];
> >  struct rte_flow_action_count
> sample_count[RAW_SAMPLE_CONFS_MAX_NUM];
> > +struct rte_flow_action_port_id
> > +sample_port_id[RAW_SAMPLE_CONFS_MAX_NUM];
> > +struct rte_flow_action_raw_encap
> > +sample_encap[RAW_SAMPLE_CONFS_MAX_NUM];
> >
> >  /** Maximum number of subsequent tokens and arguments on the stack.
> > */  #define CTX_STACK_SIZE 16 @@ -1456,6 +1458,8 @@ struct
> > parse_action_priv {
> > ACTION_QUEUE,
> > ACTION_MARK,
> > ACTION_COUNT,
> > +   ACTION_PORT_ID,
> > +   ACTION_RAW_ENCAP,
> > ACTION_NEXT,
> > ZERO,
> >  };
> > @@ -7009,6 +7013,18 @@ static int comp_set_sample_index(struct
> context *, const struct token *,
> > (const void *)action->conf, size);
> > action->conf = &sample_queue[idx];
> > break;
> > +   case RTE_FLOW_ACTION_TYPE_RAW_ENCAP:
> > +   size = sizeof(struct rte_flow_action_raw_encap);
> > +   rte_memcpy(&sample_encap[idx],
> > +   (const void *)action->conf, size);
> > +   action->conf = &sample_encap[idx];
> > +   break;
> > +   case RTE_FLOW_ACTION_TYPE_PORT_ID:
> > +   size = sizeof(struct rte_flow_action_port_id);
> > +   rte_memcpy(&sample_port_id[idx],
> > +   (const void *)action->conf, size);
> > +   action->conf = &sample_port_id[idx];
> > +   break;
> > default:
> > printf("Error - Not supported action\n");
> > return;
> > --
> > 1.8.3.1
> >


Re: [dpdk-dev] Duplicating traffic with RTE Flow

2020-09-22 Thread Jiawei(Jonny) Wang
Hi Jan,

Sorry for late response, Could you check the below latest patches that support 
flow-based traffic sampling? (based on: net/enic: support VXLAN decap action 
combined with VLAN pop)
https://patchwork.dpdk.org/project/dpdk/list/?series=12410

" The solution introduces a new action that will sample the incoming
traffic and send a duplicated traffic with the specified ratio to the
application, while the original packet will continue to the target
destination."
And,
 set sample_actions 1 port_id id 1 / end
   flow create 0 ... pattern eth / end actions
sample ratio 1 index 1 / port_id id 2...
The flow will result in all the matched ingress packets will be sent to
port 2, and also mirrored the packets and sent to port 1.

Thanks.
B.R.

Jonny

> -Original Message-
> From: Jan Viktorin 
> Sent: Tuesday, September 22, 2020 4:04 AM
> To: Asaf Penso 
> Cc: dev@dpdk.org; Ori Kam ; Jiawei(Jonny) Wang
> ; Slava Ovsiienko 
> Subject: Re: [dpdk-dev] Duplicating traffic with RTE Flow
> 
> On Fri, 18 Sep 2020 14:23:42 +
> Asaf Penso  wrote:
> 
> > Hello Jan,
> >
> > You can have a look in series [1] where we propose to add APIs to
> DPDK20.11 for both mirroring and sampling for packets, with additional
> actions of the different traffic.
> >
> > [1]
> >
> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatch
> >
> es.dpdk.org%2Fproject%2Fdpdk%2Flist%2F%3Fseries%3D12045&data=
> 02%7C
> >
> 01%7Cjiaweiw%40nvidia.com%7C8c9585855f9640f37ae608d85e698dbb%7C43
> 083d1
> >
> 5727340c1b7db39efd9ccc17a%7C0%7C1%7C637363154745490399&sdata
> =mdG51
> >
> UgntQvMjs%2BPpRozwt2dtAcdWR8j9MXBtZ3%2Bl8k%3D&reserved=0
> 
> Thanks! Can you please recommend me a base where I can apply this series?
> For current main (dc18be1d8) I got:
> 
> error: patch failed: drivers/net/mlx5/mlx5_flow_dv.c:9537
> error: drivers/net/mlx5/mlx5_flow_dv.c: patch does not apply
> error: patch failed: drivers/net/mlx5/mlx5_flow_dv.c:80
> error: drivers/net/mlx5/mlx5_flow_dv.c: patch does not apply
> error: patch failed: drivers/net/mlx5/mlx5_flow_dv.c:9007
> error: drivers/net/mlx5/mlx5_flow_dv.c: patch does not apply
> 
> Jan
> 
> >
> > Regards,
> > Asaf Penso
> >
> > >-Original Message-
> > >From: dev  On Behalf Of Jan Viktorin
> > >Sent: Friday, September 18, 2020 3:56 PM
> > >To: dev@dpdk.org
> > >Subject: [dpdk-dev] Duplicating traffic with RTE Flow
> > >
> > >Hello all,
> > >
> > >we are looking for a way to duplicate ingress traffic in hardware.
> > >
> > >There is an example in [1] suggesting to insert two fate actions into
> > >the RTE Flow actions array like:
> > >
> > >  flow create 0 ingress pattern end \
> > >  actions queue index 0 / void / queue index 1 / end
> > >
> > >But our experience is that PMDs reject two fate actions (tried with
> > >mlx5). Another similar approach would be to deliver every single
> > >packet into two virtual
> > >functions:
> > >
> > >  flow create 0 ingress pattern end \
> > > actions vf index 0 / vf index 1 / end
> > >
> > >Third possibility was to use passthru:
> > >
> > >  flow create 0 ingress pattern end \
> > >  actions passthru / vf index 0 / end  flow create 0 ingress
> > > pattern end \
> > >  actions vf index 1 / end
> > >
> > >Again, tried on mlx5 and it does not support the passthru.
> > >
> > >Last idea was to use isolate with passthru (to deliver both to DPDK
> > >application and to the kernel) but again there was no support on mlx5 for
> passthru...
> > >
> > >  flow isolate 0 true
> > >  flow create 0 ingress pattern end actions passthru / rss end / end
> > >
> > >Is there any other possibility or PMD+NIC that is known to solve such
> issue?
> > >
> > >Thanks
> > >Jan Viktorin
> > >
> > >[1]
> > >https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fd
> oc
> > >.dpdk
> > >.org%2Fguides%2Fprog_guide%2Frte_flow.html%23table-rte-flow-
> redirect-
> > >queue-5-
> > >3&data=02%7C01%7Casafp%40nvidia.com%7C1a46005bec5245e729e
> 708d
> > >85bd24caf%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637360
> 3060
> > >73519816&sdata=EOF%2Fz62crvBZK8rwzwKIWxj5cVlfPVnU3FLmcL9X
> 2w0%3
> > >D&reserved=0



Re: [dpdk-dev] Duplicating traffic with RTE Flow

2020-09-26 Thread Jiawei(Jonny) Wang


> -Original Message-
> From: Jan Viktorin 
> Sent: Wednesday, September 23, 2020 4:30 PM
> To: Jiawei(Jonny) Wang 
> Cc: Asaf Penso ; dev@dpdk.org; Ori Kam
> ; Slava Ovsiienko 
> Subject: Re: [dpdk-dev] Duplicating traffic with RTE Flow
> 
> On Wed, 23 Sep 2020 02:28:03 +
> "Jiawei(Jonny) Wang"  wrote:
> 
> > Hi Jan,
> >
> > Sorry for late response, Could you check the below latest patches that
> > support flow-based traffic sampling? (based on: net/enic:
> > support VXLAN decap action combined with VLAN pop)
> >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatc
> >
> hwork.dpdk.org%2Fproject%2Fdpdk%2Flist%2F%3Fseries%3D12410&da
> ta=02
> > %7C01%7Cjiaweiw%40nvidia.com%7Cf5fad57a2357485f8fe308d85f9adcca%
> 7C4308
> >
> 3d15727340c1b7db39efd9ccc17a%7C0%7C1%7C637364466050833489&sd
> ata=1l
> > rhsIVvHpiTSA7c4k6ceMnQQsDRs2UtgWnvRomTS7s%3D&reserved=0
> >
> > " The solution introduces a new action that will sample the incoming
> > traffic and send a duplicated traffic with the specified ratio to the
> > application, while the original packet will continue to the target
> > destination."
> > And,
> >  set sample_actions 1 port_id id 1 / end
> >flow create 0 ... pattern eth / end actions
> > sample ratio 1 index 1 / port_id id 2...
> > The flow will result in all the matched ingress packets will be sent
> > to port 2, and also mirrored the packets and sent to port 1.
> 
> Hi,
> 
> excuse me, but what am I doing wrong?
> 
> $ git log -1 --oneline
> a4ab862 net/enic: support VXLAN decap action combined with VLAN pop
> 
> $ curl
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatc
> hwork.dpdk.org%2Fseries%2F12410%2Fmbox%2F&data=02%7C01%7Cji
> aweiw%40nvidia.com%7Cf5fad57a2357485f8fe308d85f9adcca%7C43083d1572
> 7340c1b7db39efd9ccc17a%7C0%7C1%7C637364466050833489&sdata=ZD
> glETAx9A9C3iaXFPhLUb2vtUTGE7R8aK5ngJF7lko%3D&reserved=0 >
> sample-action-rte-flow.patch
> 
> $ git apply sample-action-rte-flow.patch
> error: patch failed: doc/guides/rel_notes/release_20_11.rst:62
> error: doc/guides/rel_notes/release_20_11.rst: patch does not apply
> 
> Or...
> 
> $ git am -3 sample-action-rte-flow.patch
> Applying: ethdev: introduce sample action for rte flow
> fatal: sha1 information is lacking or useless
> (doc/guides/prog_guide/rte_flow.rst).
> Repository lacks necessary blobs to fall back on 3-way merge.
> Cannot fall back to three-way merge.
> Patch failed at 0001 ethdev: introduce sample action for rte flow
> 
> Jan
> 
Hi Jan,

Please try the v8 patch: 
https://patchwork.dpdk.org/project/dpdk/list/?series=12525

I rebased it based on the main code and it should work for you.

Thanks.
Jonny
> >
> > Thanks.
> > B.R.
> >
> > Jonny
> >
> > > -Original Message-
> > > From: Jan Viktorin 
> > > Sent: Tuesday, September 22, 2020 4:04 AM
> > > To: Asaf Penso 
> > > Cc: dev@dpdk.org; Ori Kam ; Jiawei(Jonny) Wang
> > > ; Slava Ovsiienko 
> > > Subject: Re: [dpdk-dev] Duplicating traffic with RTE Flow
> > >
> > > On Fri, 18 Sep 2020 14:23:42 +
> > > Asaf Penso  wrote:
> > >
> > > > Hello Jan,
> > > >
> > > > You can have a look in series [1] where we propose to add APIs to
> > > >
> > > DPDK20.11 for both mirroring and sampling for packets, with
> > > additional actions of the different traffic.
> > > >
> > > > [1]
> > > >
> > >
> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpat
> > > ch
> > >
> > > >
> > >
> es.dpdk.org%2Fproject%2Fdpdk%2Flist%2F%3Fseries%3D12045&data=
> > > 02%7C
> > > >
> > >
> 01%7Cjiaweiw%40nvidia.com%7C8c9585855f9640f37ae608d85e698dbb%7C43
> > > 083d1
> > > >
> > >
> 5727340c1b7db39efd9ccc17a%7C0%7C1%7C637363154745490399&sdata
> > > =mdG51
> > > >
> > >
> UgntQvMjs%2BPpRozwt2dtAcdWR8j9MXBtZ3%2Bl8k%3D&reserved=0
> > >
> > > Thanks! Can you please recommend me a base where I can apply this
> > > series? For current main (dc18be1d8) I got:
> > >
> > > error: patch failed: drivers/net/mlx5/mlx5_flow_dv.c:9537
> > > error: drivers/net/mlx5/mlx5_flow_dv.c: patch does not apply
> > > error: patch failed: drivers/net/mlx5/mlx5_flow_dv.c:80
> > > error: drivers/net/mlx5/mlx5_flow_dv.c: patch does not apply
> > > error: patch failed: d

Re: [dpdk-dev] [PATCH v9 0/3] support the flow-based traffic sampling

2020-10-09 Thread Jiawei(Jonny) Wang
For MLX5 PMD patch set that implements the flow-based sampling, please see: 
https://patchwork.dpdk.org/project/dpdk/list/?series=12829

Thanks.
Jonny

> -Original Message-
> From: dev  On Behalf Of Jiawei Wang
> Sent: Friday, October 9, 2020 9:46 PM
> To: Ori Kam ; Slava Ovsiienko ;
> Matan Azrad ; NBU-Contact-Thomas Monjalon
> ; ferruh.yi...@intel.com;
> marko.kovace...@intel.com; arybche...@solarflare.com
> Cc: dev@dpdk.org; Raslan Darawsheh ;
> ian.sto...@intel.com; f...@redhat.com; Asaf Penso 
> Subject: [dpdk-dev] [PATCH v9 0/3] support the flow-based traffic sampling
> 
> This patch set implement the flow-based traffic sampling.
> 
> The solution is introduced a new rte_flow action that will sample the
> incoming traffic and send a duplicated traffic with the specified ratio to the
> application, while the original packet will continue to the target 
> destination.
> 
> If the sample ratio value be set to 1, means that the packets would be
> completely mirrored. The sample packet can be assigned with different set of
> actions from the original packet.
> 
> 
> v9:
> * Rebase patches based on the latest code.
> * Separate the MLX5 PMD changes into another patches.
> 
> v8:
> * Rebase patches based on the latest code.
> * Update the offloads dependencies document for sample flow.
> * Update sample flow limitation document.
> 
> v7:
> * Removed change in [PATCH 12/12] net/mlx5: support the native port id
> actions for mirroring, should use sample action.
> * Update the PMD code to match the new rdma-core API for mirroring.
> * Optimize the sample flow split routine.
> * Update code changes and commit log based on the review.
> * Add E-Switch sample flow limitation document.
> 
> v6:
> * Update the function that restore vport through metadata register c0 for
> FDB sampler.
> * Add multiple destination support.
> * Support the remote mirroring with different encapsulation header.
> * Fix coverity error.
> 
> v5:
> * Add the release note.
> * Remove Make changes since it's deprecated.
> 
> v4:
> * Rebase.
> * Fix the coding style issue.
> 
> v3:
> * Remove 'const' of ratio field.
> * Update description and commit messages.
> 
> v2:
> * Rebase patches based on the latest code.
> * Update rte_flow and release documents.
> * Fix the compile error.
> * Removed unnecessary change in [PATCH 7/8] net/mlx5: update the
> metadata register c0 support since FDB will use 5-tuple to do match.
> * Update changes based on the comments.
> 
> Jiawei Wang (3):
>   ethdev: introduce sample action for rte flow
>   app/testpmd: add testpmd command for sample action
>   app/testpmd: add port and encap support for sample action
> 
>  app/test-pmd/cmdline_flow.c| 301
> -
>  doc/guides/prog_guide/rte_flow.rst |  25 +++
>  doc/guides/rel_notes/release_20_11.rst |   6 +
>  lib/librte_ethdev/rte_flow.c   |   1 +
>  lib/librte_ethdev/rte_flow.h   |  30 
>  5 files changed, 354 insertions(+), 9 deletions(-)
> 
> --
> 1.8.3.1