Re: [dpdk-dev] [RFC] some questions for speed_capa usage

2021-01-23 Thread oulijun




在 2021/1/18 19:23, Ferruh Yigit 写道:

On 1/18/2021 10:27 AM, oulijun wrote:

Hi,


The 'speed_capa' will be reported in rte_eth_dev_info_get API. How 
should users use the field?


1) The driver reports only the capabilities supported by the NIC, and 
users only obtain the capabilities.
Maybe, there is a case that a rate bit in 'speed_capa' is not 
supported by the current transmission medium,

such as, copper cable optical modules and optical interface modules.

2) The field is used only to inform users of the speed_capa supported 
by the current transmission medium.
And users set the fixed speed or auto-negotiation by using 
'link_speeds' according to the field.


According to the existing implementations of all drivers, it seems 
that both of the above behaviors exist.


How should we report and use it?



Hi Lijun,

When the driver reports the capabilities supported by the NIC, we tend 
to mark this feature as partially supported.


The expectation is the driver report the capability for the current 
configuration, the PHY/FW/transmission medium, whatever it is.


Driver should return the current supported values so that application 
can select one, as you said.



Thank you for your answer. I see.

Regards,
ferruh
.



Re: [dpdk-dev] [dpdklab] RE: [dpdk-stable] [PATCH v4] mbuf: fix reset on mbuf free

2021-01-23 Thread Morten Brørup
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Lincoln Lavoie
> Sent: Thursday, January 21, 2021 5:35 PM
> To: Morten Brørup
> 
> Hi All,
> 
> Trying to follow the specific conversation.  It is correct, the lab
> does
> not list the specific throughput values achieved by the hardware, as
> that
> data can be sensitive to the hardware vendors, etc. The purpose of the
> lab
> is to check for degradations caused by patches, so the difference is
> really
> the important factor.  The comparison is against a prior run on the
> same
> hardware, via the DPDK main branch, so any delta should be caused by
> the
> specific patch changes (excluding statistical "wiggle").
> 

Thank you for clarifying, Lincoln.

This sounds like a perfect solution to the meet the purpose.

I request that you change the output to show the relative difference (i.e. 
percentage above/below baseline), instead of the absolute difference (i.e. 
number of packets per second).

By showing a percentage, anyone reading the test report can understand if the 
difference is insignificant, or big enough to require further discussion before 
accepting the patch. Showing the difference in packets per second requires the 
reader of the test report to have prior knowledge about the expected 
performance.

> If the group would prefer, we could calculate additional references if
> desired (i.e. difference from the last official release, or a monthly
> run
> of the current, etc.).  We just need the community to define their
> needs,
> and we can add this to the development queue.
> 

I retract my suggestion about adding additional references. You clearly 
explained how your baselining works, and I think it fully serves the needs of a 
regression test.

So please just put this small change in your development queue: Output the 
difference in percent with a few decimals after the comma, instead of the 
difference in packets per second.

For readability, performance drops should be shown as negative values, and 
increases as positive, e.g.:

Difference = (NewPPS - BaselinePPS) / BaselinePPS * 100.00 %.


If we want to compare performance against official releases, current or older, 
it should go elsewhere, not in the continuous testing environment. E.g. the 
release notes could include a report showing differences from the last few 
official releases. But that is a task for another day.


> Cheers,
> Lincoln
> 
> 
> On Thu, Jan 21, 2021 at 4:29 AM Morten Brørup
> 
> wrote:
> 
> > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ferruh Yigit
> > > Sent: Thursday, January 21, 2021 10:19 AM
> > >
> > > On 1/15/2021 6:39 PM, Ali Alnubani wrote:
> > > > Hi,
> > > > Adding Ferruh and Zhaoyan,
> > > >
> > > >> Ali,
> > > >>
> > > >> You reported some performance regression, did you confirm it?
> > > >> If I get no reply by monday, I'll proceed with this patch.
> > > >
> > > > Sure I'll confirm by Monday.
> > > >
> > > > Doesn't the regression also reproduce on the Lab's Intel servers?
> > > > Even though the check iol-intel-Performance isn't failing, I can
> see
> > > that the throughput differences from expected for this patch are
> less
> > > than those of another patch that was tested only 20 minutes
> earlier.
> > > Both patches were applied to the same tree:
> > > >
> > > > https://mails.dpdk.org/archives/test-report/2021-
> January/173927.html
> > > >> | 64 | 512 | 1.571   |
> > > >
> > > > https://mails.dpdk.org/archives/test-report/2021-
> January/173919.html
> > > >> | 64 | 512 | 2.698   |
> > > >
> > > > Assuming that pw86457 doesn't have an effect on this test, it
> looks
> > > to me that this patch caused a regression in Intel hardware as
> well.
> > > >
> > > > Can someone update the baseline's expected values for the Intel
> NICs
> > > and rerun the test on this patch?
> > > >
> > >
> > > Zhaoyan said that the baseline is calculated dynamically,
> > > what I understand is baseline set based on previous days
> performance
> > > result, so
> > > it shouldn't require updating.
> >
> > That sounds smart!
> >
> > Perhaps another reference baseline could be added, for informational
> > purposes only:
> > Deviation from the performance of the last official release.
> >
> > >
> > > But cc'ed the lab for more details.
> >
> >
> 
> --
> *Lincoln Lavoie*
> Senior Engineer, Broadband Technologies
> 21 Madbury Rd., Ste. 100, Durham, NH 03824
> lylav...@iol.unh.edu
> https://www.iol.unh.edu
> +1-603-674-2755 (m)
> 



Re: [dpdk-dev] [PATCH v3 01/22] ethdev: fix MTU size exceeds max rx packet length

2021-01-23 Thread oulijun




在 2021/1/19 16:46, oulijun 写道:



在 2021/1/18 18:42, Ferruh Yigit 写道:

On 1/15/2021 10:44 AM, oulijun wrote:

Hi Steve
This is a very good job! But I have some question and suggestions.
Please check it.

在 2021/1/14 17:45, Steve Yang 写道:

Ethdev is using default Ethernet overhead to decide if provided
'max_rx_pkt_len' value is bigger than max (non jumbo) MTU value,
and limits it to MAX if it is.

Since the application/driver used Ethernet overhead is different than
the ethdev one, check result is wrong.

If the driver is using Ethernet overhead bigger than the default one,
the provided 'max_rx_pkt_len' is trimmed down, and in the driver when
correct Ethernet overhead is used to convert back, the resulting MTU is
less than the intended one, causing some packets to be dropped.

Like,
app -> max_rx_pkt_len = 1500/*mtu*/ + 22/*overhead*/ = 1522
ethdev  -> 1522 > 1518/*MAX*/; max_rx_pkt_len = 1518
driver  -> MTU = 1518 - 22 = 1496
Packets with size 1497-1500 are dropped although intention is to be 
able

to send/receive them.

The fix is to make ethdev use the correct Ethernet overhead for port,
instead of default one.

Fixes: 59d0ecdbf0e1 ("ethdev: MTU accessors")

Signed-off-by: Steve Yang 
---
  lib/librte_ethdev/rte_ethdev.c | 27 ---
  1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/lib/librte_ethdev/rte_ethdev.c 
b/lib/librte_ethdev/rte_ethdev.c

index 17ddacc78d..19ca4c4512 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -1292,8 +1292,10 @@ rte_eth_dev_configure(uint16_t port_id, 
uint16_t nb_rx_q, uint16_t nb_tx_q,

  struct rte_eth_dev *dev;
  struct rte_eth_dev_info dev_info;
  struct rte_eth_conf orig_conf;
+uint16_t overhead_len;
  int diag;
  int ret;
+uint16_t old_mtu;
  RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
@@ -1319,10 +1321,20 @@ rte_eth_dev_configure(uint16_t port_id, 
uint16_t nb_rx_q, uint16_t nb_tx_q,

  memcpy(&dev->data->dev_conf, dev_conf,
 sizeof(dev->data->dev_conf));
+/* Backup mtu for rollback */
+old_mtu = dev->data->mtu;
+
  ret = rte_eth_dev_info_get(port_id, &dev_info);
  if (ret != 0)
  goto rollback;
+/* Get the real Ethernet overhead length */
+if (dev_info.max_mtu != UINT16_MAX &&
+dev_info.max_rx_pktlen > dev_info.max_mtu)
+overhead_len = dev_info.max_rx_pktlen - dev_info.max_mtu;
+else
+overhead_len = RTE_ETHER_HDR_LEN + RTE_ETHER_CRC_LEN;
+
The Ethernet frame header length supported by each NIC may be 
different, which cause the difference of allowed packets size and 
drop you say.
The diference of Ethernet frame header length have an impact for the 
maximum Ethernet frame length, which is a boundary value

of enabling jumbo frame through the ' max_rx_pkt_len '.
However, we need to do the same thing you do above to get the 
overhead_len every time, which will
cause a lot of duplicate code in the framework and app. For examples, 
parsing and processing for '--max-pkt-len=xxx' parameter,
  and " cmd_config_max_pkt_len_parsed " in testpmd, and here 
modifying here dev_configure API.

It's a little redundant and troublesome.

Maybe, it is necessary for driver to directly report the supported 
maximum Ethernet frame length by rte_dev_info_get API.

As following:
struct rte_eth_dev_info {
 
 /**
  * The maximum Ethernet frame length supported by each
  * driver varies with the Ethernet header length.
  */
 uint16_t eth_max_len;
 
}

int
rte_eth_dev_info_get(uint16_t port_id, struct rte_eth_dev_info 
*dev_info)

{
 xxx
 dev_info->min_mtu = RTE_ETHER_MIN_MTU;
 dev_info->max_mtu = UINT16_MAX;
 dev_info->eth_max_len = RTE_ETHER_MAX_LEN;
 xxx
}

And then:
xxx_devv_infos_get(struct rte_eth_dev *eth_dev, struct 
rte_eth_dev_info *info)

{
 xxx
 info->eth_max_len = xxx_ETHER_MAX _LEN;
 xxx
}
What do you think?


Hi Lijun,

The 'max_mtu' has been added in the past to solve exact same problem, 
perhaps it would be better to add something like 'eth_max_len' at that 
time.


But as Steve mentioned we are planning to switch to more MTU based 
configuration, which should remove the need of the field you mentioned.


  /* If number of queues specified by application for both Rx 
and Tx is
   * zero, use driver preferred values. This cannot be done 
individually

   * as it is valid for either Tx or Rx (but not both) to be zero.
@@ -1410,11 +1422,18 @@ rte_eth_dev_configure(uint16_t port_id, 
uint16_t nb_rx_q, uint16_t nb_tx_q,

  goto rollback;
  }
  } else {
-if (dev_conf->rxmode.max_rx_pkt_len < RTE_ETHER_MIN_LEN ||
-dev_conf->rxmode.max_rx_pkt_len > RTE_ETHER_MAX_LEN)
+uint16_t pktlen = dev_conf->rxmode.max_rx_pkt_len;
+if (pktlen < RTE_ETHER_MIN_MTU + overhead_len ||
+pktlen > RTE_ETHER_MTU + overhead_len)
  /* Use default

Re: [dpdk-dev] [PATCH v8 2/3] build: use Python pmdinfogen

2021-01-23 Thread Thomas Monjalon
22/01/2021 23:24, Dmitry Kozlyuk:
> On Fri, 22 Jan 2021 21:57:15 +0100, Thomas Monjalon wrote:
> > 22/01/2021 21:31, Dmitry Kozlyuk:
> > > On Wed, 20 Jan 2021 11:24:21 +0100, Thomas Monjalon wrote:  
> > > > 20/01/2021 08:23, Dmitry Kozlyuk:  
> > > > > On Wed, 20 Jan 2021 01:05:59 +0100, Thomas Monjalon wrote:
> > > > > > This is now the right timeframe to introduce this change
> > > > > > with the new Python module dependency.
> > > > > > Unfortunately, the ABI check is returning an issue:
> > > > > > 
> > > > > > 'const char mlx5_common_pci_pmd_info[62]' was changed
> > > > > > to 'const char mlx5_common_pci_pmd_info[60]' at 
> > > > > > rte_common_mlx5.pmd.c
> > > > > 
> > > > > Will investigate and fix ASAP.  
> > > 
> > > Now that I think of it: strings like this change every time new PCI IDs 
> > > are
> > > added to a PMD, but AFAIK adding PCI IDs is not considered an ABI 
> > > breakage,
> > > is it? One example is 28c9a7d7b48e ("net/mlx5: add ConnectX-6 Lx device 
> > > ID")
> > > added 2020-07-08, i.e. clearly outside of ABI change window.  
> > 
> > You're right.
> > 
> > > "xxx_pmd_info" changes are due to JSON formatting (new is more canonical),
> > > which can be worked around easily, if the above is wrong.  
> > 
> > If the new format is better, please keep it.
> > What we need is an exception for the pmdinfo symbols
> > in the file devtools/libabigail.abignore.
> > You can probably use a regex for these symbols.
> 
> This would allow real breakages to pass ABI check, abidiff doesn't analyze
> variable content and it's not easy to compare. Maybe later a script can be
> added that checks lines with RTE_DEVICE_IN in patches. There are at most 32 of
> 5494 relevant commits between 19.11 and 20.11, though.
> 
> To verify there are no meaningful changes I ensured empty diff between
> results of the following command for "main" and the branch:
> 
>   find build/drivers -name '*.so' -exec usertools/dpdk-pmdinfo.py

For now we cannot do such check as part of the ABI checker.
And we cannot merge this patch if the ABI check fails.
I think the only solution is to allow any change in the pmdinfo variables.