Re: [dpdk-dev] [PATCH 1/2] net/mlx5: fix MARK action in active tunnel offload
Hi, > -Original Message- > From: Gregory Etelson > Sent: Wednesday, January 20, 2021 9:17 PM > To: dev@dpdk.org > Cc: Gregory Etelson ; Matan Azrad > ; Raslan Darawsheh ; Slava > Ovsiienko ; Shahaf Shuler > Subject: [PATCH 1/2] net/mlx5: fix MARK action in active tunnel offload > > Tunnel offload mode allows application to restore partially offloaded > tunneled packets to its original state. > MLX5 PMD stores internal data required to restore partially offloaded > packet in packet mark section. Therefore MLX5 PMD will not allow > applications to use mark action if tunnel offload mode was activated. > The restriction is applied both to regular and tunnel offload rules. > > The patch rejects application rules with mark action while tunnel > offload is active. > Missing: Fixes: 4ec6360de37d ("net/mlx5: implement tunnel offload") Cc: sta...@dpdk.org Which is added to both patches during integration, > Signed-off-by: Gregory Etelson > Acked-by: Viacheslav Ovsiienko > --- Series applied to next-net-mlx, Kindest regards Raslan Darawsheh
Re: [dpdk-dev] [PATCH] net/mlx5: fix wrong Flow Tag decompression
Hi, > -Original Message- > From: Alexander Kozyrev > Sent: Thursday, January 14, 2021 11:32 PM > To: dev@dpdk.org > Cc: sta...@dpdk.org; Raslan Darawsheh ; Slava > Ovsiienko ; Matan Azrad > Subject: [PATCH] net/mlx5: fix wrong Flow Tag decompression > > Packets can get a wrong Flow Tag on x86 architecture with the Flow Tag > compression format (rxq_cqe_comp_en=2) enabled inside the SSE Rx burst. > The shuffle mask that extracts a Flow Tag from the pair of compressed > CQEs is reversed. This leads to the wrong Flow Tag assignment. > Correct the shuffle mask to get proper bytes for a Flow Tag from miniCQEs. > > Fixes: 54c2d46b160 ("net/mlx5: support flow tag and packet header > miniCQEs") > Cc: sta...@dpdk.org > > Signed-off-by: Alexander Kozyrev > --- Patch applied to next-net-mlx, Kindest regards, Raslan Darawsheh
[dpdk-dev] 回复: [PATCH v1 1/3] test/ring: reduce iteration numbers to make test duration shorter
Hi, Konstantin > -邮件原件- > 发件人: Ananyev, Konstantin > 发送时间: 2021年1月22日 21:16 > 收件人: Feifei Wang ; Honnappa Nagarahalli > ; Olivier Matz ; > Gavin Hu > 抄送: dev@dpdk.org; nd ; sta...@dpdk.org > 主题: RE: [PATCH v1 1/3] test/ring: reduce iteration numbers to make test > duration shorter > > > > When testing ring performance in the case that multiple lcores are > > mapped to the same physical core, e.g. --lcores '(0-3)@10', it takes a > > very long time to wait for the "enqueue_dequeue_bulk_helper" to > > finish. This is because too much iteration numbers and extremely low > > efficiency for enqueue and dequeue with this kind of core mapping. > > Following are the test results to show the above phenomenon: > > > > x86-Intel(R) Xeon(R) Gold 6240: > > $sudo ./app/test/dpdk-test --lcores '(0-1)@25' > > Testing using two hyperthreads(bulk (size: 8):) > > iter_shift: 3 5 7 9 11 13*15 17 19 > >21 23 > > run time: 7s7s7s8s9s 16s47s170s > > 660s >0.5h >1h > > legacy APIs: SP/SC: 37116 40525 40525 40209 40367 40407 > > 40541 > NoData NoData > > legacy APIs: MP/MC: 56141150657 40526 40526 40526 40625 > > 40585 > NoData NoData > > > > aarch64-n1sdp: > > $sudo ./app/test/dpdk-test --lcore '(0-1)@1' > > Testing using two hyperthreads(bulk (size: 8):) > > iter_shift: 3 5 7 9 11 13*15 17 19 > >21 23 > > run time: 8s8s8s9s9s 14s34s111s > > 418s 25min >1h > > legacy APIs: SP/SC: 0.4 0.2 0.1 488 488488488488489 > >489 > NoData > > legacy APIs: MP/MC: 0.4 0.3 0.2 488 488488488490489 > >489 > NoData > > > > As the number of iterations increases, so does the time which is > > required to run the program. Currently (iter_shift = 23), it will take > > more than 1 hour to wait for the test to finish. To fix this, the > > "iter_shift" should decrease and ensure enough iterations to keep the > > test data stable. In order to achieve this, we also test with "-l" EAL > argument: > > > > x86-Intel(R) Xeon(R) Gold 6240: > > $sudo ./app/test/dpdk-test -l 25-26 > > Testing using two NUMA nodes(bulk (size: 8):) > > iter_shift: 3 5 7 9 11 13*15 17 19 > >21 23 > > run time: 6s6s6s6s6s 6s 6s 7s 8s > >11s 27s > > legacy APIs: SP/SC: 4720132254 83 91 73 81 > >75 95 > > legacy APIs: MP/MC: 441818240 245270250249252 > >250 > 253 > > > > aarch64-n1sdp: > > $sudo ./app/test/dpdk-test -l 1-2 > > Testing using two physical cores(bulk (size: 8):) > > iter_shift: 3 5 7 9 11 13*15 17 19 > >21 23 > > run time: 8s8s8s8s8s 8s 8s 9s 9s > >11s 23s > > legacy APIs: SP/SC: 0.7 0.4 1.2 1.8 2.02.02.02.02.0 > >2.0 2.0 > > legacy APIs: MP/MC: 0.3 0.4 1.3 1.9 2.92.92.92.92.9 > >2.9 2.9 > > > > According to above test data, when "iter_shift" is set as "15", the > > test run time is reduced to less than 1 minute and the test result can > > keep stable in x86 and aarch64 servers. > > > > Fixes: 1fa5d0099efc ("test/ring: add custom element size performance > > tests") > > Cc: honnappa.nagaraha...@arm.com > > Cc: sta...@dpdk.org > > > > Signed-off-by: Feifei Wang > > Reviewed-by: Honnappa Nagarahalli > > Reviewed-by: Ruifeng Wang > > --- > > app/test/test_ring_perf.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/app/test/test_ring_perf.c b/app/test/test_ring_perf.c > > index e63e25a86..fd82e2041 100644 > > --- a/app/test/test_ring_perf.c > > +++ b/app/test/test_ring_perf.c > > @@ -178,7 +178,7 @@ enqueue_dequeue_bulk_helper(const unsigned int > flag, const int esize, > > struct thread_params *p) > > { > > int ret; > > - const unsigned int iter_shift = 23; > > + const unsigned int iter_shift = 15; > > const unsigned int iterations = 1 << iter_shift; > > struct rte_ring *r = p->r; > > unsigned int bsize = p->size; > > -- > > I think it would be better to rework the test(s) to terminate after some > timeout (30s or so), and report number of ops per timeout. > Anyway, as a short term fix, I am ok with it. > Acked-by: Konstantin Ananyev Ok, thanks very much. Best Regards Feifei > > > > 2.17.1
[dpdk-dev] [Bug 625] MLX5: Jumbo frames are being received as rx error
https://bugs.dpdk.org/show_bug.cgi?id=625 Bug ID: 625 Summary: MLX5: Jumbo frames are being received as rx error Product: DPDK Version: unspecified Hardware: All OS: All Status: UNCONFIRMED Severity: critical Priority: Normal Component: testpmd Assignee: dev@dpdk.org Reporter: wis...@mellanox.com Target Milestone: --- ,How to reproduce start testpmd ./dpdk-testpmd -n 4 -w :03:00.0,representor=[0,1],txq_inline=647,rx_vec_en=1,dv_flow_en=1,dv_esw_en=1,dv_xmeta_en=3 -- --mbcache=512 -i --nb-cores=15 --rxq=5 --txq=5 --txd=1024 --rxd=1024 --burst=64 --enable-scatter --tx-offloads=0x8000 --forward-mode=rxonly Then: port stop all port config all max-pkt-len 3072 port start all start port config mtu 0 3072 port config mtu 1 3072 port config mtu 2 3072 set verbose 1 start >From TG side, change MTU and send this packet: ifconfig ens5 mtu 5000 up packet = Ether(dst='00:16:3e:26:f8:28', src='00:16:3e:4b:62:07', type=2048)/IP(version=4, ihl=5, tos=28, len=2162, id=1, flags=0, frag=0, ttl=183, proto=0, chksum=9433, src='173.214.92.106', dst='183.162.20.179')/Raw(load=b'UnHRTCfPDLyiuxvZLANJOMISLXdwsAvnJNGuWrmVbvoSgrLOZZrerhPrPqVBQxCtWpuDpNfTJjudHBpynTiNzUjVLSWUJcOTOZVrQBrHZskfhWiGFqTTojwFAlsCcVkijAPnDxFmAuUMZnGzHixCaxZyZWLJtsLlipINwedoWZrUjOhpJxZKopphomYILQZleBtxyDVnNbYidrWTVWvXXSFgIiedDbjnKYYgqdzSPexWlsfylkitOzlRgKDAaCgMxoBawORJJmNHREwZjSRpIbpuZjCdxSnbwRSXxaDVRJtshJRPWoUcnxNvCqSXknClEXVchBcHJbPCUMyEnLpTRfQJLlYOsTnCeMruorXFtJHiXeApLdbKTHbLZFYKHpUtlVvYCviTZSKzYrZHMZIKvYyXtoAzcXtPathoCYSRBeJxctuLLoUeXeOYzzKZVNEGfSqCFBvKHAytcBEamnNpoHIbcruqNGvoulFLBchUgZCAeeGVFtcjEJTDsIyjjswXJnGAcNCHoxQxKIlTfwTcQplEgYFFtMlaQvhZMnTvazAnAvEnZaZxDiYzMmPoquPvlrRbqVvXWoNmYvrYKtXkMqmldstyoUeWcDcuFSRrmLEEkeEZPDkAEjyoyOzRgKCyStTzyziUgwCdUKbJypEUsjtGBZIecsJwRYCrNGVAqxwuWEicZkOTAQTFeLPvIZJvyxfyfXRqvYdaXfoHYTiEoOlbIzIpzmBNYAUVYUuyRSkmuJftMSoSnFejPvxkLkzVXJyrWdgFkhQDcPOzkxXuNFKENPgXQrqyaPsUJXMCqHGsosqIslXDOfQcBZTwdXYmsYWcYhTTtBsTUlabvheIwJNwNQGVcLNowOwBhtgcUTzByFcQHyZaZeZgGcVLjAjijwBrQJBwZrAsFBODEpKqrkfWxoOahMyWTnTnmjMAoLFSkmbxIDFnuSNQcIqnXlYuGyUPyByEzubTyydeQJQjoUjANWVsDuqjGLgCAxRGOjHpQZczlMnwUSbuBjqIwfHnoTAaOwnWkVoXFoTxJDornnZpXHIcgAvuiSWLpmKXcoitIVpuxfVNcrQTbgaQdYCQbCGwurDphLOwpfeChPRUWKgxEpmfvKNdmsoEOmaIBuoeTySKGQaXdODITTuwGsHMzOftXfDPBUqKCDrlqBNwXySHkQppOfVXMQbYbXLnyFJnyMIzfvsefQFTkSVAplXdlrldxogRoZxvYgoEIEFWqLmaPFhuSNPbUBDEDCogFNIAgIpzBJvYQepKXaZJbCSLWKKGxGRiOYTWordNdzqLGkeOBzPdocuIFHLLFRzmCcciRkfxrOlhGzHTgLFYPRVINJXVfIljUvUsMtXjjzWUfNJrEYcuTehAAffmXPQShQjKGUvaBCbLoLNfyMcafmlqhSvWsOEtYZsPWumXSYsGozjdFClQfdAulPCcMhWjiVuHhOymZtoRQDYkaypXDuDrmFDjMEbkIIENNHQxNJictbCoJHMcibUdGmtLbzViPOCrsPduErDJRtowWJYtrYLpkhOwTwmGdmUkzvqezMtRmxvBPHqDzewuQECdItsOfPMspTXUFhMTlhIDECfvnbYBcDMsZLPqlXEXqVthxhgGvKQzMFYgPCrcsfLpFuwMTuReBLMSbFILfImCwYeYesPImFTqjAcXUjyUjIuTQPeSKIuMVBgHbBxYAWCmoEtULPOOsLGQxJtKuCAuIEevNDXfsgUUPsAKqkMhggzlaQUXBEhdGZGmpBgYTWDXEazpyxnDzxYXqjHcgyWywIxxZOLIjFVlbMzgiyjfFyAwnTsDljierIXuUrntKxmZEulaOfjxObbuWTeVUOLPJYsjnOHYkLdDJaryezThEDACYGFeVPujPZpiSyIzDyWKStHHQZLcfthRbgjxhZPWRutxScpxGKvKNkQbwGssHbVdorflWUblauODVcrHmPJMSNaQoOFHyQAwvXrCJLWNmVjsgIONDdJOJJtUrZLorLbFnMaZatbEgRBSWTksWkNQtDGneemrJpfzgUlRNrxMQVYdeomkdNOnvGTaCpyhRifrcqjThCKYHmHAuCOgcDwUYwfhUQbVCDhBcvbyOoBhAnRRuyBTQfqixUdsFBmXQaJvJimCoonMUsWHzq') sendp(packet, iface = 'ens5', count = 1) Results: Waiting for lcores to finish... -- Forward statistics for port 0 -- RX-packets: 0 RX-dropped: 0 RX-total: 0 RX-error: 1 RX-nombufs: 0 TX-packets: 0 TX-dropped: 0 TX-total: 0 -- Forward statistics for port 1 -- RX-packets: 0 RX-dropped: 0 RX-total: 0 TX-packets: 0 TX-dropped: 0 TX-total: 0 -- Forward statistics for port 2 -- RX-packets: 0 RX-dropped: 0 RX-total: 0 TX-packets: 0 TX-dropped: 0 TX-total: 0 +++ Accumulated forward statistics for all ports+++ RX-packets: 0 RX-dropped: 0 RX-total: 0 TX-packets: 0 TX-dropped: 0 TX-total: 0 Done. Packet recevied as rx-error The issue is created by: commit 761c4d66900fd7db6927f57eb610f543cc0908e4 Author: Steve Yang Date: Mon Jan 18 07:04:08 2021 + app/testpmd: fix max Rx packet length for VLAN packets When the max rx packet length is smaller than the sum of mtu size and ether overhead size, it
Re: [dpdk-dev] [PATCH] net/mlx5: fix refuse empty VLAN validation
Hi, > -Original Message- > From: Shiri Kuzin > Sent: Tuesday, January 19, 2021 7:07 PM > To: dev@dpdk.org > Cc: Matan Azrad ; Slava Ovsiienko > ; Shahaf Shuler ; Raslan > Darawsheh ; sta...@dpdk.org > Subject: [PATCH] net/mlx5: fix refuse empty VLAN validation > > In verbs, an empty VLAN is equivalent to a packet without VLAN layer, > hence, the VLAN item should not be empty and this case is rejected. > > However, the case for ether type of VLAN without following VLAN item > was not validated, allowing the creation of a flow with empty > VLAN item. > > To fix this issue a validation was added requiring ether type of VLAN > will be followed with VLAN item. > > Fixes: 0b1edd21cd78 ("net/mlx5: refuse empty VLAN flow specification") > Cc: sta...@dpdk.org > > Signed-off-by: Shiri Kuzin > Acked-by: Matan Azrad > --- Patch applied to next-net-mlx, Kindest regards, Raslan Darawsheh
[dpdk-dev] [PATCH 2/4] net/mlx5: fix secondary process port detach crash
When secondary process starts, in rte_eth_dev_attach_secondary() function, the secondary process port device data in struct rte_eth_dev will be initialized to be shared with primary process port. When failsafe sub-port hot-plug happens, both primary and secondary process will release the sub-port, and primary process will clear the sub-port device data in fs_dev_remove() deactivate stage first before request secondary process to release the sub-port. In this case, the secondary process will not be able to get the priv memory pointer from the shared device data memory anymore, since the device data memory has been cleared. Since what secondary process needs in port detach is the UAR table size to unmap the UAR addresses. It used Tx queue number as size of UAR table in priv. In fact the uar_table_sz in struct mlx5_proc_priv means the size of UAR register table - the number of UAR records. However, the code set this field incorrectly to the size of mlx5_proc_priv structure. This commit fixes UAR table size to match with relevant Tx queue number, uses the UAR table size directly to avoid the secondary process to access the priv pointer in the shared device data memory when unmapping the UAR address. Fixes: 120dc4a7dcd3 ("net/mlx5: remove device register remap") cc: sta...@dpdk.org Signed-off-by: Suanming Mou Acked-by: Viacheslav Ovsiienko --- drivers/net/mlx5/mlx5.c | 6 +++--- drivers/net/mlx5/mlx5_txq.c | 21 + 2 files changed, 16 insertions(+), 11 deletions(-) diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index 3730f32..efedbb9 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -1251,13 +1251,13 @@ struct mlx5_dev_ctx_shared * */ ppriv_size = sizeof(struct mlx5_proc_priv) + priv->txqs_n * sizeof(void *); - ppriv = mlx5_malloc(MLX5_MEM_RTE, ppriv_size, RTE_CACHE_LINE_SIZE, - dev->device->numa_node); + ppriv = mlx5_malloc(MLX5_MEM_RTE | MLX5_MEM_ZERO, ppriv_size, + RTE_CACHE_LINE_SIZE, dev->device->numa_node); if (!ppriv) { rte_errno = ENOMEM; return -rte_errno; } - ppriv->uar_table_sz = ppriv_size; + ppriv->uar_table_sz = priv->txqs_n; dev->process_private = ppriv; return 0; } diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c index b81bb4a..c53af10 100644 --- a/drivers/net/mlx5/mlx5_txq.c +++ b/drivers/net/mlx5/mlx5_txq.c @@ -634,18 +634,23 @@ void mlx5_tx_uar_uninit_secondary(struct rte_eth_dev *dev) { - struct mlx5_priv *priv = dev->data->dev_private; - struct mlx5_txq_data *txq; - struct mlx5_txq_ctrl *txq_ctrl; + struct mlx5_proc_priv *ppriv = (struct mlx5_proc_priv *) + dev->process_private; + const size_t page_size = rte_mem_page_size(); + void *addr; unsigned int i; + if (page_size == (size_t)-1) { + DRV_LOG(ERR, "Failed to get mem page size"); + return; + } MLX5_ASSERT(rte_eal_process_type() == RTE_PROC_SECONDARY); - for (i = 0; i != priv->txqs_n; ++i) { - if (!(*priv->txqs)[i]) + for (i = 0; i != ppriv->uar_table_sz; ++i) { + if (!ppriv->uar_table[i]) continue; - txq = (*priv->txqs)[i]; - txq_ctrl = container_of(txq, struct mlx5_txq_ctrl, txq); - txq_uar_uninit_secondary(txq_ctrl); + addr = ppriv->uar_table[i]; + rte_mem_unmap(RTE_PTR_ALIGN_FLOOR(addr, page_size), page_size); + } } -- 1.8.3.1
[dpdk-dev] [PATCH 0/4] net/mlx: fix secondary process bugs
This patch series fix several secondary process bugs. Suanming Mou (4): net/mlx5: fix invalid multi-process ID net/mlx5: fix secondary process port detach crash net/mlx5: fix secondary process attach port Tx queue net/mlx4: fix secondary process attach port Tx queue drivers/net/mlx4/mlx4.c | 10 +- drivers/net/mlx4/mlx4.h | 4 drivers/net/mlx4/mlx4_mp.c | 24 drivers/net/mlx4/mlx4_rxtx.h| 1 + drivers/net/mlx4/mlx4_txq.c | 28 drivers/net/mlx5/linux/mlx5_mp_os.c | 19 +++ drivers/net/mlx5/linux/mlx5_os.c| 4 ++-- drivers/net/mlx5/mlx5.c | 8 drivers/net/mlx5/mlx5.h | 6 +- drivers/net/mlx5/mlx5_txq.c | 21 + 10 files changed, 105 insertions(+), 20 deletions(-) -- 1.8.3.1
[dpdk-dev] [PATCH 1/4] net/mlx5: fix invalid multi-process ID
The device port_id is used for inter-process communication and must be the same both for primary and secondary process This IPC port_id was configured with the invalid temporary value in port spawn routine. This temporary value was used by the function rte_eth_dev_get_port_by_name() to check whether the port exists. This commit corrects the mp port_id with rte_eth_dev port_id. Fixes: 2eb4d0107acc ("net/mlx5: refactor PCI probing on Linux") Signed-off-by: Suanming Mou Acked-by: Viacheslav Ovsiienko --- drivers/net/mlx5/linux/mlx5_os.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c index 9ac1d46..1d91b92 100644 --- a/drivers/net/mlx5/linux/mlx5_os.c +++ b/drivers/net/mlx5/linux/mlx5_os.c @@ -930,8 +930,6 @@ priv->dev_port = spawn->phys_port; priv->pci_dev = spawn->pci_dev; priv->mtu = RTE_ETHER_MTU; - priv->mp_id.port_id = port_id; - strlcpy(priv->mp_id.name, MLX5_MP_NAME, RTE_MP_MAX_NAME_LEN); /* Some internal functions rely on Netlink sockets, open them now. */ priv->nl_socket_rdma = mlx5_nl_init(NETLINK_RDMA); priv->nl_socket_route = mlx5_nl_init(NETLINK_ROUTE); @@ -1347,6 +1345,8 @@ eth_dev->data->dev_flags |= RTE_ETH_DEV_REPRESENTOR; eth_dev->data->representor_id = priv->representor_id; } + priv->mp_id.port_id = eth_dev->data->port_id; + strlcpy(priv->mp_id.name, MLX5_MP_NAME, RTE_MP_MAX_NAME_LEN); /* * Store associated network device interface index. This index * is permanent throughout the lifetime of device. So, we may store -- 1.8.3.1
[dpdk-dev] [PATCH 3/4] net/mlx5: fix secondary process attach port Tx queue
Currently, the secondary process port UAR register mapping used by Tx queue is done during port initializing. Unluckily, in port hot-plug case, the secondary process was requested to initialize the port when primary process did not complete the device configuration and the port Tx queue number is not configured yet. Hence, the secondary process getS the zero Tx queue number during probing, causing the UAR registers not be mapped in the correct fashion. This commit checks the configured number of Tx queues in secondary process when the port start is requested. In case the Tx queue number mismatch found the UAR mapping is reinitialized accordingly. Fixes: 2aac5b5d119f ("net/mlx5: sync stop/start with secondary process") cc: sta...@dpdk.org Signed-off-by: Suanming Mou Acked-by: Viacheslav Ovsiienko --- drivers/net/mlx5/linux/mlx5_mp_os.c | 19 +++ drivers/net/mlx5/mlx5.c | 2 +- drivers/net/mlx5/mlx5.h | 6 +- 3 files changed, 25 insertions(+), 2 deletions(-) diff --git a/drivers/net/mlx5/linux/mlx5_mp_os.c b/drivers/net/mlx5/linux/mlx5_mp_os.c index 08ade75..95372e2 100644 --- a/drivers/net/mlx5/linux/mlx5_mp_os.c +++ b/drivers/net/mlx5/linux/mlx5_mp_os.c @@ -115,6 +115,7 @@ const struct mlx5_mp_param *param = (const struct mlx5_mp_param *)mp_msg->param; struct rte_eth_dev *dev; + struct mlx5_proc_priv *ppriv; struct mlx5_priv *priv; int ret; @@ -132,6 +133,20 @@ rte_mb(); dev->rx_pkt_burst = mlx5_select_rx_function(dev); dev->tx_pkt_burst = mlx5_select_tx_function(dev); + ppriv = (struct mlx5_proc_priv *)dev->process_private; + /* If Tx queue number changes, re-initialize UAR. */ + if (ppriv->uar_table_sz != priv->txqs_n) { + mlx5_tx_uar_uninit_secondary(dev); + mlx5_proc_priv_uninit(dev); + ret = mlx5_proc_priv_init(dev); + if (ret) + return -rte_errno; + ret = mlx5_tx_uar_init_secondary(dev, mp_msg->fds[0]); + if (ret) { + mlx5_proc_priv_uninit(dev); + return -rte_errno; + } + } mp_init_msg(&priv->mp_id, &mp_res, param->type); res->result = 0; ret = rte_mp_reply(&mp_res, peer); @@ -183,6 +198,10 @@ return; } mp_init_msg(&priv->mp_id, &mp_req, type); + if (type == MLX5_MP_REQ_START_RXTX) { + mp_req.num_fds = 1; + mp_req.fds[0] = ((struct ibv_context *)priv->sh->ctx)->cmd_fd; + } ret = rte_mp_request_sync(&mp_req, &mp_rep, &ts); if (ret) { if (rte_errno != ENOTSUP) diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index efedbb9..b82e767 100644 --- a/drivers/net/mlx5/mlx5.c +++ b/drivers/net/mlx5/mlx5.c @@ -1268,7 +1268,7 @@ struct mlx5_dev_ctx_shared * * @param dev * Pointer to Ethernet device structure. */ -static void +void mlx5_proc_priv_uninit(struct rte_eth_dev *dev) { if (!dev->process_private) diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index 101e9c2..899241e 100644 --- a/drivers/net/mlx5/mlx5.h +++ b/drivers/net/mlx5/mlx5.h @@ -743,7 +743,10 @@ struct mlx5_dev_ctx_shared { struct mlx5_dev_shared_port port[]; /* per device port data array. */ }; -/* Per-process private structure. */ +/* + * Per-process private structure. + * Caution, secondary pocess may rebuid the struct during port start. + */ struct mlx5_proc_priv { size_t uar_table_sz; /* Size of UAR register table. */ @@ -998,6 +1001,7 @@ struct rte_hairpin_peer_info { int mlx5_getenv_int(const char *); int mlx5_proc_priv_init(struct rte_eth_dev *dev); +void mlx5_proc_priv_uninit(struct rte_eth_dev *dev); int mlx5_udp_tunnel_port_add(struct rte_eth_dev *dev, struct rte_eth_udp_tunnel *udp_tunnel); uint16_t mlx5_eth_find_next(uint16_t port_id, struct rte_pci_device *pci_dev); -- 1.8.3.1
[dpdk-dev] [PATCH 4/4] net/mlx4: fix secondary process attach port Tx queue
Currently, the secondary process port UAR register mapping used by Tx queue is done during port initializing. Unluckily, in port hot-plug case, the secondary process will be requested to initialize the port when primary process probe the port. At that time, the port Tx queue number is still not configured, the secondary process get Tx queue number as 0. This causes the UAR register not be mapped as secondary process get Tx queue number 0. This commit adds the check of Tx queue number in secondary process when port starts is requested. Once the Tx queue number is not matching, do UAR mapping with the latest Tx queue number. Fixes: 0203d33a1059 ("net/mlx4: support secondary process") cc: sta...@dpdk.org Signed-off-by: Suanming Mou Acked-by: Viacheslav Ovsiienko --- drivers/net/mlx4/mlx4.c | 10 +- drivers/net/mlx4/mlx4.h | 4 drivers/net/mlx4/mlx4_mp.c | 24 drivers/net/mlx4/mlx4_rxtx.h | 1 + drivers/net/mlx4/mlx4_txq.c | 28 5 files changed, 62 insertions(+), 5 deletions(-) diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c index 495b4fc..919a934 100644 --- a/drivers/net/mlx4/mlx4.c +++ b/drivers/net/mlx4/mlx4.c @@ -195,7 +195,7 @@ struct mlx4_conf { * @return * 0 on success, a negative errno value otherwise and rte_errno is set. */ -static int +int mlx4_proc_priv_init(struct rte_eth_dev *dev) { struct mlx4_proc_priv *ppriv; @@ -207,13 +207,13 @@ struct mlx4_conf { */ ppriv_size = sizeof(struct mlx4_proc_priv) + dev->data->nb_tx_queues * sizeof(void *); - ppriv = rte_malloc_socket("mlx4_proc_priv", ppriv_size, - RTE_CACHE_LINE_SIZE, dev->device->numa_node); + ppriv = rte_zmalloc_socket("mlx4_proc_priv", ppriv_size, + RTE_CACHE_LINE_SIZE, dev->device->numa_node); if (!ppriv) { rte_errno = ENOMEM; return -rte_errno; } - ppriv->uar_table_sz = ppriv_size; + ppriv->uar_table_sz = dev->data->nb_tx_queues; dev->process_private = ppriv; return 0; } @@ -224,7 +224,7 @@ struct mlx4_conf { * @param dev * Pointer to Ethernet device structure. */ -static void +void mlx4_proc_priv_uninit(struct rte_eth_dev *dev) { if (!dev->process_private) diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h index c6cb294..87710d3 100644 --- a/drivers/net/mlx4/mlx4.h +++ b/drivers/net/mlx4/mlx4.h @@ -197,6 +197,10 @@ struct mlx4_priv { #define PORT_ID(priv) ((priv)->dev_data->port_id) #define ETH_DEV(priv) (&rte_eth_devices[PORT_ID(priv)]) +int mlx4_proc_priv_init(struct rte_eth_dev *dev); +void mlx4_proc_priv_uninit(struct rte_eth_dev *dev); + + /* mlx4_ethdev.c */ int mlx4_get_ifname(const struct mlx4_priv *priv, char (*ifname)[IF_NAMESIZE]); diff --git a/drivers/net/mlx4/mlx4_mp.c b/drivers/net/mlx4/mlx4_mp.c index eca0c20..3622d61 100644 --- a/drivers/net/mlx4/mlx4_mp.c +++ b/drivers/net/mlx4/mlx4_mp.c @@ -111,6 +111,9 @@ const struct mlx4_mp_param *param = (const struct mlx4_mp_param *)mp_msg->param; struct rte_eth_dev *dev; +#ifdef HAVE_IBV_MLX4_UAR_MMAP_OFFSET + struct mlx4_proc_priv *ppriv; +#endif int ret; MLX4_ASSERT(rte_eal_process_type() == RTE_PROC_SECONDARY); @@ -126,6 +129,21 @@ rte_mb(); dev->tx_pkt_burst = mlx4_tx_burst; dev->rx_pkt_burst = mlx4_rx_burst; +#ifdef HAVE_IBV_MLX4_UAR_MMAP_OFFSET + ppriv = (struct mlx4_proc_priv *)dev->process_private; + if (ppriv->uar_table_sz != dev->data->nb_tx_queues) { + mlx4_tx_uar_uninit_secondary(dev); + mlx4_proc_priv_uninit(dev); + ret = mlx4_proc_priv_init(dev); + if (ret) + return -rte_errno; + ret = mlx4_tx_uar_init_secondary(dev, mp_msg->fds[0]); + if (ret) { + mlx4_proc_priv_uninit(dev); + return -rte_errno; + } + } +#endif mp_init_msg(dev, &mp_res, param->type); res->result = 0; ret = rte_mp_reply(&mp_res, peer); @@ -163,6 +181,7 @@ struct rte_mp_reply mp_rep; struct mlx4_mp_param *res __rte_unused; struct timespec ts = {.tv_sec = MLX4_MP_REQ_TIMEOUT_SEC, .tv_nsec = 0}; + struct mlx4_priv *priv; int ret; int i; @@ -175,6 +194,11 @@ return; } mp_init_msg(dev, &mp_req, type); + if (type == MLX4_MP_REQ_START_RXTX) { + priv = dev->data->dev_private; + mp_req.num_fds = 1; + mp_req.fds[0] = priv->ctx->cmd_fd; + } ret = rte_mp_request_sync(&mp_req, &mp_rep, &ts);
Re: [dpdk-dev] [PATCH] doc: update mlx5 flow MARK action description
Hi, > -Original Message- > From: Viacheslav Ovsiienko > Sent: Friday, December 11, 2020 2:06 PM > To: dev@dpdk.org > Cc: Raslan Darawsheh ; Matan Azrad > ; sta...@dpdk.org > Subject: [PATCH] doc: update mlx5 flow MARK action description > > There some limitations added for the MARK action value range. > > Fixes: 2d241515ebaf ("net/mlx5: add devarg for extensive metadata > support") > Cc: sta...@dpdk.org > > Signed-off-by: Viacheslav Ovsiienko > --- Patch applied to next-net-mlx, Kindest regards, Raslan Darawsheh
Re: [dpdk-dev] [PATCH v1] devtools: update abi ignore for cryptodev
"Kinsella, Ray" writes: > On 22/01/2021 13:09, Dodji Seketeli wrote: >> Thomas Monjalon writes: >> >> [...] >> > Then I've added (quickly) a libabigail exception rule: > > [suppress_type] > name = rte_cryptodev > has_data_member_inserted_between = {0, 1023} > > Now we want to improve this rule to restrict the offsets > to the padding at the end of the struct only, > so we keep forbidding changes in existing fields, > and forbidding additions further the current struct size. > Is this new rule good? > > has_data_member_inserted_between = {offset_after(attached), end} Yes, this rule should do what you think it says. > Do you confirm that the keyword "end" means the old reference size? Yes I do. > What else do we need to check for adding a new field in a padding? Actually, that rule will work independantly of it there is enough padding or not. It'll shut down the change report, even if the added data exceeds the padding. >>> >>> I don't understand why. >>> If "end" means the old reference size, then addition after the old size >>> should be reported, isn't it? >> >> Yes, you are right. >> >> What I meant is that even if (in an hypothetical case, not yours) the >> padding was so "small" that it wasn't going up to the 'end' of the >> struct, that rule would have still shut down the change report. > > Understood - you are talking about padding between members. Exactly. Cheers, -- Dodji
[dpdk-dev] 回复: [PATCH v1 3/3] ring: rename and refactor ring library
Hi, Konstantin > -邮件原件- > 发件人: Ananyev, Konstantin > 发送时间: 2021年1月22日 21:09 > 收件人: Feifei Wang ; Honnappa Nagarahalli > > 抄送: dev@dpdk.org; nd > 主题: RE: [PATCH v1 3/3] ring: rename and refactor ring library > > > > For legacy modes, rename ring_generic/c11 to ring_generic/c11_pvt. > > Furthermore, add new file ring_elem_pvt.h which includes > > ring_do_eq/deq and ring element copy/delete APIs. > > > > For other modes, rename xx_c11_mem to xx_elem_pvt. Move all private > > APIs into these new header files. > > > > Suggested-by: Honnappa Nagarahalli > > Signed-off-by: Feifei Wang > > Reviewed-by: Honnappa Nagarahalli > > Reviewed-by: Ruifeng Wang > > --- > > lib/librte_ring/meson.build | 15 +- > > .../{rte_ring_c11_mem.h => ring_c11_pvt.h}| 9 +- > > lib/librte_ring/ring_elem_pvt.h | 385 ++ > > ...{rte_ring_generic.h => ring_generic_pvt.h} | 6 +- > > ...ring_hts_c11_mem.h => ring_hts_elem_pvt.h} | 88 +++- > > ...ng_peek_c11_mem.h => ring_peek_elem_pvt.h} | 75 +++- > > ...ring_rts_c11_mem.h => ring_rts_elem_pvt.h} | 88 +++- > > lib/librte_ring/rte_ring_elem.h | 374 + > > lib/librte_ring/rte_ring_hts.h| 84 +--- > > lib/librte_ring/rte_ring_peek.h | 71 +--- > > lib/librte_ring/rte_ring_peek_zc.h| 2 +- > > lib/librte_ring/rte_ring_rts.h| 84 +--- > > 12 files changed, 646 insertions(+), 635 deletions(-) rename > > lib/librte_ring/{rte_ring_c11_mem.h => ring_c11_pvt.h} (96%) create > > mode 100644 lib/librte_ring/ring_elem_pvt.h rename > > lib/librte_ring/{rte_ring_generic.h => ring_generic_pvt.h} (98%) > > rename lib/librte_ring/{rte_ring_hts_c11_mem.h => ring_hts_elem_pvt.h} > > (60%) rename lib/librte_ring/{rte_ring_peek_c11_mem.h => > > ring_peek_elem_pvt.h} (62%) rename > > lib/librte_ring/{rte_ring_rts_c11_mem.h => ring_rts_elem_pvt.h} (62%) > > > > Sorry, but I don't understand the purpose of that patch. > As I remember by DPDK naming convention all installable headers should > have 'rte_' prefix. Same for public defines (RTE_). > Why to abandon it here? This change refers to you and Honnappa's discussion about ring: https://mails.dpdk.org/archives/dev/2020-May/166803.html And it is to separate the external APIs that can be called by users, from internal APIs that cannot be called by users. The internal functions are included in the files with xx_pvt.h. Best Regards Feifei > > > diff --git a/lib/librte_ring/meson.build b/lib/librte_ring/meson.build > > index 36fdcb6a5..98eac5810 100644 > > --- a/lib/librte_ring/meson.build > > +++ b/lib/librte_ring/meson.build > > @@ -2,15 +2,16 @@ > > # Copyright(c) 2017 Intel Corporation > > > > sources = files('rte_ring.c') > > -headers = files('rte_ring.h', > > +headers = files('ring_c11_pvt.h', > > + 'ring_elem_pvt.h', > > + 'ring_generic_pvt.h', > > + 'ring_hts_elem_pvt.h', > > + 'ring_peek_elem_pvt.h', > > + 'ring_rts_elem_pvt.h', > > + 'rte_ring.h', > > 'rte_ring_core.h', > > 'rte_ring_elem.h', > > - 'rte_ring_c11_mem.h', > > - 'rte_ring_generic.h', > > 'rte_ring_hts.h', > > - 'rte_ring_hts_c11_mem.h', > > 'rte_ring_peek.h', > > - 'rte_ring_peek_c11_mem.h', > > 'rte_ring_peek_zc.h', > > - 'rte_ring_rts.h', > > - 'rte_ring_rts_c11_mem.h') > > + 'rte_ring_rts.h') > > diff --git a/lib/librte_ring/rte_ring_c11_mem.h > > b/lib/librte_ring/ring_c11_pvt.h similarity index 96% rename from > > lib/librte_ring/rte_ring_c11_mem.h > > rename to lib/librte_ring/ring_c11_pvt.h index 7f5eba262..9f2f5318f > > 100644 > > --- a/lib/librte_ring/rte_ring_c11_mem.h > > +++ b/lib/librte_ring/ring_c11_pvt.h > > @@ -7,8 +7,8 @@ > > * Used as BSD-3 Licensed with permission from Kip Macy. > > */ > > > > -#ifndef _RTE_RING_C11_MEM_H_ > > -#define _RTE_RING_C11_MEM_H_ > > +#ifndef _RING_C11_PVT_H_ > > +#define _RING_C11_PVT_H_ > > > > static __rte_always_inline void > > __rte_ring_update_tail(struct rte_ring_headtail *ht, uint32_t > > old_val, @@ -69,9 +69,6 @@ __rte_ring_move_prod_head(struct rte_ring > *r, unsigned int is_sp, > > /* Ensure the head is read before tail */ > > __atomic_thread_fence(__ATOMIC_ACQUIRE); > > > > - /* load-acquire synchronize with store-release of ht->tail > > -* in update_tail. > > -*/ > > cons_tail = __atomic_load_n(&r->cons.tail, > > __ATOMIC_ACQUIRE); > > > > @@ -178,4 +175,4 @@ __rte_ring_move_cons_head(struct rte_ring *r, int > is_sc, > > return n; > > } > > > > -#endif /* _RTE_RING_C11_MEM_H_ */ > > +#endif /* _RING_C11_PVT_H_ */ > > diff --git a/lib/librte_ring/ring_elem_pvt.h > > b/lib/librte_ring/ring_elem_pvt.h new file mode 100644 index > > 0..8003e5edc
Re: [dpdk-dev] [PATCH v5 1/3] PCI: use PCI standard sysfs entry to get PIO address
Hi Huawei, >-Original Message- >From: dev On Behalf Of 谢华伟(此时此刻) >Sent: Friday, January 15, 2021 2:24 AM >To: Maxime Coquelin ; >ferruh.yi...@intel.com >Cc: dev@dpdk.org; anatoly.bura...@intel.com; >david.march...@redhat.com; zhihong.w...@intel.com; >chenbo@intel.com; gr...@u256.net >Subject: Re: [dpdk-dev] [PATCH v5 1/3] PCI: use PCI standard sysfs entry to get >PIO address > > >On 2021/1/12 16:07, Maxime Coquelin wrote: >> Hi Huawei, >> >> The title should be under the form: >> "bus/pci: use PCI standard sysfs entry to get PIO address" >> >> On 10/22/20 5:51 PM, 谢华伟(此时此刻) wrote: >>> From: "huawei.xhw" >>> >>> Previously with igb_uio we get PIO address from igb_uio sysfs entry, >>> with uio_pci_generic, we get PIO address from /proc/ioports. It will be great to explain a little bit more what this patch is trying to do. >>> >>> Signed-off-by: huawei.xhw >> In order to comply with the contribution rules, your name must be >> disaplyed under the form: >> >> Signed-off-by: Firstname Lastname >Would fix this. >>> --- >>> drivers/bus/pci/linux/pci.c | 77 >>> - >>> drivers/bus/pci/linux/pci_uio.c | 64 -- >>> 2 files changed, 46 insertions(+), 95 deletions(-) >>> >>> diff --git a/drivers/bus/pci/linux/pci.c >>> b/drivers/bus/pci/linux/pci.c index 2e1808b..0f38abf 100644 >>> --- a/drivers/bus/pci/linux/pci.c >>> +++ b/drivers/bus/pci/linux/pci.c >>> @@ -677,71 +677,6 @@ int rte_pci_write_config(const struct >rte_pci_device *device, >>> } >>> } >>> >>> -#if defined(RTE_ARCH_X86) >>> -static int >>> -pci_ioport_map(struct rte_pci_device *dev, int bar __rte_unused, >>> - struct rte_pci_ioport *p) >>> -{ >>> - uint16_t start, end; >>> - FILE *fp; >>> - char *line = NULL; >>> - char pci_id[16]; >>> - int found = 0; >>> - size_t linesz; >>> - >>> - if (rte_eal_iopl_init() != 0) { >>> - RTE_LOG(ERR, EAL, "%s(): insufficient ioport permissions for >PCI device %s\n", >>> - __func__, dev->name); >>> - return -1; >>> - } >>> - >>> - snprintf(pci_id, sizeof(pci_id), PCI_PRI_FMT, >>> -dev->addr.domain, dev->addr.bus, >>> -dev->addr.devid, dev->addr.function); >>> - >>> - fp = fopen("/proc/ioports", "r"); >>> - if (fp == NULL) { >>> - RTE_LOG(ERR, EAL, "%s(): can't open ioports\n", __func__); >>> - return -1; >>> - } >>> - >>> - while (getdelim(&line, &linesz, '\n', fp) > 0) { >>> - char *ptr = line; >>> - char *left; >>> - int n; >>> - >>> - n = strcspn(ptr, ":"); >>> - ptr[n] = 0; >>> - left = &ptr[n + 1]; >>> - >>> - while (*left && isspace(*left)) >>> - left++; >>> - >>> - if (!strncmp(left, pci_id, strlen(pci_id))) { >>> - found = 1; >>> - >>> - while (*ptr && isspace(*ptr)) >>> - ptr++; >>> - >>> - sscanf(ptr, "%04hx-%04hx", &start, &end); >>> - >>> - break; >>> - } >>> - } >>> - >>> - free(line); >>> - fclose(fp); >>> - >>> - if (!found) >>> - return -1; >>> - >>> - p->base = start; >>> - RTE_LOG(DEBUG, EAL, "PCI Port IO found start=0x%x\n", start); >>> - >>> - return 0; >>> -} >>> -#endif >>> - >>> int >>> rte_pci_ioport_map(struct rte_pci_device *dev, int bar, >>> struct rte_pci_ioport *p) >>> @@ -756,14 +691,8 @@ int rte_pci_write_config(const struct >rte_pci_device *device, >>> break; >>> #endif >>> case RTE_PCI_KDRV_IGB_UIO: >>> - ret = pci_uio_ioport_map(dev, bar, p); >>> - break; >>> case RTE_PCI_KDRV_UIO_GENERIC: >>> -#if defined(RTE_ARCH_X86) >>> - ret = pci_ioport_map(dev, bar, p); >>> -#else >>> ret = pci_uio_ioport_map(dev, bar, p); -#endif >>> break; >>> default: >>> break; >>> @@ -830,14 +759,8 @@ int rte_pci_write_config(const struct >rte_pci_device *device, >>> break; >>> #endif >>> case RTE_PCI_KDRV_IGB_UIO: >>> - ret = pci_uio_ioport_unmap(p); >>> - break; >>> case RTE_PCI_KDRV_UIO_GENERIC: >>> -#if defined(RTE_ARCH_X86) >>> - ret = 0; >>> -#else >>> ret = pci_uio_ioport_unmap(p); >>> -#endif >>> break; >>> default: >>> break; >>> diff --git a/drivers/bus/pci/linux/pci_uio.c >>> b/drivers/bus/pci/linux/pci_uio.c index f3305a2..01f2a40 100644 >>> --- a/drivers/bus/pci/linux/pci_uio.c >>> +++ b/drivers/bus/pci/linux/pci_uio.c >>> @@ -373,10 +373,13 @@ >>> pci_uio_ioport_map(struct rte_pci_device *dev, int bar, >>>struct rte_pci_ioport *p) >>> { >>> + FILE *f = NULL; >>> char dirname[PATH_MAX]; >>> char filename[PATH_MAX]; >>> - int uio_num; >>> - unsigned long start; >>> + char buf[BUFSIZ]; >>> + uint64_t phys_addr, end_addr
Re: [dpdk-dev] [PATCH v5 2/3] PCI: support MMIO in rte_pci_ioport_map/unap/read/write
Hi Huawei, Nice work, just some small comments. >-Original Message- >From: dev On Behalf Of 谢华伟(此时此刻) >Sent: Thursday, October 22, 2020 11:51 PM >To: ferruh.yi...@intel.com >Cc: dev@dpdk.org; maxime.coque...@redhat.com; >anatoly.bura...@intel.com; david.march...@redhat.com; >zhihong.w...@intel.com; chenbo@intel.com; gr...@u256.net; 谢华伟(此 >时此刻) >Subject: [dpdk-dev] [PATCH v5 2/3] PCI: support MMIO in >rte_pci_ioport_map/unap/read/write > >From: "huawei.xhw" > >If IO BAR, we get PIO address. >If MMIO BAR, we get mapped virtual address. >We distinguish PIO and MMIO by their address like how kernel does. >ioread/write8/16/32 is provided to access PIO/MMIO. >BTW, for virtio on arch other than x86, BAR flag indicates PIO but is mapped. > >Signed-off-by: huawei.xhw >--- > drivers/bus/pci/linux/pci.c | 4 -- > drivers/bus/pci/linux/pci_uio.c | 123 ++--- >--- > 2 files changed, 82 insertions(+), 45 deletions(-) > >diff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c index >0f38abf..0dc99e9 100644 >--- a/drivers/bus/pci/linux/pci.c >+++ b/drivers/bus/pci/linux/pci.c >@@ -715,8 +715,6 @@ int rte_pci_write_config(const struct rte_pci_device >*device, > break; > #endif > case RTE_PCI_KDRV_IGB_UIO: >- pci_uio_ioport_read(p, data, len, offset); >- break; > case RTE_PCI_KDRV_UIO_GENERIC: > pci_uio_ioport_read(p, data, len, offset); > break; >@@ -736,8 +734,6 @@ int rte_pci_write_config(const struct rte_pci_device >*device, > break; > #endif > case RTE_PCI_KDRV_IGB_UIO: >- pci_uio_ioport_write(p, data, len, offset); >- break; > case RTE_PCI_KDRV_UIO_GENERIC: > pci_uio_ioport_write(p, data, len, offset); > break; >diff --git a/drivers/bus/pci/linux/pci_uio.c b/drivers/bus/pci/linux/pci_uio.c >index 01f2a40..c19382f 100644 >--- a/drivers/bus/pci/linux/pci_uio.c >+++ b/drivers/bus/pci/linux/pci_uio.c >@@ -379,14 +379,9 @@ > char buf[BUFSIZ]; > uint64_t phys_addr, end_addr, flags; > unsigned long base; >+ bool iobar; > int i; > >- if (rte_eal_iopl_init() != 0) { >- RTE_LOG(ERR, EAL, "%s(): insufficient ioport permissions for >PCI device %s\n", >- __func__, dev->name); >- return -1; >- } >- > /* open and read addresses of the corresponding resource in sysfs */ > snprintf(filename, sizeof(filename), "%s/" PCI_PRI_FMT "/resource", > rte_pci_get_sysfs_path(), dev->addr.domain, dev->addr.bus, >@@ -408,15 +403,30 @@ > &end_addr, &flags) < 0) > goto error; > >- if (!(flags & IORESOURCE_IO)) { >- RTE_LOG(ERR, EAL, "%s(): bar resource other than IO is not >supported\n", __func__); >+ if (flags & IORESOURCE_IO) { >+ iobar = 1; >+ base = (unsigned long)phys_addr; >+ RTE_LOG(INFO, EAL, "%s(): PIO BAR %08lx detected\n", >__func__, base); >+ } else if (flags & IORESOURCE_MEM) { >+ iobar = 0; >+ base = (unsigned long)dev->mem_resource[bar].addr; >+ RTE_LOG(INFO, EAL, "%s(): MMIO BAR %08lx detected\n", >__func__, base); Same here, INFO level seems chatty. >+ } else { >+ RTE_LOG(ERR, EAL, "%s(): unknown BAR type\n", __func__); >+ goto error; >+ } >+ >+ >+ if (iobar && rte_eal_iopl_init() != 0) { >+ RTE_LOG(ERR, EAL, "%s(): insufficient ioport permissions for >PCI device %s\n", >+ __func__, dev->name); > goto error; > } Same as Maxime's suggestion, please move this block as well. >- base = (unsigned long)phys_addr; >- RTE_LOG(INFO, EAL, "%s(): PIO BAR %08lx detected\n", __func__, >base); > >- if (base > UINT16_MAX) >+ if (iobar && (base > UINT16_MAX)) { PIO_MAX defined below, please use it here. UNI16_MAX used in patch 1/3 as well. >+ RTE_LOG(ERR, EAL, "%s(): %08lx too large PIO resource\n", >__func__, >+base); > goto error; >+ } > > /* FIXME only for primary process ? */ > if (dev->intr_handle.type == RTE_INTR_HANDLE_UNKNOWN) { @@ - >517,6 +527,61 @@ } #endif > >+#define PIO_MAX 0x1 >+static inline uint8_t ioread8(void *addr) { >+ uint8_t val; >+ >+ val = (uint64_t)(uintptr_t)addr >= PIO_MAX ? >+ *(volatile uint8_t *)addr : >+ inb((unsigned long)addr); >+ >+ return val; >+} >+ >+static inline uint16_t ioread16(void *addr) { >+ uint16_t val; >+ >+ val = (uint64_t)(uintptr_t)addr >= PIO_MAX ? >+ *(volatile uint16_t *)addr : >+ inw((unsigned long)addr); >+ >+ return val; >+} >+ >+static inline uint32_t ioread32(void *addr) { >+ uint32_t val; >+ >+ val = (uint64_t)(uintptr_t)addr >= PIO_MAX ? >+ *(volatile ui
Re: [dpdk-dev] [PATCH] net/bnxt: fix null termination of receive mbuf chain
On Fri, Jan 22, 2021 at 1:49 PM Lance Richardson wrote: > > The last mbuf in a multi-segment packet needs to be > NULL-terminated. > > Fixes: 0958d8b6435d ("net/bnxt: support LRO") > Cc: sta...@dpdk.org > Signed-off-by: Lance Richardson > Reviewed-by: Somnath Kotur > Reviewed-by: Ajit Kumar Khaparde Patch applied to dpdk-next-net-brcm. > --- > drivers/net/bnxt/bnxt_rxr.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/net/bnxt/bnxt_rxr.c b/drivers/net/bnxt/bnxt_rxr.c > index 969cae19fc..c34a8905e7 100644 > --- a/drivers/net/bnxt/bnxt_rxr.c > +++ b/drivers/net/bnxt/bnxt_rxr.c > @@ -325,6 +325,7 @@ static int bnxt_rx_pages(struct bnxt_rx_queue *rxq, > */ > rte_bitmap_set(rxr->ag_bitmap, ag_cons); > } > + last->next = NULL; > bnxt_prod_ag_mbuf(rxq); > return 0; > } > -- > 2.25.1 >
[dpdk-dev] [PATCH v10 0/3] pmdinfogen: rewrite in Python
This patchset implements existing pmdinfogen logic in Python, replaces and removes the old code. The goals of rewriting are: * easier maintenance by using a more high-level language, * simpler build process without host application and libelf, * foundation for adding Windows support. Identity of generated PMD information is checked by comparing output of pmdinfo before and after the patch: find build/drivers -name '*.so' -exec usertools/dpdk-pmdinfo.py Acked-by: Neil Horman Tested-by: Jie Zhou --- Changes in v10: * Suppress ABI warnings for generated strings (Thomas). Dmitry Kozlyuk (3): pmdinfogen: add Python implementation build: use Python pmdinfogen pmdinfogen: remove C implementation .github/workflows/build.yml | 4 +- .travis.yml | 2 +- MAINTAINERS | 3 +- buildtools/gen-pmdinfo-cfile.sh | 6 +- buildtools/meson.build| 15 + buildtools/pmdinfogen.py | 189 +++ buildtools/pmdinfogen/meson.build | 14 - buildtools/pmdinfogen/pmdinfogen.c| 456 -- buildtools/pmdinfogen/pmdinfogen.h| 119 --- devtools/libabigail.abignore | 4 + doc/guides/freebsd_gsg/build_dpdk.rst | 3 +- doc/guides/linux_gsg/sys_reqs.rst | 6 + drivers/meson.build | 2 +- meson.build | 1 - 14 files changed, 225 insertions(+), 599 deletions(-) create mode 100755 buildtools/pmdinfogen.py delete mode 100644 buildtools/pmdinfogen/meson.build delete mode 100644 buildtools/pmdinfogen/pmdinfogen.c delete mode 100644 buildtools/pmdinfogen/pmdinfogen.h -- 2.29.2
[dpdk-dev] [PATCH v10 1/3] pmdinfogen: add Python implementation
Using a high-level, interpreted language simplifies maintenance and build process. Furthermore, ELF handling is delegated to pyelftools package. Original logic is kept, the copyright recognizes that. Signed-off-by: Dmitry Kozlyuk --- buildtools/pmdinfogen.py | 189 +++ 1 file changed, 189 insertions(+) create mode 100755 buildtools/pmdinfogen.py diff --git a/buildtools/pmdinfogen.py b/buildtools/pmdinfogen.py new file mode 100755 index 0..0cca47ff1 --- /dev/null +++ b/buildtools/pmdinfogen.py @@ -0,0 +1,189 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: BSD-3-Clause +# Copyright (c) 2016 Neil Horman +# Copyright (c) 2020 Dmitry Kozlyuk + +import argparse +import ctypes +import json +import sys +import tempfile + +from elftools.elf.elffile import ELFFile +from elftools.elf.sections import SymbolTableSection + + +class ELFSymbol: +def __init__(self, image, symbol): +self._image = image +self._symbol = symbol + +@property +def size(self): +return self._symbol["st_size"] + +@property +def value(self): +data = self._image.get_section_data(self._symbol["st_shndx"]) +base = self._symbol["st_value"] +return data[base:base + self.size] + +@property +def string_value(self): +value = self.value +return value[:-1].decode() if value else "" + + +class ELFImage: +def __init__(self, data): +self._image = ELFFile(data) +self._symtab = self._image.get_section_by_name(".symtab") +if not isinstance(self._symtab, SymbolTableSection): +raise Exception(".symtab section is not a symbol table") + +@property +def is_big_endian(self): +return not self._image.little_endian + +def get_section_data(self, name): +return self._image.get_section(name).data() + +def find_by_name(self, name): +symbol = self._symtab.get_symbol_by_name(name) +return ELFSymbol(self, symbol[0]) if symbol else None + +def find_by_prefix(self, prefix): +for i in range(self._symtab.num_symbols()): +symbol = self._symtab.get_symbol(i) +if symbol.name.startswith(prefix): +yield ELFSymbol(self, symbol) + + +def define_rte_pci_id(is_big_endian): +base_type = ctypes.LittleEndianStructure +if is_big_endian: +base_type = ctypes.BigEndianStructure + +class rte_pci_id(base_type): +_pack_ = True +_fields_ = [ +("class_id", ctypes.c_uint32), +("vendor_id", ctypes.c_uint16), +("device_id", ctypes.c_uint16), +("subsystem_vendor_id", ctypes.c_uint16), +("subsystem_device_id", ctypes.c_uint16), +] + +return rte_pci_id + + +class Driver: +OPTIONS = [ +("params", "_param_string_export"), +("kmod", "_kmod_dep_export"), +] + +def __init__(self, name, options): +self.name = name +for key, value in options.items(): +setattr(self, key, value) +self.pci_ids = [] + +@classmethod +def load(cls, image, symbol): +name = symbol.string_value + +options = {} +for key, suffix in cls.OPTIONS: +option_symbol = image.find_by_name("__%s%s" % (name, suffix)) +if option_symbol: +value = option_symbol.string_value +options[key] = value + +driver = cls(name, options) + +pci_table_name_symbol = image.find_by_name("__%s_pci_tbl_export" % name) +if pci_table_name_symbol: +driver.pci_ids = cls._load_pci_ids(image, pci_table_name_symbol) + +return driver + +@staticmethod +def _load_pci_ids(image, table_name_symbol): +table_name = table_name_symbol.string_value +table_symbol = image.find_by_name(table_name) +if not table_symbol: +raise Exception("PCI table declared but not defined: %d" % table_name) + +rte_pci_id = define_rte_pci_id(image.is_big_endian) + +pci_id_size = ctypes.sizeof(rte_pci_id) +pci_ids_desc = rte_pci_id * (table_symbol.size // pci_id_size) +pci_ids = pci_ids_desc.from_buffer_copy(table_symbol.value) +result = [] +for pci_id in pci_ids: +if not pci_id.device_id: +break +result.append([ +pci_id.vendor_id, +pci_id.device_id, +pci_id.subsystem_vendor_id, +pci_id.subsystem_device_id, +]) +return result + +def dump(self, file): +dumped = json.dumps(self.__dict__) +escaped = dumped.replace('"', '\\"') +print( +'const char %s_pmd_info[] __attribute__((used)) = "PMD_INFO_STRING= %s";' +% (self.name, escaped), +file=file, +) + + +def load_drivers(image): +drivers = [] +for symbol in image.find_by_prefix("this_pmd_name
[dpdk-dev] [PATCH v10 2/3] build: use Python pmdinfogen
Use the same interpreter to run pmdinfogen as for other build scripts. Adjust wrapper script accordingly and also don't suppress stderr from ar and pmdinfogen. Add configure-time check for elftools Python module for Unix hosts. Add pyelftools to CI configuration and build requirements for Linux and FreeBSD. Windows targets are not currently using pmdinfogen. Suppress ABI warnings about generated PMD information strings. Signed-off-by: Dmitry Kozlyuk --- .github/workflows/build.yml | 4 ++-- .travis.yml | 2 +- buildtools/gen-pmdinfo-cfile.sh | 6 +++--- buildtools/meson.build| 15 +++ devtools/libabigail.abignore | 4 doc/guides/freebsd_gsg/build_dpdk.rst | 3 ++- doc/guides/linux_gsg/sys_reqs.rst | 6 ++ drivers/meson.build | 2 +- meson.build | 1 - 9 files changed, 34 insertions(+), 9 deletions(-) diff --git a/.github/workflows/build.yml b/.github/workflows/build.yml index 0b72df0eb..a5b579add 100644 --- a/.github/workflows/build.yml +++ b/.github/workflows/build.yml @@ -91,8 +91,8 @@ jobs: run: sudo apt update - name: Install packages run: sudo apt install -y ccache libnuma-dev python3-setuptools -python3-wheel python3-pip ninja-build libbsd-dev libpcap-dev -libibverbs-dev libcrypto++-dev libfdt-dev libjansson-dev +python3-wheel python3-pip python3-pyelftools ninja-build libbsd-dev +libpcap-dev libibverbs-dev libcrypto++-dev libfdt-dev libjansson-dev - name: Install libabigail build dependencies if no cache is available if: env.ABI_CHECKS == 'true' && steps.libabigail-cache.outputs.cache-hit != 'true' run: sudo apt install -y autoconf automake libtool pkg-config libxml2-dev diff --git a/.travis.yml b/.travis.yml index 5aa7ad49f..4391af1d5 100644 --- a/.travis.yml +++ b/.travis.yml @@ -14,7 +14,7 @@ addons: apt: update: true packages: &required_packages - - [libnuma-dev, python3-setuptools, python3-wheel, python3-pip, ninja-build] + - [libnuma-dev, python3-setuptools, python3-wheel, python3-pip, python3-pyelftools, ninja-build] - [libbsd-dev, libpcap-dev, libibverbs-dev, libcrypto++-dev, libfdt-dev, libjansson-dev] _aarch64_packages: &aarch64_packages diff --git a/buildtools/gen-pmdinfo-cfile.sh b/buildtools/gen-pmdinfo-cfile.sh index 43059cf36..109ee461e 100755 --- a/buildtools/gen-pmdinfo-cfile.sh +++ b/buildtools/gen-pmdinfo-cfile.sh @@ -4,11 +4,11 @@ arfile=$1 output=$2 -pmdinfogen=$3 +shift 2 +pmdinfogen=$* # The generated file must not be empty if compiled in pedantic mode echo 'static __attribute__((unused)) const char *generator = "'$0'";' > $output for ofile in `ar t $arfile` ; do - ar p $arfile $ofile | $pmdinfogen - - >> $output 2> /dev/null + ar p $arfile $ofile | $pmdinfogen - - >> $output done -exit 0 diff --git a/buildtools/meson.build b/buildtools/meson.build index 04808dabc..dd4c0f640 100644 --- a/buildtools/meson.build +++ b/buildtools/meson.build @@ -17,3 +17,18 @@ else endif map_to_win_cmd = py3 + files('map_to_win.py') sphinx_wrapper = py3 + files('call-sphinx-build.py') +pmdinfogen = py3 + files('pmdinfogen.py') + +# TODO: starting from Meson 0.51.0 use +# python3 = import('python').find_installation('python', +# modules : python3_required_modules) +python3_required_modules = [] +if host_machine.system() != 'windows' + python3_required_modules = ['elftools'] +endif +foreach module : python3_required_modules + script = 'import importlib.util; import sys; exit(importlib.util.find_spec("@0@") is None)' + if run_command(py3, '-c', script.format(module)).returncode() != 0 + error('missing python module: @0@'.format(module)) + endif +endforeach diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore index 1dc84fa74..05afccc1a 100644 --- a/devtools/libabigail.abignore +++ b/devtools/libabigail.abignore @@ -16,3 +16,7 @@ [suppress_type] name = rte_cryptodev has_data_member_inserted_between = {0, 1023} + +; Ignore all changes in generated PMD information strings. +[suppress_variable] +name_regex = _pmd_info$ diff --git a/doc/guides/freebsd_gsg/build_dpdk.rst b/doc/guides/freebsd_gsg/build_dpdk.rst index e3005a7f3..bed353473 100644 --- a/doc/guides/freebsd_gsg/build_dpdk.rst +++ b/doc/guides/freebsd_gsg/build_dpdk.rst @@ -14,10 +14,11 @@ The following FreeBSD packages are required to build DPDK: * meson * ninja * pkgconf +* py37-pyelftools These can be installed using (as root):: - pkg install meson pkgconf + pkg install meson pkgconf py37-pyelftools To compile the required kernel modules for memory management and working with physical NIC devices, the kernel sources for FreeBSD also diff --git a/doc/guides/linux_gsg/sys_reqs.rst b/doc/guides/linux_gsg/sys_reqs.rst index be714adf2..a05b5bd81 100644 -
[dpdk-dev] [PATCH v10 3/3] pmdinfogen: remove C implementation
Delete the files no longer used in build process. Add myself as maintainer of new implementation. Signed-off-by: Dmitry Kozlyuk --- MAINTAINERS| 3 +- buildtools/pmdinfogen/meson.build | 14 - buildtools/pmdinfogen/pmdinfogen.c | 456 - buildtools/pmdinfogen/pmdinfogen.h | 119 4 files changed, 2 insertions(+), 590 deletions(-) delete mode 100644 buildtools/pmdinfogen/meson.build delete mode 100644 buildtools/pmdinfogen/pmdinfogen.c delete mode 100644 buildtools/pmdinfogen/pmdinfogen.h diff --git a/MAINTAINERS b/MAINTAINERS index aa973a396..65f6fffd1 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -133,7 +133,8 @@ F: lib/*/*.map Driver information M: Neil Horman -F: buildtools/pmdinfogen/ +M: Dmitry Kozlyuk +F: buildtools/pmdinfogen.py F: usertools/dpdk-pmdinfo.py F: doc/guides/tools/pmdinfo.rst diff --git a/buildtools/pmdinfogen/meson.build b/buildtools/pmdinfogen/meson.build deleted file mode 100644 index 670528fac..0 --- a/buildtools/pmdinfogen/meson.build +++ /dev/null @@ -1,14 +0,0 @@ -# SPDX-License-Identifier: BSD-3-Clause -# Copyright(c) 2017 Intel Corporation - -if is_windows - subdir_done() -endif - -pmdinfogen_inc = [global_inc] -pmdinfogen_inc += include_directories('../../lib/librte_eal/include') -pmdinfogen_inc += include_directories('../../lib/librte_pci') -pmdinfogen = executable('pmdinfogen', - 'pmdinfogen.c', - include_directories: pmdinfogen_inc, - native: true) diff --git a/buildtools/pmdinfogen/pmdinfogen.c b/buildtools/pmdinfogen/pmdinfogen.c deleted file mode 100644 index a68d1ea99..0 --- a/buildtools/pmdinfogen/pmdinfogen.c +++ /dev/null @@ -1,456 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 - * Postprocess pmd object files to export hw support - * - * Copyright 2016 Neil Horman - * Based in part on modpost.c from the linux kernel - */ - -#include -#include -#include -#include -#include -#include -#include - -#include -#include "pmdinfogen.h" - -#ifdef RTE_ARCH_64 -#define ADDR_SIZE 64 -#else -#define ADDR_SIZE 32 -#endif - -static int use_stdin, use_stdout; - -static const char *sym_name(struct elf_info *elf, Elf_Sym *sym) -{ - if (sym) - return elf->strtab + sym->st_name; - else - return "(unknown)"; -} - -static void *grab_file(const char *filename, unsigned long *size) -{ - struct stat st; - void *map = MAP_FAILED; - int fd = -1; - - if (!use_stdin) { - fd = open(filename, O_RDONLY); - if (fd < 0) - return NULL; - } else { - /* from stdin, use a temporary file to mmap */ - FILE *infile; - char buffer[1024]; - int n; - - infile = tmpfile(); - if (infile == NULL) { - perror("tmpfile"); - return NULL; - } - fd = dup(fileno(infile)); - fclose(infile); - if (fd < 0) - return NULL; - - n = read(STDIN_FILENO, buffer, sizeof(buffer)); - while (n > 0) { - if (write(fd, buffer, n) != n) - goto failed; - n = read(STDIN_FILENO, buffer, sizeof(buffer)); - } - } - - if (fstat(fd, &st)) - goto failed; - - *size = st.st_size; - map = mmap(NULL, *size, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0); - -failed: - close(fd); - if (map == MAP_FAILED) - return NULL; - return map; -} - -/** - * Return a copy of the next line in a mmap'ed file. - * spaces in the beginning of the line is trimmed away. - * Return a pointer to a static buffer. - **/ -static void release_file(void *file, unsigned long size) -{ - munmap(file, size); -} - - -static void *get_sym_value(struct elf_info *info, const Elf_Sym *sym) -{ - return RTE_PTR_ADD(info->hdr, - info->sechdrs[sym->st_shndx].sh_offset + sym->st_value); -} - -static Elf_Sym *find_sym_in_symtab(struct elf_info *info, - const char *name, Elf_Sym *last) -{ - Elf_Sym *idx; - if (last) - idx = last+1; - else - idx = info->symtab_start; - - for (; idx < info->symtab_stop; idx++) { - const char *n = sym_name(info, idx); - if (!strncmp(n, name, strlen(name))) - return idx; - } - return NULL; -} - -static int parse_elf(struct elf_info *info, const char *filename) -{ - unsigned int i; - Elf_Ehdr *hdr; - Elf_Shdr *sechdrs; - Elf_Sym *sym; - int endian; - unsigned int symtab_idx = ~0U, symtab_shndx_idx = ~0U; - - hdr = grab_file(filename, &info->size); - if (!hdr) { - perror(filename); - exit(1)
Re: [dpdk-dev] [PATCH v8 2/3] build: use Python pmdinfogen
On Sat, 23 Jan 2021 12:38:45 +0100, Thomas Monjalon wrote: > 22/01/2021 23:24, Dmitry Kozlyuk: > > On Fri, 22 Jan 2021 21:57:15 +0100, Thomas Monjalon wrote: > > > 22/01/2021 21:31, Dmitry Kozlyuk: > > > > On Wed, 20 Jan 2021 11:24:21 +0100, Thomas Monjalon wrote: > > > > > 20/01/2021 08:23, Dmitry Kozlyuk: > > > > > > On Wed, 20 Jan 2021 01:05:59 +0100, Thomas Monjalon wrote: > > > > > > > This is now the right timeframe to introduce this change > > > > > > > with the new Python module dependency. > > > > > > > Unfortunately, the ABI check is returning an issue: > > > > > > > > > > > > > > 'const char mlx5_common_pci_pmd_info[62]' was changed > > > > > > > to 'const char mlx5_common_pci_pmd_info[60]' at > > > > > > > rte_common_mlx5.pmd.c > > > > > > > > > > > > Will investigate and fix ASAP. > > > > > > > > Now that I think of it: strings like this change every time new PCI IDs > > > > are > > > > added to a PMD, but AFAIK adding PCI IDs is not considered an ABI > > > > breakage, > > > > is it? One example is 28c9a7d7b48e ("net/mlx5: add ConnectX-6 Lx device > > > > ID") > > > > added 2020-07-08, i.e. clearly outside of ABI change window. > > > > > > You're right. > > > > > > > "xxx_pmd_info" changes are due to JSON formatting (new is more > > > > canonical), > > > > which can be worked around easily, if the above is wrong. > > > > > > If the new format is better, please keep it. > > > What we need is an exception for the pmdinfo symbols > > > in the file devtools/libabigail.abignore. > > > You can probably use a regex for these symbols. > > > > This would allow real breakages to pass ABI check, abidiff doesn't analyze > > variable content and it's not easy to compare. Maybe later a script can be > > added that checks lines with RTE_DEVICE_IN in patches. There are at most 32 > > of > > 5494 relevant commits between 19.11 and 20.11, though. > > > > To verify there are no meaningful changes I ensured empty diff between > > results of the following command for "main" and the branch: > > > > find build/drivers -name '*.so' -exec usertools/dpdk-pmdinfo.py > > For now we cannot do such check as part of the ABI checker. > And we cannot merge this patch if the ABI check fails. > I think the only solution is to allow any change in the pmdinfo variables. Send v10 with suppression. Such check, however, *can* be implemented: at ABI check stage we have two install directories that dpdk-pmdinfo.py can inspect. Then a script can check that diff contains only additions, i.e. no device support being removed.
[dpdk-dev] [RFC 0/1] lib/librte_ethdev: Meter algorithms support packet per second.
Li Zhang (1): lib/librte_ethdev: add definitions for packet per second. lib/librte_ethdev/rte_mtr.h | 15 +++ 1 file changed, 15 insertions(+) -- 2.27.0
[dpdk-dev] [PATCH] [RFC]: adds support PPS(packet per second) on meter
Currently the flow Meter algorithms in rte_flow only supports bytes per second(BPS). Such as Single Rate Three Color Marker (srTCM rfc2697) This RFC adds the packet per second definition in Meter algorithms structure, to support the rte_mtr APIs with type srTCM pps mode. The below structure will be extended: rte_mtr_algorithm rte_mtr_meter_profile Signed-off-by: Li Zhang --- lib/librte_ethdev/rte_mtr.h | 28 1 file changed, 28 insertions(+) diff --git a/lib/librte_ethdev/rte_mtr.h b/lib/librte_ethdev/rte_mtr.h index 916a09c5c3..6413892aec 100644 --- a/lib/librte_ethdev/rte_mtr.h +++ b/lib/librte_ethdev/rte_mtr.h @@ -119,6 +119,9 @@ enum rte_mtr_algorithm { /** Two Rate Three Color Marker (trTCM) - IETF RFC 4115. */ RTE_MTR_TRTCM_RFC4115, + + /** Single Rate Three Color Marker (srTCM) in Packet per second mode */ + RTE_MTR_SRTCM_PPS, }; /** @@ -171,6 +174,18 @@ struct rte_mtr_meter_profile { /** Excess Burst Size (EBS) (bytes). */ uint64_t ebs; } trtcm_rfc4115; + + /** Items only valid when *alg* is set to srTCM - PPS. */ + struct { + /** Committed Information Rate (CIR)(packets/second). */ + uint64_t cir; + + /** Committed Burst Size (CBS) (bytes). */ + uint64_t cbs; + + /** Excess Burst Size (EBS) (bytes). */ + uint64_t ebs; + } srtcm_pps; }; }; @@ -317,6 +332,13 @@ struct rte_mtr_capabilities { */ uint32_t meter_trtcm_rfc4115_n_max; + /** Maximum number of MTR objects that can have their meter configured +* to run the srTCM packet per second algorithm. The value of 0 +* indicates this metering algorithm is not supported. +* The maximum value is *n_max*. +*/ + uint32_t meter_srtcm_pps_n_max; + /** Maximum traffic rate that can be metered by a single MTR object. For * srTCM RFC 2697, this is the maximum CIR rate. For trTCM RFC 2698, * this is the maximum PIR rate. For trTCM RFC 4115, this is the maximum @@ -342,6 +364,12 @@ struct rte_mtr_capabilities { */ int color_aware_trtcm_rfc4115_supported; + /** + * When non-zero, it indicates that color aware mode is supported for + * the srTCM packet per second metering algorithm. + */ + int color_aware_srtcm_pps_supported; + /** When non-zero, it indicates that the policer packet recolor actions * are supported. * @see enum rte_mtr_policer_action -- 2.21.0
[dpdk-dev] [RFC 0/1] lib/librte_ethdev: Meter algorithms support packet per second.
Li Zhang (1): lib/librte_ethdev: add definitions for packet per second. lib/librte_ethdev/rte_mtr.h | 15 +++ 1 file changed, 15 insertions(+) -- 2.27.0
[dpdk-dev] [PATCH] [RFC, v2]: adds support PPS(packet per second) on meter
Currently the flow Meter algorithms in rte_flow only supports bytes per second(BPS). Such as Single Rate Three Color Marker (srTCM rfc2697) This RFC adds the packet per second definition in Meter algorithms structure, to support the rte_mtr APIs with type srTCM pps mode. The below structure will be extended: rte_mtr_algorithm rte_mtr_meter_profile Signed-off-by: Li Zhang --- lib/librte_ethdev/rte_mtr.h | 28 1 file changed, 28 insertions(+) diff --git a/lib/librte_ethdev/rte_mtr.h b/lib/librte_ethdev/rte_mtr.h index 916a09c5c3..3e88904faf 100644 --- a/lib/librte_ethdev/rte_mtr.h +++ b/lib/librte_ethdev/rte_mtr.h @@ -119,6 +119,9 @@ enum rte_mtr_algorithm { /** Two Rate Three Color Marker (trTCM) - IETF RFC 4115. */ RTE_MTR_TRTCM_RFC4115, + + /** Single Rate Three Color Marker (srTCM) in Packet per second mode */ + RTE_MTR_SRTCM_PPS, }; /** @@ -171,6 +174,18 @@ struct rte_mtr_meter_profile { /** Excess Burst Size (EBS) (bytes). */ uint64_t ebs; } trtcm_rfc4115; + + /** Items only valid when *alg* is set to srTCM - PPS. */ + struct { + /** Committed Information Rate (CIR)(packets/second). */ + uint64_t cir; + + /** Committed Burst Size (CBS) (bytes). */ + uint64_t cbs; + + /** Excess Burst Size (EBS) (bytes). */ + uint64_t ebs; + } srtcm_pps; }; }; @@ -317,6 +332,13 @@ struct rte_mtr_capabilities { */ uint32_t meter_trtcm_rfc4115_n_max; + /** Maximum number of MTR objects that can have their meter configured +* to run the srTCM packet per second algorithm. The value of 0 +* indicates this metering algorithm is not supported. +* The maximum value is *n_max*. +*/ + uint32_t meter_srtcm_pps_n_max; + /** Maximum traffic rate that can be metered by a single MTR object. For * srTCM RFC 2697, this is the maximum CIR rate. For trTCM RFC 2698, * this is the maximum PIR rate. For trTCM RFC 4115, this is the maximum @@ -342,6 +364,12 @@ struct rte_mtr_capabilities { */ int color_aware_trtcm_rfc4115_supported; + /** +* When non-zero, it indicates that color aware mode is supported for +* the srTCM packet per second metering algorithm. +*/ + int color_aware_srtcm_pps_supported; + /** When non-zero, it indicates that the policer packet recolor actions * are supported. * @see enum rte_mtr_policer_action -- 2.21.0
[dpdk-dev] [PATCH] doc: clarify disclosure time slot when no response
Sometimes security team won't send confirmation mail back to reporter in three business days. This mean reported vulnerability is either low severity or not a real vulnerability. Reporter should assume that the issue need shortest embargo. After that reporter can submit it through normal bugzilla process or send out fix patch to public. Signed-off-by: Marvin Liu Signed-off-by: Qian Xu diff --git a/doc/guides/contributing/vulnerability.rst b/doc/guides/contributing/vulnerability.rst index b6300252ad..cda814fa69 100644 --- a/doc/guides/contributing/vulnerability.rst +++ b/doc/guides/contributing/vulnerability.rst @@ -99,6 +99,11 @@ Following information must be included in the mail: * Reporter credit * Bug ID (empty and restricted for future reference) +If no confirmation mail send back to reporter in this period, thus mean security +team take this vulnerability as low severity. Furthermore shortest embargo **two weeks** +is required for it. Reporter can sumbit the bug through normal process or send +out patch to public. + CVE Request --- -- 2.17.1
Re: [dpdk-dev] [PATCH v5 2/3] PCI: support MMIO in rte_pci_ioport_map/unap/read/write
On 2021/1/24 23:22, Xueming(Steven) Li wrote: + } else if (flags & IORESOURCE_MEM) { + iobar = 0; + base = (unsigned long)dev->mem_resource[bar].addr; + RTE_LOG(INFO, EAL, "%s(): MMIO BAR %08lx detected\n", __func__, base); Same here, INFO level seems chatty. makes sense. would remove it. + } else { + RTE_LOG(ERR, EAL, "%s(): unknown BAR type\n", __func__); + goto error; + } + + + if (iobar && rte_eal_iopl_init() != 0) { + RTE_LOG(ERR, EAL, "%s(): insufficient ioport permissions for PCI device %s\n", + __func__, dev->name); goto error; } Same as Maxime's suggestion, please move this block as well. Thanks. It is already moved in v6 patch. - base = (unsigned long)phys_addr; - RTE_LOG(INFO, EAL, "%s(): PIO BAR %08lx detected\n", __func__, base); - if (base > UINT16_MAX) + if (iobar && (base > UINT16_MAX)) { PIO_MAX defined below, please use it here. UNI16_MAX used in patch 1/3 as well. ok.
[dpdk-dev] [PATCH v2] net/ixgbe: disable NFS filtering
From: Dapeng Yu Disable NFS header filtering whether NFS packets coalescing are required or not. This behavior is aligned with ixgbe kernel driver. Fixes: b826efba6de4 ("net/ixgbe: align register setting when RSC is disabled") Fixes: 8eecb3295aed ("ixgbe: add LRO support") Cc: sta...@dpdk.org Signed-off-by: Dapeng Yu --- drivers/net/ixgbe/ixgbe_rxtx.c | 10 +++--- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c index cc8f70e6d..2efd054f7 100644 --- a/drivers/net/ixgbe/ixgbe_rxtx.c +++ b/drivers/net/ixgbe/ixgbe_rxtx.c @@ -4923,15 +4923,11 @@ ixgbe_set_rsc(struct rte_eth_dev *dev) /* RFCTL configuration */ rfctl = IXGBE_READ_REG(hw, IXGBE_RFCTL); if ((rsc_capable) && (rx_conf->offloads & DEV_RX_OFFLOAD_TCP_LRO)) - /* -* Since NFS packets coalescing is not supported - clear -* RFCTL.NFSW_DIS and RFCTL.NFSR_DIS when RSC is -* enabled. -*/ - rfctl &= ~(IXGBE_RFCTL_RSC_DIS | IXGBE_RFCTL_NFSW_DIS | - IXGBE_RFCTL_NFSR_DIS); + rfctl &= ~IXGBE_RFCTL_RSC_DIS; else rfctl |= IXGBE_RFCTL_RSC_DIS; + /* disable NFS filtering */ + rfctl |= IXGBE_RFCTL_NFSW_DIS | IXGBE_RFCTL_NFSR_DIS; IXGBE_WRITE_REG(hw, IXGBE_RFCTL, rfctl); /* If LRO hasn't been requested - we are done here. */ -- 2.27.0
[dpdk-dev] [PATCH v2] app/testpmd: avoid exit without terminal restore
From: Dapeng Yu In interactive mode, if testpmd exit by calling rte_exit without restore terminal attributes, terminal will not echo keyboard input. register a function with atexit() in prompt(), when exit() in rte_exit() is called, the registered function restores terminal attributes. Fixes: 5a8fb55c48ab ("app/testpmd: support unidirectional configuration") Cc: sta...@dpdk.org Signed-off-by: Dapeng Yu --- app/test-pmd/cmdline.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index 89034c8b7..f7e18ba3d 100644 --- a/app/test-pmd/cmdline.c +++ b/app/test-pmd/cmdline.c @@ -17116,6 +17116,7 @@ cmdline_read_from_file(const char *filename) void prompt(void) { + int ret; /* initialize non-constant commands */ cmd_set_fwd_mode_init(); cmd_set_fwd_retry_mode_init(); @@ -17123,15 +17124,23 @@ prompt(void) testpmd_cl = cmdline_stdin_new(main_ctx, "testpmd> "); if (testpmd_cl == NULL) return; + + ret = atexit(prompt_exit); + if (ret != 0) + printf("Cannot set exit function for cmdline\n"); + cmdline_interact(testpmd_cl); - cmdline_stdin_exit(testpmd_cl); + if (ret != 0) + cmdline_stdin_exit(testpmd_cl); } void prompt_exit(void) { - if (testpmd_cl != NULL) + if (testpmd_cl != NULL) { cmdline_quit(testpmd_cl); + cmdline_stdin_exit(testpmd_cl); + } } static void -- 2.27.0
[dpdk-dev] [PATCH v1] net/iavf: fix unsupported VLAN offload requested
If the underlying PF doesn't support a specific ethertype or the ability to toggle VLAN insertion and/or stripping, then the VF prevents sending an invalid message to the PF. Fixes: 1c301e8c3cff ("net/iavf: support new VLAN capabilities") Signed-off-by: Haiyue Wang --- drivers/net/iavf/iavf_vchnl.c | 36 --- 1 file changed, 16 insertions(+), 20 deletions(-) diff --git a/drivers/net/iavf/iavf_vchnl.c b/drivers/net/iavf/iavf_vchnl.c index c82925eceb..9b8c4d113a 100644 --- a/drivers/net/iavf/iavf_vchnl.c +++ b/drivers/net/iavf/iavf_vchnl.c @@ -528,23 +528,21 @@ int iavf_config_vlan_strip_v2(struct iavf_adapter *adapter, bool enable) { struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(adapter); - struct virtchnl_vlan_supported_caps *supported_caps; + struct virtchnl_vlan_supported_caps *stripping_caps; struct virtchnl_vlan_setting vlan_strip; struct iavf_cmd_info args; - uint32_t stripping_caps; uint32_t *ethertype; int ret; - supported_caps = &vf->vlan_v2_caps.offloads.stripping_support; - if (supported_caps->outer) { - stripping_caps = supported_caps->outer; + stripping_caps = &vf->vlan_v2_caps.offloads.stripping_support; + + if ((stripping_caps->outer & VIRTCHNL_VLAN_ETHERTYPE_8100) && + (stripping_caps->outer & VIRTCHNL_VLAN_TOGGLE)) ethertype = &vlan_strip.outer_ethertype_setting; - } else { - stripping_caps = supported_caps->inner; + else if ((stripping_caps->inner & VIRTCHNL_VLAN_ETHERTYPE_8100) && +(stripping_caps->inner & VIRTCHNL_VLAN_TOGGLE)) ethertype = &vlan_strip.inner_ethertype_setting; - } - - if (!(stripping_caps & VIRTCHNL_VLAN_ETHERTYPE_8100)) + else return -ENOTSUP; memset(&vlan_strip, 0, sizeof(vlan_strip)); @@ -570,23 +568,21 @@ int iavf_config_vlan_insert_v2(struct iavf_adapter *adapter, bool enable) { struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(adapter); - struct virtchnl_vlan_supported_caps *supported_caps; + struct virtchnl_vlan_supported_caps *insertion_caps; struct virtchnl_vlan_setting vlan_insert; struct iavf_cmd_info args; - uint32_t insertion_caps; uint32_t *ethertype; int ret; - supported_caps = &vf->vlan_v2_caps.offloads.insertion_support; - if (supported_caps->outer) { - insertion_caps = supported_caps->outer; + insertion_caps = &vf->vlan_v2_caps.offloads.insertion_support; + + if ((insertion_caps->outer & VIRTCHNL_VLAN_ETHERTYPE_8100) && + (insertion_caps->outer & VIRTCHNL_VLAN_TOGGLE)) ethertype = &vlan_insert.outer_ethertype_setting; - } else { - insertion_caps = supported_caps->inner; + else if ((insertion_caps->inner & VIRTCHNL_VLAN_ETHERTYPE_8100) && +(insertion_caps->inner & VIRTCHNL_VLAN_TOGGLE)) ethertype = &vlan_insert.inner_ethertype_setting; - } - - if (!(insertion_caps & VIRTCHNL_VLAN_ETHERTYPE_8100)) + else return -ENOTSUP; memset(&vlan_insert, 0, sizeof(vlan_insert)); -- 2.30.0
[dpdk-dev] [Bug 435] Proposed improvement to non-interactive loop timing
https://bugs.dpdk.org/show_bug.cgi?id=435 Zhang, RobinX (robinx.zh...@intel.com) changed: What|Removed |Added Resolution|--- |WONTFIX Status|UNCONFIRMED |RESOLVED CC||robinx.zh...@intel.com --- Comment #1 from Zhang, RobinX (robinx.zh...@intel.com) --- Hi Charlie, you can send your proposal to community for further discussion. Here's the contributor's guideline: https://doc.dpdk.org/guides/contributing/index.html For this bugzilla, due to it's not a bug, I will close it, thanks. -- You are receiving this mail because: You are the assignee for the bug.
Re: [dpdk-dev] [PATCH v3 1/3] ethdev: fix MTU doesn't update when jumbo frame disabled
Hi Steve, In the current modification, the MTU is updated based on 'max_rx_pkt_len' regardless of whether jumbo frame is enabled. Now, MTU is correct when jumbo frmae is disabled. However, when jumbo frame is enabled, the MTU value may be inconsistent with the definition of the enabled jumbo frame. Like: 1/ DEV_RX_OFFLOAD_JUMBO_FRAME is set; 2/ max_rx_pkt_len = 1200 3/ dev->data->mtu = 1200 - overhead_len(18) = 1182 In rte_eth_dev_configure API, the check for 'max_rx_pkt_len' is as follows: if (dev_conf->rxmode.offloads & DEV_RX_OFFLOAD_JUMBO_FRAME) { //jumbo frame enabled if (dev_conf->rxmode.max_rx_pkt_len > dev_info.max_rx_pktlen) { goto rollback; } else if (*dev_conf->rxmode.max_rx_pkt_len < RTE_ETHER_MIN_LEN*) { goto rollback; } } else { //jumbo frame disabled if (pktlen < RTE_ETHER_MIN_MTU + overhead_len || pktlen > RTE_ETHER_MTU + overhead_len) /* Use default value */ dev->data->dev_conf.rxmode.max_rx_pkt_len = RTE_ETHER_MTU + overhead_len; } Since the applicatin sets DEV_RX_OFFLOAD_JUMBO_FRAME to enable jumbo frame, and the framework API needs to update the MTU based on 'max_rx_pkt_len', but the framework API uses *RTE_ETHER_MIN_LEN(64)* to verify the boundary value of 'max_rx_pkt_len', instead of "RTE_ETHER_MTU + overhead_len". As far as I know, if the applicatin sets DEV_RX_OFFLOAD_JUMBO_FRAME and 'max_rx_pkt_len' is 1200, the framework API or driver should return a failure. As mentioned in this patch set, the jumbo frame offload is set only when 'max_rx_pkt_len' requested is greater than "RTE_ETHER_MTU + eth_overhead" in testpmd. I really don't understand it. How do you understand this behavior? Thanks. 在 2021/1/22 17:01, Steve Yang 写道: The MTU value should be updated to 'max_rx_pkt_len - overhead' no matter if the JUMBO FRAME offload enabled. If not update this MTU, use will get the wrong MTU info via some command. E.g.: 'show port info all' in testpmd tool. Actually, the 'max_rx_pkt_len' has been used for other purposes in many places now, even though the 'max_rx_pkt_len' is expected 'Only used if JUMBO_FRAME enabled'. For examples, 'max_rx_pkt_len' perhaps can be used as the 'rx_ctx.rxmax' in i40e. Fixes: bf0f90d92d30 ("ethdev: fix max Rx packet length check") Signed-off-by: Steve Yang --- lib/librte_ethdev/rte_ethdev.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c index daf5f24f7e..42857e3b67 100644 --- a/lib/librte_ethdev/rte_ethdev.c +++ b/lib/librte_ethdev/rte_ethdev.c @@ -1421,10 +1421,6 @@ rte_eth_dev_configure(uint16_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q, ret = -EINVAL; goto rollback; } - - /* Scale the MTU size to adapt max_rx_pkt_len */ - dev->data->mtu = dev->data->dev_conf.rxmode.max_rx_pkt_len - - overhead_len; } else { uint16_t pktlen = dev_conf->rxmode.max_rx_pkt_len; if (pktlen < RTE_ETHER_MIN_MTU + overhead_len || @@ -1434,6 +1430,10 @@ rte_eth_dev_configure(uint16_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q, RTE_ETHER_MTU + overhead_len; } + /* Scale the MTU size to adapt max_rx_pkt_len */ + dev->data->mtu = dev->data->dev_conf.rxmode.max_rx_pkt_len - + overhead_len; + /* * If LRO is enabled, check that the maximum aggregated packet * size is supported by the configured device.
[dpdk-dev] [PATCH v1] net/iavf: support to config VLAN filter
Add the VLAN filtering enable/disable configuration according to the VF successfully negotiated that capability with PF. Signed-off-by: Haiyue Wang --- drivers/net/iavf/iavf.h| 1 + drivers/net/iavf/iavf_ethdev.c | 7 ++ drivers/net/iavf/iavf_vchnl.c | 40 ++ 3 files changed, 48 insertions(+) diff --git a/drivers/net/iavf/iavf.h b/drivers/net/iavf/iavf.h index c934d2e614..9d84e2604d 100644 --- a/drivers/net/iavf/iavf.h +++ b/drivers/net/iavf/iavf.h @@ -314,6 +314,7 @@ int iavf_configure_queues(struct iavf_adapter *adapter, int iavf_get_supported_rxdid(struct iavf_adapter *adapter); int iavf_config_vlan_strip_v2(struct iavf_adapter *adapter, bool enable); int iavf_config_vlan_insert_v2(struct iavf_adapter *adapter, bool enable); +int iavf_config_vlan_filter_v2(struct iavf_adapter *adapter, bool enable); int iavf_add_del_vlan_v2(struct iavf_adapter *adapter, uint16_t vlanid, bool add); int iavf_get_vlan_offload_caps_v2(struct iavf_adapter *adapter); diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c index cf6ea0b15c..8e741031ec 100644 --- a/drivers/net/iavf/iavf_ethdev.c +++ b/drivers/net/iavf/iavf_ethdev.c @@ -1086,6 +1086,13 @@ iavf_dev_vlan_offload_set_v2(struct rte_eth_dev *dev, int mask) enable = !!(rxmode->offloads & DEV_RX_OFFLOAD_VLAN_FILTER); iavf_iterate_vlan_filters_v2(dev, enable); + + err = iavf_config_vlan_filter_v2(adapter, enable); + /* If not support, the filtering is already disabled by PF */ + if (err == -ENOTSUP && !enable) + err = 0; + if (err) + return -EIO; } if (mask & ETH_VLAN_STRIP_MASK) { diff --git a/drivers/net/iavf/iavf_vchnl.c b/drivers/net/iavf/iavf_vchnl.c index 9b8c4d113a..7af143a21b 100644 --- a/drivers/net/iavf/iavf_vchnl.c +++ b/drivers/net/iavf/iavf_vchnl.c @@ -604,6 +604,46 @@ iavf_config_vlan_insert_v2(struct iavf_adapter *adapter, bool enable) return ret; } +int +iavf_config_vlan_filter_v2(struct iavf_adapter *adapter, bool enable) +{ + struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(adapter); + struct virtchnl_vlan_supported_caps *filtering_caps; + struct virtchnl_vlan_setting vlan_filter; + struct iavf_cmd_info args; + uint32_t *ethertype; + int ret; + + filtering_caps = &vf->vlan_v2_caps.filtering.filtering_support; + + if ((filtering_caps->outer & VIRTCHNL_VLAN_ETHERTYPE_8100) && + (filtering_caps->outer & VIRTCHNL_VLAN_TOGGLE)) + ethertype = &vlan_filter.outer_ethertype_setting; + else if ((filtering_caps->inner & VIRTCHNL_VLAN_ETHERTYPE_8100) && +(filtering_caps->inner & VIRTCHNL_VLAN_TOGGLE)) + ethertype = &vlan_filter.inner_ethertype_setting; + else + return -ENOTSUP; + + memset(&vlan_filter, 0, sizeof(vlan_filter)); + vlan_filter.vport_id = vf->vsi_res->vsi_id; + *ethertype = VIRTCHNL_VLAN_ETHERTYPE_8100; + + args.ops = enable ? VIRTCHNL_OP_ENABLE_VLAN_FILTERING_V2 : + VIRTCHNL_OP_DISABLE_VLAN_FILTERING_V2; + args.in_args = (uint8_t *)&vlan_filter; + args.in_args_size = sizeof(vlan_filter); + args.out_buffer = vf->aq_resp; + args.out_size = IAVF_AQ_BUF_SZ; + ret = iavf_execute_vf_cmd(adapter, &args); + if (ret) + PMD_DRV_LOG(ERR, "fail to execute command %s", + enable ? "VIRTCHNL_OP_ENABLE_VLAN_FILTERING_V2" : +"VIRTCHNL_OP_DISABLE_VLAN_FILTERING_V2"); + + return ret; +} + int iavf_add_del_vlan_v2(struct iavf_adapter *adapter, uint16_t vlanid, bool add) { -- 2.30.0