RE: [RFC v3 23/26] dev: hide driver object
Looks good to me. Acked-by: Jay Jayatheerthan > -Original Message- > From: David Marchand > Sent: Thursday, July 28, 2022 8:57 PM > To: dev@dpdk.org > Cc: Chautru, Nicolas ; Parav Pandit > ; Xueming Li ; Hemant > Agrawal ; Sachin Saxena ; > Stephen Hemminger ; > Long Li ; Zhang, Roy Fan ; > Sunila Sahu ; Ashish Gupta > ; Ajit Khaparde ; > Raveendra Padasalagi > ; Vikas Gupta ; > Chandubabu Namburu ; > Ankur Dwivedi ; Anoob Joseph ; > Tejasree Kondoj ; > Gagandeep Singh ; Richardson, Bruce > ; Laatz, Kevin ; > McDaniel, Timothy ; Jerin Jacob > ; Elena Agostini ; Loftus, > Ciara ; Zhang, Qi Z ; Shepard > Siegel ; Ed Czeck > ; John Miller ; > Webster, Steven ; Peters, > Matt ; Rasesh Mody ; Shahed > Shaikh ; Somnath Kotur > ; Rahul Lakkireddy > ; Su, Simei ; Wu, Wenjun1 > ; Marcin Wojtas ; Michal Krawczyk > ; Shai Brandes > ; Evgeny Schemeilin ; Igor Chauskin > ; Daley, John > ; Hyong Youb Kim ; Gaetan Rivet > ; Wang, Xiao W > ; Zhang, Yuying ; Xing, Beilei > ; Wu, Jingjing > ; Yang, Qiming ; Matan Azrad > ; Viacheslav Ovsiienko > ; Chaoyong He ; Niklas > Soderlund ; Harman > Kalra ; Devendra Singh Rawat ; > Andrew Rybchenko > ; Maciej Czekaj ; Maxime > Coquelin ; Xia, > Chenbo ; Jochen Behrens ; Jakub > Palider ; Tomasz Duszynski > ; Ori Kam ; Akhil Goyal > ; Chengwen Feng > ; Ray Kinsella ; Thomas Monjalon > ; Ferruh Yigit > ; Gujjar, Abhinandan S > ; Jayatheerthan, Jay > ; > Matz, Olivier ; Pattan, Reshma > > Subject: [RFC v3 23/26] dev: hide driver object > > Make rte_driver opaque for non internal users. > This will make extending this object possible without breaking the ABI. > > Introduce a new driver header and move rte_driver definition. > Update drivers and library to use the internal header. > > Some applications may have been dereferencing rte_driver objects, mark > this object's accessors as stable. > > Signed-off-by: David Marchand > --- > Changes since RFC v2: > - updated release notes, > - marked accessors as stable, > > --- > doc/guides/rel_notes/release_22_11.rst| 2 ++ > drivers/baseband/acc100/rte_acc100_pmd.c | 2 +- > .../fpga_5gnr_fec/rte_fpga_5gnr_fec.c | 2 +- > drivers/baseband/fpga_lte_fec/fpga_lte_fec.c | 2 +- > drivers/bus/auxiliary/auxiliary_params.c | 2 +- > drivers/bus/auxiliary/bus_auxiliary_driver.h | 2 +- > drivers/bus/dpaa/bus_dpaa_driver.h| 2 +- > drivers/bus/fslmc/bus_fslmc_driver.h | 2 +- > drivers/bus/fslmc/fslmc_vfio.c| 2 +- > drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c | 2 +- > drivers/bus/fslmc/portal/dpaa2_hw_dpci.c | 2 +- > drivers/bus/fslmc/portal/dpaa2_hw_dpio.c | 2 +- > drivers/bus/fslmc/portal/dpaa2_hw_dprc.c | 2 +- > drivers/bus/pci/bus_pci_driver.h | 2 +- > drivers/bus/pci/pci_params.c | 2 +- > drivers/bus/vdev/bus_vdev_driver.h| 2 +- > drivers/bus/vdev/vdev.c | 2 +- > drivers/bus/vdev/vdev_params.c| 2 +- > drivers/bus/vmbus/bus_vmbus_driver.h | 2 +- > drivers/common/qat/dev/qat_dev_gen4.c | 2 +- > drivers/common/qat/qat_qp.c | 2 +- > drivers/compress/zlib/zlib_pmd_ops.c | 2 +- > drivers/crypto/bcmfs/bcmfs_qp.c | 2 +- > drivers/crypto/bcmfs/bcmfs_sym_pmd.c | 2 +- > drivers/crypto/ccp/rte_ccp_pmd.c | 2 +- > drivers/crypto/cnxk/cn10k_cryptodev.c | 2 +- > drivers/crypto/cnxk/cn9k_cryptodev.c | 2 +- > drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c | 2 +- > drivers/crypto/dpaa_sec/dpaa_sec.c| 2 +- > drivers/crypto/scheduler/scheduler_pmd_ops.c | 2 +- > drivers/dma/idxd/idxd_bus.c | 1 + > drivers/event/dlb2/dlb2.c | 2 +- > drivers/event/dlb2/pf/dlb2_pf.c | 2 +- > drivers/event/dpaa/dpaa_eventdev.c| 2 +- > drivers/event/dpaa2/dpaa2_eventdev.c | 2 +- > drivers/event/dpaa2/dpaa2_hw_dpcon.c | 2 +- > drivers/event/octeontx/ssovf_evdev.c | 2 +- > drivers/event/skeleton/skeleton_eventdev.c| 2 +- > drivers/gpu/cuda/cuda.c | 2 +- > drivers/mempool/dpaa2/dpaa2_hw_mempool.c | 2 +- > drivers/net/af_xdp/rte_eth_af_xdp.c | 2 +- > drivers/net/ark/ark_global.h | 2 +- > drivers/net/avp/avp_ethdev.c | 2 +- > drivers/net/axgbe/axgbe_common.h | 2 +- > drivers/net/bnx2x/bnx2x_ethdev.c | 2 +- > drivers/net/bnxt/bnxt_ethdev.c| 2 +- > drivers/net/bnxt/rte_pmd_bnxt.c | 2 +- > drivers/net/cxgbe/base/t4_hw.c| 2 +- > drivers/net/cxgbe/cxgbe_ethdev.c | 2 +- > drivers/net/cxgbe/cxgbe_main.c| 2 +- > drivers/net/cxgbe/sge.c | 2 +- > drivers/net/dpaa2/base/dpaa2_hw_dpni.c| 2 +- > drivers/net/dpa
RE: [PATCH v9 1/4] ethdev: introduce protocol header API
Hi Thomas, Sorry so long to response your email. > -Original Message- > From: Thomas Monjalon > Sent: Thursday, July 7, 2022 5:05 PM > To: Wu, WenxuanX > Cc: andrew.rybche...@oktetlabs.ru; Li, Xiaoyun ; > ferruh.yi...@xilinx.com; Singh, Aman Deep ; > dev@dpdk.org; Zhang, Yuying ; Zhang, Qi Z > ; jerinjac...@gmail.com; > step...@networkplumber.org > Subject: Re: [PATCH v9 1/4] ethdev: introduce protocol header API > > 13/06/2022 12:25, wenxuanx...@intel.com: > > From: Wenxuan Wu > > > > This patch added new ethdev API to retrieve supported protocol header > > mask of a PMD, which helps to configure protocol header based buffer split. > > > > Signed-off-by: Wenxuan Wu > > --- > > +/** > > + * @warning > > + * @b EXPERIMENTAL: this API may change without prior notice > > + * > > + * Get supported header protocols to split supported by PMD. > > + * The API will return error if the device is not valid. > > + * > > + * @param port_id > > + * The port identifier of the device. > > + * @param ptype > > + * Supported protocol headers of driver. > > It doesn't say where to find the types. > Please give the prefix. Sorry I didn't catch your point, are you referring the ptype should be composed of RTE_PTYPE_*? Could you explain it in more detail? > > > + * @return > > + * - (-ENOTSUP) if header protocol is not supported by device. > > + * - (-ENODEV) if *port_id* invalid. > > + * - (-EIO) if device is removed. > > + * - (0) on success. > > + */ > > +__rte_experimental > > +int rte_eth_supported_hdrs_get(uint16_t port_id, > > + uint32_t *ptype); > > The function name is not precise enough. > There should be the word "split" in its name. Thanks for the suggestion, it will be revised in the next version. Thanks, Yuan >
RE: [PATCH v9 1/4] ethdev: introduce protocol header API
Hi Andrew, Apologies for the delay in getting back to you. > -Original Message- > From: Andrew Rybchenko > Sent: Friday, July 8, 2022 11:01 PM > To: Wu, WenxuanX ; tho...@monjalon.net; Li, > Xiaoyun ; ferruh.yi...@xilinx.com; Singh, Aman Deep > ; dev@dpdk.org; Zhang, Yuying > ; Zhang, Qi Z ; > jerinjac...@gmail.com > Cc: step...@networkplumber.org > Subject: Re: [PATCH v9 1/4] ethdev: introduce protocol header API > > On 6/13/22 13:25, wenxuanx...@intel.com wrote: > > From: Wenxuan Wu > > > > This patch added new ethdev API to retrieve supported protocol header > > mask > > This patch added -> Add Thanks for your catch, will fix in the next version. > > > of a PMD, which helps to configure protocol header based buffer split. > > I'd like to see motivation why single mask is considered sufficient. > I.e. why don't we follow ptypes approach which is move flexible, but a bit > more complicated. > > Looking at RTE_PTYPE_* defines carefully it looks like below API simply > cannot provide information that we can split after TCP or UDP. As Xuan replied in the patch 2, we think maybe RTE_PTYPE_* is enough. Any insights are welcome. > > > > > Signed-off-by: Wenxuan Wu > > [snip] > > > /** > >* @internal > >* Dump private info from device to a file. > > @@ -1281,6 +1296,9 @@ struct eth_dev_ops { > > /** Set IP reassembly configuration */ > > eth_ip_reassembly_conf_set_t ip_reassembly_conf_set; > > > > + /** Get supported ptypes to split */ > > + eth_buffer_split_hdr_ptype_get_t hdrs_supported_ptypes_get; > > + > > It is better to be consistent with naming. I.e. just cut prefix "eth_" > and suffix "_t". > > Also the type name sounds like it get current split configuration, not > supported one. Thank you for your suggestion, will fix in the next version. > > > /** Dump private info from device */ > > eth_dev_priv_dump_t eth_dev_priv_dump; > > }; > > diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c index > > 29a3d80466..e1f2a0ffe3 100644 > > --- a/lib/ethdev/rte_ethdev.c > > +++ b/lib/ethdev/rte_ethdev.c > > @@ -1636,9 +1636,10 @@ rte_eth_dev_is_removed(uint16_t port_id) > > } > > > > static int > > -rte_eth_rx_queue_check_split(const struct rte_eth_rxseg_split *rx_seg, > > -uint16_t n_seg, uint32_t *mbp_buf_size, > > -const struct rte_eth_dev_info *dev_info) > > +rte_eth_rx_queue_check_split(uint16_t port_id, > > + const struct rte_eth_rxseg_split *rx_seg, > > + int16_t n_seg, uint32_t *mbp_buf_size, > > + const struct rte_eth_dev_info *dev_info) > > { > > const struct rte_eth_rxseg_capa *seg_capa = &dev_info- > >rx_seg_capa; > > struct rte_mempool *mp_first; > > @@ -1694,13 +1695,7 @@ rte_eth_rx_queue_check_split(const struct > rte_eth_rxseg_split *rx_seg, > > } > > offset += seg_idx != 0 ? 0 : RTE_PKTMBUF_HEADROOM; > > *mbp_buf_size = rte_pktmbuf_data_room_size(mpl); > > - length = length != 0 ? length : *mbp_buf_size; > > - if (*mbp_buf_size < length + offset) { > > I don't understand why the check goes away completely. Thanks for your catch, it should be in the patch 2, will fix in the next version. > > > - RTE_ETHDEV_LOG(ERR, > > - "%s mbuf_data_room_size %u < %u > (segment length=%u + segment offset=%u)\n", > > - mpl->name, *mbp_buf_size, > > - length + offset, length, offset); > > - return -EINVAL; > > + > > Unnecessary empty line > > > } > > Shouldn't the curly bracket go away as well together with its 'if' Thanks for your catch, will fix in the next version. > > > } > > return 0; > > @@ -1779,7 +1774,7 @@ rte_eth_rx_queue_setup(uint16_t port_id, > uint16_t rx_queue_id, > > n_seg = rx_conf->rx_nseg; > > > > if (rx_conf->offloads & > RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT) { > > - ret = rte_eth_rx_queue_check_split(rx_seg, n_seg, > > + ret = rte_eth_rx_queue_check_split(port_id, rx_seg, > n_seg, > >&mbp_buf_size, > >&dev_info); > > if (ret != 0) > > @@ -5844,6 +5839,20 @@ rte_eth_ip_reassembly_conf_set(uint16_t > port_id, > >(*dev->dev_ops->ip_reassembly_conf_set)(dev, conf)); > > } > > > > +int > > +rte_eth_supported_hdrs_get(uint16_t port_id, uint32_t *ptypes) { > > + struct rte_eth_dev *dev; > > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV); > > + dev = &rte_eth_devices[port_id]; > > ptypes must be checked vs NULL Thanks for your catch, will fix in the next version. > > > + > > + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops- > >hdrs_supported_ptypes_get, > > +
[PATCH v3] pcapng: fix write more packets than IOV_MAX limit
The rte_pcapng_write_packets() function fails when we try to write more packets than the IOV_MAX limit. writev() system call is limited by the IOV_MAX limit. The iovcnt argument is valid if it is greater than 0 and less than or equal to IOV_MAX as defined in . To avoid this problem, we can check that all segments of the next packet will fit into the iovec buffer, whose capacity will be limited by the IOV_MAX limit. If not, we flush the current iovec buffer to the file by calling writev() and, if successful, fit the current packet at the beginning of the flushed iovec buffer. Fixes: 8d23ce8f5ee9 ("pcapng: add new library for writing pcapng files") Cc: step...@networkplumber.org Signed-off-by: Mário Kuka --- v3: * Remove unwanted changes to check for partial writes from the writev(). app/test/test_pcapng.c | 42 +++- lib/pcapng/rte_pcapng.c | 47 - 2 files changed, 64 insertions(+), 25 deletions(-) diff --git a/app/test/test_pcapng.c b/app/test/test_pcapng.c index 320dacea34..7f51946fff 100644 --- a/app/test/test_pcapng.c +++ b/app/test/test_pcapng.c @@ -110,7 +110,7 @@ test_setup(void) } /* Make a pool for cloned packets */ - mp = rte_pktmbuf_pool_create_by_ops("pcapng_test_pool", NUM_PACKETS, + mp = rte_pktmbuf_pool_create_by_ops("pcapng_test_pool", IOV_MAX + NUM_PACKETS, 0, 0, rte_pcapng_mbuf_size(pkt_len), SOCKET_ID_ANY, "ring_mp_sc"); @@ -237,6 +237,45 @@ test_validate(void) return ret; } +static int +test_write_over_limit_iov_max(void) +{ + struct rte_mbuf *orig; + struct rte_mbuf *clones[IOV_MAX + NUM_PACKETS] = { }; + struct dummy_mbuf mbfs; + unsigned int i; + ssize_t len; + + /* make a dummy packet */ + mbuf1_prepare(&mbfs, pkt_len); + + /* clone them */ + orig = &mbfs.mb[0]; + for (i = 0; i < IOV_MAX + NUM_PACKETS; i++) { + struct rte_mbuf *mc; + + mc = rte_pcapng_copy(port_id, 0, orig, mp, pkt_len, + rte_get_tsc_cycles(), 0); + if (mc == NULL) { + fprintf(stderr, "Cannot copy packet\n"); + return -1; + } + clones[i] = mc; + } + + /* write it to capture file */ + len = rte_pcapng_write_packets(pcapng, clones, IOV_MAX + NUM_PACKETS); + + rte_pktmbuf_free_bulk(clones, IOV_MAX + NUM_PACKETS); + + if (len <= 0) { + fprintf(stderr, "Write of packets failed\n"); + return -1; + } + + return 0; +} + static void test_cleanup(void) { @@ -256,6 +295,7 @@ unit_test_suite test_pcapng_suite = { TEST_CASE(test_write_packets), TEST_CASE(test_write_stats), TEST_CASE(test_validate), + TEST_CASE(test_write_over_limit_iov_max), TEST_CASES_END() } }; diff --git a/lib/pcapng/rte_pcapng.c b/lib/pcapng/rte_pcapng.c index 06ad712bd1..e41cf909e1 100644 --- a/lib/pcapng/rte_pcapng.c +++ b/lib/pcapng/rte_pcapng.c @@ -551,33 +551,16 @@ rte_pcapng_copy(uint16_t port_id, uint32_t queue, return NULL; } -/* Count how many segments are in this array of mbufs */ -static unsigned int -mbuf_burst_segs(struct rte_mbuf *pkts[], unsigned int n) -{ - unsigned int i, iovcnt; - - for (iovcnt = 0, i = 0; i < n; i++) { - const struct rte_mbuf *m = pkts[i]; - - __rte_mbuf_sanity_check(m, 1); - - iovcnt += m->nb_segs; - } - return iovcnt; -} - /* Write pre-formatted packets to file. */ ssize_t rte_pcapng_write_packets(rte_pcapng_t *self, struct rte_mbuf *pkts[], uint16_t nb_pkts) { - int iovcnt = mbuf_burst_segs(pkts, nb_pkts); - struct iovec iov[iovcnt]; - unsigned int i, cnt; - ssize_t ret; + struct iovec iov[IOV_MAX]; + unsigned int i, cnt = 0; + ssize_t ret, total = 0; - for (i = cnt = 0; i < nb_pkts; i++) { + for (i = 0; i < nb_pkts; i++) { struct rte_mbuf *m = pkts[i]; struct pcapng_enhance_packet_block *epb; @@ -589,6 +572,20 @@ rte_pcapng_write_packets(rte_pcapng_t *self, return -1; } + /* +* Handle case of highly fragmented and large burst size +* Note: this assumes that max segments per mbuf < IOV_MAX +*/ + if (unlikely(cnt + m->nb_segs >= IOV_MAX)) { + ret = writev(self->outfd, iov, cnt); + if (unlikely(ret < 0)) { + rte_errno = errno; + return -1; + } + total += ret; +
Re: [PATCH v2 2/2] pcapng: check if writev() returns a partial write
As pcapng is used in the dpdk application it writes to a file. You could repurpose it to something else, but even a pipe will not give partial writes unless you configure the pipe as non-blocking. Writing to a non-blocking pipe is going to have a load of other problems. This seems like a purely hypothetical case, can't see why it needs to be addressed. OK, I'm sending patch v3 which only fixes the IVO_MAX limit issue. I've removed the changes related to the partial write check. On 29/07/2022 20:14, Stephen Hemminger wrote: On Fri, 29 Jul 2022 19:08:41 +0200 Mário Kuka wrote: Since this is being written to a file, handling partial writes makes little sense. The only case where partial write would happen would be if filesystem was full. Retrying just adds unnecessary complexity. If you really want to track this, then add a dropped counter. But the file descriptor doesn't have to refer to just a regular file, what if it's a socket or a pipe or some device? The pcapng documentation doesn't say anything about any restrictions, so the implementation should be fully generic. What's the point of a function to write packets to a file descriptor where there's a risk that it won't write all the packets or that the file will by corrupted due to a partial write and still not even let me know about it? As pcapng is used in the dpdk application it writes to a file. You could repurpose it to something else, but even a pipe will not give partial writes unless you configure the pipe as non-blocking. Writing to a non-blocking pipe is going to have a load of other problems. This seems like a purely hypothetical case, can't see why it needs to be addressed. smime.p7s Description: S/MIME Cryptographic Signature
Re: [PATCH v9 1/4] ethdev: introduce protocol header API
01/08/2022 09:09, Wang, YuanX: > Hi Thomas, > > Sorry so long to response your email. > > From: Thomas Monjalon > > 13/06/2022 12:25, wenxuanx...@intel.com: > > > From: Wenxuan Wu > > > > > > This patch added new ethdev API to retrieve supported protocol header > > > mask of a PMD, which helps to configure protocol header based buffer > > > split. > > > > > > Signed-off-by: Wenxuan Wu > > > --- > > > +/** > > > + * @warning > > > + * @b EXPERIMENTAL: this API may change without prior notice > > > + * > > > + * Get supported header protocols to split supported by PMD. > > > + * The API will return error if the device is not valid. > > > + * > > > + * @param port_id > > > + * The port identifier of the device. > > > + * @param ptype > > > + * Supported protocol headers of driver. > > > > It doesn't say where to find the types. > > Please give the prefix. > > Sorry I didn't catch your point, are you referring the ptype should be > composed of RTE_PTYPE_*? > Could you explain it in more detail? Yes just give to the user the required info to use the function. If ptype must be composed with RTE_PTYPE_*, it must be said. Thanks
[PATCH 0/2] vhost fixes for OVS SIGSEGV in PMD
Hi, the real meat is in patch 1/2, which fixes a segmentation fault in the OVS PMD thread when resynchronizing with QEMU after the guest application has been killed with SIGKILL. This fixes an issue where the guest DPDK application is able to crash the OVS process on the host. Patch 2/2 is just an improvement in the current error handling. For your review and comments, Claudio Claudio Fontana (2): vhost: fix error handling in virtio_dev_tx_split vhost: improve error handling in desc_to_mbuf lib/vhost/virtio_net.c | 16 +--- 1 file changed, 13 insertions(+), 3 deletions(-) -- 2.26.2
[PATCH 1/2] vhost: fix error handling in virtio_dev_tx_split
in virtio_dev_split we add a check for invalid nr_vec, mainly for nr_vec == 0 (but add a check for BUF_VECTOR_MAX too), and bail out before calling desc_to_mbuf, otherwise in desc_to_mbuf we end up trying to memcpy from a source address buf_vec[0] that is an uninitialized stack variable. This should fix errors that have been reported in multiple occasions from telcos to the DPDK, OVS and QEMU projects, as this affects in particular the openvswitch/DPDK, QEMU vhost-user setup. The back trace looks roughly like this, depending on the specific rte_memcpy selected, etc, in any case the "src" parameter is garbage (in this example containing 0 + dev->host_hlen(12 = 0xc)). Thread 153 "pmd-c88/id:150" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7f64e5e6b700 (LWP 141373)] rte_mov128blocks (n=2048, src=0xc , dst=0x150da4480) at ../lib/eal/x86/include/rte_memcpy.h:384 (gdb) bt 0 rte_mov128blocks (n=2048, src=0xc, dst=0x150da4480) 1 rte_memcpy_generic (n=2048, src=0xc, dst=0x150da4480) 2 rte_memcpy (n=2048, src=0xc, dst=) 3 sync_fill_seg 4 desc_to_mbuf 5 virtio_dev_tx_split 6 virtio_dev_tx_split_legacy 7 0x7f676fea0fef in rte_vhost_dequeue_burst 8 0x7f6772005a62 in netdev_dpdk_vhost_rxq_recv 9 0x7f6771f38116 in netdev_rxq_recv 10 0x7f6771f03d96 in dp_netdev_process_rxq_port 11 0x7f6771f04239 in pmd_thread_main 12 0x7f6771f92aff in ovsthread_wrapper 13 0x7f6771c1b6ea in start_thread 14 0x7f6771933a8f in clone Tested-by: Claudio Fontana Signed-off-by: Claudio Fontana --- lib/vhost/virtio_net.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c index 35fa4670fd..0b8db2046e 100644 --- a/lib/vhost/virtio_net.c +++ b/lib/vhost/virtio_net.c @@ -2917,9 +2917,16 @@ virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq, vq->last_avail_idx + i, &nr_vec, buf_vec, &head_idx, &buf_len, - VHOST_ACCESS_RO) < 0)) + VHOST_ACCESS_RO) < 0)) { + dropped += 1; + i++; break; - + } + if (unlikely(nr_vec < 1 || nr_vec >= BUF_VECTOR_MAX)) { + dropped += 1; + i++; + break; + } update_shadow_used_ring_split(vq, head_idx, 0); err = virtio_dev_pktmbuf_prep(dev, pkts[i], buf_len); -- 2.26.2
[PATCH 2/2] vhost: improve error handling in desc_to_mbuf
check when increasing vec_idx that it is still valid in the (buf_len < dev->vhost_hlen) case too. Signed-off-by: Claudio Fontana --- lib/vhost/virtio_net.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c index 0b8db2046e..6d34feaf73 100644 --- a/lib/vhost/virtio_net.c +++ b/lib/vhost/virtio_net.c @@ -2701,12 +2701,15 @@ desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, if (unlikely(buf_len < dev->vhost_hlen)) { buf_offset = dev->vhost_hlen - buf_len; vec_idx++; + if (unlikely(vec_idx >= nr_vec)) + goto error; buf_addr = buf_vec[vec_idx].buf_addr; buf_iova = buf_vec[vec_idx].buf_iova; buf_len = buf_vec[vec_idx].buf_len; buf_avail = buf_len - buf_offset; } else if (buf_len == dev->vhost_hlen) { - if (unlikely(++vec_idx >= nr_vec)) + vec_idx++; + if (unlikely(vec_idx >= nr_vec)) goto error; buf_addr = buf_vec[vec_idx].buf_addr; buf_iova = buf_vec[vec_idx].buf_iova; -- 2.26.2
Re: Question about pattern types for rte_flow RSS rule
On 8/1/22 06:39, lihuisong (C) wrote: 在 2022/7/31 17:40, Andrew Rybchenko 写道: Hi, Huisong! On 7/29/22 05:30, lihuisong (C) wrote: Hi Ori, and all, For RSS flow rule, pattern item types and RSS types in action are an inclusive relationship, and RSS types contain pattern item types. I disagree with the statement. We can redirect various packets, but apply RSS on subset only. Everything else goes to the first queue (first entry in the redirection table in fact). Sorry, the statement above is inaccurate. I mean, pattern item type and RSS types in action are the same for creating specified types RSS rule, and this configuration is duplicate in this case(in <1> command, I think). Use one of them can specify which packet type flow performs action(like <1> or <2> command). No, I still disagree. Pattern is about filtering. It defines on which packets flow rule actions are applied. RSS types choose packet header fields (if applicable) to calculate hash. I.e. RSS types are no about filtering. Create a RSS rule that redirect ipv4-tcp flow to queues 1 and 2: <1> flow create 0 ingress pattern eth / ipv4 / tcp / end actions rss types ipv4-tcp end queues 1 2 end / end Yes, this one will redirect IPv4 TCP packets only. <2> flow create 0 ingress pattern end actions rss types ipv4-tcp end queues 1 2 end / end The rule will redirect all packets to queues 1 and 2. Some packets will always go to queue 1, e.g. ARP packets. <3> flow create 0 ingress pattern eth / ipv4 / tcp / end actions rss end queues 1 2 end / end This one will redirect IPv4 TCP and will use RSS hash function specified in dev_conf on device configuration or default RSS hash function chosen by corresponding driver. What do you think about it, @Andrew and @Ori? Is it necessary to set pattern item types when specify RSS types to create a rule? No, it is not strictly required. It depends on what you want. How should the user set and how should the driver do? Pattern and action are not strictly related in the case of RSS. Pattern defines on which packets the rule is applied. Action defines what the rule does. If hash function is not applicable to a packet, e.g. ARP packet and ipv4-tcp hash function, the hash is 0 and goes via redirection table entry 0. I know this rule. Ori has already explained the usage of RSS rule in rte_flow API, but I still have a confusion momentioned above. Andrew. Looking forward to your reply. Regards, Huisong 在 2022/7/13 9:34, lihuisong (C) 写道: Hi all, Can someone open my confusion? I'm looking forward to your reply. Thanks, Huisong. 在 2022/7/7 11:50, lihuisong (C) 写道: Hi all, From following testpmd command: 'flow create 0 ingress pattern eth / ipv4 / tcp / end actions rss types ipv4-tcp l3-src-only end queues end / end' and "flow create 0 ingress pattern end actions rss types ipv4-tcp l3-src-only end queues end / end" I have some confusions about rte_flow RSS rule: 1> Do pattern item types need to set when configure rte_flow RSS rule? 2> Does the driver need to check and process the pattern? (After all, the RSS types in actions alreadly contain all RSS offload types.) Have someone explains it? Regards, Huisong . . .
Re: segfault in ovs in setup with DPDK, qemu vhost-user
On 7/30/22 18:23, Claudio Fontana wrote: > On 7/30/22 18:17, Claudio Fontana wrote: >> Hello all, >> >> with the latest DPDK, openvswitch and qemu >> >> DPDK tag v22.07 >> openvswitch tag v2.17.1 >> qemu v7.1-git 22.07.2022 >> >> and a DPDK setup which involves also an ubuntu guest with DPDK 16.11 >> test-pmd application (but also verified with DPDK 19.x), >> with an external traffic generator to cause some load, >> >> I am able to cause a segfault in OVS (ovs-vswitchd) inside the DPDK >> libraries by doing (from the guest): >> >> bind the device, start testpmd, >> SIGKILL of testpmd, >> immediately restart testpmd, >> rinse and repeat. >> >> Once every few restarts, the following segfault happens (may take anything >> from a few seconds to minutes): >> >> >> Thread 153 "pmd-c88/id:150" received signal SIGSEGV, Segmentation fault. >> [Switching to Thread 0x7f64e5e6b700 (LWP 141373)] >> rte_mov128blocks (n=2048, src=0xc > 0xc>, dst=0x150da4480 "h\005\312❇\377\377\377\377\377\377\b") at >> ../lib/eal/x86/include/rte_memcpy.h:384 >> 384 ../lib/eal/x86/include/rte_memcpy.h: No such file or directory. >> (gdb) bt >> #0 rte_mov128blocks (n=2048, src=0xc > address 0xc>, >> dst=0x150da4480 "h\005\312❇\377\377\377\377\377\377\b") at >> ../lib/eal/x86/include/rte_memcpy.h:384 >> #1 rte_memcpy_generic (n=2048, src=0xc, dst=0x150da4480) at >> ../lib/eal/x86/include/rte_memcpy.h:484 >> #2 rte_memcpy (n=2048, src=0xc, dst=) at >> ../lib/eal/x86/include/rte_memcpy.h:851 >> #3 sync_fill_seg (to_desc=false, cpy_len=2048, buf_iova=, >> buf_addr=12, mbuf_offset=0, m=0x150da4140, >> vq=0x2200400680, dev=0x2200d3d740) at ../lib/vhost/virtio_net.c:1119 >> #4 desc_to_mbuf (is_async=false, slot_idx=0, legacy_ol_flags=true, >> mbuf_pool=0x17fe7df00, m=0x150da4140, nr_vec=, >> buf_vec=0x7f64e5e67ca0, vq=0x2200400680, dev=0x2200d3d740) at >> ../lib/vhost/virtio_net.c:2747 >> #5 virtio_dev_tx_split (legacy_ol_flags=true, count=, >> count@entry=0, pkts=pkts@entry=0x0, >> mbuf_pool=mbuf_pool@entry=0x150da4140, vq=vq@entry=0xe5e67d34, >> dev=dev@entry=0x7f64e5e694d0) at ../lib/vhost/virtio_net.c:2943 >> #6 virtio_dev_tx_split_legacy (dev=dev@entry=0x2200d3d740, >> vq=vq@entry=0x2200400680, mbuf_pool=mbuf_pool@entry=0x17fe7df00, >> pkts=pkts@entry=0x7f64e5e69600, count=count@entry=32) at >> ../lib/vhost/virtio_net.c:2979 >> #7 0x7f676fea0fef in rte_vhost_dequeue_burst (vid=vid@entry=0, >> queue_id=queue_id@entry=1, mbuf_pool=0x17fe7df00, >> pkts=pkts@entry=0x7f64e5e69600, count=count@entry=32) at >> ../lib/vhost/virtio_net.c:3331 >> #8 0x7f6772005a62 in netdev_dpdk_vhost_rxq_recv (rxq=, >> batch=0x7f64e5e695f0, qfill=0x0) >> at ../lib/netdev-dpdk.c:2393 >> #9 0x7f6771f38116 in netdev_rxq_recv (rx=, >> batch=batch@entry=0x7f64e5e695f0, qfill=) >> at ../lib/netdev.c:727 >> #10 0x7f6771f03d96 in dp_netdev_process_rxq_port >> (pmd=pmd@entry=0x7f64e5e6c010, rxq=0x254d730, port_no=2) >> at ../lib/dpif-netdev.c:5317 >> #11 0x7f6771f04239 in pmd_thread_main (f_=) at >> ../lib/dpif-netdev.c:6945 >> #12 0x7f6771f92aff in ovsthread_wrapper (aux_=) at >> ../lib/ovs-thread.c:422 >> #13 0x7f6771c1b6ea in start_thread () from /lib64/libpthread.so.0 >> #14 0x7f6771933a8f in clone () from /lib64/libc.so.6 >> >> When run in gdb as shown above, ovs-vswitchd on the host gets a SIGSEGV and >> drops to gdb as shown above, >> so as a result QEMU stops when trying to read a response from ovs as such: >> >> 0 0x7f0a093991e9 in poll () from target:/lib64/libc.so.6 >> #1 0x7f0a0b06c9a9 in ?? () from target:/usr/lib64/libglib-2.0.so.0 >> #2 0x7f0a0b06ccf2 in g_main_loop_run () from >> target:/usr/lib64/libglib-2.0.so.0 >> #3 0x561a5cd04747 in vhost_user_read (dev=dev@entry=0x561a5e640df0, >> msg=msg@entry=0x7f09ff7fd160) >> at ../hw/virtio/vhost-user.c:406 >> #4 0x561a5cd04c7e in vhost_user_get_vring_base (dev=0x561a5e640df0, >> ring=0x7f09ff7fd428) >> at ../hw/virtio/vhost-user.c:1261 >> #5 0x561a5cd0043f in vhost_virtqueue_stop >> (dev=dev@entry=0x561a5e640df0, vdev=vdev@entry=0x561a5f78ae50, >> vq=0x561a5e641070, idx=0) at ../hw/virtio/vhost.c:1216 >> #6 0x561a5cd034fa in vhost_dev_stop (hdev=hdev@entry=0x561a5e640df0, >> vdev=vdev@entry=0x561a5f78ae50) >> at ../hw/virtio/vhost.c:1872 >> #7 0x561a5cb623fa in vhost_net_stop_one (net=0x561a5e640df0, >> dev=dev@entry=0x561a5f78ae50) >> at ../hw/net/vhost_net.c:315 >> #8 0x561a5cb6295e in vhost_net_stop (dev=dev@entry=0x561a5f78ae50, >> ncs=0x561a5f808970, >> data_queue_pairs=data_queue_pairs@entry=4, cvq=cvq@entry=0) at >> ../hw/net/vhost_net.c:427 >> #9 0x561a5cccef79 in virtio_net_vhost_status (status=, >> n=0x561a5f78ae50) >> at ../hw/net/virtio-net.c:298 >> #10 virtio_net_set_status (vdev=0x561a5f78ae50, status=0 '\000') at >> ../hw/net/virtio-net.c:372 >> #11 0x561a5ccfb36b in virtio_set_status (vdev=vdev@entry=0
Re: segfault in ovs in setup with DPDK, qemu vhost-user
On 8/1/22 13:57, Claudio Fontana wrote: > On 7/30/22 18:23, Claudio Fontana wrote: >> On 7/30/22 18:17, Claudio Fontana wrote: >>> Hello all, >>> >>> with the latest DPDK, openvswitch and qemu >>> >>> DPDK tag v22.07 >>> openvswitch tag v2.17.1 >>> qemu v7.1-git 22.07.2022 >>> >>> and a DPDK setup which involves also an ubuntu guest with DPDK 16.11 >>> test-pmd application (but also verified with DPDK 19.x), >>> with an external traffic generator to cause some load, >>> >>> I am able to cause a segfault in OVS (ovs-vswitchd) inside the DPDK >>> libraries by doing (from the guest): >>> >>> bind the device, start testpmd, >>> SIGKILL of testpmd, >>> immediately restart testpmd, >>> rinse and repeat. >>> >>> Once every few restarts, the following segfault happens (may take anything >>> from a few seconds to minutes): >>> >>> >>> Thread 153 "pmd-c88/id:150" received signal SIGSEGV, Segmentation fault. >>> [Switching to Thread 0x7f64e5e6b700 (LWP 141373)] >>> rte_mov128blocks (n=2048, src=0xc >> 0xc>, dst=0x150da4480 "h\005\312❇\377\377\377\377\377\377\b") at >>> ../lib/eal/x86/include/rte_memcpy.h:384 >>> 384 ../lib/eal/x86/include/rte_memcpy.h: No such file or directory. >>> (gdb) bt >>> #0 rte_mov128blocks (n=2048, src=0xc >> address 0xc>, >>> dst=0x150da4480 "h\005\312❇\377\377\377\377\377\377\b") at >>> ../lib/eal/x86/include/rte_memcpy.h:384 >>> #1 rte_memcpy_generic (n=2048, src=0xc, dst=0x150da4480) at >>> ../lib/eal/x86/include/rte_memcpy.h:484 >>> #2 rte_memcpy (n=2048, src=0xc, dst=) at >>> ../lib/eal/x86/include/rte_memcpy.h:851 >>> #3 sync_fill_seg (to_desc=false, cpy_len=2048, buf_iova=, >>> buf_addr=12, mbuf_offset=0, m=0x150da4140, >>> vq=0x2200400680, dev=0x2200d3d740) at ../lib/vhost/virtio_net.c:1119 >>> #4 desc_to_mbuf (is_async=false, slot_idx=0, legacy_ol_flags=true, >>> mbuf_pool=0x17fe7df00, m=0x150da4140, nr_vec=, >>> buf_vec=0x7f64e5e67ca0, vq=0x2200400680, dev=0x2200d3d740) at >>> ../lib/vhost/virtio_net.c:2747 >>> #5 virtio_dev_tx_split (legacy_ol_flags=true, count=, >>> count@entry=0, pkts=pkts@entry=0x0, >>> mbuf_pool=mbuf_pool@entry=0x150da4140, vq=vq@entry=0xe5e67d34, >>> dev=dev@entry=0x7f64e5e694d0) at ../lib/vhost/virtio_net.c:2943 >>> #6 virtio_dev_tx_split_legacy (dev=dev@entry=0x2200d3d740, >>> vq=vq@entry=0x2200400680, mbuf_pool=mbuf_pool@entry=0x17fe7df00, >>> pkts=pkts@entry=0x7f64e5e69600, count=count@entry=32) at >>> ../lib/vhost/virtio_net.c:2979 >>> #7 0x7f676fea0fef in rte_vhost_dequeue_burst (vid=vid@entry=0, >>> queue_id=queue_id@entry=1, mbuf_pool=0x17fe7df00, >>> pkts=pkts@entry=0x7f64e5e69600, count=count@entry=32) at >>> ../lib/vhost/virtio_net.c:3331 >>> #8 0x7f6772005a62 in netdev_dpdk_vhost_rxq_recv (rxq=, >>> batch=0x7f64e5e695f0, qfill=0x0) >>> at ../lib/netdev-dpdk.c:2393 >>> #9 0x7f6771f38116 in netdev_rxq_recv (rx=, >>> batch=batch@entry=0x7f64e5e695f0, qfill=) >>> at ../lib/netdev.c:727 >>> #10 0x7f6771f03d96 in dp_netdev_process_rxq_port >>> (pmd=pmd@entry=0x7f64e5e6c010, rxq=0x254d730, port_no=2) >>> at ../lib/dpif-netdev.c:5317 >>> #11 0x7f6771f04239 in pmd_thread_main (f_=) at >>> ../lib/dpif-netdev.c:6945 >>> #12 0x7f6771f92aff in ovsthread_wrapper (aux_=) at >>> ../lib/ovs-thread.c:422 >>> #13 0x7f6771c1b6ea in start_thread () from /lib64/libpthread.so.0 >>> #14 0x7f6771933a8f in clone () from /lib64/libc.so.6 >>> >>> When run in gdb as shown above, ovs-vswitchd on the host gets a SIGSEGV and >>> drops to gdb as shown above, >>> so as a result QEMU stops when trying to read a response from ovs as such: >>> >>> 0 0x7f0a093991e9 in poll () from target:/lib64/libc.so.6 >>> #1 0x7f0a0b06c9a9 in ?? () from target:/usr/lib64/libglib-2.0.so.0 >>> #2 0x7f0a0b06ccf2 in g_main_loop_run () from >>> target:/usr/lib64/libglib-2.0.so.0 >>> #3 0x561a5cd04747 in vhost_user_read (dev=dev@entry=0x561a5e640df0, >>> msg=msg@entry=0x7f09ff7fd160) >>> at ../hw/virtio/vhost-user.c:406 >>> #4 0x561a5cd04c7e in vhost_user_get_vring_base (dev=0x561a5e640df0, >>> ring=0x7f09ff7fd428) >>> at ../hw/virtio/vhost-user.c:1261 >>> #5 0x561a5cd0043f in vhost_virtqueue_stop >>> (dev=dev@entry=0x561a5e640df0, vdev=vdev@entry=0x561a5f78ae50, >>> vq=0x561a5e641070, idx=0) at ../hw/virtio/vhost.c:1216 >>> #6 0x561a5cd034fa in vhost_dev_stop (hdev=hdev@entry=0x561a5e640df0, >>> vdev=vdev@entry=0x561a5f78ae50) >>> at ../hw/virtio/vhost.c:1872 >>> #7 0x561a5cb623fa in vhost_net_stop_one (net=0x561a5e640df0, >>> dev=dev@entry=0x561a5f78ae50) >>> at ../hw/net/vhost_net.c:315 >>> #8 0x561a5cb6295e in vhost_net_stop (dev=dev@entry=0x561a5f78ae50, >>> ncs=0x561a5f808970, >>> data_queue_pairs=data_queue_pairs@entry=4, cvq=cvq@entry=0) at >>> ../hw/net/vhost_net.c:427 >>> #9 0x561a5cccef79 in virtio_net_vhost_status (status=, >>> n=0x561a5f78ae50) >>> at ../hw/net/virtio-net.c:298 >>> #10 virtio_net_se
[PATCH] app/testpmd: add command line argument 'rx-metadata'
Presently, rx metadata is sent to PMD by default, leading to a performance drop as processing for the same in rx path takes extra cycles. Hence, introducing command line argument, 'rx-metadata' to control passing rx metadata to PMD. By default it’s disabled. ci: skip_checkpatch Signed-off-by: Hanumanth Pothula Change-Id: If7b6bbc7489d3e9df89c63e000d71ea2f7fe9afd --- app/test-pmd/parameters.c | 4 app/test-pmd/testpmd.c| 6 +- app/test-pmd/testpmd.h| 2 ++ 3 files changed, 11 insertions(+), 1 deletion(-) diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c index e3c9757f3f..daf1218977 100644 --- a/app/test-pmd/parameters.c +++ b/app/test-pmd/parameters.c @@ -213,6 +213,7 @@ usage(char* progname) printf(" --hairpin-mode=0xXX: bitmask set the hairpin port mode.\n" "0x10 - explicit Tx rule, 0x02 - hairpin ports paired\n" "0x01 - hairpin ports loop, 0x00 - hairpin port self\n"); + printf(" --rx-metadata: send rx metadata to driver \n"); } #ifdef RTE_LIB_CMDLINE @@ -710,6 +711,7 @@ launch_args_parse(int argc, char** argv) { "record-burst-stats", 0, 0, 0 }, { PARAM_NUM_PROCS, 1, 0, 0 }, { PARAM_PROC_ID,1, 0, 0 }, + { "rx-metadata",0, 0, 0 }, { 0, 0, 0, 0 }, }; @@ -1510,6 +1512,8 @@ launch_args_parse(int argc, char** argv) num_procs = atoi(optarg); if (!strcmp(lgopts[opt_idx].name, PARAM_PROC_ID)) proc_id = atoi(optarg); + if (!strcmp(lgopts[opt_idx].name, "rx-metadata")) + rx_metadata_negotiate = 1; break; case 'h': usage(argv[0]); diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index addcbcac85..ebbde5dfc9 100644 --- a/app/test-pmd/testpmd.c +++ b/app/test-pmd/testpmd.c @@ -411,6 +411,9 @@ uint8_t clear_ptypes = true; /* Hairpin ports configuration mode. */ uint16_t hairpin_mode; +/* send rx metadata */ +uint8_t rx_metadata_negotiate; + /* Pretty printing of ethdev events */ static const char * const eth_event_desc[] = { [RTE_ETH_EVENT_UNKNOWN] = "unknown", @@ -1628,7 +1631,8 @@ init_config_port_offloads(portid_t pid, uint32_t socket_id) int ret; int i; - eth_rx_metadata_negotiate_mp(pid); + if (rx_metadata_negotiate) + eth_rx_metadata_negotiate_mp(pid); port->dev_conf.txmode = tx_mode; port->dev_conf.rxmode = rx_mode; diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h index fb2f5195d3..8a9168c51e 100644 --- a/app/test-pmd/testpmd.h +++ b/app/test-pmd/testpmd.h @@ -621,6 +621,8 @@ extern struct rte_ether_addr peer_eth_addrs[RTE_MAX_ETHPORTS]; extern uint32_t burst_tx_delay_time; /**< Burst tx delay time(us) for mac-retry. */ extern uint32_t burst_tx_retry_num; /**< Burst tx retry number for mac-retry. */ +extern uint8_t rx_metadata_negotiate; + #ifdef RTE_LIB_GRO #define GRO_DEFAULT_ITEM_NUM_PER_FLOW 32 #define GRO_DEFAULT_FLOW_NUM (RTE_GRO_MAX_BURST_ITEM_NUM / \ -- 2.25.1
[PATCH] app/testpmd: add command line argument 'rx-metadata'
Presently, rx metadata is sent to PMD by default, leading to a performance drop as processing for the same in rx path takes extra cycles. Hence, introducing command line argument, 'rx-metadata' to control passing rx metadata to PMD. By default it’s disabled. Signed-off-by: Hanumanth Pothula --- app/test-pmd/parameters.c | 4 app/test-pmd/testpmd.c| 6 +- app/test-pmd/testpmd.h| 2 ++ 3 files changed, 11 insertions(+), 1 deletion(-) diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c index e3c9757f3f..daf1218977 100644 --- a/app/test-pmd/parameters.c +++ b/app/test-pmd/parameters.c @@ -213,6 +213,7 @@ usage(char* progname) printf(" --hairpin-mode=0xXX: bitmask set the hairpin port mode.\n" "0x10 - explicit Tx rule, 0x02 - hairpin ports paired\n" "0x01 - hairpin ports loop, 0x00 - hairpin port self\n"); + printf(" --rx-metadata: send rx metadata to driver \n"); } #ifdef RTE_LIB_CMDLINE @@ -710,6 +711,7 @@ launch_args_parse(int argc, char** argv) { "record-burst-stats", 0, 0, 0 }, { PARAM_NUM_PROCS, 1, 0, 0 }, { PARAM_PROC_ID,1, 0, 0 }, + { "rx-metadata",0, 0, 0 }, { 0, 0, 0, 0 }, }; @@ -1510,6 +1512,8 @@ launch_args_parse(int argc, char** argv) num_procs = atoi(optarg); if (!strcmp(lgopts[opt_idx].name, PARAM_PROC_ID)) proc_id = atoi(optarg); + if (!strcmp(lgopts[opt_idx].name, "rx-metadata")) + rx_metadata_negotiate = 1; break; case 'h': usage(argv[0]); diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index addcbcac85..ebbde5dfc9 100644 --- a/app/test-pmd/testpmd.c +++ b/app/test-pmd/testpmd.c @@ -411,6 +411,9 @@ uint8_t clear_ptypes = true; /* Hairpin ports configuration mode. */ uint16_t hairpin_mode; +/* send rx metadata */ +uint8_t rx_metadata_negotiate; + /* Pretty printing of ethdev events */ static const char * const eth_event_desc[] = { [RTE_ETH_EVENT_UNKNOWN] = "unknown", @@ -1628,7 +1631,8 @@ init_config_port_offloads(portid_t pid, uint32_t socket_id) int ret; int i; - eth_rx_metadata_negotiate_mp(pid); + if (rx_metadata_negotiate) + eth_rx_metadata_negotiate_mp(pid); port->dev_conf.txmode = tx_mode; port->dev_conf.rxmode = rx_mode; diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h index fb2f5195d3..8a9168c51e 100644 --- a/app/test-pmd/testpmd.h +++ b/app/test-pmd/testpmd.h @@ -621,6 +621,8 @@ extern struct rte_ether_addr peer_eth_addrs[RTE_MAX_ETHPORTS]; extern uint32_t burst_tx_delay_time; /**< Burst tx delay time(us) for mac-retry. */ extern uint32_t burst_tx_retry_num; /**< Burst tx retry number for mac-retry. */ +extern uint8_t rx_metadata_negotiate; + #ifdef RTE_LIB_GRO #define GRO_DEFAULT_ITEM_NUM_PER_FLOW 32 #define GRO_DEFAULT_FLOW_NUM (RTE_GRO_MAX_BURST_ITEM_NUM / \ -- 2.25.1
Re: [PATCH v1] graph: fix out of bounds access when re-allocate node objs
On Wed, Jul 27, 2022 at 8:10 AM Zhirun Yan wrote: > > For __rte_node_enqueue_prologue(), If the number of objs is more than > the node->size * 2, the extra objs will write out of bounds memory. > It should use __rte_node_stream_alloc_size() to request enough memory. > > And for rte_node_next_stream_put(), it will re-allocate a small size, > when the node free space is small and new objs is less than the current > node->size. Some objs pointers behind new size may be lost. And it will > cause memory leak. It should request enough size of memory, containing > the original objs and new objs at least. > > Fixes: 40d4f51403ec ("graph: implement fastpath routines") > > Signed-off-by: Zhirun Yan > Signed-off-by: Liang, Cunming > --- > lib/graph/rte_graph_worker.h | 14 ++ > 1 file changed, 10 insertions(+), 4 deletions(-) > > diff --git a/lib/graph/rte_graph_worker.h b/lib/graph/rte_graph_worker.h > index 0c0b9c095a..b7d145c3cb 100644 > --- a/lib/graph/rte_graph_worker.h > +++ b/lib/graph/rte_graph_worker.h > @@ -218,13 +218,16 @@ static __rte_always_inline void > __rte_node_enqueue_prologue(struct rte_graph *graph, struct rte_node *node, > const uint16_t idx, const uint16_t space) > { > + uint32_t req_size; > > /* Add to the pending stream list if the node is new */ > if (idx == 0) > __rte_node_enqueue_tail_update(graph, node); > > - if (unlikely(node->size < (idx + space))) > - __rte_node_stream_alloc(graph, node); > + if (unlikely(node->size < (idx + space))) { > + req_size = rte_align32pow2(node->size + space); > + __rte_node_stream_alloc_size(graph, node, req_size); > + } Change looks good to me. Please have an inline function to avoid code duplication(Same change in rte_node_next_stream_get()) With above change: Acked-by: Jerin Jacob > } > > /** > @@ -430,9 +433,12 @@ rte_node_next_stream_get(struct rte_graph *graph, struct > rte_node *node, > node = __rte_node_next_node_get(node, next); > const uint16_t idx = node->idx; > uint16_t free_space = node->size - idx; > + uint32_t req_size; > > - if (unlikely(free_space < nb_objs)) > - __rte_node_stream_alloc_size(graph, node, nb_objs); > + if (unlikely(free_space < nb_objs)) { > + req_size = rte_align32pow2(node->size + nb_objs); > + __rte_node_stream_alloc_size(graph, node, req_size); > + } > > return &node->objs[idx]; > } > -- > 2.25.1 >
Re: Question about pattern types for rte_flow RSS rule
On 8/1/22 16:27, lihuisong (C) wrote: 在 2022/8/1 19:53, Andrew Rybchenko 写道: On 8/1/22 06:39, lihuisong (C) wrote: 在 2022/7/31 17:40, Andrew Rybchenko 写道: Hi, Huisong! On 7/29/22 05:30, lihuisong (C) wrote: Hi Ori, and all, For RSS flow rule, pattern item types and RSS types in action are an inclusive relationship, and RSS types contain pattern item types. I disagree with the statement. We can redirect various packets, but apply RSS on subset only. Everything else goes to the first queue (first entry in the redirection table in fact). Sorry, the statement above is inaccurate. I mean, pattern item type and RSS types in action are the same for creating specified types RSS rule, and this configuration is duplicate in this case(in <1> command, I think). Use one of them can specify which packet type flow performs action(like <1> or <2> command). No, I still disagree. Pattern is about filtering. It defines on which packets flow rule actions are applied. RSS types choose packet header fields (if applicable) to calculate hash. I.e. RSS types are no about filtering. Create a RSS rule that redirect ipv4-tcp flow to queues 1 and 2: <1> flow create 0 ingress pattern eth / ipv4 / tcp / end actions rss types ipv4-tcp end queues 1 2 end / end Yes, this one will redirect IPv4 TCP packets only. <2> flow create 0 ingress pattern end actions rss types ipv4-tcp end queues 1 2 end / end The rule will redirect all packets to queues 1 and 2. Some packets will always go to queue 1, e.g. ARP packets. In this case, only nonfrag ipv4 TCP packet flow calculate RSS hash and may distribute to queue 1 and 2 in hardware, and other packets will always go to queue 1, right? Yes. <3> flow create 0 ingress pattern eth / ipv4 / tcp / end actions rss end queues 1 2 end / end This one will redirect IPv4 TCP and will use RSS hash function specified in dev_conf on device configuration or default RSS hash function chosen by corresponding driver. What do you think about it, @Andrew and @Ori? Is it necessary to set pattern item types when specify RSS types to create a rule? No, it is not strictly required. It depends on what you want. How should the user set and how should the driver do? Pattern and action are not strictly related in the case of RSS. Pattern defines on which packets the rule is applied. Action defines what the rule does. If hash function is not applicable to a packet, e.g. ARP packet and ipv4-tcp hash function, the hash is 0 and goes via redirection table entry 0. I know this rule. Ori has already explained the usage of RSS rule in rte_flow API, but I still have a confusion momentioned above. Andrew. Looking forward to your reply. Regards, Huisong 在 2022/7/13 9:34, lihuisong (C) 写道: Hi all, Can someone open my confusion? I'm looking forward to your reply. Thanks, Huisong. 在 2022/7/7 11:50, lihuisong (C) 写道: Hi all, From following testpmd command: 'flow create 0 ingress pattern eth / ipv4 / tcp / end actions rss types ipv4-tcp l3-src-only end queues end / end' and "flow create 0 ingress pattern end actions rss types ipv4-tcp l3-src-only end queues end / end" I have some confusions about rte_flow RSS rule: 1> Do pattern item types need to set when configure rte_flow RSS rule? 2> Does the driver need to check and process the pattern? (After all, the RSS types in actions alreadly contain all RSS offload types.) Have someone explains it? Regards, Huisong . . . .
Re: [PATCH v9 2/4] ethdev: introduce protocol hdr based buffer split
On 7/21/22 06:24, Ding, Xuan wrote: Hi Andrew, -Original Message- From: Andrew Rybchenko Sent: 2022年7月8日 23:01 To: Wu, WenxuanX ; tho...@monjalon.net; Li, Xiaoyun ; ferruh.yi...@xilinx.com; Singh, Aman Deep ; dev@dpdk.org; Zhang, Yuying ; Zhang, Qi Z ; jerinjac...@gmail.com Cc: step...@networkplumber.org; Ding, Xuan ; Wang, YuanX ; Ray Kinsella Subject: Re: [PATCH v9 2/4] ethdev: introduce protocol hdr based buffer split On 6/13/22 13:25, wenxuanx...@intel.com wrote: From: Wenxuan Wu Currently, Rx buffer split supports length based split. With Rx queue offload RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT enabled and Rx packet segment configured, PMD will be able to split the received packets into multiple segments. However, length based buffer split is not suitable for NICs that do split based on protocol headers. Given an arbitrarily variable length in Rx packet segment, it is almost impossible to pass a fixed protocol header to driver. Besides, the existence of tunneling results in the composition of a packet is various, which makes the situation even worse. This patch extends current buffer split to support protocol header based buffer split. A new proto_hdr field is introduced in the reserved field of rte_eth_rxseg_split structure to specify protocol header. The proto_hdr field defines the split position of packet, splitting will always happens after the protocol header defined in the Rx packet segment. When Rx queue offload RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT is enabled and corresponding protocol header is configured, driver will split the ingress packets into multiple segments. struct rte_eth_rxseg_split { struct rte_mempool *mp; /* memory pools to allocate segment from */ uint16_t length; /* segment maximal data length, configures "split point" */ uint16_t offset; /* data offset from beginning of mbuf data buffer */ uint32_t proto_hdr; /* inner/outer L2/L3/L4 protocol header, configures "split point" */ There is a big problem here that using RTE_PTYPE_* defines I can't request split after either TCP or UDP header. Sorry, for some reason I missed your reply. Current RTE_PTYPE_* list all the tunnel and L2/L3/L4 protocol headers (both outer and inner). Do you mean that we should support higher layer protocols after L4? I think tunnel and L2/L3/L4 protocol headers are enough. In DPDK, we don't parse higher level protocols after L4. And the higher layer protocols are richer, we can't list all of them. What do you think? It looks like you don't get my point. You simply cannot say: RTE_PTYPE_L4_TCP | RTE_PTYPE_L4_UDP since it is numerically equal to RTE_PTYPE_L4_FRAG. May be the design limitation is acceptable. I have no strong opinion, but it must be clear for all that the limitation exists. }; If both inner and outer L2/L3/L4 level protocol header split can be supported by a PMD. Corresponding protocol header capability is RTE_PTYPE_L2_ETHER, RTE_PTYPE_L3_IPV4, RTE_PTYPE_L3_IPV6, RTE_PTYPE_L4_TCP, RTE_PTYPE_L4_UDP, RTE_PTYPE_L4_SCTP, RTE_PTYPE_INNER_L2_ETHER, RTE_PTYPE_INNER_L3_IPV4, RTE_PTYPE_INNER_L3_IPV6, RTE_PTYPE_INNER_L4_TCP, RTE_PTYPE_INNER_L4_UDP, RTE_PTYPE_INNER_L4_SCTP. I think there is no point to list above defines here if it is not the only supported defines. Yes, since we use a API to return the protocol header driver supported to split, there is no need to list the incomplete RTE_PTYPE* here. Please see next version. For example, let's suppose we configured the Rx queue with the following segments: seg0 - pool0, proto_hdr0=RTE_PTYPE_L3_IPV4, off0=2B seg1 - pool1, proto_hdr1=RTE_PTYPE_L4_UDP, off1=128B seg2 - pool2, off1=0B The packet consists of MAC_IPV4_UDP_PAYLOAD will be split like following: seg0 - ipv4 header @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0 seg1 - udp header @ 128 in mbuf from pool1 seg2 - payload @ 0 in mbuf from pool2 Sorry, but I still see no definition what should happen with, for example, ARP packet with above config. Thanks, because the following reply was not answered in v8, the definition has not been added in v9 yet. " Our NIC only supports to split the packets into two segments, so there will be an exact match for the only one protocol header configured. Back to this question, for the set of proto_hdrs configured, it can have two behaviors: 1. The aggressive way is to split on longest match you mentioned, E.g. we configure split on ETH-IPV4-TCP, when receives ETH-IPV4-UDP or ETH-IPV6, it can also split on ETH-IPV4 or ETH. 2. A more conservative way is to split only when the packets meet the all protocol headers in the Rx packet segment. In the above situation, it will not do split for ETH-IPV4-UDP and ETH-IPV6. I prefer the second behavior, because the split is usually for the inner most header and payload, if it does not meet, the rest of the h
Re: [PATCH v3] pcapng: fix write more packets than IOV_MAX limit
On Mon, 1 Aug 2022 10:40:56 +0200 Mário Kuka wrote: > The rte_pcapng_write_packets() function fails when we try to write more > packets than the IOV_MAX limit. writev() system call is limited by the > IOV_MAX limit. The iovcnt argument is valid if it is greater than 0 and > less than or equal to IOV_MAX as defined in . > > To avoid this problem, we can check that all segments of the next > packet will fit into the iovec buffer, whose capacity will be limited > by the IOV_MAX limit. If not, we flush the current iovec buffer to the > file by calling writev() and, if successful, fit the current packet at > the beginning of the flushed iovec buffer. > > Fixes: 8d23ce8f5ee9 ("pcapng: add new library for writing pcapng files") > Cc: step...@networkplumber.org > > Signed-off-by: Mário Kuka Thanks for fixing this. Acked-by: Stephen Hemminger
Re: [RFC v3 23/26] dev: hide driver object
On Thu, Jul 28, 2022 at 10:00 AM Bruce Richardson wrote: > > On Thu, Jul 28, 2022 at 05:26:37PM +0200, David Marchand wrote: > > Make rte_driver opaque for non internal users. > > This will make extending this object possible without breaking the ABI. > > > > Introduce a new driver header and move rte_driver definition. > > Update drivers and library to use the internal header. > > > > Some applications may have been dereferencing rte_driver objects, mark > > this object's accessors as stable. > > > > Signed-off-by: David Marchand > > --- > Acked-by: Bruce Richardson Acked-by: Ajit Khaparde smime.p7s Description: S/MIME Cryptographic Signature
Re: [PATCH] app/testpmd: add command line argument 'rx-metadata'
Hi Hanumanth, Thanks for expanding support for the NIC-to-PMD Rx metadata feature. I do not object the idea of the patch, it looks aceeptable. However, please find my comments below. On Mon, 1 Aug 2022, Hanumanth Pothula wrote: Presently, rx metadata is sent to PMD by default, leading to a performance drop as processing for the same in rx path takes extra cycles. Hence, introducing command line argument, 'rx-metadata' to control passing rx metadata to PMD. By default it’s disabled. Signed-off-by: Hanumanth Pothula --- app/test-pmd/parameters.c | 4 app/test-pmd/testpmd.c | 6 +- app/test-pmd/testpmd.h | 2 ++ 3 files changed, 11 insertions(+), 1 deletion(-) diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c index e3c9757f3f..daf1218977 100644 --- a/app/test-pmd/parameters.c +++ b/app/test-pmd/parameters.c @@ -213,6 +213,7 @@ usage(char* progname) printf(" --hairpin-mode=0xXX: bitmask set the hairpin port mode.\n" " 0x10 - explicit Tx rule, 0x02 - hairpin ports paired\n" " 0x01 - hairpin ports loop, 0x00 - hairpin port self\n"); + printf(" --rx-metadata: send rx metadata to driver \n"); In accordance with the above "printf", one should have 2 space characters between the opening double quote character and "--". Also, it is quite common to capitalise "R" in "Rx". The space character before "\n" seems redundant. Regarding the option name, I would appreciate if it gets more precise and clear. Consider: "--nic-to-pmd-rx-metadata". I do not insist on this particular name. Perhaps it pays to have a bit more verbose description as well. Consider: "let the NIC deliver per-packet Rx metadata to PMD". } #ifdef RTE_LIB_CMDLINE @@ -710,6 +711,7 @@ launch_args_parse(int argc, char** argv) { "record-burst-stats", 0, 0, 0 }, { PARAM_NUM_PROCS, 1, 0, 0 }, { PARAM_PROC_ID, 1, 0, 0 }, + { "rx-metadata", 0, 0, 0 }, { 0, 0, 0, 0 }, }; @@ -1510,6 +1512,8 @@ launch_args_parse(int argc, char** argv) num_procs = atoi(optarg); if (!strcmp(lgopts[opt_idx].name, PARAM_PROC_ID)) proc_id = atoi(optarg); + if (!strcmp(lgopts[opt_idx].name, "rx-metadata")) + rx_metadata_negotiate = 1; break; case 'h': usage(argv[0]); diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index addcbcac85..ebbde5dfc9 100644 --- a/app/test-pmd/testpmd.c +++ b/app/test-pmd/testpmd.c @@ -411,6 +411,9 @@ uint8_t clear_ptypes = true; /* Hairpin ports configuration mode. */ uint16_t hairpin_mode; +/* send rx metadata */ Consider: "/* Send Rx metadata */". +uint8_t rx_metadata_negotiate; + /* Pretty printing of ethdev events */ static const char * const eth_event_desc[] = { [RTE_ETH_EVENT_UNKNOWN] = "unknown", @@ -1628,7 +1631,8 @@ init_config_port_offloads(portid_t pid, uint32_t socket_id) int ret; int i; - eth_rx_metadata_negotiate_mp(pid); + if (rx_metadata_negotiate) + eth_rx_metadata_negotiate_mp(pid); port->dev_conf.txmode = tx_mode; port->dev_conf.rxmode = rx_mode; diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h index fb2f5195d3..8a9168c51e 100644 --- a/app/test-pmd/testpmd.h +++ b/app/test-pmd/testpmd.h @@ -621,6 +621,8 @@ extern struct rte_ether_addr peer_eth_addrs[RTE_MAX_ETHPORTS]; extern uint32_t burst_tx_delay_time; /**< Burst tx delay time(us) for mac-retry. */ extern uint32_t burst_tx_retry_num; /**< Burst tx retry number for mac-retry. */ +extern uint8_t rx_metadata_negotiate; + #ifdef RTE_LIB_GRO #define GRO_DEFAULT_ITEM_NUM_PER_FLOW 32 #define GRO_DEFAULT_FLOW_NUM (RTE_GRO_MAX_BURST_ITEM_NUM / \ -- 2.25.1 -- Best regards, Ivan M
[PATCH v2 0/2] vhost fixes for OVS SIGSEGV in PMD
This is an alternative, more general fix compared with PATCH v1. The series fixes a segmentation fault in the OVS PMD thread when resynchronizing with QEMU after the guest application has been killed with SIGKILL (patch 1/2), The segmentation fault can be caused by the guest DPDK application, which is able this way to crash the OVS process on the host, see the backtrace in patch 1/2. Patch 2/2 is an additional improvement in the current error handling. --- Changes from v1: * patch 1/2: instead of only fixing virtio_dev_tx_split, put the check for nr_vec == 0 inside desc_to_mbuf and mbuf_to_desc, so that in no case they attempt to read and dereference addresses from the buf_vec[] array when it does not contain any valid elements. --- For your review and comments, Claudio Claudio Fontana (2): vhost: check for nr_vec == 0 in desc_to_mbuf, mbuf_to_desc vhost: improve error handling in desc_to_mbuf lib/vhost/virtio_net.c | 16 1 file changed, 12 insertions(+), 4 deletions(-) -- 2.26.2
[PATCH 1/2] vhost: check for nr_vec == 0 in desc_to_mbuf, mbuf_to_desc
in virtio_dev_split we cannot currently call desc_to_mbuf with nr_vec == 0, or we end up trying to rte_memcpy from a source address buf_vec[0] that is an uninitialized stack variable. Improve this in general by having desc_to_mbuf and mbuf_to_desc return -1 when called with an invalid nr_vec == 0, which should fix any other instance of this problem. This should fix errors that have been reported in multiple occasions from telcos to the DPDK, OVS and QEMU projects, as this affects in particular the openvswitch/DPDK, QEMU vhost-user setup when the guest DPDK application abruptly goes away via SIGKILL and then reconnects. The back trace looks roughly like this, depending on the specific rte_memcpy selected, etc, in any case the "src" parameter is garbage (in this example containing 0 + dev->host_hlen(12 = 0xc)). Thread 153 "pmd-c88/id:150" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7f64e5e6b700 (LWP 141373)] rte_mov128blocks (n=2048, src=0xc , dst=0x150da4480) at ../lib/eal/x86/include/rte_memcpy.h:384 (gdb) bt 0 rte_mov128blocks (n=2048, src=0xc, dst=0x150da4480) 1 rte_memcpy_generic (n=2048, src=0xc, dst=0x150da4480) 2 rte_memcpy (n=2048, src=0xc, dst=) 3 sync_fill_seg 4 desc_to_mbuf 5 virtio_dev_tx_split 6 virtio_dev_tx_split_legacy 7 0x7f676fea0fef in rte_vhost_dequeue_burst 8 0x7f6772005a62 in netdev_dpdk_vhost_rxq_recv 9 0x7f6771f38116 in netdev_rxq_recv 10 0x7f6771f03d96 in dp_netdev_process_rxq_port 11 0x7f6771f04239 in pmd_thread_main 12 0x7f6771f92aff in ovsthread_wrapper 13 0x7f6771c1b6ea in start_thread 14 0x7f6771933a8f in clone Tested-by: Claudio Fontana Signed-off-by: Claudio Fontana --- lib/vhost/virtio_net.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c index 35fa4670fd..8d0d223983 100644 --- a/lib/vhost/virtio_net.c +++ b/lib/vhost/virtio_net.c @@ -1153,7 +1153,7 @@ mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, struct virtio_net_hdr_mrg_rxbuf tmp_hdr, *hdr = NULL; struct vhost_async *async = vq->async; - if (unlikely(m == NULL)) + if (unlikely(m == NULL) || nr_vec == 0) return -1; buf_addr = buf_vec[vec_idx].buf_addr; @@ -2673,6 +2673,9 @@ desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, struct vhost_async *async = vq->async; struct async_inflight_info *pkts_info; +if (unlikely(nr_vec == 0)) { + return -1; +} buf_addr = buf_vec[vec_idx].buf_addr; buf_iova = buf_vec[vec_idx].buf_iova; buf_len = buf_vec[vec_idx].buf_len; @@ -2917,9 +2920,11 @@ virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq, vq->last_avail_idx + i, &nr_vec, buf_vec, &head_idx, &buf_len, - VHOST_ACCESS_RO) < 0)) + VHOST_ACCESS_RO) < 0)) { + dropped += 1; + i++; break; - + } update_shadow_used_ring_split(vq, head_idx, 0); err = virtio_dev_pktmbuf_prep(dev, pkts[i], buf_len); -- 2.26.2
[PATCH 2/2] vhost: improve error handling in desc_to_mbuf
check when increasing vec_idx that it is still valid in the (buf_len < dev->vhost_hlen) case too. Signed-off-by: Claudio Fontana --- lib/vhost/virtio_net.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c index 8d0d223983..229e484f2d 100644 --- a/lib/vhost/virtio_net.c +++ b/lib/vhost/virtio_net.c @@ -2704,12 +2704,15 @@ desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, if (unlikely(buf_len < dev->vhost_hlen)) { buf_offset = dev->vhost_hlen - buf_len; vec_idx++; + if (unlikely(vec_idx >= nr_vec)) + goto error; buf_addr = buf_vec[vec_idx].buf_addr; buf_iova = buf_vec[vec_idx].buf_iova; buf_len = buf_vec[vec_idx].buf_len; buf_avail = buf_len - buf_offset; } else if (buf_len == dev->vhost_hlen) { - if (unlikely(++vec_idx >= nr_vec)) + vec_idx++; + if (unlikely(vec_idx >= nr_vec)) goto error; buf_addr = buf_vec[vec_idx].buf_addr; buf_iova = buf_vec[vec_idx].buf_iova; -- 2.26.2
[PATCH v3 0/2] vhost fixes for OVS SIGSEGV in PMD
This is an alternative, more general fix compared with PATCH v1, and fixes style issues in v2. The series fixes a segmentation fault in the OVS PMD thread when resynchronizing with QEMU after the guest application has been killed with SIGKILL (patch 1/2), The segmentation fault can be caused by the guest DPDK application, which is able this way to crash the OVS process on the host, see the backtrace in patch 1/2. Patch 2/2 is an additional improvement in the current error handling. --- Changes from v2: fix warnings from checkpatch. --- Changes from v1: * patch 1/2: instead of only fixing virtio_dev_tx_split, put the check for nr_vec == 0 inside desc_to_mbuf and mbuf_to_desc, so that in no case they attempt to read and dereference addresses from the buf_vec[] array when it does not contain any valid elements. --- For your review and comments, Claudio Claudio Fontana (2): vhost: check for nr_vec == 0 in desc_to_mbuf, mbuf_to_desc vhost: improve error handling in desc_to_mbuf lib/vhost/virtio_net.c | 16 1 file changed, 12 insertions(+), 4 deletions(-) -- 2.26.2
[PATCH v3 1/2] vhost: check for nr_vec == 0 in desc_to_mbuf, mbuf_to_desc
in virtio_dev_split we cannot currently call desc_to_mbuf with nr_vec == 0, or we end up trying to rte_memcpy from a source address buf_vec[0] that is an uninitialized stack variable. Improve this in general by having desc_to_mbuf and mbuf_to_desc return -1 when called with an invalid nr_vec == 0, which should fix any other instance of this problem. This should fix errors that have been reported in multiple occasions from telcos to the DPDK, OVS and QEMU projects, as this affects in particular the openvswitch/DPDK, QEMU vhost-user setup when the guest DPDK application abruptly goes away via SIGKILL and then reconnects. The back trace looks roughly like this, depending on the specific rte_memcpy selected, etc, in any case the "src" parameter is garbage (in this example containing 0 + dev->host_hlen(12 = 0xc)). Thread 153 "pmd-c88/id:150" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7f64e5e6b700 (LWP 141373)] rte_mov128blocks (n=2048, src=0xc , dst=0x150da4480) at ../lib/eal/x86/include/rte_memcpy.h:384 (gdb) bt 0 rte_mov128blocks (n=2048, src=0xc, dst=0x150da4480) 1 rte_memcpy_generic (n=2048, src=0xc, dst=0x150da4480) 2 rte_memcpy (n=2048, src=0xc, dst=) 3 sync_fill_seg 4 desc_to_mbuf 5 virtio_dev_tx_split 6 virtio_dev_tx_split_legacy 7 0x7f676fea0fef in rte_vhost_dequeue_burst 8 0x7f6772005a62 in netdev_dpdk_vhost_rxq_recv 9 0x7f6771f38116 in netdev_rxq_recv 10 0x7f6771f03d96 in dp_netdev_process_rxq_port 11 0x7f6771f04239 in pmd_thread_main 12 0x7f6771f92aff in ovsthread_wrapper 13 0x7f6771c1b6ea in start_thread 14 0x7f6771933a8f in clone Tested-by: Claudio Fontana Signed-off-by: Claudio Fontana --- lib/vhost/virtio_net.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c index 35fa4670fd..eb19e54c2b 100644 --- a/lib/vhost/virtio_net.c +++ b/lib/vhost/virtio_net.c @@ -1153,7 +1153,7 @@ mbuf_to_desc(struct virtio_net *dev, struct vhost_virtqueue *vq, struct virtio_net_hdr_mrg_rxbuf tmp_hdr, *hdr = NULL; struct vhost_async *async = vq->async; - if (unlikely(m == NULL)) + if (unlikely(m == NULL) || nr_vec == 0) return -1; buf_addr = buf_vec[vec_idx].buf_addr; @@ -2673,6 +2673,9 @@ desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, struct vhost_async *async = vq->async; struct async_inflight_info *pkts_info; + if (unlikely(nr_vec == 0)) + return -1; + buf_addr = buf_vec[vec_idx].buf_addr; buf_iova = buf_vec[vec_idx].buf_iova; buf_len = buf_vec[vec_idx].buf_len; @@ -2917,9 +2920,11 @@ virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq, vq->last_avail_idx + i, &nr_vec, buf_vec, &head_idx, &buf_len, - VHOST_ACCESS_RO) < 0)) + VHOST_ACCESS_RO) < 0)) { + dropped += 1; + i++; break; - + } update_shadow_used_ring_split(vq, head_idx, 0); err = virtio_dev_pktmbuf_prep(dev, pkts[i], buf_len); -- 2.26.2
[PATCH v3 2/2] vhost: improve error handling in desc_to_mbuf
check when increasing vec_idx that it is still valid in the (buf_len < dev->vhost_hlen) case too. Tested-by: Claudio Fontana Signed-off-by: Claudio Fontana --- lib/vhost/virtio_net.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c index eb19e54c2b..20ed951979 100644 --- a/lib/vhost/virtio_net.c +++ b/lib/vhost/virtio_net.c @@ -2704,12 +2704,15 @@ desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, if (unlikely(buf_len < dev->vhost_hlen)) { buf_offset = dev->vhost_hlen - buf_len; vec_idx++; + if (unlikely(vec_idx >= nr_vec)) + goto error; buf_addr = buf_vec[vec_idx].buf_addr; buf_iova = buf_vec[vec_idx].buf_iova; buf_len = buf_vec[vec_idx].buf_len; buf_avail = buf_len - buf_offset; } else if (buf_len == dev->vhost_hlen) { - if (unlikely(++vec_idx >= nr_vec)) + vec_idx++; + if (unlikely(vec_idx >= nr_vec)) goto error; buf_addr = buf_vec[vec_idx].buf_addr; buf_iova = buf_vec[vec_idx].buf_iova; -- 2.26.2
Re: [PATCH v3 1/2] vhost: check for nr_vec == 0 in desc_to_mbuf, mbuf_to_desc
On Tue, 2 Aug 2022 02:49:37 +0200 Claudio Fontana wrote: > Tested-by: Claudio Fontana > Signed-off-by: Claudio Fontana Minor nit having both tags by same author is redundant and unnecessary.
RE: [PATCH v4] net/i40e: fix the issue caused by PF and VF release order
Hi Ke, Please update commit log. LGTM. > -Original Message- > From: Zhang, Ke1X > Sent: Friday, July 15, 2022 5:04 PM > To: tho...@monjalon.net; Zhang, Yuying ; Xing, > Beilei ; ferruh.yi...@xilinx.com; Zhou, YidingX > ; dev@dpdk.org > Cc: Zhang, Ke1X ; sta...@dpdk.org > Subject: [PATCH v4] net/i40e: fix the issue caused by PF and VF release order > > A segmentation fault occurs when testpmd exit. > > This is due to fetching the device name from PF, PF is freed firstly and then > VF > representor is called later. > > This commit fixes the bug by fetching the device name from VF representor > not PF. instead of PF > > Fixes: e391a7b7f815 ("net/i40e: fix multi-process shared data") > Cc: sta...@dpdk.org > > Signed-off-by: Ke Zhang Acked-by: Yuying Zhang > > --- > v4: Update the commit log > v3: Change the design and fix code in driver > v2: Change the testpmd code to fix this issue > --- > --- > drivers/net/i40e/i40e_vf_representor.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/drivers/net/i40e/i40e_vf_representor.c > b/drivers/net/i40e/i40e_vf_representor.c > index 7f8e81858e..bcd445bcdd 100644 > --- a/drivers/net/i40e/i40e_vf_representor.c > +++ b/drivers/net/i40e/i40e_vf_representor.c > @@ -29,8 +29,6 @@ i40e_vf_representor_dev_infos_get(struct rte_eth_dev > *ethdev, > struct rte_eth_dev_info *dev_info) > { > struct i40e_vf_representor *representor = ethdev->data- > >dev_private; > - struct rte_eth_dev_data *pf_dev_data = > - representor->adapter->pf.dev_data; > > /* get dev info for the vdev */ > dev_info->device = ethdev->device; > @@ -104,7 +102,7 @@ i40e_vf_representor_dev_infos_get(struct > rte_eth_dev *ethdev, > }; > > dev_info->switch_info.name = > - rte_eth_devices[pf_dev_data->port_id].device->name; > + rte_eth_devices[ethdev->data->port_id].device->name; > dev_info->switch_info.domain_id = representor->switch_domain_id; > dev_info->switch_info.port_id = representor->vf_id; > > -- > 2.25.1
Re: [PATCH] net/bnxt: fix null pointer dereference in bnxt_hwrm_port_led_cfg()
On Tue, Aug 2, 2022 at 8:33 AM Mao YingMing wrote: > Thanks for the patch > From: maoyingming > > VFs's "bp->leds" is allways null, check bp->leds is > not null before use bp->leds->num_leds. Typo in 'always' You can just say ' For VFs , bp->leds is uninitialized' > > segfault backtrace in trex program when use VF: > 11: bnxt_hwrm_port_led_cfg (bp=0x23ffb2140, led_on=true) > 10: bnxt_dev_led_on_op (dev=0x22d7780 ) > 9: rte_eth_led_on (port_id=0) > 8: DpdkTRexPortAttr::set_led (this=0x23b6ce0, on=true) > 7: DpdkTRexPortAttr::DpdkTRexPortAttr > 6: CTRexExtendedDriverBnxt::create_port_attr > 5: CPhyEthIF::Create > 4: CGlobalTRex::device_start > 3: CGlobalTRex::Create > 2: main_test > 1: main > > Fixes: d4d5a04 ("net/bnxt: fix unnecessary memory allocation") > Cc: sta...@dpdk.org > > Signed-off-by: Mao YingMing > --- > drivers/net/bnxt/bnxt_hwrm.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c > index 9c52573..41e6067 100644 > --- a/drivers/net/bnxt/bnxt_hwrm.c > +++ b/drivers/net/bnxt/bnxt_hwrm.c > @@ -4535,7 +4535,7 @@ int bnxt_hwrm_port_led_cfg(struct bnxt *bp, bool led_on) > uint16_t duration = 0; > int rc, i; > > - if (!bp->leds->num_leds || BNXT_VF(bp)) > + if (BNXT_VF(bp) || (!bp->leds) || (!bp->leds->num_leds)) > return -EOPNOTSUPP; > > HWRM_PREP(&req, HWRM_PORT_LED_CFG, BNXT_USE_CHIMP_MB); > -- > 1.8.3.1 > smime.p7s Description: S/MIME Cryptographic Signature
[PATCH] net/bnxt: fix null pointer dereference in bnxt_hwrm_port_led_cfg()
From: maoyingming VFs's "bp->leds" is allways null, check bp->leds is not null before use bp->leds->num_leds. segfault backtrace in trex program when use VF: 11: bnxt_hwrm_port_led_cfg (bp=0x23ffb2140, led_on=true) 10: bnxt_dev_led_on_op (dev=0x22d7780 ) 9: rte_eth_led_on (port_id=0) 8: DpdkTRexPortAttr::set_led (this=0x23b6ce0, on=true) 7: DpdkTRexPortAttr::DpdkTRexPortAttr 6: CTRexExtendedDriverBnxt::create_port_attr 5: CPhyEthIF::Create 4: CGlobalTRex::device_start 3: CGlobalTRex::Create 2: main_test 1: main Fixes: d4d5a04 ("net/bnxt: fix unnecessary memory allocation") Cc: sta...@dpdk.org Signed-off-by: Mao YingMing --- drivers/net/bnxt/bnxt_hwrm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c index 9c52573..41e6067 100644 --- a/drivers/net/bnxt/bnxt_hwrm.c +++ b/drivers/net/bnxt/bnxt_hwrm.c @@ -4535,7 +4535,7 @@ int bnxt_hwrm_port_led_cfg(struct bnxt *bp, bool led_on) uint16_t duration = 0; int rc, i; - if (!bp->leds->num_leds || BNXT_VF(bp)) + if (BNXT_VF(bp) || (!bp->leds) || (!bp->leds->num_leds)) return -EOPNOTSUPP; HWRM_PREP(&req, HWRM_PORT_LED_CFG, BNXT_USE_CHIMP_MB); -- 1.8.3.1
Re: 21.11.2 patches review and test
Hi Luca, The dpdk 21.11.2-rc1 test result from Red Hat looks good. We tested below 17 scenarios and all got PASS on RHEL8: - Guest with device assignment(PF) throughput testing(1G hugepage size): PASS - Guest with device assignment(PF) throughput testing(2M hugepage size) : PASS - Guest with device assignment(VF) throughput testing: PASS - PVP (host dpdk testpmd as vswitch) 1Q: throughput testing: PASS - PVP vhost-user 2Q throughput testing: PASS - PVP vhost-user 1Q - cross numa node throughput testing: PASS - Guest with vhost-user 2 queues throughput testing: PASS - vhost-user reconnect with dpdk-client, qemu-server: qemu reconnect: PASS - vhost-user reconnect with dpdk-client, qemu-server: ovs reconnect: PASS - PVP 1Q live migration testing: PASS - PVP 1Q cross numa node live migration testing: PASS - Guest with ovs+dpdk+vhost-user 1Q live migration testing: PASS - Guest with ovs+dpdk+vhost-user 1Q live migration testing (2M): PASS - Guest with ovs+dpdk+vhost-user 2Q live migration testing: PASS - Guest with ovs+dpdk+vhost-user 4Q live migration testing: PASS - Host PF + DPDK testing: PASS - Host VF + DPDK testing: PASS Versions: - kernel 4.18 - qemu 6.2 - dpdk: git://dpdk.org/dpdk-stable (remotes/origin/21.11) - # git log Author: Luca Boccassi Date: Mon Jul 18 09:56:26 2022 +0100 version: 21.11.2-rc1 Signed-off-by: Luca Boccassi - NICs: X540-AT2 NIC(ixgbe, 10G) Best Regards, YangHang Liu On Thu, Jul 28, 2022 at 8:34 PM Luca Boccassi wrote: > On Mon, 2022-07-18 at 10:58 +0100, luca.bocca...@gmail.com wrote: > > Hi all, > > > > Here is a list of patches targeted for stable release 21.11.2. > > > > The planned date for the final release is August 29th. > > > > Please help with testing and validation of your use cases and report > > any issues/results with reply-all to this mail. For the final release > > the fixes and reported validations will be added to the release notes. > > > > A release candidate tarball can be found at: > > > > https://dpdk.org/browse/dpdk-stable/tag/?id=v21.11.2-rc1 > > > > These patches are located at branch 21.11 of dpdk-stable repo: > > https://dpdk.org/browse/dpdk-stable/ > > > > Thanks. > > > > Luca Boccassi > > Hello, > > Any update from any of the validation teams? Any indication on how the > tests are going? > > -- > Kind regards, > Luca Boccassi > >
[PATCHv2] net/bnxt: fix null pointer dereference in bnxt_hwrm_port_led_cfg()
For VFs, bp->leds is uninitialized, check bp->leds is not null before use bp->leds->num_leds. segfault backtrace in trex program when use VF: 11: bnxt_hwrm_port_led_cfg (bp=0x23ffb2140, led_on=true) 10: bnxt_dev_led_on_op (dev=0x22d7780 ) 9: rte_eth_led_on (port_id=0) 8: DpdkTRexPortAttr::set_led (this=0x23b6ce0, on=true) 7: DpdkTRexPortAttr::DpdkTRexPortAttr 6: CTRexExtendedDriverBnxt::create_port_attr 5: CPhyEthIF::Create 4: CGlobalTRex::device_start 3: CGlobalTRex::Create 2: main_test 1: main Fixes: d4d5a04 ("net/bnxt: fix unnecessary memory allocation") Cc: sta...@dpdk.org Signed-off-by: Mao YingMing --- drivers/net/bnxt/bnxt_hwrm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c index 9c52573..51e1e2d 100644 --- a/drivers/net/bnxt/bnxt_hwrm.c +++ b/drivers/net/bnxt/bnxt_hwrm.c @@ -4535,7 +4535,7 @@ int bnxt_hwrm_port_led_cfg(struct bnxt *bp, bool led_on) uint16_t duration = 0; int rc, i; - if (!bp->leds->num_leds || BNXT_VF(bp)) + if (BNXT_VF(bp) || !bp->leds || !bp->leds->num_leds) return -EOPNOTSUPP; HWRM_PREP(&req, HWRM_PORT_LED_CFG, BNXT_USE_CHIMP_MB); -- 1.8.3.1
Re: [PATCHv2] net/bnxt: fix null pointer dereference in bnxt_hwrm_port_led_cfg()
On Tue, Aug 2, 2022 at 9:17 AM Mao YingMing wrote: > > For VFs, bp->leds is uninitialized, check bp->leds is > not null before use bp->leds->num_leds. Sorry for missing this the first time around, Rephrase it to say 'before checking for bp->leds->num_leds' > segfault backtrace in trex program when use VF: > 11: bnxt_hwrm_port_led_cfg (bp=0x23ffb2140, led_on=true) > 10: bnxt_dev_led_on_op (dev=0x22d7780 ) > 9: rte_eth_led_on (port_id=0) > 8: DpdkTRexPortAttr::set_led (this=0x23b6ce0, on=true) > 7: DpdkTRexPortAttr::DpdkTRexPortAttr > 6: CTRexExtendedDriverBnxt::create_port_attr > 5: CPhyEthIF::Create > 4: CGlobalTRex::device_start > 3: CGlobalTRex::Create > 2: main_test > 1: main > > Fixes: d4d5a04 ("net/bnxt: fix unnecessary memory allocation") > Cc: sta...@dpdk.org > > Signed-off-by: Mao YingMing > --- > drivers/net/bnxt/bnxt_hwrm.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c > index 9c52573..51e1e2d 100644 > --- a/drivers/net/bnxt/bnxt_hwrm.c > +++ b/drivers/net/bnxt/bnxt_hwrm.c > @@ -4535,7 +4535,7 @@ int bnxt_hwrm_port_led_cfg(struct bnxt *bp, bool led_on) > uint16_t duration = 0; > int rc, i; > > - if (!bp->leds->num_leds || BNXT_VF(bp)) > + if (BNXT_VF(bp) || !bp->leds || !bp->leds->num_leds) > return -EOPNOTSUPP; > > HWRM_PREP(&req, HWRM_PORT_LED_CFG, BNXT_USE_CHIMP_MB); > -- > 1.8.3.1 > smime.p7s Description: S/MIME Cryptographic Signature
[PATCH v3] net/bnxt: fix null pointer dereference in bnxt_hwrm_port_led_cfg()
For VFs, bp->leds is uninitialized, check bp->leds is not null before checking for bp->leds->num_leds. segfault backtrace in trex program when use VF: 11: bnxt_hwrm_port_led_cfg (bp=0x23ffb2140, led_on=true) 10: bnxt_dev_led_on_op (dev=0x22d7780 ) 9: rte_eth_led_on (port_id=0) 8: DpdkTRexPortAttr::set_led (this=0x23b6ce0, on=true) 7: DpdkTRexPortAttr::DpdkTRexPortAttr 6: CTRexExtendedDriverBnxt::create_port_attr 5: CPhyEthIF::Create 4: CGlobalTRex::device_start 3: CGlobalTRex::Create 2: main_test 1: main Fixes: d4d5a04 ("net/bnxt: fix unnecessary memory allocation") Cc: sta...@dpdk.org Signed-off-by: Mao YingMing --- drivers/net/bnxt/bnxt_hwrm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c index 9c52573..51e1e2d 100644 --- a/drivers/net/bnxt/bnxt_hwrm.c +++ b/drivers/net/bnxt/bnxt_hwrm.c @@ -4535,7 +4535,7 @@ int bnxt_hwrm_port_led_cfg(struct bnxt *bp, bool led_on) uint16_t duration = 0; int rc, i; - if (!bp->leds->num_leds || BNXT_VF(bp)) + if (BNXT_VF(bp) || !bp->leds || !bp->leds->num_leds) return -EOPNOTSUPP; HWRM_PREP(&req, HWRM_PORT_LED_CFG, BNXT_USE_CHIMP_MB); -- 1.8.3.1
Re: [PATCH v3] net/bnxt: fix null pointer dereference in bnxt_hwrm_port_led_cfg()
On Tue, Aug 2, 2022 at 9:38 AM Mao YingMing wrote: > > For VFs, bp->leds is uninitialized, check bp->leds is > not null before checking for bp->leds->num_leds. > > segfault backtrace in trex program when use VF: > 11: bnxt_hwrm_port_led_cfg (bp=0x23ffb2140, led_on=true) > 10: bnxt_dev_led_on_op (dev=0x22d7780 ) > 9: rte_eth_led_on (port_id=0) > 8: DpdkTRexPortAttr::set_led (this=0x23b6ce0, on=true) > 7: DpdkTRexPortAttr::DpdkTRexPortAttr > 6: CTRexExtendedDriverBnxt::create_port_attr > 5: CPhyEthIF::Create > 4: CGlobalTRex::device_start > 3: CGlobalTRex::Create > 2: main_test > 1: main > > Fixes: d4d5a04 ("net/bnxt: fix unnecessary memory allocation") > Cc: sta...@dpdk.org > > Signed-off-by: Mao YingMing > --- > drivers/net/bnxt/bnxt_hwrm.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c > index 9c52573..51e1e2d 100644 > --- a/drivers/net/bnxt/bnxt_hwrm.c > +++ b/drivers/net/bnxt/bnxt_hwrm.c > @@ -4535,7 +4535,7 @@ int bnxt_hwrm_port_led_cfg(struct bnxt *bp, bool led_on) > uint16_t duration = 0; > int rc, i; > > - if (!bp->leds->num_leds || BNXT_VF(bp)) > + if (BNXT_VF(bp) || !bp->leds || !bp->leds->num_leds) > return -EOPNOTSUPP; > > HWRM_PREP(&req, HWRM_PORT_LED_CFG, BNXT_USE_CHIMP_MB); > -- > 1.8.3.1 > Acked-by: Somnath Kotur smime.p7s Description: S/MIME Cryptographic Signature
答复: [PATCH v3] net/bnxt: fix null pointer dereference in bnxt_hwrm_port_led_cfg()
Thans for your review. Mao Yingming -邮件原件- 发件人: Somnath Kotur 发送时间: 2022年8月2日 12:09 收件人: Mao,Yingming 抄送: dev@dpdk.org; Ajit Khaparde ; Kalesh AP ; Thomas Monjalon 主题: Re: [PATCH v3] net/bnxt: fix null pointer dereference in bnxt_hwrm_port_led_cfg() On Tue, Aug 2, 2022 at 9:38 AM Mao YingMing wrote: > > For VFs, bp->leds is uninitialized, check bp->leds is > not null before checking for bp->leds->num_leds. > > segfault backtrace in trex program when use VF: > 11: bnxt_hwrm_port_led_cfg (bp=0x23ffb2140, led_on=true) > 10: bnxt_dev_led_on_op (dev=0x22d7780 ) > 9: rte_eth_led_on (port_id=0) > 8: DpdkTRexPortAttr::set_led (this=0x23b6ce0, on=true) > 7: DpdkTRexPortAttr::DpdkTRexPortAttr > 6: CTRexExtendedDriverBnxt::create_port_attr > 5: CPhyEthIF::Create > 4: CGlobalTRex::device_start > 3: CGlobalTRex::Create > 2: main_test > 1: main > > Fixes: d4d5a04 ("net/bnxt: fix unnecessary memory allocation") > Cc: sta...@dpdk.org > > Signed-off-by: Mao YingMing > --- > drivers/net/bnxt/bnxt_hwrm.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c > index 9c52573..51e1e2d 100644 > --- a/drivers/net/bnxt/bnxt_hwrm.c > +++ b/drivers/net/bnxt/bnxt_hwrm.c > @@ -4535,7 +4535,7 @@ int bnxt_hwrm_port_led_cfg(struct bnxt *bp, bool led_on) > uint16_t duration = 0; > int rc, i; > > - if (!bp->leds->num_leds || BNXT_VF(bp)) > + if (BNXT_VF(bp) || !bp->leds || !bp->leds->num_leds) > return -EOPNOTSUPP; > > HWRM_PREP(&req, HWRM_PORT_LED_CFG, BNXT_USE_CHIMP_MB); > -- > 1.8.3.1 > Acked-by: Somnath Kotur