Re: [PATCH V3 5/5] arm64: dts: renesas: beacon kits: Setup AVB refclk
On Wed, Feb 24, 2021 at 12:52 PM Adam Ford wrote: > The AVB refererence clock assumes an external clock that runs reference > automatically. Because the Versaclock is wired to provide the > AVB refclock, the device tree needs to reference it in order for the > driver to start the clock. > > Signed-off-by: Adam Ford Reviewed-by: Geert Uytterhoeven i.e. will queue in renesas-devel (with the typo fixed) once the DT bindings have been accepted. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [PATCH V3 4/5] net: ethernet: ravb: Enable optional refclk
Hi Adam, On Wed, Feb 24, 2021 at 12:52 PM Adam Ford wrote: > For devices that use a programmable clock for the AVB reference clock, > the driver may need to enable them. Add code to find the optional clock > and enable it when available. > > Signed-off-by: Adam Ford Thanks for your patch! > --- a/drivers/net/ethernet/renesas/ravb_main.c > +++ b/drivers/net/ethernet/renesas/ravb_main.c > @@ -2148,6 +2148,13 @@ static int ravb_probe(struct platform_device *pdev) > goto out_release; > } > > + priv->refclk = devm_clk_get_optional(&pdev->dev, "refclk"); > + if (IS_ERR(priv->refclk)) { > + error = PTR_ERR(priv->refclk); > + goto out_release; > + } > + clk_prepare_enable(priv->refclk); > + Shouldn't the reference clock be disabled in case of any failure below? > ndev->max_mtu = 2048 - (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN); > ndev->min_mtu = ETH_MIN_MTU; > > @@ -2260,6 +2267,9 @@ static int ravb_remove(struct platform_device *pdev) > if (priv->chip_id != RCAR_GEN2) > ravb_ptp_stop(ndev); > > + if (priv->refclk) > + clk_disable_unprepare(priv->refclk); > + > dma_free_coherent(ndev->dev.parent, priv->desc_bat_size, > priv->desc_bat, > priv->desc_bat_dma); > /* Set reset mode */ Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: Re: [RFC v4 07/11] vduse: Introduce VDUSE - vDPA Device in Userspace
On Thu, Mar 4, 2021 at 2:27 PM Jason Wang wrote: > > > On 2021/2/23 7:50 下午, Xie Yongji wrote: > > This VDUSE driver enables implementing vDPA devices in userspace. > > Both control path and data path of vDPA devices will be able to > > be handled in userspace. > > > > In the control path, the VDUSE driver will make use of message > > mechnism to forward the config operation from vdpa bus driver > > to userspace. Userspace can use read()/write() to receive/reply > > those control messages. > > > > In the data path, VDUSE_IOTLB_GET_FD ioctl will be used to get > > the file descriptors referring to vDPA device's iova regions. Then > > userspace can use mmap() to access those iova regions. Besides, > > userspace can use ioctl() to inject interrupt and use the eventfd > > mechanism to receive virtqueue kicks. > > > > Signed-off-by: Xie Yongji > > --- > > Documentation/userspace-api/ioctl/ioctl-number.rst |1 + > > drivers/vdpa/Kconfig | 10 + > > drivers/vdpa/Makefile |1 + > > drivers/vdpa/vdpa_user/Makefile|5 + > > drivers/vdpa/vdpa_user/vduse_dev.c | 1348 > > > > include/uapi/linux/vduse.h | 136 ++ > > 6 files changed, 1501 insertions(+) > > create mode 100644 drivers/vdpa/vdpa_user/Makefile > > create mode 100644 drivers/vdpa/vdpa_user/vduse_dev.c > > create mode 100644 include/uapi/linux/vduse.h > > > > diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst > > b/Documentation/userspace-api/ioctl/ioctl-number.rst > > index a4c75a28c839..71722e6f8f23 100644 > > --- a/Documentation/userspace-api/ioctl/ioctl-number.rst > > +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst > > @@ -300,6 +300,7 @@ Code Seq#Include File > > Comments > > 'z' 10-4F drivers/s390/crypto/zcrypt_api.h > > conflict! > > '|' 00-7F linux/media.h > > 0x80 00-1F linux/fb.h > > +0x81 00-1F linux/vduse.h > > 0x89 00-06 arch/x86/include/asm/sockios.h > > 0x89 0B-DF linux/sockios.h > > 0x89 E0-EF linux/sockios.h > > SIOCPROTOPRIVATE range > > diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig > > index ffd1e098bfd2..92f07715e3b6 100644 > > --- a/drivers/vdpa/Kconfig > > +++ b/drivers/vdpa/Kconfig > > @@ -25,6 +25,16 @@ config VDPA_SIM_NET > > help > > vDPA networking device simulator which loops TX traffic back to RX. > > > > +config VDPA_USER > > + tristate "VDUSE (vDPA Device in Userspace) support" > > + depends on EVENTFD && MMU && HAS_DMA > > + select DMA_OPS > > + select VHOST_IOTLB > > + select IOMMU_IOVA > > + help > > + With VDUSE it is possible to emulate a vDPA Device > > + in a userspace program. > > + > > config IFCVF > > tristate "Intel IFC VF vDPA driver" > > depends on PCI_MSI > > diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile > > index d160e9b63a66..66e97778ad03 100644 > > --- a/drivers/vdpa/Makefile > > +++ b/drivers/vdpa/Makefile > > @@ -1,5 +1,6 @@ > > # SPDX-License-Identifier: GPL-2.0 > > obj-$(CONFIG_VDPA) += vdpa.o > > obj-$(CONFIG_VDPA_SIM) += vdpa_sim/ > > +obj-$(CONFIG_VDPA_USER) += vdpa_user/ > > obj-$(CONFIG_IFCVF)+= ifcvf/ > > obj-$(CONFIG_MLX5_VDPA) += mlx5/ > > diff --git a/drivers/vdpa/vdpa_user/Makefile > > b/drivers/vdpa/vdpa_user/Makefile > > new file mode 100644 > > index ..260e0b26af99 > > --- /dev/null > > +++ b/drivers/vdpa/vdpa_user/Makefile > > @@ -0,0 +1,5 @@ > > +# SPDX-License-Identifier: GPL-2.0 > > + > > +vduse-y := vduse_dev.o iova_domain.o > > + > > +obj-$(CONFIG_VDPA_USER) += vduse.o > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c > > b/drivers/vdpa/vdpa_user/vduse_dev.c > > new file mode 100644 > > index ..393bf99c48be > > --- /dev/null > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c > > @@ -0,0 +1,1348 @@ > > +// SPDX-License-Identifier: GPL-2.0-only > > +/* > > + * VDUSE: vDPA Device in Userspace > > + * > > + * Copyright (C) 2020 Bytedance Inc. and/or its affiliates. All rights > > reserved. > > + * > > + * Author: Xie Yongji > > + * > > + */ > > + > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > + > > +#include "iova_domain.h" > > + > > +#define DRV_VERSION "1.0" > > +#define DRV_AUTHOR "Yongji Xie " > > +#define DRV_DESC "vDPA Device in Userspace" > > +#define DRV_LICENSE "GPL v2" > > + > > +#define VDUSE_DEV_MAX (1U << MINORBITS) > > + > > +struct vduse_virtqueue { > > + u16 index; > > + bool ready; > > + spinlock_t kick_lock; > > + spinlock_t irq_lock; > > + struct eventfd_ctx *kickfd; > > + struct v
Re: Re: [RFC v4 11/11] vduse: Support binding irq to the specified cpu
On Thu, Mar 4, 2021 at 3:30 PM Jason Wang wrote: > > > On 2021/2/23 7:50 下午, Xie Yongji wrote: > > Add a parameter for the ioctl VDUSE_INJECT_VQ_IRQ to support > > injecting virtqueue's interrupt to the specified cpu. > > > How userspace know which CPU is this irq for? It looks to me we need to > do it at different level. > > E.g introduce some API in sys to allow admin to tune for that. > > But I think we can do that in antoher patch on top of this series. > OK. I will think more about it. Thanks, Yongji
Re: [PATCH net] docs: networking: drop special stable handling
On Wed, Mar 03, 2021 at 05:00:07PM +, patchwork-bot+netdev...@kernel.org wrote: > Hello: > > This patch was applied to netdev/net.git (refs/heads/master): > > On Tue, 2 Mar 2021 18:46:43 -0800 you wrote: > > Leave it to Greg. > > > > Signed-off-by: Jakub Kicinski > > --- > > Documentation/networking/netdev-FAQ.rst | 72 ++- > > Documentation/process/stable-kernel-rules.rst | 6 -- > > Documentation/process/submitting-patches.rst | 5 -- > > 3 files changed, 6 insertions(+), 77 deletions(-) > > Here is the summary with links: > - [net] docs: networking: drop special stable handling > https://git.kernel.org/netdev/net/c/dbbe7c962c3a Excellent news, thanks
Re: [virtio-dev] Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero
On 2021/3/3 4:29 下午, Cornelia Huck wrote: On Wed, 3 Mar 2021 12:01:01 +0800 Jason Wang wrote: On 2021/3/2 8:08 下午, Cornelia Huck wrote: On Mon, 1 Mar 2021 11:51:08 +0800 Jason Wang wrote: On 2021/3/1 5:25 上午, Michael S. Tsirkin wrote: On Fri, Feb 26, 2021 at 04:19:16PM +0800, Jason Wang wrote: On 2021/2/26 2:53 上午, Michael S. Tsirkin wrote: Confused. What is wrong with the above? It never reads the field unless the feature has been offered by device. So the spec said: " The following driver-read-only field, max_virtqueue_pairs only exists if VIRTIO_NET_F_MQ is set. " If I read this correctly, there will be no max_virtqueue_pairs field if the VIRTIO_NET_F_MQ is not offered by device? If yes the offsetof() violates what spec said. Thanks I think that's a misunderstanding. This text was never intended to imply that field offsets change beased on feature bits. We had this pain with legacy and we never wanted to go back there. This merely implies that without VIRTIO_NET_F_MQ the field should not be accessed. Exists in the sense "is accessible to driver". Let's just clarify that in the spec, job done. Ok, agree. That will make things more eaiser. Yes, that makes much more sense. What about adding the following to the "Basic Facilities of a Virtio Device/Device Configuration Space" section of the spec: "If an optional configuration field does not exist, the corresponding space is still present, but reserved." This became interesting after re-reading some of the qemu codes. E.g in virtio-net.c we had: *static VirtIOFeature feature_sizes[] = { {.flags = 1ULL << VIRTIO_NET_F_MAC, .end = endof(struct virtio_net_config, mac)}, {.flags = 1ULL << VIRTIO_NET_F_STATUS, .end = endof(struct virtio_net_config, status)}, {.flags = 1ULL << VIRTIO_NET_F_MQ, .end = endof(struct virtio_net_config, max_virtqueue_pairs)}, {.flags = 1ULL << VIRTIO_NET_F_MTU, .end = endof(struct virtio_net_config, mtu)}, {.flags = 1ULL << VIRTIO_NET_F_SPEED_DUPLEX, .end = endof(struct virtio_net_config, duplex)}, {.flags = (1ULL << VIRTIO_NET_F_RSS) | (1ULL << VIRTIO_NET_F_HASH_REPORT), .end = endof(struct virtio_net_config, supported_hash_types)}, {} };* *It has a implict dependency chain. E.g MTU doesn't presnet if DUPLEX/RSS is not offered ... * But I think it covers everything up to the relevant field, no? So MTU is included if we have the feature bit, even if we don't have DUPLEX/RSS. Given that a config space may be shorter (but must not collapse non-existing fields), maybe a better wording would be: "If an optional configuration field does not exist, the corresponding space will still be present if it is not at the end of the configuration space (i.e., further configuration fields exist.) This should work but I think we need to define the end of configuration space first? This implies that a given field, if it exists, is always at the same offset from the beginning of the configuration space." (Do we need to specify what a device needs to do if the driver tries to access a non-existing field? We cannot really make assumptions about config space accesses; virtio-ccw can just copy a chunk of config space that contains non-existing fields... Right, it looks to me ccw doesn't depends on config len which is kind of interesting. I think the answer depends on the implementation of both transport and device: (virtio-ccw is a bit odd, because channel I/O does not have the concept of a config space and I needed to come up with something) Ok. Let's take virtio-net-pci as an example, it fills status unconditionally in virtio_net_set_config() even if VIRTIO_NET_F_STATUS is not negotiated. So with the above feature_sizes: 1) if one of the MQ, MTU, DUPLEX or RSS is implemented, driver can still read status in this case 2) if none of the above four is negotied, driver can only read a 0xFF (virtio_config_readb()) I guess the device could ignore writes and return zeroes on read?) So consider the above, it might be too late to fix/clarify that in the spec on the expected behaviour of reading/writing non-exist fields. We could still advise behaviour via SHOULD, though I'm not sure it adds much at this point in time. It really feels a bit odd that a driver can still read and write fields for features that it did not negotiate, but I fear we're stuck with it. Yes, since the device (at least virtio-pci) implment thing like this. Thanks Thanks I've opened https://github.com/oasis-tcs/virtio-spec/issues/98 for the spec issues. - To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
Re: BUG: soft lockup in ieee80211_tasklet_handler
On Tue, 2021-03-02 at 20:01 +0100, Dmitry Vyukov wrote: > > Looking at the reproducer that mostly contains just perf_event_open, > It may be the old known issue of perf_event_open with some extreme > parameters bringing down kernel. > +perf maintainers > And as far as I remember +Peter had some patch to restrict > perf_event_open parameters. > > r0 = perf_event_open(&(0x7f01d000)={0x1, 0x70, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x3ff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xfffe, 0x0, @perf_config_ext}, 0x0, > 0x0, 0x, 0x0) Oh! Thanks for looking. Seems that also applies to https://syzkaller.appspot.com/bug?extid=d6219cf21f26bdfcc22e FWIW. I was still tracking that one, but never had a chance to look at it (also way down the list since it was reported as directly in hwsim) johannes
Re: [PATCH net 1/3] sh_eth: fix TRSCER mask for SH771x
On Sun, Feb 28, 2021 at 9:54 PM Sergey Shtylyov wrote: > According to the SH7710, SH7712, SH7713 Group User's Manual: Hardware, > Rev. 3.00, the TRSCER register actually has only bit 7 valid (and named > differently), with all the other bits reserved. Apparently, this was not > the case with some early revisions of the manual as we have the other > bits declared (and set) in the original driver. Follow the suit and add > the explicit sh_eth_cpu_data::trscer_err_mask initializer for SH771x... > > Fixes: 86a74ff21a7a ("net: sh_eth: add support for Renesas SuperH Ethernet") > Signed-off-by: Sergey Shtylyov Reviewed-by: Geert Uytterhoeven Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [PATCH net 2/3] sh_eth: fix TRSCER mask for R7S72100
On Sun, Feb 28, 2021 at 9:54 PM Sergey Shtylyov wrote: > According to the RZ/A1H Group, RZ/A1M Group User's Manual: Hardware, > Rev. 4.00, the TRSCER register has bit 9 reserved, hence we can't use > the driver's default TRSCER mask. Add the explicit initializer for > sh_eth_cpu_data::trscer_err_mask for R7S72100. > > Fixes: db893473d313 ("sh_eth: Add support for r7s72100") > Signed-off-by: Sergey Shtylyov Reviewed-by: Geert Uytterhoeven Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [PATCH net 3/3] sh_eth: fix TRSCER mask for R7S9210
On Sun, Feb 28, 2021 at 9:56 PM Sergey Shtylyov wrote: > According to the RZ/A2M Group User's Manual: Hardware, Rev. 2.00, > the TRSCER register has bit 9 reserved, hence we can't use the driver's > default TRSCER mask. Add the explicit initializer for sh_eth_cpu_data:: > trscer_err_mask for R7S9210. > > Fixes: 6e0bb04d0e4f ("sh_eth: Add R7S9210 support") > Signed-off-by: Sergey Shtylyov Reviewed-by: Geert Uytterhoeven Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [PATCH intel-net 1/3] i40e: move headroom initialization to i40e_configure_rx_ring
On Wed, 3 Mar 2021 16:39:26 +0100 Maciej Fijalkowski wrote: > i40e_rx_offset(), that is supposed to initialize the Rx buffer headroom, > relies on I40E_RXR_FLAGS_BUILD_SKB_ENABLED flag. > > Currently, the callsite of mentioned function is placed incorrectly > within i40e_setup_rx_descriptors() where Rx ring's build skb flag is not > set yet. This causes the XDP_REDIRECT to be partially broken due to > inability to create xdp_frame in the headroom space, as the headroom is > 0. > > For the record, below is the call graph: > > i40e_vsi_open > i40e_vsi_setup_rx_resources > i40e_setup_rx_descriptors >i40e_rx_offset() <-- sets offset to 0 as build_skb flag is set below > > i40e_vsi_configure_rx > i40e_configure_rx_ring >set_ring_build_skb_enabled(ring) <-- set build_skb flag > > Fix this by moving i40e_rx_offset() to i40e_configure_rx_ring() after > the flag setting. > > Fixes: f7bb0d71d658 ("i40e: store the result of i40e_rx_offset() onto > i40e_ring") > Reported-by: Jesper Dangaard Brouer > Co-developed-by: Jesper Dangaard Brouer > Signed-off-by: Jesper Dangaard Brouer > Signed-off-by: Maciej Fijalkowski > --- > drivers/net/ethernet/intel/i40e/i40e_main.c | 13 + > drivers/net/ethernet/intel/i40e/i40e_txrx.c | 12 > 2 files changed, 13 insertions(+), 12 deletions(-) Acked-by: Jesper Dangaard Brouer Tested-by: Jesper Dangaard Brouer I'm currently looking at extending samples/bpf/ xdp_redirect_map to detect the situation. As with this bug the redirect tests/sample programs will just report really high performance numbers (because packets are dropped earlier due to err). Knowing what performance numbers to expect, I could see that they were out-of-spec, and investigated the root-cause. I assume Intel QA tested XDP-redirect and didn't find the bug due to this. Red Hat QA also use samples/bpf/xdp* and based on the reports I get from them, I could not blame them if this bug would slip through, as the tool reports "good" results. -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer
Re: Re: [RFC v4 10/11] vduse: Introduce a workqueue for irq injection
On Thu, Mar 4, 2021 at 2:59 PM Jason Wang wrote: > > > On 2021/2/23 7:50 下午, Xie Yongji wrote: > > This patch introduces a workqueue to support injecting > > virtqueue's interrupt asynchronously. This is mainly > > for performance considerations which makes sure the push() > > and pop() for used vring can be asynchronous. > > > Do you have pref numbers for this patch? > No, I can do some tests for it if needed. Another problem is the VIRTIO_RING_F_EVENT_IDX feature will be useless if we call irq callback in ioctl context. Something like: virtqueue_push(); virtio_notify(); ioctl() - irq_cb() virtqueue_get_buf() The used vring is always empty each time we call virtqueue_push() in userspace. Not sure if it is what we expected. Thanks, Yongji
[PATCH net 0/2] nexthop: Do not flush blackhole nexthops when loopback goes down
From: Ido Schimmel Patch #1 prevents blackhole nexthops from being flushed when the loopback device goes down given that as far as user space is concerned, these nexthops do not have a nexthop device. Patch #2 adds a test case. There are no regressions in fib_nexthops.sh with this change: # ./fib_nexthops.sh ... Tests passed: 165 Tests failed: 0 Ido Schimmel (2): nexthop: Do not flush blackhole nexthops when loopback goes down selftests: fib_nexthops: Test blackhole nexthops when loopback goes down net/ipv4/nexthop.c | 10 +++--- tools/testing/selftests/net/fib_nexthops.sh | 8 2 files changed, 15 insertions(+), 3 deletions(-) -- 2.29.2
[PATCH net 2/2] selftests: fib_nexthops: Test blackhole nexthops when loopback goes down
From: Ido Schimmel Test that blackhole nexthops are not flushed when the loopback device goes down. Output without previous patch: # ./fib_nexthops.sh -t basic Basic functional tests -- TEST: List with nothing defined [ OK ] TEST: Nexthop get on non-existent id[ OK ] TEST: Nexthop with no device or gateway [ OK ] TEST: Nexthop with down device [ OK ] TEST: Nexthop with device that is linkdown [ OK ] TEST: Nexthop with device only [ OK ] TEST: Nexthop with duplicate id [ OK ] TEST: Blackhole nexthop [ OK ] TEST: Blackhole nexthop with other attributes [ OK ] TEST: Blackhole nexthop with loopback device down [FAIL] TEST: Create group [ OK ] TEST: Create group with blackhole nexthop [FAIL] TEST: Create multipath group where 1 path is a blackhole[ OK ] TEST: Multipath group can not have a member replaced by blackhole [ OK ] TEST: Create group with non-existent nexthop[ OK ] TEST: Create group with same nexthop multiple times [ OK ] TEST: Replace nexthop with nexthop group[ OK ] TEST: Replace nexthop group with nexthop[ OK ] TEST: Nexthop group and device [ OK ] TEST: Test proto flush [ OK ] TEST: Nexthop group and blackhole [ OK ] Tests passed: 19 Tests failed: 2 Output with previous patch: # ./fib_nexthops.sh -t basic Basic functional tests -- TEST: List with nothing defined [ OK ] TEST: Nexthop get on non-existent id[ OK ] TEST: Nexthop with no device or gateway [ OK ] TEST: Nexthop with down device [ OK ] TEST: Nexthop with device that is linkdown [ OK ] TEST: Nexthop with device only [ OK ] TEST: Nexthop with duplicate id [ OK ] TEST: Blackhole nexthop [ OK ] TEST: Blackhole nexthop with other attributes [ OK ] TEST: Blackhole nexthop with loopback device down [ OK ] TEST: Create group [ OK ] TEST: Create group with blackhole nexthop [ OK ] TEST: Create multipath group where 1 path is a blackhole[ OK ] TEST: Multipath group can not have a member replaced by blackhole [ OK ] TEST: Create group with non-existent nexthop[ OK ] TEST: Create group with same nexthop multiple times [ OK ] TEST: Replace nexthop with nexthop group[ OK ] TEST: Replace nexthop group with nexthop[ OK ] TEST: Nexthop group and device [ OK ] TEST: Test proto flush [ OK ] TEST: Nexthop group and blackhole [ OK ] Tests passed: 21 Tests failed: 0 Signed-off-by: Ido Schimmel Reviewed-by: David Ahern --- tools/testing/selftests/net/fib_nexthops.sh | 8 1 file changed, 8 insertions(+) diff --git a/tools/testing/selftests/net/fib_nexthops.sh b/tools/testing/selftests/net/fib_nexthops.sh index 4c7d33618437..d98fb85e201c 100755 --- a/tools/testing/selftests/net/fib_nexthops.sh +++ b/tools/testing/selftests/net/fib_nexthops.sh @@ -1524,6 +1524,14 @@ basic() run_cmd "$IP nexthop replace id 2 blackhole dev veth1" log_test $? 2 "Blackhole nexthop with other attributes" + # blackhole nexthop should not be affected by the state of the loopback + # device + run_cmd "$IP link set dev lo down" + check_nexthop "id 2" "id 2 blackhole" + log_test $? 0 "Blackhole nexthop with loopback device down" + + run_cmd "$IP link set dev lo up" + # # groups # -- 2.29.2
[PATCH net 1/2] nexthop: Do not flush blackhole nexthops when loopback goes down
From: Ido Schimmel As far as user space is concerned, blackhole nexthops do not have a nexthop device and therefore should not be affected by the administrative or carrier state of any netdev. However, when the loopback netdev goes down all the blackhole nexthops are flushed. This happens because internally the kernel associates blackhole nexthops with the loopback netdev. This behavior is both confusing to those not familiar with kernel internals and also diverges from the legacy API where blackhole IPv4 routes are not flushed when the loopback netdev goes down: # ip route add blackhole 198.51.100.0/24 # ip link set dev lo down # ip route show 198.51.100.0/24 blackhole 198.51.100.0/24 Blackhole IPv6 routes are flushed, but at least user space knows that they are associated with the loopback netdev: # ip -6 route show 2001:db8:1::/64 blackhole 2001:db8:1::/64 dev lo metric 1024 pref medium Fix this by only flushing blackhole nexthops when the loopback netdev is unregistered. Fixes: ab84be7e54fc ("net: Initial nexthop code") Signed-off-by: Ido Schimmel Reported-by: Donald Sharp Reviewed-by: David Ahern --- net/ipv4/nexthop.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c index f1c6cbdb9e43..743777bce179 100644 --- a/net/ipv4/nexthop.c +++ b/net/ipv4/nexthop.c @@ -1399,7 +1399,7 @@ static int insert_nexthop(struct net *net, struct nexthop *new_nh, /* rtnl */ /* remove all nexthops tied to a device being deleted */ -static void nexthop_flush_dev(struct net_device *dev) +static void nexthop_flush_dev(struct net_device *dev, unsigned long event) { unsigned int hash = nh_dev_hashfn(dev->ifindex); struct net *net = dev_net(dev); @@ -1411,6 +1411,10 @@ static void nexthop_flush_dev(struct net_device *dev) if (nhi->fib_nhc.nhc_dev != dev) continue; + if (nhi->reject_nh && + (event == NETDEV_DOWN || event == NETDEV_CHANGE)) + continue; + remove_nexthop(net, nhi->nh_parent, NULL); } } @@ -2189,11 +2193,11 @@ static int nh_netdev_event(struct notifier_block *this, switch (event) { case NETDEV_DOWN: case NETDEV_UNREGISTER: - nexthop_flush_dev(dev); + nexthop_flush_dev(dev, event); break; case NETDEV_CHANGE: if (!(dev_get_flags(dev) & (IFF_RUNNING | IFF_LOWER_UP))) - nexthop_flush_dev(dev); + nexthop_flush_dev(dev, event); break; case NETDEV_CHANGEMTU: info_ext = ptr; -- 2.29.2
[PATCH net] ethtool: Add indicator field for link_mode validity to link_ksettings
Some drivers clear the 'ethtool_link_ksettings' struct in their get_link_ksettings() callback, before populating it with actual values. Such drivers will set the new 'link_mode' field to zero, resulting in user space receiving wrong link mode information given that zero is a valid value for the field. Fix this by introducing a new boolean field ('link_mode_valid'), which indicates whether the 'link_mode' field is valid or not. Set it in mlxsw which is currently the only driver supporting the new API. Another problem is that some drivers (notably tun) can report random values in the 'link_mode' field. This can result in a general protection fault when the field is used as an index to the 'link_mode_params' array [1]. This happens because such drivers implement their set_link_ksettings() callback by simply overwriting their private copy of 'ethtool_link_ksettings' struct with the one they get from the stack, which is not always properly initialized. Fix this by making sure that the new 'link_mode_valid' field is always initialized to 'false' before invoking the set_link_ksettings() callback. [1] general protection fault, probably for non-canonical address 0xdc00f14cc32c: [#1] PREEMPT SMP KASAN KASAN: probably user-memory-access in range [0x00078a661960-0x00078a661967] CPU: 0 PID: 8452 Comm: syz-executor360 Not tainted 5.11.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__ethtool_get_link_ksettings+0x1a3/0x3a0 net/ethtool/ioctl.c:446 Code: b7 3e fa 83 fd ff 0f 84 30 01 00 00 e8 16 b0 3e fa 48 8d 3c ed 60 d5 69 8a 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 +38 d0 7c 08 84 d2 0f 85 b9 RSP: 0018:c900019df7a0 EFLAGS: 00010202 RAX: dc00 RBX: 888026136008 RCX: RDX: f14cc32c RSI: 873439ca RDI: 00078a661960 RBP: 8880 R08: R09: 88802613606f R10: 873439bc R11: R12: R13: 88802613606c R14: 888011d0c210 R15: 888011d0c210 FS: 00749300() GS:8880b9c0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 004b60f0 CR3: 185c2000 CR4: 001506f0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: linkinfo_prepare_data+0xfd/0x280 net/ethtool/linkinfo.c:37 ethnl_default_notify+0x1dc/0x630 net/ethtool/netlink.c:586 ethtool_notify+0xbd/0x1f0 net/ethtool/netlink.c:656 ethtool_set_link_ksettings+0x277/0x330 net/ethtool/ioctl.c:620 dev_ethtool+0x2b35/0x45d0 net/ethtool/ioctl.c:2842 dev_ioctl+0x463/0xb70 net/core/dev_ioctl.c:440 sock_do_ioctl+0x148/0x2d0 net/socket.c:1060 sock_ioctl+0x477/0x6a0 net/socket.c:1177 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl fs/ioctl.c:739 [inline] __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: c8907043c6ac9 ("ethtool: Get link mode in use instead of speed and duplex parameters") Signed-off-by: Danielle Ratson Reported-by: Eric Dumazet Reviewed-by: Ido Schimmel --- drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c | 9 - include/linux/ethtool.h| 1 + net/ethtool/ioctl.c| 6 -- 3 files changed, 9 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c index 0bd64169bf81..66eb1cd17bc4 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c @@ -1232,14 +1232,14 @@ mlxsw_sp1_from_ptys_link_mode(struct mlxsw_sp *mlxsw_sp, bool carrier_ok, { int i; - cmd->link_mode = -1; - if (!carrier_ok) return; for (i = 0; i < MLXSW_SP1_PORT_LINK_MODE_LEN; i++) { - if (ptys_eth_proto & mlxsw_sp1_port_link_mode[i].mask) + if (ptys_eth_proto & mlxsw_sp1_port_link_mode[i].mask) { cmd->link_mode = mlxsw_sp1_port_link_mode[i].mask_ethtool; + cmd->link_mode_valid = true; + } } } @@ -1672,8 +1672,6 @@ mlxsw_sp2_from_ptys_link_mode(struct mlxsw_sp *mlxsw_sp, bool carrier_ok, struct mlxsw_sp2_port_link_mode link; int i; - cmd->link_mode = -1; - if (!carrier_ok) return; @@ -1681,6 +1679,7 @@ mlxsw_sp2_from_ptys_link_mode(struct mlxsw_sp *mlxsw_sp, bool carrier_ok, if (ptys_eth_proto & mlxsw_sp2_port_link_mode[i].mask) { link = mlxsw_sp2_port_link_mode[i]; cmd->link_mode = link.mask_ethtool[1]; + cm
Re: [PATCH] net: mac802154: Fix null pointer dereference
Hi, thanks for your reply! On Wed, 2021-03-03 at 21:40 -0500, Alexander Aring wrote: > Hi, > > On Wed, 3 Mar 2021 at 11:28, Pavel Skripkin > wrote: > > syzbot found general protection fault in crypto_destroy_tfm()[1]. > > It was caused by wrong clean up loop in llsec_key_alloc(). > > If one of the tfm array members won't be initialized it will cause > > NULL dereference in crypto_destroy_tfm(). > > > > Call Trace: > > crypto_free_aead include/crypto/aead.h:191 [inline] [1] > > llsec_key_alloc net/mac802154/llsec.c:156 [inline] > > mac802154_llsec_key_add+0x9e0/0xcc0 net/mac802154/llsec.c:249 > > ieee802154_add_llsec_key+0x56/0x80 net/mac802154/cfg.c:338 > > rdev_add_llsec_key net/ieee802154/rdev-ops.h:260 [inline] > > nl802154_add_llsec_key+0x3d3/0x560 net/ieee802154/nl802154.c:1584 > > genl_family_rcv_msg_doit+0x228/0x320 net/netlink/genetlink.c:739 > > genl_family_rcv_msg net/netlink/genetlink.c:783 [inline] > > genl_rcv_msg+0x328/0x580 net/netlink/genetlink.c:800 > > netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502 > > genl_rcv+0x24/0x40 net/netlink/genetlink.c:811 > > netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] > > netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 > > netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 > > sock_sendmsg_nosec net/socket.c:654 [inline] > > sock_sendmsg+0xcf/0x120 net/socket.c:674 > > sys_sendmsg+0x6e8/0x810 net/socket.c:2350 > > ___sys_sendmsg+0xf3/0x170 net/socket.c:2404 > > __sys_sendmsg+0xe5/0x1b0 net/socket.c:2433 > > do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 > > entry_SYSCALL_64_after_hwframe+0x44/0xae > > > > Signed-off-by: Pavel Skripkin > > Reported-by: syzbot+12cf5fbfdeba210a8...@syzkaller.appspotmail.com > > --- > > net/mac802154/llsec.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/net/mac802154/llsec.c b/net/mac802154/llsec.c > > index 585d33144c33..6709f186f777 100644 > > --- a/net/mac802154/llsec.c > > +++ b/net/mac802154/llsec.c > > @@ -151,7 +151,7 @@ llsec_key_alloc(const struct > > ieee802154_llsec_key *template) > > err_tfm0: > > crypto_free_sync_skcipher(key->tfm0); > > err_tfm: > > - for (i = 0; i < ARRAY_SIZE(key->tfm); i++) > > + for (; i >= 0; i--) > > if (key->tfm[i]) > > I think this need to be: > > if (!IS_ERR_OR_NULL(key->tfm[i])) > > otherwise we still run into issues for the current iterator when > key->tfm[i] is in range of IS_ERR(). Oh... I got it completly wrong, I'm sorry. If it's still not fixed, I'll send rigth patch for that. > > - Alex -- With regards, Pavel Skripkin
Re: [Patch bpf-next v2 2/9] sock: introduce sk_prot->update_proto()
On Wed, 3 Mar 2021 at 18:21, Cong Wang wrote: > > Yeah, I am not surprised we can change tcp_update_ulp() too, but > why should I bother kTLS when I do not have to? What you suggest > could at most save us a bit of code size, not a big gain. So, I'd keep > its return value as it is, unless you see any other benefits. I think the end result is code that is easier to understand and therefore maintain. Keep it as it is if you prefer. > BTW, I will rename it to 'psock_update_sk_prot', please let me know > if you have any better names. SGTM. -- Lorenz Bauer | Systems Engineer 6th Floor, County Hall/The Riverside Building, SE1 7PB, UK www.cloudflare.com
[PATCH] i40e: improve locking of mac_filter_hash
i40e_config_vf_promiscuous_mode() calls i40e_getnum_vf_vsi_vlan_filters() without acquiring the mac_filter_hash_lock spinlock. This is unsafe because mac_filter_hash may get altered in another thread while i40e_getnum_vf_vsi_vlan_filters() traverses the hashes. Simply adding the spinlock in i40e_getnum_vf_vsi_vlan_filters() is not possible as it already gets called in i40e_get_vlan_list_sync() with the spinlock held. Therefore adding a wrapper that acquires the spinlock and call the correct function where appropriate. Fixes: 37d318d7805f ("i40e: Remove scheduling while atomic possibility") Fix-suggested-by: Paolo Abeni Signed-off-by: Stefan Assmann --- .../ethernet/intel/i40e/i40e_virtchnl_pf.c| 23 --- 1 file changed, 20 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c index 1b6ec9be155a..61d3e9a00f65 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c +++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c @@ -1101,12 +1101,12 @@ static int i40e_quiesce_vf_pci(struct i40e_vf *vf) } /** - * i40e_getnum_vf_vsi_vlan_filters + * __i40e_getnum_vf_vsi_vlan_filters * @vsi: pointer to the vsi * * called to get the number of VLANs offloaded on this VF **/ -static int i40e_getnum_vf_vsi_vlan_filters(struct i40e_vsi *vsi) +static int __i40e_getnum_vf_vsi_vlan_filters(struct i40e_vsi *vsi) { struct i40e_mac_filter *f; u16 num_vlans = 0, bkt; @@ -1119,6 +1119,23 @@ static int i40e_getnum_vf_vsi_vlan_filters(struct i40e_vsi *vsi) return num_vlans; } +/** + * i40e_getnum_vf_vsi_vlan_filters + * @vsi: pointer to the vsi + * + * wrapper for __i40e_getnum_vf_vsi_vlan_filters() with spinlock held + **/ +static int i40e_getnum_vf_vsi_vlan_filters(struct i40e_vsi *vsi) +{ + int num_vlans; + + spin_lock_bh(&vsi->mac_filter_hash_lock); + num_vlans = __i40e_getnum_vf_vsi_vlan_filters(vsi); + spin_unlock_bh(&vsi->mac_filter_hash_lock); + + return num_vlans; +} + /** * i40e_get_vlan_list_sync * @vsi: pointer to the VSI @@ -1136,7 +1153,7 @@ static void i40e_get_vlan_list_sync(struct i40e_vsi *vsi, u16 *num_vlans, int bkt; spin_lock_bh(&vsi->mac_filter_hash_lock); - *num_vlans = i40e_getnum_vf_vsi_vlan_filters(vsi); + *num_vlans = __i40e_getnum_vf_vsi_vlan_filters(vsi); *vlan_list = kcalloc(*num_vlans, sizeof(**vlan_list), GFP_ATOMIC); if (!(*vlan_list)) goto err; -- 2.29.2
Re: [PATCH v2 1/1] xfrm: Use actual socket sk instead of skb socket for xfrm_output_resume
On Tue, Mar 02, 2021 at 08:00:04AM +1300, Evan Nimmo wrote: > A situation can occur where the interface bound to the sk is different > to the interface bound to the sk attached to the skb. The interface > bound to the sk is the correct one however this information is lost inside > xfrm_output2 and instead the sk on the skb is used in xfrm_output_resume > instead. This assumes that the sk bound interface and the bound interface > attached to the sk within the skb are the same which can lead to lookup > failures inside ip_route_me_harder resulting in the packet being dropped. > > We have an l2tp v3 tunnel with ipsec protection. The tunnel is in the > global VRF however we have an encapsulated dot1q tunnel interface that > is within a different VRF. We also have a mangle rule that marks the > packets causing them to be processed inside ip_route_me_harder. > > Prior to commit 31c70d5956fc ("l2tp: keep original skb ownership") this > worked fine as the sk attached to the skb was changed from the dot1q > encapsulated interface to the sk for the tunnel which meant the interface > bound to the sk and the interface bound to the skb were identical. > Commit 46d6c5ae953c ("netfilter: use actual socket sk rather than skb sk > when routing harder") fixed some of these issues however a similar > problem existed in the xfrm code. > > Fixes: 31c70d5956fc ("l2tp: keep original skb ownership") > > Signed-off-by: Evan Nimmo Applied, thanks Evan!
RE: [PATCH] gianfar: fix jumbo packets+napi+rx overrun crash
>-Original Message- >From: michael-...@fami-braun.de >Sent: Thursday, March 4, 2021 5:13 AM >To: Claudiu Manoil >Cc: netdev@vger.kernel.org; Michael Braun >Subject: [PATCH] gianfar: fix jumbo packets+napi+rx overrun crash > >From: Michael Braun > >When using jumbo packets and overrunning rx queue with napi enabled, >the following sequence is observed in gfar_add_rx_frag: > Hi, Could you help provide some context, e.g.: On what board/soc were you able to trigger this issue? How often does the overrun occur? What's the use case? Is the issue triggered with smaller packets than 9600B? Increasing the Rx ring size does significantly reduce ring overruns? Thanks. > | lstatus | | skb | >t | lstatus, size, flags| first | len, data_len, *ptr | >---+--+---+---+ >13 | 18002348, 9032, INTERRUPT LAST | 0 | 9600, 8000, f554c12e | >12 | 1640, 1600, INTERRUPT| 0 | 8000, 6400, f554c12e | >11 | 1640, 1600, INTERRUPT| 0 | 6400, 4800, f554c12e | >10 | 1640, 1600, INTERRUPT| 0 | 4800, 3200, f554c12e | >09 | 1640, 1600, INTERRUPT| 0 | 3200, 1600, f554c12e | >08 | 14000640, 1600, INTERRUPT FIRST | 0 | 1600, 0, f554c12e | >07 | 14000640, 1600, INTERRUPT FIRST | 1 | 0,0, f554c12e | >06 | 1c80, 128, INTERRUPT LAST FIRST | 1 | 0,0, abf3bd6e | >05 | 18002348, 9032, INTERRUPT LAST | 0 | 8000, 6400, c5a57780 | >04 | 1640, 1600, INTERRUPT| 0 | 6400, 4800, c5a57780 | >03 | 1640, 1600, INTERRUPT| 0 | 4800, 3200, c5a57780 | >02 | 1640, 1600, INTERRUPT| 0 | 3200, 1600, c5a57780 | >01 | 1640, 1600, INTERRUPT| 0 | 1600, 0, c5a57780 | >00 | 14000640, 1600, INTERRUPT FIRST | 1 | 0,0, c5a57780 | > >So at t=7 a new packets is started but not finished, probably due to rx >overrun - but rx overrun is not indicated in the flags. Instead a new >packets starts at t=8. This results in skb->len to exceed size for the LAST >fragment at t=13 and thus a negative fragment size added to the skb. > >This then crashes: > >kernel BUG at include/linux/skbuff.h:2277! >Oops: Exception in kernel mode, sig: 5 [#1] >... >NIP [c04689f4] skb_pull+0x2c/0x48 >LR [c03f62ac] gfar_clean_rx_ring+0x2e4/0x844 >Call Trace: >[ec4bfd38] [c06a84c4] _raw_spin_unlock_irqrestore+0x60/0x7c (unreliable) >[ec4bfda8] [c03f6a44] gfar_poll_rx_sq+0x48/0xe4 >[ec4bfdc8] [c048d504] __napi_poll+0x54/0x26c >[ec4bfdf8] [c048d908] net_rx_action+0x138/0x2c0 >[ec4bfe68] [c06a8f34] __do_softirq+0x3a4/0x4fc >[ec4bfed8] [c0040150] run_ksoftirqd+0x58/0x70 >[ec4bfee8] [c0066ecc] smpboot_thread_fn+0x184/0x1cc >[ec4bff08] [c0062718] kthread+0x140/0x144 >[ec4bff38] [c0012350] ret_from_kernel_thread+0x14/0x1c > >This patch fixes this by checking for computed LAST fragment size, so a >negative sized fragment is never added. >In order to prevent the newer rx frame from getting corrupted, the FIRST >flag is checked to discard the incomplete older frame. > >Signed-off-by: Michael Braun >--- > drivers/net/ethernet/freescale/gianfar.c | 14 ++ > 1 file changed, 14 insertions(+) > >diff --git a/drivers/net/ethernet/freescale/gianfar.c >b/drivers/net/ethernet/freescale/gianfar.c >index 541de32ea662..2aecae23bfd0 100644 >--- a/drivers/net/ethernet/freescale/gianfar.c >+++ b/drivers/net/ethernet/freescale/gianfar.c >@@ -2390,6 +2390,10 @@ static bool gfar_add_rx_frag(struct gfar_rx_buff >*rxb, u32 lstatus, > if (lstatus & BD_LFLAG(RXBD_LAST)) > size -= skb->len; > >+ WARN(size < 0, "gianfar: rx fragment size underflow"); >+ if (size < 0) >+ return false; >+ > skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, page, > rxb->page_offset + RXBUF_ALIGNMENT, > size, GFAR_RXB_TRUESIZE); >@@ -2552,6 +2556,16 @@ static int gfar_clean_rx_ring(struct gfar_priv_rx_q >*rx_queue, > if (lstatus & BD_LFLAG(RXBD_EMPTY)) > break; > >+ /* lost RXBD_LAST descriptor due to overrun */ >+ if (skb && >+ (lstatus & BD_LFLAG(RXBD_FIRST))) { >+ /* discard faulty buffer */ >+ dev_kfree_skb(skb); >+ skb = NULL; >+ >+ /* can continue normally */ >+ } >+ This is indeed an invalid state. If you hit this condition, discarding the skb is the right thing to do. Acked-by: Claudiu Manoil > /* order rx buffer descriptor reads */ > rmb(); > >-- >2.20.1
[PATCH -next] iwlwifi: mvm: fix old-style static const declaration
From: Wei Yongjun GCC reports warning as follows: drivers/net/wireless/intel/iwlwifi/mvm/rfi.c:14:1: warning: 'static' is not at beginning of declaration [-Wold-style-declaration] 14 | const static struct iwl_rfi_lut_entry iwl_rfi_table[IWL_RFI_LUT_SIZE] = { | ^ Move static to the beginning of declaration. Fixes: 21254908cbe9 ("iwlwifi: mvm: add RFI-M support") Reported-by: Hulk Robot Signed-off-by: Wei Yongjun --- drivers/net/wireless/intel/iwlwifi/mvm/rfi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/rfi.c b/drivers/net/wireless/intel/iwlwifi/mvm/rfi.c index 873919048143..4d5a99cbcc9d 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/rfi.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/rfi.c @@ -11,7 +11,7 @@ * DDR needs frequency in units of 16.666MHz, so provide FW with the * frequency values in the adjusted format. */ -const static struct iwl_rfi_lut_entry iwl_rfi_table[IWL_RFI_LUT_SIZE] = { +static const struct iwl_rfi_lut_entry iwl_rfi_table[IWL_RFI_LUT_SIZE] = { /* LPDDR4 */ /* frequency 3733MHz */
net: mscc: ocelot: issue with uninitialized pointer read in ocelot_flower_parse_key
Hi, Static analysis with Coverity had detected an uninitialized pointer read in function ocelot_flower_parse_key in drivers/net/ethernet/mscc/ocelot_flower.c introduced by commit: commit 75944fda1dfe836fdd406bef6cb3cc8a80f7af83 Author: Xiaoliang Yang Date: Fri Oct 2 15:02:23 2020 +0300 net: mscc: ocelot: offload ingress skbedit and vlan actions to VCAP IS1 The analysis is as follows: 531 10. Condition flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_IPV4_ADDRS), taking true branch. 11. Condition proto == 2048, taking true branch. 532if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_IPV4_ADDRS) && 533proto == ETH_P_IP) { 12. var_decl: Declaring variable match without initializer. 534struct flow_match_ipv4_addrs match; 535u8 *tmp; 536 13. Condition filter->block_id == VCAP_ES0, taking false branch. 537if (filter->block_id == VCAP_ES0) { 538NL_SET_ERR_MSG_MOD(extack, 539 "VCAP ES0 cannot match on IP address"); 540return -EOPNOTSUPP; 541} 542 14. Condition filter->block_id == VCAP_IS1, taking true branch. Uninitialized pointer read (UNINIT) 15. uninit_use: Using uninitialized value match.mask. 543if (filter->block_id == VCAP_IS1 && *(u32 *)&match.mask->dst) { 544NL_SET_ERR_MSG_MOD(extack, 545 "Key type S1_NORMAL cannot match on destination IP"); 546return -EOPNOTSUPP; 547} match is declared in line 534 and is not initialized and the uninitialized match.mask is being dereferenced on line 543. Not sure what intent was on this and how to fix, hence I'm reporting this issue. Colin
[PATCH net] net: mscc: ocelot: properly reject destination IP keys in VCAP IS1
From: Vladimir Oltean An attempt is made to warn the user about the fact that VCAP IS1 cannot offload keys matching on destination IP (at least given the current half key format), but sadly that warning fails miserably in practice, due to the fact that it operates on an uninitialized "match" variable. We must first decode the keys from the flow rule. Fixes: 75944fda1dfe ("net: mscc: ocelot: offload ingress skbedit and vlan actions to VCAP IS1") Reported-by: Colin Ian King Signed-off-by: Vladimir Oltean --- drivers/net/ethernet/mscc/ocelot_flower.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mscc/ocelot_flower.c b/drivers/net/ethernet/mscc/ocelot_flower.c index c3ac026f6aea..a41b458b1b3e 100644 --- a/drivers/net/ethernet/mscc/ocelot_flower.c +++ b/drivers/net/ethernet/mscc/ocelot_flower.c @@ -540,13 +540,14 @@ ocelot_flower_parse_key(struct ocelot *ocelot, int port, bool ingress, return -EOPNOTSUPP; } + flow_rule_match_ipv4_addrs(rule, &match); + if (filter->block_id == VCAP_IS1 && *(u32 *)&match.mask->dst) { NL_SET_ERR_MSG_MOD(extack, "Key type S1_NORMAL cannot match on destination IP"); return -EOPNOTSUPP; } - flow_rule_match_ipv4_addrs(rule, &match); tmp = &filter->key.ipv4.sip.value.addr[0]; memcpy(tmp, &match.key->src, 4); -- 2.25.1
Re: [PATCH net] net: mscc: ocelot: properly reject destination IP keys in VCAP IS1
On 04/03/2021 10:29, Vladimir Oltean wrote: > From: Vladimir Oltean > > An attempt is made to warn the user about the fact that VCAP IS1 cannot > offload keys matching on destination IP (at least given the current half > key format), but sadly that warning fails miserably in practice, due to > the fact that it operates on an uninitialized "match" variable. We must > first decode the keys from the flow rule. > > Fixes: 75944fda1dfe ("net: mscc: ocelot: offload ingress skbedit and vlan > actions to VCAP IS1") > Reported-by: Colin Ian King > Signed-off-by: Vladimir Oltean > --- > drivers/net/ethernet/mscc/ocelot_flower.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/mscc/ocelot_flower.c > b/drivers/net/ethernet/mscc/ocelot_flower.c > index c3ac026f6aea..a41b458b1b3e 100644 > --- a/drivers/net/ethernet/mscc/ocelot_flower.c > +++ b/drivers/net/ethernet/mscc/ocelot_flower.c > @@ -540,13 +540,14 @@ ocelot_flower_parse_key(struct ocelot *ocelot, int > port, bool ingress, > return -EOPNOTSUPP; > } > > + flow_rule_match_ipv4_addrs(rule, &match); > + > if (filter->block_id == VCAP_IS1 && *(u32 *)&match.mask->dst) { > NL_SET_ERR_MSG_MOD(extack, > "Key type S1_NORMAL cannot match on > destination IP"); > return -EOPNOTSUPP; > } > > - flow_rule_match_ipv4_addrs(rule, &match); > tmp = &filter->key.ipv4.sip.value.addr[0]; > memcpy(tmp, &match.key->src, 4); > > Thanks, looks good to me. Reviewed-by: Colin Ian King
Re: Re: [RFC v4 09/11] Documentation: Add documentation for VDUSE
On Thu, Mar 4, 2021 at 2:40 PM Jason Wang wrote: > > > On 2021/2/23 7:50 下午, Xie Yongji wrote: > > VDUSE (vDPA Device in Userspace) is a framework to support > > implementing software-emulated vDPA devices in userspace. This > > document is intended to clarify the VDUSE design and usage. > > > > Signed-off-by: Xie Yongji > > --- > > Documentation/userspace-api/index.rst | 1 + > > Documentation/userspace-api/vduse.rst | 112 > > ++ > > 2 files changed, 113 insertions(+) > > create mode 100644 Documentation/userspace-api/vduse.rst > > > > diff --git a/Documentation/userspace-api/index.rst > > b/Documentation/userspace-api/index.rst > > index acd2cc2a538d..f63119130898 100644 > > --- a/Documentation/userspace-api/index.rst > > +++ b/Documentation/userspace-api/index.rst > > @@ -24,6 +24,7 @@ place where this information is gathered. > > ioctl/index > > iommu > > media/index > > + vduse > > > > .. only:: subproject and html > > > > diff --git a/Documentation/userspace-api/vduse.rst > > b/Documentation/userspace-api/vduse.rst > > new file mode 100644 > > index ..2a20e686bb59 > > --- /dev/null > > +++ b/Documentation/userspace-api/vduse.rst > > @@ -0,0 +1,112 @@ > > +== > > +VDUSE - "vDPA Device in Userspace" > > +== > > + > > +vDPA (virtio data path acceleration) device is a device that uses a > > +datapath which complies with the virtio specifications with vendor > > +specific control path. vDPA devices can be both physically located on > > +the hardware or emulated by software. VDUSE is a framework that makes it > > +possible to implement software-emulated vDPA devices in userspace. > > + > > +How VDUSE works > > + > > +Each userspace vDPA device is created by the VDUSE_CREATE_DEV ioctl on > > +the character device (/dev/vduse/control). Then a device file with the > > +specified name (/dev/vduse/$NAME) will appear, which can be used to > > +implement the userspace vDPA device's control path and data path. > > > It's better to mention that in order to le thte device to be registered > on the bus, admin need to use the management API(netlink) to create the > vDPA device. > > Some codes to demnonstrate how to create the device will be better. > OK. > > > + > > +To implement control path, a message-based communication protocol and some > > +types of control messages are introduced in the VDUSE framework: > > + > > +- VDUSE_SET_VQ_ADDR: Set the vring address of virtqueue. > > + > > +- VDUSE_SET_VQ_NUM: Set the size of virtqueue > > + > > +- VDUSE_SET_VQ_READY: Set ready status of virtqueue > > + > > +- VDUSE_GET_VQ_READY: Get ready status of virtqueue > > + > > +- VDUSE_SET_VQ_STATE: Set the state for virtqueue > > + > > +- VDUSE_GET_VQ_STATE: Get the state for virtqueue > > + > > +- VDUSE_SET_FEATURES: Set virtio features supported by the driver > > + > > +- VDUSE_GET_FEATURES: Get virtio features supported by the device > > + > > +- VDUSE_SET_STATUS: Set the device status > > + > > +- VDUSE_GET_STATUS: Get the device status > > + > > +- VDUSE_SET_CONFIG: Write to device specific configuration space > > + > > +- VDUSE_GET_CONFIG: Read from device specific configuration space > > + > > +- VDUSE_UPDATE_IOTLB: Notify userspace to update the memory mapping in > > device IOTLB > > + > > +Those control messages are mostly based on the vdpa_config_ops in > > +include/linux/vdpa.h which defines a unified interface to control > > +different types of vdpa device. Userspace needs to read()/write() > > +on the VDUSE device file to receive/reply those control messages > > +from/to VDUSE kernel module as follows: > > + > > +.. code-block:: c > > + > > + static int vduse_message_handler(int dev_fd) > > + { > > + int len; > > + struct vduse_dev_request req; > > + struct vduse_dev_response resp; > > + > > + len = read(dev_fd, &req, sizeof(req)); > > + if (len != sizeof(req)) > > + return -1; > > + > > + resp.request_id = req.unique; > > + > > + switch (req.type) { > > + > > + /* handle different types of message */ > > + > > + } > > + > > + len = write(dev_fd, &resp, sizeof(resp)); > > + if (len != sizeof(resp)) > > + return -1; > > + > > + return 0; > > + } > > + > > +In the deta path, vDPA device's iova regions will be mapped into userspace > > +with the help of VDUSE_IOTLB_GET_FD ioctl on the VDUSE device file: > > + > > +- VDUSE_IOTLB_GET_FD: get the file descriptor to iova region. Userspace can > > + access this iova region by passing the fd to mmap(). > > > It would be better to have codes to explain how it is expected to work here. > OK. > > > + > > +Besides, the following ioctls on the VDUSE device file are provided to > > support > > +interrupt injection and setting up e
[PATCH net 1/2] net: dsa: sja1105: fix SGMII PCS being forced to SPEED_UNKNOWN instead of SPEED_10
From: Vladimir Oltean When using MLO_AN_PHY or MLO_AN_FIXED, the MII_BMCR of the SGMII PCS is read before resetting the switch so it can be reprogrammed afterwards. This works for the speeds of 1Gbps and 100Mbps, but not for 10Mbps, because SPEED_10 is actually 0, so AND-ing anything with 0 is false, therefore that last branch is dead code. Do what others do (genphy_read_status_fixed, phy_mii_ioctl) and just remove the check for SPEED_10, let it fall into the default case. Fixes: ffe10e679cec ("net: dsa: sja1105: Add support for the SGMII port") Signed-off-by: Vladimir Oltean --- drivers/net/dsa/sja1105/sja1105_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c index 7692338730df..c1982615c631 100644 --- a/drivers/net/dsa/sja1105/sja1105_main.c +++ b/drivers/net/dsa/sja1105/sja1105_main.c @@ -1922,7 +1922,7 @@ int sja1105_static_config_reload(struct sja1105_private *priv, speed = SPEED_1000; else if (bmcr & BMCR_SPEED100) speed = SPEED_100; - else if (bmcr & BMCR_SPEED10) + else speed = SPEED_10; sja1105_sgmii_pcs_force_speed(priv, speed); -- 2.25.1
[PATCH net 2/2] net: dsa: sja1105: fix ucast/bcast flooding always remaining enabled
From: Vladimir Oltean In the blamed patch I managed to introduce a bug while moving code around: the same logic is applied to the ucast_egress_floods and bcast_egress_floods variables both on the "if" and the "else" branches. This is clearly an unintended change compared to how the code used to be prior to that bugfix, so restore it. Fixes: 7f7ccdea8c73 ("net: dsa: sja1105: fix leakage of flooded frames outside bridging domain") Signed-off-by: Vladimir Oltean --- drivers/net/dsa/sja1105/sja1105_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c index c1982615c631..51ea104c63bb 100644 --- a/drivers/net/dsa/sja1105/sja1105_main.c +++ b/drivers/net/dsa/sja1105/sja1105_main.c @@ -3369,14 +3369,14 @@ static int sja1105_port_ucast_bcast_flood(struct sja1105_private *priv, int to, if (flags.val & BR_FLOOD) priv->ucast_egress_floods |= BIT(to); else - priv->ucast_egress_floods |= BIT(to); + priv->ucast_egress_floods &= ~BIT(to); } if (flags.mask & BR_BCAST_FLOOD) { if (flags.val & BR_BCAST_FLOOD) priv->bcast_egress_floods |= BIT(to); else - priv->bcast_egress_floods |= BIT(to); + priv->bcast_egress_floods &= ~BIT(to); } return sja1105_manage_flood_domains(priv); -- 2.25.1
Re: cleanup unused or almost unused IOMMU APIs and the FSL PAMU driver
On Mon, Mar 01, 2021 at 09:42:40AM +0100, Christoph Hellwig wrote: > Diffstat: > arch/powerpc/include/asm/fsl_pamu_stash.h | 12 > drivers/gpu/drm/msm/adreno/adreno_gpu.c |2 > drivers/iommu/amd/iommu.c | 23 > drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 85 --- > drivers/iommu/arm/arm-smmu/arm-smmu.c | 122 +--- > drivers/iommu/dma-iommu.c |8 > drivers/iommu/fsl_pamu.c| 264 -- > drivers/iommu/fsl_pamu.h| 10 > drivers/iommu/fsl_pamu_domain.c | 694 > ++-- > drivers/iommu/fsl_pamu_domain.h | 46 - > drivers/iommu/intel/iommu.c | 55 -- > drivers/iommu/iommu.c | 75 --- > drivers/soc/fsl/qbman/qman_portal.c | 56 -- > drivers/vfio/vfio_iommu_type1.c | 31 - > drivers/vhost/vdpa.c| 10 > include/linux/iommu.h | 81 --- > 16 files changed, 214 insertions(+), 1360 deletions(-) Nice cleanup, thanks. The fsl_pamu driver and interface has always been a little bit of an alien compared to other IOMMU drivers. I am inclined to merge this after -rc3 is out, given some reviews. Can you also please add changelogs to the last three patches? Thanks, Joerg
[PATCH 01/18] thunderbolt: Disable retry logic for intra-domain control packets
In most cases the response packet is lost because the router in question was disconnected by the user. Resending the control packet in that case just adds unnecessary delays, so disable that for intra-domain control packets. For inter-domain (XDomain) packets we continue retrying. This also aligns the driver better what the Intel connection manager firmware is doing. Signed-off-by: Mika Westerberg --- drivers/thunderbolt/ctl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/thunderbolt/ctl.c b/drivers/thunderbolt/ctl.c index f1aeaff9f368..875922133782 100644 --- a/drivers/thunderbolt/ctl.c +++ b/drivers/thunderbolt/ctl.c @@ -17,7 +17,7 @@ #define TB_CTL_RX_PKG_COUNT10 -#define TB_CTL_RETRIES 4 +#define TB_CTL_RETRIES 1 /** * struct tb_ctl - Thunderbolt control channel -- 2.30.1
[PATCH 04/18] Documentation / thunderbolt: Drop speed/lanes entries for XDomain
These are actually not needed as we already have similar entries that apply to all devices on the Thunderbolt bus. Cc: Isaac Hazan Signed-off-by: Mika Westerberg --- .../ABI/testing/sysfs-bus-thunderbolt | 28 --- 1 file changed, 28 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-bus-thunderbolt b/Documentation/ABI/testing/sysfs-bus-thunderbolt index d7f09d011b6d..bfa4ca6f3fc1 100644 --- a/Documentation/ABI/testing/sysfs-bus-thunderbolt +++ b/Documentation/ABI/testing/sysfs-bus-thunderbolt @@ -1,31 +1,3 @@ -What: /sys/bus/thunderbolt/devices//rx_speed -Date: Feb 2021 -KernelVersion: 5.11 -Contact: Isaac Hazan -Description: This attribute reports the XDomain RX speed per lane. - All RX lanes run at the same speed. - -What: /sys/bus/thunderbolt/devices//rx_lanes -Date: Feb 2021 -KernelVersion: 5.11 -Contact: Isaac Hazan -Description: This attribute reports the number of RX lanes the XDomain - is using simultaneously through its upstream port. - -What: /sys/bus/thunderbolt/devices//tx_speed -Date: Feb 2021 -KernelVersion: 5.11 -Contact: Isaac Hazan -Description: This attribute reports the XDomain TX speed per lane. - All TX lanes run at the same speed. - -What: /sys/bus/thunderbolt/devices//tx_lanes -Date: Feb 2021 -KernelVersion: 5.11 -Contact: Isaac Hazan -Description: This attribute reports number of TX lanes the XDomain - is using simultaneously through its upstream port. - What: /sys/bus/thunderbolt/devices/.../domainX/boot_acl Date: Jun 2018 KernelVersion: 4.17 -- 2.30.1
[PATCH 05/18] thunderbolt: Add more logging to XDomain connections
Currently the driver is pretty quiet when another host is connected which makes debugging possible issues harder. For this reason add more logging on debug level that can be turned on as needed. While there log the host-to-host connection on info level analogous to routers and retimers. Signed-off-by: Mika Westerberg --- drivers/thunderbolt/xdomain.c | 34 +++--- 1 file changed, 31 insertions(+), 3 deletions(-) diff --git a/drivers/thunderbolt/xdomain.c b/drivers/thunderbolt/xdomain.c index 7cf8b9c85ab7..584bb5ec06f8 100644 --- a/drivers/thunderbolt/xdomain.c +++ b/drivers/thunderbolt/xdomain.c @@ -591,6 +591,8 @@ static void tb_xdp_handle_request(struct work_struct *work) finalize_property_block(); + tb_dbg(tb, "%llx: received XDomain request %#x\n", route, pkg->type); + switch (pkg->type) { case PROPERTIES_REQUEST: ret = tb_xdp_properties_response(tb, ctl, route, sequence, uuid, @@ -1002,9 +1004,12 @@ static void tb_xdomain_get_uuid(struct work_struct *work) uuid_t uuid; int ret; + dev_dbg(&xd->dev, "requesting remote UUID\n"); + ret = tb_xdp_uuid_request(tb->ctl, xd->route, xd->uuid_retries, &uuid); if (ret < 0) { if (xd->uuid_retries-- > 0) { + dev_dbg(&xd->dev, "failed to request UUID, retrying\n"); queue_delayed_work(xd->tb->wq, &xd->get_uuid_work, msecs_to_jiffies(100)); } else { @@ -1013,6 +1018,8 @@ static void tb_xdomain_get_uuid(struct work_struct *work) return; } + dev_dbg(&xd->dev, "got remote UUID %pUb\n", &uuid); + if (uuid_equal(&uuid, xd->local_uuid)) dev_dbg(&xd->dev, "intra-domain loop detected\n"); @@ -1052,11 +1059,15 @@ static void tb_xdomain_get_properties(struct work_struct *work) u32 gen = 0; int ret; + dev_dbg(&xd->dev, "requesting remote properties\n"); + ret = tb_xdp_properties_request(tb->ctl, xd->route, xd->local_uuid, xd->remote_uuid, xd->properties_retries, &block, &gen); if (ret < 0) { if (xd->properties_retries-- > 0) { + dev_dbg(&xd->dev, + "failed to request remote properties, retrying\n"); queue_delayed_work(xd->tb->wq, &xd->get_properties_work, msecs_to_jiffies(1000)); } else { @@ -1123,6 +1134,11 @@ static void tb_xdomain_get_properties(struct work_struct *work) dev_err(&xd->dev, "failed to add XDomain device\n"); return; } + dev_info(&xd->dev, "new host found, vendor=%#x device=%#x\n", +xd->vendor, xd->device); + if (xd->vendor_name && xd->device_name) + dev_info(&xd->dev, "%s %s\n", xd->vendor_name, +xd->device_name); } else { kobject_uevent(&xd->dev.kobj, KOBJ_CHANGE); } @@ -1143,13 +1159,19 @@ static void tb_xdomain_properties_changed(struct work_struct *work) properties_changed_work.work); int ret; + dev_dbg(&xd->dev, "sending properties changed notification\n"); + ret = tb_xdp_properties_changed_request(xd->tb->ctl, xd->route, xd->properties_changed_retries, xd->local_uuid); if (ret) { - if (xd->properties_changed_retries-- > 0) + if (xd->properties_changed_retries-- > 0) { + dev_dbg(&xd->dev, + "failed to send properties changed notification, retrying\n"); queue_delayed_work(xd->tb->wq, &xd->properties_changed_work, msecs_to_jiffies(1000)); + } + dev_err(&xd->dev, "failed to send properties changed notification\n"); return; } @@ -1390,6 +1412,10 @@ struct tb_xdomain *tb_xdomain_alloc(struct tb *tb, struct device *parent, xd->dev.groups = xdomain_attr_groups; dev_set_name(&xd->dev, "%u-%llx", tb->index, route); + dev_dbg(&xd->dev, "local UUID %pUb\n", local_uuid); + if (remote_uuid) + dev_dbg(&xd->dev, "remote UUID %pUb\n", remote_uuid); + /* * This keeps the DMA powered on as long as we have active * connection to another host. @@ -1452,10 +1478,12 @@ void tb_xdomain_remove(struct tb_xdomain *xd) pm_runtime_put_noidle(&xd->dev); pm_runtime_set_suspended(&xd->dev); - if (!device_is_registered(&xd->dev)) + if (!device_is_registered(&xd->dev)) {
[PATCH 03/18] thunderbolt: Decrease control channel timeout for software connection manager
When the firmware connection manager is not proxying between the software and the hardware we can decrease the timeout for control packets significantly. The USB4 spec recommends 10 ms +- 1 ms but we use slightly larger value (100 ms) which is recommendation from Intel Thunderbolt firmware folks. When firmware connection manager is running then we keep using the existing 5000 ms. To implement this we move the control channel allocation to tb_domain_alloc(), and pass the timeout from that function to the tb_ctl_alloc(). Then make both connection manager implementations pass the timeout when they alloc the domain structure. While there update kernel-doc of struct tb_ctl to match the reality. Signed-off-by: Mika Westerberg --- drivers/thunderbolt/ctl.c| 15 +--- drivers/thunderbolt/ctl.h| 5 ++- drivers/thunderbolt/domain.c | 66 +--- drivers/thunderbolt/icm.c| 2 +- drivers/thunderbolt/tb.c | 4 ++- drivers/thunderbolt/tb.h | 2 +- 6 files changed, 49 insertions(+), 45 deletions(-) diff --git a/drivers/thunderbolt/ctl.c b/drivers/thunderbolt/ctl.c index b79be1f02d92..0fb5e04191e2 100644 --- a/drivers/thunderbolt/ctl.c +++ b/drivers/thunderbolt/ctl.c @@ -29,6 +29,7 @@ * @request_queue_lock: Lock protecting @request_queue * @request_queue: List of outstanding requests * @running: Is the control channel running at the moment + * @timeout_msec: Default timeout for non-raw control messages * @callback: Callback called when hotplug message is received * @callback_data: Data passed to @callback */ @@ -43,6 +44,7 @@ struct tb_ctl { struct list_head request_queue; bool running; + int timeout_msec; event_cb callback; void *callback_data; }; @@ -613,6 +615,7 @@ struct tb_cfg_result tb_cfg_request_sync(struct tb_ctl *ctl, /** * tb_ctl_alloc() - allocate a control channel * @nhi: Pointer to NHI + * @timeout_msec: Default timeout used with non-raw control messages * @cb: Callback called for plug events * @cb_data: Data passed to @cb * @@ -620,13 +623,15 @@ struct tb_cfg_result tb_cfg_request_sync(struct tb_ctl *ctl, * * Return: Returns a pointer on success or NULL on failure. */ -struct tb_ctl *tb_ctl_alloc(struct tb_nhi *nhi, event_cb cb, void *cb_data) +struct tb_ctl *tb_ctl_alloc(struct tb_nhi *nhi, int timeout_msec, event_cb cb, + void *cb_data) { int i; struct tb_ctl *ctl = kzalloc(sizeof(*ctl), GFP_KERNEL); if (!ctl) return NULL; ctl->nhi = nhi; + ctl->timeout_msec = timeout_msec; ctl->callback = cb; ctl->callback_data = cb_data; @@ -829,7 +834,7 @@ struct tb_cfg_result tb_cfg_reset(struct tb_ctl *ctl, u64 route) req->response_size = sizeof(reply); req->response_type = TB_CFG_PKG_RESET; - res = tb_cfg_request_sync(ctl, req, TB_CFG_DEFAULT_TIMEOUT); + res = tb_cfg_request_sync(ctl, req, ctl->timeout_msec); tb_cfg_request_put(req); @@ -1005,7 +1010,7 @@ int tb_cfg_read(struct tb_ctl *ctl, void *buffer, u64 route, u32 port, enum tb_cfg_space space, u32 offset, u32 length) { struct tb_cfg_result res = tb_cfg_read_raw(ctl, buffer, route, port, - space, offset, length, TB_CFG_DEFAULT_TIMEOUT); + space, offset, length, ctl->timeout_msec); switch (res.err) { case 0: /* Success */ @@ -1031,7 +1036,7 @@ int tb_cfg_write(struct tb_ctl *ctl, const void *buffer, u64 route, u32 port, enum tb_cfg_space space, u32 offset, u32 length) { struct tb_cfg_result res = tb_cfg_write_raw(ctl, buffer, route, port, - space, offset, length, TB_CFG_DEFAULT_TIMEOUT); + space, offset, length, ctl->timeout_msec); switch (res.err) { case 0: /* Success */ @@ -1069,7 +1074,7 @@ int tb_cfg_get_upstream_port(struct tb_ctl *ctl, u64 route) u32 dummy; struct tb_cfg_result res = tb_cfg_read_raw(ctl, &dummy, route, 0, TB_CFG_SWITCH, 0, 1, - TB_CFG_DEFAULT_TIMEOUT); + ctl->timeout_msec); if (res.err == 1) return -EIO; if (res.err) diff --git a/drivers/thunderbolt/ctl.h b/drivers/thunderbolt/ctl.h index 2eafbfea5dff..e8c64898dfce 100644 --- a/drivers/thunderbolt/ctl.h +++ b/drivers/thunderbolt/ctl.h @@ -21,15 +21,14 @@ struct tb_ctl; typedef bool (*event_cb)(void *data, enum tb_cfg_pkg_type type, const void *buf, size_t size); -struct tb_ctl *tb_ctl_alloc(struct tb_nhi *nhi, event_cb cb, void *cb_data); +struct tb_ctl *tb_ctl_alloc(struct tb_nhi *nhi, int timeout_msec, event_cb cb, + void *cb_data); void tb_ctl_start(struct tb_ctl *ctl
[PATCH 07/18] thunderbolt: Use pseudo-random number as initial property block generation
As recommended by USB4 inter-domain service spec use pseudo-random value instead of zero as initial XDomain property block generation value. Signed-off-by: Mika Westerberg --- drivers/thunderbolt/xdomain.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/thunderbolt/xdomain.c b/drivers/thunderbolt/xdomain.c index a1657663a95e..cfe6fa7e84f4 100644 --- a/drivers/thunderbolt/xdomain.c +++ b/drivers/thunderbolt/xdomain.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include #include @@ -1880,6 +1881,7 @@ int tb_xdomain_init(void) tb_property_add_immediate(xdomain_property_dir, "deviceid", 0x1); tb_property_add_immediate(xdomain_property_dir, "devicerv", 0x8100); + xdomain_property_block_gen = prandom_u32(); return 0; } -- 2.30.1
[PATCH 13/18] thunderbolt: Allow multiple DMA tunnels over a single XDomain connection
Currently we have had an artificial limitation of a single DMA tunnel per XDomain connection. However, hardware wise there is no such limit and software based connection manager can take advantage of all the DMA rings available on the host to establish tunnels. For this reason make the tb_xdomain_[enable|disable]_paths() to take the DMA ring and HopID as parameter instead of storing them in the struct tb_xdomain. We also add API functions to allocate input and output HopIDs of the XDomain connection that the service drivers can use instead of hard-coding. Also convert the two existing service drivers over to this API. Signed-off-by: Mika Westerberg --- drivers/net/thunderbolt.c | 49 +--- drivers/thunderbolt/dma_test.c | 35 - drivers/thunderbolt/domain.c | 24 -- drivers/thunderbolt/icm.c | 32 +--- drivers/thunderbolt/tb.c | 48 +++- drivers/thunderbolt/tb.h | 16 +++- drivers/thunderbolt/tunnel.c | 82 --- drivers/thunderbolt/tunnel.h | 8 +- drivers/thunderbolt/xdomain.c | 139 ++--- include/linux/thunderbolt.h| 32 +--- 10 files changed, 340 insertions(+), 125 deletions(-) diff --git a/drivers/net/thunderbolt.c b/drivers/net/thunderbolt.c index ed3743dc62b9..5c9ec91b6e78 100644 --- a/drivers/net/thunderbolt.c +++ b/drivers/net/thunderbolt.c @@ -28,7 +28,6 @@ #define TBNET_LOGOUT_TIMEOUT 100 #define TBNET_RING_SIZE256 -#define TBNET_LOCAL_PATH 0xf #define TBNET_LOGIN_RETRIES60 #define TBNET_LOGOUT_RETRIES 5 #define TBNET_MATCH_FRAGS_ID BIT(1) @@ -154,8 +153,8 @@ struct tbnet_ring { * @login_sent: ThunderboltIP login message successfully sent * @login_received: ThunderboltIP login message received from the remote * host - * @transmit_path: HopID the other end needs to use building the - *opposite side path. + * @local_transmit_path: HopID we are using to send out packets + * @remote_transmit_path: HopID the other end is using to send packets to us * @connection_lock: Lock serializing access to @login_sent, * @login_received and @transmit_path. * @login_retries: Number of login retries currently done @@ -184,7 +183,8 @@ struct tbnet { atomic_t command_id; bool login_sent; bool login_received; - u32 transmit_path; + int local_transmit_path; + int remote_transmit_path; struct mutex connection_lock; int login_retries; struct delayed_work login_work; @@ -257,7 +257,7 @@ static int tbnet_login_request(struct tbnet *net, u8 sequence) atomic_inc_return(&net->command_id)); request.proto_version = TBIP_LOGIN_PROTO_VERSION; - request.transmit_path = TBNET_LOCAL_PATH; + request.transmit_path = net->local_transmit_path; return tb_xdomain_request(xd, &request, sizeof(request), TB_CFG_PKG_XDOMAIN_RESP, &reply, @@ -364,10 +364,10 @@ static void tbnet_tear_down(struct tbnet *net, bool send_logout) mutex_lock(&net->connection_lock); if (net->login_sent && net->login_received) { - int retries = TBNET_LOGOUT_RETRIES; + int ret, retries = TBNET_LOGOUT_RETRIES; while (send_logout && retries-- > 0) { - int ret = tbnet_logout_request(net); + ret = tbnet_logout_request(net); if (ret != -ETIMEDOUT) break; } @@ -377,8 +377,16 @@ static void tbnet_tear_down(struct tbnet *net, bool send_logout) tbnet_free_buffers(&net->rx_ring); tbnet_free_buffers(&net->tx_ring); - if (tb_xdomain_disable_paths(net->xd)) + ret = tb_xdomain_disable_paths(net->xd, + net->local_transmit_path, + net->rx_ring.ring->hop, + net->remote_transmit_path, + net->tx_ring.ring->hop); + if (ret) netdev_warn(net->dev, "failed to disable DMA paths\n"); + + tb_xdomain_release_in_hopid(net->xd, net->remote_transmit_path); + net->remote_transmit_path = 0; } net->login_retries = 0; @@ -424,7 +432,7 @@ static int tbnet_handle_packet(const void *buf, size_t size, void *data) if (!ret) { mutex_lock(&net->connection_lock); net->login_received = true; - net->transmit_path = pkg->transmit_path; + net->remote_transmit_path = pkg->transmit_path; /* If we reached the number of max retries or * previous logout, schedule another round of @@ -597
[PATCH 00/18] thunderbolt: Align with USB4 inter-domain and DROM specs
Hi all, The latest USB4 spec [1] also includes inter-domain (peer-to-peer, XDomain) and DROM (per-device ROM) specs. There are sligth differences between what the driver is doing now and what the spec say so this series tries to align the driver(s) with that. We also improve the "service" stack so that it is possible to run multiple DMA tunnels over a single XDomain connection, and update the two existing service drivers accordingly. We also decrease the control channel timeout when software based connection manager is used. The USB4 DROM spec adds a new product descriptor that includes the device and IDs instead of the generic entries in the Thunderbotl 1-3 DROMs. This series updates the driver to parse this descriptor too. [1] https://www.usb.org/document-library/usb4tm-specification Mika Westerberg (18): thunderbolt: Disable retry logic for intra-domain control packets thunderbolt: Do not pass timeout for tb_cfg_reset() thunderbolt: Decrease control channel timeout for software connection manager Documentation / thunderbolt: Drop speed/lanes entries for XDomain thunderbolt: Add more logging to XDomain connections thunderbolt: Do not re-establish XDomain DMA paths automatically thunderbolt: Use pseudo-random number as initial property block generation thunderbolt: Align XDomain protocol timeouts with the spec thunderbolt: Add tb_property_copy_dir() thunderbolt: Add support for maxhopid XDomain property thunderbolt: Use dedicated flow control for DMA tunnels thunderbolt: Drop unused tb_port_set_initial_credits() thunderbolt: Allow multiple DMA tunnels over a single XDomain connection net: thunderbolt: Align the driver to the USB4 networking spec thunderbolt: Add KUnit tests for XDomain properties thunderbolt: Add KUnit tests for DMA tunnels thunderbolt: Check quirks in tb_switch_add() thunderbolt: Add support for USB4 DROM .../ABI/testing/sysfs-bus-thunderbolt | 35 +- drivers/net/thunderbolt.c | 56 +- drivers/thunderbolt/ctl.c | 21 +- drivers/thunderbolt/ctl.h | 8 +- drivers/thunderbolt/dma_test.c| 35 +- drivers/thunderbolt/domain.c | 90 ++-- drivers/thunderbolt/eeprom.c | 105 +++- drivers/thunderbolt/icm.c | 34 +- drivers/thunderbolt/property.c| 71 +++ drivers/thunderbolt/switch.c | 26 +- drivers/thunderbolt/tb.c | 52 +- drivers/thunderbolt/tb.h | 19 +- drivers/thunderbolt/test.c| 492 ++ drivers/thunderbolt/tunnel.c | 102 +++- drivers/thunderbolt/tunnel.h | 8 +- drivers/thunderbolt/xdomain.c | 416 +-- include/linux/thunderbolt.h | 54 +- 17 files changed, 1220 insertions(+), 404 deletions(-) -- 2.30.1
[PATCH 10/18] thunderbolt: Add support for maxhopid XDomain property
USB4 inter-domain spec mandates that the compatible hosts expose a new property "maxhopid" that tells the connection manager on the other side what is the maximum supported input HopID over the connection. Since this is depend on the lane adapter the cable is connected it needs to be filled in dynamically. For this reason we take a copy of the global properties and fill then for each XDomain connection upon first connect, and then keep updating it if the generation changes as services are being added/removed. We also take advantage of this copy to fill in the hostname. We also expose this maxhopid as an attribute under each XDomain device. While there drop kernel-doc entry for property_lock which seems to be left there when the structure was originally introduced. Signed-off-by: Mika Westerberg --- .../ABI/testing/sysfs-bus-thunderbolt | 7 + drivers/thunderbolt/xdomain.c | 206 ++ include/linux/thunderbolt.h | 19 +- 3 files changed, 138 insertions(+), 94 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-bus-thunderbolt b/Documentation/ABI/testing/sysfs-bus-thunderbolt index bfa4ca6f3fc1..c41c68f64693 100644 --- a/Documentation/ABI/testing/sysfs-bus-thunderbolt +++ b/Documentation/ABI/testing/sysfs-bus-thunderbolt @@ -134,6 +134,13 @@ Contact: thunderbolt-softw...@lists.01.org Description: This attribute contains name of this device extracted from the device DROM. +What: /sys/bus/thunderbolt/devices/.../maxhopid +Date: Jul 2021 +KernelVersion: 5.13 +Contact: Mika Westerberg +Description: Only set for XDomains. The maximum HopID the other host + supports as its input HopID. + What: /sys/bus/thunderbolt/devices/.../rx_speed Date: Jan 2020 KernelVersion: 5.5 diff --git a/drivers/thunderbolt/xdomain.c b/drivers/thunderbolt/xdomain.c index ffa9cc9e0e7d..ab56757d7c24 100644 --- a/drivers/thunderbolt/xdomain.c +++ b/drivers/thunderbolt/xdomain.c @@ -24,6 +24,7 @@ #define XDOMAIN_PROPERTIES_RETRIES 10 #define XDOMAIN_PROPERTIES_CHANGED_RETRIES 10 #define XDOMAIN_BONDING_WAIT 100 /* ms */ +#define XDOMAIN_DEFAULT_MAX_HOPID 15 struct xdomain_request_work { struct work_struct work; @@ -35,13 +36,15 @@ static bool tb_xdomain_enabled = true; module_param_named(xdomain, tb_xdomain_enabled, bool, 0444); MODULE_PARM_DESC(xdomain, "allow XDomain protocol (default: true)"); -/* Serializes access to the properties and protocol handlers below */ +/* + * Serializes access to the properties and protocol handlers below. If + * you need to take both this lock and the struct tb_xdomain lock, take + * this one first. + */ static DEFINE_MUTEX(xdomain_lock); /* Properties exposed to the remote domains */ static struct tb_property_dir *xdomain_property_dir; -static u32 *xdomain_property_block; -static u32 xdomain_property_block_len; static u32 xdomain_property_block_gen; /* Additional protocol handlers */ @@ -386,8 +389,7 @@ static int tb_xdp_properties_request(struct tb_ctl *ctl, u64 route, } static int tb_xdp_properties_response(struct tb *tb, struct tb_ctl *ctl, - u64 route, u8 sequence, const uuid_t *src_uuid, - const struct tb_xdp_properties *req) + struct tb_xdomain *xd, u8 sequence, const struct tb_xdp_properties *req) { struct tb_xdp_properties_response *res; size_t total_size; @@ -399,39 +401,39 @@ static int tb_xdp_properties_response(struct tb *tb, struct tb_ctl *ctl, * protocol supports forwarding, though which we might add * support later on. */ - if (!uuid_equal(src_uuid, &req->dst_uuid)) { - tb_xdp_error_response(ctl, route, sequence, + if (!uuid_equal(xd->local_uuid, &req->dst_uuid)) { + tb_xdp_error_response(ctl, xd->route, sequence, ERROR_UNKNOWN_DOMAIN); return 0; } - mutex_lock(&xdomain_lock); + mutex_lock(&xd->lock); - if (req->offset >= xdomain_property_block_len) { - mutex_unlock(&xdomain_lock); + if (req->offset >= xd->local_property_block_len) { + mutex_unlock(&xd->lock); return -EINVAL; } - len = xdomain_property_block_len - req->offset; + len = xd->local_property_block_len - req->offset; len = min_t(u16, len, TB_XDP_PROPERTIES_MAX_DATA_LENGTH); total_size = sizeof(*res) + len * 4; res = kzalloc(total_size, GFP_KERNEL); if (!res) { - mutex_unlock(&xdomain_lock); + mutex_unlock(&xd->lock); return -ENOMEM; } - tb_xdp_fill_header(&res->hdr, route, sequence, PROPERTIES_RESPONSE, + tb_xdp_fill_header(&res->hdr, xd->route, sequence, PROPERTIES_RESPONSE, total_size); - res->generation = xdomai
[PATCH 15/18] thunderbolt: Add KUnit tests for XDomain properties
This adds KUnit tests for parsing, formatting and copying of XDomain properties. Signed-off-by: Mika Westerberg --- drivers/thunderbolt/test.c | 252 + 1 file changed, 252 insertions(+) diff --git a/drivers/thunderbolt/test.c b/drivers/thunderbolt/test.c index 464c2d37b992..4e1e7ae2d90d 100644 --- a/drivers/thunderbolt/test.c +++ b/drivers/thunderbolt/test.c @@ -1594,6 +1594,255 @@ static void tb_test_tunnel_port_on_path(struct kunit *test) tb_tunnel_free(dp_tunnel); } +static const u32 root_directory[] = { + 0x55584401, /* "UXD" v1 */ + 0x0018, /* Root directory length */ + 0x76656e64, /* "vend" */ + 0x6f726964, /* "orid" */ + 0x7601, /* "v" R 1 */ + 0x0a27, /* Immediate value, ! Vendor ID */ + 0x76656e64, /* "vend" */ + 0x6f726964, /* "orid" */ + 0x7403, /* "t" R 3 */ + 0x001a, /* Text leaf offset, (“Apple Inc.”) */ + 0x64657669, /* "devi" */ + 0x63656964, /* "ceid" */ + 0x7601, /* "v" R 1 */ + 0x000a, /* Immediate value, ! Device ID */ + 0x64657669, /* "devi" */ + 0x63656964, /* "ceid" */ + 0x7403, /* "t" R 3 */ + 0x001d, /* Text leaf offset, (“Macintosh”) */ + 0x64657669, /* "devi" */ + 0x63657276, /* "cerv" */ + 0x7601, /* "v" R 1 */ + 0x8100, /* Immediate value, Device Revision */ + 0x6e657477, /* "netw" */ + 0x6f726b00, /* "ork" */ + 0x4414, /* "D" R 20 */ + 0x0021, /* Directory data offset, (Network Directory) */ + 0x4170706c, /* "Appl" */ + 0x6520496e, /* "e In" */ + 0x632e, /* "c." ! */ + 0x4d616369, /* "Maci" */ + 0x6e746f73, /* "ntos" */ + 0x6800, /* "h" */ + 0x, /* padding */ + 0xca8961c6, /* Directory UUID, Network Directory */ + 0x9541ce1c, /* Directory UUID, Network Directory */ + 0x5949b8bd, /* Directory UUID, Network Directory */ + 0x4f5a5f2e, /* Directory UUID, Network Directory */ + 0x70727463, /* "prtc" */ + 0x6964, /* "id" */ + 0x7601, /* "v" R 1 */ + 0x0001, /* Immediate value, Network Protocol ID */ + 0x70727463, /* "prtc" */ + 0x76657273, /* "vers" */ + 0x7601, /* "v" R 1 */ + 0x0001, /* Immediate value, Network Protocol Version */ + 0x70727463, /* "prtc" */ + 0x72657673, /* "revs" */ + 0x7601, /* "v" R 1 */ + 0x0001, /* Immediate value, Network Protocol Revision */ + 0x70727463, /* "prtc" */ + 0x73746e73, /* "stns" */ + 0x7601, /* "v" R 1 */ + 0x, /* Immediate value, Network Protocol Settings */ +}; + +static const uuid_t network_dir_uuid = + UUID_INIT(0xc66189ca, 0x1cce, 0x4195, + 0xbd, 0xb8, 0x49, 0x59, 0x2e, 0x5f, 0x5a, 0x4f); + +static void tb_test_property_parse(struct kunit *test) +{ + struct tb_property_dir *dir, *network_dir; + struct tb_property *p; + + dir = tb_property_parse_dir(root_directory, ARRAY_SIZE(root_directory)); + KUNIT_ASSERT_TRUE(test, dir != NULL); + + p = tb_property_find(dir, "foo", TB_PROPERTY_TYPE_TEXT); + KUNIT_ASSERT_TRUE(test, !p); + + p = tb_property_find(dir, "vendorid", TB_PROPERTY_TYPE_TEXT); + KUNIT_ASSERT_TRUE(test, p != NULL); + KUNIT_EXPECT_STREQ(test, p->value.text, "Apple Inc."); + + p = tb_property_find(dir, "vendorid", TB_PROPERTY_TYPE_VALUE); + KUNIT_ASSERT_TRUE(test, p != NULL); + KUNIT_EXPECT_EQ(test, p->value.immediate, (u32)0xa27); + + p = tb_property_find(dir, "deviceid", TB_PROPERTY_TYPE_TEXT); + KUNIT_ASSERT_TRUE(test, p != NULL); + KUNIT_EXPECT_STREQ(test, p->value.text, "Macintosh"); + + p = tb_property_find(dir, "deviceid", TB_PROPERTY_TYPE_VALUE); + KUNIT_ASSERT_TRUE(test, p != NULL); + KUNIT_EXPECT_EQ(test, p->value.immediate, (u32)0xa); + + p = tb_property_find(dir, "missing", TB_PROPERTY_TYPE_DIRECTORY); + KUNIT_ASSERT_TRUE(test, !p); + + p = tb_property_find(dir, "network", TB_PROPERTY_TYPE_DIRECTORY); + KUNIT_ASSERT_TRUE(test, p != NULL); + + network_dir = p->value.dir; + KUNIT_EXPECT_TRUE(test, uuid_equal(network_dir->uuid, &network_dir_uuid)); + + p = tb_property_find(network_dir, "prtcid", TB_PROPERTY_TYPE_VALUE); + KUNIT_ASSERT_TRUE(test, p != NULL); + KUNIT_EXPECT_EQ(test, p->value.immediate, (u32)0x1); + + p = tb_property_find(network_dir, "prtcvers", TB_PROPERTY_TYPE_VALUE); + KUNIT_ASSERT_TRUE(test, p != NULL); + KUNIT_EXPECT_EQ(test, p->value.immediate, (u32)0x1); + + p = tb_property_find(network_dir, "prtcrevs", TB_PROPERTY_TYPE_
[PATCH 11/18] thunderbolt: Use dedicated flow control for DMA tunnels
The USB4 inter-domain service spec recommends using dedicated flow control scheme so update the driver accordingly. Signed-off-by: Mika Westerberg --- drivers/thunderbolt/tunnel.c | 20 1 file changed, 4 insertions(+), 16 deletions(-) diff --git a/drivers/thunderbolt/tunnel.c b/drivers/thunderbolt/tunnel.c index 6557b6e07009..2e7ec037a73e 100644 --- a/drivers/thunderbolt/tunnel.c +++ b/drivers/thunderbolt/tunnel.c @@ -794,24 +794,14 @@ static u32 tb_dma_credits(struct tb_port *nhi) return min(max_credits, 13U); } -static int tb_dma_activate(struct tb_tunnel *tunnel, bool active) -{ - struct tb_port *nhi = tunnel->src_port; - u32 credits; - - credits = active ? tb_dma_credits(nhi) : 0; - return tb_port_set_initial_credits(nhi, credits); -} - -static void tb_dma_init_path(struct tb_path *path, unsigned int isb, -unsigned int efc, u32 credits) +static void tb_dma_init_path(struct tb_path *path, unsigned int efc, u32 credits) { int i; path->egress_fc_enable = efc; path->ingress_fc_enable = TB_PATH_ALL; path->egress_shared_buffer = TB_PATH_NONE; - path->ingress_shared_buffer = isb; + path->ingress_shared_buffer = TB_PATH_NONE; path->priority = 5; path->weight = 1; path->clear_fc = true; @@ -856,7 +846,6 @@ struct tb_tunnel *tb_tunnel_alloc_dma(struct tb *tb, struct tb_port *nhi, if (!tunnel) return NULL; - tunnel->activate = tb_dma_activate; tunnel->src_port = nhi; tunnel->dst_port = dst; @@ -869,8 +858,7 @@ struct tb_tunnel *tb_tunnel_alloc_dma(struct tb *tb, struct tb_port *nhi, tb_tunnel_free(tunnel); return NULL; } - tb_dma_init_path(path, TB_PATH_NONE, TB_PATH_SOURCE | TB_PATH_INTERNAL, -credits); + tb_dma_init_path(path, TB_PATH_SOURCE | TB_PATH_INTERNAL, credits); tunnel->paths[i++] = path; } @@ -881,7 +869,7 @@ struct tb_tunnel *tb_tunnel_alloc_dma(struct tb *tb, struct tb_port *nhi, tb_tunnel_free(tunnel); return NULL; } - tb_dma_init_path(path, TB_PATH_SOURCE, TB_PATH_ALL, credits); + tb_dma_init_path(path, TB_PATH_ALL, credits); tunnel->paths[i++] = path; } -- 2.30.1
[PATCH 08/18] thunderbolt: Align XDomain protocol timeouts with the spec
The USB4 inter-domain service spec has slightly different recommended timeouts for the XDomain protocol so align the driver with those. Signed-off-by: Mika Westerberg --- drivers/thunderbolt/xdomain.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/thunderbolt/xdomain.c b/drivers/thunderbolt/xdomain.c index cfe6fa7e84f4..ffa9cc9e0e7d 100644 --- a/drivers/thunderbolt/xdomain.c +++ b/drivers/thunderbolt/xdomain.c @@ -19,9 +19,9 @@ #include "tb.h" -#define XDOMAIN_DEFAULT_TIMEOUT5000 /* ms */ +#define XDOMAIN_DEFAULT_TIMEOUT1000 /* ms */ #define XDOMAIN_UUID_RETRIES 10 -#define XDOMAIN_PROPERTIES_RETRIES 60 +#define XDOMAIN_PROPERTIES_RETRIES 10 #define XDOMAIN_PROPERTIES_CHANGED_RETRIES 10 #define XDOMAIN_BONDING_WAIT 100 /* ms */ -- 2.30.1
[PATCH 09/18] thunderbolt: Add tb_property_copy_dir()
This function takes a deep copy of the properties. We need this in order to support more dynamic properties per XDomain connection as required by the USB4 inter-domain service spec. Signed-off-by: Mika Westerberg --- drivers/thunderbolt/property.c | 71 ++ include/linux/thunderbolt.h| 1 + 2 files changed, 72 insertions(+) diff --git a/drivers/thunderbolt/property.c b/drivers/thunderbolt/property.c index d5b0cdb8f0b1..dc555cda98e6 100644 --- a/drivers/thunderbolt/property.c +++ b/drivers/thunderbolt/property.c @@ -501,6 +501,77 @@ ssize_t tb_property_format_dir(const struct tb_property_dir *dir, u32 *block, return ret < 0 ? ret : 0; } +/** + * tb_property_copy_dir() - Take a deep copy of directory + * @dir: Directory to copy + * + * This function takes a deep copy of @dir and returns back the copy. In + * case of error returns %NULL. The resulting directory needs to be + * released by calling tb_property_free_dir(). + */ +struct tb_property_dir *tb_property_copy_dir(const struct tb_property_dir *dir) +{ + struct tb_property *property, *p = NULL; + struct tb_property_dir *d; + + if (!dir) + return NULL; + + d = tb_property_create_dir(dir->uuid); + if (!d) + return NULL; + + list_for_each_entry(property, &dir->properties, list) { + struct tb_property *p; + + p = tb_property_alloc(property->key, property->type); + if (!p) + goto err_free; + + p->length = property->length; + + switch (property->type) { + case TB_PROPERTY_TYPE_DIRECTORY: + p->value.dir = tb_property_copy_dir(property->value.dir); + if (!p->value.dir) + goto err_free; + break; + + case TB_PROPERTY_TYPE_DATA: + p->value.data = kmemdup(property->value.data, + property->length * 4, + GFP_KERNEL); + if (!p->value.data) + goto err_free; + break; + + case TB_PROPERTY_TYPE_TEXT: + p->value.text = kzalloc(p->length * 4, GFP_KERNEL); + if (!p->value.text) + goto err_free; + strcpy(p->value.text, property->value.text); + break; + + case TB_PROPERTY_TYPE_VALUE: + p->value.immediate = property->value.immediate; + break; + + default: + break; + } + + list_add_tail(&p->list, &d->properties); + } + + return d; + +err_free: + kfree(p); + tb_property_free_dir(d); + + return NULL; +} + /** * tb_property_add_immediate() - Add immediate property to directory * @parent: Directory to add the property diff --git a/include/linux/thunderbolt.h b/include/linux/thunderbolt.h index 7ec977161f5c..003a9ad29168 100644 --- a/include/linux/thunderbolt.h +++ b/include/linux/thunderbolt.h @@ -146,6 +146,7 @@ struct tb_property_dir *tb_property_parse_dir(const u32 *block, size_t block_len); ssize_t tb_property_format_dir(const struct tb_property_dir *dir, u32 *block, size_t block_len); +struct tb_property_dir *tb_property_copy_dir(const struct tb_property_dir *dir); struct tb_property_dir *tb_property_create_dir(const uuid_t *uuid); void tb_property_free_dir(struct tb_property_dir *dir); int tb_property_add_immediate(struct tb_property_dir *parent, const char *key, -- 2.30.1
[PATCH 14/18] net: thunderbolt: Align the driver to the USB4 networking spec
The USB4 networking spec (USB4NET) recommends different timeouts, and also suggest that the driver sets the 64k frame support flag in the properties block. Make the networking driver to honor this. Signed-off-by: Mika Westerberg --- drivers/net/thunderbolt.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/net/thunderbolt.c b/drivers/net/thunderbolt.c index 5c9ec91b6e78..9a6a8353e192 100644 --- a/drivers/net/thunderbolt.c +++ b/drivers/net/thunderbolt.c @@ -25,12 +25,13 @@ /* Protocol timeouts in ms */ #define TBNET_LOGIN_DELAY 4500 #define TBNET_LOGIN_TIMEOUT500 -#define TBNET_LOGOUT_TIMEOUT 100 +#define TBNET_LOGOUT_TIMEOUT 1000 #define TBNET_RING_SIZE256 #define TBNET_LOGIN_RETRIES60 -#define TBNET_LOGOUT_RETRIES 5 +#define TBNET_LOGOUT_RETRIES 10 #define TBNET_MATCH_FRAGS_ID BIT(1) +#define TBNET_64K_FRAMES BIT(2) #define TBNET_MAX_MTU SZ_64K #define TBNET_FRAME_SIZE SZ_4K #define TBNET_MAX_PAYLOAD_SIZE \ @@ -1367,7 +1368,7 @@ static int __init tbnet_init(void) * the moment. */ tb_property_add_immediate(tbnet_dir, "prtcstns", - TBNET_MATCH_FRAGS_ID); + TBNET_MATCH_FRAGS_ID | TBNET_64K_FRAMES); ret = tb_register_property_dir("network", tbnet_dir); if (ret) { -- 2.30.1
[PATCH 02/18] thunderbolt: Do not pass timeout for tb_cfg_reset()
There is only one user for this function and it passes the default timeout to it anyway, so remove the parameter completely. This is also needed in the subsequent patch where we allow connection manager implementations to use different timeout for non-raw control channel messages. Signed-off-by: Mika Westerberg --- drivers/thunderbolt/ctl.c| 6 ++ drivers/thunderbolt/ctl.h| 3 +-- drivers/thunderbolt/switch.c | 2 +- 3 files changed, 4 insertions(+), 7 deletions(-) diff --git a/drivers/thunderbolt/ctl.c b/drivers/thunderbolt/ctl.c index 875922133782..b79be1f02d92 100644 --- a/drivers/thunderbolt/ctl.c +++ b/drivers/thunderbolt/ctl.c @@ -802,14 +802,12 @@ static bool tb_cfg_copy(struct tb_cfg_request *req, const struct ctl_pkg *pkg) * tb_cfg_reset() - send a reset packet and wait for a response * @ctl: Control channel pointer * @route: Router string for the router to send reset - * @timeout_msec: Timeout in ms how long to wait for the response * * If the switch at route is incorrectly configured then we will not receive a * reply (even though the switch will reset). The caller should check for * -ETIMEDOUT and attempt to reconfigure the switch. */ -struct tb_cfg_result tb_cfg_reset(struct tb_ctl *ctl, u64 route, - int timeout_msec) +struct tb_cfg_result tb_cfg_reset(struct tb_ctl *ctl, u64 route) { struct cfg_reset_pkg request = { .header = tb_cfg_make_header(route) }; struct tb_cfg_result res = { 0 }; @@ -831,7 +829,7 @@ struct tb_cfg_result tb_cfg_reset(struct tb_ctl *ctl, u64 route, req->response_size = sizeof(reply); req->response_type = TB_CFG_PKG_RESET; - res = tb_cfg_request_sync(ctl, req, timeout_msec); + res = tb_cfg_request_sync(ctl, req, TB_CFG_DEFAULT_TIMEOUT); tb_cfg_request_put(req); diff --git a/drivers/thunderbolt/ctl.h b/drivers/thunderbolt/ctl.h index 97cb03b38953..2eafbfea5dff 100644 --- a/drivers/thunderbolt/ctl.h +++ b/drivers/thunderbolt/ctl.h @@ -124,8 +124,7 @@ static inline struct tb_cfg_header tb_cfg_make_header(u64 route) } int tb_cfg_ack_plug(struct tb_ctl *ctl, u64 route, u32 port, bool unplug); -struct tb_cfg_result tb_cfg_reset(struct tb_ctl *ctl, u64 route, - int timeout_msec); +struct tb_cfg_result tb_cfg_reset(struct tb_ctl *ctl, u64 route); struct tb_cfg_result tb_cfg_read_raw(struct tb_ctl *ctl, void *buffer, u64 route, u32 port, enum tb_cfg_space space, u32 offset, diff --git a/drivers/thunderbolt/switch.c b/drivers/thunderbolt/switch.c index 2a95b4ce06c0..218869c6ee21 100644 --- a/drivers/thunderbolt/switch.c +++ b/drivers/thunderbolt/switch.c @@ -1331,7 +1331,7 @@ int tb_switch_reset(struct tb_switch *sw) TB_CFG_SWITCH, 2, 2); if (res.err) return res.err; - res = tb_cfg_reset(sw->tb->ctl, tb_route(sw), TB_CFG_DEFAULT_TIMEOUT); + res = tb_cfg_reset(sw->tb->ctl, tb_route(sw)); if (res.err > 0) return -EIO; return res.err; -- 2.30.1
[PATCH 16/18] thunderbolt: Add KUnit tests for DMA tunnels
Add a couple of tests to check DMA tunneling functionality. Signed-off-by: Mika Westerberg --- drivers/thunderbolt/test.c | 240 + 1 file changed, 240 insertions(+) diff --git a/drivers/thunderbolt/test.c b/drivers/thunderbolt/test.c index 4e1e7ae2d90d..5ff5a03bc9ce 100644 --- a/drivers/thunderbolt/test.c +++ b/drivers/thunderbolt/test.c @@ -119,6 +119,7 @@ static struct tb_switch *alloc_host(struct kunit *test) sw->ports[7].config.type = TB_TYPE_NHI; sw->ports[7].config.max_in_hop_id = 11; sw->ports[7].config.max_out_hop_id = 11; + sw->ports[7].config.nfc_credits = 0x4180; sw->ports[8].config.type = TB_TYPE_PCIE_DOWN; sw->ports[8].config.max_in_hop_id = 8; @@ -1594,6 +1595,240 @@ static void tb_test_tunnel_port_on_path(struct kunit *test) tb_tunnel_free(dp_tunnel); } +static void tb_test_tunnel_dma(struct kunit *test) +{ + struct tb_port *nhi, *port; + struct tb_tunnel *tunnel; + struct tb_switch *host; + + /* +* Create DMA tunnel from NHI to port 1 and back. +* +* [Host 1] +*1 ^ In HopID 1 -> Out HopID 8 +* | +* v In HopID 8 -> Out HopID 1 +* Domain border +* | +* [Host 2] +*/ + host = alloc_host(test); + nhi = &host->ports[7]; + port = &host->ports[1]; + + tunnel = tb_tunnel_alloc_dma(NULL, nhi, port, 8, 1, 8, 1); + KUNIT_ASSERT_TRUE(test, tunnel != NULL); + KUNIT_EXPECT_EQ(test, tunnel->type, (enum tb_tunnel_type)TB_TUNNEL_DMA); + KUNIT_EXPECT_PTR_EQ(test, tunnel->src_port, nhi); + KUNIT_EXPECT_PTR_EQ(test, tunnel->dst_port, port); + KUNIT_ASSERT_EQ(test, tunnel->npaths, (size_t)2); + /* RX path */ + KUNIT_ASSERT_EQ(test, tunnel->paths[0]->path_length, 1); + KUNIT_EXPECT_PTR_EQ(test, tunnel->paths[0]->hops[0].in_port, port); + KUNIT_EXPECT_EQ(test, tunnel->paths[0]->hops[0].in_hop_index, 8); + KUNIT_EXPECT_PTR_EQ(test, tunnel->paths[0]->hops[0].out_port, nhi); + KUNIT_EXPECT_EQ(test, tunnel->paths[0]->hops[0].next_hop_index, 1); + /* TX path */ + KUNIT_ASSERT_EQ(test, tunnel->paths[1]->path_length, 1); + KUNIT_EXPECT_PTR_EQ(test, tunnel->paths[1]->hops[0].in_port, nhi); + KUNIT_EXPECT_EQ(test, tunnel->paths[1]->hops[0].in_hop_index, 1); + KUNIT_EXPECT_PTR_EQ(test, tunnel->paths[1]->hops[0].out_port, port); + KUNIT_EXPECT_EQ(test, tunnel->paths[1]->hops[0].next_hop_index, 8); + + tb_tunnel_free(tunnel); +} + +static void tb_test_tunnel_dma_rx(struct kunit *test) +{ + struct tb_port *nhi, *port; + struct tb_tunnel *tunnel; + struct tb_switch *host; + + /* +* Create DMA RX tunnel from port 1 to NHI. +* +* [Host 1] +*1 ^ +* | +* | In HopID 15 -> Out HopID 2 +* Domain border +* | +* [Host 2] +*/ + host = alloc_host(test); + nhi = &host->ports[7]; + port = &host->ports[1]; + + tunnel = tb_tunnel_alloc_dma(NULL, nhi, port, -1, -1, 15, 2); + KUNIT_ASSERT_TRUE(test, tunnel != NULL); + KUNIT_EXPECT_EQ(test, tunnel->type, (enum tb_tunnel_type)TB_TUNNEL_DMA); + KUNIT_EXPECT_PTR_EQ(test, tunnel->src_port, nhi); + KUNIT_EXPECT_PTR_EQ(test, tunnel->dst_port, port); + KUNIT_ASSERT_EQ(test, tunnel->npaths, (size_t)1); + /* RX path */ + KUNIT_ASSERT_EQ(test, tunnel->paths[0]->path_length, 1); + KUNIT_EXPECT_PTR_EQ(test, tunnel->paths[0]->hops[0].in_port, port); + KUNIT_EXPECT_EQ(test, tunnel->paths[0]->hops[0].in_hop_index, 15); + KUNIT_EXPECT_PTR_EQ(test, tunnel->paths[0]->hops[0].out_port, nhi); + KUNIT_EXPECT_EQ(test, tunnel->paths[0]->hops[0].next_hop_index, 2); + + tb_tunnel_free(tunnel); +} + +static void tb_test_tunnel_dma_tx(struct kunit *test) +{ + struct tb_port *nhi, *port; + struct tb_tunnel *tunnel; + struct tb_switch *host; + + /* +* Create DMA TX tunnel from NHI to port 1. +* +* [Host 1] +*1 | In HopID 2 -> Out HopID 15 +* | +* v +* Domain border +* | +* [Host 2] +*/ + host = alloc_host(test); + nhi = &host->ports[7]; + port = &host->ports[1]; + + tunnel = tb_tunnel_alloc_dma(NULL, nhi, port, 15, 2, -1, -1); + KUNIT_ASSERT_TRUE(test, tunnel != NULL); + KUNIT_EXPECT_EQ(test, tunnel->type, (enum tb_tunnel_type)TB_TUNNEL_DMA); + KUNIT_EXPECT_PTR_EQ(test, tunnel->src_port, nhi); + KUNIT_EXPECT_PTR_EQ(test, tunnel->dst_port, port); + KUNIT_ASSERT_EQ(test, tunnel->npaths, (size_t)1); + /* TX path */ + KUNIT_ASSERT_EQ(test, tunnel->paths[0]->path_length, 1); + KUNIT_EXPECT_PTR_EQ(test, tunnel->paths[0]->
[PATCH 17/18] thunderbolt: Check quirks in tb_switch_add()
This makes it more visible on the main path of adding router. Signed-off-by: Mika Westerberg --- drivers/thunderbolt/eeprom.c | 1 - drivers/thunderbolt/switch.c | 2 ++ 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/thunderbolt/eeprom.c b/drivers/thunderbolt/eeprom.c index dd03d3096653..aecb0b9f0c75 100644 --- a/drivers/thunderbolt/eeprom.c +++ b/drivers/thunderbolt/eeprom.c @@ -610,7 +610,6 @@ int tb_drom_read(struct tb_switch *sw) sw->uid = header->uid; sw->vendor = header->vendor_id; sw->device = header->model_id; - tb_check_quirks(sw); crc = tb_crc32(sw->drom + TB_DROM_DATA_START, header->data_len); if (crc != header->data_crc32) { diff --git a/drivers/thunderbolt/switch.c b/drivers/thunderbolt/switch.c index 7ac37a1f95e1..72b43c7c0651 100644 --- a/drivers/thunderbolt/switch.c +++ b/drivers/thunderbolt/switch.c @@ -2520,6 +2520,8 @@ int tb_switch_add(struct tb_switch *sw) } tb_sw_dbg(sw, "uid: %#llx\n", sw->uid); + tb_check_quirks(sw); + ret = tb_switch_set_uuid(sw); if (ret) { dev_err(&sw->dev, "failed to set UUID\n"); -- 2.30.1
[PATCH 12/18] thunderbolt: Drop unused tb_port_set_initial_credits()
This function is not used anymore in the driver so we can remove it. Signed-off-by: Mika Westerberg --- drivers/thunderbolt/switch.c | 22 -- drivers/thunderbolt/tb.h | 1 - 2 files changed, 23 deletions(-) diff --git a/drivers/thunderbolt/switch.c b/drivers/thunderbolt/switch.c index 218869c6ee21..7ac37a1f95e1 100644 --- a/drivers/thunderbolt/switch.c +++ b/drivers/thunderbolt/switch.c @@ -626,28 +626,6 @@ int tb_port_add_nfc_credits(struct tb_port *port, int credits) TB_CFG_PORT, ADP_CS_4, 1); } -/** - * tb_port_set_initial_credits() - Set initial port link credits allocated - * @port: Port to set the initial credits - * @credits: Number of credits to to allocate - * - * Set initial credits value to be used for ingress shared buffering. - */ -int tb_port_set_initial_credits(struct tb_port *port, u32 credits) -{ - u32 data; - int ret; - - ret = tb_port_read(port, &data, TB_CFG_PORT, ADP_CS_5, 1); - if (ret) - return ret; - - data &= ~ADP_CS_5_LCA_MASK; - data |= (credits << ADP_CS_5_LCA_SHIFT) & ADP_CS_5_LCA_MASK; - - return tb_port_write(port, &data, TB_CFG_PORT, ADP_CS_5, 1); -} - /** * tb_port_clear_counter() - clear a counter in TB_CFG_COUNTER * @port: Port whose counters to clear diff --git a/drivers/thunderbolt/tb.h b/drivers/thunderbolt/tb.h index d6ad45686488..ec8cdbc761fa 100644 --- a/drivers/thunderbolt/tb.h +++ b/drivers/thunderbolt/tb.h @@ -860,7 +860,6 @@ static inline bool tb_switch_tmu_is_enabled(const struct tb_switch *sw) int tb_wait_for_port(struct tb_port *port, bool wait_if_unplugged); int tb_port_add_nfc_credits(struct tb_port *port, int credits); -int tb_port_set_initial_credits(struct tb_port *port, u32 credits); int tb_port_clear_counter(struct tb_port *port, int counter); int tb_port_unlock(struct tb_port *port); int tb_port_enable(struct tb_port *port); -- 2.30.1
[PATCH 18/18] thunderbolt: Add support for USB4 DROM
USB4 router DROM differs sligthly from Thunderbolt 1-3 DROM. For instance it does not include UID and CRC8 in the header section, and it has product descriptor genereric entry to describe the product IDs and related information. If the "Version" field in the DROM header section reads 3 it means the router only has USB4 DROM and if it reads 1 it means the router supports TBT3 compatible DROM. For this reason, update the DROM parsing code to support "pure" USB4 DROMs too. While there drop the extra empty line at the end of tb_drom_read(). Signed-off-by: Mika Westerberg --- drivers/thunderbolt/eeprom.c | 104 +++ 1 file changed, 80 insertions(+), 24 deletions(-) diff --git a/drivers/thunderbolt/eeprom.c b/drivers/thunderbolt/eeprom.c index aecb0b9f0c75..46d0906a3070 100644 --- a/drivers/thunderbolt/eeprom.c +++ b/drivers/thunderbolt/eeprom.c @@ -277,6 +277,16 @@ struct tb_drom_entry_port { u8 unknown4:2; } __packed; +/* USB4 product descriptor */ +struct tb_drom_entry_desc { + struct tb_drom_entry_header header; + u16 bcdUSBSpec; + u16 idVendor; + u16 idProduct; + u16 bcdProductFWRevision; + u32 TID; + u8 productHWRevision; +}; /** * tb_drom_read_uid_only() - Read UID directly from DROM @@ -329,6 +339,16 @@ static int tb_drom_parse_entry_generic(struct tb_switch *sw, if (!sw->device_name) return -ENOMEM; break; + case 9: { + const struct tb_drom_entry_desc *desc = + (const struct tb_drom_entry_desc *)entry; + + if (!sw->vendor && !sw->device) { + sw->vendor = desc->idVendor; + sw->device = desc->idProduct; + } + break; + } } return 0; @@ -521,6 +541,51 @@ static int tb_drom_read_n(struct tb_switch *sw, u16 offset, u8 *val, return tb_eeprom_read_n(sw, offset, val, count); } +static int tb_drom_parse(struct tb_switch *sw) +{ + const struct tb_drom_header *header = + (const struct tb_drom_header *)sw->drom; + u32 crc; + + crc = tb_crc8((u8 *) &header->uid, 8); + if (crc != header->uid_crc8) { + tb_sw_warn(sw, + "DROM UID CRC8 mismatch (expected: %#x, got: %#x), aborting\n", + header->uid_crc8, crc); + return -EINVAL; + } + if (!sw->uid) + sw->uid = header->uid; + sw->vendor = header->vendor_id; + sw->device = header->model_id; + + crc = tb_crc32(sw->drom + TB_DROM_DATA_START, header->data_len); + if (crc != header->data_crc32) { + tb_sw_warn(sw, + "DROM data CRC32 mismatch (expected: %#x, got: %#x), continuing\n", + header->data_crc32, crc); + } + + return tb_drom_parse_entries(sw); +} + +static int usb4_drom_parse(struct tb_switch *sw) +{ + const struct tb_drom_header *header = + (const struct tb_drom_header *)sw->drom; + u32 crc; + + crc = tb_crc32(sw->drom + TB_DROM_DATA_START, header->data_len); + if (crc != header->data_crc32) { + tb_sw_warn(sw, + "DROM data CRC32 mismatch (expected: %#x, got: %#x), aborting\n", + header->data_crc32, crc); + return -EINVAL; + } + + return tb_drom_parse_entries(sw); +} + /** * tb_drom_read() - Copy DROM to sw->drom and parse it * @sw: Router whose DROM to read and parse @@ -534,7 +599,6 @@ static int tb_drom_read_n(struct tb_switch *sw, u16 offset, u8 *val, int tb_drom_read(struct tb_switch *sw) { u16 size; - u32 crc; struct tb_drom_header *header; int res, retries = 1; @@ -599,30 +663,21 @@ int tb_drom_read(struct tb_switch *sw) goto err; } - crc = tb_crc8((u8 *) &header->uid, 8); - if (crc != header->uid_crc8) { - tb_sw_warn(sw, - "drom uid crc8 mismatch (expected: %#x, got: %#x), aborting\n", - header->uid_crc8, crc); - goto err; - } - if (!sw->uid) - sw->uid = header->uid; - sw->vendor = header->vendor_id; - sw->device = header->model_id; + tb_sw_dbg(sw, "DROM version: %d\n", header->device_rom_revision); - crc = tb_crc32(sw->drom + TB_DROM_DATA_START, header->data_len); - if (crc != header->data_crc32) { - tb_sw_warn(sw, - "drom data crc32 mismatch (expected: %#x, got: %#x), continuing\n", - header->data_crc32, crc); + switch (header->device_rom_revision) { + case 3: + res = usb4_drom_parse(sw); + break; + default: + tb_sw_warn(sw, "DROM device_rom_revision %#x unknown\n", +
[PATCH 06/18] thunderbolt: Do not re-establish XDomain DMA paths automatically
This step is actually not needed. The service drivers themselves will handle this once they have negotiated the service up and running again with the remote side. Also dropping this makes it easier to add support for multiple DMA tunnels over a single XDomain connection. Signed-off-by: Mika Westerberg --- drivers/thunderbolt/xdomain.c | 35 ++- include/linux/thunderbolt.h | 2 -- 2 files changed, 2 insertions(+), 35 deletions(-) diff --git a/drivers/thunderbolt/xdomain.c b/drivers/thunderbolt/xdomain.c index 584bb5ec06f8..a1657663a95e 100644 --- a/drivers/thunderbolt/xdomain.c +++ b/drivers/thunderbolt/xdomain.c @@ -946,19 +946,6 @@ static int populate_properties(struct tb_xdomain *xd, return 0; } -/* Called with @xd->lock held */ -static void tb_xdomain_restore_paths(struct tb_xdomain *xd) -{ - if (!xd->resume) - return; - - xd->resume = false; - if (xd->transmit_path) { - dev_dbg(&xd->dev, "re-establishing DMA path\n"); - tb_domain_approve_xdomain_paths(xd->tb, xd); - } -} - static inline struct tb_switch *tb_xdomain_parent(struct tb_xdomain *xd) { return tb_to_switch(xd->dev.parent); @@ -1084,16 +1071,8 @@ static void tb_xdomain_get_properties(struct work_struct *work) mutex_lock(&xd->lock); /* Only accept newer generation properties */ - if (xd->properties && gen <= xd->property_block_gen) { - /* -* On resume it is likely that the properties block is -* not changed (unless the other end added or removed -* services). However, we need to make sure the existing -* DMA paths are restored properly. -*/ - tb_xdomain_restore_paths(xd); + if (xd->properties && gen <= xd->property_block_gen) goto err_free_block; - } dir = tb_property_parse_dir(block, ret); if (!dir) { @@ -1118,8 +1097,6 @@ static void tb_xdomain_get_properties(struct work_struct *work) tb_xdomain_update_link_attributes(xd); - tb_xdomain_restore_paths(xd); - mutex_unlock(&xd->lock); kfree(block); @@ -1332,15 +1309,7 @@ static int __maybe_unused tb_xdomain_suspend(struct device *dev) static int __maybe_unused tb_xdomain_resume(struct device *dev) { - struct tb_xdomain *xd = tb_to_xdomain(dev); - - /* -* Ask tb_xdomain_get_properties() restore any existing DMA -* paths after properties are re-read. -*/ - xd->resume = true; - start_handshake(xd); - + start_handshake(tb_to_xdomain(dev)); return 0; } diff --git a/include/linux/thunderbolt.h b/include/linux/thunderbolt.h index 659a0a810fa1..7ec977161f5c 100644 --- a/include/linux/thunderbolt.h +++ b/include/linux/thunderbolt.h @@ -185,7 +185,6 @@ void tb_unregister_property_dir(const char *key, struct tb_property_dir *dir); * @link_speed: Speed of the link in Gb/s * @link_width: Width of the link (1 or 2) * @is_unplugged: The XDomain is unplugged - * @resume: The XDomain is being resumed * @needs_uuid: If the XDomain does not have @remote_uuid it will be * queried first * @transmit_path: HopID which the remote end expects us to transmit @@ -231,7 +230,6 @@ struct tb_xdomain { unsigned int link_speed; unsigned int link_width; bool is_unplugged; - bool resume; bool needs_uuid; u16 transmit_path; u16 transmit_ring; -- 2.30.1
Re: [regression] Kernel panic on resume from sleep
Looks good so far, but need to wait some more time as the issue was irregular. Do you have any explanation why the calls disorder caused the panic just occasionally? Also, the same (wrong) order I can see in the 3.16 kernel code, but it has worked fine with this kernel in all cases. So what is different in 5.10? Thanks Zbynek On Wed, Mar 3, 2021 at 2:44 AM Jakub Kicinski wrote: > > On Mon, 1 Mar 2021 23:11:05 +0100 Zbynek Michl wrote: > > Hello, > > > > Can anybody help me with the following kernel issue? > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=983595 > > > > Do I understand it correctly that the kernel crashes due to the bug in > > the alx driver? > > Order of calls on resume looks suspicious, can you give this a try? > > diff --git a/drivers/net/ethernet/atheros/alx/main.c > b/drivers/net/ethernet/atheros/alx/main.c > index 9b7f1af5f574..9e02f8864593 100644 > --- a/drivers/net/ethernet/atheros/alx/main.c > +++ b/drivers/net/ethernet/atheros/alx/main.c > @@ -1894,13 +1894,16 @@ static int alx_resume(struct device *dev) > > if (!netif_running(alx->dev)) > return 0; > - netif_device_attach(alx->dev); > > rtnl_lock(); > err = __alx_open(alx, true); > rtnl_unlock(); > + if (err) > + return err; > > - return err; > + netif_device_attach(alx->dev); > + > + return 0; > } > > static SIMPLE_DEV_PM_OPS(alx_pm_ops, alx_suspend, alx_resume);
stmmac driver timeout issue
Hello Andrew, Hello Jakub, You may can give some suggestions based on your great networking knowledge, thanks in advance! I found that add vlan id hw filter (stmmac_vlan_rx_add_vid) have possibility timeout when accessing VLAN Filter registers during ifup/ifdown stress test, and restore vlan id hw filter (stmmac_restore_hw_vlan_rx_fltr) always timeout when access VLAN Filter registers. My hardware is i.MX8MP (drivers/net/ethernet/stmicro/stmmac/dwmac-imx.c, RGMII interface, RTL8211FDI-CG PHY), it needs fix mac speed(imx_dwmac_fix_speed), it indirectly involved in phylink_link_up. After debugging, if phylink_link_up is called later than adding vlan id hw filter, it will report timeout, so I guess we need fix mac speed before accessing VLAN Filter registers. Error like below: [ 106.389879] 8021q: adding VLAN 0 to HW filter on device eth1 [ 106.395644] imx-dwmac 30bf.ethernet eth1: Timeout accessing MAC_VLAN_Tag_Filter [ 108.160734] imx-dwmac 30bf.ethernet eth1: Link is Up - 100Mbps/Full - flow control rx/tx ->->-> which means accessing VLAN Filter registers before phylink_link_up is called. Same case when system resume back, [ 1763.842294] imx-dwmac 30bf.ethernet eth1: configuring for phy/rgmii-id link mode [ 1763.853084] imx-dwmac 30bf.ethernet eth1: No Safety Features support found [ 1763.853186] imx-dwmac 30bf.ethernet eth1: Timeout accessing MAC_VLAN_Tag_Filter [ 1763.873465] usb usb1: root hub lost power or was reset [ 1763.873469] usb usb2: root hub lost power or was reset [ 1764.090321] PM: resume devices took 0.248 seconds [ 1764.257381] OOM killer enabled. [ 1764.260518] Restarting tasks ... done. [ 1764.265229] PM: suspend exit === suspend 12 times === [ 1765.887915] imx-dwmac 30bf.ethernet eth1: Link is Up - 100Mbps/Full - flow control rx/tx ->->-> which means accessing VLAN Filter registers before phylink_link_up is called. My question is that some MAC controllers need RXC clock from RGMII interface to reset DAM or access to some registers. If there is any way to ensure phylink_link_up is invoked synchronously when we need it. I am not sure this timeout caused by a fix mac speed is needed before accessing VLAN Filter registers, is there ang hints, thanks a lot! We have another board i.MX8DXL which don't need fix mac speed attach to AR8031 PHY, can't reproduce this issue. Best Regards, Joakim Zhang
[PATCH 1/1] net: usb: qmi_wwan: allow qmimux add/del with master up
There's no reason for preventing the creation and removal of qmimux network interfaces when the underlying interface is up. This makes qmi_wwan mux implementation more similar to the rmnet one, simplifying userspace management of the same logical interfaces. Fixes: c6adf77953bc ("net: usb: qmi_wwan: add qmap mux protocol support") Reported-by: Aleksander Morgado Signed-off-by: Daniele Palmas --- drivers/net/usb/qmi_wwan.c | 14 -- 1 file changed, 14 deletions(-) diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c index 17a050521b86..6700f1970b24 100644 --- a/drivers/net/usb/qmi_wwan.c +++ b/drivers/net/usb/qmi_wwan.c @@ -429,13 +429,6 @@ static ssize_t add_mux_store(struct device *d, struct device_attribute *attr, c goto err; } - /* we don't want to modify a running netdev */ - if (netif_running(dev->net)) { - netdev_err(dev->net, "Cannot change a running device\n"); - ret = -EBUSY; - goto err; - } - ret = qmimux_register_device(dev->net, mux_id); if (!ret) { info->flags |= QMI_WWAN_FLAG_MUX; @@ -465,13 +458,6 @@ static ssize_t del_mux_store(struct device *d, struct device_attribute *attr, c if (!rtnl_trylock()) return restart_syscall(); - /* we don't want to modify a running netdev */ - if (netif_running(dev->net)) { - netdev_err(dev->net, "Cannot change a running device\n"); - ret = -EBUSY; - goto err; - } - del_dev = qmimux_find_dev(dev, mux_id); if (!del_dev) { netdev_err(dev->net, "mux_id not present\n"); -- 2.17.1
Re: [PATCH net 1/2] net: dsa: sja1105: fix SGMII PCS being forced to SPEED_UNKNOWN instead of SPEED_10
On Thu, Mar 04, 2021 at 12:56:53PM +0200, Vladimir Oltean wrote: > From: Vladimir Oltean > > When using MLO_AN_PHY or MLO_AN_FIXED, the MII_BMCR of the SGMII PCS is > read before resetting the switch so it can be reprogrammed afterwards. > This works for the speeds of 1Gbps and 100Mbps, but not for 10Mbps, > because SPEED_10 is actually 0, so AND-ing anything with 0 is false, > therefore that last branch is dead code. > > Do what others do (genphy_read_status_fixed, phy_mii_ioctl) and just > remove the check for SPEED_10, let it fall into the default case. > > Fixes: ffe10e679cec ("net: dsa: sja1105: Add support for the SGMII port") > Signed-off-by: Vladimir Oltean Reviewed-by: Andrew Lunn Andrew
Re: [PATCH net 2/2] net: dsa: sja1105: fix ucast/bcast flooding always remaining enabled
On Thu, Mar 04, 2021 at 12:56:54PM +0200, Vladimir Oltean wrote: > From: Vladimir Oltean > > In the blamed patch I managed to introduce a bug while moving code > around: the same logic is applied to the ucast_egress_floods and > bcast_egress_floods variables both on the "if" and the "else" branches. Some static analysers will report this. > This is clearly an unintended change compared to how the code used to be > prior to that bugfix, so restore it. > > Fixes: 7f7ccdea8c73 ("net: dsa: sja1105: fix leakage of flooded frames > outside bridging domain") > Signed-off-by: Vladimir Oltean Reviewed-by: Andrew Lunn Andrew
Re: [virtio-dev] Re: [PATCH] vdpa/mlx5: set_features should allow reset to zero
On Thu, 4 Mar 2021 16:24:16 +0800 Jason Wang wrote: > On 2021/3/3 4:29 下午, Cornelia Huck wrote: > > On Wed, 3 Mar 2021 12:01:01 +0800 > > Jason Wang wrote: > > > >> On 2021/3/2 8:08 下午, Cornelia Huck wrote: > >>> On Mon, 1 Mar 2021 11:51:08 +0800 > >>> Jason Wang wrote: > >>> > On 2021/3/1 5:25 上午, Michael S. Tsirkin wrote: > > On Fri, Feb 26, 2021 at 04:19:16PM +0800, Jason Wang wrote: > >> On 2021/2/26 2:53 上午, Michael S. Tsirkin wrote: > >>> Confused. What is wrong with the above? It never reads the > >>> field unless the feature has been offered by device. > >> So the spec said: > >> > >> " > >> > >> The following driver-read-only field, max_virtqueue_pairs only exists > >> if > >> VIRTIO_NET_F_MQ is set. > >> > >> " > >> > >> If I read this correctly, there will be no max_virtqueue_pairs field > >> if the > >> VIRTIO_NET_F_MQ is not offered by device? If yes the offsetof() > >> violates > >> what spec said. > >> > >> Thanks > > I think that's a misunderstanding. This text was never intended to > > imply that field offsets change beased on feature bits. > > We had this pain with legacy and we never wanted to go back there. > > > > This merely implies that without VIRTIO_NET_F_MQ the field > > should not be accessed. Exists in the sense "is accessible to driver". > > > > Let's just clarify that in the spec, job done. > Ok, agree. That will make things more eaiser. > >>> Yes, that makes much more sense. > >>> > >>> What about adding the following to the "Basic Facilities of a Virtio > >>> Device/Device Configuration Space" section of the spec: > >>> > >>> "If an optional configuration field does not exist, the corresponding > >>> space is still present, but reserved." > >> > >> This became interesting after re-reading some of the qemu codes. > >> > >> E.g in virtio-net.c we had: > >> > >> *static VirtIOFeature feature_sizes[] = { > >> {.flags = 1ULL << VIRTIO_NET_F_MAC, > >> .end = endof(struct virtio_net_config, mac)}, > >> {.flags = 1ULL << VIRTIO_NET_F_STATUS, > >> .end = endof(struct virtio_net_config, status)}, > >> {.flags = 1ULL << VIRTIO_NET_F_MQ, > >> .end = endof(struct virtio_net_config, max_virtqueue_pairs)}, > >> {.flags = 1ULL << VIRTIO_NET_F_MTU, > >> .end = endof(struct virtio_net_config, mtu)}, > >> {.flags = 1ULL << VIRTIO_NET_F_SPEED_DUPLEX, > >> .end = endof(struct virtio_net_config, duplex)}, > >> {.flags = (1ULL << VIRTIO_NET_F_RSS) | (1ULL << > >> VIRTIO_NET_F_HASH_REPORT), > >> .end = endof(struct virtio_net_config, supported_hash_types)}, > >> {} > >> };* > >> > >> *It has a implict dependency chain. E.g MTU doesn't presnet if > >> DUPLEX/RSS is not offered ... > >> * > > But I think it covers everything up to the relevant field, no? So MTU > > is included if we have the feature bit, even if we don't have > > DUPLEX/RSS. > > > > Given that a config space may be shorter (but must not collapse > > non-existing fields), maybe a better wording would be: > > > > "If an optional configuration field does not exist, the corresponding > > space will still be present if it is not at the end of the > > configuration space (i.e., further configuration fields exist.) > > > This should work but I think we need to define the end of configuration > space first? What about sidestepping this: "...the corresponding space will still be present, unless no further configuration fields exist." ? > > > This > > implies that a given field, if it exists, is always at the same offset > > from the beginning of the configuration space."
Re: [PATCH/v5] bpf: add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_ENCAP_L2_ETH
On Thu, Mar 4, 2021 at 1:41 AM Xuesen Huang wrote: > > From: Xuesen Huang > > bpf_skb_adjust_room sets the inner_protocol as skb->protocol for packets > encapsulation. But that is not appropriate when pushing Ethernet header. > > Add an option to further specify encap L2 type and set the inner_protocol > as ETH_P_TEB. > > Suggested-by: Willem de Bruijn > Signed-off-by: Xuesen Huang > Signed-off-by: Zhiyong Cheng > Signed-off-by: Li Wang Acked-by: Willem de Bruijn
Re: [PATCH] selftests_bpf: extend test_tc_tunnel test with vxlan
On Thu, Mar 4, 2021 at 1:42 AM Xuesen Huang wrote: > > From: Xuesen Huang > > Add BPF_F_ADJ_ROOM_ENCAP_L2_ETH flag to the existing tests which > encapsulates the ethernet as the inner l2 header. > > Update a vxlan encapsulation test case. > > Signed-off-by: Xuesen Huang > Signed-off-by: Li Wang > Signed-off-by: Willem de Bruijn Please mark patch target: [PATCH bpf-next] > --- > tools/testing/selftests/bpf/progs/test_tc_tunnel.c | 113 > ++--- > tools/testing/selftests/bpf/test_tc_tunnel.sh | 15 ++- > 2 files changed, 111 insertions(+), 17 deletions(-) > -static __always_inline int encap_ipv4(struct __sk_buff *skb, __u8 > encap_proto, > - __u16 l2_proto) > +static __always_inline int __encap_ipv4(struct __sk_buff *skb, __u8 > encap_proto, > + __u16 l2_proto, __u16 ext_proto) > { > __u16 udp_dst = UDP_PORT; > struct iphdr iph_inner; > struct v4hdr h_outer; > struct tcphdr tcph; > int olen, l2_len; > + __u8 *l2_hdr = NULL; > int tcp_off; > __u64 flags; > > @@ -141,7 +157,11 @@ static __always_inline int encap_ipv4(struct __sk_buff > *skb, __u8 encap_proto, > break; > case ETH_P_TEB: > l2_len = ETH_HLEN; > - udp_dst = ETH_OVER_UDP_PORT; > + if (ext_proto & EXTPROTO_VXLAN) { > + udp_dst = VXLAN_UDP_PORT; > + l2_len += sizeof(struct vxlanhdr); > + } else > + udp_dst = ETH_OVER_UDP_PORT; > break; > } > flags |= BPF_F_ADJ_ROOM_ENCAP_L2(l2_len); > @@ -171,14 +191,26 @@ static __always_inline int encap_ipv4(struct __sk_buff > *skb, __u8 encap_proto, > } > > /* add L2 encap (if specified) */ > + l2_hdr = (__u8 *)&h_outer + olen; > switch (l2_proto) { > case ETH_P_MPLS_UC: > - *((__u32 *)((__u8 *)&h_outer + olen)) = mpls_label; > + *(__u32 *)l2_hdr = mpls_label; > break; > case ETH_P_TEB: > - if (bpf_skb_load_bytes(skb, 0, (__u8 *)&h_outer + olen, > - ETH_HLEN)) > + flags |= BPF_F_ADJ_ROOM_ENCAP_L2_ETH; > + > + if (ext_proto & EXTPROTO_VXLAN) { > + struct vxlanhdr *vxlan_hdr = (struct vxlanhdr > *)l2_hdr; > + > + vxlan_hdr->vx_flags = VXLAN_FLAGS; > + vxlan_hdr->vx_vni = bpf_htonl((VXLAN_VNI & > VXLAN_VNI_MASK) << 8); > + > + l2_hdr += sizeof(struct vxlanhdr); should this be l2_len? (here and ipv6 below) > +SEC("encap_vxlan_eth") > +int __encap_vxlan_eth(struct __sk_buff *skb) > +{ > + if (skb->protocol == __bpf_constant_htons(ETH_P_IP)) > + return __encap_ipv4(skb, IPPROTO_UDP, > + ETH_P_TEB, > + EXTPROTO_VXLAN); non-standard indentation: align with the opening parenthesis. (here and ipv6 below)
[PATCH] net: mellanox: mlx5: fix error return code in mlx5_fpga_device_start()
When mlx5_is_fpga_lookaside() returns a non-zero value, no error return code is assigned. To fix this bug, err is assigned with -EINVAL as error return code. Reported-by: TOTE Robot Signed-off-by: Jia-Ju Bai --- drivers/net/ethernet/mellanox/mlx5/core/fpga/core.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fpga/core.c b/drivers/net/ethernet/mellanox/mlx5/core/fpga/core.c index 2ce4241459ce..c9e6da97126f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fpga/core.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fpga/core.c @@ -198,8 +198,10 @@ int mlx5_fpga_device_start(struct mlx5_core_dev *mdev) mlx5_fpga_info(fdev, "FPGA card %s:%u\n", mlx5_fpga_name(fpga_id), fpga_id); /* No QPs if FPGA does not participate in net processing */ - if (mlx5_is_fpga_lookaside(fpga_id)) + if (mlx5_is_fpga_lookaside(fpga_id)) { + err = -EINVAL; goto out; + } mlx5_fpga_info(fdev, "%s(%d): image, version %u; SBU %06x:%04x version %d\n", mlx5_fpga_image_name(fdev->last_oper_image), -- 2.17.1
Re: [PATCH 10/11] pragma once: delete few backslashes
On 28/02/2021 17:05, Alexey Dobriyan wrote: > From 251ca5673886b5bb0a42004944290b9d2b267a4a Mon Sep 17 00:00:00 2001 > From: Alexey Dobriyan > Date: Fri, 19 Feb 2021 13:37:24 +0300 > Subject: [PATCH 10/11] pragma once: delete few backslashes > > Some macros contain one backslash too many and end up being the last > macro in a header file. When #pragma once conversion script truncates > the last #endif and whitespace before it, such backslash triggers > a warning about "OMG file ends up in a backslash-newline". > > Needless to say I don't want to handle another case in my script, > so delete useless backslashes instead. > > Signed-off-by: Alexey Dobriyan > --- > arch/arc/include/asm/cacheflush.h | 2 +- > drivers/net/ethernet/mellanox/mlxsw/item.h | 2 +- > include/linux/once.h | 2 +- > include/media/drv-intf/exynos-fimc.h | 2 +- > 4 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/arch/arc/include/asm/cacheflush.h > b/arch/arc/include/asm/cacheflush.h > index e201b4b1655a..46704c341b17 100644 > --- a/arch/arc/include/asm/cacheflush.h > +++ b/arch/arc/include/asm/cacheflush.h > @@ -112,6 +112,6 @@ do { > \ > } while (0) > > #define copy_from_user_page(vma, page, vaddr, dst, src, len) \ > - memcpy(dst, src, len); \ > + memcpy(dst, src, len) > This changebar also removes a semicolon. It looks plausibly correct, but the commit message ought to mention it. -ed
Re: [PATCH] net: mac802154: Fix null pointer dereference
Hi, On Thu, 4 Mar 2021 at 04:23, Pavel Skripkin wrote: ... > > > > I think this need to be: > > > > if (!IS_ERR_OR_NULL(key->tfm[i])) > > > > otherwise we still run into issues for the current iterator when > > key->tfm[i] is in range of IS_ERR(). > > Oh... I got it completly wrong, I'm sorry. If it's still not fixed, > I'll send rigth patch for that. > please resend your patch. We will review again. - Alex
Re: WARNING in carl9170_usb_submit_cmd_urb/usb_submit_urb
syzbot has bisected this issue to: commit 6a66a7ded12baa6ebbb2e3e82f8cb91382814839 Author: zhangyi (F) Date: Thu Feb 13 06:38:20 2020 + jbd2: move the clearing of b_modified flag to the journal_unmap_buffer() bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=14bd498ed0 start commit: f69d02e3 Merge tag 'misc-5.12-2021-03-02' of git://git.ker.. git tree: upstream final oops: https://syzkaller.appspot.com/x/report.txt?x=16bd498ed0 console output: https://syzkaller.appspot.com/x/log.txt?x=12bd498ed0 kernel config: https://syzkaller.appspot.com/x/.config?x=e0da2d01cc636e2c dashboard link: https://syzkaller.appspot.com/bug?extid=9468df99cb63a4a4c4e1 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11770346d0 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17be69ccd0 Reported-by: syzbot+9468df99cb63a4a4c...@syzkaller.appspotmail.com Fixes: 6a66a7ded12b ("jbd2: move the clearing of b_modified flag to the journal_unmap_buffer()") For information about bisection process see: https://goo.gl/tpsmEJ#bisection
Re: [PATCH v4 0/6] can: c_can: add support to 64 message objects
On 02.03.2021 22:54:29, Dario Binacchi wrote: > > The D_CAN controller supports up to 128 messages. Until now the driver > only managed 32 messages although Sitara processors and DRA7 SOC can > handle 64. > > The series was tested on a beaglebone board. > > Note: > I have not changed the type of tx_field (belonging to the c_can_priv > structure) to atomic64_t because I think the atomic_t type has size > of at least 32 bits on x86 and arm, which is enough to handle 64 > messages. > http://marc.info/?l=linux-can&m=139746476821294&w=2 reports the results > of tests performed just on x86 and arm architectures. Applied to linux-can-next/testing. I've added some cleanup patches to the series. I'll send it around soon, please test. regards, Marc -- Pengutronix e.K. | Marc Kleine-Budde | Embedded Linux | https://www.pengutronix.de | Vertretung West/Dortmund | Phone: +49-231-2826-924 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917- | signature.asc Description: PGP signature
Re: [PATCH 14/17] iommu: remove DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE
On 2021-03-01 08:42, Christoph Hellwig wrote: Use explicit methods for setting and querying the information instead. Now that everyone's using iommu-dma, is there any point in bouncing this through the drivers at all? Seems like it would make more sense for the x86 drivers to reflect their private options back to iommu_dma_strict (and allow Intel's caching mode to override it as well), then have iommu_dma_init_domain just test !iommu_dma_strict && domain->ops->flush_iotlb_all. Robin. Also remove the now unused iommu_domain_get_attr functionality. Signed-off-by: Christoph Hellwig --- drivers/iommu/amd/iommu.c | 23 ++--- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 47 ++--- drivers/iommu/arm/arm-smmu/arm-smmu.c | 56 + drivers/iommu/dma-iommu.c | 8 ++- drivers/iommu/intel/iommu.c | 27 ++ drivers/iommu/iommu.c | 19 +++ include/linux/iommu.h | 17 ++- 7 files changed, 51 insertions(+), 146 deletions(-) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index a69a8b573e40d0..37a8e51db17656 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -1771,24 +1771,11 @@ static struct iommu_group *amd_iommu_device_group(struct device *dev) return acpihid_device_group(dev); } -static int amd_iommu_domain_get_attr(struct iommu_domain *domain, - enum iommu_attr attr, void *data) +static bool amd_iommu_dma_use_flush_queue(struct iommu_domain *domain) { - switch (domain->type) { - case IOMMU_DOMAIN_UNMANAGED: - return -ENODEV; - case IOMMU_DOMAIN_DMA: - switch (attr) { - case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE: - *(int *)data = !amd_iommu_unmap_flush; - return 0; - default: - return -ENODEV; - } - break; - default: - return -EINVAL; - } + if (domain->type != IOMMU_DOMAIN_DMA) + return false; + return !amd_iommu_unmap_flush; } /* @@ -2257,7 +2244,7 @@ const struct iommu_ops amd_iommu_ops = { .release_device = amd_iommu_release_device, .probe_finalize = amd_iommu_probe_finalize, .device_group = amd_iommu_device_group, - .domain_get_attr = amd_iommu_domain_get_attr, + .dma_use_flush_queue = amd_iommu_dma_use_flush_queue, .get_resv_regions = amd_iommu_get_resv_regions, .put_resv_regions = generic_iommu_put_resv_regions, .is_attach_deferred = amd_iommu_is_attach_deferred, diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 8594b4a8304375..bf96172e8c1f71 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2449,33 +2449,21 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev) return group; } -static int arm_smmu_domain_get_attr(struct iommu_domain *domain, - enum iommu_attr attr, void *data) +static bool arm_smmu_dma_use_flush_queue(struct iommu_domain *domain) { struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); - switch (domain->type) { - case IOMMU_DOMAIN_UNMANAGED: - switch (attr) { - case DOMAIN_ATTR_NESTING: - *(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED); - return 0; - default: - return -ENODEV; - } - break; - case IOMMU_DOMAIN_DMA: - switch (attr) { - case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE: - *(int *)data = smmu_domain->non_strict; - return 0; - default: - return -ENODEV; - } - break; - default: - return -EINVAL; - } + if (domain->type != IOMMU_DOMAIN_DMA) + return false; + return smmu_domain->non_strict; +} + + +static void arm_smmu_dma_enable_flush_queue(struct iommu_domain *domain) +{ + if (domain->type != IOMMU_DOMAIN_DMA) + return; + to_smmu_domain(domain)->non_strict = true; } static int arm_smmu_domain_set_attr(struct iommu_domain *domain, @@ -2505,13 +2493,7 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain, } break; case IOMMU_DOMAIN_DMA: - switch(attr) { - case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE: - smmu_domain->non_strict = *(int *)data; - break; - default: - ret = -EN
[PATCH v2] net: mac802154: Fix general protection fault
syzbot found general protection fault in crypto_destroy_tfm()[1]. It was caused by wrong clean up loop in llsec_key_alloc(). If one of the tfm array members is in IS_ERR() range it will cause general protection fault in clean up function [1]. Call Trace: crypto_free_aead include/crypto/aead.h:191 [inline] [1] llsec_key_alloc net/mac802154/llsec.c:156 [inline] mac802154_llsec_key_add+0x9e0/0xcc0 net/mac802154/llsec.c:249 ieee802154_add_llsec_key+0x56/0x80 net/mac802154/cfg.c:338 rdev_add_llsec_key net/ieee802154/rdev-ops.h:260 [inline] nl802154_add_llsec_key+0x3d3/0x560 net/ieee802154/nl802154.c:1584 genl_family_rcv_msg_doit+0x228/0x320 net/netlink/genetlink.c:739 genl_family_rcv_msg net/netlink/genetlink.c:783 [inline] genl_rcv_msg+0x328/0x580 net/netlink/genetlink.c:800 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502 genl_rcv+0x24/0x40 net/netlink/genetlink.c:811 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:674 sys_sendmsg+0x6e8/0x810 net/socket.c:2350 ___sys_sendmsg+0xf3/0x170 net/socket.c:2404 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2433 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xae Signed-off-by: Pavel Skripkin Reported-by: syzbot+9ec037722d2603a9f...@syzkaller.appspotmail.com Change-Id: I29f7ac641a039096d63d1e6070bb32cb5a3beb07 --- net/mac802154/llsec.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/mac802154/llsec.c b/net/mac802154/llsec.c index 585d33144c33..0ead2ced 100644 --- a/net/mac802154/llsec.c +++ b/net/mac802154/llsec.c @@ -152,7 +152,7 @@ llsec_key_alloc(const struct ieee802154_llsec_key *template) crypto_free_sync_skcipher(key->tfm0); err_tfm: for (i = 0; i < ARRAY_SIZE(key->tfm); i++) - if (key->tfm[i]) + if (!IS_ERR_OR_NULL(key->tfm[i])) crypto_free_aead(key->tfm[i]); kfree_sensitive(key); -- 2.25.1
Re: [PATCH v4 5/6] can: c_can: prepare to up the message objects number
On 02.03.2021 22:54:34, Dario Binacchi wrote: > diff --git a/drivers/net/can/c_can/c_can.c b/drivers/net/can/c_can/c_can.c > index 77b9aee56154..0052ba5197e0 100644 > --- a/drivers/net/can/c_can/c_can.c > +++ b/drivers/net/can/c_can/c_can.c [...] > -struct net_device *alloc_c_can_dev(void) > +struct net_device *alloc_c_can_dev(int msg_obj_num) > { > struct net_device *dev; > struct c_can_priv *priv; > + int msg_obj_tx_num = msg_obj_num / 2; > > - dev = alloc_candev(sizeof(struct c_can_priv), C_CAN_MSG_OBJ_TX_NUM); > + dev = alloc_candev(sizeof(*priv) + sizeof(u32) * msg_obj_tx_num, > +msg_obj_tx_num); I've converted this to make use of the struct_size() macro: + dev = alloc_candev(struct_size(priv, dlc, msg_obj_tx_num), + msg_obj_tx_num); regards, Marc -- Pengutronix e.K. | Marc Kleine-Budde | Embedded Linux | https://www.pengutronix.de | Vertretung West/Dortmund | Phone: +49-231-2826-924 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917- | signature.asc Description: PGP signature
Re: [Patch bpf-next] skmsg: add function doc for skb->_sk_redir
Hello: This patch was applied to bpf/bpf-next.git (refs/heads/master): On Mon, 1 Mar 2021 10:48:05 -0800 you wrote: > From: Cong Wang > > This should fix the following warning: > > include/linux/skbuff.h:932: warning: Function parameter or member > '_sk_redir' not described in 'sk_buff' > > [...] Here is the summary with links: - [bpf-next] skmsg: add function doc for skb->_sk_redir https://git.kernel.org/bpf/bpf-next/c/6ed6e1c761f6 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html
Re: [PATCH 16/17] iommu: remove DOMAIN_ATTR_IO_PGTABLE_CFG
On 2021-03-01 08:42, Christoph Hellwig wrote: Signed-off-by: Christoph Hellwig Moreso than the previous patch, where the feature is at least relatively generic (note that there's a bunch of in-flight development around DOMAIN_ATTR_NESTING), I'm really not convinced that it's beneficial to bloat the generic iommu_ops structure with private driver-specific interfaces. The attribute interface is a great compromise for these kinds of things, and you can easily add type-checked wrappers around it for external callers (maybe even make the actual attributes internal between the IOMMU core and drivers) if that's your concern. Robin. --- drivers/gpu/drm/msm/adreno/adreno_gpu.c | 2 +- drivers/iommu/arm/arm-smmu/arm-smmu.c | 40 +++-- drivers/iommu/iommu.c | 9 ++ include/linux/iommu.h | 9 +- 4 files changed, 29 insertions(+), 31 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index 0f184c3dd9d9ec..78d98ab2ee3a68 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -191,7 +191,7 @@ void adreno_set_llc_attributes(struct iommu_domain *iommu) struct io_pgtable_domain_attr pgtbl_cfg; pgtbl_cfg.quirks = IO_PGTABLE_QUIRK_ARM_OUTER_WBWA; - iommu_domain_set_attr(iommu, DOMAIN_ATTR_IO_PGTABLE_CFG, &pgtbl_cfg); + iommu_domain_set_pgtable_attr(iommu, &pgtbl_cfg); } struct msm_gem_address_space * diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c index 2e17d990d04481..2858999c86dfd1 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c @@ -1515,40 +1515,22 @@ static int arm_smmu_domain_enable_nesting(struct iommu_domain *domain) return ret; } -static int arm_smmu_domain_set_attr(struct iommu_domain *domain, - enum iommu_attr attr, void *data) +static int arm_smmu_domain_set_pgtable_attr(struct iommu_domain *domain, + struct io_pgtable_domain_attr *pgtbl_cfg) { - int ret = 0; struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); + int ret = -EPERM; - mutex_lock(&smmu_domain->init_mutex); - - switch(domain->type) { - case IOMMU_DOMAIN_UNMANAGED: - switch (attr) { - case DOMAIN_ATTR_IO_PGTABLE_CFG: { - struct io_pgtable_domain_attr *pgtbl_cfg = data; - - if (smmu_domain->smmu) { - ret = -EPERM; - goto out_unlock; - } + if (domain->type != IOMMU_DOMAIN_UNMANAGED) + return -EINVAL; - smmu_domain->pgtbl_cfg = *pgtbl_cfg; - break; - } - default: - ret = -ENODEV; - } - break; - case IOMMU_DOMAIN_DMA: - ret = -ENODEV; - break; - default: - ret = -EINVAL; + mutex_lock(&smmu_domain->init_mutex); + if (!smmu_domain->smmu) { + smmu_domain->pgtbl_cfg = *pgtbl_cfg; + ret = 0; } -out_unlock: mutex_unlock(&smmu_domain->init_mutex); + return ret; } @@ -1609,7 +1591,7 @@ static struct iommu_ops arm_smmu_ops = { .device_group = arm_smmu_device_group, .dma_use_flush_queue= arm_smmu_dma_use_flush_queue, .dma_enable_flush_queue = arm_smmu_dma_enable_flush_queue, - .domain_set_attr= arm_smmu_domain_set_attr, + .domain_set_pgtable_attr = arm_smmu_domain_set_pgtable_attr, .domain_enable_nesting = arm_smmu_domain_enable_nesting, .of_xlate = arm_smmu_of_xlate, .get_resv_regions = arm_smmu_get_resv_regions, diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 2e9e058501a953..8490aefd4b41f8 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2693,6 +2693,15 @@ int iommu_domain_enable_nesting(struct iommu_domain *domain) } EXPORT_SYMBOL_GPL(iommu_domain_enable_nesting); +int iommu_domain_set_pgtable_attr(struct iommu_domain *domain, + struct io_pgtable_domain_attr *pgtbl_cfg) +{ + if (!domain->ops->domain_set_pgtable_attr) + return -EINVAL; + return domain->ops->domain_set_pgtable_attr(domain, pgtbl_cfg); +} +EXPORT_SYMBOL_GPL(iommu_domain_set_pgtable_attr); + void iommu_get_resv_regions(struct device *dev, struct list_head *list) { const struct iommu_ops *ops = dev->bus->iommu_ops; diff --git a/include/linux/iommu.h b/include/linux/iommu.h index aed88aa3bd3edf..39d3ed4d2700ac 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -40,6 +40,7 @@ struct iommu_domain; struct notifier_block; struct iommu_sva; struct iommu_fault_event; +stru
Re: [PATCH bpf-next] selftests/bpf: Fix test_attach_probe for powerpc uprobes
On Thu, Mar 04, 2021 at 11:46:27AM +1100, Michael Ellerman wrote: > "Naveen N. Rao" writes: > > On 2021/03/02 11:35AM, Jiri Olsa wrote: > >> On Mon, Mar 01, 2021 at 02:58:53PM -0800, Yonghong Song wrote: > >> > > >> > > >> > On 3/1/21 11:04 AM, Jiri Olsa wrote: > >> > > When testing uprobes we the test gets GEP (Global Entry Point) > >> > > address from kallsyms, but then the function is called locally > >> > > so the uprobe is not triggered. > >> > > > >> > > Fixing this by adjusting the address to LEP (Local Entry Point) > >> > > for powerpc arch. > >> > > > >> > > Signed-off-by: Jiri Olsa > >> > > --- > >> > > .../selftests/bpf/prog_tests/attach_probe.c| 18 > >> > > +- > >> > > 1 file changed, 17 insertions(+), 1 deletion(-) > >> > > > >> > > diff --git a/tools/testing/selftests/bpf/prog_tests/attach_probe.c > >> > > b/tools/testing/selftests/bpf/prog_tests/attach_probe.c > >> > > index a0ee87c8e1ea..c3cfb48d3ed0 100644 > >> > > --- a/tools/testing/selftests/bpf/prog_tests/attach_probe.c > >> > > +++ b/tools/testing/selftests/bpf/prog_tests/attach_probe.c > >> > > @@ -2,6 +2,22 @@ > >> > > #include > >> > > #include "test_attach_probe.skel.h" > >> > > +#if defined(__powerpc64__) > > > > This needs to be specific to ELF v2 ABI, so you'll need to check > > _CALL_ELF. See commit d5c2e2c17ae1d6 ("perf probe ppc64le: Prefer symbol > > table lookup over DWARF") for an example. > > > >> > > +/* > >> > > + * We get the GEP (Global Entry Point) address from kallsyms, > >> > > + * but then the function is called locally, so we need to adjust > >> > > + * the address to get LEP (Local Entry Point). > >> > > >> > Any documentation in the kernel about this behavior? This will > >> > help to validate the change without trying with powerpc64 qemu... > > > > I don't think we have documented this in the kernel anywhere, but this > > is specific to the ELF v2 ABI and is described there: > > - 2.3.2.1. Function Prologue: > > > > http://cdn.openpowerfoundation.org/wp-content/uploads/resources/leabi/content/dbdoclet.50655240___RefHeading___Toc377640597.html > > - 3.4.1. Symbol Values: > > > > http://cdn.openpowerfoundation.org/wp-content/uploads/resources/leabi/content/dbdoclet.50655241_95185.html > > There's a comment in ppc_function_entry(), but I don't think we have any > actual "documentation". > > static inline unsigned long ppc_function_entry(void *func) > { > #ifdef PPC64_ELF_ABI_v2 > u32 *insn = func; > > /* >* A PPC64 ABIv2 function may have a local and a global entry >* point. We need to use the local entry point when patching >* functions, so identify and step over the global entry point >* sequence. hm, so I need to do the instructions check below as well >* >* The global entry point sequence is always of the form: >* >* addis r2,r12, >* addi r2,r2, >* >* A linker optimisation may convert the addis to lis: >* >* lis r2, >* addi r2,r2, >*/ > if *insn & OP_RT_RA_MASK) == ADDIS_R2_R12) || >((*insn & OP_RT_RA_MASK) == LIS_R2)) && > ((*(insn+1) & OP_RT_RA_MASK) == ADDI_R2_R2)) is this check/instructions specific to kernel code? In the test prog I see following instructions: Dump of assembler code for function get_base_addr: 0x10034cb0 <+0>: lis r2,4256 0x10034cb4 <+4>: addir2,r2,31488 ... but first instruction does not match the check in kernel code above: 1.insn value: 0x3c4010a0 2.insn value: 0x38427b00 the used defines are: #define OP_RT_RA_MASK 0xUL #define LIS_R2 0x3c02UL #define ADDIS_R2_R120x3c4cUL #define ADDI_R2_R2 0x3842UL maybe we could skip the check, and run the test twice: first on kallsym address and if the uprobe is not hit we will run it again on address + 8 thanks, jirka > return (unsigned long)(insn + 2); > else > return (unsigned long)func; > > > cheers >
Re: [PATCH 3/3] PCI: Convert rtw88 power cycle quirk to shutdown quirk
[+cc Rafael, linux-pm] On Thu, Mar 04, 2021 at 02:07:18PM +0800, Kai-Heng Feng wrote: > On Sat, Feb 27, 2021 at 2:17 AM Bjorn Helgaas wrote: > > On Fri, Feb 26, 2021 at 02:31:31PM +0100, Heiner Kallweit wrote: > > > On 26.02.2021 13:18, Kai-Heng Feng wrote: > > > > On Fri, Feb 26, 2021 at 8:10 PM Heiner Kallweit > > > > wrote: > > > >> > > > >> On 26.02.2021 08:12, Kalle Valo wrote: > > > >>> Kai-Heng Feng writes: > > > >>> > > > Now we have a generic D3 shutdown quirk, so convert the original > > > approach to a PCI quirk. > > > > > > Signed-off-by: Kai-Heng Feng > > > --- > > > drivers/net/wireless/realtek/rtw88/pci.c | 2 -- > > > drivers/pci/quirks.c | 6 ++ > > > 2 files changed, 6 insertions(+), 2 deletions(-) > > > >>> > > > >>> It would have been nice to CC linux-wireless also on patches 1-2. I > > > >>> only > > > >>> saw patch 3 and had to search the rest of patches from lkml. > > > >>> > > > >>> I assume this goes via the PCI tree so: > > > >>> > > > >>> Acked-by: Kalle Valo > > > >> > > > >> To me it looks odd to (mis-)use the quirk mechanism to set a device > > > >> to D3cold on shutdown. As I see it the quirk mechanism is used to work > > > >> around certain device misbehavior. And setting a device to a D3 > > > >> state on shutdown is a normal activity, and the shutdown() callback > > > >> seems to be a good place for it. > > > >> I miss an explanation what the actual benefit of the change is. > > > > > > > > To make putting device to D3 more generic, as there are more than one > > > > device need the quirk. > > > > > > > > Here's the discussion: > > > > https://lore.kernel.org/linux-usb/00de6927-3fa6-a9a3-2d65-2b4d4e8f0...@linux.intel.com/ > > > > > > > > > > Thanks for the link. For the AMD USB use case I don't have a strong > > > opinion, > > > what's considered the better option may be a question of personal taste. > > > For rtw88 however I'd still consider it over-engineering to replace a > > > simple > > > call to pci_set_power_state() with a PCI quirk. > > > I may be biased here because I find it sometimes bothering if I want to > > > look up how a device is handled and in addition to checking the respective > > > driver I also have to grep through quirks.c whether there's any special > > > handling. > > > > I haven't looked at these patches carefully, but in general, I agree > > that quirks should be used to work around hardware defects in the > > device. If the device behaves correctly per spec, we should use a > > different mechanism so the code remains generic and all devices get > > the benefit. > > > > If we do add quirks, the commit log should explain what the device > > defect is. > > So maybe it's reasonable to put all PCI devices to D3 at shutdown? I don't know off-hand. I added Rafael and linux-pm in case they do. If not, I suggest working up a patch to do that and a commit log that explains why that's a good idea and then we can have a discussion about it. This thread really doesn't have that justification. It says "putting device X in D3cold at shutdown saves 0.03w while in S5", but doesn't explain why that's safe or desirable for all devices. Bjorn
[PATCH net] ibmvnic: always store valid MAC address
The last change to ibmvnic_set_mac(), 8fc3672a8ad3, meant to prevent users from setting an invalid MAC address on an ibmvnic interface that has not been brought up yet. The change also prevented the requested MAC address from being stored by the adapter object for an ibmvnic interface when the state of the ibmvnic interface is VNIC_PROBED - that is after probing has finished but before the ibmvnic interface is brought up. The MAC address stored by the adapter object is used and sent to the hypervisor for checking when an ibmvnic interface is brought up. The ibmvnic driver ignoring the requested MAC address when in VNIC_PROBED state caused LACP bonds (bonds in 802.3ad mode) with more than one slave to malfunction. The bonding code must be able to change the MAC address of its slaves before they are brought up during enslaving. The inability of kernels with 8fc3672a8ad3 to set the MAC addresses of bonding slaves is observable in the output of "ip address show". The MAC addresses of the slaves are the same as the MAC address of the bond on a working system whereas the slaves retain their original MAC addresses on a system with a malfunctioning LACP bond. Fixes: 8fc3672a8ad3 ("ibmvnic: fix ibmvnic_set_mac") Signed-off-by: Jiri Wiesner --- drivers/net/ethernet/ibm/ibmvnic.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index 3bad762083c5..b6102ccf9b90 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -1906,10 +1906,9 @@ static int ibmvnic_set_mac(struct net_device *netdev, void *p) if (!is_valid_ether_addr(addr->sa_data)) return -EADDRNOTAVAIL; - if (adapter->state != VNIC_PROBED) { - ether_addr_copy(adapter->mac_addr, addr->sa_data); + ether_addr_copy(adapter->mac_addr, addr->sa_data); + if (adapter->state != VNIC_PROBED) rc = __ibmvnic_set_mac(netdev, addr->sa_data); - } return rc; } -- 2.26.2
Re: [PATCH] scsi: target: vhost-scsi: remove redundant initialization of variable ret
On Wed, Mar 03, 2021 at 01:43:39PM +, Colin King wrote: > From: Colin Ian King > > The variable ret is being initialized with a value that is never read > and it is being updated later with a new value. The initialization is > redundant and can be removed. > > Addresses-Coverity: ("Unused value") > Signed-off-by: Colin Ian King > --- > drivers/vhost/scsi.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Which kernel version is this patch based on? If it's a fix for a patch that hasn't landed yet, please indicate this. A "Fixes: ..." tag should be added to this patch as well. I looked at linux.git/master commit f69d02e37a85645aa90d18cacfff36dba370f797 and see this: static int __init vhost_scsi_init(void) { int ret = -ENOMEM; pr_debug("TCM_VHOST fabric module %s on %s/%s" " on "UTS_RELEASE"\n", VHOST_SCSI_VERSION, utsname()->sysname, utsname()->machine); /* * Use our own dedicated workqueue for submitting I/O into * target core to avoid contention within system_wq. */ vhost_scsi_workqueue = alloc_workqueue("vhost_scsi", 0, 0); if (!vhost_scsi_workqueue) goto out; We need ret's initialization value here ^ ret = vhost_scsi_register(); if (ret < 0) goto out_destroy_workqueue; ret = target_register_template(&vhost_scsi_ops); if (ret < 0) goto out_vhost_scsi_deregister; return 0; out_vhost_scsi_deregister: vhost_scsi_deregister(); out_destroy_workqueue: destroy_workqueue(vhost_scsi_workqueue); out: return ret; }; > > diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c > index d16c04dcc144..9129ab8187fd 100644 > --- a/drivers/vhost/scsi.c > +++ b/drivers/vhost/scsi.c > @@ -2465,7 +2465,7 @@ static const struct target_core_fabric_ops > vhost_scsi_ops = { > > static int __init vhost_scsi_init(void) > { > - int ret = -ENOMEM; > + int ret; > > pr_debug("TCM_VHOST fabric module %s on %s/%s" > " on "UTS_RELEASE"\n", VHOST_SCSI_VERSION, utsname()->sysname, > -- > 2.30.0 > signature.asc Description: PGP signature
Minor update to bridge-utils
Small changes to bridge-utils to address some minor issues. 1. The default branch is main not master 2. Fixed some compiler warnings because Gcc 10 and Clang now do checks for string overflow. 3. Made a backup repository mirror at github. 4. Fixed version string printed This is not a required update, none of these are important or critical to users. Do not intend to do any new features or serious fixes to bridge-utils; all users should convert to iproute2/bridge command.
Re: [PATCH 1/1] net: usb: qmi_wwan: allow qmimux add/del with master up
Daniele Palmas writes: > There's no reason for preventing the creation and removal > of qmimux network interfaces when the underlying interface > is up. > > This makes qmi_wwan mux implementation more similar to the > rmnet one, simplifying userspace management of the same > logical interfaces. > > Fixes: c6adf77953bc ("net: usb: qmi_wwan: add qmap mux protocol support") > Reported-by: Aleksander Morgado > Signed-off-by: Daniele Palmas Acked-by: Bjørn Mork
Re: [regression] Kernel panic on resume from sleep
On Thu, 4 Mar 2021 13:50:01 +0100 Zbynek Michl wrote: > Looks good so far, but need to wait some more time as the issue was irregular. > > Do you have any explanation why the calls disorder caused the panic > just occasionally? Depends if kernel attempts to try to send a packet before __alx_open() finishes. You can probably make it more likely by running trafgen, iperf or such while suspending and resuming? > Also, the same (wrong) order I can see in the 3.16 kernel code, but it > has worked fine with this kernel in all cases. So what is different in > 5.10? At some point in between those versions the driver got modified to allocate and free the NAPI structures dynamically. I didn't look too closely to find out if things indeed worked 100% correctly before, but now they will reliably crash on a NULL pointer dereference if transmission comes before open is done.
[PATCH net] netdevsim: init u64 stats for 32bit hardware
From: Hillf Danton Init the u64 stats in order to avoid the lockdep prints on the 32bit hardware like INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 0 PID: 4695 Comm: syz-executor.0 Not tainted 5.11.0-rc5-syzkaller #0 Hardware name: ARM-Versatile Express Backtrace: [<826fc5b8>] (dump_backtrace) from [<826fc82c>] (show_stack+0x18/0x1c arch/arm/kernel/traps.c:252) [<826fc814>] (show_stack) from [<8270d1f8>] (__dump_stack lib/dump_stack.c:79 [inline]) [<826fc814>] (show_stack) from [<8270d1f8>] (dump_stack+0xa8/0xc8 lib/dump_stack.c:120) [<8270d150>] (dump_stack) from [<802bf9c0>] (assign_lock_key kernel/locking/lockdep.c:935 [inline]) [<8270d150>] (dump_stack) from [<802bf9c0>] (register_lock_class+0xabc/0xb68 kernel/locking/lockdep.c:1247) [<802bef04>] (register_lock_class) from [<802baa2c>] (__lock_acquire+0x84/0x32d4 kernel/locking/lockdep.c:4711) [<802ba9a8>] (__lock_acquire) from [<802be840>] (lock_acquire.part.0+0xf0/0x554 kernel/locking/lockdep.c:5442) [<802be750>] (lock_acquire.part.0) from [<802bed10>] (lock_acquire+0x6c/0x74 kernel/locking/lockdep.c:5415) [<802beca4>] (lock_acquire) from [<81560548>] (seqcount_lockdep_reader_access include/linux/seqlock.h:103 [inline]) [<802beca4>] (lock_acquire) from [<81560548>] (__u64_stats_fetch_begin include/linux/u64_stats_sync.h:164 [inline]) [<802beca4>] (lock_acquire) from [<81560548>] (u64_stats_fetch_begin include/linux/u64_stats_sync.h:175 [inline]) [<802beca4>] (lock_acquire) from [<81560548>] (nsim_get_stats64+0xdc/0xf0 drivers/net/netdevsim/netdev.c:70) [<8156046c>] (nsim_get_stats64) from [<81e2efa0>] (dev_get_stats+0x44/0xd0 net/core/dev.c:10405) [<81e2ef5c>] (dev_get_stats) from [<81e53204>] (rtnl_fill_stats+0x38/0x120 net/core/rtnetlink.c:1211) [<81e531cc>] (rtnl_fill_stats) from [<81e59d58>] (rtnl_fill_ifinfo+0x6d4/0x148c net/core/rtnetlink.c:1783) [<81e59684>] (rtnl_fill_ifinfo) from [<81e5ceb4>] (rtmsg_ifinfo_build_skb+0x9c/0x108 net/core/rtnetlink.c:3798) [<81e5ce18>] (rtmsg_ifinfo_build_skb) from [<81e5d0ac>] (rtmsg_ifinfo_event net/core/rtnetlink.c:3830 [inline]) [<81e5ce18>] (rtmsg_ifinfo_build_skb) from [<81e5d0ac>] (rtmsg_ifinfo_event net/core/rtnetlink.c:3821 [inline]) [<81e5ce18>] (rtmsg_ifinfo_build_skb) from [<81e5d0ac>] (rtmsg_ifinfo+0x44/0x70 net/core/rtnetlink.c:3839) [<81e5d068>] (rtmsg_ifinfo) from [<81e45c2c>] (register_netdevice+0x664/0x68c net/core/dev.c:10103) [<81e455c8>] (register_netdevice) from [<815608bc>] (nsim_create+0xf8/0x124 drivers/net/netdevsim/netdev.c:317) [<815607c4>] (nsim_create) from [<81561184>] (__nsim_dev_port_add+0x108/0x188 drivers/net/netdevsim/dev.c:941) [<8156107c>] (__nsim_dev_port_add) from [<815620d8>] (nsim_dev_port_add_all drivers/net/netdevsim/dev.c:990 [inline]) [<8156107c>] (__nsim_dev_port_add) from [<815620d8>] (nsim_dev_probe+0x5cc/0x750 drivers/net/netdevsim/dev.c:1119) [<81561b0c>] (nsim_dev_probe) from [<815661dc>] (nsim_bus_probe+0x10/0x14 drivers/net/netdevsim/bus.c:287) [<815661cc>] (nsim_bus_probe) from [<811724c0>] (really_probe+0x100/0x50c drivers/base/dd.c:554) [<811723c0>] (really_probe) from [<811729c4>] (driver_probe_device+0xf8/0x1c8 drivers/base/dd.c:740) [<811728cc>] (driver_probe_device) from [<81172fe4>] (__device_attach_driver+0x8c/0xf0 drivers/base/dd.c:846) [<81172f58>] (__device_attach_driver) from [<8116fee0>] (bus_for_each_drv+0x88/0xd8 drivers/base/bus.c:431) [<8116fe58>] (bus_for_each_drv) from [<81172c6c>] (__device_attach+0xdc/0x1d0 drivers/base/dd.c:914) [<81172b90>] (__device_attach) from [<8117305c>] (device_initial_probe+0x14/0x18 drivers/base/dd.c:961) [<81173048>] (device_initial_probe) from [<81171358>] (bus_probe_device+0x90/0x98 drivers/base/bus.c:491) [<811712c8>] (bus_probe_device) from [<8116e77c>] (device_add+0x320/0x824 drivers/base/core.c:3109) [<8116e45c>] (device_add) from [<8116ec9c>] (device_register+0x1c/0x20 drivers/base/core.c:3182) [<8116ec80>] (device_register) from [<81566710>] (nsim_bus_dev_new drivers/net/netdevsim/bus.c:336 [inline]) [<8116ec80>] (device_register) from [<81566710>] (new_device_store+0x178/0x208 drivers/net/netdevsim/bus.c:215) [<81566598>] (new_device_store) from [<8116fcb4>] (bus_attr_store+0x2c/0x38 drivers/base/bus.c:122) [<8116fc88>] (bus_attr_store) from [<805b4b8c>] (sysfs_kf_write+0x48/0x54 fs/sysfs/file.c:139) [<805b4b44>] (sysfs_kf_write) from [<805b3c90>] (kernfs_fop_write_iter+0x128/0x1ec fs/kernfs/file.c:296) [<805b3b68>] (kernfs_fop_write_iter) from [<804d22fc>] (call_write_iter include/linux/fs.h:1901 [inline]) [<805b3b68>] (kernfs_fop_write_iter) from [<804d22fc>] (new_sync_write fs/read_write.c:518 [inline]) [<805b3b68>] (kernfs_fop_write_iter) from [<804d22fc>] (vfs_write+0x3dc/0x57c fs/read_write.c:605) [<804d1f20>] (vfs_write) from [<804d2604>] (ksys_write+0x68/0xec fs/read_write.c:658) [<804d259c>] (ksys_write)
[PATCH] net: sched: avoid duplicates in classes dump
This is a follow up of commit ea3274695353 ("net: sched: avoid duplicates in qdisc dump") which has fixed the issue only for the qdisc dump. The duplicate printing also occurs when dumping the classes via tc class show dev eth0 Fixes: 59cc1f61f09c ("net: sched: convert qdisc linked list to hashtable") Signed-off-by: Maximilian Heyne --- net/sched/sch_api.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index e2e4353db8a7..f87d07736a14 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -2168,7 +2168,7 @@ static int tc_dump_tclass_qdisc(struct Qdisc *q, struct sk_buff *skb, static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb, struct tcmsg *tcm, struct netlink_callback *cb, - int *t_p, int s_t) + int *t_p, int s_t, bool recur) { struct Qdisc *q; int b; @@ -2179,7 +2179,7 @@ static int tc_dump_tclass_root(struct Qdisc *root, struct sk_buff *skb, if (tc_dump_tclass_qdisc(root, skb, tcm, cb, t_p, s_t) < 0) return -1; - if (!qdisc_dev(root)) + if (!qdisc_dev(root) || !recur) return 0; if (tcm->tcm_parent) { @@ -2214,13 +2214,13 @@ static int tc_dump_tclass(struct sk_buff *skb, struct netlink_callback *cb) s_t = cb->args[0]; t = 0; - if (tc_dump_tclass_root(dev->qdisc, skb, tcm, cb, &t, s_t) < 0) + if (tc_dump_tclass_root(dev->qdisc, skb, tcm, cb, &t, s_t, true) < 0) goto done; dev_queue = dev_ingress_queue(dev); if (dev_queue && tc_dump_tclass_root(dev_queue->qdisc_sleeping, skb, tcm, cb, - &t, s_t) < 0) + &t, s_t, false) < 0) goto done; done: -- 2.16.6 Amazon Development Center Germany GmbH Krausenstr. 38 10117 Berlin Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B Sitz: Berlin Ust-ID: DE 289 237 879
Re: [PATCH net] ethtool: Add indicator field for link_mode validity to link_ksettings
On Thu, 4 Mar 2021 11:09:33 +0200 Danielle Ratson wrote: > Some drivers clear the 'ethtool_link_ksettings' struct in their > get_link_ksettings() callback, before populating it with actual values. > Such drivers will set the new 'link_mode' field to zero, resulting in > user space receiving wrong link mode information given that zero is a > valid value for the field. > > Fix this by introducing a new boolean field ('link_mode_valid'), which > indicates whether the 'link_mode' field is valid or not. Set it in mlxsw > which is currently the only driver supporting the new API. > > Another problem is that some drivers (notably tun) can report random > values in the 'link_mode' field. This can result in a general protection > fault when the field is used as an index to the 'link_mode_params' array > [1]. > > This happens because such drivers implement their set_link_ksettings() > callback by simply overwriting their private copy of > 'ethtool_link_ksettings' struct with the one they get from the stack, > which is not always properly initialized. > > Fix this by making sure that the new 'link_mode_valid' field is always > initialized to 'false' before invoking the set_link_ksettings() > callback. > > [1] > general protection fault, probably for non-canonical address > 0xdc00f14cc32c: [#1] PREEMPT SMP KASAN > KASAN: probably user-memory-access in range > [0x00078a661960-0x00078a661967] > CPU: 0 PID: 8452 Comm: syz-executor360 Not tainted 5.11.0-syzkaller #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > RIP: 0010:__ethtool_get_link_ksettings+0x1a3/0x3a0 net/ethtool/ioctl.c:446 > Code: b7 3e fa 83 fd ff 0f 84 30 01 00 00 e8 16 b0 3e fa 48 8d 3c ed 60 d5 69 > 8a 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 > 83 e0 07 83 c0 03 > +38 d0 7c 08 84 d2 0f 85 b9 > RSP: 0018:c900019df7a0 EFLAGS: 00010202 > RAX: dc00 RBX: 888026136008 RCX: > RDX: f14cc32c RSI: 873439ca RDI: 00078a661960 > RBP: 8880 R08: R09: 88802613606f > R10: 873439bc R11: R12: > R13: 88802613606c R14: 888011d0c210 R15: 888011d0c210 > FS: 00749300() GS:8880b9c0() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 004b60f0 CR3: 185c2000 CR4: 001506f0 > DR0: DR1: DR2: > DR3: DR6: fffe0ff0 DR7: 0400 > Call Trace: > linkinfo_prepare_data+0xfd/0x280 net/ethtool/linkinfo.c:37 > ethnl_default_notify+0x1dc/0x630 net/ethtool/netlink.c:586 > ethtool_notify+0xbd/0x1f0 net/ethtool/netlink.c:656 > ethtool_set_link_ksettings+0x277/0x330 net/ethtool/ioctl.c:620 > dev_ethtool+0x2b35/0x45d0 net/ethtool/ioctl.c:2842 > dev_ioctl+0x463/0xb70 net/core/dev_ioctl.c:440 > sock_do_ioctl+0x148/0x2d0 net/socket.c:1060 > sock_ioctl+0x477/0x6a0 net/socket.c:1177 > vfs_ioctl fs/ioctl.c:48 [inline] > __do_sys_ioctl fs/ioctl.c:753 [inline] > __se_sys_ioctl fs/ioctl.c:739 [inline] > __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739 > do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > Fixes: c8907043c6ac9 ("ethtool: Get link mode in use instead of speed and > duplex parameters") > Signed-off-by: Danielle Ratson > Reported-by: Eric Dumazet > Reviewed-by: Ido Schimmel Reviewed-by: Jakub Kicinski BTW shouldn't __ethtool_get_link_ksettings() be moved to common.c?
Re: [PATCH net 1/3] net: ethernet: ixgbe: don't propagate -ENODEV from ixgbe_mii_bus_init()
On Wed, 3 Mar 2021 17:06:47 -0800 Tony Nguyen wrote: > From: Bartosz Golaszewski > > It's a valid use-case for ixgbe_mii_bus_init() to return -ENODEV - we > still want to finalize the registration of the ixgbe device. Check the > error code and don't bail out if err == -ENODEV. > > This fixes an issue on C3000 family of SoCs where four ixgbe devices > share a single MDIO bus and ixgbe_mii_bus_init() returns -ENODEV for > three of them but we still want to register them. > > Fixes: 09ef193fef7e ("net: ethernet: ixgbe: check the return value of > ixgbe_mii_bus_init()") > Reported-by: Yongxin Liu > Signed-off-by: Bartosz Golaszewski > Tested-by: Tony Brelinski > Signed-off-by: Tony Nguyen Are you sure this is not already fixed upstream by: bd7f14df9492 ("ixgbe: fix probing of multi-port devices with one MDIO") ?
Re: [PATCH bpf-next v4 1/5] bpf: consolidate shared test timing code
On Wed, Mar 3, 2021 at 2:18 AM Lorenz Bauer wrote: > > Share the timing / signal interruption logic between different > implementations of PROG_TEST_RUN. There is a change in behaviour > as well. We check the loop exit condition before checking for > pending signals. This resolves an edge case where a signal > arrives during the last iteration. Instead of aborting with > EINTR we return the successful result to user space. > > Signed-off-by: Lorenz Bauer > --- LGTM. Acked-by: Andrii Nakryiko > net/bpf/test_run.c | 141 + > 1 file changed, 78 insertions(+), 63 deletions(-) > [...]
[RFC PATCH V2 net-next 0/5] ethtool: Extend module EEPROM dump API
Ethtool supports module EEPROM dumps via the `ethtool -m ` command. But in current state its functionality is limited - offset and length parameters, which are used to specify a linear desired region of EEPROM data to dump, is not enough, considering emergence of complex module EEPROM layouts such as CMIS 4.0. Moreover, CMIS 4.0 extends the amount of pages that may be accessible by introducing another parameter for page addressing - banks. Besides, currently module EEPROM is represented as a chunk of concatenated pages, where lower 128 bytes of all pages, except page 00h, are omitted. Offset and length are used to address parts of this fake linear memory. But in practice drivers, which implement get_module_info() and get_module_eeprom() ethtool ops still calculate page number and set I2C address on their own. This series tackles these issues by adding ethtool op, which allows to pass page number, bank number and I2C address in addition to offset and length parameters to the driver, adds corresponding netlink infrastructure and implements the new interface in mlx5 driver. This allows to extend userspace 'ethtool -m' CLI by adding new parameters - page, bank and i2c. New command line format: ethtool -m [hex on|off] [raw on|off] [offset N] [length N] [page N] [bank N] [i2c N] The consequence of this series is a possibility to dump arbitrary EEPROM page at a time, in contrast to dumps of concatenated pages. Therefore, offset and length change their semantics and may be used only to specify a part of data within a page, which size is currently limited to 256 bytes. As for backwards compatibility with get_module_info() and get_module_eeprom() pair, the series addresses it as well by implementing a fallback mechanism. As mentioned earlier, drivers derive a page number from 'global' offset, so this can be done vice versa without their involvement thanks to standardization. If kernel netlink handler of 'ethtool -m' command detects that new ethtool op is not supported by the driver, it calculates offset from given page number and page offset and calls old ndos, if they are available. Change log: v1 -> v2: - Limited i2c_address values by 127 - Added page bound check for offset and length - Added defines for these two points - Added extack to ndo parameters - Moved ethnl_ops_begin(dev) and set error path accordingly Vladyslav Tarasiuk (5): ethtool: Allow network drivers to dump arbitrary EEPROM data net/mlx5: Refactor module EEPROM query net/mlx5: Implement get_module_eeprom_data_by_page() net/mlx5: Add support for DSFP module EEPROM dumps ethtool: Add fallback to get_module_eeprom from netlink command .../ethernet/mellanox/mlx5/core/en_ethtool.c | 42 +++ .../net/ethernet/mellanox/mlx5/core/port.c| 101 ++-- include/linux/ethtool.h | 7 +- include/linux/mlx5/port.h | 12 + include/uapi/linux/ethtool.h | 26 ++ include/uapi/linux/ethtool_netlink.h | 19 ++ net/ethtool/Makefile | 2 +- net/ethtool/eeprom.c | 239 ++ net/ethtool/netlink.c | 10 + net/ethtool/netlink.h | 2 + 10 files changed, 430 insertions(+), 30 deletions(-) create mode 100644 net/ethtool/eeprom.c -- 2.18.2
[RFC PATCH V2 net-next 3/5] net/mlx5: Implement get_module_eeprom_data_by_page()
From: Vladyslav Tarasiuk Implement ethtool_ops::get_module_eeprom_data_by_page() to enable support of new SFP standards. Signed-off-by: Vladyslav Tarasiuk --- .../ethernet/mellanox/mlx5/core/en_ethtool.c | 42 +++ .../net/ethernet/mellanox/mlx5/core/port.c| 33 +++ include/linux/mlx5/port.h | 2 + 3 files changed, 77 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c index abdf721bb264..6766ffb07c81 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c @@ -1769,6 +1769,47 @@ static int mlx5e_get_module_eeprom(struct net_device *netdev, return 0; } +static int mlx5e_get_module_eeprom_data_by_page(struct net_device *netdev, + const struct ethtool_eeprom_data *page_data, + struct netlink_ext_ack *extack) +{ + struct mlx5e_priv *priv = netdev_priv(netdev); + struct mlx5_module_eeprom_query_params query; + struct mlx5_core_dev *mdev = priv->mdev; + u8 *data = page_data->data; + int size_read; + int i = 0; + + if (!page_data->length) + return -EINVAL; + + memset(data, 0, page_data->length); + + query.offset = page_data->offset; + query.i2c_address = page_data->i2c_address; + query.bank = page_data->bank; + query.page = page_data->page; + while (i < page_data->length) { + query.size = page_data->length - i; + size_read = mlx5_query_module_eeprom_data(mdev, &query, data + i); + + /* Done reading */ + if (!size_read) + return 0; + + if (size_read < 0) { + netdev_err(priv->netdev, "%s: mlx5_query_module_eeprom_data failed:0x%x\n", + __func__, size_read); + return 0; + } + + i += size_read; + query.offset += size_read; + } + + return 0; +} + int mlx5e_ethtool_flash_device(struct mlx5e_priv *priv, struct ethtool_flash *flash) { @@ -2148,6 +2189,7 @@ const struct ethtool_ops mlx5e_ethtool_ops = { .set_wol = mlx5e_set_wol, .get_module_info = mlx5e_get_module_info, .get_module_eeprom = mlx5e_get_module_eeprom, + .get_module_eeprom_data_by_page = mlx5e_get_module_eeprom_data_by_page, .flash_device = mlx5e_flash_device, .get_priv_flags= mlx5e_get_priv_flags, .set_priv_flags= mlx5e_set_priv_flags, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/port.c b/drivers/net/ethernet/mellanox/mlx5/core/port.c index 9b9f870d67a4..f7a16fdfb8d3 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/port.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/port.c @@ -428,6 +428,39 @@ int mlx5_query_module_eeprom(struct mlx5_core_dev *dev, } EXPORT_SYMBOL_GPL(mlx5_query_module_eeprom); +int mlx5_query_module_eeprom_data(struct mlx5_core_dev *dev, + struct mlx5_module_eeprom_query_params *params, + u8 *data) +{ + u8 module_id; + int err; + + err = mlx5_query_module_num(dev, ¶ms->module_number); + if (err) + return err; + + err = mlx5_query_module_id(dev, params->module_number, &module_id); + if (err) + return err; + + if (module_id != MLX5_MODULE_ID_SFP && + module_id != MLX5_MODULE_ID_QSFP && + module_id != MLX5_MODULE_ID_QSFP28 && + module_id != MLX5_MODULE_ID_QSFP_PLUS) { + mlx5_core_err(dev, "Module ID not recognized: 0x%x\n", module_id); + return -EINVAL; + } + + if (params->i2c_address != MLX5_I2C_ADDR_HIGH && + params->i2c_address != MLX5_I2C_ADDR_LOW) { + mlx5_core_err(dev, "I2C address not recognized: 0x%x\n", params->i2c_address); + return -EINVAL; + } + + return mlx5_query_mcia(dev, params, data); +} +EXPORT_SYMBOL_GPL(mlx5_query_module_eeprom_data); + static int mlx5_query_port_pvlc(struct mlx5_core_dev *dev, u32 *pvlc, int pvlc_size, u8 local_port) { diff --git a/include/linux/mlx5/port.h b/include/linux/mlx5/port.h index 90b87aa82db3..887cd43b41e8 100644 --- a/include/linux/mlx5/port.h +++ b/include/linux/mlx5/port.h @@ -209,6 +209,8 @@ void mlx5_query_port_fcs(struct mlx5_core_dev *mdev, bool *supported, bool *enabled); int mlx5_query_module_eeprom(struct mlx5_core_dev *dev, u16 offset, u16 size, u8 *data); +int mlx5_query_module_eeprom_data(struct mlx5_core_dev *dev, + struct mlx5_module_eeprom_query_params *par
[RFC PATCH V2 net-next 2/5] net/mlx5: Refactor module EEPROM query
From: Vladyslav Tarasiuk Prepare for ethtool_ops::get_module_eeprom_data() implementation by extracting common part of mlx5_query_module_eeprom() into a separate function. Signed-off-by: Vladyslav Tarasiuk --- .../net/ethernet/mellanox/mlx5/core/port.c| 79 +++ include/linux/mlx5/port.h | 9 +++ 2 files changed, 54 insertions(+), 34 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/port.c b/drivers/net/ethernet/mellanox/mlx5/core/port.c index 4bb219565c58..9b9f870d67a4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/port.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/port.c @@ -353,67 +353,78 @@ static void mlx5_sfp_eeprom_params_set(u16 *i2c_addr, int *page_num, u16 *offset *offset -= MLX5_EEPROM_PAGE_LENGTH; } -int mlx5_query_module_eeprom(struct mlx5_core_dev *dev, -u16 offset, u16 size, u8 *data) +static int mlx5_query_mcia(struct mlx5_core_dev *dev, + struct mlx5_module_eeprom_query_params *params, u8 *data) { - int module_num, status, err, page_num = 0; u32 in[MLX5_ST_SZ_DW(mcia_reg)] = {}; u32 out[MLX5_ST_SZ_DW(mcia_reg)]; - u16 i2c_addr = 0; - u8 module_id; + int status, err; void *ptr; + u16 size; + + size = min_t(int, params->size, MLX5_EEPROM_MAX_BYTES); + + MLX5_SET(mcia_reg, in, l, 0); + MLX5_SET(mcia_reg, in, size, size); + MLX5_SET(mcia_reg, in, module, params->module_number); + MLX5_SET(mcia_reg, in, device_address, params->offset); + MLX5_SET(mcia_reg, in, page_number, params->page); + MLX5_SET(mcia_reg, in, i2c_device_address, params->i2c_address); - err = mlx5_query_module_num(dev, &module_num); + err = mlx5_core_access_reg(dev, in, sizeof(in), out, + sizeof(out), MLX5_REG_MCIA, 0, 0); if (err) return err; - err = mlx5_query_module_id(dev, module_num, &module_id); + status = MLX5_GET(mcia_reg, out, status); + if (status) { + mlx5_core_err(dev, "query_mcia_reg failed: status: 0x%x\n", + status); + return -EIO; + } + + ptr = MLX5_ADDR_OF(mcia_reg, out, dword_0); + memcpy(data, ptr, size); + + return size; +} + +int mlx5_query_module_eeprom(struct mlx5_core_dev *dev, +u16 offset, u16 size, u8 *data) +{ + struct mlx5_module_eeprom_query_params query = {0}; + u8 module_id; + int err; + + err = mlx5_query_module_num(dev, &query.module_number); + if (err) + return err; + + err = mlx5_query_module_id(dev, query.module_number, &module_id); if (err) return err; switch (module_id) { case MLX5_MODULE_ID_SFP: - mlx5_sfp_eeprom_params_set(&i2c_addr, &page_num, &offset); + mlx5_sfp_eeprom_params_set(&query.i2c_address, &query.page, &query.offset); break; case MLX5_MODULE_ID_QSFP: case MLX5_MODULE_ID_QSFP_PLUS: case MLX5_MODULE_ID_QSFP28: - mlx5_qsfp_eeprom_params_set(&i2c_addr, &page_num, &offset); + mlx5_qsfp_eeprom_params_set(&query.i2c_address, &query.page, &query.offset); break; default: mlx5_core_err(dev, "Module ID not recognized: 0x%x\n", module_id); return -EINVAL; } - if (offset + size > MLX5_EEPROM_PAGE_LENGTH) + if (query.offset + size > MLX5_EEPROM_PAGE_LENGTH) /* Cross pages read, read until offset 256 in low page */ size -= offset + size - MLX5_EEPROM_PAGE_LENGTH; - size = min_t(int, size, MLX5_EEPROM_MAX_BYTES); + query.size = size; - MLX5_SET(mcia_reg, in, l, 0); - MLX5_SET(mcia_reg, in, module, module_num); - MLX5_SET(mcia_reg, in, i2c_device_address, i2c_addr); - MLX5_SET(mcia_reg, in, page_number, page_num); - MLX5_SET(mcia_reg, in, device_address, offset); - MLX5_SET(mcia_reg, in, size, size); - - err = mlx5_core_access_reg(dev, in, sizeof(in), out, - sizeof(out), MLX5_REG_MCIA, 0, 0); - if (err) - return err; - - status = MLX5_GET(mcia_reg, out, status); - if (status) { - mlx5_core_err(dev, "query_mcia_reg failed: status: 0x%x\n", - status); - return -EIO; - } - - ptr = MLX5_ADDR_OF(mcia_reg, out, dword_0); - memcpy(data, ptr, size); - - return size; + return mlx5_query_mcia(dev, &query, data); } EXPORT_SYMBOL_GPL(mlx5_query_module_eeprom); diff --git a/include/linux/mlx5/port.h b/include/linux/mlx5/port.h index 23edd2db4803..90b87aa82db3 100644 --- a/include/linux/mlx5/port.h +++ b/include/linux/mlx5/port.h @@ -62,6 +62,15 @@ enum mlx5_an_status { #def
[RFC PATCH V2 net-next 1/5] ethtool: Allow network drivers to dump arbitrary EEPROM data
From: Vladyslav Tarasiuk Define get_module_eeprom_data_by_page() ethtool callback and implement netlink infrastructure. get_module_eeprom_data_by_page() allows network drivers to dump a part of module's EEPROM specified by page and bank numbers along with offset and length. It is effectively a netlink replacement for get_module_info() and get_module_eeprom() pair, which is needed due to emergence of complex non-linear EEPROM layouts. Signed-off-by: Vladyslav Tarasiuk --- include/linux/ethtool.h | 7 +- include/uapi/linux/ethtool.h | 26 + include/uapi/linux/ethtool_netlink.h | 19 net/ethtool/Makefile | 2 +- net/ethtool/eeprom.c | 157 +++ net/ethtool/netlink.c| 10 ++ net/ethtool/netlink.h| 2 + 7 files changed, 221 insertions(+), 2 deletions(-) create mode 100644 net/ethtool/eeprom.c diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h index ec4cd3921c67..2f65aae5f492 100644 --- a/include/linux/ethtool.h +++ b/include/linux/ethtool.h @@ -81,6 +81,7 @@ enum { #define ETH_RSS_HASH_NO_CHANGE 0 struct net_device; +struct netlink_ext_ack; /* Some generic methods drivers may use in their ethtool_ops */ u32 ethtool_op_get_link(struct net_device *dev); @@ -410,6 +411,8 @@ struct ethtool_pause_stats { * @get_ethtool_phy_stats: Return extended statistics about the PHY device. * This is only useful if the device maintains PHY statistics and * cannot use the standard PHY library helpers. + * @get_module_eeprom_data_by_page: Get a region of plug-in module EEPROM data + * from specified page. Returns a negative error code or zero. * * All operations are optional (i.e. the function pointer may be set * to %NULL) and callers must take this into account. Callers must @@ -515,6 +518,9 @@ struct ethtool_ops { const struct ethtool_tunable *, void *); int (*set_phy_tunable)(struct net_device *, const struct ethtool_tunable *, const void *); + int (*get_module_eeprom_data_by_page)(struct net_device *dev, + const struct ethtool_eeprom_data *page, + struct netlink_ext_ack *extack); }; int ethtool_check_ops(const struct ethtool_ops *ops); @@ -538,7 +544,6 @@ int ethtool_virtdev_set_link_ksettings(struct net_device *dev, const struct ethtool_link_ksettings *cmd, u32 *dev_speed, u8 *dev_duplex); -struct netlink_ext_ack; struct phy_device; struct phy_tdr_config; diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h index cde753bb2093..2459571fc1d1 100644 --- a/include/uapi/linux/ethtool.h +++ b/include/uapi/linux/ethtool.h @@ -340,6 +340,28 @@ struct ethtool_eeprom { __u8data[0]; }; +/** + * struct ethtool_eeprom_data - EEPROM dump from specified page + * @offset: Offset within the specified EEPROM page to begin read, in bytes. + * @length: Number of bytes to read. + * @page: Page number to read from. + * @bank: Page bank number to read from, if applicable by EEPROM spec. + * @i2c_address: I2C address of a page. Value less than 0x7f expected. Most + * EEPROMs use 0x50 or 0x51. + * @data: Pointer to buffer with EEPROM data of @length size. + * + * This can be used to manage pages during EEPROM dump in ethtool and pass + * required information to the driver. + */ +struct ethtool_eeprom_data { + __u32 offset; + __u32 length; + __u32 page; + __u32 bank; + __u32 i2c_address; + __u8*data; +}; + /** * struct ethtool_eee - Energy Efficient Ethernet information * @cmd: ETHTOOL_{G,S}EEE @@ -1865,6 +1887,10 @@ static inline int ethtool_validate_duplex(__u8 duplex) #define ETH_MODULE_SFF_8636_MAX_LEN 640 #define ETH_MODULE_SFF_8436_MAX_LEN 640 +#define ETH_MODULE_EEPROM_MAX_LEN 640 +#define ETH_MODULE_EEPROM_PAGE_LEN 256 +#define ETH_MODULE_MAX_I2C_ADDRESS 0x7f + /* Reset flags */ /* The reset() operation must clear the flags for the components which * were actually reset. On successful return, the flags indicate the diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h index a286635ac9b8..60dd848d0b54 100644 --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@ -42,6 +42,7 @@ enum { ETHTOOL_MSG_CABLE_TEST_ACT, ETHTOOL_MSG_CABLE_TEST_TDR_ACT, ETHTOOL_MSG_TUNNEL_INFO_GET, + ETHTOOL_MSG_EEPROM_DATA_GET, /* add new constants above here */ __ETHTOOL_MSG_USER_CNT, @@ -80,6 +81,7 @@ enum { ETHTOOL_MSG_CABLE_TEST_NTF, ETHTOOL_MSG_CABLE_TEST_TDR_NTF, ETHTOOL_MSG_TUNNEL_INFO_GET_REPLY, + ETHTOOL_MSG_EEPROM_DATA_GET_REPLY, /*
[RFC PATCH V2 net-next 4/5] net/mlx5: Add support for DSFP module EEPROM dumps
From: Vladyslav Tarasiuk Allow the driver to recognise DSFP transceiver module ID and therefore allow its EEPROM dumps using ethtool. Signed-off-by: Vladyslav Tarasiuk --- drivers/net/ethernet/mellanox/mlx5/core/port.c | 3 ++- include/linux/mlx5/port.h | 1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/port.c b/drivers/net/ethernet/mellanox/mlx5/core/port.c index f7a16fdfb8d3..3a7aa6b05198 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/port.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/port.c @@ -446,7 +446,8 @@ int mlx5_query_module_eeprom_data(struct mlx5_core_dev *dev, if (module_id != MLX5_MODULE_ID_SFP && module_id != MLX5_MODULE_ID_QSFP && module_id != MLX5_MODULE_ID_QSFP28 && - module_id != MLX5_MODULE_ID_QSFP_PLUS) { + module_id != MLX5_MODULE_ID_QSFP_PLUS && + module_id != MLX5_MODULE_ID_DSFP) { mlx5_core_err(dev, "Module ID not recognized: 0x%x\n", module_id); return -EINVAL; } diff --git a/include/linux/mlx5/port.h b/include/linux/mlx5/port.h index 887cd43b41e8..71b4373cb96c 100644 --- a/include/linux/mlx5/port.h +++ b/include/linux/mlx5/port.h @@ -45,6 +45,7 @@ enum mlx5_module_id { MLX5_MODULE_ID_QSFP = 0xC, MLX5_MODULE_ID_QSFP_PLUS= 0xD, MLX5_MODULE_ID_QSFP28 = 0x11, + MLX5_MODULE_ID_DSFP = 0x1B, }; enum mlx5_an_status { -- 2.18.2
[RFC PATCH V2 net-next 5/5] ethtool: Add fallback to get_module_eeprom from netlink command
From: Vladyslav Tarasiuk In case netlink get_module_eeprom_data_by_page() callback is not implemented by the driver, try to call old get_module_info() and get_module_eeprom() pair. Recalculate parameters to get_module_eeprom() offset and len using page number and their sizes. Return error if this can't be done. Signed-off-by: Vladyslav Tarasiuk --- net/ethtool/eeprom.c | 84 +++- 1 file changed, 83 insertions(+), 1 deletion(-) diff --git a/net/ethtool/eeprom.c b/net/ethtool/eeprom.c index 2618a55b9a40..72c7714a9d37 100644 --- a/net/ethtool/eeprom.c +++ b/net/ethtool/eeprom.c @@ -26,6 +26,88 @@ struct eeprom_data_reply_data { #define EEPROM_DATA_REPDATA(__reply_base) \ container_of(__reply_base, struct eeprom_data_reply_data, base) +static int fallback_set_params(struct eeprom_data_req_info *request, + struct ethtool_modinfo *modinfo, + struct ethtool_eeprom *eeprom) +{ + u32 offset = request->offset; + u32 length = request->length; + + if (request->page) + offset = 128 + request->page * 128 + offset; + + if (!length) + length = modinfo->eeprom_len; + + if (offset >= modinfo->eeprom_len) + return -EINVAL; + + if (modinfo->eeprom_len < offset + length) + length = modinfo->eeprom_len - offset; + + eeprom->cmd = ETHTOOL_GMODULEEEPROM; + eeprom->len = length; + eeprom->offset = offset; + + switch (modinfo->type) { + case ETH_MODULE_SFF_8079: + if (request->page > 1) + return -EINVAL; + break; + case ETH_MODULE_SFF_8472: + if (request->page > 3) + return -EINVAL; + break; + case ETH_MODULE_SFF_8436: + case ETH_MODULE_SFF_8636: + if (request->page > 3) + return -EINVAL; + break; + } + return 0; +} + +static int eeprom_data_fallback(struct eeprom_data_req_info *request, + struct eeprom_data_reply_data *reply, + struct genl_info *info) +{ + struct net_device *dev = reply->base.dev; + struct ethtool_modinfo modinfo = {0}; + struct ethtool_eeprom eeprom = {0}; + u8 *data; + int err; + + if ((!dev->ethtool_ops->get_module_info && +!dev->ethtool_ops->get_module_eeprom) || + request->bank || request->i2c_address) { + return -EOPNOTSUPP; + } + modinfo.cmd = ETHTOOL_GMODULEINFO; + err = dev->ethtool_ops->get_module_info(dev, &modinfo); + if (err < 0) + return err; + + err = fallback_set_params(request, &modinfo, &eeprom); + if (err < 0) + return err; + + data = kmalloc(eeprom.len, GFP_KERNEL); + if (!data) + return -ENOMEM; + err = dev->ethtool_ops->get_module_eeprom(dev, &eeprom, data); + if (err < 0) + goto err_out; + + reply->data = data; + reply->length = eeprom.len; + + return 0; + +err_out: + kfree(data); + return err; +} + static int eeprom_data_prepare_data(const struct ethnl_req_info *req_base, struct ethnl_reply_data *reply_base, struct genl_info *info) @@ -37,7 +119,7 @@ static int eeprom_data_prepare_data(const struct ethnl_req_info *req_base, int err; if (!dev->ethtool_ops->get_module_eeprom_data_by_page) - return -EOPNOTSUPP; + return eeprom_data_fallback(request, reply, info); page_data.offset = request->offset; page_data.length = request->length; -- 2.18.2
Re: [PATCH net 1/3] net: ethernet: ixgbe: don't propagate -ENODEV from ixgbe_mii_bus_init()
On Thu, 2021-03-04 at 10:50 -0800, Jakub Kicinski wrote: > On Wed, 3 Mar 2021 17:06:47 -0800 Tony Nguyen wrote: > > From: Bartosz Golaszewski > > > > It's a valid use-case for ixgbe_mii_bus_init() to return -ENODEV - > > we > > still want to finalize the registration of the ixgbe device. Check > > the > > error code and don't bail out if err == -ENODEV. > > > > This fixes an issue on C3000 family of SoCs where four ixgbe > > devices > > share a single MDIO bus and ixgbe_mii_bus_init() returns -ENODEV > > for > > three of them but we still want to register them. > > > > Fixes: 09ef193fef7e ("net: ethernet: ixgbe: check the return value > > of ixgbe_mii_bus_init()") > > Reported-by: Yongxin Liu > > Signed-off-by: Bartosz Golaszewski > > Tested-by: Tony Brelinski > > Signed-off-by: Tony Nguyen > > Are you sure this is not already fixed upstream by: > > bd7f14df9492 ("ixgbe: fix probing of multi-port devices with one > MDIO") > > ? That looks to solve this issue. I'll drop this patch and resend the series. Thanks, Tony
[BUG] hitting bug when running spinlock test
hi, I'm getting attached BUG/crash when running in parralel selftests, like: while :; do ./test_progs -t spinlock; done while :; do ./test_progs ; done it's the latest bpf-next/master, I can send the .config if needed, but I don't think there's anything special about it, because I saw the bug on other servers with different generic configs it looks like it's related to cgroup local storage, for some reason the storage deref returns NULL I'm bit lost in this code, so any help would be great ;-) thanks, jirka --- ... [ 382.324440] bpf_testmod: loading out-of-tree module taints kernel. [ 382.330670] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel [ 480.391667] perf: interrupt took too long (2540 > 2500), lowering kernel.perf_event_max_sample_rate to 78000 [ 480.401730] perf: interrupt took too long (6860 > 6751), lowering kernel.perf_event_max_sample_rate to 29000 [ 480.416172] perf: interrupt took too long (8602 > 8575), lowering kernel.perf_event_max_sample_rate to 23000 [ 480.433053] BUG: kernel NULL pointer dereference, address: [ 480.440014] #PF: supervisor read access in kernel mode [ 480.445153] #PF: error_code(0x) - not-present page [ 480.450294] PGD 800133a18067 P4D 800133a18067 PUD 10c019067 PMD 0 [ 480.457164] Oops: [#1] PREEMPT SMP PTI [ 480.461350] CPU: 6 PID: 16689 Comm: test_progs Tainted: G IOE 5.11.0+ #11 [ 480.469263] Hardware name: Dell Inc. PowerEdge R440/08CYF7, BIOS 1.7.0 12/14/2018 [ 480.476742] RIP: 0010:bpf_get_local_storage+0x13/0x50 [ 480.481797] Code: e8 92 c5 8e 00 5d 89 c0 c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 0f 1f 44 00 00 83 7f 18 15 74 10 65 48 8b 05 6d c6 e2 7e <48> 8b 00 48 83 c0 10 c3 55 48 89 e5 53 65 48 8b 05 60 c6 e2 7e8 [ 480.500540] RSP: 0018:c90001bd3ce0 EFLAGS: 00010293 [ 480.505766] RAX: RBX: 982a2595 RCX: 0018 [ 480.512901] RDX: 0001 RSI: RDI: 888149ccf000 [ 480.520034] RBP: c90001bd3d20 R08: c90001bd3d04 R09: 888105121600 [ 480.527164] R10: d3b93420 R11: 025c R12: 0734 [ 480.534299] R13: 888149ccc710 R14: R15: c9379048 [ 480.541430] FS: 7f8f2357b640() GS:8897e098() knlGS: [ 480.549515] CS: 0010 DS: ES: CR0: 80050033 [ 480.555262] CR2: CR3: 00014e826006 CR4: 007706e0 [ 480.562395] DR0: DR1: DR2: [ 480.569527] DR3: DR6: fffe0ff0 DR7: 0400 [ 480.576660] PKRU: 5554 [ 480.579372] Call Trace: [ 480.581829] ? bpf_prog_c48154a736e5c014_bpf_sping_lock_test+0x2ba/0x860 [ 480.588526] bpf_test_run+0x127/0x2b0 [ 480.592192] ? __build_skb_around+0xb0/0xc0 [ 480.596378] bpf_prog_test_run_skb+0x32f/0x6b0 [ 480.600824] __do_sys_bpf+0xa94/0x2240 [ 480.604577] ? debug_smp_processor_id+0x17/0x20 [ 480.609107] ? __perf_event_task_sched_in+0x32/0x340 [ 480.614077] __x64_sys_bpf+0x1a/0x20 [ 480.617653] do_syscall_64+0x38/0x50 [ 480.621233] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 480.626286] RIP: 0033:0x7f8f2467f55d [ 480.629865] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d eb 78 0c 00 f7 d8 64 89 018 [ 480.648611] RSP: 002b:7f8f2357ad58 EFLAGS: 0206 ORIG_RAX: 0141 [ 480.656175] RAX: ffda RBX: RCX: 7f8f2467f55d [ 480.663308] RDX: 0078 RSI: 7f8f2357ad60 RDI: 000a [ 480.670442] RBP: 7f8f2357ae28 R08: R09: 0008 [ 480.677574] R10: R11: 0206 R12: 7f8f2357ae2c [ 480.684707] R13: 022df420 R14: R15: 7f8f2357b640 [ 480.691842] Modules linked in: bpf_testmod(OE) intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm ipmi_ssif irqbypass rapl intel_cstate dell_smbios intel_uncore mei_] [ 480.739134] CR2: [ 480.742452] ---[ end trace 807177cbb5e3b3da ]--- [ 480.752174] RIP: 0010:bpf_get_local_storage+0x13/0x50 [ 480.757230] Code: e8 92 c5 8e 00 5d 89 c0 c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 0f 1f 44 00 00 83 7f 18 15 74 10 65 48 8b 05 6d c6 e2 7e <48> 8b 00 48 83 c0 10 c3 55 48 89 e5 53 65 48 8b 05 60 c6 e2 7e8 [ 480.775976] RSP: 0018:c90001bd3ce0 EFLAGS: 00010293 [ 480.781202] RAX: RBX: 982a2595 RCX: 0018 [ 480.788335] RDX: 0001 RSI: RDI: 888149ccf000 [ 480.795466] RBP: c90001bd3d20 R08: c90001bd3d04 R09: 888105121600 [ 480.802598] R10: d3b93420 R11: 025c R12: 0734 [ 480.809730] R13: 888149ccc710 R14: 00
Re: [PATCH net] net: tcp: don't allocate fast clones for fastopen SYN
On Thu, 4 Mar 2021 13:51:15 +0100 Eric Dumazet wrote: > I think we are over thinking this really (especially if the fix needs > a change in core networking or drivers) > > We can reuse TSQ logic to have a chance to recover when the clone is > eventually freed. > This will be more generic, not only for the SYN+data of FastOpen. > > Can you please test the following patch ? #7 - Eric comes up with something much better :) But so far doesn't seem to quite do it, I'm looking but maybe you'll know right away (FWIW testing a v5.6 backport but I don't think TSQ changed?): On __tcp_retransmit_skb kretprobe: ==> Hit TFO case ret:-16 ca_state:0 skb:888fdb4bac00! First hit: __tcp_retransmit_skb+1 tcp_rcv_state_process+2488 tcp_v6_do_rcv+405 tcp_v6_rcv+2984 ip6_protocol_deliver_rcu+180 ip6_input_finish+17 Successful hit: __tcp_retransmit_skb+1 tcp_retransmit_skb+18 tcp_retransmit_timer+716 tcp_write_timer_handler+136 tcp_write_timer+141 call_timer_fn+43 skb:888fdb4bac00 --- delay:51642us bytes_acked:1
Re: [PATCH net] net: usb: cdc_ncm: don't spew notifications
On Wed, Jan 20, 2021 at 5:04 PM Jakub Kicinski wrote: > > On Wed, 20 Jan 2021 03:38:32 + Hayes Wang wrote: > > Grant Grundler > > > Sent: Wednesday, January 20, 2021 9:12 AM > > > Subject: [PATCH net] net: usb: cdc_ncm: don't spew notifications > > > > > > RTL8156 sends notifications about every 32ms. > > > Only display/log notifications when something changes. > > > > > > This issue has been reported by others: > > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1832472 > > > https://lkml.org/lkml/2020/8/27/1083 > > > > > > Chrome OS cannot support RTL8156 until this is fixed. > > > > > > Signed-off-by: Grant Grundler > > > > Reviewed-by: Hayes Wang > > Applied, thanks! > > net should be merged back into net-next by the end of the day, so > if the other patches depend on this one to apply cleanly please keep > an eye and post after that happens. If there is no conflict you can > just post them with [PATCH net-next] now. Jakub, sorry, I "dropped the ball" on this one. I'll repost with "net-next" a bit later today. cheers, grant
[PATCH net v2 0/2][pull request] Intel Wired LAN Driver Updates 2021-03-04
This series contains updates to ixgbe and ixgbevf drivers. Antony Antony adds a check to fail for non transport mode SA with offload as this is not supported for ixgbe and ixgbevf. Dinghao Liu fixes a memory leak on failure to program a perfect filter for ixgbe. v2: - Dropped patch 1 The following are changes since commit a9ecb0cbf03746b17a7c13bd8e3464e6789f73e8: rtnetlink: using dev_base_seq from target net and are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue 10GbE Antony Antony (1): ixgbe: fail to create xfrm offload of IPsec tunnel mode SA Dinghao Liu (1): ixgbe: Fix memleak in ixgbe_configure_clsu32 drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 5 + drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 6 -- drivers/net/ethernet/intel/ixgbevf/ipsec.c | 5 + 3 files changed, 14 insertions(+), 2 deletions(-) -- 2.26.2
[PATCH net v2 2/2] ixgbe: Fix memleak in ixgbe_configure_clsu32
From: Dinghao Liu When ixgbe_fdir_write_perfect_filter_82599() fails, input allocated by kzalloc() has not been freed, which leads to memleak. Signed-off-by: Dinghao Liu Reviewed-by: Paul Menzel Tested-by: Tony Brelinski Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index fae84202d870..9f3f12e2ccf2 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -9565,8 +9565,10 @@ static int ixgbe_configure_clsu32(struct ixgbe_adapter *adapter, ixgbe_atr_compute_perfect_hash_82599(&input->filter, mask); err = ixgbe_fdir_write_perfect_filter_82599(hw, &input->filter, input->sw_idx, queue); - if (!err) - ixgbe_update_ethtool_fdir_entry(adapter, input, input->sw_idx); + if (err) + goto err_out_w_lock; + + ixgbe_update_ethtool_fdir_entry(adapter, input, input->sw_idx); spin_unlock(&adapter->fdir_perfect_lock); if ((uhtid != 0x800) && (adapter->jump_tables[uhtid])) -- 2.26.2
[PATCH net v2 1/2] ixgbe: fail to create xfrm offload of IPsec tunnel mode SA
From: Antony Antony Based on talks and indirect references ixgbe IPsec offlod do not support IPsec tunnel mode offload. It can only support IPsec transport mode offload. Now explicitly fail when creating non transport mode SA with offload to avoid false performance expectations. Fixes: 63a67fe229ea ("ixgbe: add ipsec offload add and remove SA") Signed-off-by: Antony Antony Acked-by: Shannon Nelson Tested-by: Tony Brelinski Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 5 + drivers/net/ethernet/intel/ixgbevf/ipsec.c | 5 + 2 files changed, 10 insertions(+) diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c index eca73526ac86..54d47265a7ac 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c @@ -575,6 +575,11 @@ static int ixgbe_ipsec_add_sa(struct xfrm_state *xs) return -EINVAL; } + if (xs->props.mode != XFRM_MODE_TRANSPORT) { + netdev_err(dev, "Unsupported mode for ipsec offload\n"); + return -EINVAL; + } + if (ixgbe_ipsec_check_mgmt_ip(xs)) { netdev_err(dev, "IPsec IP addr clash with mgmt filters\n"); return -EINVAL; diff --git a/drivers/net/ethernet/intel/ixgbevf/ipsec.c b/drivers/net/ethernet/intel/ixgbevf/ipsec.c index 5170dd9d8705..caaea2c920a6 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ipsec.c +++ b/drivers/net/ethernet/intel/ixgbevf/ipsec.c @@ -272,6 +272,11 @@ static int ixgbevf_ipsec_add_sa(struct xfrm_state *xs) return -EINVAL; } + if (xs->props.mode != XFRM_MODE_TRANSPORT) { + netdev_err(dev, "Unsupported mode for ipsec offload\n"); + return -EINVAL; + } + if (xs->xso.flags & XFRM_OFFLOAD_INBOUND) { struct rx_sa rsa; -- 2.26.2
[PATCHv2] gianfar: fix jumbo packets+napi+rx overrun crash
From: Michael Braun When using jumbo packets and overrunning rx queue with napi enabled, the following sequence is observed in gfar_add_rx_frag: | lstatus | | skb | t | lstatus, size, flags| first | len, data_len, *ptr | ---+--+---+---+ 13 | 18002348, 9032, INTERRUPT LAST | 0 | 9600, 8000, f554c12e | 12 | 1640, 1600, INTERRUPT| 0 | 8000, 6400, f554c12e | 11 | 1640, 1600, INTERRUPT| 0 | 6400, 4800, f554c12e | 10 | 1640, 1600, INTERRUPT| 0 | 4800, 3200, f554c12e | 09 | 1640, 1600, INTERRUPT| 0 | 3200, 1600, f554c12e | 08 | 14000640, 1600, INTERRUPT FIRST | 0 | 1600, 0, f554c12e | 07 | 14000640, 1600, INTERRUPT FIRST | 1 | 0,0, f554c12e | 06 | 1c80, 128, INTERRUPT LAST FIRST | 1 | 0,0, abf3bd6e | 05 | 18002348, 9032, INTERRUPT LAST | 0 | 8000, 6400, c5a57780 | 04 | 1640, 1600, INTERRUPT| 0 | 6400, 4800, c5a57780 | 03 | 1640, 1600, INTERRUPT| 0 | 4800, 3200, c5a57780 | 02 | 1640, 1600, INTERRUPT| 0 | 3200, 1600, c5a57780 | 01 | 1640, 1600, INTERRUPT| 0 | 1600, 0, c5a57780 | 00 | 14000640, 1600, INTERRUPT FIRST | 1 | 0,0, c5a57780 | So at t=7 a new packets is started but not finished, probably due to rx overrun - but rx overrun is not indicated in the flags. Instead a new packets starts at t=8. This results in skb->len to exceed size for the LAST fragment at t=13 and thus a negative fragment size added to the skb. This then crashes: kernel BUG at include/linux/skbuff.h:2277! Oops: Exception in kernel mode, sig: 5 [#1] ... NIP [c04689f4] skb_pull+0x2c/0x48 LR [c03f62ac] gfar_clean_rx_ring+0x2e4/0x844 Call Trace: [ec4bfd38] [c06a84c4] _raw_spin_unlock_irqrestore+0x60/0x7c (unreliable) [ec4bfda8] [c03f6a44] gfar_poll_rx_sq+0x48/0xe4 [ec4bfdc8] [c048d504] __napi_poll+0x54/0x26c [ec4bfdf8] [c048d908] net_rx_action+0x138/0x2c0 [ec4bfe68] [c06a8f34] __do_softirq+0x3a4/0x4fc [ec4bfed8] [c0040150] run_ksoftirqd+0x58/0x70 [ec4bfee8] [c0066ecc] smpboot_thread_fn+0x184/0x1cc [ec4bff08] [c0062718] kthread+0x140/0x144 [ec4bff38] [c0012350] ret_from_kernel_thread+0x14/0x1c This patch fixes this by checking for computed LAST fragment size, so a negative sized fragment is never added. In order to prevent the newer rx frame from getting corrupted, the FIRST flag is checked to discard the incomplete older frame. Signed-off-by: Michael Braun --- drivers/net/ethernet/freescale/gianfar.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c index 541de32ea662..1cf8ef717453 100644 --- a/drivers/net/ethernet/freescale/gianfar.c +++ b/drivers/net/ethernet/freescale/gianfar.c @@ -2390,6 +2390,10 @@ static bool gfar_add_rx_frag(struct gfar_rx_buff *rxb, u32 lstatus, if (lstatus & BD_LFLAG(RXBD_LAST)) size -= skb->len; + WARN(size < 0, "gianfar: rx fragment size underflow"); + if (size < 0) + return false; + skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, page, rxb->page_offset + RXBUF_ALIGNMENT, size, GFAR_RXB_TRUESIZE); @@ -2552,6 +2556,17 @@ static int gfar_clean_rx_ring(struct gfar_priv_rx_q *rx_queue, if (lstatus & BD_LFLAG(RXBD_EMPTY)) break; + /* lost RXBD_LAST descriptor due to overrun */ + if (skb && + (lstatus & BD_LFLAG(RXBD_FIRST))) { + /* discard faulty buffer */ + dev_kfree_skb(skb); + skb = NULL; + rx_queue->stats.rx_dropped++; + + /* can continue normally */ + } + /* order rx buffer descriptor reads */ rmb(); -- 2.20.1
Re: [PATCH] net: mellanox: mlx5: fix error return code in mlx5_fpga_device_start()
On 04.03.2021 15:18, Jia-Ju Bai wrote: > When mlx5_is_fpga_lookaside() returns a non-zero value, no error > return code is assigned. > To fix this bug, err is assigned with -EINVAL as error return code. > To me it looks like the current behavior is intentional. Did you verify that it's actually an error condition if the function returns true? Please don't blindly trust such code checkers. > Reported-by: TOTE Robot > Signed-off-by: Jia-Ju Bai > --- > drivers/net/ethernet/mellanox/mlx5/core/fpga/core.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fpga/core.c > b/drivers/net/ethernet/mellanox/mlx5/core/fpga/core.c > index 2ce4241459ce..c9e6da97126f 100644 > --- a/drivers/net/ethernet/mellanox/mlx5/core/fpga/core.c > +++ b/drivers/net/ethernet/mellanox/mlx5/core/fpga/core.c > @@ -198,8 +198,10 @@ int mlx5_fpga_device_start(struct mlx5_core_dev *mdev) > mlx5_fpga_info(fdev, "FPGA card %s:%u\n", mlx5_fpga_name(fpga_id), > fpga_id); > > /* No QPs if FPGA does not participate in net processing */ > - if (mlx5_is_fpga_lookaside(fpga_id)) > + if (mlx5_is_fpga_lookaside(fpga_id)) { > + err = -EINVAL; > goto out; > + } > > mlx5_fpga_info(fdev, "%s(%d): image, version %u; SBU %06x:%04x version > %d\n", > mlx5_fpga_image_name(fdev->last_oper_image), >
Re: [PATCHv2] gianfar: fix jumbo packets+napi+rx overrun crash
Hi Michael, On Thu, Mar 04, 2021 at 08:52:52PM +0100, michael-...@fami-braun.de wrote: > From: Michael Braun > > When using jumbo packets and overrunning rx queue with napi enabled, > the following sequence is observed in gfar_add_rx_frag: > >| lstatus | | skb | > t | lstatus, size, flags| first | len, data_len, *ptr | > ---+--+---+---+ > 13 | 18002348, 9032, INTERRUPT LAST | 0 | 9600, 8000, f554c12e | > 12 | 1640, 1600, INTERRUPT| 0 | 8000, 6400, f554c12e | > 11 | 1640, 1600, INTERRUPT| 0 | 6400, 4800, f554c12e | > 10 | 1640, 1600, INTERRUPT| 0 | 4800, 3200, f554c12e | > 09 | 1640, 1600, INTERRUPT| 0 | 3200, 1600, f554c12e | > 08 | 14000640, 1600, INTERRUPT FIRST | 0 | 1600, 0, f554c12e | > 07 | 14000640, 1600, INTERRUPT FIRST | 1 | 0,0, f554c12e | > 06 | 1c80, 128, INTERRUPT LAST FIRST | 1 | 0,0, abf3bd6e | > 05 | 18002348, 9032, INTERRUPT LAST | 0 | 8000, 6400, c5a57780 | > 04 | 1640, 1600, INTERRUPT| 0 | 6400, 4800, c5a57780 | > 03 | 1640, 1600, INTERRUPT| 0 | 4800, 3200, c5a57780 | > 02 | 1640, 1600, INTERRUPT| 0 | 3200, 1600, c5a57780 | > 01 | 1640, 1600, INTERRUPT| 0 | 1600, 0, c5a57780 | > 00 | 14000640, 1600, INTERRUPT FIRST | 1 | 0,0, c5a57780 | > > So at t=7 a new packets is started but not finished, probably due to rx > overrun - but rx overrun is not indicated in the flags. Instead a new > packets starts at t=8. This results in skb->len to exceed size for the LAST > fragment at t=13 and thus a negative fragment size added to the skb. > > This then crashes: > > kernel BUG at include/linux/skbuff.h:2277! > Oops: Exception in kernel mode, sig: 5 [#1] > ... > NIP [c04689f4] skb_pull+0x2c/0x48 > LR [c03f62ac] gfar_clean_rx_ring+0x2e4/0x844 > Call Trace: > [ec4bfd38] [c06a84c4] _raw_spin_unlock_irqrestore+0x60/0x7c (unreliable) > [ec4bfda8] [c03f6a44] gfar_poll_rx_sq+0x48/0xe4 > [ec4bfdc8] [c048d504] __napi_poll+0x54/0x26c > [ec4bfdf8] [c048d908] net_rx_action+0x138/0x2c0 > [ec4bfe68] [c06a8f34] __do_softirq+0x3a4/0x4fc > [ec4bfed8] [c0040150] run_ksoftirqd+0x58/0x70 > [ec4bfee8] [c0066ecc] smpboot_thread_fn+0x184/0x1cc > [ec4bff08] [c0062718] kthread+0x140/0x144 > [ec4bff38] [c0012350] ret_from_kernel_thread+0x14/0x1c > > This patch fixes this by checking for computed LAST fragment size, so a > negative sized fragment is never added. > In order to prevent the newer rx frame from getting corrupted, the FIRST > flag is checked to discard the incomplete older frame. > > Signed-off-by: Michael Braun > --- Just for my understanding, do you have a reproducer for the issue? I notice you haven't answered Claudiu's questions posted on v1. On LS1021A I cannot trigger this apparent hardware issue even if I force RX overruns (by reducing the ring size). Judging from the "NIP" register from your stack trace, this is a PowerPC device, which one is it?
Re: [PATCH AUTOSEL 5.10 050/217] rsi: Fix TX EAPOL packet handling against iwlwifi AP
On Tue, Mar 02, 2021 at 08:25:49PM +0100, Marek Vasut wrote: On 12/23/20 3:13 AM, Sasha Levin wrote: Hello Sasha, From: Marek Vasut [ Upstream commit 65277100caa2f2c62b6f3c4648b90d6f0435f3bc ] In case RSI9116 SDIO WiFi operates in STA mode against Intel 9260 in AP mode, the association fails. The former is using wpa_supplicant during association, the later is set up using hostapd: [...] Was this patch possibly missed from 5.10.y ? I'm not sure what happened there, but I can queue it up. Also, while at it, I think it might make sense to pick the following two patches as well, they dramatically reduce interrupt rate of the RSI WiFi device, so it stops overloading lower-end devices: 287431463e786 ("rsi: Move card interrupt handling to RX thread") And this one too. abd131a19f6b8 ("rsi: Clean up loop in the interrupt handler") But not this one, it looks like just a cleanup. Why is it needed? -- Thanks, Sasha
Re: [PATCH] net: sched: avoid duplicates in classes dump
On Thu, Mar 4, 2021 at 6:44 AM Maximilian Heyne wrote: > > This is a follow up of commit ea3274695353 ("net: sched: avoid > duplicates in qdisc dump") which has fixed the issue only for the qdisc > dump. > > The duplicate printing also occurs when dumping the classes via > tc class show dev eth0 > > Fixes: 59cc1f61f09c ("net: sched: convert qdisc linked list to hashtable") > Signed-off-by: Maximilian Heyne Seems reasonable. If you can show the difference of tc class dump before and after this patch in your changelog, it would be helpful to understand the bug here. Thanks.
[PATCH] netlink.7: note not reliable if NETLINK_NO_ENOBUFS
This patch adds a note to the netlink manpage that if NETLINK_NO_ENOBUFS is set there is no additional handling to make netlink reliable. It just disables the error notification. The used word "avoid" receiving ENOBUFS errors can be interpreted that netlink tries to do some additional queue handling to avoid that such scenario occurs at all, e.g. like zerocopy which tries to avoid memory copy. However disable is not the right word here as well that in some cases ENOBUFS can be still received. This patch makes clear that there will no additional handling to put netlink in a more reliable mode. Signed-off-by: Alexander Aring --- man7/netlink.7 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/man7/netlink.7 b/man7/netlink.7 index c69bb62bf..2cb0d1a55 100644 --- a/man7/netlink.7 +++ b/man7/netlink.7 @@ -478,7 +478,7 @@ errors. .\"Author: Pablo Neira Ayuso This flag can be used by unicast and broadcast listeners to avoid receiving .B ENOBUFS -errors. +errors. Note it does not turn netlink into any kind of more reliable mode. .TP .BR NETLINK_LISTEN_ALL_NSID " (since Linux 4.2)" .\"commit 59324cf35aba5336b611074028777838a963d03b -- 2.26.2
[PATCH resend] netlink.7: note not reliable if NETLINK_NO_ENOBUFS
This patch adds a note to the netlink manpage that if NETLINK_NO_ENOBUFS is set there is no additional handling to make netlink reliable. It just disables the error notification. The used word "avoid" receiving ENOBUFS errors can be interpreted that netlink tries to do some additional queue handling to avoid that such scenario occurs at all, e.g. like zerocopy which tries to avoid memory copy. However disable is not the right word here as well that in some cases ENOBUFS can be still received. This patch makes clear that there will no additional handling to put netlink in a more reliable mode. Signed-off-by: Alexander Aring --- resend: - forgot linux-man mailinglist in cc, sorry. man7/netlink.7 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/man7/netlink.7 b/man7/netlink.7 index c69bb62bf..2cb0d1a55 100644 --- a/man7/netlink.7 +++ b/man7/netlink.7 @@ -478,7 +478,7 @@ errors. .\"Author: Pablo Neira Ayuso This flag can be used by unicast and broadcast listeners to avoid receiving .B ENOBUFS -errors. +errors. Note it does not turn netlink into any kind of more reliable mode. .TP .BR NETLINK_LISTEN_ALL_NSID " (since Linux 4.2)" .\"commit 59324cf35aba5336b611074028777838a963d03b -- 2.26.2
Re: [PATCH AUTOSEL 5.10 050/217] rsi: Fix TX EAPOL packet handling against iwlwifi AP
On 3/4/21 9:47 PM, Sasha Levin wrote: On Tue, Mar 02, 2021 at 08:25:49PM +0100, Marek Vasut wrote: On 12/23/20 3:13 AM, Sasha Levin wrote: Hello Sasha, From: Marek Vasut [ Upstream commit 65277100caa2f2c62b6f3c4648b90d6f0435f3bc ] In case RSI9116 SDIO WiFi operates in STA mode against Intel 9260 in AP mode, the association fails. The former is using wpa_supplicant during association, the later is set up using hostapd: [...] Was this patch possibly missed from 5.10.y ? I'm not sure what happened there, but I can queue it up. Thank you Also, while at it, I think it might make sense to pick the following two patches as well, they dramatically reduce interrupt rate of the RSI WiFi device, so it stops overloading lower-end devices: 287431463e786 ("rsi: Move card interrupt handling to RX thread") And this one too. Thanks abd131a19f6b8 ("rsi: Clean up loop in the interrupt handler") But not this one, it looks like just a cleanup. Why is it needed? Now I got confused, yes, please skip abd131a19f6b8, thanks for spotting it. (I still have one more patch for the RSI wifi which I need to send out, but that's for later)
Re: [PATCH net] net: tcp: don't allocate fast clones for fastopen SYN
On Thu, Mar 04, 2021 at 08:41:45PM +0100, Eric Dumazet wrote: > On Thu, Mar 4, 2021 at 8:06 PM Jakub Kicinski wrote: > > > > On Thu, 4 Mar 2021 13:51:15 +0100 Eric Dumazet wrote: > > > I think we are over thinking this really (especially if the fix needs > > > a change in core networking or drivers) > > > > > > We can reuse TSQ logic to have a chance to recover when the clone is > > > eventually freed. > > > This will be more generic, not only for the SYN+data of FastOpen. > > > > > > Can you please test the following patch ? > > > > #7 - Eric comes up with something much better :) > > > > > > But so far doesn't seem to quite do it, I'm looking but maybe you'll > > know right away (FWIW testing a v5.6 backport but I don't think TSQ > > changed?): > > > > On __tcp_retransmit_skb kretprobe: > > > > ==> Hit TFO case ret:-16 ca_state:0 skb:888fdb4bac00! > > > > First hit: > > __tcp_retransmit_skb+1 > > tcp_rcv_state_process+2488 > > tcp_v6_do_rcv+405 > > tcp_v6_rcv+2984 > > ip6_protocol_deliver_rcu+180 > > ip6_input_finish+17 > > > > Successful hit: > > __tcp_retransmit_skb+1 > > tcp_retransmit_skb+18 > > tcp_retransmit_timer+716 > > tcp_write_timer_handler+136 > > tcp_write_timer+141 > > call_timer_fn+43 > > > > skb:888fdb4bac00 --- delay:51642us bytes_acked:1 > > > Humm maybe one of the conditions used in tcp_tsq_write() does not hold... > > if (tp->lost_out > tp->retrans_out && > tp->snd_cwnd > tcp_packets_in_flight(tp)) { > tcp_mstamp_refresh(tp); > tcp_xmit_retransmit_queue(sk); > } > > Maybe FastOpen case is 'special' and tp->lost_out is wrong. Something like this? (completely untested) -- Jonathan diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 69a545db80d2..92bc9b0f4955 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -5995,8 +5995,10 @@ static bool tcp_rcv_fastopen_synack(struct sock *sk, struct sk_buff *synack, else tp->fastopen_client_fail = TFO_DATA_NOT_ACKED; skb_rbtree_walk_from(data) { + tcp_mark_skb_lost(sk, data); if (__tcp_retransmit_skb(sk, data, 1)) break; + tp->retrans_out += tcp_skb_pcount(data); } tcp_rearm_rto(sk); NET_INC_STATS(sock_net(sk),