Re: [PATCH iproute2 5/6] man8/bridge.8: explain self vs master for "bridge fdb add"
Thank you very much Vladimir for improving this man page. I am still struggling with the meaning of the bridge attributes and sometimes the man page has caused more confusion. In the section about 'bridge link set' Self vs master mention physical device vs software bridge. Would it make sense to use the same terminology here? The attributes are listed under 'bridge fdb add' not under 'bridge fdb show'. Is it correct that the attributes displayed by 'show' are a 1-to-1 representation of the ones set by 'add'? What about the entries that are not manually set, like bridge learned adresses? Is it possible to add some explanation about those as well? On 11.02.21 11:45, Vladimir Oltean wrote: > From: Vladimir Oltean > > The "usually hardware" and "usually software" distinctions make no > sense, try to clarify what these do based on the actual kernel behavior. > > Signed-off-by: Vladimir Oltean > --- > man/man8/bridge.8 | 15 --- > 1 file changed, 12 insertions(+), 3 deletions(-) > > diff --git a/man/man8/bridge.8 b/man/man8/bridge.8 > index 1dc0aec83f09..d0bcd708bb61 100644 > --- a/man/man8/bridge.8 > +++ b/man/man8/bridge.8 > @@ -533,12 +533,21 @@ specified. > .sp > > .B self > -- the address is associated with the port drivers fdb. Usually hardware > - (default). > +- the operation is fulfilled directly by the driver for the specified network > +device. If the network device belongs to a master like a bridge, then the > +bridge is bypassed and not notified of this operation (and if the device does > +notify the bridge, it is driver-specific behavior and not mandated by this > +flag, check the driver for more details). The "bridge fdb add" command can > also > +be used on the bridge device itself, and in this case, the added fdb entries > +will be locally terminated (not forwarded). In the latter case, the "self" > flag > +is mandatory. Maybe I misunderstand this sentence, but I can do a 'bridge fdb add' without 'self' on the bridge device. And the address shows up under 'bridge fdb show'. So what does mandatory mean here? The flag is set by default if "master" is not specified. > .sp > > .B master > -- the address is associated with master devices fdb. Usually software. > +- if the specified network device is a port that belongs to a master device > +such as a bridge, the operation is fulfilled by the master device's driver, > +which may in turn notify the port driver too of the address. If the specified > +device is a master itself, such as a bridge, this flag is invalid. > .sp > > .B router >
Re: [PATCH 1/2 net-next] net/mlx5e: TC: Fix IS_ERR() vs NULL checks
On Mon, Sep 28, 2020 at 06:31:04PM +, Ariel Levkovich wrote: > On Sep 28, 2020, at 13:42, Dan Carpenter wrote: > > > > The mlx5_tc_ct_init() function doesn't return error pointers it returns > > NULL. Also we need to set the error codes on this path. > > > > Fixes: aedd133d17bc ("net/mlx5e: Support CT offload for tc nic flows") > > Signed-off-by: Dan Carpenter > > --- > > drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 8 ++-- > > 1 file changed, 6 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c > > b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c > > index 104b1c339de0..438fbcf478d1 100644 > > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c > > @@ -5224,8 +5224,10 @@ int mlx5e_tc_nic_init(struct mlx5e_priv *priv) > > > >tc->ct = mlx5_tc_ct_init(priv, tc->chains, &priv->fs.tc.mod_hdr, > > MLX5_FLOW_NAMESPACE_KERNEL); > > -if (IS_ERR(tc->ct)) > > +if (!tc->ct) { > > +err = -ENOMEM; > >goto err_ct; > > +} > > Hi Dan, > That was implement like that on purpose. If mlx5_tc_init_ct returns NULL it > means the device doesn’t support CT offload which can happen with older > devices or old FW on the devices. > However, in this case we want to continue with the rest of the Tc > initialization because we can still support other TC offloads. No need to > fail the entire TC init in this case. Only if mlx5_tc_init_ct return err_ptr > that means the tc init failed not because of lack of support but due to a > real error and only then we want to fail the rest of the tc init. > > Your change will break compatibility for devices/FW versions that don’t have > CT offload support. > When we have a function like this which is optional then returning NULL is a special kind of success as you say. Returning NULL should not generate a warning message. At the same time, if the user enables the option and the code fails because we are low on memory then returning an error pointer is the correct behavior. Just because the feature is optional does not mean we should ignore what the user told us to do. This code never returns error pointers. It always returns NULL/success when an allocation fails. That triggers the first static checker warning from last year. Now Smatch is complaining about a new static checker warning: drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:4754 mlx5e_tc_esw_init() warn: missing error code here? 'IS_ERR()' failed. 'err' = '0' 4708 int mlx5e_tc_esw_init(struct rhashtable *tc_ht) 4709 { 4710 const size_t sz_enc_opts = sizeof(struct tunnel_match_enc_opts); 4711 struct mlx5_rep_uplink_priv *uplink_priv; 4712 struct mlx5e_rep_priv *rpriv; 4713 struct mapping_ctx *mapping; 4714 struct mlx5_eswitch *esw; 4715 struct mlx5e_priv *priv; 4716 int err = 0; 4717 4718 uplink_priv = container_of(tc_ht, struct mlx5_rep_uplink_priv, tc_ht); 4719 rpriv = container_of(uplink_priv, struct mlx5e_rep_priv, uplink_priv); 4720 priv = netdev_priv(rpriv->netdev); 4721 esw = priv->mdev->priv.eswitch; 4722 4723 uplink_priv->ct_priv = mlx5_tc_ct_init(netdev_priv(priv->netdev), 4724 esw_chains(esw), 4725 &esw->offloads.mod_hdr, 4726 MLX5_FLOW_NAMESPACE_FDB); 4727 if (IS_ERR(uplink_priv->ct_priv)) 4728 goto err_ct; If mlx5_tc_ct_init() fails, which it should do if kmalloc() fails but currently it does not, then the error should be propagated all the way back. So this code should preserve the error code instead of returning success. 4729 4730 mapping = mapping_create(sizeof(struct tunnel_match_key), 4731 TUNNEL_INFO_BITS_MASK, true); regards, dan carpenter
Re: [PATCH net-next V1] net: followup adjust net_device layout for cacheline usage
On Fri, 12 Feb 2021 18:03:44 +0100 Eric Dumazet wrote: > On 2/12/21 2:50 PM, Jesper Dangaard Brouer wrote: > > As Eric pointed out in response to commit 28af22c6c8df ("net: adjust > > net_device layout for cacheline usage") the netdev_features_t members > > wanted_features and hw_features are only used in control path. > > > > Thus, this patch reorder the netdev_features_t to let more members that > > are used in fast path into the 3rd cacheline. Whether these members are > > read depend on SKB properties, which are hinted as comments. The member > > mpls_features could not fit in the cacheline, but it was the least > > commonly used (depend on CONFIG_NET_MPLS_GSO). > > > > In the future we should consider relocating member gso_partial_features > > to be closer to member gso_max_segs. (see usage in gso_features_check()). > > > > Suggested-by: Eric Dumazet > > Signed-off-by: Jesper Dangaard Brouer > > --- > > include/linux/netdevice.h | 11 +++ > > 1 file changed, 7 insertions(+), 4 deletions(-) > > > > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h > > index bfadf3b82f9c..3898bb167579 100644 > > --- a/include/linux/netdevice.h > > +++ b/include/linux/netdevice.h > > @@ -1890,13 +1890,16 @@ struct net_device { > > unsigned short needed_headroom; > > unsigned short needed_tailroom; > > > > + /* Fast path features - via netif_skb_features */ > > netdev_features_t features; > > + netdev_features_t vlan_features; /* if skb_vlan_tagged */ > > + netdev_features_t hw_enc_features; /* if skb->encapsulation */ > > + netdev_features_t gso_partial_features;/* if skb_is_gso */ > > + netdev_features_t mpls_features; /* if eth_p_mpls+NET_MPLS_GSO */ > > + > > + /* Control path features */ > > netdev_features_t hw_features; > > netdev_features_t wanted_features; > > - netdev_features_t vlan_features; > > - netdev_features_t hw_enc_features; > > - netdev_features_t mpls_features; > > - netdev_features_t gso_partial_features; > > > > unsigned intmin_mtu; > > unsigned intmax_mtu; > > > > > > > Please also note we currently have at least 3 distinct blocks for tx path. > > Presumably netdev_features_t are only used in TX, so should be grouped with > the other TX > sections. > > > /* --- cacheline 3 boundary (192 bytes) --- */ > ... > netdev_features_t features; /* 0xe0 0x8 */ > > ... Lots of ctrl stuff > > > /* --- cacheline 14 boundary (896 bytes) --- */ > struct netdev_queue * _tx __attribute__((__aligned__(64))); /* > 0x380 0x8 */ > > > > > /* Mix of unrelated control stuff like rtnl_link_ops > > /* --- cacheline 31 boundary (1984 bytes) --- */ > unsigned int gso_max_size; /* 0x7c0 0x4 */ > u16gso_max_segs; /* 0x7c4 0x2 */ > > > > Ideally we should move _all_ control/slow_path stuff at the very end of the > structure, > in order to not pollute the cache lines we need for data path, to keep them > as small > and packed as possible. > > This could be done one field at a time, to ease code review. > > We should have something like this > > /* section used in RX (fast) path */ > /* section used in both RX/TX (fast) path */ > /* section used in TX (fast) path */ > /* section used for slow path, and control path */ I fully agree with above, but this is the long term plan, that I have added to my TODO list. This patch is a followup to commit 28af22c6c8df ("net: adjust net_device layout for cacheline usage") for fixing that the feature members got partitioned into two cache-lines. -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer
Re: [RFC PATCH v4 07/17] af_vsock: rest of SEQPACKET support
On 11.02.2021 15:27, Stefano Garzarella wrote: > On Sun, Feb 07, 2021 at 06:16:12PM +0300, Arseny Krasnov wrote: >> This does rest of SOCK_SEQPACKET support: >> 1) Adds socket ops for SEQPACKET type. >> 2) Allows to create socket with SEQPACKET type. >> >> Signed-off-by: Arseny Krasnov >> --- >> net/vmw_vsock/af_vsock.c | 37 - >> 1 file changed, 36 insertions(+), 1 deletion(-) >> >> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c >> index a033d3340ac4..c77998a14018 100644 >> --- a/net/vmw_vsock/af_vsock.c >> +++ b/net/vmw_vsock/af_vsock.c >> @@ -452,6 +452,7 @@ int vsock_assign_transport(struct vsock_sock *vsk, >> struct vsock_sock *psk) >> new_transport = transport_dgram; >> break; >> case SOCK_STREAM: >> +case SOCK_SEQPACKET: >> if (vsock_use_local_transport(remote_cid)) >> new_transport = transport_local; >> else if (remote_cid <= VMADDR_CID_HOST || !transport_h2g || >> @@ -459,6 +460,15 @@ int vsock_assign_transport(struct vsock_sock *vsk, >> struct vsock_sock *psk) >> new_transport = transport_g2h; >> else >> new_transport = transport_h2g; >> + >> +if (sk->sk_type == SOCK_SEQPACKET) { >> +if (!new_transport || >> +!new_transport->seqpacket_seq_send_len || >> +!new_transport->seqpacket_seq_send_eor || >> +!new_transport->seqpacket_seq_get_len || >> +!new_transport->seqpacket_dequeue) >> +return -ESOCKTNOSUPPORT; >> +} > Maybe we should move this check after the try_module_get() call, since > the memory pointed by 'new_transport' pointer can be deallocated in the > meantime. > > Also, if the socket had a transport before, we should deassign it before > returning an error. I think previous transport is deassigned immediately after this 'switch()' on sk->sk_type: if (vsk->transport) { ... vsock_deassign_transport(vsk); } Ok, check will be moved after 'try_module_get()'. > >> break; >> default: >> return -ESOCKTNOSUPPORT; >> @@ -684,6 +694,7 @@ static int __vsock_bind(struct sock *sk, struct >> sockaddr_vm *addr) >> >> switch (sk->sk_socket->type) { >> case SOCK_STREAM: >> +case SOCK_SEQPACKET: >> spin_lock_bh(&vsock_table_lock); >> retval = __vsock_bind_connectible(vsk, addr); >> spin_unlock_bh(&vsock_table_lock); >> @@ -769,7 +780,7 @@ static struct sock *__vsock_create(struct net *net, >> >> static bool sock_type_connectible(u16 type) >> { >> -return type == SOCK_STREAM; >> +return (type == SOCK_STREAM) || (type == SOCK_SEQPACKET); >> } >> >> static void __vsock_release(struct sock *sk, int level) >> @@ -2199,6 +2210,27 @@ static const struct proto_ops vsock_stream_ops = { >> .sendpage = sock_no_sendpage, >> }; >> >> +static const struct proto_ops vsock_seqpacket_ops = { >> +.family = PF_VSOCK, >> +.owner = THIS_MODULE, >> +.release = vsock_release, >> +.bind = vsock_bind, >> +.connect = vsock_connect, >> +.socketpair = sock_no_socketpair, >> +.accept = vsock_accept, >> +.getname = vsock_getname, >> +.poll = vsock_poll, >> +.ioctl = sock_no_ioctl, >> +.listen = vsock_listen, >> +.shutdown = vsock_shutdown, >> +.setsockopt = vsock_connectible_setsockopt, >> +.getsockopt = vsock_connectible_getsockopt, >> +.sendmsg = vsock_connectible_sendmsg, >> +.recvmsg = vsock_connectible_recvmsg, >> +.mmap = sock_no_mmap, >> +.sendpage = sock_no_sendpage, >> +}; >> + >> static int vsock_create(struct net *net, struct socket *sock, >> int protocol, int kern) >> { >> @@ -2219,6 +2251,9 @@ static int vsock_create(struct net *net, struct socket >> *sock, >> case SOCK_STREAM: >> sock->ops = &vsock_stream_ops; >> break; >> +case SOCK_SEQPACKET: >> +sock->ops = &vsock_seqpacket_ops; >> +break; >> default: >> return -ESOCKTNOSUPPORT; >> } >> -- >> 2.25.1 >> >
Re: [PATCH 2/4] net: stmmac: Add Toshiba Visconti SoCs glue driver
On Mon, Feb 15, 2021 at 04:28:09PM +0900, Nobuhiro Iwamatsu wrote: > Hi, > > Thanks for your review. > > On Mon, Feb 15, 2021 at 08:07:21AM +0200, Leon Romanovsky wrote: > > On Mon, Feb 15, 2021 at 02:06:53PM +0900, Nobuhiro Iwamatsu wrote: > > > Add dwmac-visconti to the stmmac driver in Toshiba Visconti ARM SoCs. > > > This patch contains only the basic function of the device. There is no > > > clock control, PM, etc. yet. These will be added in the future. > > > > > > Signed-off-by: Nobuhiro Iwamatsu > > > --- > > > drivers/net/ethernet/stmicro/stmmac/Kconfig | 8 + > > > drivers/net/ethernet/stmicro/stmmac/Makefile | 1 + > > > .../ethernet/stmicro/stmmac/dwmac-visconti.c | 285 ++ > > > 3 files changed, 294 insertions(+) > > > create mode 100644 drivers/net/ethernet/stmicro/stmmac/dwmac-visconti.c > > > > > > diff --git a/drivers/net/ethernet/stmicro/stmmac/Kconfig > > > b/drivers/net/ethernet/stmicro/stmmac/Kconfig > > > index 53f14c5a9e02..55ba67a550b9 100644 > > > --- a/drivers/net/ethernet/stmicro/stmmac/Kconfig > > > +++ b/drivers/net/ethernet/stmicro/stmmac/Kconfig > > > @@ -219,6 +219,14 @@ config DWMAC_INTEL_PLAT > > > This selects the Intel platform specific glue layer support for > > > the stmmac device driver. This driver is used for the Intel Keem Bay > > > SoC. > > > + > > > +config DWMAC_VISCONTI > > > + bool "Toshiba Visconti DWMAC support" > > > + def_bool y > > > > Sorry, I sent the wrong patchset that didn't fix this point out. > > > I asked it before, but never received an answer. > > I have received your point out and have sent an email with the content > to remove this line. But it may not have arrived yet... > > > Why did you use "def_bool y" and not "default y"? Isn't it supposed to be > > "depends on STMMAC_ETH"? And probably it shouldn't be set as a default as > > "y". > > > > The reason why "def_bool y" was set is that the wrong fix was left when > debugging. Also, I don't think it is necessary to set "default y". > This is also incorrect because it says "bool" Toshiba Visconti DWMAC > support "". I change it to trustate in the new patch. > > And this driver is enabled when STMMAC_PLATFORM was Y. And STMMAC_PLATFORM > depends on STMMAC_ETH. > So I understand that STMMAC_ETH does not need to be dependents. Is this > understanding wrong? This is correct understanding, just need to clean other entries in that Kconfig that depends on STMMAC_ETH. Thanks > > > Thanks > > > > Best regards, > Nobuhiro
Re: [PATCH net-next RFC v3] net: hdlc_x25: Queue outgoing LAPB frames
On Sun, Feb 14, 2021 at 11:27:03PM -0800, Xie He wrote: > When sending packets, we will first hand over the (L3) packets to the > LAPB module. The LAPB module will then hand over the corresponding LAPB > (L2) frames back to us for us to transmit. > > The LAPB module can also emit LAPB (L2) frames at any time, even without > an (L3) packet currently being sent on the device. This happens when the > LAPB module tries to send (L3) packets queued up in its internal queue, > or when the LAPB module decides to send some (L2) control frame. > > This means we need to have a queue for these outgoing LAPB (L2) frames, > otherwise frames can be dropped if sent when the hardware driver is > already busy in transmitting. The queue needs to be controlled by > the hardware driver's netif_stop_queue and netif_wake_queue calls. > Therefore, we need to use the device's qdisc TX queue for this purpose. > However, currently outgoing LAPB (L2) frames are not queued. > > On the other hand, outgoing (L3) packets (before they are handed over > to the LAPB module) don't need to be queued, because the LAPB module > already has an internal queue for them, and is able to queue new outgoing > (L3) packets at any time. However, currently outgoing (L3) packets are > being queued in the device's qdisc TX queue, which is controlled by > the hardware driver's netif_stop_queue and netif_wake_queue calls. > This is unnecessary and meaningless. > > To fix these issues, we can split the HDLC device into two devices - > a virtual X.25 device and the actual HDLC device, use the virtual X.25 > device to send (L3) packets and then use the actual HDLC device to > queue LAPB (L2) frames. The outgoing (L2) LAPB queue will be controlled > by the hardware driver's netif_stop_queue and netif_wake_queue calls, > while outgoing (L3) packets will not be affected by these calls. > > Cc: Martin Schiller > Signed-off-by: Xie He > --- > > Change from RFC v2: > Simplified the commit message. > Dropped the x25_open fix which is already merged into net-next now. > Use HDLC_MAX_MTU as the mtu of the X.25 virtual device. > Add an explanation to the documentation about the X.25 virtual device. > > Change from RFC v1: > Properly initialize state(hdlc)->x25_dev and state(hdlc)->x25_dev_lock. > > --- > Documentation/networking/generic-hdlc.rst | 3 + > drivers/net/wan/hdlc_x25.c| 153 ++ > 2 files changed, 130 insertions(+), 26 deletions(-) <...> > +static void x25_setup_virtual_dev(struct net_device *dev) > +{ > + dev->netdev_ops = &hdlc_x25_netdev_ops; > + dev->type= ARPHRD_X25; > + dev->addr_len= 0; > + dev->hard_header_len = 0; > + dev->mtu = HDLC_MAX_MTU; > + > + /* When transmitting data: > + * first we'll remove a pseudo header of 1 byte, > + * then the LAPB module will prepend an LAPB header of at most 3 bytes. > + */ > + dev->needed_headroom = 3 - 1; 3 - 1 = 2 Thanks > +} > +
Re: [PATCH iproute2 5/6] man8/bridge.8: explain self vs master for "bridge fdb add"
Hi Alexandra, On Mon, Feb 15, 2021 at 09:22:47AM +0100, Alexandra Winter wrote: > In the section about 'bridge link set' Self vs master mention physical > device vs software bridge. Would it make sense to use the same > terminology here? You mean like this? .TP .B self operation is fulfilled by the driver of the specified network interface. .TP .B master operation is fulfilled by the specified interface's master, for example a bridge, which in turn may or may not notify the underlying network interface driver. This flag is considered implicit by the kernel if 'self' was not specified. > The attributes are listed under 'bridge fdb add' not under 'bridge fdb > show'. Is it correct that the attributes displayed by 'show' are a > 1-to-1 representation of the ones set by 'add'? Bah, not quite. I'll try to summarize below. > What about the entries that are not manually set, like bridge learned > adresses? Is it possible to add some explanation about those as well? Ok, challenge accepted. Here's my take on 'bridge fdb show', I haven't used most of these options (I'm commenting solely based on code inspection) so if anybody with more experience could chime in, I'd be happy to adjust the wording. .SS bridge fdb show - list forwarding entries. This command displays the current forwarding table. By default all FDB entries in the system are shown. The following options can be used to reduce the number of displayed entries: .TP .B br Filter the output to contain only the FDB entries of the specified bridge, or belonging to ports of the specified bridge (optional). .B brport Filter the output to contain only the FDB entries present on the specified network interface (bridge port). This flag is optional. .B dev Same as "brport". .B vlan Filter the output to contain only the FDB entries with the specified VLAN ID (optional). .B dynamic Filter out the local/permanent (not forwarded) FDB entries. .B state Filter the output to contain only the FDB entries having the specified state. The bridge FDB is modeled as a neighbouring protocol for PF_BRIDGE (similar to what ARP is for IPv4 and ND is for IPv6). Therefore, an FDB entry has a NUD ("Network Unreachability Detection") state given by the generic neighbouring layer. The following are the valid components of an FDB entry state (more than one may be valid at the same time): .B permanent Associated with the generic NUD_PERMANENT state, which means that the L2 address of the neighbor has been statically configured by the user and therefore there is no need for a neighbour resolution. For the bridge FDB, it means that an FDB entry is 'local', i.e. the L2 address belongs to a local interface. .B reachable Associated with the generic NUD_REACHABLE state, which means that the L2 address has been resolved by the neighbouring protocol. A reachable bridge FDB entry can have two sub-states (static and dynamic) detailed below. .B static Associated with the generic NUD_NOARP state, which is used to denote a neighbour for which no protocol is needed to resolve the mapping between the L3 address and L2 address. For the bridge FDB, the neighbour resolution protocol is source MAC address learning, therefore a static FDB entry is one that has not been learnt. .B dynamic Is a NUD_REACHABLE entry that lacks the NUD_NOARP state, therefore has been resolved through address learning. .B stale Associated with the generic NUD_STALE state. Denotes an FDB entry that was last updated longer ago than the bridge's hold time, but not yet removed. The hold time is equal to the forward_delay (if the STP topology is still changing) or to the ageing_time otherwise. .PP In the resulting output, each FDB entry can have one or more of the following flags: .B self This entry is present in the FDB of the specified network interface driver. .B router ??? .B extern_learn This entry has been added to the master interface's FDB by the lower port driver, as a result of hardware address learning. .B offload This entry is present in the hardware FDB of a lower port and also associated with an entry of the master interface. .B master This entry is present in the software FDB of the master interface of this lower port. .B sticky This entry cannot be migrated to another port by the address learning process. .PP With the .B -statistics option, the command becomes verbose. It prints out the last updated and last used time for each entry. > > .B self > > -- the address is associated with the port drivers fdb. Usually hardware > > - (default). > > +- the operation is fulfilled directly by the driver for the specified > > network > > +device. If the network device belongs to a master like a bridge, then the > > +bridge is bypassed and not notified of this operation (and if the device > > does > > +notify the bridge, it is driver-specific behavior and not mandated by this > > +flag, check the driver for more details). The "bridge fdb add" command can > > also > > +be used on the bridge device
Re: [Patch bpf-next v3 1/5] bpf: clean up sockmap related Kconfigs
On Sat, 13 Feb 2021 at 21:44, Cong Wang wrote: > > From: Cong Wang > > As suggested by John, clean up sockmap related Kconfigs: > > Reduce the scope of CONFIG_BPF_STREAM_PARSER down to TCP stream > parser, to reflect its name. > > Make the rest sockmap code simply depend on CONFIG_BPF_SYSCALL. > And leave CONFIG_NET_SOCK_MSG untouched, as it is used by > non-sockmap cases. For the series: Reviewed-by: Lorenz Bauer Jakub, John: can you please take another look at the assembly in patch 3? -- Lorenz Bauer | Systems Engineer 6th Floor, County Hall/The Riverside Building, SE1 7PB, UK www.cloudflare.com
Re: [PATCH 1/2] lockdep: add lockdep_assert_not_held()
On Sun, Feb 14, 2021 at 06:53:01PM +0100, Peter Zijlstra wrote: > On Fri, Feb 12, 2021 at 04:28:42PM -0700, Shuah Khan wrote: > > > +#define lockdep_assert_not_held(l) do {\ > > + WARN_ON(debug_locks && lockdep_is_held(l)); \ > > + } while (0) > > + > > This thing isn't as straight forward as you might think, but it'll > mostly work. > > Notably this thing will misfire when lockdep_off() is employed. It > certainyl needs a comment to explain the subtleties. I think something like so will work, but please double check. diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h index b9e9adec73e8..c8b0d292bf8e 100644 --- a/include/linux/lockdep.h +++ b/include/linux/lockdep.h @@ -294,11 +294,15 @@ extern void lock_unpin_lock(struct lockdep_map *lock, struct pin_cookie); #define lockdep_depth(tsk) (debug_locks ? (tsk)->lockdep_depth : 0) -#define lockdep_assert_held(l) do {\ - WARN_ON(debug_locks && !lockdep_is_held(l));\ +#define lockdep_assert_held(l) do {\ + WARN_ON(debug_locks && lockdep_is_held(l) == 0)); \ } while (0) -#define lockdep_assert_held_write(l) do {\ +#define lockdep_assert_not_held(l) do {\ + WARN_ON(debug_locks && lockdep_is_held(l) == 1)); \ + } while (0) + +#define lockdep_assert_held_write(l) do {\ WARN_ON(debug_locks && !lockdep_is_held_type(l, 0));\ } while (0) diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c index c1418b47f625..983ba206f7b2 100644 --- a/kernel/locking/lockdep.c +++ b/kernel/locking/lockdep.c @@ -5467,7 +5467,7 @@ noinstr int lock_is_held_type(const struct lockdep_map *lock, int read) int ret = 0; if (unlikely(!lockdep_enabled())) - return 1; /* avoid false negative lockdep_assert_held() */ + return -1; /* avoid false negative lockdep_assert_held() */ raw_local_irq_save(flags); check_flags(flags);
Re: [PATCH iproute2 5/6] man8/bridge.8: explain self vs master for "bridge fdb add"
On 15.02.21 11:32, Vladimir Oltean wrote: > Hi Alexandra, > > On Mon, Feb 15, 2021 at 09:22:47AM +0100, Alexandra Winter wrote: >> In the section about 'bridge link set' Self vs master mention physical >> device vs software bridge. Would it make sense to use the same >> terminology here? > > You mean like this? > > .TP > .B self > operation is fulfilled by the driver of the specified network interface. > > .TP > .B master > operation is fulfilled by the specified interface's master, for example > a bridge, which in turn may or may not notify the underlying network > interface driver. This flag is considered implicit by the kernel if > 'self' was not specified. > Actually, I found your first (more verbose) proposal more helpful. >> The attributes are listed under 'bridge fdb add' not under 'bridge fdb >> show'. Is it correct that the attributes displayed by 'show' are a >> 1-to-1 representation of the ones set by 'add'? > > Bah, not quite. I'll try to summarize below. > >> What about the entries that are not manually set, like bridge learned >> adresses? Is it possible to add some explanation about those as well? > > Ok, challenge accepted. Here's my take on 'bridge fdb show', I haven't > used most of these options (I'm commenting solely based on code > inspection) so if anybody with more experience could chime in, I'd be > happy to adjust the wording. > > > .SS bridge fdb show - list forwarding entries. > > This command displays the current forwarding table. By default all FDB > entries in the system are shown. The following options can be used to > reduce the number of displayed entries: > > .TP > .B br > Filter the output to contain only the FDB entries of the specified > bridge, or belonging to ports of the specified bridge (optional). > > .B brport > Filter the output to contain only the FDB entries present on the > specified network interface (bridge port). This flag is optional. > > .B dev > Same as "brport". > > .B vlan > Filter the output to contain only the FDB entries with the specified > VLAN ID (optional). > > .B dynamic > Filter out the local/permanent (not forwarded) FDB entries. > > .B state > Filter the output to contain only the FDB entries having the specified > state. The bridge FDB is modeled as a neighbouring protocol for > PF_BRIDGE (similar to what ARP is for IPv4 and ND is for IPv6). > Therefore, an FDB entry has a NUD ("Network Unreachability Detection") > state given by the generic neighbouring layer. > > The following are the valid components of an FDB entry state (more than > one may be valid at the same time): > > .B permanent > Associated with the generic NUD_PERMANENT state, which means that the L2 > address of the neighbor has been statically configured by the user and > therefore there is no need for a neighbour resolution. > For the bridge FDB, it means that an FDB entry is 'local', i.e. the L2 > address belongs to a local interface. > > .B reachable > Associated with the generic NUD_REACHABLE state, which means that the L2 > address has been resolved by the neighbouring protocol. A reachable > bridge FDB entry can have two sub-states (static and dynamic) detailed > below. > > .B static > Associated with the generic NUD_NOARP state, which is used to denote a > neighbour for which no protocol is needed to resolve the mapping between > the L3 address and L2 address. For the bridge FDB, the neighbour > resolution protocol is source MAC address learning, therefore a static > FDB entry is one that has not been learnt. > > .B dynamic > Is a NUD_REACHABLE entry that lacks the NUD_NOARP state, therefore has > been resolved through address learning. > > .B stale > Associated with the generic NUD_STALE state. Denotes an FDB entry that > was last updated longer ago than the bridge's hold time, but not yet > removed. The hold time is equal to the forward_delay (if the STP > topology is still changing) or to the ageing_time otherwise. > > > .PP > In the resulting output, each FDB entry can have one or more of the > following flags: > > .B self > This entry is present in the FDB of the specified network interface driver. > > .B router > ??? > > .B extern_learn > This entry has been added to the master interface's FDB by the lower > port driver, as a result of hardware address learning. > > .B offload > This entry is present in the hardware FDB of a lower port and also > associated with an entry of the master interface. > > .B master > This entry is present in the software FDB of the master interface of > this lower port. > > .B sticky > This entry cannot be migrated to another port by the address learning > process. > > .PP > With the > .B -statistics > option, the command becomes verbose. It prints out the last updated > and last used time for each entry. > Thank you so much!! This will be very helpful. >>> .B self >>> -- the address is associated with the port drivers fdb. Usually hardware >>> - (default). >>> +- the operation is fulfilled dire
Re: linux-next: manual merge of the net-next tree with the net tree
On Mon, Feb 15, 2021 at 11:43:54AM +1100, Stephen Rothwell wrote: > Hi all, > > Today's linux-next merge of the net-next tree got a conflict in: > > tools/testing/selftests/net/forwarding/tc_flower.sh > > between commit: > > d2126838050c ("flow_dissector: fix TTL and TOS dissection on IPv4 > fragments") > > from the net tree and commits: > > 203ee5cd7235 ("selftests: tc: Add basic mpls_* matching support for > tc-flower") > c09bfd9a5df9 ("selftests: tc: Add generic mpls matching support for > tc-flower") > > from the net-next tree. > > I fixed it up (see below) and can carry the fix as necessary. This > is now fixed as far as linux-next is concerned, but any non trivial > conflicts should be mentioned to your upstream maintainer when your tree > is submitted for merging. You may also want to consider cooperating > with the maintainer of the conflicting tree to minimise any particularly > complex conflicts. > > -- > Cheers, > Stephen Rothwell > > diff --cc tools/testing/selftests/net/forwarding/tc_flower.sh > index b11d8e6b5bc1,a554838666c4.. > --- a/tools/testing/selftests/net/forwarding/tc_flower.sh > +++ b/tools/testing/selftests/net/forwarding/tc_flower.sh > @@@ -3,7 -3,9 +3,9 @@@ > > ALL_TESTS="match_dst_mac_test match_src_mac_test match_dst_ip_test \ > match_src_ip_test match_ip_flags_test match_pcp_test match_vlan_test \ > - match_ip_tos_test match_indev_test match_ip_ttl_test" > + match_ip_tos_test match_indev_test match_mpls_label_test \ > + match_mpls_tc_test match_mpls_bos_test match_mpls_ttl_test \ > -match_mpls_lse_test" > ++match_mpls_lse_test match_ip_ttl_test" That's technically right. But I think it'd be nicer to have "match_ip_ttl_test" appear between "match_ip_tos_test" and "match_indev_test", rather than at the end of the list. Before these commits, ALL_TESTS listed the tests in the order they were implemented in the rest of the file. So I'd rather continue following this implicit rule, if at all possible. Also it makes sense to keep grouping all match_ip_*_test together. > NUM_NETIFS=2 > source tc_common.sh > source lib.sh
Re: Regressions with VMBus/VSCs hardening changes
On Fri, Feb 12, 2021 at 05:50:50PM +0100, Andrea Parri wrote: > Hi all, [...] > 2) IIUC a8c3209998afb5 could be dropped (after rebase) without further modi- >fications to hyperv-next. I've reverted the said patch from hyperv-next. Wei.
Re: linux-next: manual merge of the net-next tree with the net tree
On Mon, 2021-02-15 at 12:01 +0100, Guillaume Nault wrote: > Before these commits, ALL_TESTS listed the tests in the order they were > implemented in the rest of the file. So I'd rather continue following > this implicit rule, if at all possible. Also it makes sense to keep > grouping all match_ip_*_test together. yes, it makes sense. I can follow-up with a commit for net-next (when tree re-opens), where the "ordering" in ALL_TESTS is restored. Ok? thanks, -- davide
[PATCH net] cxgb4/chtls/cxgbit: Keeping the max ofld immediate data size same in cxgb4 and ulds
The Max imm data size in cxgb4 is not similar to the max imm data size in the chtls. This caused an mismatch in output of is_ofld_imm() of cxgb4 and chtls. So fixed this by keeping the max wreq size of imm data same in both chtls and cxgb4 as MAX_IMM_OFLD_TX_DATA_WR_LEN. As cxgb4's max imm. data value for ofld packets is changed to MAX_IMM_OFLD_TX_DATA_WR_LEN. Using the same in cxgbit also. Fixes: 36bedb3f2e5b8 ("crypto: chtls - Inline TLS record Tx") Signed-off-by: Ayush Sawal --- drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h| 3 +++ drivers/net/ethernet/chelsio/cxgb4/sge.c | 11 --- .../ethernet/chelsio/inline_crypto/chtls/chtls_cm.h | 3 --- drivers/target/iscsi/cxgbit/cxgbit_target.c | 3 +-- 4 files changed, 12 insertions(+), 8 deletions(-) diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h index 1b49f2fa9b18..34546f5312ee 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h @@ -46,6 +46,9 @@ #define MAX_ULD_QSETS 16 #define MAX_ULD_NPORTS 4 +/* ulp_mem_io + ulptx_idata + payload + padding */ +#define MAX_IMM_ULPTX_WR_LEN (32 + 8 + 256 + 8) + /* CPL message priority levels */ enum { CPL_PRIORITY_DATA = 0, /* data messages */ diff --git a/drivers/net/ethernet/chelsio/cxgb4/sge.c b/drivers/net/ethernet/chelsio/cxgb4/sge.c index 196652a114c5..3334c9e2152a 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/sge.c +++ b/drivers/net/ethernet/chelsio/cxgb4/sge.c @@ -2842,17 +2842,22 @@ int t4_mgmt_tx(struct adapter *adap, struct sk_buff *skb) * @skb: the packet * * Returns true if a packet can be sent as an offload WR with immediate - * data. We currently use the same limit as for Ethernet packets. + * data. + * FW_OFLD_TX_DATA_WR limits the payload to 255 bytes due to 8-bit field. + * However, FW_ULPTX_WR commands have a 256 byte immediate only + * payload limit. */ static inline int is_ofld_imm(const struct sk_buff *skb) { struct work_request_hdr *req = (struct work_request_hdr *)skb->data; unsigned long opcode = FW_WR_OP_G(ntohl(req->wr_hi)); - if (opcode == FW_CRYPTO_LOOKASIDE_WR) + if (unlikely(opcode == FW_ULPTX_WR)) + return skb->len <= MAX_IMM_ULPTX_WR_LEN; + else if (opcode == FW_CRYPTO_LOOKASIDE_WR) return skb->len <= SGE_MAX_WR_LEN; else - return skb->len <= MAX_IMM_TX_PKT_LEN; + return skb->len <= MAX_IMM_OFLD_TX_DATA_WR_LEN; } /** diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.h b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.h index 47ba81e42f5d..b1161bdeda4d 100644 --- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.h +++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.h @@ -50,9 +50,6 @@ #define MIN_RCV_WND (24 * 1024U) #define LOOPBACK(x) (((x) & htonl(0xff00)) == htonl(0x7f00)) -/* ulp_mem_io + ulptx_idata + payload + padding */ -#define MAX_IMM_ULPTX_WR_LEN (32 + 8 + 256 + 8) - /* for TX: a skb must have a headroom of at least TX_HEADER_LEN bytes */ #define TX_HEADER_LEN \ (sizeof(struct fw_ofld_tx_data_wr) + sizeof(struct sge_opaque_hdr)) diff --git a/drivers/target/iscsi/cxgbit/cxgbit_target.c b/drivers/target/iscsi/cxgbit/cxgbit_target.c index 9b3eb2e8c92a..b926e1d6c7b8 100644 --- a/drivers/target/iscsi/cxgbit/cxgbit_target.c +++ b/drivers/target/iscsi/cxgbit/cxgbit_target.c @@ -86,8 +86,7 @@ static int cxgbit_is_ofld_imm(const struct sk_buff *skb) if (likely(cxgbit_skcb_flags(skb) & SKCBF_TX_ISO)) length += sizeof(struct cpl_tx_data_iso); -#define MAX_IMM_TX_PKT_LEN 256 - return length <= MAX_IMM_TX_PKT_LEN; + return length <= MAX_IMM_OFLD_TX_DATA_WR_LEN; } /* -- 2.28.0.rc1.6.gae46588
Re: [RFC PATCH 03/13] nexthop: Add netlink defines and enumerators for resilient NH groups
Ido Schimmel writes: > On Sat, Feb 13, 2021 at 12:16:45PM -0700, David Ahern wrote: >> On 2/8/21 1:42 PM, Petr Machata wrote: >> > @@ -52,8 +53,50 @@ enum { >> >NHA_FDB,/* flag; nexthop belongs to a bridge fdb */ >> >/* if NHA_FDB is added, OIF, BLACKHOLE, ENCAP cannot be set */ >> > >> > + /* nested; resilient nexthop group attributes */ >> > + NHA_RES_GROUP, >> > + /* nested; nexthop bucket attributes */ >> > + NHA_RES_BUCKET, >> > + >> >__NHA_MAX, >> > }; >> > >> > #define NHA_MAX (__NHA_MAX - 1) >> > + >> > +enum { >> > + NHA_RES_GROUP_UNSPEC, >> > + /* Pad attribute for 64-bit alignment. */ >> > + NHA_RES_GROUP_PAD = NHA_RES_GROUP_UNSPEC, >> > + >> > + /* u32; number of nexthop buckets in a resilient nexthop group */ >> > + NHA_RES_GROUP_BUCKETS, >> >> u32 is overkill; arguably u16 (64k) should be more than enough buckets >> for any real use case. > > We wanted to make it future-proof, but I think we can live with 64k. At > least in Spectrum the maximum is 4k. I don't know about other devices, > but I guess it is not more than 64k. OK, no problem. I was thinking of keeping the API as u32, and tracking as u16 internally, but let's not add baggage at this stage already. Push comes to shove there can be another u32 attribute mutexed with this one.
Re: [RFC PATCH 04/13] nexthop: Add implementation of resilient next-hop groups
David Ahern writes: > On 2/8/21 1:42 PM, Petr Machata wrote: >> @@ -212,7 +254,7 @@ static inline bool nexthop_is_multipath(const struct >> nexthop *nh) >> struct nh_group *nh_grp; >> >> nh_grp = rcu_dereference_rtnl(nh->nh_grp); >> -return nh_grp->mpath; >> +return nh_grp->mpath || nh_grp->resilient; > > That pattern shows up multiple times; since this is datapath, it would > be worth adding a new bool for it (is_multipath or has_nexthops or > something else along those lines) OK.
Re: linux-next: manual merge of the net-next tree with the net tree
Hi Davide, On Mon, 15 Feb 2021 12:35:37 +0100 Davide Caratti wrote: > > On Mon, 2021-02-15 at 12:01 +0100, Guillaume Nault wrote: > > Before these commits, ALL_TESTS listed the tests in the order they were > > implemented in the rest of the file. So I'd rather continue following > > this implicit rule, if at all possible. Also it makes sense to keep > > grouping all match_ip_*_test together. > > yes, it makes sense. I can follow-up with a commit for net-next (when > tree re-opens), where the "ordering" in ALL_TESTS is restored. Ok? The ordering is not set in stone yet (I have only done the merge in the linux-next tree), just make sure that Dave knows what it should look like when he merges the net and net-next trees. -- Cheers, Stephen Rothwell pgpN6LBT4u1D7.pgp Description: OpenPGP digital signature
Re: [PATCH v14 2/4] phy: Add media type and speed serdes configuration interfaces
Hi Steen, On 12/02/21 6:35 pm, Steen Hegelund wrote: > Hi Kishon, > > On Fri, 2021-02-12 at 17:02 +0530, Kishon Vijay Abraham I wrote: >> EXTERNAL EMAIL: Do not click links or open attachments unless you >> know the content is safe >> >> Hi Steen, >> >> On 10/02/21 2:22 pm, Steen Hegelund wrote: >>> Provide new phy configuration interfaces for media type and speed >>> that >>> allows allows e.g. PHYs used for ethernet to be configured with >>> this >>> information. >>> >>> Signed-off-by: Lars Povlsen >>> Signed-off-by: Steen Hegelund >>> Reviewed-by: Andrew Lunn >>> Reviewed-by: Alexandre Belloni >>> --- >>> drivers/phy/phy-core.c | 30 ++ >>> include/linux/phy/phy.h | 26 ++ >>> 2 files changed, 56 insertions(+) >>> >>> diff --git a/drivers/phy/phy-core.c b/drivers/phy/phy-core.c >>> index 71cb10826326..ccb575b13777 100644 >>> --- a/drivers/phy/phy-core.c >>> +++ b/drivers/phy/phy-core.c >>> @@ -373,6 +373,36 @@ int phy_set_mode_ext(struct phy *phy, enum >>> phy_mode mode, int submode) >>> } >>> EXPORT_SYMBOL_GPL(phy_set_mode_ext); >>> >>> +int phy_set_media(struct phy *phy, enum phy_media media) >>> +{ >>> + int ret; >>> + >>> + if (!phy || !phy->ops->set_media) >>> + return 0; >>> + >>> + mutex_lock(&phy->mutex); >>> + ret = phy->ops->set_media(phy, media); >>> + mutex_unlock(&phy->mutex); >>> + >>> + return ret; >>> +} >>> +EXPORT_SYMBOL_GPL(phy_set_media); >>> + >>> +int phy_set_speed(struct phy *phy, int speed) >>> +{ >>> + int ret; >>> + >>> + if (!phy || !phy->ops->set_speed) >>> + return 0; >>> + >>> + mutex_lock(&phy->mutex); >>> + ret = phy->ops->set_speed(phy, speed); >>> + mutex_unlock(&phy->mutex); >>> + >>> + return ret; >>> +} >>> +EXPORT_SYMBOL_GPL(phy_set_speed); >> >> Can't speed derived from mode? Do we need a separate set_speed >> function? >> >> Thanks >> Kishon > > Yes the client will need to be able to choose the speed as needed: > e.g. lower than the serdes mode supports, in case the the media or the > other end is not capable of running that speed. > > An example is a 10G and 25G serdes connected via DAC and as there is no > inband autoneg, the 25G client would have to manually select 10G speed > to communicate with its partner. Okay. Is it going to be some sort of manual negotiation where the Ethernet controller invokes set_speed with different speeds? Or the Ethernet controller will get the speed using some out of band mechanism and invokes set_speed once with the actual speed? Thanks Kishon > >> >>> + >>> int phy_reset(struct phy *phy) >>> { >>> int ret; >>> diff --git a/include/linux/phy/phy.h b/include/linux/phy/phy.h >>> index e435bdb0bab3..e4fd69a1faa7 100644 >>> --- a/include/linux/phy/phy.h >>> +++ b/include/linux/phy/phy.h >>> @@ -44,6 +44,12 @@ enum phy_mode { >>> PHY_MODE_DP >>> }; >>> >>> +enum phy_media { >>> + PHY_MEDIA_DEFAULT, >>> + PHY_MEDIA_SR, >>> + PHY_MEDIA_DAC, >>> +}; >>> + >>> /** >>> * union phy_configure_opts - Opaque generic phy configuration >>> * >>> @@ -64,6 +70,8 @@ union phy_configure_opts { >>> * @power_on: powering on the phy >>> * @power_off: powering off the phy >>> * @set_mode: set the mode of the phy >>> + * @set_media: set the media type of the phy (optional) >>> + * @set_speed: set the speed of the phy (optional) >>> * @reset: resetting the phy >>> * @calibrate: calibrate the phy >>> * @release: ops to be performed while the consumer relinquishes >>> the PHY >>> @@ -75,6 +83,8 @@ struct phy_ops { >>> int (*power_on)(struct phy *phy); >>> int (*power_off)(struct phy *phy); >>> int (*set_mode)(struct phy *phy, enum phy_mode mode, int >>> submode); >>> + int (*set_media)(struct phy *phy, enum phy_media media); >>> + int (*set_speed)(struct phy *phy, int speed); >>> >>> /** >>> * @configure: >>> @@ -215,6 +225,8 @@ int phy_power_off(struct phy *phy); >>> int phy_set_mode_ext(struct phy *phy, enum phy_mode mode, int >>> submode); >>> #define phy_set_mode(phy, mode) \ >>> phy_set_mode_ext(phy, mode, 0) >>> +int phy_set_media(struct phy *phy, enum phy_media media); >>> +int phy_set_speed(struct phy *phy, int speed); >>> int phy_configure(struct phy *phy, union phy_configure_opts >>> *opts); >>> int phy_validate(struct phy *phy, enum phy_mode mode, int submode, >>> union phy_configure_opts *opts); >>> @@ -344,6 +356,20 @@ static inline int phy_set_mode_ext(struct phy >>> *phy, enum phy_mode mode, >>> #define phy_set_mode(phy, mode) \ >>> phy_set_mode_ext(phy, mode, 0) >>> >>> +static inline int phy_set_media(struct phy *phy, enum phy_media >>> media) >>> +{ >>> + if (!phy) >>> + return 0; >>> + return -ENOSYS; >>> +} >>> + >>> +static inline int phy_set_speed(struct phy *phy, int speed) >>> +{ >>> + if (!phy) >>> + return 0; >>> + return
Re: [net-next] tcp: Sanitize CMSG flags and reserved args in tcp_zerocopy_receive.
Hi Arjun, url: https://github.com/0day-ci/linux/commits/Arjun-Roy/tcp-Sanitize-CMSG-flags-and-reserved-args-in-tcp_zerocopy_receive/20210212-052537 base: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git e4b62cf7559f2ef9a022de235e5a09a8d7ded520 config: x86_64-randconfig-m001-20210209 (attached as .config) compiler: gcc-9 (Debian 9.3.0-15) 9.3.0 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot Reported-by: Dan Carpenter smatch warnings: net/ipv4/tcp.c:4158 do_tcp_getsockopt() warn: check for integer overflow 'len' vim +/len +4158 net/ipv4/tcp.c 3fdadf7d27e3fb Dmitry Mishin2006-03-20 3896 static int do_tcp_getsockopt(struct sock *sk, int level, 3fdadf7d27e3fb Dmitry Mishin2006-03-20 3897int optname, char __user *optval, int __user *optlen) ^1da177e4c3f41 Linus Torvalds 2005-04-16 3898 { 295f7324ff8d9e Arnaldo Carvalho de Melo 2005-08-09 3899struct inet_connection_sock *icsk = inet_csk(sk); ^1da177e4c3f41 Linus Torvalds 2005-04-16 3900struct tcp_sock *tp = tcp_sk(sk); 6fa251663069e0 Nikolay Borisov 2016-02-03 3901struct net *net = sock_net(sk); ^1da177e4c3f41 Linus Torvalds 2005-04-16 3902int val, len; "len" is int. [ snip ] 05255b823a6173 Eric Dumazet 2018-04-27 4146 #ifdef CONFIG_MMU 05255b823a6173 Eric Dumazet 2018-04-27 4147case TCP_ZEROCOPY_RECEIVE: { 7eeba1706eba6d Arjun Roy2021-01-20 4148struct scm_timestamping_internal tss; e0fecb289ad3fd Arjun Roy2020-12-10 4149struct tcp_zerocopy_receive zc = {}; 05255b823a6173 Eric Dumazet 2018-04-27 4150int err; 05255b823a6173 Eric Dumazet 2018-04-27 4151 05255b823a6173 Eric Dumazet 2018-04-27 4152if (get_user(len, optlen)) 05255b823a6173 Eric Dumazet 2018-04-27 4153 return -EFAULT; c8856c05145490 Arjun Roy2020-02-14 4154if (len < offsetofend(struct tcp_zerocopy_receive, length)) 05255b823a6173 Eric Dumazet 2018-04-27 4155 return -EINVAL; The problem is that negative values of "len" are type promoted to high positive values. So the fix is to write this as: if (len < 0 || len < offsetofend(struct tcp_zerocopy_receive, length)) return -EINVAL; 110912bdf28392 Arjun Roy2021-02-11 4156if (unlikely(len > sizeof(zc))) { 110912bdf28392 Arjun Roy2021-02-11 4157 err = check_zeroed_user(optval + sizeof(zc), 110912bdf28392 Arjun Roy2021-02-11 @4158 len - sizeof(zc)); Potentially "len - a negative value". 110912bdf28392 Arjun Roy2021-02-11 4159 if (err < 1) 110912bdf28392 Arjun Roy2021-02-11 4160 return err == 0 ? -EINVAL : err; c8856c05145490 Arjun Roy2020-02-14 4161 len = sizeof(zc); 0b7f41f68710cc Arjun Roy2020-02-25 4162 if (put_user(len, optlen)) 0b7f41f68710cc Arjun Roy2020-02-25 4163 return -EFAULT; 0b7f41f68710cc Arjun Roy2020-02-25 4164} --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip
[PATCH] b43: N-PHY: Fix the update of coef for the PHY revision >= 3case
From: Colin Ian King The documentation for the PHY update [1] states: Loop 4 times with index i If PHY Revision >= 3 Copy table[i] to coef[i] Otherwise Set coef[i] to 0 the copy of the table to coef is currently implemented the wrong way around, table is being updated from uninitialized values in coeff. Fix this by swapping the assignment around. [1] https://bcm-v4.sipsolutions.net/802.11/PHY/N/RestoreCal/ Fixes: 2f258b74d13c ("b43: N-PHY: implement restoring general configuration") Addresses-Coverity: ("Uninitialized scalar variable") Signed-off-by: Colin Ian King --- drivers/net/wireless/broadcom/b43/phy_n.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/wireless/broadcom/b43/phy_n.c b/drivers/net/wireless/broadcom/b43/phy_n.c index b669dff24b6e..665b737fbb0d 100644 --- a/drivers/net/wireless/broadcom/b43/phy_n.c +++ b/drivers/net/wireless/broadcom/b43/phy_n.c @@ -5311,7 +5311,7 @@ static void b43_nphy_restore_cal(struct b43_wldev *dev) for (i = 0; i < 4; i++) { if (dev->phy.rev >= 3) - table[i] = coef[i]; + coef[i] = table[i]; else coef[i] = 0; } -- 2.30.0
Re: [PATCH iproute2 5/6] man8/bridge.8: explain self vs master for "bridge fdb add"
On Mon, Feb 15, 2021 at 11:53:42AM +0100, Alexandra Winter wrote: > Actually, I found your first (more verbose) proposal more helpful. Sorry, I don't understand. Do you want me to copy the whole explanation from bridge fdb add to bridge link set? > >> Maybe I misunderstand this sentence, but I can do a 'bridge fdb add' > >> without 'self' > >> on the bridge device. And the address shows up under 'bridge fdb show'. > >> So what does mandatory mean here? > > > > It's right in the next sentence: > > > >> The flag is set by default if "master" is not specified. > > > > It's mandatory and implicit if "master" is not specified, ergo 'bridge > > fdb add dev br0' will work because 'master' is not specified (it is > > implicitly 'bridge fdb add dev br0 self'. But 'bridge fdb add dev br0 > > master' will fail, because the 'self' flag is no longer implicit (since > > 'master' was specified) but mandatory and absent. > > > > I'm not sure what I can do to improve this. > > > Maybe the sentence under 'master': > " If the specified > +device is a master itself, such as a bridge, this flag is invalid." > is sufficient to defien this situation. And no need to explain mandatory > implicit defaults > in the first paragraph? I don't understand this either. Could you paste here how you think this paragraph should read?
Re: [PATCH 2/4] net: stmmac: Add Toshiba Visconti SoCs glue driver
On Mon, Feb 15, 2021 at 10:23 AM Leon Romanovsky wrote: > On Mon, Feb 15, 2021 at 04:28:09PM +0900, Nobuhiro Iwamatsu wrote: > > > > Sorry, I sent the wrong patchset that didn't fix this point out. > > > > > I asked it before, but never received an answer. > > > > I have received your point out and have sent an email with the content > > to remove this line. But it may not have arrived yet... > > > > > Why did you use "def_bool y" and not "default y"? Isn't it supposed to be > > > "depends on STMMAC_ETH"? And probably it shouldn't be set as a default as > > > "y". > > > > > > > The reason why "def_bool y" was set is that the wrong fix was left when > > debugging. Also, I don't think it is necessary to set "default y". > > This is also incorrect because it says "bool" Toshiba Visconti DWMAC > > support "". I change it to trustate in the new patch. > > > > And this driver is enabled when STMMAC_PLATFORM was Y. And STMMAC_PLATFORM > > depends on STMMAC_ETH. > > So I understand that STMMAC_ETH does not need to be dependents. Is this > > understanding wrong? > > This is correct understanding, just need to clean other entries in that > Kconfig that depends on STMMAC_ETH. 'tristate' with no default sounds right. I see that some platforms have a default according to the platform, which also makes sense but isn't required. What I would suggest though is a dependency on the platform, to make it easier to disable the front-end based on which platforms are enabled. This would end up as config DWMAC_VISCONTI tristate "Toshiba Visconti DWMAC support" depends on ARCH_VISCONTI || COMPILE_TEST depends on OF && COMMON_CLK # only add this line if it's required for compilation default ARCH_VISCONTI Arnd
AW: HSR/PRP sequence counter issue with Cisco Redbox
> On Wed, Jan 27, 2021 at 6:32 AM Wenzel, Marco eberle.de> wrote: > > > > Hi, > > > > we have figured out an issue with the current PRP driver when trying to > communicate with Cisco IE 2000 industrial Ethernet switches in Redbox > mode. The Cisco always resets the HSR/PRP sequence counter to "1" at low > traffic (<= 1 frame in 400 ms). It can be reproduced by a simple ICMP echo > request with 1 s interval between a Linux box running with PRP and a VDAN > behind the Cisco Redbox. The Linux box then always receives frames with > sequence counter "1" and drops them. The behavior is not configurable at > the Cisco Redbox. > > > > I fixed it by ignoring sequence counters with value "1" at the sequence > counter check in hsr_register_frame_out (): > > > > diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c index > > 5c97de459905..630c238e81f0 100644 > > --- a/net/hsr/hsr_framereg.c > > +++ b/net/hsr/hsr_framereg.c > > @@ -411,7 +411,7 @@ void hsr_register_frame_in(struct hsr_node *node, > > struct hsr_port *port, int hsr_register_frame_out(struct hsr_port *port, > struct hsr_node *node, > >u16 sequence_nr) { > > - if (seq_nr_before_or_eq(sequence_nr, node->seq_out[port->type])) > > + if (seq_nr_before_or_eq(sequence_nr, > > + node->seq_out[port->type]) && (sequence_nr != 1)) > > return 1; > > > > node->seq_out[port->type] = sequence_nr; > > > > > > Do you think this could be a solution? Should this patch be officially > > applied > in order to avoid other users running into these communication issues? > > This isn't the correct way to solve the problem. IEC 62439-3 defines > EntryForgetTime as "Time after which an entry is removed from the duplicate > table" with a value of 400ms and states devices should usually be configured > to keep entries in the table for a much shorter time. hsr_framereg.c needs to > be reworked to handle this according to the specification. Sorry for the delay but I did not have the time to take a closer look at the problem until now. My suggestion for the EntryForgetTime feature would be the following: A time_out element will be added to the hsr_node structure, which always stores the current time when entering hsr_register_frame_out(). If the last stored time is older than EntryForgetTime (400 ms) the sequence number check will be ignored. diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c index 5c97de459905..a97bffbd2581 100644 --- a/net/hsr/hsr_framereg.c +++ b/net/hsr/hsr_framereg.c @@ -164,8 +164,10 @@ static struct hsr_node *hsr_add_node(struct hsr_priv *hsr, * as initialization. (0 could trigger an spurious ring error warning). */ now = jiffies; - for (i = 0; i < HSR_PT_PORTS; i++) + for (i = 0; i < HSR_PT_PORTS; i++) { new_node->time_in[i] = now; + new_node->time_out[i] = now; + } for (i = 0; i < HSR_PT_PORTS; i++) new_node->seq_out[i] = seq_out; @@ -411,9 +413,12 @@ void hsr_register_frame_in(struct hsr_node *node, struct hsr_port *port, int hsr_register_frame_out(struct hsr_port *port, struct hsr_node *node, u16 sequence_nr) { - if (seq_nr_before_or_eq(sequence_nr, node->seq_out[port->type])) + if (seq_nr_before_or_eq(sequence_nr, node->seq_out[port->type]) && +time_is_after_jiffies(node->time_out[port->type] + msecs_to_jiffies(HSR_ENTRY_FORGET_TIME))) { return 1; + } + node->time_out[port->type] = jiffies; node->seq_out[port->type] = sequence_nr; return 0; } diff --git a/net/hsr/hsr_framereg.h b/net/hsr/hsr_framereg.h index 86b43f539f2c..d9628e7a5f05 100644 --- a/net/hsr/hsr_framereg.h +++ b/net/hsr/hsr_framereg.h @@ -75,6 +75,7 @@ struct hsr_node { enum hsr_port_type addr_B_port; unsigned long time_in[HSR_PT_PORTS]; booltime_in_stale[HSR_PT_PORTS]; + unsigned long time_out[HSR_PT_PORTS]; /* if the node is a SAN */ boolsan_a; boolsan_b; diff --git a/net/hsr/hsr_main.h b/net/hsr/hsr_main.h index 7dc92ce5a134..f79ca55d6986 100644 --- a/net/hsr/hsr_main.h +++ b/net/hsr/hsr_main.h @@ -21,6 +21,7 @@ #define HSR_LIFE_CHECK_INTERVAL 2000 /* ms */ #define HSR_NODE_FORGET_TIME 6 /* ms */ #define HSR_ANNOUNCE_INTERVAL100 /* ms */ +#define HSR_ENTRY_FORGET_TIME400 /* ms */ /* By how much may slave1 and slave2 timestamps of latest received frame from * each node differ before we notify of communication problem? This approach works fine with the Cisco IE 2000 and I think it implements the correct way to handle sequence numbers as defined in IEC 62439-3. Regards, Marco Wenzel > > > > Thanks > > Marco Wenzel > > Regards, > George McCollister
Re: [PATCH iproute2 5/6] man8/bridge.8: explain self vs master for "bridge fdb add"
On 15.02.21 13:13, Vladimir Oltean wrote: > On Mon, Feb 15, 2021 at 11:53:42AM +0100, Alexandra Winter wrote: >> Actually, I found your first (more verbose) proposal more helpful. > > Sorry, I don't understand. Do you want me to copy the whole explanation > from bridge fdb add to bridge link set? > Maybe I misunderstand this sentence, but I can do a 'bridge fdb add' without 'self' on the bridge device. And the address shows up under 'bridge fdb show'. So what does mandatory mean here? >>> >>> It's right in the next sentence: >>> The flag is set by default if "master" is not specified. >>> >>> It's mandatory and implicit if "master" is not specified, ergo 'bridge >>> fdb add dev br0' will work because 'master' is not specified (it is >>> implicitly 'bridge fdb add dev br0 self'. But 'bridge fdb add dev br0 >>> master' will fail, because the 'self' flag is no longer implicit (since >>> 'master' was specified) but mandatory and absent. >>> >>> I'm not sure what I can do to improve this. >>> >> Maybe the sentence under 'master': >> " If the specified >> +device is a master itself, such as a bridge, this flag is invalid." >> is sufficient to defien this situation. And no need to explain mandatory >> implicit defaults >> in the first paragraph? > > I don't understand this either. Could you paste here how you think this > paragraph should read? > Sorry, I did not mean to cause confusion. Your original proposal: .B self -- the address is associated with the port drivers fdb. Usually hardware - (default). +- the operation is fulfilled directly by the driver for the specified network +device. If the network device belongs to a master like a bridge, then the +bridge is bypassed and not notified of this operation (and if the device does +notify the bridge, it is driver-specific behavior and not mandated by this +flag, check the driver for more details). The "bridge fdb add" command can also +be used on the bridge device itself, and in this case, the added fdb entries +will be locally terminated (not forwarded). In the latter case, the "self" flag +is mandatory. The flag is set by default if "master" is not specified. .sp .B master -- the address is associated with master devices fdb. Usually software. +- if the specified network device is a port that belongs to a master device +such as a bridge, the operation is fulfilled by the master device's driver, +which may in turn notify the port driver too of the address. If the specified +device is a master itself, such as a bridge, this flag is invalid. .sp The above is fine with me and IMHO much better than it is today. But if you ask me I would change it to: .B self - the operation is fulfilled directly by the driver for the specified physical device. If the network device belongs to a master like a bridge, then the bridge is bypassed and not notified of this operation (and if the device does notify the bridge, it is driver-specific behavior and not mandated by this flag, check the driver for more details). The "bridge fdb add" command can also be used on the bridge device itself, and in this case, the added fdb entries will be locally terminated (not forwarded). The flag is set by default if "master" is not specified. .sp .B master - if the specified network device is a port that belongs to a master device such as a software bridge, the operation is fulfilled by the master device's driver, which may in turn notify the port driver too of the address. If the specified device is a master itself, such as a bridge, this flag is invalid. .sp
Re: [net-next PATCH v5 15/15] net: dpaa2-mac: Add ACPI support for DPAA2 MAC driver
On Mon, Feb 08, 2021 at 04:28:31PM +, Russell King - ARM Linux admin wrote: > On Mon, Feb 08, 2021 at 08:42:44PM +0530, Calvin Johnson wrote: > > Modify dpaa2_mac_connect() to support ACPI along with DT. > > Modify dpaa2_mac_get_node() to get the dpmac fwnode from either > > DT or ACPI. > > > > Replace of_get_phy_mode with fwnode_get_phy_mode to get > > phy-mode for a dpmac_node. > > > > Use helper function phylink_fwnode_phy_connect() to find phy_dev and > > connect to mac->phylink. > > > > Signed-off-by: Calvin Johnson > > I don't think this does the full job. > > > static int dpaa2_pcs_create(struct dpaa2_mac *mac, > > - struct device_node *dpmac_node, int id) > > + struct fwnode_handle *dpmac_node, > > + int id) > > { > > struct mdio_device *mdiodev; > > - struct device_node *node; > > + struct fwnode_handle *node; > > > > - node = of_parse_phandle(dpmac_node, "pcs-handle", 0); > > - if (!node) { > > + node = fwnode_find_reference(dpmac_node, "pcs-handle", 0); > > + if (IS_ERR(node)) { > > /* do not error out on old DTS files */ > > netdev_warn(mac->net_dev, "pcs-handle node not found\n"); > > return 0; > > } > > > > - if (!of_device_is_available(node)) { > > + if (!of_device_is_available(to_of_node(node))) { > > If "node" is an ACPI node, then to_of_node() returns NULL, and > of_device_is_available(NULL) is false. So, if we're using ACPI > and we enter this path, we will always hit the error below: > > > netdev_err(mac->net_dev, "pcs-handle node not available\n"); > > - of_node_put(node); > > + of_node_put(to_of_node(node)); > > return -ENODEV; > > } > > > @@ -306,7 +321,7 @@ int dpaa2_mac_connect(struct dpaa2_mac *mac) > > * error out if the interface mode requests them and there is no PHY > > * to act upon them > > */ > > - if (of_phy_is_fixed_link(dpmac_node) && > > + if (of_phy_is_fixed_link(to_of_node(dpmac_node)) && > > If "dpmac_node" is an ACPI node, to_of_node() will return NULL, and > of_phy_is_fixed_link() will oops. I think of_phy_is_fixed_link() needs to be fixed. I'll add below fix. --- a/drivers/net/mdio/of_mdio.c +++ b/drivers/net/mdio/of_mdio.c @@ -439,6 +439,9 @@ bool of_phy_is_fixed_link(struct device_node *np) int len, err; const char *managed; + if (!np) + return false; + /* New binding */ dn = of_get_child_by_name(np, "fixed-link"); if (dn) { Regards Calvin
Re: [PATCH net-next 09/12] net: dsa: tag_ocelot: create separate tagger for Seville
Hi Vladimir, url: https://github.com/0day-ci/linux/commits/Vladimir-Oltean/PTP-for-DSA-tag_ocelot_8021q/20210213-081857 base: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git 3c5a2fd042d0bfac71a2dfb99515723d318df47b config: i386-randconfig-m031-20210209 (attached as .config) compiler: gcc-9 (Debian 9.3.0-15) 9.3.0 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot Reported-by: Dan Carpenter smatch warnings: net/dsa/tag_ocelot.c:59 ocelot_xmit() warn: should '(((1))) << (dp->index)' be a 64 bit type? net/dsa/tag_ocelot.c:71 seville_xmit() warn: should '(((1))) << (dp->index)' be a 64 bit type? vim +59 net/dsa/tag_ocelot.c 9d88a16c0fc930 Vladimir Oltean 2021-02-13 52 static struct sk_buff *ocelot_xmit(struct sk_buff *skb, 9d88a16c0fc930 Vladimir Oltean 2021-02-13 53 struct net_device *netdev) 9d88a16c0fc930 Vladimir Oltean 2021-02-13 54 { 9d88a16c0fc930 Vladimir Oltean 2021-02-13 55 struct dsa_port *dp = dsa_slave_to_port(netdev); 9d88a16c0fc930 Vladimir Oltean 2021-02-13 56 void *injection; 9d88a16c0fc930 Vladimir Oltean 2021-02-13 57 9d88a16c0fc930 Vladimir Oltean 2021-02-13 58 ocelot_xmit_common(skb, netdev, cpu_to_be32(0x888a), &injection); 9d88a16c0fc930 Vladimir Oltean 2021-02-13 @59 ocelot_ifh_set_dest(injection, BIT(dp->index)); db->index is less than db->num_ports which 32 or less but sometimes it comes from the device tree so who knows. The ocelot_ifh_set_dest() function takes a u64 though and that suggests that BIT() should be changed to BIT_ULL(). 9d88a16c0fc930 Vladimir Oltean 2021-02-13 60 9d88a16c0fc930 Vladimir Oltean 2021-02-13 61 return skb; 9d88a16c0fc930 Vladimir Oltean 2021-02-13 62 } 9d88a16c0fc930 Vladimir Oltean 2021-02-13 63 9d88a16c0fc930 Vladimir Oltean 2021-02-13 64 static struct sk_buff *seville_xmit(struct sk_buff *skb, 9d88a16c0fc930 Vladimir Oltean 2021-02-13 65 struct net_device *netdev) 9d88a16c0fc930 Vladimir Oltean 2021-02-13 66 { 9d88a16c0fc930 Vladimir Oltean 2021-02-13 67 struct dsa_port *dp = dsa_slave_to_port(netdev); 9d88a16c0fc930 Vladimir Oltean 2021-02-13 68 void *injection; 9d88a16c0fc930 Vladimir Oltean 2021-02-13 69 9d88a16c0fc930 Vladimir Oltean 2021-02-13 70 ocelot_xmit_common(skb, netdev, cpu_to_be32(0x8885), &injection); 9d88a16c0fc930 Vladimir Oltean 2021-02-13 @71 seville_ifh_set_dest(injection, BIT(dp->index)); Same. 9d88a16c0fc930 Vladimir Oltean 2021-02-13 72 8dce89aa5f3274 Vladimir Oltean 2019-11-14 73 return skb; 8dce89aa5f3274 Vladimir Oltean 2019-11-14 74 } --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip
Re: [PATCH 1/2] lockdep: add lockdep_assert_not_held()
On Mon, 2021-02-15 at 11:44 +0100, Peter Zijlstra wrote: > > I think something like so will work, but please double check. Yeah, that looks better. > +++ b/include/linux/lockdep.h > @@ -294,11 +294,15 @@ extern void lock_unpin_lock(struct lockdep_map *lock, > struct pin_cookie); > > #define lockdep_depth(tsk) (debug_locks ? (tsk)->lockdep_depth : 0) > > -#define lockdep_assert_held(l) do {\ > - WARN_ON(debug_locks && !lockdep_is_held(l));\ > +#define lockdep_assert_held(l) do { > \ > + WARN_ON(debug_locks && lockdep_is_held(l) == 0)); \ > } while (0) That doesn't really need to change? It's the same. > -#define lockdep_assert_held_write(l) do {\ > +#define lockdep_assert_not_held(l) do {\ > + WARN_ON(debug_locks && lockdep_is_held(l) == 1)); \ > + } while (0) > + > +#define lockdep_assert_held_write(l) do {\ > WARN_ON(debug_locks && !lockdep_is_held_type(l, 0));\ > } while (0) > > diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c > index c1418b47f625..983ba206f7b2 100644 > --- a/kernel/locking/lockdep. > +++ b/kernel/locking/lockdep.c > @@ -5467,7 +5467,7 @@ noinstr int lock_is_held_type(const struct lockdep_map > *lock, int read) > int ret = 0; > > if (unlikely(!lockdep_enabled())) > - return 1; /* avoid false negative lockdep_assert_held() */ > + return -1; /* avoid false negative lockdep_assert_held() */ Maybe add lockdep_assert_not_held() to the comment, to explain the -1 (vs non-zero)? johannes
Re: [PATCH net v1 1/3] net: phy: mscc: adding LCPLL reset to VSC8514
On Fri, 2021-02-12 at 16:20 +0100, Andrew Lunn wrote: > EXTERNAL EMAIL: Do not click links or open attachments unless you > know the content is safe > > On Fri, Feb 12, 2021 at 03:06:41PM +0100, Bjarni Jonasson wrote: > > +static u32 vsc85xx_csr_read(struct phy_device *phydev, > > + enum csr_target target, u32 reg); > > +static int vsc85xx_csr_write(struct phy_device *phydev, > > + enum csr_target target, u32 reg, u32 > > val); > > + > > Hi Bjarni > > No forward definitions please. Move the code around so they are not > required. Sometimes it is best to do such a move as a preparation > patch. Sure, I will remove them. > > > @@ -1569,8 +1664,16 @@ static int vsc8514_config_pre_init(struct > > phy_device *phydev) > > {0x16b2, 0x7000}, > > {0x16b4, 0x0814}, > > }; > > + struct device *dev = &phydev->mdio.dev; > > unsigned int i; > > u16 reg; > > + int ret; > > Hard to say from the limited context, but is reverse christmass tree > being preserved here? I will dobbelcheck. > > Andrew
Re: [PATCH net-next 09/12] net: dsa: tag_ocelot: create separate tagger for Seville
Hi Dan, On Mon, Feb 15, 2021 at 04:00:04PM +0300, Dan Carpenter wrote: > db->index is less than db->num_ports which 32 or less but sometimes it > comes from the device tree so who knows. The destination port mask is copied into a 12-bit field of the packet, starting at bit offset 67 and ending at 56: static inline void ocelot_ifh_set_dest(void *injection, u64 dest) { packing(injection, &dest, 67, 56, OCELOT_TAG_LEN, PACK, 0); } So this DSA tagging protocol supports at most 12 bits, which is clearly less than 32. Attempting to send to a port number > 12 will cause the packing() call to truncate way before there will be 32-bit truncation due to type promotion of the BIT(port) argument towards u64. > The ocelot_ifh_set_dest() function takes a u64 though and that > suggests that BIT() should be changed to BIT_ULL(). I understand that you want to silence the warning, which fundamentally comes from the packing() API which works with u64 values and nothing of a smaller size. So I can send a patch which replaces BIT(port) with BIT_ULL(port), even if in practice both are equally fine.
Re: [PATCH net v1 1/3] net: phy: mscc: adding LCPLL reset to VSC8514
On Fri, 2021-02-12 at 16:54 +0100, Andrew Lunn wrote: > EXTERNAL EMAIL: Do not click links or open attachments unless you > know the content is safe > > On Fri, Feb 12, 2021 at 03:06:41PM +0100, Bjarni Jonasson wrote: > > At Power-On Reset, transients may cause the LCPLL to lock onto a > > clock that is momentarily unstable. This is normally seen in QSGMII > > setups where the higher speed 6G SerDes is being used. > > This patch adds an initial LCPLL Reset to the PHY (first instance) > > to avoid this issue. > > Hi Bjarni > > https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html > > These patches are rather large for stable, and not obviously correct. > > There these problems hitting real users running stable kernels? Or is > it so broken it never really worked? > >Andrew Correct, the current linux driver is unstable and has never really worked properly. Our in-house SDK driver already have these fixes and the upstreamed version has been lagging behind. And yes the patches are big, we had to pull out the calibration from the 8051 into the driver.
[PATCH net-next] net: mscc: ocelot: avoid type promotion when calling ocelot_ifh_set_dest
From: Vladimir Oltean Smatch is confused by the fact that a 32-bit BIT(port) macro is passed as argument to the ocelot_ifh_set_dest function and warns: ocelot_xmit() warn: should '(((1))) << (dp->index)' be a 64 bit type? seville_xmit() warn: should '(((1))) << (dp->index)' be a 64 bit type? The destination port mask is copied into a 12-bit field of the packet, starting at bit offset 67 and ending at 56. So this DSA tagging protocol supports at most 12 bits, which is clearly less than 32. Attempting to send to a port number > 12 will cause the packing() call to truncate way before there will be 32-bit truncation due to type promotion of the BIT(port) argument towards u64. Therefore, smatch's fears that BIT(port) will do the wrong thing and cause unexpected truncation for "port" values >= 32 are unfounded. Nonetheless, let's silence the warning by explicitly passing an u64 value to ocelot_ifh_set_dest, such that the compiler does not need to do a questionable type promotion. Reported-by: kernel test robot Reported-by: Dan Carpenter Signed-off-by: Vladimir Oltean --- drivers/net/ethernet/mscc/ocelot.c | 2 +- net/dsa/tag_ocelot.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c index 8d97c731e953..5d13087c85d6 100644 --- a/drivers/net/ethernet/mscc/ocelot.c +++ b/drivers/net/ethernet/mscc/ocelot.c @@ -803,7 +803,7 @@ void ocelot_port_inject_frame(struct ocelot *ocelot, int port, int grp, QS_INJ_CTRL_SOF, QS_INJ_CTRL, grp); ocelot_ifh_set_bypass(ifh, 1); - ocelot_ifh_set_dest(ifh, BIT(port)); + ocelot_ifh_set_dest(ifh, BIT_ULL(port)); ocelot_ifh_set_tag_type(ifh, IFH_TAG_TYPE_C); ocelot_ifh_set_vid(ifh, skb_vlan_tag_get(skb)); ocelot_ifh_set_rew_op(ifh, rew_op); diff --git a/net/dsa/tag_ocelot.c b/net/dsa/tag_ocelot.c index a7dd61c8e005..f9df9cac81c5 100644 --- a/net/dsa/tag_ocelot.c +++ b/net/dsa/tag_ocelot.c @@ -56,7 +56,7 @@ static struct sk_buff *ocelot_xmit(struct sk_buff *skb, void *injection; ocelot_xmit_common(skb, netdev, cpu_to_be32(0x888a), &injection); - ocelot_ifh_set_dest(injection, BIT(dp->index)); + ocelot_ifh_set_dest(injection, BIT_ULL(dp->index)); return skb; } @@ -68,7 +68,7 @@ static struct sk_buff *seville_xmit(struct sk_buff *skb, void *injection; ocelot_xmit_common(skb, netdev, cpu_to_be32(0x8885), &injection); - seville_ifh_set_dest(injection, BIT(dp->index)); + seville_ifh_set_dest(injection, BIT_ULL(dp->index)); return skb; } -- 2.25.1
Re: [PATCH net v1 3/3] net: phy: mscc: coma mode disabled for VSC8514
On Fri, 2021-02-12 at 16:32 +, Vladimir Oltean wrote: > EXTERNAL EMAIL: Do not click links or open attachments unless you > know the content is safe > > On Fri, Feb 12, 2021 at 03:06:43PM +0100, Bjarni Jonasson wrote: > > The 'coma mode' (configurable through sw or hw) provides an > > optional feature that may be used to control when the PHYs become > > active. > > The typical usage is to synchronize the link-up time across > > all PHY instances. This patch releases coma mode if not done by > > hardware, > > otherwise the phys will not link-up. > > > > Signed-off-by: Steen Hegelund > > Signed-off-by: Bjarni Jonasson > > Fixes: e4f9ba642f0b ("net: phy: mscc: add support for VSC8514 > > PHY.") > > --- > > drivers/net/phy/mscc/mscc_main.c | 13 + > > 1 file changed, 13 insertions(+) > > > > diff --git a/drivers/net/phy/mscc/mscc_main.c > > b/drivers/net/phy/mscc/mscc_main.c > > index 7546d9cc3abd..0600b592618b 100644 > > --- a/drivers/net/phy/mscc/mscc_main.c > > +++ b/drivers/net/phy/mscc/mscc_main.c > > @@ -1418,6 +1418,18 @@ static void vsc8584_get_base_addr(struct > > phy_device *phydev) > > vsc8531->addr = addr; > > } > > > > +static void vsc85xx_coma_mode_release(struct phy_device *phydev) > > +{ > > + /* The coma mode (pin or reg) provides an optional feature > > that > > + * may be used to control when the PHYs become active. > > + * Alternatively the COMA_MODE pin may be connected low > > + * so that the PHYs are fully active once out of reset. > > + */ > > + __phy_write(phydev, MSCC_EXT_PAGE_ACCESS, > > MSCC_PHY_PAGE_EXTENDED_GPIO); > > + __phy_write(phydev, MSCC_PHY_GPIO_CONTROL_2, 0x0600); > > + __phy_write(phydev, MSCC_EXT_PAGE_ACCESS, > > MSCC_PHY_PAGE_STANDARD); > > Can you please do: > phy_write_paged(phydev, MSCC_PHY_PAGE_EXTENDED_GPIO, > MSCC_PHY_GPIO_CONTROL_2, 0x0600); > > And can you please provide some definitions for what 0x0600 is? > My reference manual says that: > > Bit 13: > COMA_MODE output enable (active low) > Bit 12: > COMA_MODE output data > Bit 11: > COMA_MODE input data > Bit 10: > Reserved > Bit 9: > Tri-state enable for LEDs > > 0x600 is BIT(10) | BIT(9). But BIT(10) is reserved. Sure this is > correct? I can see this is unclear. The code is actualy writing zero to bit 12 and 13. Bit 9 and 10 are not interesting in this context. I will change it to use phy_modify_paged() bit 12 and 13. > > > +} > > + > > static int vsc8584_config_init(struct phy_device *phydev) > > { > > struct vsc8531_private *vsc8531 = phydev->priv; > > @@ -2610,6 +2622,7 @@ static int vsc8514_config_init(struct > > phy_device *phydev) > > ret = vsc8514_config_host_serdes(phydev); > > if (ret) > > goto err; > > + vsc85xx_coma_mode_release(phydev); > > } > > > > phy_unlock_mdio_bus(phydev); > > -- > > 2.17.1 > >
Re: [PATCH net v1 1/3] net: phy: mscc: adding LCPLL reset to VSC8514
On Fri, 2021-02-12 at 10:53 -0800, Jakub Kicinski wrote: > EXTERNAL EMAIL: Do not click links or open attachments unless you > know the content is safe > > On Fri, 12 Feb 2021 15:06:41 +0100 Bjarni Jonasson wrote: > > At Power-On Reset, transients may cause the LCPLL to lock onto a > > clock that is momentarily unstable. This is normally seen in QSGMII > > setups where the higher speed 6G SerDes is being used. > > This patch adds an initial LCPLL Reset to the PHY (first instance) > > to avoid this issue. > > > > Signed-off-by: Steen Hegelund > > Signed-off-by: Bjarni Jonasson > > Fixes: e4f9ba642f0b ("net: phy: mscc: add support for VSC8514 > > PHY.") > > Please make sure each commit builds cleanly with W=1 C=1. > > This one appears to not build at all? Sorry about that, I will make sure.
Re: [PATCH] net: dsa: mv88e6xxx: prevent 2500BASEX mode override
On Mon, 15 Feb 2021 06:15:59 + Nathan Rossi wrote: > diff --git a/drivers/net/dsa/mv88e6xxx/chip.c > b/drivers/net/dsa/mv88e6xxx/chip.c > index 54aa942eed..5c52906b29 100644 > --- a/drivers/net/dsa/mv88e6xxx/chip.c > +++ b/drivers/net/dsa/mv88e6xxx/chip.c > @@ -650,6 +650,13 @@ static void mv88e6xxx_validate(struct dsa_switch *ds, > int port, > if (chip->info->ops->phylink_validate) > chip->info->ops->phylink_validate(chip, port, mask, state); > > + /* Advertise 2500BASEX only if 1000BASEX is not configured, this > + * prevents phylink_helper_basex_speed from always overriding the > + * 1000BASEX mode since auto negotiation is always enabled. > + */ > + if (state->interface == PHY_INTERFACE_MODE_1000BASEX) > + phylink_clear(mask, 2500baseX_Full); > + I don't quite like this. This problem should be either solved in phylink_helper_basex_speed() or somewhere in the mv88e6xxx code, but near the call to phylink_helper_basex_speed(). Putting a solution to the behaviour of phylink_helper_basex_speed() it into the validate() method when phylink_helper_basex_speed() is called from a different place will complicate debugging in the future. If we start solving problems in this kind of way, the driver will become totally unreadable, IMO. Marek
Re: [PATCH v14 2/4] phy: Add media type and speed serdes configuration interfaces
On Mon, Feb 15, 2021 at 05:25:10PM +0530, Kishon Vijay Abraham I wrote: > Okay. Is it going to be some sort of manual negotiation where the > Ethernet controller invokes set_speed with different speeds? Or the > Ethernet controller will get the speed using some out of band mechanism > and invokes set_speed once with the actual speed? Hi Kishon There are a few different mechanism possible. The SFP has an EEPROM which contains lots of parameters. One is the maximum baud rate the module supports. PHYLINK will combine this information with the MAC capabilities to determine the default speed. The users can select the mode the MAC works in, e.g. 1000BaseX vs 2500BaseX, via ethtool -s. Different modes needs different speeds. Some copper PHYs will change there host side interface baud rate when the media side interface changes mode. 10GBASE-X for 10G copper, 5GBase-X for 5G COPPER, 2500Base-X for 2.5G copper, and SGMII for old school 10/100/1G Ethernet. Mainline Linux has no support for it, but some 'vendor crap' will do a manual negotiation, simply trying different speeds and see if the SERDES establishes link. There is nothing standardised for this, as far as i know. Andrew
Re: [PATCH net-next 09/12] net: dsa: tag_ocelot: create separate tagger for Seville
On Mon, Feb 15, 2021 at 03:19:31PM +0200, Vladimir Oltean wrote: > Hi Dan, > > On Mon, Feb 15, 2021 at 04:00:04PM +0300, Dan Carpenter wrote: > > db->index is less than db->num_ports which 32 or less but sometimes it > > comes from the device tree so who knows. > > The destination port mask is copied into a 12-bit field of the packet, > starting at bit offset 67 and ending at 56: > > static inline void ocelot_ifh_set_dest(void *injection, u64 dest) > { > packing(injection, &dest, 67, 56, OCELOT_TAG_LEN, PACK, 0); > } > > So this DSA tagging protocol supports at most 12 bits, which is clearly > less than 32. Attempting to send to a port number > 12 will cause the > packing() call to truncate way before there will be 32-bit truncation > due to type promotion of the BIT(port) argument towards u64. > > > The ocelot_ifh_set_dest() function takes a u64 though and that > > suggests that BIT() should be changed to BIT_ULL(). > > I understand that you want to silence the warning, which fundamentally > comes from the packing() API which works with u64 values and nothing of > a smaller size. So I can send a patch which replaces BIT(port) with > BIT_ULL(port), even if in practice both are equally fine. I don't have a strong feeling about this... Generally silencing warnings just to make a checker happy is the wrong idea. To be honest, I normally ignore these warnings. But I have been looking at them recently to try figure out if we could make it so it would only generate a warning where "db->index" was known as possibly being in the 32-63 range. So I looked at this one. And now I see some ways that Smatch could have parsed this better... regards, dan carpenter
[PATCH v1 2/3] string: Move onoff() helper under string.h hood
We have already an implementation and a lot of code that can benefit of the onoff() helper. Move it under string.h hood. Signed-off-by: Andy Shevchenko --- drivers/gpu/drm/i915/i915_utils.h | 5 - include/linux/string.h| 5 + 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h index e6da5a951132..d2ac357896d4 100644 --- a/drivers/gpu/drm/i915/i915_utils.h +++ b/drivers/gpu/drm/i915/i915_utils.h @@ -409,11 +409,6 @@ wait_remaining_ms_from_jiffies(unsigned long timestamp_jiffies, int to_wait_ms) #define MBps(x) KBps(1000 * (x)) #define GBps(x) ((u64)1000 * MBps((x))) -static inline const char *onoff(bool v) -{ - return v ? "on" : "off"; -} - static inline const char *enableddisabled(bool v) { return v ? "enabled" : "disabled"; diff --git a/include/linux/string.h b/include/linux/string.h index fd946a5e18c8..2a0589e945d9 100644 --- a/include/linux/string.h +++ b/include/linux/string.h @@ -313,4 +313,9 @@ static inline const char *yesno(bool yes) return yes ? "yes" : "no"; } +static inline const char *onoff(bool on) +{ + return on ? "on" : "off"; +} + #endif /* _LINUX_STRING_H_ */ -- 2.30.0
[PATCH v1 1/3] string: Consolidate yesno() helpers under string.h hood
We have already few similar implementation and a lot of code that can benefit of the yesno() helper. Consolidate yesno() helpers under string.h hood. Signed-off-by: Andy Shevchenko --- .../drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c| 6 +- drivers/gpu/drm/i915/i915_utils.h| 6 +- drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c | 12 +--- include/linux/string.h | 5 + 4 files changed, 8 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c index 360952129b6d..7fde4f90e513 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c @@ -23,6 +23,7 @@ * */ +#include #include #include @@ -49,11 +50,6 @@ struct dmub_debugfs_trace_entry { uint32_t param1; }; -static inline const char *yesno(bool v) -{ - return v ? "yes" : "no"; -} - /* parse_write_buffer_into_params - Helper function to parse debugfs write buffer into an array * * Function takes in attributes passed to debugfs write entry diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h index abd4dcd9f79c..e6da5a951132 100644 --- a/drivers/gpu/drm/i915/i915_utils.h +++ b/drivers/gpu/drm/i915/i915_utils.h @@ -27,6 +27,7 @@ #include #include +#include #include #include #include @@ -408,11 +409,6 @@ wait_remaining_ms_from_jiffies(unsigned long timestamp_jiffies, int to_wait_ms) #define MBps(x) KBps(1000 * (x)) #define GBps(x) ((u64)1000 * MBps((x))) -static inline const char *yesno(bool v) -{ - return v ? "yes" : "no"; -} - static inline const char *onoff(bool v) { return v ? "on" : "off"; diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c index 7d49fd4edc9e..c857d73abbd7 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c @@ -34,6 +34,7 @@ #include #include +#include #include #include #include @@ -2015,17 +2016,6 @@ static const struct file_operations rss_debugfs_fops = { /* RSS Configuration. */ -/* Small utility function to return the strings "yes" or "no" if the supplied - * argument is non-zero. - */ -static const char *yesno(int x) -{ - static const char *yes = "yes"; - static const char *no = "no"; - - return x ? yes : no; -} - static int rss_config_show(struct seq_file *seq, void *v) { struct adapter *adapter = seq->private; diff --git a/include/linux/string.h b/include/linux/string.h index 9521d8cab18e..fd946a5e18c8 100644 --- a/include/linux/string.h +++ b/include/linux/string.h @@ -308,4 +308,9 @@ static __always_inline size_t str_has_prefix(const char *str, const char *prefix return strncmp(str, prefix, len) == 0 ? len : 0; } +static inline const char *yesno(bool yes) +{ + return yes ? "yes" : "no"; +} + #endif /* _LINUX_STRING_H_ */ -- 2.30.0
[PATCH v1 3/3] string: Move enableddisabled() helper under string.h hood
We have already an implementation and a lot of code that can benefit of the enableddisabled() helper. Move it under string.h hood. Signed-off-by: Andy Shevchenko --- drivers/gpu/drm/i915/i915_utils.h | 5 - include/linux/string.h| 5 + 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h index d2ac357896d4..b05d72b4dd93 100644 --- a/drivers/gpu/drm/i915/i915_utils.h +++ b/drivers/gpu/drm/i915/i915_utils.h @@ -409,11 +409,6 @@ wait_remaining_ms_from_jiffies(unsigned long timestamp_jiffies, int to_wait_ms) #define MBps(x) KBps(1000 * (x)) #define GBps(x) ((u64)1000 * MBps((x))) -static inline const char *enableddisabled(bool v) -{ - return v ? "enabled" : "disabled"; -} - void add_taint_for_CI(struct drm_i915_private *i915, unsigned int taint); static inline void __add_taint_for_CI(unsigned int taint) { diff --git a/include/linux/string.h b/include/linux/string.h index 2a0589e945d9..25f055aa4c31 100644 --- a/include/linux/string.h +++ b/include/linux/string.h @@ -318,4 +318,9 @@ static inline const char *onoff(bool on) return on ? "on" : "off"; } +static inline const char *enableddisabled(bool enabled) +{ + return enabled ? "enabled" : "disabled"; +} + #endif /* _LINUX_STRING_H_ */ -- 2.30.0
Re: [PATCH v1 1/3] string: Consolidate yesno() helpers under string.h hood
Am 15.02.21 um 15:21 schrieb Andy Shevchenko: We have already few similar implementation and a lot of code that can benefit of the yesno() helper. Consolidate yesno() helpers under string.h hood. Signed-off-by: Andy Shevchenko Looks like a good idea to me, feel free to add an Acked-by: Christian König to the series. But looking at the use cases for this, wouldn't it make more sense to teach kprintf some new format modifier for this? Christian. --- .../drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c| 6 +- drivers/gpu/drm/i915/i915_utils.h| 6 +- drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c | 12 +--- include/linux/string.h | 5 + 4 files changed, 8 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c index 360952129b6d..7fde4f90e513 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c @@ -23,6 +23,7 @@ * */ +#include #include #include @@ -49,11 +50,6 @@ struct dmub_debugfs_trace_entry { uint32_t param1; }; -static inline const char *yesno(bool v) -{ - return v ? "yes" : "no"; -} - /* parse_write_buffer_into_params - Helper function to parse debugfs write buffer into an array * * Function takes in attributes passed to debugfs write entry diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h index abd4dcd9f79c..e6da5a951132 100644 --- a/drivers/gpu/drm/i915/i915_utils.h +++ b/drivers/gpu/drm/i915/i915_utils.h @@ -27,6 +27,7 @@ #include #include +#include #include #include #include @@ -408,11 +409,6 @@ wait_remaining_ms_from_jiffies(unsigned long timestamp_jiffies, int to_wait_ms) #define MBps(x) KBps(1000 * (x)) #define GBps(x) ((u64)1000 * MBps((x))) -static inline const char *yesno(bool v) -{ - return v ? "yes" : "no"; -} - static inline const char *onoff(bool v) { return v ? "on" : "off"; diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c index 7d49fd4edc9e..c857d73abbd7 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c @@ -34,6 +34,7 @@ #include #include +#include #include #include #include @@ -2015,17 +2016,6 @@ static const struct file_operations rss_debugfs_fops = { /* RSS Configuration. */ -/* Small utility function to return the strings "yes" or "no" if the supplied - * argument is non-zero. - */ -static const char *yesno(int x) -{ - static const char *yes = "yes"; - static const char *no = "no"; - - return x ? yes : no; -} - static int rss_config_show(struct seq_file *seq, void *v) { struct adapter *adapter = seq->private; diff --git a/include/linux/string.h b/include/linux/string.h index 9521d8cab18e..fd946a5e18c8 100644 --- a/include/linux/string.h +++ b/include/linux/string.h @@ -308,4 +308,9 @@ static __always_inline size_t str_has_prefix(const char *str, const char *prefix return strncmp(str, prefix, len) == 0 ? len : 0; } +static inline const char *yesno(bool yes) +{ + return yes ? "yes" : "no"; +} + #endif /* _LINUX_STRING_H_ */
commit 0f0aefd733f7 to linux-firmware effectively broke all of the setups with old kernels
Hi! Seems the commit 0f0aefd733f7 to linux-firmware effectively broke all of the setups with the old kernels. Firmware name is an ABI (!) and replacing it like this will definitely break systems with older kernels. Linux firmware package likely, but unfortunately, should carry on both versions as long as it's needed. Alternative solution is to provide the links during installation. Btw, I haven't seen the driver change for that. Care to provide a commit ID in upstream? -- With Best Regards, Andy Shevchenko
Re: [PATCH v1 1/3] string: Consolidate yesno() helpers under string.h hood
On Mon, 15 Feb 2021, Andy Shevchenko wrote: > We have already few similar implementation and a lot of code that can benefit > of the yesno() helper. Consolidate yesno() helpers under string.h hood. Good luck. I gave up after just four versions. [1] Acked-by: Jani Nikula BR, Jani. [1] http://lore.kernel.org/r/20191023131308.9420-1-jani.nik...@intel.com > > Signed-off-by: Andy Shevchenko > --- > .../drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c| 6 +- > drivers/gpu/drm/i915/i915_utils.h| 6 +- > drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c | 12 +--- > include/linux/string.h | 5 + > 4 files changed, 8 insertions(+), 21 deletions(-) > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c > index 360952129b6d..7fde4f90e513 100644 > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c > @@ -23,6 +23,7 @@ > * > */ > > +#include > #include > > #include > @@ -49,11 +50,6 @@ struct dmub_debugfs_trace_entry { > uint32_t param1; > }; > > -static inline const char *yesno(bool v) > -{ > - return v ? "yes" : "no"; > -} > - > /* parse_write_buffer_into_params - Helper function to parse debugfs write > buffer into an array > * > * Function takes in attributes passed to debugfs write entry > diff --git a/drivers/gpu/drm/i915/i915_utils.h > b/drivers/gpu/drm/i915/i915_utils.h > index abd4dcd9f79c..e6da5a951132 100644 > --- a/drivers/gpu/drm/i915/i915_utils.h > +++ b/drivers/gpu/drm/i915/i915_utils.h > @@ -27,6 +27,7 @@ > > #include > #include > +#include > #include > #include > #include > @@ -408,11 +409,6 @@ wait_remaining_ms_from_jiffies(unsigned long > timestamp_jiffies, int to_wait_ms) > #define MBps(x) KBps(1000 * (x)) > #define GBps(x) ((u64)1000 * MBps((x))) > > -static inline const char *yesno(bool v) > -{ > - return v ? "yes" : "no"; > -} > - > static inline const char *onoff(bool v) > { > return v ? "on" : "off"; > diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c > b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c > index 7d49fd4edc9e..c857d73abbd7 100644 > --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c > +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c > @@ -34,6 +34,7 @@ > > #include > #include > +#include > #include > #include > #include > @@ -2015,17 +2016,6 @@ static const struct file_operations rss_debugfs_fops = > { > /* RSS Configuration. > */ > > -/* Small utility function to return the strings "yes" or "no" if the supplied > - * argument is non-zero. > - */ > -static const char *yesno(int x) > -{ > - static const char *yes = "yes"; > - static const char *no = "no"; > - > - return x ? yes : no; > -} > - > static int rss_config_show(struct seq_file *seq, void *v) > { > struct adapter *adapter = seq->private; > diff --git a/include/linux/string.h b/include/linux/string.h > index 9521d8cab18e..fd946a5e18c8 100644 > --- a/include/linux/string.h > +++ b/include/linux/string.h > @@ -308,4 +308,9 @@ static __always_inline size_t str_has_prefix(const char > *str, const char *prefix > return strncmp(str, prefix, len) == 0 ? len : 0; > } > > +static inline const char *yesno(bool yes) > +{ > + return yes ? "yes" : "no"; > +} > + > #endif /* _LINUX_STRING_H_ */ -- Jani Nikula, Intel Open Source Graphics Center
Re: [PATCH v1 1/3] string: Consolidate yesno() helpers under string.h hood
+Cc: Sakari and printk people On Mon, Feb 15, 2021 at 4:28 PM Christian König wrote: > Am 15.02.21 um 15:21 schrieb Andy Shevchenko: > > We have already few similar implementation and a lot of code that can > > benefit > > of the yesno() helper. Consolidate yesno() helpers under string.h hood. > > > > Signed-off-by: Andy Shevchenko > > Looks like a good idea to me, feel free to add an Acked-by: Christian > König to the series. Thanks. > But looking at the use cases for this, wouldn't it make more sense to > teach kprintf some new format modifier for this? As a next step? IIRC Sakari has at some point the series converted yesno and Co. to something which I don't remember the details of. Guys, what do you think? -- With Best Regards, Andy Shevchenko
Re: [PATCH net-next 09/12] net: dsa: tag_ocelot: create separate tagger for Seville
On Mon, Feb 15, 2021 at 05:15:21PM +0300, Dan Carpenter wrote: > On Mon, Feb 15, 2021 at 03:19:31PM +0200, Vladimir Oltean wrote: > > Hi Dan, > > > > On Mon, Feb 15, 2021 at 04:00:04PM +0300, Dan Carpenter wrote: > > > db->index is less than db->num_ports which 32 or less but sometimes it > > > comes from the device tree so who knows. > > > > The destination port mask is copied into a 12-bit field of the packet, > > starting at bit offset 67 and ending at 56: > > > > static inline void ocelot_ifh_set_dest(void *injection, u64 dest) > > { > > packing(injection, &dest, 67, 56, OCELOT_TAG_LEN, PACK, 0); > > } > > > > So this DSA tagging protocol supports at most 12 bits, which is clearly > > less than 32. Attempting to send to a port number > 12 will cause the > > packing() call to truncate way before there will be 32-bit truncation > > due to type promotion of the BIT(port) argument towards u64. > > > > > The ocelot_ifh_set_dest() function takes a u64 though and that > > > suggests that BIT() should be changed to BIT_ULL(). > > > > I understand that you want to silence the warning, which fundamentally > > comes from the packing() API which works with u64 values and nothing of > > a smaller size. So I can send a patch which replaces BIT(port) with > > BIT_ULL(port), even if in practice both are equally fine. > > I don't have a strong feeling about this... Generally silencing > warnings just to make a checker happy is the wrong idea. In this case it is a harmless wrong idea. > To be honest, I normally ignore these warnings. But I have been looking > at them recently to try figure out if we could make it so it would only > generate a warning where "db->index" was known as possibly being in the > 32-63 range. So I looked at this one. > > And now I see some ways that Smatch could have parsed this better... For DSA, the dp->index should be lower than ds->num_ports by construction (see dsa_switch_touch_ports). In turn, ds->num_ports is set to constant values smaller than 12 in felix_pci_probe() and in seville_probe(). For ocelot on the other hand, there is a restriction put in mscc_ocelot_init_ports that the port must be <= ocelot->num_phys_ports, which is set to "of_get_child_count(ports)". So there is indeed a possible attack by device tree on the ocelot driver. The number of physical ports does not depend on device tree (arch/mips/boot/dts/mscc/ocelot.dtsi), but should be hardcoded to 11. How many ports there are defined in DT doesn't change how many physical ports there are. For example, the CPU port module is at index 11, and in the code it is referenced as ocelot->ports[ocelot->num_phys_ports]. If num_phys_ports has any other value than 11, the driver malfunctions because the index of the CPU port is misidentified. I'd rather fix this if there was some way in which static analysis could then determine that "port" is bounded by a constant smaller than the truncation threshold. However, I'm not sure how to classify the severity of this problem. There's a million of other ways in which the system can malfunction if it is being "attacked by device tree".
Re: [PATCH] net: dsa: mv88e6xxx: prevent 2500BASEX mode override
On Mon, Feb 15, 2021 at 02:47:56PM +0100, Marek Behun wrote: > On Mon, 15 Feb 2021 06:15:59 + > Nathan Rossi wrote: > > > diff --git a/drivers/net/dsa/mv88e6xxx/chip.c > > b/drivers/net/dsa/mv88e6xxx/chip.c > > index 54aa942eed..5c52906b29 100644 > > --- a/drivers/net/dsa/mv88e6xxx/chip.c > > +++ b/drivers/net/dsa/mv88e6xxx/chip.c > > @@ -650,6 +650,13 @@ static void mv88e6xxx_validate(struct dsa_switch *ds, > > int port, > > if (chip->info->ops->phylink_validate) > > chip->info->ops->phylink_validate(chip, port, mask, state); > > > > + /* Advertise 2500BASEX only if 1000BASEX is not configured, this > > +* prevents phylink_helper_basex_speed from always overriding the > > +* 1000BASEX mode since auto negotiation is always enabled. > > +*/ > > + if (state->interface == PHY_INTERFACE_MODE_1000BASEX) > > + phylink_clear(mask, 2500baseX_Full); > > + > > I don't quite like this. This problem should be either solved in > phylink_helper_basex_speed() or somewhere in the mv88e6xxx code, but near > the call to phylink_helper_basex_speed(). > > Putting a solution to the behaviour of phylink_helper_basex_speed() it > into the validate() method when phylink_helper_basex_speed() is called > from a different place will complicate debugging in the future. If > we start solving problems in this kind of way, the driver will become > totally unreadable, IMO. If we can't switch between 1000base-X and 2500base-X, then we should not be calling phylink_helper_basex_speed() - and only one of those two capabilities should be set in the validation callback. I thought there were DSA switches where we could program the CMODE to switch between these two... -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
Re: [net-next] tcp: Sanitize CMSG flags and reserved args in tcp_zerocopy_receive.
On 2/15/21 5:03 AM, Dan Carpenter wrote: > Hi Arjun, > > url: > https://github.com/0day-ci/linux/commits/Arjun-Roy/tcp-Sanitize-CMSG-flags-and-reserved-args-in-tcp_zerocopy_receive/20210212-052537 > base: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git > e4b62cf7559f2ef9a022de235e5a09a8d7ded520 > config: x86_64-randconfig-m001-20210209 (attached as .config) > compiler: gcc-9 (Debian 9.3.0-15) 9.3.0 > > If you fix the issue, kindly add following tag as appropriate > Reported-by: kernel test robot > Reported-by: Dan Carpenter > > smatch warnings: > net/ipv4/tcp.c:4158 do_tcp_getsockopt() warn: check for integer overflow 'len' > > vim +/len +4158 net/ipv4/tcp.c > > 3fdadf7d27e3fb Dmitry Mishin2006-03-20 3896 static int > do_tcp_getsockopt(struct sock *sk, int level, > 3fdadf7d27e3fb Dmitry Mishin2006-03-20 3897 int > optname, char __user *optval, int __user *optlen) > ^1da177e4c3f41 Linus Torvalds 2005-04-16 3898 { > 295f7324ff8d9e Arnaldo Carvalho de Melo 2005-08-09 3899 struct > inet_connection_sock *icsk = inet_csk(sk); > ^1da177e4c3f41 Linus Torvalds 2005-04-16 3900 struct tcp_sock > *tp = tcp_sk(sk); > 6fa251663069e0 Nikolay Borisov 2016-02-03 3901 struct net *net > = sock_net(sk); > ^1da177e4c3f41 Linus Torvalds 2005-04-16 3902 int val, len; > > "len" is int. > > [ snip ] > 05255b823a6173 Eric Dumazet 2018-04-27 4146 #ifdef CONFIG_MMU > 05255b823a6173 Eric Dumazet 2018-04-27 4147 case > TCP_ZEROCOPY_RECEIVE: { > 7eeba1706eba6d Arjun Roy2021-01-20 4148 struct > scm_timestamping_internal tss; > e0fecb289ad3fd Arjun Roy2020-12-10 4149 struct > tcp_zerocopy_receive zc = {}; > 05255b823a6173 Eric Dumazet 2018-04-27 4150 int err; > 05255b823a6173 Eric Dumazet 2018-04-27 4151 > 05255b823a6173 Eric Dumazet 2018-04-27 4152 if > (get_user(len, optlen)) > 05255b823a6173 Eric Dumazet 2018-04-27 4153 > return -EFAULT; > c8856c05145490 Arjun Roy2020-02-14 4154 if (len > < offsetofend(struct tcp_zerocopy_receive, length)) > 05255b823a6173 Eric Dumazet 2018-04-27 4155 > return -EINVAL; > > > The problem is that negative values of "len" are type promoted to high > positive values. So the fix is to write this as: > > if (len < 0 || len < offsetofend(struct tcp_zerocopy_receive, length)) > return -EINVAL; > > 110912bdf28392 Arjun Roy2021-02-11 4156 if > (unlikely(len > sizeof(zc))) { > 110912bdf28392 Arjun Roy2021-02-11 4157 > err = check_zeroed_user(optval + sizeof(zc), > 110912bdf28392 Arjun Roy2021-02-11 @4158 > len - sizeof(zc)); > > > Potentially "len - a negative value". > > get_user(len, optlen) is called multiple times in that function. len < 0 was checked after the first one at the top. Also, maybe I am missing something here, but offsetofend can not return a negative value, so this checks catches len < 0 as well: if (len < offsetofend(struct tcp_zerocopy_receive, length)) return -EINVAL;
Re: [PATCH net-next 1/2] net: mvneta: Remove per-cpu queue mapping for Armada 3700
Hi Pali, On Mon, 15 Feb 2021 00:50:58 +0100 Pali Rohár wrote: >> According to Errata #23 "The per-CPU GbE interrupt is limited to Core >> 0", we can't use the per-cpu interrupt mechanism on the Armada 3700 >> familly. >> >> This is correctly checked for RSS configuration, but the initial queue >> mapping is still done by having the queues spread across all the CPUs in >> the system, both in the init path and in the cpu_hotplug path. > >Hello Maxime! > >This patch looks like a bug fix for Armada 3700 SoC. What about marking >this commit with Fixes line? E.g.: > >Fixes: 2636ac3cc2b4 ("net: mvneta: Add network support for Armada 3700 > SoC") Yes you're correct, I'll add that to the V2 ! Thanks for the review, Maxime -- Maxime Chevallier, Bootlin Embedded Linux and kernel engineering https://bootlin.com
Re: [PATCH net-next 2/2] net: mvneta: Implement mqprio support
Hi Andrew, On Sat, 13 Feb 2021 20:45:25 +0100 Andrew Lunn wrote: >On Fri, Feb 12, 2021 at 04:12:20PM +0100, Maxime Chevallier wrote: >> +static void mvneta_setup_rx_prio_map(struct mvneta_port *pp) >> +{ >> +int i; >> +u32 val = 0; > >Hi Maxime > >Reverse Chrismtass tree please. Ah yes sorry, I'll fix that in V2. Thanks for the review, Maxime -- Maxime Chevallier, Bootlin Embedded Linux and kernel engineering https://bootlin.com
Re: [PATCH] net: dsa: mv88e6xxx: prevent 2500BASEX mode override
> If we can't switch between 1000base-X and 2500base-X, then we should > not be calling phylink_helper_basex_speed() - and only one of those > two capabilities should be set in the validation callback. I thought > there were DSA switches where we could program the CMODE to switch > between these two... 6390 family has a writable CMODE for ports 9 and 10, and you can select various SERDES modes. Nathan, what device are you using? Andrew
Re: [net-next PATCH v5 15/15] net: dpaa2-mac: Add ACPI support for DPAA2 MAC driver
On Mon, Feb 15, 2021 at 2:33 PM Calvin Johnson wrote: > On Mon, Feb 08, 2021 at 04:28:31PM +, Russell King - ARM Linux admin > wrote: ... > I think of_phy_is_fixed_link() needs to be fixed. I'll add below fix. > > --- a/drivers/net/mdio/of_mdio.c > +++ b/drivers/net/mdio/of_mdio.c > @@ -439,6 +439,9 @@ bool of_phy_is_fixed_link(struct device_node *np) > int len, err; > const char *managed; > > + if (!np) > + return false; AFAICS this doesn't add anything: all of the of_* APIs should handle OF nodes being NULL below. > /* New binding */ > dn = of_get_child_by_name(np, "fixed-link"); > if (dn) { -- With Best Regards, Andy Shevchenko
Re: [PATCH 2/4] net: stmmac: Add Toshiba Visconti SoCs glue driver
Hi, On Mon, Feb 15, 2021 at 11:22:33AM +0200, Leon Romanovsky wrote: > On Mon, Feb 15, 2021 at 04:28:09PM +0900, Nobuhiro Iwamatsu wrote: > > > > I have received your point out and have sent an email with the content > > to remove this line. But it may not have arrived yet... > > > > > Why did you use "def_bool y" and not "default y"? Isn't it supposed to be > > > "depends on STMMAC_ETH"? And probably it shouldn't be set as a default as > > > "y". > > > > > > > The reason why "def_bool y" was set is that the wrong fix was left when > > debugging. Also, I don't think it is necessary to set "default y". > > This is also incorrect because it says "bool" Toshiba Visconti DWMAC > > support "". I change it to trustate in the new patch. > > > > And this driver is enabled when STMMAC_PLATFORM was Y. And STMMAC_PLATFORM > > depends on STMMAC_ETH. > > So I understand that STMMAC_ETH does not need to be dependents. Is this > > understanding wrong? > > This is correct understanding, just need to clean other entries in that > Kconfig that depends on STMMAC_ETH. OK, thanks. Best regards, Nobuhiro
Re: [net-next PATCH v5 15/15] net: dpaa2-mac: Add ACPI support for DPAA2 MAC driver
On Mon, Feb 15, 2021 at 5:13 PM Andy Shevchenko wrote: > > On Mon, Feb 15, 2021 at 2:33 PM Calvin Johnson > wrote: > > On Mon, Feb 08, 2021 at 04:28:31PM +, Russell King - ARM Linux admin > > wrote: > > ... > > > I think of_phy_is_fixed_link() needs to be fixed. I'll add below fix. > > > > --- a/drivers/net/mdio/of_mdio.c > > +++ b/drivers/net/mdio/of_mdio.c > > @@ -439,6 +439,9 @@ bool of_phy_is_fixed_link(struct device_node *np) > > int len, err; > > const char *managed; > > > > + if (!np) > > + return false; > > AFAICS this doesn't add anything: all of the of_* APIs should handle > OF nodes being NULL below. > > > /* New binding */ > > dn = of_get_child_by_name(np, "fixed-link"); > > if (dn) { Yes, of_get_next_child() and of_get_property() are NULL aware. So, the check is redundant. -- With Best Regards, Andy Shevchenko
Re: [PATCH] net: dsa: mv88e6xxx: prevent 2500BASEX mode override
On Mon, 15 Feb 2021 14:57:57 + Russell King - ARM Linux admin wrote: > On Mon, Feb 15, 2021 at 02:47:56PM +0100, Marek Behun wrote: > > On Mon, 15 Feb 2021 06:15:59 + > > Nathan Rossi wrote: > > > > > diff --git a/drivers/net/dsa/mv88e6xxx/chip.c > > > b/drivers/net/dsa/mv88e6xxx/chip.c > > > index 54aa942eed..5c52906b29 100644 > > > --- a/drivers/net/dsa/mv88e6xxx/chip.c > > > +++ b/drivers/net/dsa/mv88e6xxx/chip.c > > > @@ -650,6 +650,13 @@ static void mv88e6xxx_validate(struct dsa_switch > > > *ds, int port, > > > if (chip->info->ops->phylink_validate) > > > chip->info->ops->phylink_validate(chip, port, mask, state); > > > > > > + /* Advertise 2500BASEX only if 1000BASEX is not configured, this > > > + * prevents phylink_helper_basex_speed from always overriding the > > > + * 1000BASEX mode since auto negotiation is always enabled. > > > + */ > > > + if (state->interface == PHY_INTERFACE_MODE_1000BASEX) > > > + phylink_clear(mask, 2500baseX_Full); > > > + > > > > I don't quite like this. This problem should be either solved in > > phylink_helper_basex_speed() or somewhere in the mv88e6xxx code, but near > > the call to phylink_helper_basex_speed(). > > > > Putting a solution to the behaviour of phylink_helper_basex_speed() it > > into the validate() method when phylink_helper_basex_speed() is called > > from a different place will complicate debugging in the future. If > > we start solving problems in this kind of way, the driver will become > > totally unreadable, IMO. > > If we can't switch between 1000base-X and 2500base-X, then we should > not be calling phylink_helper_basex_speed() - and only one of those > two capabilities should be set in the validation callback. I thought > there were DSA switches where we could program the CMODE to switch > between these two... There are. At least Peridot, Topaz and Amethyst support switching between these modes. But only on some ports. This problem happnes on Peridot X, I think. On Peridot X there are - port 0: RGMII - ports 9-10: capable of 1, 2.5 and 10G SerDes (10G via XAUI/RXAUI, so multiple lanes) - ports 1-8: with copper PHYs - some of these can instead be set to use the unused SerDes lanes of ports 9-10, but only in 1000base-x mode So the problem can happen if you set port 9 or 10 to only use one SerDes lane, and use the spare lanes for the 1G ports. On these ports 2500base-x is not supported, only 1000base-x (maybe sgmii, I don't remember) Marek
general protection fault in nl802154_add_llsec_key
Hello, syzbot found the following issue on: HEAD commit:57baf8cc net: axienet: Handle deferred probe on clock prop.. git tree: net console output: https://syzkaller.appspot.com/x/log.txt?x=15c84ed2d0 kernel config: https://syzkaller.appspot.com/x/.config?x=8cb23303ddb9411f dashboard link: https://syzkaller.appspot.com/bug?extid=ce4e062c2d51977ddc50 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1332a822d0 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=12760822d0 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+ce4e062c2d51977dd...@syzkaller.appspotmail.com general protection fault, probably for non-canonical address 0xdc00: [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x-0x0007] CPU: 1 PID: 8444 Comm: syz-executor279 Not tainted 5.11.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:nla_len include/net/netlink.h:1148 [inline] RIP: 0010:nla_parse_nested_deprecated include/net/netlink.h:1231 [inline] RIP: 0010:nl802154_add_llsec_key+0x1f7/0x560 net/ieee802154/nl802154.c:1547 Code: 00 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 2f 03 00 00 48 8b 93 28 01 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 d1 48 c1 e9 03 <0f> b6 0c 01 48 89 d0 83 e0 07 83 c0 01 38 c8 7c 08 84 c9 0f 85 c9 RSP: 0018:c90001adf4a0 EFLAGS: 00010246 RAX: dc00 RBX: 88802117c000 RCX: RDX: RSI: 88892195 RDI: 88802117c128 RBP: 19200035be95 R08: R09: R10: 87315ffa R11: R12: 88801c222000 R13: 88801bb34bd0 R14: c90001adf8b0 R15: FS: 01dcd300() GS:8880b9d0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 26c8 CR3: 2146a000 CR4: 001506e0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: genl_family_rcv_msg_doit+0x228/0x320 net/netlink/genetlink.c:739 genl_family_rcv_msg net/netlink/genetlink.c:783 [inline] genl_rcv_msg+0x328/0x580 net/netlink/genetlink.c:800 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2494 genl_rcv+0x24/0x40 net/netlink/genetlink.c:811 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1330 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1919 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:672 sys_sendmsg+0x6e8/0x810 net/socket.c:2345 ___sys_sendmsg+0xf3/0x170 net/socket.c:2399 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2432 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x43f8a9 Code: 28 c3 e8 5a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:7ffe56c91f68 EFLAGS: 0246 ORIG_RAX: 002e RAX: ffda RBX: 004004a0 RCX: 0043f8a9 RDX: RSI: 2a00 RDI: 0004 RBP: 00403310 R08: 004004a0 R09: 004004a0 R10: 0003 R11: 0246 R12: 004033a0 R13: R14: 004ad018 R15: 004004a0 Modules linked in: ---[ end trace 1844a7d5c231fd88 ]--- RIP: 0010:nla_len include/net/netlink.h:1148 [inline] RIP: 0010:nla_parse_nested_deprecated include/net/netlink.h:1231 [inline] RIP: 0010:nl802154_add_llsec_key+0x1f7/0x560 net/ieee802154/nl802154.c:1547 Code: 00 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 2f 03 00 00 48 8b 93 28 01 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 d1 48 c1 e9 03 <0f> b6 0c 01 48 89 d0 83 e0 07 83 c0 01 38 c8 7c 08 84 c9 0f 85 c9 RSP: 0018:c90001adf4a0 EFLAGS: 00010246 RAX: dc00 RBX: 88802117c000 RCX: RDX: RSI: 88892195 RDI: 88802117c128 RBP: 19200035be95 R08: R09: R10: 87315ffa R11: R12: 88801c222000 R13: 88801bb34bd0 R14: c90001adf8b0 R15: FS: 01dcd300() GS:8880b9d0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 26c8 CR3: 2146a000 CR4: 001506e0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkal...@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. syzbot ca
Re: [PATCH 2/4] net: stmmac: Add Toshiba Visconti SoCs glue driver
Hi, On Mon, Feb 15, 2021 at 01:19:18PM +0100, Arnd Bergmann wrote: > On Mon, Feb 15, 2021 at 10:23 AM Leon Romanovsky wrote: > > On Mon, Feb 15, 2021 at 04:28:09PM +0900, Nobuhiro Iwamatsu wrote: > > > > > > Sorry, I sent the wrong patchset that didn't fix this point out. > > > > > > > I asked it before, but never received an answer. > > > > > > I have received your point out and have sent an email with the content > > > to remove this line. But it may not have arrived yet... > > > > > > > Why did you use "def_bool y" and not "default y"? Isn't it supposed to > > > > be > > > > "depends on STMMAC_ETH"? And probably it shouldn't be set as a default > > > > as "y". > > > > > > > > > > The reason why "def_bool y" was set is that the wrong fix was left when > > > debugging. Also, I don't think it is necessary to set "default y". > > > This is also incorrect because it says "bool" Toshiba Visconti DWMAC > > > support "". I change it to trustate in the new patch. > > > > > > And this driver is enabled when STMMAC_PLATFORM was Y. And STMMAC_PLATFORM > > > depends on STMMAC_ETH. > > > So I understand that STMMAC_ETH does not need to be dependents. Is this > > > understanding wrong? > > > > This is correct understanding, just need to clean other entries in that > > Kconfig that depends on STMMAC_ETH. > > 'tristate' with no default sounds right. I see that some platforms have a > default according to the platform, which also makes sense but isn't > required. What I would suggest though is a dependency on the platform, > to make it easier to disable the front-end based on which platforms > are enabled. This would end up as > > config DWMAC_VISCONTI > tristate "Toshiba Visconti DWMAC support" > depends on ARCH_VISCONTI || COMPILE_TEST > depends on OF && COMMON_CLK # only add this line if it's > required for compilation > default ARCH_VISCONTI > The fix at hand is the same as your suggestion. Thank you for your comment. > Arnd > Best regards, Nobuhiro
[net-next] net: mvpp2: Add TX flow control support for jumbo frames
From: Stefan Chulski With MTU less than 1500B on all ports, the driver uses per CPU pool mode. If one of the ports set to jumbo frame MTU size, all ports move to shared pools mode. Here, buffer manager TX Flow Control reconfigured on all ports. Signed-off-by: Stefan Chulski --- drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c | 26 1 file changed, 26 insertions(+) diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c index 222e9a3..10c17d1 100644 --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c @@ -924,6 +924,25 @@ static void mvpp2_bm_pool_update_fc(struct mvpp2_port *port, spin_unlock_irqrestore(&port->priv->mss_spinlock, flags); } +/* disable/enable flow control for BM pool on all ports */ +static void mvpp2_bm_pool_update_priv_fc(struct mvpp2 *priv, bool en) +{ + struct mvpp2_port *port; + int i; + + for (i = 0; i < priv->port_count; i++) { + port = priv->port_list[i]; + if (port->priv->percpu_pools) { + for (i = 0; i < port->nrxqs; i++) + mvpp2_bm_pool_update_fc(port, &port->priv->bm_pools[i], + port->tx_fc & en); + } else { + mvpp2_bm_pool_update_fc(port, port->pool_long, port->tx_fc & en); + mvpp2_bm_pool_update_fc(port, port->pool_short, port->tx_fc & en); + } + } +} + static int mvpp2_enable_global_fc(struct mvpp2 *priv) { int val, timeout = 0; @@ -4913,6 +4932,7 @@ static int mvpp2_set_mac_address(struct net_device *dev, void *p) */ static int mvpp2_bm_switch_buffers(struct mvpp2 *priv, bool percpu) { + bool change_percpu = (percpu != priv->percpu_pools); int numbufs = MVPP2_BM_POOLS_NUM, i; struct mvpp2_port *port = NULL; bool status[MVPP2_MAX_PORTS]; @@ -4928,6 +4948,9 @@ static int mvpp2_bm_switch_buffers(struct mvpp2 *priv, bool percpu) if (priv->percpu_pools) numbufs = port->nrxqs * 2; + if (change_percpu) + mvpp2_bm_pool_update_priv_fc(priv, false); + for (i = 0; i < numbufs; i++) mvpp2_bm_pool_destroy(port->dev->dev.parent, priv, &priv->bm_pools[i]); @@ -4942,6 +4965,9 @@ static int mvpp2_bm_switch_buffers(struct mvpp2 *priv, bool percpu) mvpp2_open(port->dev); } + if (change_percpu) + mvpp2_bm_pool_update_priv_fc(priv, true); + return 0; } -- 1.9.1
selftests: tls: multi_chunk_sendfile: Test terminated by timeout (constant failure)
Hi Pooja, commit 0e6fbe39bdf71b4e665767bcbf53567a3e6d0623 Author: Pooja Trivedi Date: Fri Jun 5 16:01:18 2020 + net/tls(TLS_SW): Add selftest for 'chunked' sendfile test your multi_chunk_sendfile test is constantly failing for me (x86_64 defconfig + tools/testing/selftests/net/config + TLS) on 5.10.y: ... # RUN tls.12.multi_chunk_sendfile ... # multi_chunk_sendfile: Test terminated by timeout # FAIL tls.12.multi_chunk_sendfile not ok 6 tls.12.multi_chunk_sendfile ... # RUN tls.13.multi_chunk_sendfile ... # multi_chunk_sendfile: Test terminated by timeout # FAIL tls.13.multi_chunk_sendfile not ok 51 tls.13.multi_chunk_sendfile i tried bumping up the timeout to an insane value, but that didn't change the outcome - anything particular i should check? -- bye, p.
[PATCH v4 1/4] dt-bindings: net: Add DT bindings for Toshiba Visconti TMPV7700 SoC
Add device tree bindings for ethernet controller of Toshiba Visconti TMPV7700 SoC series. Signed-off-by: Nobuhiro Iwamatsu --- .../bindings/net/toshiba,visconti-dwmac.yaml | 85 +++ 1 file changed, 85 insertions(+) create mode 100644 Documentation/devicetree/bindings/net/toshiba,visconti-dwmac.yaml diff --git a/Documentation/devicetree/bindings/net/toshiba,visconti-dwmac.yaml b/Documentation/devicetree/bindings/net/toshiba,visconti-dwmac.yaml new file mode 100644 index ..59724d18e6f3 --- /dev/null +++ b/Documentation/devicetree/bindings/net/toshiba,visconti-dwmac.yaml @@ -0,0 +1,85 @@ +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) +%YAML 1.2 +--- +$id: "http://devicetree.org/schemas/net/toshiba,visconti-dwmac.yaml#"; +$schema: "http://devicetree.org/meta-schemas/core.yaml#"; + +title: Toshiba Visconti DWMAC Ethernet controller + +maintainers: + - Nobuhiro Iwamatsu + +select: + properties: +compatible: + contains: +enum: + - toshiba,visconti-dwmac + required: +- compatible + +allOf: + - $ref: "snps,dwmac.yaml#" + +properties: + compatible: +oneOf: + - items: + - enum: + - toshiba,visconti-dwmac + - const: snps,dwmac-4.20a + + reg: +maxItems: 1 + + clocks: +items: + - description: main clock + - description: PHY reference clock + + clock-names: +items: + - const: stmmaceth + - const: phy_ref_clk + +required: + - compatible + - reg + - clocks + - clock-names + +unevaluatedProperties: false + +examples: + - | +#include + +soc { +#address-cells = <2>; +#size-cells = <2>; + +piether: ethernet@2800 { +compatible = "toshiba,visconti-dwmac", "snps,dwmac-4.20a"; +reg = <0 0x2800 0 0x1>; +interrupts = ; +interrupt-names = "macirq"; +clocks = <&clk300mhz>, <&clk125mhz>; +clock-names = "stmmaceth", "phy_ref_clk"; +snps,txpbl = <4>; +snps,rxpbl = <4>; +snps,tso; +phy-mode = "rgmii-id"; +phy-handle = <&phy0>; + +mdio0 { +#address-cells = <0x1>; +#size-cells = <0x0>; +compatible = "snps,dwmac-mdio"; + +phy0: ethernet-phy@1 { +device_type = "ethernet-phy"; +reg = <0x1>; +}; +}; +}; +}; -- 2.30.0.rc2
[PATCH v4 0/4] net: stmmac: Add Toshiba Visconti SoCs glue driver
Hi, This series is the ethernet driver for Toshiba's ARM SoC, Visconti[0]. This provides DT binding documentation, device driver, MAINTAINER files, and updates to DT files. Best regards, Nobuhiro [0]: https://toshiba.semicon-storage.com/ap-en/semiconductor/product/image-recognition-processors-visconti.htmli Updates: dt-bindings: net: Add DT bindings for Toshiba Visconti TMPV7700 SoC v3 -> v4: No update. Resend the correct patch series. v2 -> v3: Change to the correct dwmac version. Remove description of reg property. Update examples, drop dma-coherent, add snps,tso, fix phy-mode. v1 -> v2: No update. net: stmmac: Add Toshiba Visconti SoCs glue driver v3 -> v4: No update. Resend the correct patch series. https://lore.kernel.org/netdev/20210215050655.2532-3-nobuhiro1.iwama...@toshiba.co.jp/ did not contain the following fixes: - Drop def_bool y in Kconfig - Change from bool to tristate in Kconfig option. - Add 'default ARCH_VISCONTI' to Kconfig option. v2 -> v3: Remove code that is no longer needed by using compatible string. Drop def_bool y in Kconfig Change from bool to tristate in Kconfig option. Add 'default ARCH_VISCONTI' to Kconfig option. v1 -> v2: Use reverse christmas tree ordering for local variable declarations. MAINTAINERS: Add entries for Toshiba Visconti ethernet controller v3 -> v4: No update. Resend the correct patch series. v2 -> v3: No update. v1 -> v2: No update. arm: dts: visconti: Add DT support for Toshiba Visconti5 ethernet controller v3 -> v4: No update v2 -> v3: Add "snps,dwmac-4.20a" as compatible string. Add snps,tso. v1 -> v2: No update. Nobuhiro Iwamatsu (4): dt-bindings: net: Add DT bindings for Toshiba Visconti TMPV7700 SoC net: stmmac: Add Toshiba Visconti SoCs glue driver MAINTAINERS: Add entries for Toshiba Visconti ethernet controller arm: dts: visconti: Add DT support for Toshiba Visconti5 ethernet controller .../bindings/net/toshiba,visconti-dwmac.yaml | 85 ++ MAINTAINERS | 2 + .../boot/dts/toshiba/tmpv7708-rm-mbrc.dts | 18 ++ arch/arm64/boot/dts/toshiba/tmpv7708.dtsi | 25 ++ drivers/net/ethernet/stmicro/stmmac/Kconfig | 8 + drivers/net/ethernet/stmicro/stmmac/Makefile | 1 + .../ethernet/stmicro/stmmac/dwmac-visconti.c | 285 ++ 7 files changed, 424 insertions(+) create mode 100644 Documentation/devicetree/bindings/net/toshiba,visconti-dwmac.yaml create mode 100644 drivers/net/ethernet/stmicro/stmmac/dwmac-visconti.c -- 2.30.0.rc2
[PATCH v4 2/4] net: stmmac: Add Toshiba Visconti SoCs glue driver
Add dwmac-visconti to the stmmac driver in Toshiba Visconti ARM SoCs. This patch contains only the basic function of the device. There is no clock control, PM, etc. yet. These will be added in the future. Signed-off-by: Nobuhiro Iwamatsu --- drivers/net/ethernet/stmicro/stmmac/Kconfig | 8 + drivers/net/ethernet/stmicro/stmmac/Makefile | 1 + .../ethernet/stmicro/stmmac/dwmac-visconti.c | 285 ++ 3 files changed, 294 insertions(+) create mode 100644 drivers/net/ethernet/stmicro/stmmac/dwmac-visconti.c diff --git a/drivers/net/ethernet/stmicro/stmmac/Kconfig b/drivers/net/ethernet/stmicro/stmmac/Kconfig index 53f14c5a9e02..e675ba12fde2 100644 --- a/drivers/net/ethernet/stmicro/stmmac/Kconfig +++ b/drivers/net/ethernet/stmicro/stmmac/Kconfig @@ -219,6 +219,14 @@ config DWMAC_INTEL_PLAT This selects the Intel platform specific glue layer support for the stmmac device driver. This driver is used for the Intel Keem Bay SoC. + +config DWMAC_VISCONTI + tristate "Toshiba Visconti DWMAC support" + default ARCH_VISCONTI + depends on OF && COMMON_CLK && (ARCH_VISCONTI || COMPILE_TEST) + help + Support for ethernet controller on Visconti SoCs. + endif config DWMAC_INTEL diff --git a/drivers/net/ethernet/stmicro/stmmac/Makefile b/drivers/net/ethernet/stmicro/stmmac/Makefile index 24e6145d4eae..366740ab9c5a 100644 --- a/drivers/net/ethernet/stmicro/stmmac/Makefile +++ b/drivers/net/ethernet/stmicro/stmmac/Makefile @@ -29,6 +29,7 @@ obj-$(CONFIG_DWMAC_DWC_QOS_ETH) += dwmac-dwc-qos-eth.o obj-$(CONFIG_DWMAC_INTEL_PLAT) += dwmac-intel-plat.o obj-$(CONFIG_DWMAC_GENERIC)+= dwmac-generic.o obj-$(CONFIG_DWMAC_IMX8) += dwmac-imx.o +obj-$(CONFIG_DWMAC_VISCONTI) += dwmac-visconti.o stmmac-platform-objs:= stmmac_platform.o dwmac-altr-socfpga-objs := altr_tse_pcs.o dwmac-socfpga.o diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-visconti.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-visconti.c new file mode 100644 index ..b7a0c57dfbfb --- /dev/null +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-visconti.c @@ -0,0 +1,285 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Toshiba Visconti Ethernet Support + * + * (C) Copyright 2020 TOSHIBA CORPORATION + * (C) Copyright 2020 Toshiba Electronic Devices & Storage Corporation + */ + +#include +#include +#include +#include + +#include "stmmac_platform.h" +#include "dwmac4.h" + +#define REG_ETHER_CONTROL 0x52D4 +#define ETHER_ETH_CONTROL_RESET BIT(17) + +#define REG_ETHER_CLOCK_SEL0x52D0 +#define ETHER_CLK_SEL_TX_CLK_EN BIT(0) +#define ETHER_CLK_SEL_RX_CLK_EN BIT(1) +#define ETHER_CLK_SEL_RMII_CLK_EN BIT(2) +#define ETHER_CLK_SEL_RMII_CLK_RST BIT(3) +#define ETHER_CLK_SEL_DIV_SEL_2 BIT(4) +#define ETHER_CLK_SEL_DIV_SEL_20 BIT(0) +#define ETHER_CLK_SEL_FREQ_SEL_125M(BIT(9) | BIT(8)) +#define ETHER_CLK_SEL_FREQ_SEL_50M BIT(9) +#define ETHER_CLK_SEL_FREQ_SEL_25M BIT(8) +#define ETHER_CLK_SEL_FREQ_SEL_2P5MBIT(0) +#define ETHER_CLK_SEL_TX_CLK_EXT_SEL_IN BIT(0) +#define ETHER_CLK_SEL_TX_CLK_EXT_SEL_TXC BIT(10) +#define ETHER_CLK_SEL_TX_CLK_EXT_SEL_DIV BIT(11) +#define ETHER_CLK_SEL_RX_CLK_EXT_SEL_IN BIT(0) +#define ETHER_CLK_SEL_RX_CLK_EXT_SEL_RXC BIT(12) +#define ETHER_CLK_SEL_RX_CLK_EXT_SEL_DIV BIT(13) +#define ETHER_CLK_SEL_TX_CLK_O_TX_I BIT(0) +#define ETHER_CLK_SEL_TX_CLK_O_RMII_I BIT(14) +#define ETHER_CLK_SEL_TX_O_E_N_IN BIT(15) +#define ETHER_CLK_SEL_RMII_CLK_SEL_IN BIT(0) +#define ETHER_CLK_SEL_RMII_CLK_SEL_RX_C BIT(16) + +#define ETHER_CLK_SEL_RX_TX_CLK_EN (ETHER_CLK_SEL_RX_CLK_EN | ETHER_CLK_SEL_TX_CLK_EN) + +#define ETHER_CONFIG_INTF_MII 0 +#define ETHER_CONFIG_INTF_RGMII BIT(0) +#define ETHER_CONFIG_INTF_RMII BIT(2) + +struct visconti_eth { + void __iomem *reg; + u32 phy_intf_sel; + struct clk *phy_ref_clk; + spinlock_t lock; /* lock to protect register update */ +}; + +static void visconti_eth_fix_mac_speed(void *priv, unsigned int speed) +{ + struct visconti_eth *dwmac = priv; + unsigned int val, clk_sel_val; + unsigned long flags; + + spin_lock_irqsave(&dwmac->lock, flags); + + /* adjust link */ + val = readl(dwmac->reg + MAC_CTRL_REG); + val &= ~(GMAC_CONFIG_PS | GMAC_CONFIG_FES); + + switch (speed) { + case SPEED_1000: + if (dwmac->phy_intf_sel == ETHER_CONFIG_INTF_RGMII) + clk_sel_val = ETHER_CLK_SEL_FREQ_SEL_125M; + break; + case SPEED_100: + if (dwmac->phy_intf_sel == ETHER_CONFIG_INTF_RGMII) + clk_sel_val = ETHER_CLK_SEL_FREQ_SEL_25M; + if (dwmac->phy_intf_sel == ETHER_CONFIG_INTF_RMII) + clk_sel_val = ETHER_CLK_SEL_DIV_SEL_2; + val |= GMAC_CONFIG_PS | GMAC_CONFIG_FES; + break; + case SPEED_10: + if (dwmac->phy_intf_sel == ETHER_CONFIG_INTF_
[PATCH v4 4/4] arm: dts: visconti: Add DT support for Toshiba Visconti5 ethernet controller
Add the ethernet controller node in Toshiba Visconti5 SoC-specific DT file. And enable this node in TMPV7708 RM main board's board-specific DT file. Signed-off-by: Nobuhiro Iwamatsu --- .../boot/dts/toshiba/tmpv7708-rm-mbrc.dts | 18 + arch/arm64/boot/dts/toshiba/tmpv7708.dtsi | 25 +++ 2 files changed, 43 insertions(+) diff --git a/arch/arm64/boot/dts/toshiba/tmpv7708-rm-mbrc.dts b/arch/arm64/boot/dts/toshiba/tmpv7708-rm-mbrc.dts index ed0bf7f13f54..48fa8776e36f 100644 --- a/arch/arm64/boot/dts/toshiba/tmpv7708-rm-mbrc.dts +++ b/arch/arm64/boot/dts/toshiba/tmpv7708-rm-mbrc.dts @@ -41,3 +41,21 @@ &uart1 { clocks = <&uart_clk>; clock-names = "apb_pclk"; }; + +&piether { + status = "okay"; + phy-handle = <&phy0>; + phy-mode = "rgmii-id"; + clocks = <&clk300mhz>, <&clk125mhz>; + clock-names = "stmmaceth", "phy_ref_clk"; + + mdio0 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "snps,dwmac-mdio"; + phy0: ethernet-phy@1 { + device_type = "ethernet-phy"; + reg = <0x1>; + }; + }; +}; diff --git a/arch/arm64/boot/dts/toshiba/tmpv7708.dtsi b/arch/arm64/boot/dts/toshiba/tmpv7708.dtsi index 242f25f4e12a..3366786699fc 100644 --- a/arch/arm64/boot/dts/toshiba/tmpv7708.dtsi +++ b/arch/arm64/boot/dts/toshiba/tmpv7708.dtsi @@ -134,6 +134,20 @@ uart_clk: uart-clk { #clock-cells = <0>; }; + clk125mhz: clk125mhz { + compatible = "fixed-clock"; + clock-frequency = <12500>; + #clock-cells = <0>; + clock-output-names = "clk125mhz"; + }; + + clk300mhz: clk300mhz { + compatible = "fixed-clock"; + clock-frequency = <3>; + #clock-cells = <0>; + clock-output-names = "clk300mhz"; + }; + soc { #address-cells = <2>; #size-cells = <2>; @@ -384,6 +398,17 @@ spi6: spi@28146000 { #size-cells = <0>; status = "disabled"; }; + + piether: ethernet@2800 { + compatible = "toshiba,visconti-dwmac", "snps,dwmac-4.20a"; + reg = <0 0x2800 0 0x1>; + interrupts = ; + interrupt-names = "macirq"; + snps,txpbl = <4>; + snps,rxpbl = <4>; + snps,tso; + status = "disabled"; + }; }; }; -- 2.30.0.rc2
[PATCH v4 3/4] MAINTAINERS: Add entries for Toshiba Visconti ethernet controller
Add entries for Toshiba Visconti ethernet controller binding and driver. Signed-off-by: Nobuhiro Iwamatsu --- MAINTAINERS | 2 ++ 1 file changed, 2 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index cbf4b94f89d4..6be4bdaabf32 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2641,8 +2641,10 @@ L: linux-arm-ker...@lists.infradead.org (moderated for non-subscribers) S: Supported T: git git://git.kernel.org/pub/scm/linux/kernel/git/iwamatsu/linux-visconti.git F: Documentation/devicetree/bindings/arm/toshiba.yaml +F: Documentation/devicetree/bindings/net/toshiba,visconti-dwmac.yaml F: Documentation/devicetree/bindings/pinctrl/toshiba,tmpv7700-pinctrl.yaml F: arch/arm64/boot/dts/toshiba/ +F: drivers/net/ethernet/stmicro/stmmac/dwmac-visconti.c F: drivers/pinctrl/visconti/ N: visconti -- 2.30.0.rc2
Re: [PATCH] net: dsa: mv88e6xxx: prevent 2500BASEX mode override
On Mon, Feb 15, 2021 at 04:16:27PM +0100, Marek Behun wrote: > On Mon, 15 Feb 2021 14:57:57 + > Russell King - ARM Linux admin wrote: > > > On Mon, Feb 15, 2021 at 02:47:56PM +0100, Marek Behun wrote: > > > On Mon, 15 Feb 2021 06:15:59 + > > > Nathan Rossi wrote: > > > > > > > diff --git a/drivers/net/dsa/mv88e6xxx/chip.c > > > > b/drivers/net/dsa/mv88e6xxx/chip.c > > > > index 54aa942eed..5c52906b29 100644 > > > > --- a/drivers/net/dsa/mv88e6xxx/chip.c > > > > +++ b/drivers/net/dsa/mv88e6xxx/chip.c > > > > @@ -650,6 +650,13 @@ static void mv88e6xxx_validate(struct dsa_switch > > > > *ds, int port, > > > > if (chip->info->ops->phylink_validate) > > > > chip->info->ops->phylink_validate(chip, port, mask, > > > > state); > > > > > > > > + /* Advertise 2500BASEX only if 1000BASEX is not configured, this > > > > +* prevents phylink_helper_basex_speed from always overriding > > > > the > > > > +* 1000BASEX mode since auto negotiation is always enabled. > > > > +*/ > > > > + if (state->interface == PHY_INTERFACE_MODE_1000BASEX) > > > > + phylink_clear(mask, 2500baseX_Full); > > > > + > > > > > > I don't quite like this. This problem should be either solved in > > > phylink_helper_basex_speed() or somewhere in the mv88e6xxx code, but near > > > the call to phylink_helper_basex_speed(). > > > > > > Putting a solution to the behaviour of phylink_helper_basex_speed() it > > > into the validate() method when phylink_helper_basex_speed() is called > > > from a different place will complicate debugging in the future. If > > > we start solving problems in this kind of way, the driver will become > > > totally unreadable, IMO. > > > > If we can't switch between 1000base-X and 2500base-X, then we should > > not be calling phylink_helper_basex_speed() - and only one of those > > two capabilities should be set in the validation callback. I thought > > there were DSA switches where we could program the CMODE to switch > > between these two... > > There are. At least Peridot, Topaz and Amethyst support switching > between these modes. But only on some ports. > > This problem happnes on Peridot X, I think. > > On Peridot X there are > - port 0: RGMII > - ports 9-10: capable of 1, 2.5 and 10G SerDes (10G via > XAUI/RXAUI, so multiple lanes) > - ports 1-8: with copper PHYs > - some of these can instead be set to use the unused SerDes lanes > of ports 9-10, but only in 1000base-x mode > > So the problem can happen if you set port 9 or 10 to only use one > SerDes lane, and use the spare lanes for the 1G ports. > On these ports 2500base-x is not supported, only 1000base-x (maybe > sgmii, I don't remember) It sounds like the modes are not reporting correctly then before calling phylink_helper_basex_speed(). If the port can only be used at 1000base-X, then it should not be allowing 2500base-X to be set prior to calling phylink_helper_basex_speed(). -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
Re: [PATCH net-next 4/4] net: dsa: don't set skb->offload_fwd_mark when not offloading the bridge
On Sun, Feb 14, 2021 at 9:54 AM Vladimir Oltean wrote: [snip] > diff --git a/net/dsa/tag_xrs700x.c b/net/dsa/tag_xrs700x.c > index 858cdf9d2913..215ecceea89e 100644 > --- a/net/dsa/tag_xrs700x.c > +++ b/net/dsa/tag_xrs700x.c > @@ -45,8 +45,7 @@ static struct sk_buff *xrs700x_rcv(struct sk_buff *skb, > struct net_device *dev, > if (pskb_trim_rcsum(skb, skb->len - 1)) > return NULL; > > - /* Frame is forwarded by hardware, don't forward in software. */ > - skb->offload_fwd_mark = 1; > + dsa_default_offload_fwd_mark(skb); Does it make sense that the following would have worked prior to this change? Is this only an issue for bridging between DSA ports when offloading is supported? lan0 is a port an an xrs700x switch: ip link set eth0 up ip link del veth0 ip link add veth0 type veth peer name veth1 for eth in veth0 veth1 lan1; do ip link set ${eth} up done ip link add br0 type bridge ip link set veth1 master br0 ip link set lan1 master br0 ip link set br0 up ip addr add 192.168.2.1/24 dev veth0 # ping host connected to physical LAN that lan0 is on ping 192.168.2.249 (works!) I was trying to come up with a way to test this change and expected this would fail (and your patch) would fix it based on what you're described. -George > > return skb; > } > -- > 2.25.1 >
[PATCH bpf-next V1 0/2] bpf: Updates for BPF-helper bpf_check_mtu
The FIB lookup example[1] show how the IP-header field tot_len (iph->tot_len) is used as input to perform the MTU check. The recently added MTU check helper bpf_check_mtu() should also support this type of MTU check. Lets add this feature before merge window, please. This is a followup to 34b2021cc616 ("bpf: Add BPF-helper for MTU checking"). [1] samples/bpf/xdp_fwd_kern.c --- Jesper Dangaard Brouer (2): bpf: BPF-helper for MTU checking add length input selftests/bpf: Tests using bpf_check_mtu BPF-helper input mtu_len param include/uapi/linux/bpf.h | 17 ++-- net/core/filter.c | 12 ++- tools/testing/selftests/bpf/prog_tests/check_mtu.c |4 + tools/testing/selftests/bpf/progs/test_check_mtu.c | 92 4 files changed, 117 insertions(+), 8 deletions(-) --
[PATCH bpf-next V1 2/2] selftests/bpf: Tests using bpf_check_mtu BPF-helper input mtu_len param
Add tests that use mtu_len as input parameter in BPF-helper bpf_check_mtu(). The BPF-helper is avail from both XDP and TC context. Add two tests per context, one that tests below MTU and one that exceeds the MTU. Signed-off-by: Jesper Dangaard Brouer --- tools/testing/selftests/bpf/prog_tests/check_mtu.c |4 + tools/testing/selftests/bpf/progs/test_check_mtu.c | 92 2 files changed, 96 insertions(+) diff --git a/tools/testing/selftests/bpf/prog_tests/check_mtu.c b/tools/testing/selftests/bpf/prog_tests/check_mtu.c index 36af1c138faf..b62a39315336 100644 --- a/tools/testing/selftests/bpf/prog_tests/check_mtu.c +++ b/tools/testing/selftests/bpf/prog_tests/check_mtu.c @@ -128,6 +128,8 @@ static void test_check_mtu_xdp(__u32 mtu, __u32 ifindex) test_check_mtu_run_xdp(skel, skel->progs.xdp_use_helper, mtu); test_check_mtu_run_xdp(skel, skel->progs.xdp_exceed_mtu, mtu); test_check_mtu_run_xdp(skel, skel->progs.xdp_minus_delta, mtu); + test_check_mtu_run_xdp(skel, skel->progs.xdp_input_len, mtu); + test_check_mtu_run_xdp(skel, skel->progs.xdp_input_len_exceed, mtu); cleanup: test_check_mtu__destroy(skel); @@ -187,6 +189,8 @@ static void test_check_mtu_tc(__u32 mtu, __u32 ifindex) test_check_mtu_run_tc(skel, skel->progs.tc_exceed_mtu, mtu); test_check_mtu_run_tc(skel, skel->progs.tc_exceed_mtu_da, mtu); test_check_mtu_run_tc(skel, skel->progs.tc_minus_delta, mtu); + test_check_mtu_run_tc(skel, skel->progs.tc_input_len, mtu); + test_check_mtu_run_tc(skel, skel->progs.tc_input_len_exceed, mtu); cleanup: test_check_mtu__destroy(skel); } diff --git a/tools/testing/selftests/bpf/progs/test_check_mtu.c b/tools/testing/selftests/bpf/progs/test_check_mtu.c index b7787b43f9db..c4a9bae96e75 100644 --- a/tools/testing/selftests/bpf/progs/test_check_mtu.c +++ b/tools/testing/selftests/bpf/progs/test_check_mtu.c @@ -105,6 +105,54 @@ int xdp_minus_delta(struct xdp_md *ctx) return retval; } +SEC("xdp") +int xdp_input_len(struct xdp_md *ctx) +{ + int retval = XDP_PASS; /* Expected retval on successful test */ + void *data_end = (void *)(long)ctx->data_end; + void *data = (void *)(long)ctx->data; + __u32 ifindex = GLOBAL_USER_IFINDEX; + __u32 data_len = data_end - data; + + /* API allow user give length to check as input via mtu_len param, +* resulting MTU value is still output in mtu_len param after call. +* +* Input len is L3, like MTU and iph->tot_len. +* Remember XDP data_len is L2. +*/ + __u32 mtu_len = data_len - ETH_HLEN; + + if (bpf_check_mtu(ctx, ifindex, &mtu_len, 0, 0)) + retval = XDP_ABORTED; + + global_bpf_mtu_xdp = mtu_len; + return retval; +} + +SEC("xdp") +int xdp_input_len_exceed(struct xdp_md *ctx) +{ + int retval = XDP_ABORTED; /* Fail */ + __u32 ifindex = GLOBAL_USER_IFINDEX; + int err; + + /* API allow user give length to check as input via mtu_len param, +* resulting MTU value is still output in mtu_len param after call. +* +* Input length value is L3 size like MTU. +*/ + __u32 mtu_len = GLOBAL_USER_MTU; + + mtu_len += 1; /* Exceed with 1 */ + + err = bpf_check_mtu(ctx, ifindex, &mtu_len, 0, 0); + if (err == BPF_MTU_CHK_RET_FRAG_NEEDED) + retval = XDP_PASS ; /* Success in exceeding MTU check */ + + global_bpf_mtu_xdp = mtu_len; + return retval; +} + SEC("classifier") int tc_use_helper(struct __sk_buff *ctx) { @@ -196,3 +244,47 @@ int tc_minus_delta(struct __sk_buff *ctx) global_bpf_mtu_xdp = mtu_len; return retval; } + +SEC("classifier") +int tc_input_len(struct __sk_buff *ctx) +{ + int retval = BPF_OK; /* Expected retval on successful test */ + __u32 ifindex = GLOBAL_USER_IFINDEX; + + /* API allow user give length to check as input via mtu_len param, +* resulting MTU value is still output in mtu_len param after call. +* +* Input length value is L3 size. +*/ + __u32 mtu_len = GLOBAL_USER_MTU; + + if (bpf_check_mtu(ctx, ifindex, &mtu_len, 0, 0)) + retval = BPF_DROP; + + global_bpf_mtu_xdp = mtu_len; + return retval; +} + +SEC("classifier") +int tc_input_len_exceed(struct __sk_buff *ctx) +{ + int retval = BPF_DROP; /* Fail */ + __u32 ifindex = GLOBAL_USER_IFINDEX; + int err; + + /* API allow user give length to check as input via mtu_len param, +* resulting MTU value is still output in mtu_len param after call. +* +* Input length value is L3 size like MTU. +*/ + __u32 mtu_len = GLOBAL_USER_MTU; + + mtu_len += 1; /* Exceed with 1 */ + + err = bpf_check_mtu(ctx, ifindex, &mtu_len, 0, 0); + if (err == BPF_MTU_CHK_RET_FRAG_NEEDED) + retval = BPF_OK; /* Success in ex
[PATCH bpf-next V1 1/2] bpf: BPF-helper for MTU checking add length input
The FIB lookup example[1] show how the IP-header field tot_len (iph->tot_len) is used as input to perform the MTU check. This patch extend the BPF-helper bpf_check_mtu() with the same ability to provide the length as user parameter input, via mtu_len parameter. [1] samples/bpf/xdp_fwd_kern.c Signed-off-by: Jesper Dangaard Brouer --- include/uapi/linux/bpf.h | 17 +++-- net/core/filter.c| 12 ++-- 2 files changed, 21 insertions(+), 8 deletions(-) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 4c24daa43bac..9c8aa50dc8a5 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3850,8 +3850,7 @@ union bpf_attr { * * long bpf_check_mtu(void *ctx, u32 ifindex, u32 *mtu_len, s32 len_diff, u64 flags) * Description - - * Check ctx packet size against exceeding MTU of net device (based + * Check packet size against exceeding MTU of net device (based * on *ifindex*). This helper will likely be used in combination * with helpers that adjust/change the packet size. * @@ -3868,6 +3867,14 @@ union bpf_attr { * against the current net device. This is practical if this isn't * used prior to redirect. * + * On input *mtu_len* must be a valid pointer, else verifier will + * reject BPF program. If the value *mtu_len* is initialized to + * zero then the ctx packet size is use. When value *mtu_len* is + * provided as input this specify the L3 length that the MTU check + * is done against. Remeber XDP and TC length operate at L2, but + * this value is L3 as this correlate to MTU and IP-header tot_len + * values which are L3 (similar behavior as bpf_fib_lookup). + * * The Linux kernel route table can configure MTUs on a more * specific per route level, which is not provided by this helper. * For route level MTU checks use the **bpf_fib_lookup**\ () @@ -3892,11 +3899,9 @@ union bpf_attr { * * On return *mtu_len* pointer contains the MTU value of the net * device. Remember the net device configured MTU is the L3 size, - * which is returned here and XDP and TX length operate at L2. + * which is returned here and XDP and TC length operate at L2. * Helper take this into account for you, but remember when using - * MTU value in your BPF-code. On input *mtu_len* must be a valid - * pointer and be initialized (to zero), else verifier will reject - * BPF program. + * MTU value in your BPF-code. * * Return * * 0 on success, and populate MTU value in *mtu_len* pointer. diff --git a/net/core/filter.c b/net/core/filter.c index 7059cf604d94..fcc3bda85960 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -5660,7 +5660,7 @@ BPF_CALL_5(bpf_skb_check_mtu, struct sk_buff *, skb, if (unlikely(flags & ~(BPF_MTU_CHK_SEGS))) return -EINVAL; - if (unlikely(flags & BPF_MTU_CHK_SEGS && len_diff)) + if (unlikely(flags & BPF_MTU_CHK_SEGS && (len_diff || *mtu_len))) return -EINVAL; dev = __dev_via_ifindex(dev, ifindex); @@ -5670,7 +5670,11 @@ BPF_CALL_5(bpf_skb_check_mtu, struct sk_buff *, skb, mtu = READ_ONCE(dev->mtu); dev_len = mtu + dev->hard_header_len; - skb_len = skb->len + len_diff; /* minus result pass check */ + + /* If set use *mtu_len as input, L3 as iph->tot_len (like fib_lookup) */ + skb_len = *mtu_len ? *mtu_len + dev->hard_header_len : skb->len; + + skb_len += len_diff; /* minus result pass check */ if (skb_len <= dev_len) { ret = BPF_MTU_CHK_RET_SUCCESS; goto out; @@ -5715,6 +5719,10 @@ BPF_CALL_5(bpf_xdp_check_mtu, struct xdp_buff *, xdp, /* Add L2-header as dev MTU is L3 size */ dev_len = mtu + dev->hard_header_len; + /* Use *mtu_len as input, L3 as iph->tot_len (like fib_lookup) */ + if (*mtu_len) + xdp_len = *mtu_len + dev->hard_header_len; + xdp_len += len_diff; /* minus result pass check */ if (xdp_len > dev_len) ret = BPF_MTU_CHK_RET_FRAG_NEEDED;
[PATCH bpf-next 0/3] Introduce bpf_link in libbpf's xsk
Hi, This set is another approach towards addressing the below issue: // load xdp prog and xskmap and add entry to xskmap at idx 10 $ sudo ./xdpsock -i ens801f0 -t -q 10 // add entry to xskmap at idx 11 $ sudo ./xdpsock -i ens801f0 -t -q 11 terminate one of the processes and another one is unable to work due to the fact that the XDP prog was unloaded from interface. Previous attempt was, to put it mildly, a bit broken, as there was no synchronization between updates to additional map, as Bjorn pointed out. See https://lore.kernel.org/netdev/20190603131907.13395-5-maciej.fijalkow...@intel.com/ In the meantime bpf_link was introduced and it seems that it can address the issue of refcounting the XDP prog on interface. More info on commit messages. Thanks. Maciej Fijalkowski (3): libbpf: xsk: use bpf_link libbpf: clear map_info before each bpf_obj_get_info_by_fd samples: bpf: do not unload prog within xdpsock samples/bpf/xdpsock_user.c | 55 -- tools/lib/bpf/xsk.c| 147 +++-- 2 files changed, 139 insertions(+), 63 deletions(-) -- 2.20.1
[PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
Currently, if there are multiple xdpsock instances running on a single interface and in case one of the instances is terminated, the rest of them are left in an inoperable state due to the fact of unloaded XDP prog from interface. To address that, step away from setting bpf prog in favour of bpf_link. This means that refcounting of BPF resources will be done automatically by bpf_link itself. When setting up BPF resources during xsk socket creation, check whether bpf_link for a given ifindex already exists via set of calls to bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd and comparing the ifindexes from bpf_link and xsk socket. If there's no bpf_link yet, create one for a given XDP prog and unload explicitly existing prog if XDP_FLAGS_UPDATE_IF_NOEXIST is not set. If bpf_link is already at a given ifindex and underlying program is not AF-XDP one, bail out or update the bpf_link's prog given the presence of XDP_FLAGS_UPDATE_IF_NOEXIST. Signed-off-by: Maciej Fijalkowski --- tools/lib/bpf/xsk.c | 143 +--- 1 file changed, 122 insertions(+), 21 deletions(-) diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c index 20500fb1f17e..5911868efa43 100644 --- a/tools/lib/bpf/xsk.c +++ b/tools/lib/bpf/xsk.c @@ -28,6 +28,7 @@ #include #include #include +#include #include "bpf.h" #include "libbpf.h" @@ -70,6 +71,7 @@ struct xsk_ctx { int ifindex; struct list_head list; int prog_fd; + int link_fd; int xsks_map_fd; char ifname[IFNAMSIZ]; }; @@ -409,7 +411,7 @@ static int xsk_load_xdp_prog(struct xsk_socket *xsk) static const int log_buf_size = 16 * 1024; struct xsk_ctx *ctx = xsk->ctx; char log_buf[log_buf_size]; - int err, prog_fd; + int prog_fd; /* This is the fallback C-program: * SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx) @@ -499,14 +501,47 @@ static int xsk_load_xdp_prog(struct xsk_socket *xsk) return prog_fd; } - err = bpf_set_link_xdp_fd(xsk->ctx->ifindex, prog_fd, - xsk->config.xdp_flags); - if (err) { - close(prog_fd); - return err; + ctx->prog_fd = prog_fd; + return 0; +} + +static int xsk_create_bpf_link(struct xsk_socket *xsk) +{ + /* bpf_link only accepts XDP_FLAGS_MODES, but xsk->config.xdp_flags +* might have set XDP_FLAGS_UPDATE_IF_NOEXIST +*/ + DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts, + .flags = (xsk->config.xdp_flags & XDP_FLAGS_MODES)); + struct xsk_ctx *ctx = xsk->ctx; + __u32 prog_id; + int link_fd; + int err; + + /* for !XDP_FLAGS_UPDATE_IF_NOEXIST, unload the program first, if any, +* so that bpf_link can be attached +*/ + if (!(xsk->config.xdp_flags & XDP_FLAGS_UPDATE_IF_NOEXIST)) { + err = bpf_get_link_xdp_id(ctx->ifindex, &prog_id, xsk->config.xdp_flags); + if (err) { + pr_warn("getting XDP prog id failed\n"); + return err; + } + if (prog_id) { + err = bpf_set_link_xdp_fd(ctx->ifindex, -1, 0); + if (err < 0) { + pr_warn("detaching XDP prog failed\n"); + return err; + } + } } - ctx->prog_fd = prog_fd; + link_fd = bpf_link_create(ctx->prog_fd, xsk->ctx->ifindex, BPF_XDP, &opts); + if (link_fd < 0) { + pr_warn("bpf_link_create failed: %s\n", strerror(errno)); + return link_fd; + } + + ctx->link_fd = link_fd; return 0; } @@ -625,8 +660,32 @@ static int xsk_lookup_bpf_maps(struct xsk_socket *xsk) } err = 0; - if (ctx->xsks_map_fd == -1) + if (ctx->xsks_map_fd != -1) + goto out_map_ids; + + /* if prog load is forced and we found bpf_link on a given ifindex BUT +* the underlying XDP prog is not an AF-XDP one, close old prog's fd, +* load AF-XDP and update existing bpf_link +*/ + if (!(xsk->config.xdp_flags & XDP_FLAGS_UPDATE_IF_NOEXIST)) { + close(ctx->prog_fd); + err = xsk_create_bpf_maps(xsk); + if (err) + goto out_map_ids; + + err = xsk_load_xdp_prog(xsk); + if (err) { + xsk_delete_bpf_maps(xsk); + goto out_map_ids; + } + + /* if update failed, caller will close fds */ + err = bpf_link_update(ctx->link_fd, ctx->prog_fd, 0); + if (err) + pr_warn("bpf_link_update failed: %s\n", strerror(errno)); + } else { err = -ENOENT; + } out_map_ids: free(map
[PATCH net-next] octeontx2-pf: fix an off by one bug in otx2_get_fecparam()
The "<= FEC_MAX_INDEX" comparison should be "< FEC_MAX_INDEX". I did some cleanup in this function to hopefully make the code a bit clearer. There was no blank line after the declaration block. The closing curly brace on the fec[] declaration normally goes on a line by itself. And I removed the FEC_MAX_INDEX define and used ARRAY_SIZE(fec) instead. Fixes: d0cf9503e908 ("octeontx2-pf: ethtool fec mode support") Signed-off-by: Dan Carpenter --- .../net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c| 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c index 237e5d3321d4..0eaf11107cb7 100644 --- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c +++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c @@ -955,16 +955,17 @@ static int otx2_get_fecparam(struct net_device *netdev, ETHTOOL_FEC_OFF, ETHTOOL_FEC_BASER, ETHTOOL_FEC_RS, - ETHTOOL_FEC_BASER | ETHTOOL_FEC_RS}; -#define FEC_MAX_INDEX 4 - if (pfvf->linfo.fec < FEC_MAX_INDEX) + ETHTOOL_FEC_BASER | ETHTOOL_FEC_RS, + }; + + if (pfvf->linfo.fec < ARRAY_SIZE(fec)) fecparam->active_fec = fec[pfvf->linfo.fec]; rsp = otx2_get_fwdata(pfvf); if (IS_ERR(rsp)) return PTR_ERR(rsp); - if (rsp->fwdata.supported_fec <= FEC_MAX_INDEX) { + if (rsp->fwdata.supported_fec < ARRAY_SIZE(fec)) { if (!rsp->fwdata.supported_fec) fecparam->fec = ETHTOOL_FEC_NONE; else -- 2.30.0
Re: [PATCH] net: dsa: mv88e6xxx: prevent 2500BASEX mode override
On Mon, 15 Feb 2021 15:29:44 + Russell King - ARM Linux admin wrote: > On Mon, Feb 15, 2021 at 04:16:27PM +0100, Marek Behun wrote: > > On Mon, 15 Feb 2021 14:57:57 + > > Russell King - ARM Linux admin wrote: > > > > > On Mon, Feb 15, 2021 at 02:47:56PM +0100, Marek Behun wrote: > > > > On Mon, 15 Feb 2021 06:15:59 + > > > > Nathan Rossi wrote: > > > > > > > > > diff --git a/drivers/net/dsa/mv88e6xxx/chip.c > > > > > b/drivers/net/dsa/mv88e6xxx/chip.c > > > > > index 54aa942eed..5c52906b29 100644 > > > > > --- a/drivers/net/dsa/mv88e6xxx/chip.c > > > > > +++ b/drivers/net/dsa/mv88e6xxx/chip.c > > > > > @@ -650,6 +650,13 @@ static void mv88e6xxx_validate(struct dsa_switch > > > > > *ds, int port, > > > > > if (chip->info->ops->phylink_validate) > > > > > chip->info->ops->phylink_validate(chip, port, mask, > > > > > state); > > > > > > > > > > + /* Advertise 2500BASEX only if 1000BASEX is not configured, this > > > > > + * prevents phylink_helper_basex_speed from always overriding > > > > > the > > > > > + * 1000BASEX mode since auto negotiation is always enabled. > > > > > + */ > > > > > + if (state->interface == PHY_INTERFACE_MODE_1000BASEX) > > > > > + phylink_clear(mask, 2500baseX_Full); > > > > > + > > > > > > > > I don't quite like this. This problem should be either solved in > > > > phylink_helper_basex_speed() or somewhere in the mv88e6xxx code, but > > > > near > > > > the call to phylink_helper_basex_speed(). > > > > > > > > Putting a solution to the behaviour of phylink_helper_basex_speed() it > > > > into the validate() method when phylink_helper_basex_speed() is called > > > > from a different place will complicate debugging in the future. If > > > > we start solving problems in this kind of way, the driver will become > > > > totally unreadable, IMO. > > > > > > If we can't switch between 1000base-X and 2500base-X, then we should > > > not be calling phylink_helper_basex_speed() - and only one of those > > > two capabilities should be set in the validation callback. I thought > > > there were DSA switches where we could program the CMODE to switch > > > between these two... > > > > There are. At least Peridot, Topaz and Amethyst support switching > > between these modes. But only on some ports. > > > > This problem happnes on Peridot X, I think. > > > > On Peridot X there are > > - port 0: RGMII > > - ports 9-10: capable of 1, 2.5 and 10G SerDes (10G via > > XAUI/RXAUI, so multiple lanes) > > - ports 1-8: with copper PHYs > > - some of these can instead be set to use the unused SerDes lanes > > of ports 9-10, but only in 1000base-x mode > > > > So the problem can happen if you set port 9 or 10 to only use one > > SerDes lane, and use the spare lanes for the 1G ports. > > On these ports 2500base-x is not supported, only 1000base-x (maybe > > sgmii, I don't remember) > > It sounds like the modes are not reporting correctly then before calling > phylink_helper_basex_speed(). If the port can only be used at 1000base-X, > then it should not be allowing 2500base-X to be set prior to calling > phylink_helper_basex_speed(). > Hmm. It doesn't seem that way. Ports 1-8 only set support for 1000baseT and 1000baseX. And for lower modes if state->interface is not 8023z. Nathan, what switch do you use and on which port does this happen? Marek
Re: [PATCH 1/2] lockdep: add lockdep_assert_not_held()
On Mon, Feb 15, 2021 at 02:12:30PM +0100, Johannes Berg wrote: > On Mon, 2021-02-15 at 11:44 +0100, Peter Zijlstra wrote: > > > > I think something like so will work, but please double check. > > Yeah, that looks better. > > > +++ b/include/linux/lockdep.h > > @@ -294,11 +294,15 @@ extern void lock_unpin_lock(struct lockdep_map *lock, > > struct pin_cookie); > > > > #define lockdep_depth(tsk) (debug_locks ? (tsk)->lockdep_depth : 0) > > > > -#define lockdep_assert_held(l) do {\ > > - WARN_ON(debug_locks && !lockdep_is_held(l));\ > > +#define lockdep_assert_held(l) do { > > \ > > + WARN_ON(debug_locks && lockdep_is_held(l) == 0)); \ > > } while (0) > > That doesn't really need to change? It's the same. Correct, but I found it more symmetric vs the not implementation below. > > -#define lockdep_assert_held_write(l) do {\ > > +#define lockdep_assert_not_held(l) do {\ > > + WARN_ON(debug_locks && lockdep_is_held(l) == 1)); \ > > + } while (0) > > + > > +#define lockdep_assert_held_write(l) do { > > \ > > WARN_ON(debug_locks && !lockdep_is_held_type(l, 0));\ > > } while (0) > > > > diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c > > index c1418b47f625..983ba206f7b2 100644 > > --- a/kernel/locking/lockdep. > > +++ b/kernel/locking/lockdep.c > > @@ -5467,7 +5467,7 @@ noinstr int lock_is_held_type(const struct > > lockdep_map *lock, int read) > > int ret = 0; > > > > if (unlikely(!lockdep_enabled())) > > - return 1; /* avoid false negative lockdep_assert_held() */ > > + return -1; /* avoid false negative lockdep_assert_held() */ > > Maybe add lockdep_assert_not_held() to the comment, to explain the -1 > (vs non-zero)? Yeah, or frob a '*' in there.
Re: [PATCH v1 1/3] string: Consolidate yesno() helpers under string.h hood
On Mon, Feb 15, 2021 at 04:37:50PM +0200, Jani Nikula wrote: > On Mon, 15 Feb 2021, Andy Shevchenko > wrote: > > We have already few similar implementation and a lot of code that can > > benefit > > of the yesno() helper. Consolidate yesno() helpers under string.h hood. > > Good luck. I gave up after just four versions. [1] Thanks for a pointer! I like your version, but here we also discussing a possibility to do something like %py[DOY]. It will consolidate all those RO or whatever sections inside one data structure. > Acked-by: Jani Nikula > > [1] http://lore.kernel.org/r/20191023131308.9420-1-jani.nik...@intel.com -- With Best Regards, Andy Shevchenko
Re: [net-next] tcp: Sanitize CMSG flags and reserved args in tcp_zerocopy_receive.
On Mon, Feb 15, 2021 at 08:04:11AM -0700, David Ahern wrote: > On 2/15/21 5:03 AM, Dan Carpenter wrote: > > Hi Arjun, > > > > url: > > https://github.com/0day-ci/linux/commits/Arjun-Roy/tcp-Sanitize-CMSG-flags-and-reserved-args-in-tcp_zerocopy_receive/20210212-052537 > > > > base: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git > > e4b62cf7559f2ef9a022de235e5a09a8d7ded520 > > config: x86_64-randconfig-m001-20210209 (attached as .config) > > compiler: gcc-9 (Debian 9.3.0-15) 9.3.0 > > > > If you fix the issue, kindly add following tag as appropriate > > Reported-by: kernel test robot > > Reported-by: Dan Carpenter > > > > smatch warnings: > > net/ipv4/tcp.c:4158 do_tcp_getsockopt() warn: check for integer overflow > > 'len' > > > > vim +/len +4158 net/ipv4/tcp.c > > > > 3fdadf7d27e3fb Dmitry Mishin2006-03-20 3896 static int > > do_tcp_getsockopt(struct sock *sk, int level, > > 3fdadf7d27e3fb Dmitry Mishin2006-03-20 3897int > > optname, char __user *optval, int __user *optlen) > > ^1da177e4c3f41 Linus Torvalds 2005-04-16 3898 { > > 295f7324ff8d9e Arnaldo Carvalho de Melo 2005-08-09 3899struct > > inet_connection_sock *icsk = inet_csk(sk); > > ^1da177e4c3f41 Linus Torvalds 2005-04-16 3900struct tcp_sock > > *tp = tcp_sk(sk); > > 6fa251663069e0 Nikolay Borisov 2016-02-03 3901struct net *net > > = sock_net(sk); > > ^1da177e4c3f41 Linus Torvalds 2005-04-16 3902int val, len; > > > > "len" is int. > > > > [ snip ] > > 05255b823a6173 Eric Dumazet 2018-04-27 4146 #ifdef CONFIG_MMU > > 05255b823a6173 Eric Dumazet 2018-04-27 4147case > > TCP_ZEROCOPY_RECEIVE: { > > 7eeba1706eba6d Arjun Roy2021-01-20 4148struct > > scm_timestamping_internal tss; > > e0fecb289ad3fd Arjun Roy2020-12-10 4149struct > > tcp_zerocopy_receive zc = {}; > > 05255b823a6173 Eric Dumazet 2018-04-27 4150int err; > > 05255b823a6173 Eric Dumazet 2018-04-27 4151 > > 05255b823a6173 Eric Dumazet 2018-04-27 4152if > > (get_user(len, optlen)) > > 05255b823a6173 Eric Dumazet 2018-04-27 4153 > > return -EFAULT; > > c8856c05145490 Arjun Roy2020-02-14 4154if (len > > < offsetofend(struct tcp_zerocopy_receive, length)) > > 05255b823a6173 Eric Dumazet 2018-04-27 4155 > > return -EINVAL; > > > > > > The problem is that negative values of "len" are type promoted to high > > positive values. So the fix is to write this as: > > > > if (len < 0 || len < offsetofend(struct tcp_zerocopy_receive, length)) > > return -EINVAL; > > > > 110912bdf28392 Arjun Roy2021-02-11 4156if > > (unlikely(len > sizeof(zc))) { > > 110912bdf28392 Arjun Roy2021-02-11 4157 > > err = check_zeroed_user(optval + sizeof(zc), > > 110912bdf28392 Arjun Roy2021-02-11 @4158 > > len - sizeof(zc)); > > > > > > Potentially "len - a negative value". > > > > > > get_user(len, optlen) is called multiple times in that function. len < 0 > was checked after the first one at the top. > What you're describing is a "Double Fetch" bug, where the attack is we get some data from the user, and we verify it, then we get it from the user a second time and trust it. The problem is that the user modifies it between the first and second get_user() call so it ends up being a security vulnerability. But I'm glad you pointed out the first get_user() because it has an ancient, harmless pre git bug in it. net/ipv4/tcp.c 3888 static int do_tcp_getsockopt(struct sock *sk, int level, 3889 int optname, char __user *optval, int __user *optlen) 3890 { 3891 struct inet_connection_sock *icsk = inet_csk(sk); 3892 struct tcp_sock *tp = tcp_sk(sk); 3893 struct net *net = sock_net(sk); 3894 int val, len; 3895 3896 if (get_user(len, optlen)) 3897 return -EFAULT; 3898 3899 len = min_t(unsigned int, len, sizeof(int)); 3900 3901 if (len < 0) ^^^ This is impossible. "len" has to be in the 0-4 range because of the min_t() assignment. It's harmless though and the condition should just be removed. 3902 return -EINVAL; 3903 3904 switch (optname) { 3905 case TCP_MAXSEG: Anyway, I will create a new Smatch warning for this situation. > Also, maybe I am missing something here, but offsetofend can not return > a negative value, so this checks catches len < 0 as well: > > if (le
Re: [PATCH bpf-next 0/3] Introduce bpf_link in libbpf's xsk
On 2021-02-15 16:46, Maciej Fijalkowski wrote: Hi, This set is another approach towards addressing the below issue: // load xdp prog and xskmap and add entry to xskmap at idx 10 $ sudo ./xdpsock -i ens801f0 -t -q 10 // add entry to xskmap at idx 11 $ sudo ./xdpsock -i ens801f0 -t -q 11 terminate one of the processes and another one is unable to work due to the fact that the XDP prog was unloaded from interface. Previous attempt was, to put it mildly, a bit broken, as there was no synchronization between updates to additional map, as Bjorn pointed out. See https://lore.kernel.org/netdev/20190603131907.13395-5-maciej.fijalkow...@intel.com/ In the meantime bpf_link was introduced and it seems that it can address the issue of refcounting the XDP prog on interface. More info on commit messages. For the series: Reviewed-by: Björn Töpel Acked-by: Björn Töpel Finally, bpf_link/scoped XDP support! Thanks a lot! Björn Thanks. Maciej Fijalkowski (3): libbpf: xsk: use bpf_link libbpf: clear map_info before each bpf_obj_get_info_by_fd samples: bpf: do not unload prog within xdpsock samples/bpf/xdpsock_user.c | 55 -- tools/lib/bpf/xsk.c| 147 +++-- 2 files changed, 139 insertions(+), 63 deletions(-)
[PATCH bpf-next 3/3] samples: bpf: do not unload prog within xdpsock
With the introduction of bpf_link in xsk's libbpf part, there's no further need for explicit unload of prog on xdpsock's termination. When process dies, the bpf_link's refcount will be decremented and resources will be unloaded/freed under the hood in case when there are no more active users. While at it, don't dump stats on error path. Signed-off-by: Maciej Fijalkowski --- samples/bpf/xdpsock_user.c | 55 ++ 1 file changed, 14 insertions(+), 41 deletions(-) diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c index db0cb73513a5..96246313e342 100644 --- a/samples/bpf/xdpsock_user.c +++ b/samples/bpf/xdpsock_user.c @@ -96,7 +96,6 @@ static int opt_xsk_frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE; static int opt_timeout = 1000; static bool opt_need_wakeup = true; static u32 opt_num_xsks = 1; -static u32 prog_id; static bool opt_busy_poll; static bool opt_reduced_cap; @@ -462,59 +461,37 @@ static void *poller(void *arg) return NULL; } -static void remove_xdp_program(void) +static void int_exit(int sig) { - u32 curr_prog_id = 0; - int cmd = CLOSE_CONN; - - if (bpf_get_link_xdp_id(opt_ifindex, &curr_prog_id, opt_xdp_flags)) { - printf("bpf_get_link_xdp_id failed\n"); - exit(EXIT_FAILURE); - } - if (prog_id == curr_prog_id) - bpf_set_link_xdp_fd(opt_ifindex, -1, opt_xdp_flags); - else if (!curr_prog_id) - printf("couldn't find a prog id on a given interface\n"); - else - printf("program on interface changed, not removing\n"); - - if (opt_reduced_cap) { - if (write(sock, &cmd, sizeof(int)) < 0) { - fprintf(stderr, "Error writing into stream socket: %s", strerror(errno)); - exit(EXIT_FAILURE); - } - } + benchmark_done = true; } -static void int_exit(int sig) +static void __exit_with_error(int error, const char *file, const char *func, + int line) { - benchmark_done = true; + fprintf(stderr, "%s:%s:%i: errno: %d/\"%s\"\n", file, func, + line, error, strerror(error)); + exit(EXIT_FAILURE); } +#define exit_with_error(error) __exit_with_error(error, __FILE__, __func__, __LINE__) + static void xdpsock_cleanup(void) { struct xsk_umem *umem = xsks[0]->umem->umem; - int i; + int i, cmd = CLOSE_CONN; dump_stats(); for (i = 0; i < num_socks; i++) xsk_socket__delete(xsks[i]->xsk); (void)xsk_umem__delete(umem); - remove_xdp_program(); -} -static void __exit_with_error(int error, const char *file, const char *func, - int line) -{ - fprintf(stderr, "%s:%s:%i: errno: %d/\"%s\"\n", file, func, - line, error, strerror(error)); - dump_stats(); - remove_xdp_program(); - exit(EXIT_FAILURE); + if (opt_reduced_cap) { + if (write(sock, &cmd, sizeof(int)) < 0) + exit_with_error(errno); + } } -#define exit_with_error(error) __exit_with_error(error, __FILE__, __func__, \ -__LINE__) static void swap_mac_addresses(void *data) { struct ether_header *eth = (struct ether_header *)data; @@ -880,10 +857,6 @@ static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem, if (ret) exit_with_error(-ret); - ret = bpf_get_link_xdp_id(opt_ifindex, &prog_id, opt_xdp_flags); - if (ret) - exit_with_error(-ret); - xsk->app_stats.rx_empty_polls = 0; xsk->app_stats.fill_fail_polls = 0; xsk->app_stats.copy_tx_sendtos = 0; -- 2.20.1
Re: [PATCH 1/2] lockdep: add lockdep_assert_not_held()
On Mon, 2021-02-15 at 17:04 +0100, Peter Zijlstra wrote: > On Mon, Feb 15, 2021 at 02:12:30PM +0100, Johannes Berg wrote: > > On Mon, 2021-02-15 at 11:44 +0100, Peter Zijlstra wrote: > > > I think something like so will work, but please double check. > > > > Yeah, that looks better. > > > > > +++ b/include/linux/lockdep.h > > > @@ -294,11 +294,15 @@ extern void lock_unpin_lock(struct lockdep_map > > > *lock, struct pin_cookie); > > > > > > #define lockdep_depth(tsk) (debug_locks ? (tsk)->lockdep_depth : 0) > > > > > > -#define lockdep_assert_held(l) do {\ > > > - WARN_ON(debug_locks && !lockdep_is_held(l));\ > > > +#define lockdep_assert_held(l) do { > > > \ > > > + WARN_ON(debug_locks && lockdep_is_held(l) == 0)); \ > > > } while (0) > > > > That doesn't really need to change? It's the same. > > Correct, but I found it more symmetric vs the not implementation below. Fair enough. One might argue that you should have an enum lockdep_lock_state { LOCK_STATE_NOT_HELD, /* 0 now */ LOCK_STATE_HELD, /* 1 now */ LOCK_STATE_UNKNOWN, /* -1 with your patch but might as well be 2 */ }; :) johannes
Re: [PATCH] b43: N-PHY: Fix the update of coef for the PHY revision >= 3case
On 2/15/21 6:05 AM, Colin King wrote: From: Colin Ian King The documentation for the PHY update [1] states: Loop 4 times with index i If PHY Revision >= 3 Copy table[i] to coef[i] Otherwise Set coef[i] to 0 the copy of the table to coef is currently implemented the wrong way around, table is being updated from uninitialized values in coeff. Fix this by swapping the assignment around. [1] https://bcm-v4.sipsolutions.net/802.11/PHY/N/RestoreCal/ Fixes: 2f258b74d13c ("b43: N-PHY: implement restoring general configuration") Addresses-Coverity: ("Uninitialized scalar variable") Signed-off-by: Colin Ian King --- drivers/net/wireless/broadcom/b43/phy_n.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/wireless/broadcom/b43/phy_n.c b/drivers/net/wireless/broadcom/b43/phy_n.c index b669dff24b6e..665b737fbb0d 100644 --- a/drivers/net/wireless/broadcom/b43/phy_n.c +++ b/drivers/net/wireless/broadcom/b43/phy_n.c @@ -5311,7 +5311,7 @@ static void b43_nphy_restore_cal(struct b43_wldev *dev) for (i = 0; i < 4; i++) { if (dev->phy.rev >= 3) - table[i] = coef[i]; + coef[i] = table[i]; else coef[i] = 0; } Acked-by: Larry Finger Good catch, thanks. Larry
[PATCH][next] i40e: Fix uninitialized variable mfs_max
From: Colin Ian King The variable mfs_max is not initialized and is being compared to find the maximum value. Fix this by initializing it to 0. Addresses-Coverity: ("Uninitialized scalar variable") Fixes: 90bc8e003be2 ("i40e: Add hardware configuration for software based DCB") Signed-off-by: Colin Ian King --- drivers/net/ethernet/intel/i40e/i40e_dcb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_dcb.c b/drivers/net/ethernet/intel/i40e/i40e_dcb.c index 7b73a279d46e..243b0d2b7b72 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_dcb.c +++ b/drivers/net/ethernet/intel/i40e/i40e_dcb.c @@ -1636,7 +1636,7 @@ void i40e_dcb_hw_calculate_pool_sizes(struct i40e_hw *hw, u32 total_pool_size = 0; int shared_pool_size; /* Need signed variable */ u32 port_pb_size; - u32 mfs_max; + u32 mfs_max = 0; u32 pcirtt; u8 i; -- 2.30.0
[PATCH bpf-next 2/3] libbpf: clear map_info before each bpf_obj_get_info_by_fd
xsk_lookup_bpf_maps, based on prog_fd, looks whether current prog has a reference to XSKMAP. BPF prog can include insns that work on various BPF maps and this is covered by iterating through map_ids. The bpf_map_info that is passed to bpf_obj_get_info_by_fd for filling needs to be cleared at each iteration, so that it doesn't any outdated fields and that is currently missing in the function of interest. To fix that, zero-init map_info via memset before each bpf_obj_get_info_by_fd call. Also, since the area of this code is touched, in general strcmp is considered harmful, so let's convert it to strncmp and provide the length of the array name that we're looking for. Signed-off-by: Maciej Fijalkowski --- tools/lib/bpf/xsk.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c index 5911868efa43..fb259c0bba93 100644 --- a/tools/lib/bpf/xsk.c +++ b/tools/lib/bpf/xsk.c @@ -616,6 +616,7 @@ static int xsk_lookup_bpf_maps(struct xsk_socket *xsk) __u32 i, *map_ids, num_maps, prog_len = sizeof(struct bpf_prog_info); __u32 map_len = sizeof(struct bpf_map_info); struct bpf_prog_info prog_info = {}; + const char *map_name = "xsks_map"; struct xsk_ctx *ctx = xsk->ctx; struct bpf_map_info map_info; int fd, err; @@ -645,13 +646,14 @@ static int xsk_lookup_bpf_maps(struct xsk_socket *xsk) if (fd < 0) continue; + memset(&map_info, 0, map_len); err = bpf_obj_get_info_by_fd(fd, &map_info, &map_len); if (err) { close(fd); continue; } - if (!strcmp(map_info.name, "xsks_map")) { + if (!strncmp(map_info.name, map_name, strlen(map_name))) { ctx->xsks_map_fd = fd; continue; } -- 2.20.1
RE: [EXT] Re: Phylink flow control support on ports with MLO_AN_FIXED auto negotiation
> > > I discussed it with Andrew earlier last year, and his response was: > > > > > > DT configuration of pause for fixed link probably is sufficient. I > > > don't remember it ever been really discussed for DSA. It was a > > > Melanox discussion about limiting pause for the CPU. So I think it > > > is safe to not implement ethtool -A, at least until somebody has a > > > real use case for it. > > > > > > So I chose not to support it - no point supporting features that > > > people aren't using. If you have a "real use case" then it can be added. > > > > This patch may be sufficient - I haven't fully considered all the > > implications of changing this though. > > Did you try this patch? What's the outcome? For me patch worked as expected. Thanks, Stefan. > > > > drivers/net/phy/phylink.c | 9 +++-- > > 1 file changed, 3 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c > > index 7e0fdc17c8ee..2ee0d4dcf9a5 100644 > > --- a/drivers/net/phy/phylink.c > > +++ b/drivers/net/phy/phylink.c > > @@ -673,7 +673,6 @@ static void phylink_resolve(struct work_struct *w) > > switch (pl->cur_link_an_mode) { > > case MLO_AN_PHY: > > link_state = pl->phy_state; > > - phylink_apply_manual_flow(pl, &link_state); > > mac_config = link_state.link; > > break; > > > > @@ -698,11 +697,12 @@ static void phylink_resolve(struct work_struct > *w) > > link_state.pause = pl->phy_state.pause; > > mac_config = true; > > } > > - phylink_apply_manual_flow(pl, &link_state); > > break; > > } > > } > > > > + phylink_apply_manual_flow(pl, &link_state); > > + > > if (mac_config) { > > if (link_state.interface != pl->link_config.interface) { > > /* The interface has changed, force the link down > and @@ -1639,9 > > +1639,6 @@ int phylink_ethtool_set_pauseparam(struct phylink *pl, > > > > ASSERT_RTNL(); > > > > - if (pl->cur_link_an_mode == MLO_AN_FIXED) > > - return -EOPNOTSUPP; > > - > > if (!phylink_test(pl->supported, Pause) && > > !phylink_test(pl->supported, Asym_Pause)) > > return -EOPNOTSUPP; > > @@ -1684,7 +1681,7 @@ int phylink_ethtool_set_pauseparam(struct > phylink *pl, > > /* Update our in-band advertisement, triggering a renegotiation if > > * the advertisement changed. > > */ > > - if (!pl->phydev) > > + if (!pl->phydev && pl->cur_link_an_mode != MLO_AN_FIXED) > > phylink_change_inband_advert(pl); > > > > mutex_unlock(&pl->state_mutex); > > > > -- > > RMK's Patch system: > > https://urldefense.proofpoint.com/v2/url?u=https- > 3A__www.armlinux.org. > > > uk_developer_patches_&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=DDQ > 3dKwkTIxK > > > Al6_Bs7GMx4zhJArrXKN2mDMOXGh7lg&m=bLvAkwDrYioAER_dvXEqutiRiU > W57bKfscMh > > 71TGDxw&s=p5jgFDs7cxtpIE9LZ3ogOahGzpuKjG4PHOcPF6qXnXI&e= > > FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! > > -- > RMK's Patch system: https://urldefense.proofpoint.com/v2/url?u=https- > 3A__www.armlinux.org.uk_developer_patches_&d=DwIBAg&c=nKjWec2b6 > R0mOyPaz7xtfQ&r=DDQ3dKwkTIxKAl6_Bs7GMx4zhJArrXKN2mDMOXGh7lg& > m=bLvAkwDrYioAER_dvXEqutiRiUW57bKfscMh71TGDxw&s=p5jgFDs7cxtpIE9 > LZ3ogOahGzpuKjG4PHOcPF6qXnXI&e= > FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
[PATCH] i40e: Fix incorrect use of ip6src due to copy-paste coding error
From: Colin Ian King It appears that the call of ipv6_add_any for the destination address is using ip6src instead of ip6dst, this looks like a copy-paste coding error. Fix this by replacing ip6src with ip6dst. Addresses-Coverity: ("Copy-paste error") Fixes: efca91e89b67 ("i40e: Add flow director support for IPv6") Signed-off-by: Colin Ian King --- drivers/net/ethernet/intel/i40e/i40e_ethtool.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c index 8a4dd77a12da..a8a2b5f683a2 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c +++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c @@ -4250,7 +4250,7 @@ static int i40e_check_fdir_input_set(struct i40e_vsi *vsi, (struct in6_addr *)&ipv6_full_mask)) new_mask |= I40E_L3_V6_DST_MASK; else if (ipv6_addr_any((struct in6_addr *) - &tcp_ip6_spec->ip6src)) + &tcp_ip6_spec->ip6dst)) new_mask &= ~I40E_L3_V6_DST_MASK; else return -EOPNOTSUPP; -- 2.30.0
Re: [EXT] Re: Phylink flow control support on ports with MLO_AN_FIXED auto negotiation
On Mon, Feb 15, 2021 at 04:19:19PM +, Stefan Chulski wrote: > > > > I discussed it with Andrew earlier last year, and his response was: > > > > > > > > DT configuration of pause for fixed link probably is sufficient. I > > > > don't remember it ever been really discussed for DSA. It was a > > > > Melanox discussion about limiting pause for the CPU. So I think it > > > > is safe to not implement ethtool -A, at least until somebody has a > > > > real use case for it. > > > > > > > > So I chose not to support it - no point supporting features that > > > > people aren't using. If you have a "real use case" then it can be added. > > > > > > This patch may be sufficient - I haven't fully considered all the > > > implications of changing this though. > > > > Did you try this patch? What's the outcome? > > For me patch worked as expected. Great, thanks. Andrew, do you have any further opinions on this subject, or shall I sent a proper patch for net-next? -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
Re: HSR/PRP sequence counter issue with Cisco Redbox
On Mon, Feb 15, 2021 at 6:30 AM Wenzel, Marco wrote: > > > On Wed, Jan 27, 2021 at 6:32 AM Wenzel, Marco > eberle.de> wrote: > > > > > > Hi, > > > > > > we have figured out an issue with the current PRP driver when trying to > > communicate with Cisco IE 2000 industrial Ethernet switches in Redbox > > mode. The Cisco always resets the HSR/PRP sequence counter to "1" at low > > traffic (<= 1 frame in 400 ms). It can be reproduced by a simple ICMP echo > > request with 1 s interval between a Linux box running with PRP and a VDAN > > behind the Cisco Redbox. The Linux box then always receives frames with > > sequence counter "1" and drops them. The behavior is not configurable at > > the Cisco Redbox. > > > > > > I fixed it by ignoring sequence counters with value "1" at the sequence > > counter check in hsr_register_frame_out (): > > > > > > diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c index > > > 5c97de459905..630c238e81f0 100644 > > > --- a/net/hsr/hsr_framereg.c > > > +++ b/net/hsr/hsr_framereg.c > > > @@ -411,7 +411,7 @@ void hsr_register_frame_in(struct hsr_node *node, > > > struct hsr_port *port, int hsr_register_frame_out(struct hsr_port *port, > > struct hsr_node *node, > > >u16 sequence_nr) { > > > - if (seq_nr_before_or_eq(sequence_nr, node->seq_out[port->type])) > > > + if (seq_nr_before_or_eq(sequence_nr, > > > + node->seq_out[port->type]) && (sequence_nr != 1)) > > > return 1; > > > > > > node->seq_out[port->type] = sequence_nr; > > > > > > > > > Do you think this could be a solution? Should this patch be officially > > > applied > > in order to avoid other users running into these communication issues? > > > > This isn't the correct way to solve the problem. IEC 62439-3 defines > > EntryForgetTime as "Time after which an entry is removed from the duplicate > > table" with a value of 400ms and states devices should usually be configured > > to keep entries in the table for a much shorter time. hsr_framereg.c needs > > to > > be reworked to handle this according to the specification. > > Sorry for the delay but I did not have the time to take a closer look at the > problem until now. > > My suggestion for the EntryForgetTime feature would be the following: A > time_out element will be added to the hsr_node structure, which always stores > the current time when entering hsr_register_frame_out(). If the last stored > time is older than EntryForgetTime (400 ms) the sequence number check will be > ignored. > > diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c > index 5c97de459905..a97bffbd2581 100644 > --- a/net/hsr/hsr_framereg.c > +++ b/net/hsr/hsr_framereg.c > @@ -164,8 +164,10 @@ static struct hsr_node *hsr_add_node(struct hsr_priv > *hsr, > * as initialization. (0 could trigger an spurious ring error > warning). > */ > now = jiffies; > - for (i = 0; i < HSR_PT_PORTS; i++) > + for (i = 0; i < HSR_PT_PORTS; i++) { > new_node->time_in[i] = now; > + new_node->time_out[i] = now; > + } > for (i = 0; i < HSR_PT_PORTS; i++) > new_node->seq_out[i] = seq_out; > > @@ -411,9 +413,12 @@ void hsr_register_frame_in(struct hsr_node *node, struct > hsr_port *port, > int hsr_register_frame_out(struct hsr_port *port, struct hsr_node *node, >u16 sequence_nr) > { > - if (seq_nr_before_or_eq(sequence_nr, node->seq_out[port->type])) > + if (seq_nr_before_or_eq(sequence_nr, node->seq_out[port->type]) && > +time_is_after_jiffies(node->time_out[port->type] + > msecs_to_jiffies(HSR_ENTRY_FORGET_TIME))) { > return 1; > + } > > + node->time_out[port->type] = jiffies; > node->seq_out[port->type] = sequence_nr; > return 0; > } > diff --git a/net/hsr/hsr_framereg.h b/net/hsr/hsr_framereg.h > index 86b43f539f2c..d9628e7a5f05 100644 > --- a/net/hsr/hsr_framereg.h > +++ b/net/hsr/hsr_framereg.h > @@ -75,6 +75,7 @@ struct hsr_node { > enum hsr_port_type addr_B_port; > unsigned long time_in[HSR_PT_PORTS]; > booltime_in_stale[HSR_PT_PORTS]; > + unsigned long time_out[HSR_PT_PORTS]; > /* if the node is a SAN */ > boolsan_a; > boolsan_b; > diff --git a/net/hsr/hsr_main.h b/net/hsr/hsr_main.h > index 7dc92ce5a134..f79ca55d6986 100644 > --- a/net/hsr/hsr_main.h > +++ b/net/hsr/hsr_main.h > @@ -21,6 +21,7 @@ > #define HSR_LIFE_CHECK_INTERVAL 2000 /* ms */ > #define HSR_NODE_FORGET_TIME 6 /* ms */ > #define HSR_ANNOUNCE_INTERVAL100 /* ms */ > +#define HSR_ENTRY_FORGET_TIME400 /* ms */ > > /* By how much may slave1 and slave2 timestamps of latest received frame from > * each node differ before we notify of communication problem? > > > This app
re: octeontx2-af: cn10k: MAC internal loopback support
Hi, Static analysis on linux-next today using Coverity found an issue in the following commit: commit 3ad3f8f93c81f81d6e28b2e286b03669cc1fb3b0 Author: Hariprasad Kelam Date: Thu Feb 11 21:28:34 2021 +0530 octeontx2-af: cn10k: MAC internal loopback support The analysis is as follows: 723 static int rvu_cgx_config_intlbk(struct rvu *rvu, u16 pcifunc, bool en) 724 { 725struct mac_ops *mac_ops; 1. var_decl: Declaring variable lmac_id without initializer. 726u8 cgx_id, lmac_id; 727 2. Condition !is_cgx_config_permitted(rvu, pcifunc), taking false branch. 728if (!is_cgx_config_permitted(rvu, pcifunc)) 729return -EPERM; 730 Uninitialized scalar variable (UNINIT) 731mac_ops = get_mac_ops(rvu_cgx_pdata(cgx_id, rvu)); 732 Uninitialized scalar variable (UNINIT) 3. uninit_use_in_call: Using uninitialized value lmac_id when calling *mac_ops->mac_lmac_intl_lbk. 733return mac_ops->mac_lmac_intl_lbk(rvu_cgx_pdata(cgx_id, rvu), 734 lmac_id, en); 735 } Variables cgx_id and lmac_id are no longer being initialized and garbage values are being passed into function calls. Originally, these variables were being initialized with a call to rvu_get_cgx_lmac_id() but that has now been removed. Colin
Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
Maciej Fijalkowski writes: > Currently, if there are multiple xdpsock instances running on a single > interface and in case one of the instances is terminated, the rest of > them are left in an inoperable state due to the fact of unloaded XDP > prog from interface. > > To address that, step away from setting bpf prog in favour of bpf_link. > This means that refcounting of BPF resources will be done automatically > by bpf_link itself. > > When setting up BPF resources during xsk socket creation, check whether > bpf_link for a given ifindex already exists via set of calls to > bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd > and comparing the ifindexes from bpf_link and xsk socket. One consideration here is that bpf_link_get_fd_by_id() is a privileged operation (privileged as in CAP_SYS_ADMIN), so this has the side effect of making AF_XDP privileged as well. Is that the intention? Another is that the AF_XDP code is in the process of moving to libxdp (see in-progress PR [0]), and this approach won't carry over as-is to that model, because libxdp has to pin the bpf_link fds. However, in libxdp we can solve the original problem in a different way, and in fact I already suggested to Magnus that we should do this (see [1]); so one way forward could be to address it during the merge in libxdp? It should be possible to address the original issue (two instances of xdpsock breaking each other when they exit), but applications will still need to do an explicit unload operation before exiting (i.e., the automatic detach on bpf_link fd closure will take more work, and likely require extending the bpf_link kernel support)... -Toke [0] https://github.com/xdp-project/xdp-tools/pull/92 [1] https://github.com/xdp-project/xdp-tools/pull/92#discussion_r576204719
Re: [PATCH net-next RFC v3] net: hdlc_x25: Queue outgoing LAPB frames
On Mon, Feb 15, 2021 at 1:25 AM Leon Romanovsky wrote: > > > + /* When transmitting data: > > + * first we'll remove a pseudo header of 1 byte, > > + * then the LAPB module will prepend an LAPB header of at most 3 > > bytes. > > + */ > > + dev->needed_headroom = 3 - 1; > > 3 - 1 = 2 > > Thanks Actually this is intentional. It makes the numbers more meaningful. The compiler should automatically generate the "2" so there would be no runtime penalty.
Re: [PATCH net-next 4/4] net: dsa: don't set skb->offload_fwd_mark when not offloading the bridge
Hi George, On Mon, Feb 15, 2021 at 09:48:38AM -0600, George McCollister wrote: > On Sun, Feb 14, 2021 at 9:54 AM Vladimir Oltean wrote: > [snip] > > diff --git a/net/dsa/tag_xrs700x.c b/net/dsa/tag_xrs700x.c > > index 858cdf9d2913..215ecceea89e 100644 > > --- a/net/dsa/tag_xrs700x.c > > +++ b/net/dsa/tag_xrs700x.c > > @@ -45,8 +45,7 @@ static struct sk_buff *xrs700x_rcv(struct sk_buff *skb, > > struct net_device *dev, > > if (pskb_trim_rcsum(skb, skb->len - 1)) > > return NULL; > > > > - /* Frame is forwarded by hardware, don't forward in software. */ > > - skb->offload_fwd_mark = 1; > > + dsa_default_offload_fwd_mark(skb); > > Does it make sense that the following would have worked prior to this > change? Is this only an issue for bridging between DSA ports when > offloading is supported? lan0 is a port an an xrs700x switch: > > ip link set eth0 up > ip link del veth0 > ip link add veth0 type veth peer name veth1 > > for eth in veth0 veth1 lan1; do > ip link set ${eth} up > done > ip link add br0 type bridge > ip link set veth1 master br0 > ip link set lan1 master br0 > ip link set br0 up > > ip addr add 192.168.2.1/24 dev veth0 > > # ping host connected to physical LAN that lan0 is on > ping 192.168.2.249 (works!) > > I was trying to come up with a way to test this change and expected > this would fail (and your patch) would fix it based on what you're > described. No, the configuration you've shown should be supported and functional already (as you've noticed, in fact). I call it 'bridging with a foreign interface', where a foreign interface is a bridge port that has a different switchdev mark compared to the DSA switch. A switchdev mark is a number assigned to every bridge port by nbp_switchdev_mark_set, based on the "physical switch id"*. There is a simple rule with switchdev: on reception of an skb, the bridge checks if it was marked as 'already forwarded in hardware' (checks if skb->offload_fwd_mark == 1), and if it is, it puts a mark of its own on that skb, with the switchdev mark of the ingress port. Then during forwarding, it enforces that the egress port must have a different switchdev mark than the ingress one (this is done in nbp_switchdev_allowed_egress). The veth driver does not implement any sort of method for retrieving a physical switch id (neither devlink nor .ndo_get_port_parent_id), therefore the bridge assigns it a switchdev mark of 0, and packets coming from it will always have skb->offload_fwd_mark = 0. So there aren't any restrictions. Problems appear as soon as software bridging is attempted between two interfaces that have the same switchdev mark. If skb->offload_fwd_mark=1, the bridge will say that forwarding was already performed in hw, so it will deny it in sw. The issue is that a bond0 (or hsr0) upper of lan0 will be assigned the same switchdev mark as lan0 itself, because the function that assigns switchdev marks to bridge ports, nbp_switchdev_mark_set, recurses through that port's lower interfaces until it finds something that implements devlink. What I tested is actually pretty laughable and a far cry from a real-life scenario: I commented out the .port_bridge_join and .port_bridge_leave methods of a driver and made sure that forwarding between ports still works regardless of what uppers they have (even that used not to). But this bypasses the switchdev mark checks in nbp_switchdev_allowed_egress because the skb->offload_fwd_mark=0 now. This is an important prerequisite for seamless operation, true, but it isn't quite what we want. For one thing, we may have a topology like this: +-- br0 -+ / / |\ / / | \ / / | \ / /| \ / / |\ /| | bond0 / | | /\ swp0 swp1 swp2 swp3 swp4 where it is desirable that the presence of swp3 and swp4 under a non-offloaded LAG does not preclude us from doing hardware bridging beteen swp0, swp1 and swp2. But this creates an impossible paradox if we continue in the way that I started in this patch. When the CPU receives a packet from swp0 (say, due to flooding), the tagger must set skb->offload_fwd_mark to something. If we set it to 0, then the bridge will forward it towards swp1, swp2 and bond0. But the switch has already forwarded it towards swp1 and swp2 (not to bond0, remember, that isn't offloaded, so as far as the switch is concerned, ports swp3 and swp4 are not looking up the FDB, and the entire bond0 is a destination that is strictly behind the CPU). But we don't want duplicated traffic towards swp1 and swp2, so it's not ok to set skb->offload_fwd_mark = 0. If we set it to 1, then the bridge will not forward the skb towards the ports with the same switchdev mark, i.e. not to swp1, swp2 and bond0. Towards swp1 and swp2 that's ok, but towards bond0? It should have forwarded the skb there. An actual solution to this problem, which has nothing to do with my se
Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link
On 2021-02-15 18:07, Toke Høiland-Jørgensen wrote: Maciej Fijalkowski writes: Currently, if there are multiple xdpsock instances running on a single interface and in case one of the instances is terminated, the rest of them are left in an inoperable state due to the fact of unloaded XDP prog from interface. To address that, step away from setting bpf prog in favour of bpf_link. This means that refcounting of BPF resources will be done automatically by bpf_link itself. When setting up BPF resources during xsk socket creation, check whether bpf_link for a given ifindex already exists via set of calls to bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd and comparing the ifindexes from bpf_link and xsk socket. One consideration here is that bpf_link_get_fd_by_id() is a privileged operation (privileged as in CAP_SYS_ADMIN), so this has the side effect of making AF_XDP privileged as well. Is that the intention? We're already using, e.g., bpf_map_get_fd_by_id() which has that as well. So we're assuming that for XDP setup already! Another is that the AF_XDP code is in the process of moving to libxdp (see in-progress PR [0]), and this approach won't carry over as-is to that model, because libxdp has to pin the bpf_link fds. I was assuming there were two modes of operations for AF_XDP in libxdp. One which is with the multi-program support (which AFAIK is why the pinning is required), and one "like the current libbpf" one. For the latter Maciej's series would be a good fit, no? However, in libxdp we can solve the original problem in a different way, and in fact I already suggested to Magnus that we should do this (see [1]); so one way forward could be to address it during the merge in libxdp? It should be possible to address the original issue (two instances of xdpsock breaking each other when they exit), but applications will still need to do an explicit unload operation before exiting (i.e., the automatic detach on bpf_link fd closure will take more work, and likely require extending the bpf_link kernel support)... I'd say it's depending on the libbpf 1.0/libxdp merge timeframe. If we're months ahead, then I'd really like to see this in libbpf until the merge. However, I'll leave that for Magnus/you to decide! Bottom line; I'd *really* like bpf_link behavior (process scoped) for AF_XDP sooner than later! ;-) Thanks for the input! Björn -Toke [0] https://github.com/xdp-project/xdp-tools/pull/92 [1] https://github.com/xdp-project/xdp-tools/pull/92#discussion_r576204719
[net-next PATCH] octeontx2-af: cn10k: Fixes CN10K RPM reference issue
This patch fixes references to uninitialized variables and debugfs entry name for CN10K platform and HW_TSO flag check. Signed-off-by: Geetha sowjanya Signed-off-by: Sunil Goutham This patch fixes the bug introduced by the commit 3ad3f8f93c81 ("octeontx2-af: cn10k: MAC internal loopback support". These changes are not yet merged into net branch, hence submitting to net-next. --- drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c | 2 ++ .../net/ethernet/marvell/octeontx2/af/rvu_debugfs.c | 2 +- .../net/ethernet/marvell/octeontx2/nic/otx2_txrx.c| 11 ++- 3 files changed, 9 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c index 3a1809c28e83..e668e482383a 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c @@ -722,12 +722,14 @@ u32 rvu_cgx_get_fifolen(struct rvu *rvu) static int rvu_cgx_config_intlbk(struct rvu *rvu, u16 pcifunc, bool en) { + int pf = rvu_get_pf(pcifunc); struct mac_ops *mac_ops; u8 cgx_id, lmac_id; if (!is_cgx_config_permitted(rvu, pcifunc)) return -EPERM; + rvu_get_cgx_lmac_id(rvu->pf2cgxlmac_map[pf], &cgx_id, &lmac_id); mac_ops = get_mac_ops(rvu_cgx_pdata(cgx_id, rvu)); return mac_ops->mac_lmac_intl_lbk(rvu_cgx_pdata(cgx_id, rvu), diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c index 48a84c65804c..094124b695dc 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c @@ -2432,7 +2432,7 @@ void rvu_dbg_init(struct rvu *rvu) debugfs_create_file("rvu_pf_cgx_map", 0444, rvu->rvu_dbg.root, rvu, &rvu_dbg_rvu_pf_cgx_map_fops); else - debugfs_create_file("rvu_pf_cgx_map", 0444, rvu->rvu_dbg.root, + debugfs_create_file("rvu_pf_rpm_map", 0444, rvu->rvu_dbg.root, rvu, &rvu_dbg_rvu_pf_cgx_map_fops); create: diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c index 3f778fc054b5..22ec03a618b1 100644 --- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c +++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c @@ -816,22 +816,23 @@ static bool is_hw_tso_supported(struct otx2_nic *pfvf, { int payload_len, last_seg_size; + if (test_bit(HW_TSO, &pfvf->hw.cap_flag)) + return true; + + /* On 96xx A0, HW TSO not supported */ + if (!is_96xx_B0(pfvf->pdev)) + return false; /* HW has an issue due to which when the payload of the last LSO * segment is shorter than 16 bytes, some header fields may not * be correctly modified, hence don't offload such TSO segments. */ - if (!is_96xx_B0(pfvf->pdev)) - return true; payload_len = skb->len - (skb_transport_offset(skb) + tcp_hdrlen(skb)); last_seg_size = payload_len % skb_shinfo(skb)->gso_size; if (last_seg_size && last_seg_size < 16) return false; - if (!test_bit(HW_TSO, &pfvf->hw.cap_flag)) - return false; - return true; } -- 2.17.1
Re: [EXT] re: octeontx2-af: cn10k: MAC internal loopback support
Hi Colin, I have submitted the patch fixing the reported issue to net-next branch. Thank you, Geetha. From: Colin Ian King Sent: Monday, February 15, 2021 10:22 PM To: Hariprasad Kelam Cc: Sunil Kovvuri Goutham; Linu Cherian; Geethasowjanya Akula; Jerin Jacob Kollanukkaran; Hariprasad Kelam; Subbaraya Sundeep Bhatta; netdev@vger.kernel.org; linux-ker...@vger.kernel.org Subject: [EXT] re: octeontx2-af: cn10k: MAC internal loopback support External Email -- Hi, Static analysis on linux-next today using Coverity found an issue in the following commit: commit 3ad3f8f93c81f81d6e28b2e286b03669cc1fb3b0 Author: Hariprasad Kelam Date: Thu Feb 11 21:28:34 2021 +0530 octeontx2-af: cn10k: MAC internal loopback support The analysis is as follows: 723 static int rvu_cgx_config_intlbk(struct rvu *rvu, u16 pcifunc, bool en) 724 { 725struct mac_ops *mac_ops; 1. var_decl: Declaring variable lmac_id without initializer. 726u8 cgx_id, lmac_id; 727 2. Condition !is_cgx_config_permitted(rvu, pcifunc), taking false branch. 728if (!is_cgx_config_permitted(rvu, pcifunc)) 729return -EPERM; 730 Uninitialized scalar variable (UNINIT) 731mac_ops = get_mac_ops(rvu_cgx_pdata(cgx_id, rvu)); 732 Uninitialized scalar variable (UNINIT) 3. uninit_use_in_call: Using uninitialized value lmac_id when calling *mac_ops->mac_lmac_intl_lbk. 733return mac_ops->mac_lmac_intl_lbk(rvu_cgx_pdata(cgx_id, rvu), 734 lmac_id, en); 735 } Variables cgx_id and lmac_id are no longer being initialized and garbage values are being passed into function calls. Originally, these variables were being initialized with a call to rvu_get_cgx_lmac_id() but that has now been removed. Colin
Re: commit 0f0aefd733f7 to linux-firmware effectively broke all of the setups with old kernels
> Seems the commit 0f0aefd733f7 to linux-firmware effectively broke all > of the setups with the old kernels. Firmware name is an ABI (!) and > replacing it like this will definitely break systems with older > kernels. Linux firmware package likely, but unfortunately, should > carry on both versions as long as it's needed. Alternative solution is > to provide the links during installation. It does provide the links using the copy-firmware.sh and the details in WHENCE. The alternative is to leave firmwares in place with CVEs.
Re: [PATCH net-next v2 1/7] mld: convert from timer to delayed work
On Sun, Feb 14, 2021 at 2:56 AM Taehee Yoo wrote: > > > > On 21. 2. 14. 오전 4:07, Cong Wang wrote: > > On Sat, Feb 13, 2021 at 9:51 AM Taehee Yoo wrote: > >> -static void mld_dad_start_timer(struct inet6_dev *idev, unsigned > long delay) > >> +static void mld_dad_start_work(struct inet6_dev *idev, unsigned > long delay) > >> { > >> unsigned long tv = prandom_u32() % delay; > >> > >> - if (!mod_timer(&idev->mc_dad_timer, jiffies+tv+2)) > >> + if (!mod_delayed_work(mld_wq, &idev->mc_dad_work, > msecs_to_jiffies(tv + 2))) > > > > IIUC, before this patch 'delay' is in jiffies, after this patch it is > in msecs? > > > > Ah, I understand, It's my mistake. > I didn't change the behavior of 'delay' in this patchset. > So, 'delay' is still in jiffies, not msecs. > Therefore, msecs_to_jiffies() should not be used in this patchset. > I will send a v3 patch, which doesn't use msecs_to_jiffies(). > Thanks! > > By the way, I think the 'delay' is from the > unsolicited_report_interval() and it just return value of > idev->cnf.mldv{1 | 2}_unsolicited_report_interval. > I think this value is msecs, not jiffies. > So, It should be converted to use msecs_to_jiffies(), I think. > How do you think about it? Hmm? I think it is in jiffies: .mldv1_unsolicited_report_interval = 10 * HZ, .mldv2_unsolicited_report_interval = HZ, > > > [...] > > > >> -static void mld_dad_timer_expire(struct timer_list *t) > >> +static void mld_dad_work(struct work_struct *work) > >> { > >> - struct inet6_dev *idev = from_timer(idev, t, mc_dad_timer); > >> + struct inet6_dev *idev = container_of(to_delayed_work(work), > >> + struct inet6_dev, > >> + mc_dad_work); > >> > >> + rtnl_lock(); > > > > Any reason why we need RTNL after converting the timer to > > delayed work? > > > > For the moment, RTNL is not needed. > But the Resources, which are used by delayed_work will be protected by > RTNL instead of other locks. > So, It just pre-adds RTNL and the following patches will delete other locks. Sounds like this change does not belong to this patch. ;) If so, please move it to where ever more appropriate. Thanks.
Re: [net-next PATCH] octeontx2-af: cn10k: Fixes CN10K RPM reference issue
On Mon, Feb 15, 2021 at 11:27 PM Geetha sowjanya wrote: > > This patch fixes references to uninitialized variables and > debugfs entry name for CN10K platform and HW_TSO flag check. > > Signed-off-by: Geetha sowjanya > Signed-off-by: Sunil Goutham > > This patch fixes the bug introduced by the commit > 3ad3f8f93c81 ("octeontx2-af: cn10k: MAC internal loopback support". > These changes are not yet merged into net branch, hence submitting > to net-next. > > --- > drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c | 2 ++ > .../net/ethernet/marvell/octeontx2/af/rvu_debugfs.c | 2 +- > .../net/ethernet/marvell/octeontx2/nic/otx2_txrx.c| 11 ++- > 3 files changed, 9 insertions(+), 6 deletions(-) > > diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c > b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c > index 3f778fc054b5..22ec03a618b1 100644 > --- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c > +++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c > @@ -816,22 +816,23 @@ static bool is_hw_tso_supported(struct otx2_nic *pfvf, > { > int payload_len, last_seg_size; > > + if (test_bit(HW_TSO, &pfvf->hw.cap_flag)) > + return true; > + > + /* On 96xx A0, HW TSO not supported */ > + if (!is_96xx_B0(pfvf->pdev)) > + return false; > > /* HW has an issue due to which when the payload of the last LSO > * segment is shorter than 16 bytes, some header fields may not > * be correctly modified, hence don't offload such TSO segments. > */ > - if (!is_96xx_B0(pfvf->pdev)) > - return true; > > payload_len = skb->len - (skb_transport_offset(skb) + > tcp_hdrlen(skb)); > last_seg_size = payload_len % skb_shinfo(skb)->gso_size; > if (last_seg_size && last_seg_size < 16) > return false; > > - if (!test_bit(HW_TSO, &pfvf->hw.cap_flag)) > - return false; > - > return true; > } The HW_TSO flag should not be set for B0 silicon as well, otherwise the checks related to HW issue mentioned above will not come into effect. Thanks, Sunil.
Re: [EXT] Re: Phylink flow control support on ports with MLO_AN_FIXED auto negotiation
On Mon, Feb 15, 2021 at 04:19:19PM +, Stefan Chulski wrote: > > > > I discussed it with Andrew earlier last year, and his response was: > > > > > > > > DT configuration of pause for fixed link probably is sufficient. I > > > > don't remember it ever been really discussed for DSA. It was a > > > > Melanox discussion about limiting pause for the CPU. So I think it > > > > is safe to not implement ethtool -A, at least until somebody has a > > > > real use case for it. > > > > > > > > So I chose not to support it - no point supporting features that > > > > people aren't using. If you have a "real use case" then it can be added. > > > > > > This patch may be sufficient - I haven't fully considered all the > > > implications of changing this though. > > > > Did you try this patch? What's the outcome? > > For me patch worked as expected. Hi Stefan Russell's patch allows it, but i would be interested in knows why you actually need it. What is your use case for changing this on the fly? Andrew
Re: commit 0f0aefd733f7 to linux-firmware effectively broke all of the setups with old kernels
On Mon, Feb 15, 2021 at 8:03 PM Peter Robinson wrote: > > > Seems the commit 0f0aefd733f7 to linux-firmware effectively broke all > > of the setups with the old kernels. Firmware name is an ABI (!) and > > replacing it like this will definitely break systems with older > > kernels. Linux firmware package likely, but unfortunately, should > > carry on both versions as long as it's needed. Alternative solution is > > to provide the links during installation. > > It does provide the links using the copy-firmware.sh and the details in > WHENCE. > > The alternative is to leave firmwares in place with CVEs. Good, thanks, I haven't looked into that script. -- With Best Regards, Andy Shevchenko
RE: [Patch bpf-next v3 1/5] bpf: clean up sockmap related Kconfigs
Cong Wang wrote: > From: Cong Wang > > As suggested by John, clean up sockmap related Kconfigs: > > Reduce the scope of CONFIG_BPF_STREAM_PARSER down to TCP stream > parser, to reflect its name. > > Make the rest sockmap code simply depend on CONFIG_BPF_SYSCALL. > And leave CONFIG_NET_SOCK_MSG untouched, as it is used by > non-sockmap cases. > > Cc: John Fastabend > Cc: Daniel Borkmann > Cc: Jakub Sitnicki > Cc: Lorenz Bauer > Signed-off-by: Cong Wang > --- Thanks for doing this. Acked-by: John Fastabend
[PATCH AUTOSEL 5.10 2/6] NET: usb: qmi_wwan: Adding support for Cinterion MV31
From: Christoph Schemmel [ Upstream commit a4dc7eee9106a9d2a6e08b442db19677aa9699c7 ] Adding support for Cinterion MV31 with PID 0x00B7. T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 11 Spd=5000 MxCh= 0 D: Ver= 3.20 Cls=ef(misc ) Sub=02 Prot=01 MxPS= 9 #Cfgs= 1 P: Vendor=1e2d ProdID=00b7 Rev=04.14 S: Manufacturer=Cinterion S: Product=Cinterion USB Mobile Broadband S: SerialNumber=b3246eed C: #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=896mA I: If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan I: If#=0x1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option Signed-off-by: Christoph Schemmel Link: https://lore.kernel.org/r/20210202084523.4371-1-christoph.schem...@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- drivers/net/usb/qmi_wwan.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c index ce73df4c137ea..b223536e07bed 100644 --- a/drivers/net/usb/qmi_wwan.c +++ b/drivers/net/usb/qmi_wwan.c @@ -1332,6 +1332,7 @@ static const struct usb_device_id products[] = { {QMI_FIXED_INTF(0x1e2d, 0x0082, 5)},/* Cinterion PHxx,PXxx (2 RmNet) */ {QMI_FIXED_INTF(0x1e2d, 0x0083, 4)},/* Cinterion PHxx,PXxx (1 RmNet + USB Audio)*/ {QMI_QUIRK_SET_DTR(0x1e2d, 0x00b0, 4)}, /* Cinterion CLS8 */ + {QMI_FIXED_INTF(0x1e2d, 0x00b7, 0)},/* Cinterion MV31 RmNet */ {QMI_FIXED_INTF(0x413c, 0x81a2, 8)},/* Dell Wireless 5806 Gobi(TM) 4G LTE Mobile Broadband Card */ {QMI_FIXED_INTF(0x413c, 0x81a3, 8)},/* Dell Wireless 5570 HSPA+ (42Mbps) Mobile Broadband Card */ {QMI_FIXED_INTF(0x413c, 0x81a4, 8)},/* Dell Wireless 5570e HSPA+ (42Mbps) Mobile Broadband Card */ -- 2.27.0
[PATCH AUTOSEL 5.10 3/6] cxgb4: Add new T6 PCI device id 0x6092
From: Raju Rangoju [ Upstream commit 3401e4aa43a540881cc97190afead650e709c418 ] Signed-off-by: Raju Rangoju Link: https://lore.kernel.org/r/20210202182511.8109-1-ra...@chelsio.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h b/drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h index 0c5373462cedb..0b1b5f9c67d47 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h +++ b/drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h @@ -219,6 +219,7 @@ CH_PCI_DEVICE_ID_TABLE_DEFINE_BEGIN CH_PCI_ID_TABLE_FENTRY(0x6089), /* Custom T62100-KR */ CH_PCI_ID_TABLE_FENTRY(0x608a), /* Custom T62100-CR */ CH_PCI_ID_TABLE_FENTRY(0x608b), /* Custom T6225-CR */ + CH_PCI_ID_TABLE_FENTRY(0x6092), /* Custom T62100-CR-LOM */ CH_PCI_DEVICE_ID_TABLE_DEFINE_END; #endif /* __T4_PCI_ID_TBL_H__ */ -- 2.27.0
[PATCH AUTOSEL 5.4 1/4] NET: usb: qmi_wwan: Adding support for Cinterion MV31
From: Christoph Schemmel [ Upstream commit a4dc7eee9106a9d2a6e08b442db19677aa9699c7 ] Adding support for Cinterion MV31 with PID 0x00B7. T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 11 Spd=5000 MxCh= 0 D: Ver= 3.20 Cls=ef(misc ) Sub=02 Prot=01 MxPS= 9 #Cfgs= 1 P: Vendor=1e2d ProdID=00b7 Rev=04.14 S: Manufacturer=Cinterion S: Product=Cinterion USB Mobile Broadband S: SerialNumber=b3246eed C: #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=896mA I: If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan I: If#=0x1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#=0x3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option Signed-off-by: Christoph Schemmel Link: https://lore.kernel.org/r/20210202084523.4371-1-christoph.schem...@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- drivers/net/usb/qmi_wwan.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c index 72a3a5dc51319..5a1d21aae2a9e 100644 --- a/drivers/net/usb/qmi_wwan.c +++ b/drivers/net/usb/qmi_wwan.c @@ -1354,6 +1354,7 @@ static const struct usb_device_id products[] = { {QMI_FIXED_INTF(0x1e2d, 0x0082, 5)},/* Cinterion PHxx,PXxx (2 RmNet) */ {QMI_FIXED_INTF(0x1e2d, 0x0083, 4)},/* Cinterion PHxx,PXxx (1 RmNet + USB Audio)*/ {QMI_QUIRK_SET_DTR(0x1e2d, 0x00b0, 4)}, /* Cinterion CLS8 */ + {QMI_FIXED_INTF(0x1e2d, 0x00b7, 0)},/* Cinterion MV31 RmNet */ {QMI_FIXED_INTF(0x413c, 0x81a2, 8)},/* Dell Wireless 5806 Gobi(TM) 4G LTE Mobile Broadband Card */ {QMI_FIXED_INTF(0x413c, 0x81a3, 8)},/* Dell Wireless 5570 HSPA+ (42Mbps) Mobile Broadband Card */ {QMI_FIXED_INTF(0x413c, 0x81a4, 8)},/* Dell Wireless 5570e HSPA+ (42Mbps) Mobile Broadband Card */ -- 2.27.0