Re: [PATCH iproute2 5/6] man8/bridge.8: explain self vs master for "bridge fdb add"

2021-02-15 Thread Alexandra Winter
Thank you very much Vladimir for improving this man page. I am still struggling 
with the meaning of the bridge attributes and sometimes
the man page has caused more confusion.

In the section about 'bridge link set' Self vs master mention physical device 
vs software bridge. Would it make sense to use the same
terminology here?

The attributes are listed under 'bridge fdb add' not under 'bridge fdb show'. 
Is it correct that the attributes displayed by 'show'
are a 1-to-1 representation of the ones set by 'add'? What about the entries 
that are not manually set, like bridge learned adresses?
Is it possible to add some explanation about those as well?

On 11.02.21 11:45, Vladimir Oltean wrote:
> From: Vladimir Oltean 
> 
> The "usually hardware" and "usually software" distinctions make no
> sense, try to clarify what these do based on the actual kernel behavior.
> 
> Signed-off-by: Vladimir Oltean 
> ---
>  man/man8/bridge.8 | 15 ---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/man/man8/bridge.8 b/man/man8/bridge.8
> index 1dc0aec83f09..d0bcd708bb61 100644
> --- a/man/man8/bridge.8
> +++ b/man/man8/bridge.8
> @@ -533,12 +533,21 @@ specified.
>  .sp
>  
>  .B self
> -- the address is associated with the port drivers fdb. Usually hardware
> -  (default).
> +- the operation is fulfilled directly by the driver for the specified network
> +device. If the network device belongs to a master like a bridge, then the
> +bridge is bypassed and not notified of this operation (and if the device does
> +notify the bridge, it is driver-specific behavior and not mandated by this
> +flag, check the driver for more details). The "bridge fdb add" command can 
> also
> +be used on the bridge device itself, and in this case, the added fdb entries
> +will be locally terminated (not forwarded). In the latter case, the "self" 
> flag
> +is mandatory. 
Maybe I misunderstand this sentence, but I can do a 'bridge fdb add' without 
'self'
on the bridge device. And the address shows up under 'bridge fdb show'.
So what does mandatory mean here?
The flag is set by default if "master" is not specified.
>  .sp
>  
>  .B master
> -- the address is associated with master devices fdb. Usually software.
> +- if the specified network device is a port that belongs to a master device
> +such as a bridge, the operation is fulfilled by the master device's driver,
> +which may in turn notify the port driver too of the address. If the specified
> +device is a master itself, such as a bridge, this flag is invalid.
>  .sp
>  
>  .B router
> 


Re: [PATCH 1/2 net-next] net/mlx5e: TC: Fix IS_ERR() vs NULL checks

2021-02-15 Thread Dan Carpenter
On Mon, Sep 28, 2020 at 06:31:04PM +, Ariel Levkovich wrote:
> On Sep 28, 2020, at 13:42, Dan Carpenter  wrote:
> > 
> > The mlx5_tc_ct_init() function doesn't return error pointers it returns
> > NULL.  Also we need to set the error codes on this path.
> > 
> > Fixes: aedd133d17bc ("net/mlx5e: Support CT offload for tc nic flows")
> > Signed-off-by: Dan Carpenter 
> > ---
> > drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 8 ++--
> > 1 file changed, 6 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c 
> > b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> > index 104b1c339de0..438fbcf478d1 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> > @@ -5224,8 +5224,10 @@ int mlx5e_tc_nic_init(struct mlx5e_priv *priv)
> > 
> >tc->ct = mlx5_tc_ct_init(priv, tc->chains, &priv->fs.tc.mod_hdr,
> > MLX5_FLOW_NAMESPACE_KERNEL);
> > -if (IS_ERR(tc->ct))
> > +if (!tc->ct) {
> > +err = -ENOMEM;
> >goto err_ct;
> > +}
> 
> Hi Dan,
> That was implement like that on purpose. If mlx5_tc_init_ct returns NULL it 
> means the device doesn’t support CT offload which can happen with older 
> devices or old FW on the devices.
> However, in this case we want to continue with the rest of the Tc 
> initialization because we can still support other TC offloads. No need to 
> fail the entire TC init in this case. Only if mlx5_tc_init_ct return err_ptr 
> that means the tc init failed not because of lack of support but due to a 
> real error and only then we want to fail the rest of the tc init.
> 
> Your change will break compatibility for devices/FW versions that don’t have 
> CT offload support.
> 

When we have a function like this which is optional then returning NULL
is a special kind of success as you say.  Returning NULL should not
generate a warning message.  At the same time, if the user enables the
option and the code fails because we are low on memory then returning an
error pointer is the correct behavior.  Just because the feature is
optional does not mean we should ignore what the user told us to do.

This code never returns error pointers.  It always returns NULL/success
when an allocation fails.  That triggers the first static checker
warning from last year.  Now Smatch is complaining about a new static
checker warning:

drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:4754
mlx5e_tc_esw_init() warn: missing error code here? 'IS_ERR()' failed. 'err' = 
'0'

  4708  int mlx5e_tc_esw_init(struct rhashtable *tc_ht)
  4709  {
  4710  const size_t sz_enc_opts = sizeof(struct tunnel_match_enc_opts);
  4711  struct mlx5_rep_uplink_priv *uplink_priv;
  4712  struct mlx5e_rep_priv *rpriv;
  4713  struct mapping_ctx *mapping;
  4714  struct mlx5_eswitch *esw;
  4715  struct mlx5e_priv *priv;
  4716  int err = 0;
  4717  
  4718  uplink_priv = container_of(tc_ht, struct mlx5_rep_uplink_priv, 
tc_ht);
  4719  rpriv = container_of(uplink_priv, struct mlx5e_rep_priv, 
uplink_priv);
  4720  priv = netdev_priv(rpriv->netdev);
  4721  esw = priv->mdev->priv.eswitch;
  4722  
  4723  uplink_priv->ct_priv = 
mlx5_tc_ct_init(netdev_priv(priv->netdev),
  4724 esw_chains(esw),
  4725 &esw->offloads.mod_hdr,
  4726 MLX5_FLOW_NAMESPACE_FDB);
  4727  if (IS_ERR(uplink_priv->ct_priv))
  4728  goto err_ct;

If mlx5_tc_ct_init() fails, which it should do if kmalloc() fails but
currently it does not, then the error should be propagated all the way
back.  So this code should preserve the error code instead of returning
success.

  4729  
  4730  mapping = mapping_create(sizeof(struct tunnel_match_key),
  4731   TUNNEL_INFO_BITS_MASK, true);

regards,
dan carpenter



Re: [PATCH net-next V1] net: followup adjust net_device layout for cacheline usage

2021-02-15 Thread Jesper Dangaard Brouer
On Fri, 12 Feb 2021 18:03:44 +0100
Eric Dumazet  wrote:

> On 2/12/21 2:50 PM, Jesper Dangaard Brouer wrote:
> > As Eric pointed out in response to commit 28af22c6c8df ("net: adjust
> > net_device layout for cacheline usage") the netdev_features_t members
> > wanted_features and hw_features are only used in control path.
> > 
> > Thus, this patch reorder the netdev_features_t to let more members that
> > are used in fast path into the 3rd cacheline. Whether these members are
> > read depend on SKB properties, which are hinted as comments. The member
> > mpls_features could not fit in the cacheline, but it was the least
> > commonly used (depend on CONFIG_NET_MPLS_GSO).
> > 
> > In the future we should consider relocating member gso_partial_features
> > to be closer to member gso_max_segs. (see usage in gso_features_check()).
> > 
> > Suggested-by: Eric Dumazet 
> > Signed-off-by: Jesper Dangaard Brouer 
> > ---
> >  include/linux/netdevice.h |   11 +++
> >  1 file changed, 7 insertions(+), 4 deletions(-)
> > 
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index bfadf3b82f9c..3898bb167579 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -1890,13 +1890,16 @@ struct net_device {
> > unsigned short  needed_headroom;
> > unsigned short  needed_tailroom;
> >  
> > +   /* Fast path features - via netif_skb_features */
> > netdev_features_t   features;
> > +   netdev_features_t   vlan_features;   /* if skb_vlan_tagged */
> > +   netdev_features_t   hw_enc_features; /* if skb->encapsulation */
> > +   netdev_features_t   gso_partial_features;/* if skb_is_gso */
> > +   netdev_features_t   mpls_features; /* if eth_p_mpls+NET_MPLS_GSO */
> > +
> > +   /* Control path features */
> > netdev_features_t   hw_features;
> > netdev_features_t   wanted_features;
> > -   netdev_features_t   vlan_features;
> > -   netdev_features_t   hw_enc_features;
> > -   netdev_features_t   mpls_features;
> > -   netdev_features_t   gso_partial_features;
> >  
> > unsigned intmin_mtu;
> > unsigned intmax_mtu;
> > 
> >   
> 
> 
> Please also note we currently have at least 3 distinct blocks for tx path.
> 
> Presumably netdev_features_t are only used in TX, so should be grouped with 
> the other TX
> sections.
> 
> 
> /* --- cacheline 3 boundary (192 bytes) --- */   
> ...
> netdev_features_t  features; /*  0xe0   0x8 */  
> 
> ... Lots of ctrl stuff
> 
> 
> /* --- cacheline 14 boundary (896 bytes) --- */
>  struct netdev_queue *  _tx __attribute__((__aligned__(64))); /* 
> 0x380   0x8 */  
> 
> 
> 
> 
> /* Mix of unrelated control stuff like rtnl_link_ops
> 
>  /* --- cacheline 31 boundary (1984 bytes) --- */ 
>  unsigned int   gso_max_size; /* 0x7c0   0x4 */
>  u16gso_max_segs; /* 0x7c4   0x2 */ 
> 
> 
> 
> Ideally we should move _all_ control/slow_path stuff at the very end of the 
> structure,
> in order to not pollute the cache lines we need for data path, to keep them 
> as small
> and packed as possible.
> 
> This could be done one field at a time, to ease code review.
> 
> We should have something like this 
> 
> /* section used in RX (fast) path */
> /* section used in both RX/TX (fast) path */
> /* section used in TX (fast) path */
> /* section used for slow path, and control path */

I fully agree with above, but this is the long term plan, that I have
added to my TODO list.

This patch is a followup to commit 28af22c6c8df ("net: adjust
net_device layout for cacheline usage") for fixing that the feature
members got partitioned into two cache-lines.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer



Re: [RFC PATCH v4 07/17] af_vsock: rest of SEQPACKET support

2021-02-15 Thread Arseny Krasnov


On 11.02.2021 15:27, Stefano Garzarella wrote:
> On Sun, Feb 07, 2021 at 06:16:12PM +0300, Arseny Krasnov wrote:
>> This does rest of SOCK_SEQPACKET support:
>> 1) Adds socket ops for SEQPACKET type.
>> 2) Allows to create socket with SEQPACKET type.
>>
>> Signed-off-by: Arseny Krasnov 
>> ---
>> net/vmw_vsock/af_vsock.c | 37 -
>> 1 file changed, 36 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>> index a033d3340ac4..c77998a14018 100644
>> --- a/net/vmw_vsock/af_vsock.c
>> +++ b/net/vmw_vsock/af_vsock.c
>> @@ -452,6 +452,7 @@ int vsock_assign_transport(struct vsock_sock *vsk, 
>> struct vsock_sock *psk)
>>  new_transport = transport_dgram;
>>  break;
>>  case SOCK_STREAM:
>> +case SOCK_SEQPACKET:
>>  if (vsock_use_local_transport(remote_cid))
>>  new_transport = transport_local;
>>  else if (remote_cid <= VMADDR_CID_HOST || !transport_h2g ||
>> @@ -459,6 +460,15 @@ int vsock_assign_transport(struct vsock_sock *vsk, 
>> struct vsock_sock *psk)
>>  new_transport = transport_g2h;
>>  else
>>  new_transport = transport_h2g;
>> +
>> +if (sk->sk_type == SOCK_SEQPACKET) {
>> +if (!new_transport ||
>> +!new_transport->seqpacket_seq_send_len ||
>> +!new_transport->seqpacket_seq_send_eor ||
>> +!new_transport->seqpacket_seq_get_len ||
>> +!new_transport->seqpacket_dequeue)
>> +return -ESOCKTNOSUPPORT;
>> +}
> Maybe we should move this check after the try_module_get() call, since 
> the memory pointed by 'new_transport' pointer can be deallocated in the 
> meantime.
>
> Also, if the socket had a transport before, we should deassign it before 
> returning an error.

I think previous transport is deassigned immediately after this

'switch()' on sk->sk_type:

if (vsk->transport) {

    ...

    vsock_deassign_transport(vsk);

}


Ok, check will be moved after 'try_module_get()'.

>
>>  break;
>>  default:
>>  return -ESOCKTNOSUPPORT;
>> @@ -684,6 +694,7 @@ static int __vsock_bind(struct sock *sk, struct 
>> sockaddr_vm *addr)
>>
>>  switch (sk->sk_socket->type) {
>>  case SOCK_STREAM:
>> +case SOCK_SEQPACKET:
>>  spin_lock_bh(&vsock_table_lock);
>>  retval = __vsock_bind_connectible(vsk, addr);
>>  spin_unlock_bh(&vsock_table_lock);
>> @@ -769,7 +780,7 @@ static struct sock *__vsock_create(struct net *net,
>>
>> static bool sock_type_connectible(u16 type)
>> {
>> -return type == SOCK_STREAM;
>> +return (type == SOCK_STREAM) || (type == SOCK_SEQPACKET);
>> }
>>
>> static void __vsock_release(struct sock *sk, int level)
>> @@ -2199,6 +2210,27 @@ static const struct proto_ops vsock_stream_ops = {
>>  .sendpage = sock_no_sendpage,
>> };
>>
>> +static const struct proto_ops vsock_seqpacket_ops = {
>> +.family = PF_VSOCK,
>> +.owner = THIS_MODULE,
>> +.release = vsock_release,
>> +.bind = vsock_bind,
>> +.connect = vsock_connect,
>> +.socketpair = sock_no_socketpair,
>> +.accept = vsock_accept,
>> +.getname = vsock_getname,
>> +.poll = vsock_poll,
>> +.ioctl = sock_no_ioctl,
>> +.listen = vsock_listen,
>> +.shutdown = vsock_shutdown,
>> +.setsockopt = vsock_connectible_setsockopt,
>> +.getsockopt = vsock_connectible_getsockopt,
>> +.sendmsg = vsock_connectible_sendmsg,
>> +.recvmsg = vsock_connectible_recvmsg,
>> +.mmap = sock_no_mmap,
>> +.sendpage = sock_no_sendpage,
>> +};
>> +
>> static int vsock_create(struct net *net, struct socket *sock,
>>  int protocol, int kern)
>> {
>> @@ -2219,6 +2251,9 @@ static int vsock_create(struct net *net, struct socket 
>> *sock,
>>  case SOCK_STREAM:
>>  sock->ops = &vsock_stream_ops;
>>  break;
>> +case SOCK_SEQPACKET:
>> +sock->ops = &vsock_seqpacket_ops;
>> +break;
>>  default:
>>  return -ESOCKTNOSUPPORT;
>>  }
>> -- 
>> 2.25.1
>>
>


Re: [PATCH 2/4] net: stmmac: Add Toshiba Visconti SoCs glue driver

2021-02-15 Thread Leon Romanovsky
On Mon, Feb 15, 2021 at 04:28:09PM +0900, Nobuhiro Iwamatsu wrote:
> Hi,
>
> Thanks for your review.
>
> On Mon, Feb 15, 2021 at 08:07:21AM +0200, Leon Romanovsky wrote:
> > On Mon, Feb 15, 2021 at 02:06:53PM +0900, Nobuhiro Iwamatsu wrote:
> > > Add dwmac-visconti to the stmmac driver in Toshiba Visconti ARM SoCs.
> > > This patch contains only the basic function of the device. There is no
> > > clock control, PM, etc. yet. These will be added in the future.
> > >
> > > Signed-off-by: Nobuhiro Iwamatsu 
> > > ---
> > >  drivers/net/ethernet/stmicro/stmmac/Kconfig   |   8 +
> > >  drivers/net/ethernet/stmicro/stmmac/Makefile  |   1 +
> > >  .../ethernet/stmicro/stmmac/dwmac-visconti.c  | 285 ++
> > >  3 files changed, 294 insertions(+)
> > >  create mode 100644 drivers/net/ethernet/stmicro/stmmac/dwmac-visconti.c
> > >
> > > diff --git a/drivers/net/ethernet/stmicro/stmmac/Kconfig 
> > > b/drivers/net/ethernet/stmicro/stmmac/Kconfig
> > > index 53f14c5a9e02..55ba67a550b9 100644
> > > --- a/drivers/net/ethernet/stmicro/stmmac/Kconfig
> > > +++ b/drivers/net/ethernet/stmicro/stmmac/Kconfig
> > > @@ -219,6 +219,14 @@ config DWMAC_INTEL_PLAT
> > > This selects the Intel platform specific glue layer support for
> > > the stmmac device driver. This driver is used for the Intel Keem Bay
> > > SoC.
> > > +
> > > +config DWMAC_VISCONTI
> > > + bool "Toshiba Visconti DWMAC support"
> > > + def_bool y
> >
>
> Sorry, I sent the wrong patchset that didn't fix this point out.
>
> > I asked it before, but never received an answer.
>
> I have received your point out and have sent an email with the content
> to remove this line. But it may not have arrived yet...
>
> > Why did you use "def_bool y" and not "default y"? Isn't it supposed to be
> > "depends on STMMAC_ETH"? And probably it shouldn't be set as a default as 
> > "y".
> >
>
> The reason why "def_bool y" was set is that the wrong fix was left when
> debugging. Also, I don't think it is necessary to set "default y".
> This is also incorrect because it says "bool" Toshiba Visconti DWMAC
> support "". I change it to trustate in the new patch.
>
> And this driver is enabled when STMMAC_PLATFORM was Y. And STMMAC_PLATFORM
> depends on STMMAC_ETH.
> So I understand that STMMAC_ETH does not need to be dependents. Is this
> understanding wrong?

This is correct understanding, just need to clean other entries in that
Kconfig that depends on STMMAC_ETH.

Thanks

>
> > Thanks
> >
>
> Best regards,
>   Nobuhiro


Re: [PATCH net-next RFC v3] net: hdlc_x25: Queue outgoing LAPB frames

2021-02-15 Thread Leon Romanovsky
On Sun, Feb 14, 2021 at 11:27:03PM -0800, Xie He wrote:
> When sending packets, we will first hand over the (L3) packets to the
> LAPB module. The LAPB module will then hand over the corresponding LAPB
> (L2) frames back to us for us to transmit.
>
> The LAPB module can also emit LAPB (L2) frames at any time, even without
> an (L3) packet currently being sent on the device. This happens when the
> LAPB module tries to send (L3) packets queued up in its internal queue,
> or when the LAPB module decides to send some (L2) control frame.
>
> This means we need to have a queue for these outgoing LAPB (L2) frames,
> otherwise frames can be dropped if sent when the hardware driver is
> already busy in transmitting. The queue needs to be controlled by
> the hardware driver's netif_stop_queue and netif_wake_queue calls.
> Therefore, we need to use the device's qdisc TX queue for this purpose.
> However, currently outgoing LAPB (L2) frames are not queued.
>
> On the other hand, outgoing (L3) packets (before they are handed over
> to the LAPB module) don't need to be queued, because the LAPB module
> already has an internal queue for them, and is able to queue new outgoing
> (L3) packets at any time. However, currently outgoing (L3) packets are
> being queued in the device's qdisc TX queue, which is controlled by
> the hardware driver's netif_stop_queue and netif_wake_queue calls.
> This is unnecessary and meaningless.
>
> To fix these issues, we can split the HDLC device into two devices -
> a virtual X.25 device and the actual HDLC device, use the virtual X.25
> device to send (L3) packets and then use the actual HDLC device to
> queue LAPB (L2) frames. The outgoing (L2) LAPB queue will be controlled
> by the hardware driver's netif_stop_queue and netif_wake_queue calls,
> while outgoing (L3) packets will not be affected by these calls.
>
> Cc: Martin Schiller 
> Signed-off-by: Xie He 
> ---
>
> Change from RFC v2:
> Simplified the commit message.
> Dropped the x25_open fix which is already merged into net-next now.
> Use HDLC_MAX_MTU as the mtu of the X.25 virtual device.
> Add an explanation to the documentation about the X.25 virtual device.
>
> Change from RFC v1:
> Properly initialize state(hdlc)->x25_dev and state(hdlc)->x25_dev_lock.
>
> ---
>  Documentation/networking/generic-hdlc.rst |   3 +
>  drivers/net/wan/hdlc_x25.c| 153 ++
>  2 files changed, 130 insertions(+), 26 deletions(-)

<...>

> +static void x25_setup_virtual_dev(struct net_device *dev)
> +{
> + dev->netdev_ops  = &hdlc_x25_netdev_ops;
> + dev->type= ARPHRD_X25;
> + dev->addr_len= 0;
> + dev->hard_header_len = 0;
> + dev->mtu = HDLC_MAX_MTU;
> +
> + /* When transmitting data:
> +  * first we'll remove a pseudo header of 1 byte,
> +  * then the LAPB module will prepend an LAPB header of at most 3 bytes.
> +  */
> + dev->needed_headroom = 3 - 1;

3 - 1 = 2

Thanks

> +}
> +


Re: [PATCH iproute2 5/6] man8/bridge.8: explain self vs master for "bridge fdb add"

2021-02-15 Thread Vladimir Oltean
Hi Alexandra,

On Mon, Feb 15, 2021 at 09:22:47AM +0100, Alexandra Winter wrote:
> In the section about 'bridge link set' Self vs master mention physical
> device vs software bridge. Would it make sense to use the same
> terminology here?

You mean like this?

.TP
.B self
operation is fulfilled by the driver of the specified network interface.

.TP
.B master
operation is fulfilled by the specified interface's master, for example
a bridge, which in turn may or may not notify the underlying network
interface driver. This flag is considered implicit by the kernel if
'self' was not specified.

> The attributes are listed under 'bridge fdb add' not under 'bridge fdb
> show'. Is it correct that the attributes displayed by 'show' are a
> 1-to-1 representation of the ones set by 'add'?

Bah, not quite. I'll try to summarize below.

> What about the entries that are not manually set, like bridge learned
> adresses? Is it possible to add some explanation about those as well?

Ok, challenge accepted. Here's my take on 'bridge fdb show', I haven't
used most of these options (I'm commenting solely based on code
inspection) so if anybody with more experience could chime in, I'd be
happy to adjust the wording.


.SS bridge fdb show - list forwarding entries.

This command displays the current forwarding table. By default all FDB
entries in the system are shown. The following options can be used to
reduce the number of displayed entries:

.TP
.B br
Filter the output to contain only the FDB entries of the specified
bridge, or belonging to ports of the specified bridge (optional).

.B brport
Filter the output to contain only the FDB entries present on the
specified network interface (bridge port). This flag is optional.

.B dev
Same as "brport".

.B vlan
Filter the output to contain only the FDB entries with the specified
VLAN ID (optional).

.B dynamic
Filter out the local/permanent (not forwarded) FDB entries.

.B state
Filter the output to contain only the FDB entries having the specified
state. The bridge FDB is modeled as a neighbouring protocol for
PF_BRIDGE (similar to what ARP is for IPv4 and ND is for IPv6).
Therefore, an FDB entry has a NUD ("Network Unreachability Detection")
state given by the generic neighbouring layer.

The following are the valid components of an FDB entry state (more than
one may be valid at the same time):

.B permanent
Associated with the generic NUD_PERMANENT state, which means that the L2
address of the neighbor has been statically configured by the user and
therefore there is no need for a neighbour resolution.
For the bridge FDB, it means that an FDB entry is 'local', i.e. the L2
address belongs to a local interface.

.B reachable
Associated with the generic NUD_REACHABLE state, which means that the L2
address has been resolved by the neighbouring protocol. A reachable
bridge FDB entry can have two sub-states (static and dynamic) detailed
below.

.B static
Associated with the generic NUD_NOARP state, which is used to denote a
neighbour for which no protocol is needed to resolve the mapping between
the L3 address and L2 address. For the bridge FDB, the neighbour
resolution protocol is source MAC address learning, therefore a static
FDB entry is one that has not been learnt.

.B dynamic
Is a NUD_REACHABLE entry that lacks the NUD_NOARP state, therefore has
been resolved through address learning.

.B stale
Associated with the generic NUD_STALE state. Denotes an FDB entry that
was last updated longer ago than the bridge's hold time, but not yet
removed. The hold time is equal to the forward_delay (if the STP
topology is still changing) or to the ageing_time otherwise.


.PP
In the resulting output, each FDB entry can have one or more of the
following flags:

.B self
This entry is present in the FDB of the specified network interface driver.

.B router
???

.B extern_learn
This entry has been added to the master interface's FDB by the lower
port driver, as a result of hardware address learning.

.B offload
This entry is present in the hardware FDB of a lower port and also
associated with an entry of the master interface.

.B master
This entry is present in the software FDB of the master interface of
this lower port.

.B sticky
This entry cannot be migrated to another port by the address learning
process.

.PP
With the
.B -statistics
option, the command becomes verbose. It prints out the last updated
and last used time for each entry.

> >  .B self
> > -- the address is associated with the port drivers fdb. Usually hardware
> > -  (default).
> > +- the operation is fulfilled directly by the driver for the specified 
> > network
> > +device. If the network device belongs to a master like a bridge, then the
> > +bridge is bypassed and not notified of this operation (and if the device 
> > does
> > +notify the bridge, it is driver-specific behavior and not mandated by this
> > +flag, check the driver for more details). The "bridge fdb add" command can 
> > also
> > +be used on the bridge device

Re: [Patch bpf-next v3 1/5] bpf: clean up sockmap related Kconfigs

2021-02-15 Thread Lorenz Bauer
On Sat, 13 Feb 2021 at 21:44, Cong Wang  wrote:
>
> From: Cong Wang 
>
> As suggested by John, clean up sockmap related Kconfigs:
>
> Reduce the scope of CONFIG_BPF_STREAM_PARSER down to TCP stream
> parser, to reflect its name.
>
> Make the rest sockmap code simply depend on CONFIG_BPF_SYSCALL.
> And leave CONFIG_NET_SOCK_MSG untouched, as it is used by
> non-sockmap cases.

For the series:

Reviewed-by: Lorenz Bauer 

Jakub, John: can you please take another look at the assembly in patch 3?

-- 
Lorenz Bauer  |  Systems Engineer
6th Floor, County Hall/The Riverside Building, SE1 7PB, UK

www.cloudflare.com


Re: [PATCH 1/2] lockdep: add lockdep_assert_not_held()

2021-02-15 Thread Peter Zijlstra
On Sun, Feb 14, 2021 at 06:53:01PM +0100, Peter Zijlstra wrote:
> On Fri, Feb 12, 2021 at 04:28:42PM -0700, Shuah Khan wrote:
> 
> > +#define lockdep_assert_not_held(l) do {\
> > +   WARN_ON(debug_locks && lockdep_is_held(l)); \
> > +   } while (0)
> > +
> 
> This thing isn't as straight forward as you might think, but it'll
> mostly work.
> 
> Notably this thing will misfire when lockdep_off() is employed. It
> certainyl needs a comment to explain the subtleties.

I think something like so will work, but please double check.

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index b9e9adec73e8..c8b0d292bf8e 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -294,11 +294,15 @@ extern void lock_unpin_lock(struct lockdep_map *lock, 
struct pin_cookie);
 
 #define lockdep_depth(tsk) (debug_locks ? (tsk)->lockdep_depth : 0)
 
-#define lockdep_assert_held(l) do {\
-   WARN_ON(debug_locks && !lockdep_is_held(l));\
+#define lockdep_assert_held(l) do {\
+   WARN_ON(debug_locks && lockdep_is_held(l) == 0));   \
} while (0)
 
-#define lockdep_assert_held_write(l)   do {\
+#define lockdep_assert_not_held(l) do {\
+   WARN_ON(debug_locks && lockdep_is_held(l) == 1));   \
+   } while (0)
+
+#define lockdep_assert_held_write(l)   do {\
WARN_ON(debug_locks && !lockdep_is_held_type(l, 0));\
} while (0)
 
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index c1418b47f625..983ba206f7b2 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -5467,7 +5467,7 @@ noinstr int lock_is_held_type(const struct lockdep_map 
*lock, int read)
int ret = 0;
 
if (unlikely(!lockdep_enabled()))
-   return 1; /* avoid false negative lockdep_assert_held() */
+   return -1; /* avoid false negative lockdep_assert_held() */
 
raw_local_irq_save(flags);
check_flags(flags);


Re: [PATCH iproute2 5/6] man8/bridge.8: explain self vs master for "bridge fdb add"

2021-02-15 Thread Alexandra Winter



On 15.02.21 11:32, Vladimir Oltean wrote:
> Hi Alexandra,
> 
> On Mon, Feb 15, 2021 at 09:22:47AM +0100, Alexandra Winter wrote:
>> In the section about 'bridge link set' Self vs master mention physical
>> device vs software bridge. Would it make sense to use the same
>> terminology here?
> 
> You mean like this?
> 
> .TP
> .B self
> operation is fulfilled by the driver of the specified network interface.
> 
> .TP
> .B master
> operation is fulfilled by the specified interface's master, for example
> a bridge, which in turn may or may not notify the underlying network
> interface driver. This flag is considered implicit by the kernel if
> 'self' was not specified.
> 
Actually, I found your first (more verbose) proposal more helpful. 

>> The attributes are listed under 'bridge fdb add' not under 'bridge fdb
>> show'. Is it correct that the attributes displayed by 'show' are a
>> 1-to-1 representation of the ones set by 'add'?
> 
> Bah, not quite. I'll try to summarize below.
> 
>> What about the entries that are not manually set, like bridge learned
>> adresses? Is it possible to add some explanation about those as well?
> 
> Ok, challenge accepted. Here's my take on 'bridge fdb show', I haven't
> used most of these options (I'm commenting solely based on code
> inspection) so if anybody with more experience could chime in, I'd be
> happy to adjust the wording.
> 
> 
> .SS bridge fdb show - list forwarding entries.
> 
> This command displays the current forwarding table. By default all FDB
> entries in the system are shown. The following options can be used to
> reduce the number of displayed entries:
> 
> .TP
> .B br
> Filter the output to contain only the FDB entries of the specified
> bridge, or belonging to ports of the specified bridge (optional).
> 
> .B brport
> Filter the output to contain only the FDB entries present on the
> specified network interface (bridge port). This flag is optional.
> 
> .B dev
> Same as "brport".
> 
> .B vlan
> Filter the output to contain only the FDB entries with the specified
> VLAN ID (optional).
> 
> .B dynamic
> Filter out the local/permanent (not forwarded) FDB entries.
> 
> .B state
> Filter the output to contain only the FDB entries having the specified
> state. The bridge FDB is modeled as a neighbouring protocol for
> PF_BRIDGE (similar to what ARP is for IPv4 and ND is for IPv6).
> Therefore, an FDB entry has a NUD ("Network Unreachability Detection")
> state given by the generic neighbouring layer.
> 
> The following are the valid components of an FDB entry state (more than
> one may be valid at the same time):
> 
> .B permanent
> Associated with the generic NUD_PERMANENT state, which means that the L2
> address of the neighbor has been statically configured by the user and
> therefore there is no need for a neighbour resolution.
> For the bridge FDB, it means that an FDB entry is 'local', i.e. the L2
> address belongs to a local interface.
> 
> .B reachable
> Associated with the generic NUD_REACHABLE state, which means that the L2
> address has been resolved by the neighbouring protocol. A reachable
> bridge FDB entry can have two sub-states (static and dynamic) detailed
> below.
> 
> .B static
> Associated with the generic NUD_NOARP state, which is used to denote a
> neighbour for which no protocol is needed to resolve the mapping between
> the L3 address and L2 address. For the bridge FDB, the neighbour
> resolution protocol is source MAC address learning, therefore a static
> FDB entry is one that has not been learnt.
> 
> .B dynamic
> Is a NUD_REACHABLE entry that lacks the NUD_NOARP state, therefore has
> been resolved through address learning.
> 
> .B stale
> Associated with the generic NUD_STALE state. Denotes an FDB entry that
> was last updated longer ago than the bridge's hold time, but not yet
> removed. The hold time is equal to the forward_delay (if the STP
> topology is still changing) or to the ageing_time otherwise.
> 
> 
> .PP
> In the resulting output, each FDB entry can have one or more of the
> following flags:
> 
> .B self
> This entry is present in the FDB of the specified network interface driver.
> 
> .B router
> ???
> 
> .B extern_learn
> This entry has been added to the master interface's FDB by the lower
> port driver, as a result of hardware address learning.
> 
> .B offload
> This entry is present in the hardware FDB of a lower port and also
> associated with an entry of the master interface.
> 
> .B master
> This entry is present in the software FDB of the master interface of
> this lower port.
> 
> .B sticky
> This entry cannot be migrated to another port by the address learning
> process.
> 
> .PP
> With the
> .B -statistics
> option, the command becomes verbose. It prints out the last updated
> and last used time for each entry.
> 
Thank you so much!! This will be very helpful.

>>>  .B self
>>> -- the address is associated with the port drivers fdb. Usually hardware
>>> -  (default).
>>> +- the operation is fulfilled dire

Re: linux-next: manual merge of the net-next tree with the net tree

2021-02-15 Thread Guillaume Nault
On Mon, Feb 15, 2021 at 11:43:54AM +1100, Stephen Rothwell wrote:
> Hi all,
> 
> Today's linux-next merge of the net-next tree got a conflict in:
> 
>   tools/testing/selftests/net/forwarding/tc_flower.sh
> 
> between commit:
> 
>   d2126838050c ("flow_dissector: fix TTL and TOS dissection on IPv4 
> fragments")
> 
> from the net tree and commits:
> 
>   203ee5cd7235 ("selftests: tc: Add basic mpls_* matching support for 
> tc-flower")
>   c09bfd9a5df9 ("selftests: tc: Add generic mpls matching support for 
> tc-flower")
> 
> from the net-next tree.
> 
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging.  You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.
> 
> -- 
> Cheers,
> Stephen Rothwell
> 
> diff --cc tools/testing/selftests/net/forwarding/tc_flower.sh
> index b11d8e6b5bc1,a554838666c4..
> --- a/tools/testing/selftests/net/forwarding/tc_flower.sh
> +++ b/tools/testing/selftests/net/forwarding/tc_flower.sh
> @@@ -3,7 -3,9 +3,9 @@@
>   
>   ALL_TESTS="match_dst_mac_test match_src_mac_test match_dst_ip_test \
>   match_src_ip_test match_ip_flags_test match_pcp_test match_vlan_test \
> - match_ip_tos_test match_indev_test match_ip_ttl_test"
> + match_ip_tos_test match_indev_test match_mpls_label_test \
> + match_mpls_tc_test match_mpls_bos_test match_mpls_ttl_test \
>  -match_mpls_lse_test"
> ++match_mpls_lse_test match_ip_ttl_test"

That's technically right. But I think it'd be nicer to have
"match_ip_ttl_test" appear between "match_ip_tos_test" and
"match_indev_test", rather than at the end of the list.

Before these commits, ALL_TESTS listed the tests in the order they were
implemented in the rest of the file. So I'd rather continue following
this implicit rule, if at all possible. Also it makes sense to keep
grouping all match_ip_*_test together.

>   NUM_NETIFS=2
>   source tc_common.sh
>   source lib.sh



Re: Regressions with VMBus/VSCs hardening changes

2021-02-15 Thread Wei Liu
On Fri, Feb 12, 2021 at 05:50:50PM +0100, Andrea Parri wrote:
> Hi all,
[...]
> 2) IIUC a8c3209998afb5 could be dropped (after rebase) without further modi-
>fications to hyperv-next.

I've reverted the said patch from hyperv-next.

Wei.


Re: linux-next: manual merge of the net-next tree with the net tree

2021-02-15 Thread Davide Caratti
On Mon, 2021-02-15 at 12:01 +0100, Guillaume Nault wrote:
> Before these commits, ALL_TESTS listed the tests in the order they were
> implemented in the rest of the file. So I'd rather continue following
> this implicit rule, if at all possible. Also it makes sense to keep
> grouping all match_ip_*_test together.

yes, it makes sense. I can follow-up with a commit for net-next (when
tree re-opens), where the "ordering" in ALL_TESTS is restored. Ok?

thanks,
-- 
davide



[PATCH net] cxgb4/chtls/cxgbit: Keeping the max ofld immediate data size same in cxgb4 and ulds

2021-02-15 Thread Ayush Sawal
The Max imm data size in cxgb4 is not similar to the max imm data size
in the chtls. This caused an mismatch in output of is_ofld_imm() of
cxgb4 and chtls. So fixed this by keeping the max wreq size of imm data
same in both chtls and cxgb4 as MAX_IMM_OFLD_TX_DATA_WR_LEN.

As cxgb4's max imm. data value for ofld packets is changed to
MAX_IMM_OFLD_TX_DATA_WR_LEN. Using the same in cxgbit also.

Fixes: 36bedb3f2e5b8 ("crypto: chtls - Inline TLS record Tx")
Signed-off-by: Ayush Sawal 
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h|  3 +++
 drivers/net/ethernet/chelsio/cxgb4/sge.c  | 11 ---
 .../ethernet/chelsio/inline_crypto/chtls/chtls_cm.h   |  3 ---
 drivers/target/iscsi/cxgbit/cxgbit_target.c   |  3 +--
 4 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h 
b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
index 1b49f2fa9b18..34546f5312ee 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
@@ -46,6 +46,9 @@
 #define MAX_ULD_QSETS 16
 #define MAX_ULD_NPORTS 4
 
+/* ulp_mem_io + ulptx_idata + payload + padding */
+#define MAX_IMM_ULPTX_WR_LEN (32 + 8 + 256 + 8)
+
 /* CPL message priority levels */
 enum {
CPL_PRIORITY_DATA = 0,  /* data messages */
diff --git a/drivers/net/ethernet/chelsio/cxgb4/sge.c 
b/drivers/net/ethernet/chelsio/cxgb4/sge.c
index 196652a114c5..3334c9e2152a 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/sge.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/sge.c
@@ -2842,17 +2842,22 @@ int t4_mgmt_tx(struct adapter *adap, struct sk_buff 
*skb)
  * @skb: the packet
  *
  * Returns true if a packet can be sent as an offload WR with immediate
- * data.  We currently use the same limit as for Ethernet packets.
+ * data.
+ * FW_OFLD_TX_DATA_WR limits the payload to 255 bytes due to 8-bit field.
+ *  However, FW_ULPTX_WR commands have a 256 byte immediate only
+ *  payload limit.
  */
 static inline int is_ofld_imm(const struct sk_buff *skb)
 {
struct work_request_hdr *req = (struct work_request_hdr *)skb->data;
unsigned long opcode = FW_WR_OP_G(ntohl(req->wr_hi));
 
-   if (opcode == FW_CRYPTO_LOOKASIDE_WR)
+   if (unlikely(opcode == FW_ULPTX_WR))
+   return skb->len <= MAX_IMM_ULPTX_WR_LEN;
+   else if (opcode == FW_CRYPTO_LOOKASIDE_WR)
return skb->len <= SGE_MAX_WR_LEN;
else
-   return skb->len <= MAX_IMM_TX_PKT_LEN;
+   return skb->len <= MAX_IMM_OFLD_TX_DATA_WR_LEN;
 }
 
 /**
diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.h 
b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.h
index 47ba81e42f5d..b1161bdeda4d 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.h
+++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.h
@@ -50,9 +50,6 @@
 #define MIN_RCV_WND (24 * 1024U)
 #define LOOPBACK(x) (((x) & htonl(0xff00)) == htonl(0x7f00))
 
-/* ulp_mem_io + ulptx_idata + payload + padding */
-#define MAX_IMM_ULPTX_WR_LEN (32 + 8 + 256 + 8)
-
 /* for TX: a skb must have a headroom of at least TX_HEADER_LEN bytes */
 #define TX_HEADER_LEN \
(sizeof(struct fw_ofld_tx_data_wr) + sizeof(struct sge_opaque_hdr))
diff --git a/drivers/target/iscsi/cxgbit/cxgbit_target.c 
b/drivers/target/iscsi/cxgbit/cxgbit_target.c
index 9b3eb2e8c92a..b926e1d6c7b8 100644
--- a/drivers/target/iscsi/cxgbit/cxgbit_target.c
+++ b/drivers/target/iscsi/cxgbit/cxgbit_target.c
@@ -86,8 +86,7 @@ static int cxgbit_is_ofld_imm(const struct sk_buff *skb)
if (likely(cxgbit_skcb_flags(skb) & SKCBF_TX_ISO))
length += sizeof(struct cpl_tx_data_iso);
 
-#define MAX_IMM_TX_PKT_LEN 256
-   return length <= MAX_IMM_TX_PKT_LEN;
+   return length <= MAX_IMM_OFLD_TX_DATA_WR_LEN;
 }
 
 /*
-- 
2.28.0.rc1.6.gae46588



Re: [RFC PATCH 03/13] nexthop: Add netlink defines and enumerators for resilient NH groups

2021-02-15 Thread Petr Machata


Ido Schimmel  writes:

> On Sat, Feb 13, 2021 at 12:16:45PM -0700, David Ahern wrote:
>> On 2/8/21 1:42 PM, Petr Machata wrote:
>> > @@ -52,8 +53,50 @@ enum {
>> >NHA_FDB,/* flag; nexthop belongs to a bridge fdb */
>> >/* if NHA_FDB is added, OIF, BLACKHOLE, ENCAP cannot be set */
>> >  
>> > +  /* nested; resilient nexthop group attributes */
>> > +  NHA_RES_GROUP,
>> > +  /* nested; nexthop bucket attributes */
>> > +  NHA_RES_BUCKET,
>> > +
>> >__NHA_MAX,
>> >  };
>> >  
>> >  #define NHA_MAX   (__NHA_MAX - 1)
>> > +
>> > +enum {
>> > +  NHA_RES_GROUP_UNSPEC,
>> > +  /* Pad attribute for 64-bit alignment. */
>> > +  NHA_RES_GROUP_PAD = NHA_RES_GROUP_UNSPEC,
>> > +
>> > +  /* u32; number of nexthop buckets in a resilient nexthop group */
>> > +  NHA_RES_GROUP_BUCKETS,
>> 
>> u32 is overkill; arguably u16 (64k) should be more than enough buckets
>> for any real use case.
>
> We wanted to make it future-proof, but I think we can live with 64k. At
> least in Spectrum the maximum is 4k. I don't know about other devices,
> but I guess it is not more than 64k.

OK, no problem. I was thinking of keeping the API as u32, and tracking
as u16 internally, but let's not add baggage at this stage already. Push
comes to shove there can be another u32 attribute mutexed with this one.


Re: [RFC PATCH 04/13] nexthop: Add implementation of resilient next-hop groups

2021-02-15 Thread Petr Machata


David Ahern  writes:

> On 2/8/21 1:42 PM, Petr Machata wrote:
>> @@ -212,7 +254,7 @@ static inline bool nexthop_is_multipath(const struct 
>> nexthop *nh)
>>  struct nh_group *nh_grp;
>>  
>>  nh_grp = rcu_dereference_rtnl(nh->nh_grp);
>> -return nh_grp->mpath;
>> +return nh_grp->mpath || nh_grp->resilient;
>
> That pattern shows up multiple times; since this is datapath, it would
> be worth adding a new bool for it (is_multipath or has_nexthops or
> something else along those lines)

OK.


Re: linux-next: manual merge of the net-next tree with the net tree

2021-02-15 Thread Stephen Rothwell
Hi Davide,

On Mon, 15 Feb 2021 12:35:37 +0100 Davide Caratti  wrote:
>
> On Mon, 2021-02-15 at 12:01 +0100, Guillaume Nault wrote:
> > Before these commits, ALL_TESTS listed the tests in the order they were
> > implemented in the rest of the file. So I'd rather continue following
> > this implicit rule, if at all possible. Also it makes sense to keep
> > grouping all match_ip_*_test together.  
> 
> yes, it makes sense. I can follow-up with a commit for net-next (when
> tree re-opens), where the "ordering" in ALL_TESTS is restored. Ok?

The ordering is not set in stone yet (I have only done the merge in the
linux-next tree), just make sure that Dave knows what it should look
like when he merges the net and net-next trees.

-- 
Cheers,
Stephen Rothwell


pgpN6LBT4u1D7.pgp
Description: OpenPGP digital signature


Re: [PATCH v14 2/4] phy: Add media type and speed serdes configuration interfaces

2021-02-15 Thread Kishon Vijay Abraham I
Hi Steen,

On 12/02/21 6:35 pm, Steen Hegelund wrote:
> Hi Kishon,
> 
> On Fri, 2021-02-12 at 17:02 +0530, Kishon Vijay Abraham I wrote:
>> EXTERNAL EMAIL: Do not click links or open attachments unless you
>> know the content is safe
>>
>> Hi Steen,
>>
>> On 10/02/21 2:22 pm, Steen Hegelund wrote:
>>> Provide new phy configuration interfaces for media type and speed
>>> that
>>> allows allows e.g. PHYs used for ethernet to be configured with
>>> this
>>> information.
>>>
>>> Signed-off-by: Lars Povlsen 
>>> Signed-off-by: Steen Hegelund 
>>> Reviewed-by: Andrew Lunn 
>>> Reviewed-by: Alexandre Belloni 
>>> ---
>>>  drivers/phy/phy-core.c  | 30 ++
>>>  include/linux/phy/phy.h | 26 ++
>>>  2 files changed, 56 insertions(+)
>>>
>>> diff --git a/drivers/phy/phy-core.c b/drivers/phy/phy-core.c
>>> index 71cb10826326..ccb575b13777 100644
>>> --- a/drivers/phy/phy-core.c
>>> +++ b/drivers/phy/phy-core.c
>>> @@ -373,6 +373,36 @@ int phy_set_mode_ext(struct phy *phy, enum
>>> phy_mode mode, int submode)
>>>  }
>>>  EXPORT_SYMBOL_GPL(phy_set_mode_ext);
>>>
>>> +int phy_set_media(struct phy *phy, enum phy_media media)
>>> +{
>>> + int ret;
>>> +
>>> + if (!phy || !phy->ops->set_media)
>>> + return 0;
>>> +
>>> + mutex_lock(&phy->mutex);
>>> + ret = phy->ops->set_media(phy, media);
>>> + mutex_unlock(&phy->mutex);
>>> +
>>> + return ret;
>>> +}
>>> +EXPORT_SYMBOL_GPL(phy_set_media);
>>> +
>>> +int phy_set_speed(struct phy *phy, int speed)
>>> +{
>>> + int ret;
>>> +
>>> + if (!phy || !phy->ops->set_speed)
>>> + return 0;
>>> +
>>> + mutex_lock(&phy->mutex);
>>> + ret = phy->ops->set_speed(phy, speed);
>>> + mutex_unlock(&phy->mutex);
>>> +
>>> + return ret;
>>> +}
>>> +EXPORT_SYMBOL_GPL(phy_set_speed);
>>
>> Can't speed derived from mode? Do we need a separate set_speed
>> function?
>>
>> Thanks
>> Kishon
> 
> Yes the client will need to be able to choose the speed as needed: 
> e.g. lower than the serdes mode supports, in case the the media or the
> other end is not capable of running that speed.  
> 
> An example is a 10G and 25G serdes connected via DAC and as there is no
> inband autoneg, the 25G client would have to manually select 10G speed
> to communicate with its partner.

Okay. Is it going to be some sort of manual negotiation where the
Ethernet controller invokes set_speed with different speeds? Or the
Ethernet controller will get the speed using some out of band mechanism
and invokes set_speed once with the actual speed?

Thanks
Kishon
> 
>>
>>> +
>>>  int phy_reset(struct phy *phy)
>>>  {
>>>   int ret;
>>> diff --git a/include/linux/phy/phy.h b/include/linux/phy/phy.h
>>> index e435bdb0bab3..e4fd69a1faa7 100644
>>> --- a/include/linux/phy/phy.h
>>> +++ b/include/linux/phy/phy.h
>>> @@ -44,6 +44,12 @@ enum phy_mode {
>>>   PHY_MODE_DP
>>>  };
>>>
>>> +enum phy_media {
>>> + PHY_MEDIA_DEFAULT,
>>> + PHY_MEDIA_SR,
>>> + PHY_MEDIA_DAC,
>>> +};
>>> +
>>>  /**
>>>   * union phy_configure_opts - Opaque generic phy configuration
>>>   *
>>> @@ -64,6 +70,8 @@ union phy_configure_opts {
>>>   * @power_on: powering on the phy
>>>   * @power_off: powering off the phy
>>>   * @set_mode: set the mode of the phy
>>> + * @set_media: set the media type of the phy (optional)
>>> + * @set_speed: set the speed of the phy (optional)
>>>   * @reset: resetting the phy
>>>   * @calibrate: calibrate the phy
>>>   * @release: ops to be performed while the consumer relinquishes
>>> the PHY
>>> @@ -75,6 +83,8 @@ struct phy_ops {
>>>   int (*power_on)(struct phy *phy);
>>>   int (*power_off)(struct phy *phy);
>>>   int (*set_mode)(struct phy *phy, enum phy_mode mode, int
>>> submode);
>>> + int (*set_media)(struct phy *phy, enum phy_media media);
>>> + int (*set_speed)(struct phy *phy, int speed);
>>>
>>>   /**
>>>    * @configure:
>>> @@ -215,6 +225,8 @@ int phy_power_off(struct phy *phy);
>>>  int phy_set_mode_ext(struct phy *phy, enum phy_mode mode, int
>>> submode);
>>>  #define phy_set_mode(phy, mode) \
>>>   phy_set_mode_ext(phy, mode, 0)
>>> +int phy_set_media(struct phy *phy, enum phy_media media);
>>> +int phy_set_speed(struct phy *phy, int speed);
>>>  int phy_configure(struct phy *phy, union phy_configure_opts
>>> *opts);
>>>  int phy_validate(struct phy *phy, enum phy_mode mode, int submode,
>>>    union phy_configure_opts *opts);
>>> @@ -344,6 +356,20 @@ static inline int phy_set_mode_ext(struct phy
>>> *phy, enum phy_mode mode,
>>>  #define phy_set_mode(phy, mode) \
>>>   phy_set_mode_ext(phy, mode, 0)
>>>
>>> +static inline int phy_set_media(struct phy *phy, enum phy_media
>>> media)
>>> +{
>>> + if (!phy)
>>> + return 0;
>>> + return -ENOSYS;
>>> +}
>>> +
>>> +static inline int phy_set_speed(struct phy *phy, int speed)
>>> +{
>>> + if (!phy)
>>> + return 0;
>>> + return

Re: [net-next] tcp: Sanitize CMSG flags and reserved args in tcp_zerocopy_receive.

2021-02-15 Thread Dan Carpenter
Hi Arjun,

url:
https://github.com/0day-ci/linux/commits/Arjun-Roy/tcp-Sanitize-CMSG-flags-and-reserved-args-in-tcp_zerocopy_receive/20210212-052537
base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git 
e4b62cf7559f2ef9a022de235e5a09a8d7ded520
config: x86_64-randconfig-m001-20210209 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 
Reported-by: Dan Carpenter 

smatch warnings:
net/ipv4/tcp.c:4158 do_tcp_getsockopt() warn: check for integer overflow 'len'

vim +/len +4158 net/ipv4/tcp.c

3fdadf7d27e3fb Dmitry Mishin2006-03-20  3896  static int 
do_tcp_getsockopt(struct sock *sk, int level,
3fdadf7d27e3fb Dmitry Mishin2006-03-20  3897int 
optname, char __user *optval, int __user *optlen)
^1da177e4c3f41 Linus Torvalds   2005-04-16  3898  {
295f7324ff8d9e Arnaldo Carvalho de Melo 2005-08-09  3899struct 
inet_connection_sock *icsk = inet_csk(sk);
^1da177e4c3f41 Linus Torvalds   2005-04-16  3900struct tcp_sock 
*tp = tcp_sk(sk);
6fa251663069e0 Nikolay Borisov  2016-02-03  3901struct net *net 
= sock_net(sk);
^1da177e4c3f41 Linus Torvalds   2005-04-16  3902int val, len;

"len" is int.

[ snip ]
05255b823a6173 Eric Dumazet 2018-04-27  4146  #ifdef CONFIG_MMU
05255b823a6173 Eric Dumazet 2018-04-27  4147case 
TCP_ZEROCOPY_RECEIVE: {
7eeba1706eba6d Arjun Roy2021-01-20  4148struct 
scm_timestamping_internal tss;
e0fecb289ad3fd Arjun Roy2020-12-10  4149struct 
tcp_zerocopy_receive zc = {};
05255b823a6173 Eric Dumazet 2018-04-27  4150int err;
05255b823a6173 Eric Dumazet 2018-04-27  4151  
05255b823a6173 Eric Dumazet 2018-04-27  4152if 
(get_user(len, optlen))
05255b823a6173 Eric Dumazet 2018-04-27  4153
return -EFAULT;
c8856c05145490 Arjun Roy2020-02-14  4154if (len 
< offsetofend(struct tcp_zerocopy_receive, length))
05255b823a6173 Eric Dumazet 2018-04-27  4155
return -EINVAL;


The problem is that negative values of "len" are type promoted to high
positive values.  So the fix is to write this as:

if (len < 0 || len < offsetofend(struct tcp_zerocopy_receive, length))
return -EINVAL;

110912bdf28392 Arjun Roy2021-02-11  4156if 
(unlikely(len > sizeof(zc))) {
110912bdf28392 Arjun Roy2021-02-11  4157
err = check_zeroed_user(optval + sizeof(zc),
110912bdf28392 Arjun Roy2021-02-11 @4158
len - sizeof(zc));


Potentially "len - a negative value".


110912bdf28392 Arjun Roy2021-02-11  4159
if (err < 1)
110912bdf28392 Arjun Roy2021-02-11  4160
return err == 0 ? -EINVAL : err;
c8856c05145490 Arjun Roy2020-02-14  4161
len = sizeof(zc);
0b7f41f68710cc Arjun Roy2020-02-25  4162
if (put_user(len, optlen))
0b7f41f68710cc Arjun Roy2020-02-25  4163
return -EFAULT;
0b7f41f68710cc Arjun Roy2020-02-25  4164}

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


[PATCH] b43: N-PHY: Fix the update of coef for the PHY revision >= 3case

2021-02-15 Thread Colin King
From: Colin Ian King 

The documentation for the PHY update [1] states:

Loop 4 times with index i

If PHY Revision >= 3
Copy table[i] to coef[i]
Otherwise
Set coef[i] to 0

the copy of the table to coef is currently implemented the wrong way
around, table is being updated from uninitialized values in coeff.
Fix this by swapping the assignment around.

[1] https://bcm-v4.sipsolutions.net/802.11/PHY/N/RestoreCal/

Fixes: 2f258b74d13c ("b43: N-PHY: implement restoring general configuration")
Addresses-Coverity: ("Uninitialized scalar variable")
Signed-off-by: Colin Ian King 
---
 drivers/net/wireless/broadcom/b43/phy_n.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/broadcom/b43/phy_n.c 
b/drivers/net/wireless/broadcom/b43/phy_n.c
index b669dff24b6e..665b737fbb0d 100644
--- a/drivers/net/wireless/broadcom/b43/phy_n.c
+++ b/drivers/net/wireless/broadcom/b43/phy_n.c
@@ -5311,7 +5311,7 @@ static void b43_nphy_restore_cal(struct b43_wldev *dev)
 
for (i = 0; i < 4; i++) {
if (dev->phy.rev >= 3)
-   table[i] = coef[i];
+   coef[i] = table[i];
else
coef[i] = 0;
}
-- 
2.30.0



Re: [PATCH iproute2 5/6] man8/bridge.8: explain self vs master for "bridge fdb add"

2021-02-15 Thread Vladimir Oltean
On Mon, Feb 15, 2021 at 11:53:42AM +0100, Alexandra Winter wrote:
> Actually, I found your first (more verbose) proposal more helpful.

Sorry, I don't understand. Do you want me to copy the whole explanation
from bridge fdb add to bridge link set?

> >> Maybe I misunderstand this sentence, but I can do a 'bridge fdb add' 
> >> without 'self'
> >> on the bridge device. And the address shows up under 'bridge fdb show'.
> >> So what does mandatory mean here?
> >
> > It's right in the next sentence:
> >
> >> The flag is set by default if "master" is not specified.
> >
> > It's mandatory and implicit if "master" is not specified, ergo 'bridge
> > fdb add dev br0' will work because 'master' is not specified (it is
> > implicitly 'bridge fdb add dev br0 self'. But 'bridge fdb add dev br0
> > master' will fail, because the 'self' flag is no longer implicit (since
> > 'master' was specified) but mandatory and absent.
> >
> > I'm not sure what I can do to improve this.
> >
> Maybe the sentence under 'master':
> " If the specified
> +device is a master itself, such as a bridge, this flag is invalid."
> is sufficient to defien this situation. And no need to explain mandatory 
> implicit defaults
> in the first paragraph?

I don't understand this either. Could you paste here how you think this
paragraph should read?


Re: [PATCH 2/4] net: stmmac: Add Toshiba Visconti SoCs glue driver

2021-02-15 Thread Arnd Bergmann
On Mon, Feb 15, 2021 at 10:23 AM Leon Romanovsky  wrote:
> On Mon, Feb 15, 2021 at 04:28:09PM +0900, Nobuhiro Iwamatsu wrote:
> >
> > Sorry, I sent the wrong patchset that didn't fix this point out.
> >
> > > I asked it before, but never received an answer.
> >
> > I have received your point out and have sent an email with the content
> > to remove this line. But it may not have arrived yet...
> >
> > > Why did you use "def_bool y" and not "default y"? Isn't it supposed to be
> > > "depends on STMMAC_ETH"? And probably it shouldn't be set as a default as 
> > > "y".
> > >
> >
> > The reason why "def_bool y" was set is that the wrong fix was left when
> > debugging. Also, I don't think it is necessary to set "default y".
> > This is also incorrect because it says "bool" Toshiba Visconti DWMAC
> > support "". I change it to trustate in the new patch.
> >
> > And this driver is enabled when STMMAC_PLATFORM was Y. And STMMAC_PLATFORM
> > depends on STMMAC_ETH.
> > So I understand that STMMAC_ETH does not need to be dependents. Is this
> > understanding wrong?
>
> This is correct understanding, just need to clean other entries in that
> Kconfig that depends on STMMAC_ETH.

'tristate' with no default sounds right. I see that some platforms have a
default according to the platform, which also makes sense but isn't
required. What I would suggest though is a dependency on the platform,
to make it easier to disable the front-end based on which platforms
are enabled. This would end up as

config DWMAC_VISCONTI
tristate "Toshiba Visconti DWMAC support"
depends on ARCH_VISCONTI || COMPILE_TEST
depends on OF && COMMON_CLK # only add this line if it's
required for compilation
default ARCH_VISCONTI

  Arnd


AW: HSR/PRP sequence counter issue with Cisco Redbox

2021-02-15 Thread Wenzel, Marco
> On Wed, Jan 27, 2021 at 6:32 AM Wenzel, Marco  eberle.de> wrote:
> >
> > Hi,
> >
> > we have figured out an issue with the current PRP driver when trying to
> communicate with Cisco IE 2000 industrial Ethernet switches in Redbox
> mode. The Cisco always resets the HSR/PRP sequence counter to "1" at low
> traffic (<= 1 frame in 400 ms). It can be reproduced by a simple ICMP echo
> request with 1 s interval between a Linux box running with PRP and a VDAN
> behind the Cisco Redbox. The Linux box then always receives frames with
> sequence counter "1" and drops them. The behavior is not configurable at
> the Cisco Redbox.
> >
> > I fixed it by ignoring sequence counters with value "1" at the sequence
> counter check in hsr_register_frame_out ():
> >
> > diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c index
> > 5c97de459905..630c238e81f0 100644
> > --- a/net/hsr/hsr_framereg.c
> > +++ b/net/hsr/hsr_framereg.c
> > @@ -411,7 +411,7 @@ void hsr_register_frame_in(struct hsr_node *node,
> > struct hsr_port *port,  int hsr_register_frame_out(struct hsr_port *port,
> struct hsr_node *node,
> >u16 sequence_nr)  {
> > -   if (seq_nr_before_or_eq(sequence_nr, node->seq_out[port->type]))
> > +   if (seq_nr_before_or_eq(sequence_nr,
> > + node->seq_out[port->type]) && (sequence_nr != 1))
> > return 1;
> >
> > node->seq_out[port->type] = sequence_nr;
> >
> >
> > Do you think this could be a solution? Should this patch be officially 
> > applied
> in order to avoid other users running into these communication issues?
> 
> This isn't the correct way to solve the problem. IEC 62439-3 defines
> EntryForgetTime as "Time after which an entry is removed from the duplicate
> table" with a value of 400ms and states devices should usually be configured
> to keep entries in the table for a much shorter time. hsr_framereg.c needs to
> be reworked to handle this according to the specification.

Sorry for the delay but I did not have the time to take a closer look at the 
problem until now. 

My suggestion for the EntryForgetTime feature would be the following: A 
time_out element will be added to the hsr_node structure, which always stores 
the current time when entering hsr_register_frame_out(). If the last stored 
time is older than EntryForgetTime (400 ms) the sequence number check will be 
ignored.

diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c
index 5c97de459905..a97bffbd2581 100644
--- a/net/hsr/hsr_framereg.c
+++ b/net/hsr/hsr_framereg.c
@@ -164,8 +164,10 @@ static struct hsr_node *hsr_add_node(struct hsr_priv *hsr,
 * as initialization. (0 could trigger an spurious ring error warning).
 */
now = jiffies;
-   for (i = 0; i < HSR_PT_PORTS; i++)
+   for (i = 0; i < HSR_PT_PORTS; i++) {
new_node->time_in[i] = now;
+   new_node->time_out[i] = now;
+   }
for (i = 0; i < HSR_PT_PORTS; i++)
new_node->seq_out[i] = seq_out;
 
@@ -411,9 +413,12 @@ void hsr_register_frame_in(struct hsr_node *node, struct 
hsr_port *port,
 int hsr_register_frame_out(struct hsr_port *port, struct hsr_node *node,
   u16 sequence_nr)
 {
-   if (seq_nr_before_or_eq(sequence_nr, node->seq_out[port->type]))
+   if (seq_nr_before_or_eq(sequence_nr, node->seq_out[port->type]) &&
+time_is_after_jiffies(node->time_out[port->type] + 
msecs_to_jiffies(HSR_ENTRY_FORGET_TIME))) {
return 1;
+   }
 
+   node->time_out[port->type] = jiffies;
node->seq_out[port->type] = sequence_nr;
return 0;
 }
diff --git a/net/hsr/hsr_framereg.h b/net/hsr/hsr_framereg.h
index 86b43f539f2c..d9628e7a5f05 100644
--- a/net/hsr/hsr_framereg.h
+++ b/net/hsr/hsr_framereg.h
@@ -75,6 +75,7 @@ struct hsr_node {
enum hsr_port_type  addr_B_port;
unsigned long   time_in[HSR_PT_PORTS];
booltime_in_stale[HSR_PT_PORTS];
+   unsigned long   time_out[HSR_PT_PORTS];
/* if the node is a SAN */
boolsan_a;
boolsan_b;
diff --git a/net/hsr/hsr_main.h b/net/hsr/hsr_main.h
index 7dc92ce5a134..f79ca55d6986 100644
--- a/net/hsr/hsr_main.h
+++ b/net/hsr/hsr_main.h
@@ -21,6 +21,7 @@
 #define HSR_LIFE_CHECK_INTERVAL 2000 /* ms */
 #define HSR_NODE_FORGET_TIME   6 /* ms */
 #define HSR_ANNOUNCE_INTERVAL100 /* ms */
+#define HSR_ENTRY_FORGET_TIME400 /* ms */
 
 /* By how much may slave1 and slave2 timestamps of latest received frame from
  * each node differ before we notify of communication problem?


This approach works fine with the Cisco IE 2000 and I think it implements the 
correct way to handle sequence numbers as defined in IEC 62439-3.

Regards,
Marco Wenzel

> >
> > Thanks
> > Marco Wenzel
> 
> Regards,
> George McCollister


Re: [PATCH iproute2 5/6] man8/bridge.8: explain self vs master for "bridge fdb add"

2021-02-15 Thread Alexandra Winter



On 15.02.21 13:13, Vladimir Oltean wrote:
> On Mon, Feb 15, 2021 at 11:53:42AM +0100, Alexandra Winter wrote:
>> Actually, I found your first (more verbose) proposal more helpful.
> 
> Sorry, I don't understand. Do you want me to copy the whole explanation
> from bridge fdb add to bridge link set?
> 
 Maybe I misunderstand this sentence, but I can do a 'bridge fdb add' 
 without 'self'
 on the bridge device. And the address shows up under 'bridge fdb show'.
 So what does mandatory mean here?
>>>
>>> It's right in the next sentence:
>>>
 The flag is set by default if "master" is not specified.
>>>
>>> It's mandatory and implicit if "master" is not specified, ergo 'bridge
>>> fdb add dev br0' will work because 'master' is not specified (it is
>>> implicitly 'bridge fdb add dev br0 self'. But 'bridge fdb add dev br0
>>> master' will fail, because the 'self' flag is no longer implicit (since
>>> 'master' was specified) but mandatory and absent.
>>>
>>> I'm not sure what I can do to improve this.
>>>
>> Maybe the sentence under 'master':
>> " If the specified
>> +device is a master itself, such as a bridge, this flag is invalid."
>> is sufficient to defien this situation. And no need to explain mandatory 
>> implicit defaults
>> in the first paragraph?
> 
> I don't understand this either. Could you paste here how you think this
> paragraph should read?
> 
Sorry, I did not mean to cause confusion. Your original proposal:
 .B self
-- the address is associated with the port drivers fdb. Usually hardware
-  (default).
+- the operation is fulfilled directly by the driver for the specified network
+device. If the network device belongs to a master like a bridge, then the
+bridge is bypassed and not notified of this operation (and if the device does
+notify the bridge, it is driver-specific behavior and not mandated by this
+flag, check the driver for more details). The "bridge fdb add" command can also
+be used on the bridge device itself, and in this case, the added fdb entries
+will be locally terminated (not forwarded). In the latter case, the "self" flag
+is mandatory. The flag is set by default if "master" is not specified.
 .sp
 
 .B master
-- the address is associated with master devices fdb. Usually software.
+- if the specified network device is a port that belongs to a master device
+such as a bridge, the operation is fulfilled by the master device's driver,
+which may in turn notify the port driver too of the address. If the specified
+device is a master itself, such as a bridge, this flag is invalid.
 .sp


The above is fine with me and IMHO much better than it is today.
But if you ask me I would change it to:

 .B self
- the operation is fulfilled directly by the driver for the specified physical 
device. 
If the network device belongs to a master like a bridge, then the
bridge is bypassed and not notified of this operation (and if the device does
notify the bridge, it is driver-specific behavior and not mandated by this
flag, check the driver for more details). The "bridge fdb add" command can also
be used on the bridge device itself, and in this case, the added fdb entries
will be locally terminated (not forwarded). The flag is set by default if 
"master" 
is not specified.
 .sp
 
 .B master
- if the specified network device is a port that belongs to a master device
such as a software bridge, the operation is fulfilled by the master device's 
driver,
which may in turn notify the port driver too of the address. If the specified
device is a master itself, such as a bridge, this flag is invalid.
 .sp




Re: [net-next PATCH v5 15/15] net: dpaa2-mac: Add ACPI support for DPAA2 MAC driver

2021-02-15 Thread Calvin Johnson
On Mon, Feb 08, 2021 at 04:28:31PM +, Russell King - ARM Linux admin wrote:
> On Mon, Feb 08, 2021 at 08:42:44PM +0530, Calvin Johnson wrote:
> > Modify dpaa2_mac_connect() to support ACPI along with DT.
> > Modify dpaa2_mac_get_node() to get the dpmac fwnode from either
> > DT or ACPI.
> > 
> > Replace of_get_phy_mode with fwnode_get_phy_mode to get
> > phy-mode for a dpmac_node.
> > 
> > Use helper function phylink_fwnode_phy_connect() to find phy_dev and
> > connect to mac->phylink.
> > 
> > Signed-off-by: Calvin Johnson 
> 
> I don't think this does the full job.
> 
> >  static int dpaa2_pcs_create(struct dpaa2_mac *mac,
> > -   struct device_node *dpmac_node, int id)
> > +   struct fwnode_handle *dpmac_node,
> > +   int id)
> >  {
> > struct mdio_device *mdiodev;
> > -   struct device_node *node;
> > +   struct fwnode_handle *node;
> >  
> > -   node = of_parse_phandle(dpmac_node, "pcs-handle", 0);
> > -   if (!node) {
> > +   node = fwnode_find_reference(dpmac_node, "pcs-handle", 0);
> > +   if (IS_ERR(node)) {
> > /* do not error out on old DTS files */
> > netdev_warn(mac->net_dev, "pcs-handle node not found\n");
> > return 0;
> > }
> >  
> > -   if (!of_device_is_available(node)) {
> > +   if (!of_device_is_available(to_of_node(node))) {
> 
> If "node" is an ACPI node, then to_of_node() returns NULL, and
> of_device_is_available(NULL) is false. So, if we're using ACPI
> and we enter this path, we will always hit the error below:
> 
> > netdev_err(mac->net_dev, "pcs-handle node not available\n");
> > -   of_node_put(node);
> > +   of_node_put(to_of_node(node));
> > return -ENODEV;
> > }
> 
> > @@ -306,7 +321,7 @@ int dpaa2_mac_connect(struct dpaa2_mac *mac)
> >  * error out if the interface mode requests them and there is no PHY
> >  * to act upon them
> >  */
> > -   if (of_phy_is_fixed_link(dpmac_node) &&
> > +   if (of_phy_is_fixed_link(to_of_node(dpmac_node)) &&
> 
> If "dpmac_node" is an ACPI node, to_of_node() will return NULL, and
> of_phy_is_fixed_link() will oops.

I think of_phy_is_fixed_link() needs to be fixed. I'll add below fix.

--- a/drivers/net/mdio/of_mdio.c
+++ b/drivers/net/mdio/of_mdio.c
@@ -439,6 +439,9 @@ bool of_phy_is_fixed_link(struct device_node *np)
int len, err;
const char *managed;

+   if (!np)
+   return false;
+
/* New binding */
dn = of_get_child_by_name(np, "fixed-link");
if (dn) {

Regards
Calvin


Re: [PATCH net-next 09/12] net: dsa: tag_ocelot: create separate tagger for Seville

2021-02-15 Thread Dan Carpenter
Hi Vladimir,

url:
https://github.com/0day-ci/linux/commits/Vladimir-Oltean/PTP-for-DSA-tag_ocelot_8021q/20210213-081857
base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git 
3c5a2fd042d0bfac71a2dfb99515723d318df47b
config: i386-randconfig-m031-20210209 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 
Reported-by: Dan Carpenter 

smatch warnings:
net/dsa/tag_ocelot.c:59 ocelot_xmit() warn: should '(((1))) << (dp->index)' be 
a 64 bit type?
net/dsa/tag_ocelot.c:71 seville_xmit() warn: should '(((1))) << (dp->index)' be 
a 64 bit type?

vim +59 net/dsa/tag_ocelot.c

9d88a16c0fc930 Vladimir Oltean 2021-02-13  52  static struct sk_buff 
*ocelot_xmit(struct sk_buff *skb,
9d88a16c0fc930 Vladimir Oltean 2021-02-13  53  
struct net_device *netdev)
9d88a16c0fc930 Vladimir Oltean 2021-02-13  54  {
9d88a16c0fc930 Vladimir Oltean 2021-02-13  55   struct dsa_port *dp = 
dsa_slave_to_port(netdev);
9d88a16c0fc930 Vladimir Oltean 2021-02-13  56   void *injection;
9d88a16c0fc930 Vladimir Oltean 2021-02-13  57  
9d88a16c0fc930 Vladimir Oltean 2021-02-13  58   ocelot_xmit_common(skb, netdev, 
cpu_to_be32(0x888a), &injection);
9d88a16c0fc930 Vladimir Oltean 2021-02-13 @59   ocelot_ifh_set_dest(injection, 
BIT(dp->index));

db->index is less than db->num_ports which 32 or less but sometimes it
comes from the device tree so who knows.  The ocelot_ifh_set_dest()
function takes a u64 though and that suggests that BIT() should be
changed to BIT_ULL().

9d88a16c0fc930 Vladimir Oltean 2021-02-13  60  
9d88a16c0fc930 Vladimir Oltean 2021-02-13  61   return skb;
9d88a16c0fc930 Vladimir Oltean 2021-02-13  62  }
9d88a16c0fc930 Vladimir Oltean 2021-02-13  63  
9d88a16c0fc930 Vladimir Oltean 2021-02-13  64  static struct sk_buff 
*seville_xmit(struct sk_buff *skb,
9d88a16c0fc930 Vladimir Oltean 2021-02-13  65   
struct net_device *netdev)
9d88a16c0fc930 Vladimir Oltean 2021-02-13  66  {
9d88a16c0fc930 Vladimir Oltean 2021-02-13  67   struct dsa_port *dp = 
dsa_slave_to_port(netdev);
9d88a16c0fc930 Vladimir Oltean 2021-02-13  68   void *injection;
9d88a16c0fc930 Vladimir Oltean 2021-02-13  69  
9d88a16c0fc930 Vladimir Oltean 2021-02-13  70   ocelot_xmit_common(skb, netdev, 
cpu_to_be32(0x8885), &injection);
9d88a16c0fc930 Vladimir Oltean 2021-02-13 @71   seville_ifh_set_dest(injection, 
BIT(dp->index));

Same.

9d88a16c0fc930 Vladimir Oltean 2021-02-13  72  
8dce89aa5f3274 Vladimir Oltean 2019-11-14  73   return skb;
8dce89aa5f3274 Vladimir Oltean 2019-11-14  74  }

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


Re: [PATCH 1/2] lockdep: add lockdep_assert_not_held()

2021-02-15 Thread Johannes Berg
On Mon, 2021-02-15 at 11:44 +0100, Peter Zijlstra wrote:
> 
> I think something like so will work, but please double check.

Yeah, that looks better.

> +++ b/include/linux/lockdep.h
> @@ -294,11 +294,15 @@ extern void lock_unpin_lock(struct lockdep_map *lock, 
> struct pin_cookie);
>  
>  #define lockdep_depth(tsk)   (debug_locks ? (tsk)->lockdep_depth : 0)
>  
> -#define lockdep_assert_held(l)   do {\
> - WARN_ON(debug_locks && !lockdep_is_held(l));\
> +#define lockdep_assert_held(l)   do {
> \
> + WARN_ON(debug_locks && lockdep_is_held(l) == 0));   \
>   } while (0)

That doesn't really need to change? It's the same.

> -#define lockdep_assert_held_write(l) do {\
> +#define lockdep_assert_not_held(l)   do {\
> + WARN_ON(debug_locks && lockdep_is_held(l) == 1));   \
> + } while (0)
> +
> +#define lockdep_assert_held_write(l) do {\
>   WARN_ON(debug_locks && !lockdep_is_held_type(l, 0));\
>   } while (0)
>  
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index c1418b47f625..983ba206f7b2 100644
> --- a/kernel/locking/lockdep.
> +++ b/kernel/locking/lockdep.c
> @@ -5467,7 +5467,7 @@ noinstr int lock_is_held_type(const struct lockdep_map 
> *lock, int read)
>   int ret = 0;
>  
>   if (unlikely(!lockdep_enabled()))
> - return 1; /* avoid false negative lockdep_assert_held() */
> + return -1; /* avoid false negative lockdep_assert_held() */

Maybe add lockdep_assert_not_held() to the comment, to explain the -1
(vs non-zero)?

johannes



Re: [PATCH net v1 1/3] net: phy: mscc: adding LCPLL reset to VSC8514

2021-02-15 Thread Bjarni.Jonasson
On Fri, 2021-02-12 at 16:20 +0100, Andrew Lunn wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you
> know the content is safe
> 
> On Fri, Feb 12, 2021 at 03:06:41PM +0100, Bjarni Jonasson wrote:
> > +static u32 vsc85xx_csr_read(struct phy_device *phydev,
> > + enum csr_target target, u32 reg);
> > +static int vsc85xx_csr_write(struct phy_device *phydev,
> > +  enum csr_target target, u32 reg, u32
> > val);
> > +
> 
> Hi Bjarni
> 
> No forward definitions please. Move the code around so they are not
> required. Sometimes it is best to do such a move as a preparation
> patch.

Sure, I will remove them.

> 
> > @@ -1569,8 +1664,16 @@ static int vsc8514_config_pre_init(struct
> > phy_device *phydev)
> >   {0x16b2, 0x7000},
> >   {0x16b4, 0x0814},
> >   };
> > + struct device *dev = &phydev->mdio.dev;
> >   unsigned int i;
> >   u16 reg;
> > + int ret;
> 
> Hard to say from the limited context, but is reverse christmass tree
> being preserved here?

I will dobbelcheck.

> 
>   Andrew


Re: [PATCH net-next 09/12] net: dsa: tag_ocelot: create separate tagger for Seville

2021-02-15 Thread Vladimir Oltean
Hi Dan,

On Mon, Feb 15, 2021 at 04:00:04PM +0300, Dan Carpenter wrote:
> db->index is less than db->num_ports which 32 or less but sometimes it
> comes from the device tree so who knows.

The destination port mask is copied into a 12-bit field of the packet,
starting at bit offset 67 and ending at 56:

static inline void ocelot_ifh_set_dest(void *injection, u64 dest)
{
packing(injection, &dest, 67, 56, OCELOT_TAG_LEN, PACK, 0);
}

So this DSA tagging protocol supports at most 12 bits, which is clearly
less than 32. Attempting to send to a port number > 12 will cause the
packing() call to truncate way before there will be 32-bit truncation
due to type promotion of the BIT(port) argument towards u64.

> The ocelot_ifh_set_dest() function takes a u64 though and that
> suggests that BIT() should be changed to BIT_ULL().

I understand that you want to silence the warning, which fundamentally
comes from the packing() API which works with u64 values and nothing of
a smaller size. So I can send a patch which replaces BIT(port) with
BIT_ULL(port), even if in practice both are equally fine.


Re: [PATCH net v1 1/3] net: phy: mscc: adding LCPLL reset to VSC8514

2021-02-15 Thread Bjarni.Jonasson
On Fri, 2021-02-12 at 16:54 +0100, Andrew Lunn wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you
> know the content is safe
> 
> On Fri, Feb 12, 2021 at 03:06:41PM +0100, Bjarni Jonasson wrote:
> > At Power-On Reset, transients may cause the LCPLL to lock onto a
> > clock that is momentarily unstable. This is normally seen in QSGMII
> > setups where the higher speed 6G SerDes is being used.
> > This patch adds an initial LCPLL Reset to the PHY (first instance)
> > to avoid this issue.
> 
> Hi Bjarni
> 
> 
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
> 
> These patches are rather large for stable, and not obviously correct.
> 
> There these problems hitting real users running stable kernels? Or is
> it so broken it never really worked?
> 
>Andrew

Correct, the current linux driver is unstable and has never really
worked properly.  Our in-house SDK driver already have these fixes and
the upstreamed version has been lagging behind.  And yes the patches
are big, we had to pull out the calibration from the 8051 into the
driver.


[PATCH net-next] net: mscc: ocelot: avoid type promotion when calling ocelot_ifh_set_dest

2021-02-15 Thread Vladimir Oltean
From: Vladimir Oltean 

Smatch is confused by the fact that a 32-bit BIT(port) macro is passed
as argument to the ocelot_ifh_set_dest function and warns:

ocelot_xmit() warn: should '(((1))) << (dp->index)' be a 64 bit type?
seville_xmit() warn: should '(((1))) << (dp->index)' be a 64 bit type?

The destination port mask is copied into a 12-bit field of the packet,
starting at bit offset 67 and ending at 56.

So this DSA tagging protocol supports at most 12 bits, which is clearly
less than 32. Attempting to send to a port number > 12 will cause the
packing() call to truncate way before there will be 32-bit truncation
due to type promotion of the BIT(port) argument towards u64.

Therefore, smatch's fears that BIT(port) will do the wrong thing and
cause unexpected truncation for "port" values >= 32 are unfounded.
Nonetheless, let's silence the warning by explicitly passing an u64
value to ocelot_ifh_set_dest, such that the compiler does not need to do
a questionable type promotion.

Reported-by: kernel test robot 
Reported-by: Dan Carpenter 
Signed-off-by: Vladimir Oltean 
---
 drivers/net/ethernet/mscc/ocelot.c | 2 +-
 net/dsa/tag_ocelot.c   | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mscc/ocelot.c 
b/drivers/net/ethernet/mscc/ocelot.c
index 8d97c731e953..5d13087c85d6 100644
--- a/drivers/net/ethernet/mscc/ocelot.c
+++ b/drivers/net/ethernet/mscc/ocelot.c
@@ -803,7 +803,7 @@ void ocelot_port_inject_frame(struct ocelot *ocelot, int 
port, int grp,
 QS_INJ_CTRL_SOF, QS_INJ_CTRL, grp);
 
ocelot_ifh_set_bypass(ifh, 1);
-   ocelot_ifh_set_dest(ifh, BIT(port));
+   ocelot_ifh_set_dest(ifh, BIT_ULL(port));
ocelot_ifh_set_tag_type(ifh, IFH_TAG_TYPE_C);
ocelot_ifh_set_vid(ifh, skb_vlan_tag_get(skb));
ocelot_ifh_set_rew_op(ifh, rew_op);
diff --git a/net/dsa/tag_ocelot.c b/net/dsa/tag_ocelot.c
index a7dd61c8e005..f9df9cac81c5 100644
--- a/net/dsa/tag_ocelot.c
+++ b/net/dsa/tag_ocelot.c
@@ -56,7 +56,7 @@ static struct sk_buff *ocelot_xmit(struct sk_buff *skb,
void *injection;
 
ocelot_xmit_common(skb, netdev, cpu_to_be32(0x888a), &injection);
-   ocelot_ifh_set_dest(injection, BIT(dp->index));
+   ocelot_ifh_set_dest(injection, BIT_ULL(dp->index));
 
return skb;
 }
@@ -68,7 +68,7 @@ static struct sk_buff *seville_xmit(struct sk_buff *skb,
void *injection;
 
ocelot_xmit_common(skb, netdev, cpu_to_be32(0x8885), &injection);
-   seville_ifh_set_dest(injection, BIT(dp->index));
+   seville_ifh_set_dest(injection, BIT_ULL(dp->index));
 
return skb;
 }
-- 
2.25.1



Re: [PATCH net v1 3/3] net: phy: mscc: coma mode disabled for VSC8514

2021-02-15 Thread Bjarni.Jonasson
On Fri, 2021-02-12 at 16:32 +, Vladimir Oltean wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you
> know the content is safe
> 
> On Fri, Feb 12, 2021 at 03:06:43PM +0100, Bjarni Jonasson wrote:
> > The 'coma mode' (configurable through sw or hw) provides an
> > optional feature that may be used to control when the PHYs become
> > active.
> > The typical usage is to synchronize the link-up time across
> > all PHY instances. This patch releases coma mode if not done by
> > hardware,
> > otherwise the phys will not link-up.
> > 
> > Signed-off-by: Steen Hegelund 
> > Signed-off-by: Bjarni Jonasson 
> > Fixes: e4f9ba642f0b ("net: phy: mscc: add support for VSC8514
> > PHY.")
> > ---
> >  drivers/net/phy/mscc/mscc_main.c | 13 +
> >  1 file changed, 13 insertions(+)
> > 
> > diff --git a/drivers/net/phy/mscc/mscc_main.c
> > b/drivers/net/phy/mscc/mscc_main.c
> > index 7546d9cc3abd..0600b592618b 100644
> > --- a/drivers/net/phy/mscc/mscc_main.c
> > +++ b/drivers/net/phy/mscc/mscc_main.c
> > @@ -1418,6 +1418,18 @@ static void vsc8584_get_base_addr(struct
> > phy_device *phydev)
> >   vsc8531->addr = addr;
> >  }
> > 
> > +static void vsc85xx_coma_mode_release(struct phy_device *phydev)
> > +{
> > + /* The coma mode (pin or reg) provides an optional feature
> > that
> > +  * may be used to control when the PHYs become active.
> > +  * Alternatively the COMA_MODE pin may be connected low
> > +  * so that the PHYs are fully active once out of reset.
> > +  */
> > + __phy_write(phydev, MSCC_EXT_PAGE_ACCESS,
> > MSCC_PHY_PAGE_EXTENDED_GPIO);
> > + __phy_write(phydev, MSCC_PHY_GPIO_CONTROL_2, 0x0600);
> > + __phy_write(phydev, MSCC_EXT_PAGE_ACCESS,
> > MSCC_PHY_PAGE_STANDARD);
> 
> Can you please do:
> phy_write_paged(phydev, MSCC_PHY_PAGE_EXTENDED_GPIO,
> MSCC_PHY_GPIO_CONTROL_2, 0x0600);
> 
> And can you please provide some definitions for what 0x0600 is?
> My reference manual says that:
> 
> Bit 13:
> COMA_MODE output enable (active low)
> Bit 12:
> COMA_MODE output data
> Bit 11:
> COMA_MODE input data
> Bit 10:
> Reserved
> Bit 9:
> Tri-state enable for LEDs
> 
> 0x600 is BIT(10) | BIT(9). But BIT(10) is reserved. Sure this is
> correct?

I can see this is unclear.  The code is actualy writing zero to bit 12
and 13.  Bit 9 and 10 are not interesting in this context.  I will
change it to use phy_modify_paged() bit 12 and 13.

> 
> > +}
> > +
> >  static int vsc8584_config_init(struct phy_device *phydev)
> >  {
> >   struct vsc8531_private *vsc8531 = phydev->priv;
> > @@ -2610,6 +2622,7 @@ static int vsc8514_config_init(struct
> > phy_device *phydev)
> >   ret = vsc8514_config_host_serdes(phydev);
> >   if (ret)
> >   goto err;
> > + vsc85xx_coma_mode_release(phydev);
> >   }
> > 
> >   phy_unlock_mdio_bus(phydev);
> > --
> > 2.17.1
> > 


Re: [PATCH net v1 1/3] net: phy: mscc: adding LCPLL reset to VSC8514

2021-02-15 Thread Bjarni.Jonasson
On Fri, 2021-02-12 at 10:53 -0800, Jakub Kicinski wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you
> know the content is safe
> 
> On Fri, 12 Feb 2021 15:06:41 +0100 Bjarni Jonasson wrote:
> > At Power-On Reset, transients may cause the LCPLL to lock onto a
> > clock that is momentarily unstable. This is normally seen in QSGMII
> > setups where the higher speed 6G SerDes is being used.
> > This patch adds an initial LCPLL Reset to the PHY (first instance)
> > to avoid this issue.
> > 
> > Signed-off-by: Steen Hegelund 
> > Signed-off-by: Bjarni Jonasson 
> > Fixes: e4f9ba642f0b ("net: phy: mscc: add support for VSC8514
> > PHY.")
> 
> Please make sure each commit builds cleanly with W=1 C=1.
> 
> This one appears to not build at all?

Sorry about that, I will make sure.


Re: [PATCH] net: dsa: mv88e6xxx: prevent 2500BASEX mode override

2021-02-15 Thread Marek Behun
On Mon, 15 Feb 2021 06:15:59 +
Nathan Rossi  wrote:

> diff --git a/drivers/net/dsa/mv88e6xxx/chip.c 
> b/drivers/net/dsa/mv88e6xxx/chip.c
> index 54aa942eed..5c52906b29 100644
> --- a/drivers/net/dsa/mv88e6xxx/chip.c
> +++ b/drivers/net/dsa/mv88e6xxx/chip.c
> @@ -650,6 +650,13 @@ static void mv88e6xxx_validate(struct dsa_switch *ds, 
> int port,
>   if (chip->info->ops->phylink_validate)
>   chip->info->ops->phylink_validate(chip, port, mask, state);
>  
> + /* Advertise 2500BASEX only if 1000BASEX is not configured, this
> +  * prevents phylink_helper_basex_speed from always overriding the
> +  * 1000BASEX mode since auto negotiation is always enabled.
> +  */
> + if (state->interface == PHY_INTERFACE_MODE_1000BASEX)
> + phylink_clear(mask, 2500baseX_Full);
> +

I don't quite like this. This problem should be either solved in
phylink_helper_basex_speed() or somewhere in the mv88e6xxx code, but near
the call to phylink_helper_basex_speed().

Putting a solution to the behaviour of phylink_helper_basex_speed() it
into the validate() method when phylink_helper_basex_speed() is called
from a different place will complicate debugging in the future. If
we start solving problems in this kind of way, the driver will become
totally unreadable, IMO.

Marek


Re: [PATCH v14 2/4] phy: Add media type and speed serdes configuration interfaces

2021-02-15 Thread Andrew Lunn
On Mon, Feb 15, 2021 at 05:25:10PM +0530, Kishon Vijay Abraham I wrote:
> Okay. Is it going to be some sort of manual negotiation where the
> Ethernet controller invokes set_speed with different speeds? Or the
> Ethernet controller will get the speed using some out of band mechanism
> and invokes set_speed once with the actual speed?

Hi Kishon

There are a few different mechanism possible.

The SFP has an EEPROM which contains lots of parameters. One is the
maximum baud rate the module supports. PHYLINK will combine this
information with the MAC capabilities to determine the default speed.

The users can select the mode the MAC works in, e.g. 1000BaseX vs
2500BaseX, via ethtool -s. Different modes needs different speeds.

Some copper PHYs will change there host side interface baud rate when
the media side interface changes mode. 10GBASE-X for 10G copper,
5GBase-X for 5G COPPER, 2500Base-X for 2.5G copper, and SGMII for
old school 10/100/1G Ethernet.

Mainline Linux has no support for it, but some 'vendor crap' will do a
manual negotiation, simply trying different speeds and see if the
SERDES establishes link. There is nothing standardised for this, as
far as i know.

Andrew


Re: [PATCH net-next 09/12] net: dsa: tag_ocelot: create separate tagger for Seville

2021-02-15 Thread Dan Carpenter
On Mon, Feb 15, 2021 at 03:19:31PM +0200, Vladimir Oltean wrote:
> Hi Dan,
> 
> On Mon, Feb 15, 2021 at 04:00:04PM +0300, Dan Carpenter wrote:
> > db->index is less than db->num_ports which 32 or less but sometimes it
> > comes from the device tree so who knows.
> 
> The destination port mask is copied into a 12-bit field of the packet,
> starting at bit offset 67 and ending at 56:
> 
> static inline void ocelot_ifh_set_dest(void *injection, u64 dest)
> {
>   packing(injection, &dest, 67, 56, OCELOT_TAG_LEN, PACK, 0);
> }
> 
> So this DSA tagging protocol supports at most 12 bits, which is clearly
> less than 32. Attempting to send to a port number > 12 will cause the
> packing() call to truncate way before there will be 32-bit truncation
> due to type promotion of the BIT(port) argument towards u64.
> 
> > The ocelot_ifh_set_dest() function takes a u64 though and that
> > suggests that BIT() should be changed to BIT_ULL().
> 
> I understand that you want to silence the warning, which fundamentally
> comes from the packing() API which works with u64 values and nothing of
> a smaller size. So I can send a patch which replaces BIT(port) with
> BIT_ULL(port), even if in practice both are equally fine.

I don't have a strong feeling about this...  Generally silencing
warnings just to make a checker happy is the wrong idea.

To be honest, I normally ignore these warnings.  But I have been looking
at them recently to try figure out if we could make it so it would only
generate a warning where "db->index" was known as possibly being in the
32-63 range.  So I looked at this one.

And now I see some ways that Smatch could have parsed this better...

regards,
dan carpenter



[PATCH v1 2/3] string: Move onoff() helper under string.h hood

2021-02-15 Thread Andy Shevchenko
We have already an implementation and a lot of code that can benefit
of the onoff() helper. Move it under string.h hood.

Signed-off-by: Andy Shevchenko 
---
 drivers/gpu/drm/i915/i915_utils.h | 5 -
 include/linux/string.h| 5 +
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_utils.h 
b/drivers/gpu/drm/i915/i915_utils.h
index e6da5a951132..d2ac357896d4 100644
--- a/drivers/gpu/drm/i915/i915_utils.h
+++ b/drivers/gpu/drm/i915/i915_utils.h
@@ -409,11 +409,6 @@ wait_remaining_ms_from_jiffies(unsigned long 
timestamp_jiffies, int to_wait_ms)
 #define MBps(x) KBps(1000 * (x))
 #define GBps(x) ((u64)1000 * MBps((x)))
 
-static inline const char *onoff(bool v)
-{
-   return v ? "on" : "off";
-}
-
 static inline const char *enableddisabled(bool v)
 {
return v ? "enabled" : "disabled";
diff --git a/include/linux/string.h b/include/linux/string.h
index fd946a5e18c8..2a0589e945d9 100644
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -313,4 +313,9 @@ static inline const char *yesno(bool yes)
return yes ? "yes" : "no";
 }
 
+static inline const char *onoff(bool on)
+{
+   return on ? "on" : "off";
+}
+
 #endif /* _LINUX_STRING_H_ */
-- 
2.30.0



[PATCH v1 1/3] string: Consolidate yesno() helpers under string.h hood

2021-02-15 Thread Andy Shevchenko
We have already few similar implementation and a lot of code that can benefit
of the yesno() helper.  Consolidate yesno() helpers under string.h hood.

Signed-off-by: Andy Shevchenko 
---
 .../drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c|  6 +-
 drivers/gpu/drm/i915/i915_utils.h|  6 +-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c   | 12 +---
 include/linux/string.h   |  5 +
 4 files changed, 8 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
index 360952129b6d..7fde4f90e513 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
@@ -23,6 +23,7 @@
  *
  */
 
+#include 
 #include 
 
 #include 
@@ -49,11 +50,6 @@ struct dmub_debugfs_trace_entry {
uint32_t param1;
 };
 
-static inline const char *yesno(bool v)
-{
-   return v ? "yes" : "no";
-}
-
 /* parse_write_buffer_into_params - Helper function to parse debugfs write 
buffer into an array
  *
  * Function takes in attributes passed to debugfs write entry
diff --git a/drivers/gpu/drm/i915/i915_utils.h 
b/drivers/gpu/drm/i915/i915_utils.h
index abd4dcd9f79c..e6da5a951132 100644
--- a/drivers/gpu/drm/i915/i915_utils.h
+++ b/drivers/gpu/drm/i915/i915_utils.h
@@ -27,6 +27,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -408,11 +409,6 @@ wait_remaining_ms_from_jiffies(unsigned long 
timestamp_jiffies, int to_wait_ms)
 #define MBps(x) KBps(1000 * (x))
 #define GBps(x) ((u64)1000 * MBps((x)))
 
-static inline const char *yesno(bool v)
-{
-   return v ? "yes" : "no";
-}
-
 static inline const char *onoff(bool v)
 {
return v ? "on" : "off";
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c 
b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
index 7d49fd4edc9e..c857d73abbd7 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
@@ -34,6 +34,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -2015,17 +2016,6 @@ static const struct file_operations rss_debugfs_fops = {
 /* RSS Configuration.
  */
 
-/* Small utility function to return the strings "yes" or "no" if the supplied
- * argument is non-zero.
- */
-static const char *yesno(int x)
-{
-   static const char *yes = "yes";
-   static const char *no = "no";
-
-   return x ? yes : no;
-}
-
 static int rss_config_show(struct seq_file *seq, void *v)
 {
struct adapter *adapter = seq->private;
diff --git a/include/linux/string.h b/include/linux/string.h
index 9521d8cab18e..fd946a5e18c8 100644
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -308,4 +308,9 @@ static __always_inline size_t str_has_prefix(const char 
*str, const char *prefix
return strncmp(str, prefix, len) == 0 ? len : 0;
 }
 
+static inline const char *yesno(bool yes)
+{
+   return yes ? "yes" : "no";
+}
+
 #endif /* _LINUX_STRING_H_ */
-- 
2.30.0



[PATCH v1 3/3] string: Move enableddisabled() helper under string.h hood

2021-02-15 Thread Andy Shevchenko
We have already an implementation and a lot of code that can benefit
of the enableddisabled() helper. Move it under string.h hood.

Signed-off-by: Andy Shevchenko 
---
 drivers/gpu/drm/i915/i915_utils.h | 5 -
 include/linux/string.h| 5 +
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_utils.h 
b/drivers/gpu/drm/i915/i915_utils.h
index d2ac357896d4..b05d72b4dd93 100644
--- a/drivers/gpu/drm/i915/i915_utils.h
+++ b/drivers/gpu/drm/i915/i915_utils.h
@@ -409,11 +409,6 @@ wait_remaining_ms_from_jiffies(unsigned long 
timestamp_jiffies, int to_wait_ms)
 #define MBps(x) KBps(1000 * (x))
 #define GBps(x) ((u64)1000 * MBps((x)))
 
-static inline const char *enableddisabled(bool v)
-{
-   return v ? "enabled" : "disabled";
-}
-
 void add_taint_for_CI(struct drm_i915_private *i915, unsigned int taint);
 static inline void __add_taint_for_CI(unsigned int taint)
 {
diff --git a/include/linux/string.h b/include/linux/string.h
index 2a0589e945d9..25f055aa4c31 100644
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -318,4 +318,9 @@ static inline const char *onoff(bool on)
return on ? "on" : "off";
 }
 
+static inline const char *enableddisabled(bool enabled)
+{
+   return enabled ? "enabled" : "disabled";
+}
+
 #endif /* _LINUX_STRING_H_ */
-- 
2.30.0



Re: [PATCH v1 1/3] string: Consolidate yesno() helpers under string.h hood

2021-02-15 Thread Christian König

Am 15.02.21 um 15:21 schrieb Andy Shevchenko:

We have already few similar implementation and a lot of code that can benefit
of the yesno() helper.  Consolidate yesno() helpers under string.h hood.

Signed-off-by: Andy Shevchenko 


Looks like a good idea to me, feel free to add an Acked-by: Christian 
König  to the series.


But looking at the use cases for this, wouldn't it make more sense to 
teach kprintf some new format modifier for this?


Christian.


---
  .../drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c|  6 +-
  drivers/gpu/drm/i915/i915_utils.h|  6 +-
  drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c   | 12 +---
  include/linux/string.h   |  5 +
  4 files changed, 8 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
index 360952129b6d..7fde4f90e513 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
@@ -23,6 +23,7 @@
   *
   */
  
+#include 

  #include 
  
  #include 

@@ -49,11 +50,6 @@ struct dmub_debugfs_trace_entry {
uint32_t param1;
  };
  
-static inline const char *yesno(bool v)

-{
-   return v ? "yes" : "no";
-}
-
  /* parse_write_buffer_into_params - Helper function to parse debugfs write 
buffer into an array
   *
   * Function takes in attributes passed to debugfs write entry
diff --git a/drivers/gpu/drm/i915/i915_utils.h 
b/drivers/gpu/drm/i915/i915_utils.h
index abd4dcd9f79c..e6da5a951132 100644
--- a/drivers/gpu/drm/i915/i915_utils.h
+++ b/drivers/gpu/drm/i915/i915_utils.h
@@ -27,6 +27,7 @@
  
  #include 

  #include 
+#include 
  #include 
  #include 
  #include 
@@ -408,11 +409,6 @@ wait_remaining_ms_from_jiffies(unsigned long 
timestamp_jiffies, int to_wait_ms)
  #define MBps(x) KBps(1000 * (x))
  #define GBps(x) ((u64)1000 * MBps((x)))
  
-static inline const char *yesno(bool v)

-{
-   return v ? "yes" : "no";
-}
-
  static inline const char *onoff(bool v)
  {
return v ? "on" : "off";
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c 
b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
index 7d49fd4edc9e..c857d73abbd7 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
@@ -34,6 +34,7 @@
  
  #include 

  #include 
+#include 
  #include 
  #include 
  #include 
@@ -2015,17 +2016,6 @@ static const struct file_operations rss_debugfs_fops = {
  /* RSS Configuration.
   */
  
-/* Small utility function to return the strings "yes" or "no" if the supplied

- * argument is non-zero.
- */
-static const char *yesno(int x)
-{
-   static const char *yes = "yes";
-   static const char *no = "no";
-
-   return x ? yes : no;
-}
-
  static int rss_config_show(struct seq_file *seq, void *v)
  {
struct adapter *adapter = seq->private;
diff --git a/include/linux/string.h b/include/linux/string.h
index 9521d8cab18e..fd946a5e18c8 100644
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -308,4 +308,9 @@ static __always_inline size_t str_has_prefix(const char 
*str, const char *prefix
return strncmp(str, prefix, len) == 0 ? len : 0;
  }
  
+static inline const char *yesno(bool yes)

+{
+   return yes ? "yes" : "no";
+}
+
  #endif /* _LINUX_STRING_H_ */




commit 0f0aefd733f7 to linux-firmware effectively broke all of the setups with old kernels

2021-02-15 Thread Andy Shevchenko
Hi!

Seems the commit 0f0aefd733f7 to linux-firmware effectively broke all
of the setups with the old kernels. Firmware name is an ABI (!) and
replacing it like this will definitely break systems with older
kernels. Linux firmware package likely, but unfortunately, should
carry on both versions as long as it's needed. Alternative solution is
to provide the links during installation.

Btw, I haven't seen the driver change for that. Care to provide a
commit ID in upstream?

-- 
With Best Regards,
Andy Shevchenko


Re: [PATCH v1 1/3] string: Consolidate yesno() helpers under string.h hood

2021-02-15 Thread Jani Nikula
On Mon, 15 Feb 2021, Andy Shevchenko  wrote:
> We have already few similar implementation and a lot of code that can benefit
> of the yesno() helper.  Consolidate yesno() helpers under string.h hood.

Good luck. I gave up after just four versions. [1]

Acked-by: Jani Nikula 


BR,
Jani.


[1] http://lore.kernel.org/r/20191023131308.9420-1-jani.nik...@intel.com


>
> Signed-off-by: Andy Shevchenko 
> ---
>  .../drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c|  6 +-
>  drivers/gpu/drm/i915/i915_utils.h|  6 +-
>  drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c   | 12 +---
>  include/linux/string.h   |  5 +
>  4 files changed, 8 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
> index 360952129b6d..7fde4f90e513 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
> @@ -23,6 +23,7 @@
>   *
>   */
>  
> +#include 
>  #include 
>  
>  #include 
> @@ -49,11 +50,6 @@ struct dmub_debugfs_trace_entry {
>   uint32_t param1;
>  };
>  
> -static inline const char *yesno(bool v)
> -{
> - return v ? "yes" : "no";
> -}
> -
>  /* parse_write_buffer_into_params - Helper function to parse debugfs write 
> buffer into an array
>   *
>   * Function takes in attributes passed to debugfs write entry
> diff --git a/drivers/gpu/drm/i915/i915_utils.h 
> b/drivers/gpu/drm/i915/i915_utils.h
> index abd4dcd9f79c..e6da5a951132 100644
> --- a/drivers/gpu/drm/i915/i915_utils.h
> +++ b/drivers/gpu/drm/i915/i915_utils.h
> @@ -27,6 +27,7 @@
>  
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -408,11 +409,6 @@ wait_remaining_ms_from_jiffies(unsigned long 
> timestamp_jiffies, int to_wait_ms)
>  #define MBps(x) KBps(1000 * (x))
>  #define GBps(x) ((u64)1000 * MBps((x)))
>  
> -static inline const char *yesno(bool v)
> -{
> - return v ? "yes" : "no";
> -}
> -
>  static inline const char *onoff(bool v)
>  {
>   return v ? "on" : "off";
> diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c 
> b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
> index 7d49fd4edc9e..c857d73abbd7 100644
> --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
> +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
> @@ -34,6 +34,7 @@
>  
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -2015,17 +2016,6 @@ static const struct file_operations rss_debugfs_fops = 
> {
>  /* RSS Configuration.
>   */
>  
> -/* Small utility function to return the strings "yes" or "no" if the supplied
> - * argument is non-zero.
> - */
> -static const char *yesno(int x)
> -{
> - static const char *yes = "yes";
> - static const char *no = "no";
> -
> - return x ? yes : no;
> -}
> -
>  static int rss_config_show(struct seq_file *seq, void *v)
>  {
>   struct adapter *adapter = seq->private;
> diff --git a/include/linux/string.h b/include/linux/string.h
> index 9521d8cab18e..fd946a5e18c8 100644
> --- a/include/linux/string.h
> +++ b/include/linux/string.h
> @@ -308,4 +308,9 @@ static __always_inline size_t str_has_prefix(const char 
> *str, const char *prefix
>   return strncmp(str, prefix, len) == 0 ? len : 0;
>  }
>  
> +static inline const char *yesno(bool yes)
> +{
> + return yes ? "yes" : "no";
> +}
> +
>  #endif /* _LINUX_STRING_H_ */

-- 
Jani Nikula, Intel Open Source Graphics Center


Re: [PATCH v1 1/3] string: Consolidate yesno() helpers under string.h hood

2021-02-15 Thread Andy Shevchenko
+Cc: Sakari and printk people

On Mon, Feb 15, 2021 at 4:28 PM Christian König
 wrote:
> Am 15.02.21 um 15:21 schrieb Andy Shevchenko:
> > We have already few similar implementation and a lot of code that can 
> > benefit
> > of the yesno() helper.  Consolidate yesno() helpers under string.h hood.
> >
> > Signed-off-by: Andy Shevchenko 
>
> Looks like a good idea to me, feel free to add an Acked-by: Christian
> König  to the series.

Thanks.

> But looking at the use cases for this, wouldn't it make more sense to
> teach kprintf some new format modifier for this?

As a next step? IIRC Sakari has at some point the series converted
yesno and Co. to something which I don't remember the details of.

Guys, what do you think?

-- 
With Best Regards,
Andy Shevchenko


Re: [PATCH net-next 09/12] net: dsa: tag_ocelot: create separate tagger for Seville

2021-02-15 Thread Vladimir Oltean
On Mon, Feb 15, 2021 at 05:15:21PM +0300, Dan Carpenter wrote:
> On Mon, Feb 15, 2021 at 03:19:31PM +0200, Vladimir Oltean wrote:
> > Hi Dan,
> >
> > On Mon, Feb 15, 2021 at 04:00:04PM +0300, Dan Carpenter wrote:
> > > db->index is less than db->num_ports which 32 or less but sometimes it
> > > comes from the device tree so who knows.
> >
> > The destination port mask is copied into a 12-bit field of the packet,
> > starting at bit offset 67 and ending at 56:
> >
> > static inline void ocelot_ifh_set_dest(void *injection, u64 dest)
> > {
> > packing(injection, &dest, 67, 56, OCELOT_TAG_LEN, PACK, 0);
> > }
> >
> > So this DSA tagging protocol supports at most 12 bits, which is clearly
> > less than 32. Attempting to send to a port number > 12 will cause the
> > packing() call to truncate way before there will be 32-bit truncation
> > due to type promotion of the BIT(port) argument towards u64.
> >
> > > The ocelot_ifh_set_dest() function takes a u64 though and that
> > > suggests that BIT() should be changed to BIT_ULL().
> >
> > I understand that you want to silence the warning, which fundamentally
> > comes from the packing() API which works with u64 values and nothing of
> > a smaller size. So I can send a patch which replaces BIT(port) with
> > BIT_ULL(port), even if in practice both are equally fine.
>
> I don't have a strong feeling about this...  Generally silencing
> warnings just to make a checker happy is the wrong idea.

In this case it is a harmless wrong idea.

> To be honest, I normally ignore these warnings.  But I have been looking
> at them recently to try figure out if we could make it so it would only
> generate a warning where "db->index" was known as possibly being in the
> 32-63 range.  So I looked at this one.
>
> And now I see some ways that Smatch could have parsed this better...

For DSA, the dp->index should be lower than ds->num_ports by
construction (see dsa_switch_touch_ports). In turn, ds->num_ports is set
to constant values smaller than 12 in felix_pci_probe() and in seville_probe().

For ocelot on the other hand, there is a restriction put in
mscc_ocelot_init_ports that the port must be <= ocelot->num_phys_ports,
which is set to "of_get_child_count(ports)". So there is indeed a
possible attack by device tree on the ocelot driver. The number of
physical ports does not depend on device tree
(arch/mips/boot/dts/mscc/ocelot.dtsi), but should be hardcoded to 11.
How many ports there are defined in DT doesn't change how many physical
ports there are. For example, the CPU port module is at index 11, and in
the code it is referenced as ocelot->ports[ocelot->num_phys_ports].
If num_phys_ports has any other value than 11, the driver malfunctions
because the index of the CPU port is misidentified. I'd rather fix this
if there was some way in which static analysis could then determine that
"port" is bounded by a constant smaller than the truncation threshold.

However, I'm not sure how to classify the severity of this problem.
There's a million of other ways in which the system can malfunction if
it is being "attacked by device tree".


Re: [PATCH] net: dsa: mv88e6xxx: prevent 2500BASEX mode override

2021-02-15 Thread Russell King - ARM Linux admin
On Mon, Feb 15, 2021 at 02:47:56PM +0100, Marek Behun wrote:
> On Mon, 15 Feb 2021 06:15:59 +
> Nathan Rossi  wrote:
> 
> > diff --git a/drivers/net/dsa/mv88e6xxx/chip.c 
> > b/drivers/net/dsa/mv88e6xxx/chip.c
> > index 54aa942eed..5c52906b29 100644
> > --- a/drivers/net/dsa/mv88e6xxx/chip.c
> > +++ b/drivers/net/dsa/mv88e6xxx/chip.c
> > @@ -650,6 +650,13 @@ static void mv88e6xxx_validate(struct dsa_switch *ds, 
> > int port,
> > if (chip->info->ops->phylink_validate)
> > chip->info->ops->phylink_validate(chip, port, mask, state);
> >  
> > +   /* Advertise 2500BASEX only if 1000BASEX is not configured, this
> > +* prevents phylink_helper_basex_speed from always overriding the
> > +* 1000BASEX mode since auto negotiation is always enabled.
> > +*/
> > +   if (state->interface == PHY_INTERFACE_MODE_1000BASEX)
> > +   phylink_clear(mask, 2500baseX_Full);
> > +
> 
> I don't quite like this. This problem should be either solved in
> phylink_helper_basex_speed() or somewhere in the mv88e6xxx code, but near
> the call to phylink_helper_basex_speed().
> 
> Putting a solution to the behaviour of phylink_helper_basex_speed() it
> into the validate() method when phylink_helper_basex_speed() is called
> from a different place will complicate debugging in the future. If
> we start solving problems in this kind of way, the driver will become
> totally unreadable, IMO.

If we can't switch between 1000base-X and 2500base-X, then we should
not be calling phylink_helper_basex_speed() - and only one of those
two capabilities should be set in the validation callback. I thought
there were DSA switches where we could program the CMODE to switch
between these two...

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [net-next] tcp: Sanitize CMSG flags and reserved args in tcp_zerocopy_receive.

2021-02-15 Thread David Ahern
On 2/15/21 5:03 AM, Dan Carpenter wrote:
> Hi Arjun,
> 
> url:
> https://github.com/0day-ci/linux/commits/Arjun-Roy/tcp-Sanitize-CMSG-flags-and-reserved-args-in-tcp_zerocopy_receive/20210212-052537
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git 
> e4b62cf7559f2ef9a022de235e5a09a8d7ded520
> config: x86_64-randconfig-m001-20210209 (attached as .config)
> compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot 
> Reported-by: Dan Carpenter 
> 
> smatch warnings:
> net/ipv4/tcp.c:4158 do_tcp_getsockopt() warn: check for integer overflow 'len'
> 
> vim +/len +4158 net/ipv4/tcp.c
> 
> 3fdadf7d27e3fb Dmitry Mishin2006-03-20  3896  static int 
> do_tcp_getsockopt(struct sock *sk, int level,
> 3fdadf7d27e3fb Dmitry Mishin2006-03-20  3897  int 
> optname, char __user *optval, int __user *optlen)
> ^1da177e4c3f41 Linus Torvalds   2005-04-16  3898  {
> 295f7324ff8d9e Arnaldo Carvalho de Melo 2005-08-09  3899  struct 
> inet_connection_sock *icsk = inet_csk(sk);
> ^1da177e4c3f41 Linus Torvalds   2005-04-16  3900  struct tcp_sock 
> *tp = tcp_sk(sk);
> 6fa251663069e0 Nikolay Borisov  2016-02-03  3901  struct net *net 
> = sock_net(sk);
> ^1da177e4c3f41 Linus Torvalds   2005-04-16  3902  int val, len;
> 
> "len" is int.
> 
> [ snip ]
> 05255b823a6173 Eric Dumazet 2018-04-27  4146  #ifdef CONFIG_MMU
> 05255b823a6173 Eric Dumazet 2018-04-27  4147  case 
> TCP_ZEROCOPY_RECEIVE: {
> 7eeba1706eba6d Arjun Roy2021-01-20  4148  struct 
> scm_timestamping_internal tss;
> e0fecb289ad3fd Arjun Roy2020-12-10  4149  struct 
> tcp_zerocopy_receive zc = {};
> 05255b823a6173 Eric Dumazet 2018-04-27  4150  int err;
> 05255b823a6173 Eric Dumazet 2018-04-27  4151  
> 05255b823a6173 Eric Dumazet 2018-04-27  4152  if 
> (get_user(len, optlen))
> 05255b823a6173 Eric Dumazet 2018-04-27  4153  
> return -EFAULT;
> c8856c05145490 Arjun Roy2020-02-14  4154  if (len 
> < offsetofend(struct tcp_zerocopy_receive, length))
> 05255b823a6173 Eric Dumazet 2018-04-27  4155  
> return -EINVAL;
> 
> 
> The problem is that negative values of "len" are type promoted to high
> positive values.  So the fix is to write this as:
> 
>   if (len < 0 || len < offsetofend(struct tcp_zerocopy_receive, length))
>   return -EINVAL;
> 
> 110912bdf28392 Arjun Roy2021-02-11  4156  if 
> (unlikely(len > sizeof(zc))) {
> 110912bdf28392 Arjun Roy2021-02-11  4157  
> err = check_zeroed_user(optval + sizeof(zc),
> 110912bdf28392 Arjun Roy2021-02-11 @4158  
> len - sizeof(zc));
>   
>   
> Potentially "len - a negative value".
> 
> 

get_user(len, optlen) is called multiple times in that function. len < 0
was checked after the first one at the top.

Also, maybe I am missing something here, but offsetofend can not return
a negative value, so this checks catches len < 0 as well:

if (len < offsetofend(struct tcp_zerocopy_receive, length))
return -EINVAL;




Re: [PATCH net-next 1/2] net: mvneta: Remove per-cpu queue mapping for Armada 3700

2021-02-15 Thread Maxime Chevallier
Hi Pali,

On Mon, 15 Feb 2021 00:50:58 +0100
Pali Rohár  wrote:

>> According to Errata #23 "The per-CPU GbE interrupt is limited to Core
>> 0", we can't use the per-cpu interrupt mechanism on the Armada 3700
>> familly.
>> 
>> This is correctly checked for RSS configuration, but the initial queue
>> mapping is still done by having the queues spread across all the CPUs in
>> the system, both in the init path and in the cpu_hotplug path.  
>
>Hello Maxime!
>
>This patch looks like a bug fix for Armada 3700 SoC. What about marking
>this commit with Fixes line? E.g.:
>
>Fixes: 2636ac3cc2b4 ("net: mvneta: Add network support for Armada 3700 
> SoC")

Yes you're correct, I'll add that to the V2 !

Thanks for the review,

Maxime


-- 
Maxime Chevallier, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com


Re: [PATCH net-next 2/2] net: mvneta: Implement mqprio support

2021-02-15 Thread Maxime Chevallier
Hi Andrew,

On Sat, 13 Feb 2021 20:45:25 +0100
Andrew Lunn  wrote:

>On Fri, Feb 12, 2021 at 04:12:20PM +0100, Maxime Chevallier wrote:
>> +static void mvneta_setup_rx_prio_map(struct mvneta_port *pp)
>> +{
>> +int i;
>> +u32 val = 0;  
>
>Hi Maxime
>
>Reverse Chrismtass tree please.

Ah yes sorry, I'll fix that in V2.

Thanks for the review,

Maxime

-- 
Maxime Chevallier, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com


Re: [PATCH] net: dsa: mv88e6xxx: prevent 2500BASEX mode override

2021-02-15 Thread Andrew Lunn
> If we can't switch between 1000base-X and 2500base-X, then we should
> not be calling phylink_helper_basex_speed() - and only one of those
> two capabilities should be set in the validation callback. I thought
> there were DSA switches where we could program the CMODE to switch
> between these two...

6390 family has a writable CMODE for ports 9 and 10, and you can
select various SERDES modes.

Nathan, what device are you using?

Andrew


Re: [net-next PATCH v5 15/15] net: dpaa2-mac: Add ACPI support for DPAA2 MAC driver

2021-02-15 Thread Andy Shevchenko
On Mon, Feb 15, 2021 at 2:33 PM Calvin Johnson
 wrote:
> On Mon, Feb 08, 2021 at 04:28:31PM +, Russell King - ARM Linux admin 
> wrote:

...

> I think of_phy_is_fixed_link() needs to be fixed. I'll add below fix.
>
> --- a/drivers/net/mdio/of_mdio.c
> +++ b/drivers/net/mdio/of_mdio.c
> @@ -439,6 +439,9 @@ bool of_phy_is_fixed_link(struct device_node *np)
> int len, err;
> const char *managed;
>
> +   if (!np)
> +   return false;

AFAICS this doesn't add anything: all of the of_* APIs should handle
OF nodes being NULL below.

> /* New binding */
> dn = of_get_child_by_name(np, "fixed-link");
> if (dn) {

-- 
With Best Regards,
Andy Shevchenko


Re: [PATCH 2/4] net: stmmac: Add Toshiba Visconti SoCs glue driver

2021-02-15 Thread Nobuhiro Iwamatsu
Hi,

On Mon, Feb 15, 2021 at 11:22:33AM +0200, Leon Romanovsky wrote:
> On Mon, Feb 15, 2021 at 04:28:09PM +0900, Nobuhiro Iwamatsu wrote:
> >
> > I have received your point out and have sent an email with the content
> > to remove this line. But it may not have arrived yet...
> >
> > > Why did you use "def_bool y" and not "default y"? Isn't it supposed to be
> > > "depends on STMMAC_ETH"? And probably it shouldn't be set as a default as 
> > > "y".
> > >
> >
> > The reason why "def_bool y" was set is that the wrong fix was left when
> > debugging. Also, I don't think it is necessary to set "default y".
> > This is also incorrect because it says "bool" Toshiba Visconti DWMAC
> > support "". I change it to trustate in the new patch.
> >
> > And this driver is enabled when STMMAC_PLATFORM was Y. And STMMAC_PLATFORM
> > depends on STMMAC_ETH.
> > So I understand that STMMAC_ETH does not need to be dependents. Is this
> > understanding wrong?
> 
> This is correct understanding, just need to clean other entries in that
> Kconfig that depends on STMMAC_ETH.

OK, thanks.

Best regards,
  Nobuhiro


Re: [net-next PATCH v5 15/15] net: dpaa2-mac: Add ACPI support for DPAA2 MAC driver

2021-02-15 Thread Andy Shevchenko
On Mon, Feb 15, 2021 at 5:13 PM Andy Shevchenko
 wrote:
>
> On Mon, Feb 15, 2021 at 2:33 PM Calvin Johnson
>  wrote:
> > On Mon, Feb 08, 2021 at 04:28:31PM +, Russell King - ARM Linux admin 
> > wrote:
>
> ...
>
> > I think of_phy_is_fixed_link() needs to be fixed. I'll add below fix.
> >
> > --- a/drivers/net/mdio/of_mdio.c
> > +++ b/drivers/net/mdio/of_mdio.c
> > @@ -439,6 +439,9 @@ bool of_phy_is_fixed_link(struct device_node *np)
> > int len, err;
> > const char *managed;
> >
> > +   if (!np)
> > +   return false;
>
> AFAICS this doesn't add anything: all of the of_* APIs should handle
> OF nodes being NULL below.
>
> > /* New binding */
> > dn = of_get_child_by_name(np, "fixed-link");
> > if (dn) {

Yes, of_get_next_child() and of_get_property() are NULL aware.

So, the check is redundant.

-- 
With Best Regards,
Andy Shevchenko


Re: [PATCH] net: dsa: mv88e6xxx: prevent 2500BASEX mode override

2021-02-15 Thread Marek Behun
On Mon, 15 Feb 2021 14:57:57 +
Russell King - ARM Linux admin  wrote:

> On Mon, Feb 15, 2021 at 02:47:56PM +0100, Marek Behun wrote:
> > On Mon, 15 Feb 2021 06:15:59 +
> > Nathan Rossi  wrote:
> >   
> > > diff --git a/drivers/net/dsa/mv88e6xxx/chip.c 
> > > b/drivers/net/dsa/mv88e6xxx/chip.c
> > > index 54aa942eed..5c52906b29 100644
> > > --- a/drivers/net/dsa/mv88e6xxx/chip.c
> > > +++ b/drivers/net/dsa/mv88e6xxx/chip.c
> > > @@ -650,6 +650,13 @@ static void mv88e6xxx_validate(struct dsa_switch 
> > > *ds, int port,
> > >   if (chip->info->ops->phylink_validate)
> > >   chip->info->ops->phylink_validate(chip, port, mask, state);
> > >  
> > > + /* Advertise 2500BASEX only if 1000BASEX is not configured, this
> > > +  * prevents phylink_helper_basex_speed from always overriding the
> > > +  * 1000BASEX mode since auto negotiation is always enabled.
> > > +  */
> > > + if (state->interface == PHY_INTERFACE_MODE_1000BASEX)
> > > + phylink_clear(mask, 2500baseX_Full);
> > > +  
> > 
> > I don't quite like this. This problem should be either solved in
> > phylink_helper_basex_speed() or somewhere in the mv88e6xxx code, but near
> > the call to phylink_helper_basex_speed().
> > 
> > Putting a solution to the behaviour of phylink_helper_basex_speed() it
> > into the validate() method when phylink_helper_basex_speed() is called
> > from a different place will complicate debugging in the future. If
> > we start solving problems in this kind of way, the driver will become
> > totally unreadable, IMO.  
> 
> If we can't switch between 1000base-X and 2500base-X, then we should
> not be calling phylink_helper_basex_speed() - and only one of those
> two capabilities should be set in the validation callback. I thought
> there were DSA switches where we could program the CMODE to switch
> between these two...

There are. At least Peridot, Topaz and Amethyst support switching
between these modes. But only on some ports.

This problem happnes on Peridot X, I think.

On Peridot X there are
- port 0: RGMII
- ports 9-10: capable of 1, 2.5 and 10G SerDes (10G via
  XAUI/RXAUI, so multiple lanes)
- ports 1-8: with copper PHYs
  - some of these can instead be set to use the unused SerDes lanes
of ports 9-10, but only in 1000base-x mode

So the problem can happen if you set port 9 or 10 to only use one
SerDes lane, and use the spare lanes for the 1G ports.
On these ports 2500base-x is not supported, only 1000base-x (maybe
sgmii, I don't remember)

Marek


general protection fault in nl802154_add_llsec_key

2021-02-15 Thread syzbot
Hello,

syzbot found the following issue on:

HEAD commit:57baf8cc net: axienet: Handle deferred probe on clock prop..
git tree:   net
console output: https://syzkaller.appspot.com/x/log.txt?x=15c84ed2d0
kernel config:  https://syzkaller.appspot.com/x/.config?x=8cb23303ddb9411f
dashboard link: https://syzkaller.appspot.com/bug?extid=ce4e062c2d51977ddc50
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1332a822d0
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=12760822d0

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+ce4e062c2d51977dd...@syzkaller.appspotmail.com

general protection fault, probably for non-canonical address 
0xdc00:  [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x-0x0007]
CPU: 1 PID: 8444 Comm: syz-executor279 Not tainted 5.11.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
RIP: 0010:nla_len include/net/netlink.h:1148 [inline]
RIP: 0010:nla_parse_nested_deprecated include/net/netlink.h:1231 [inline]
RIP: 0010:nl802154_add_llsec_key+0x1f7/0x560 net/ieee802154/nl802154.c:1547
Code: 00 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 2f 03 00 00 48 8b 93 28 01 00 
00 48 b8 00 00 00 00 00 fc ff df 48 89 d1 48 c1 e9 03 <0f> b6 0c 01 48 89 d0 83 
e0 07 83 c0 01 38 c8 7c 08 84 c9 0f 85 c9
RSP: 0018:c90001adf4a0 EFLAGS: 00010246
RAX: dc00 RBX: 88802117c000 RCX: 
RDX:  RSI: 88892195 RDI: 88802117c128
RBP: 19200035be95 R08:  R09: 
R10: 87315ffa R11:  R12: 88801c222000
R13: 88801bb34bd0 R14: c90001adf8b0 R15: 
FS:  01dcd300() GS:8880b9d0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 26c8 CR3: 2146a000 CR4: 001506e0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
 genl_family_rcv_msg_doit+0x228/0x320 net/netlink/genetlink.c:739
 genl_family_rcv_msg net/netlink/genetlink.c:783 [inline]
 genl_rcv_msg+0x328/0x580 net/netlink/genetlink.c:800
 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2494
 genl_rcv+0x24/0x40 net/netlink/genetlink.c:811
 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
 netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1330
 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1919
 sock_sendmsg_nosec net/socket.c:652 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:672
 sys_sendmsg+0x6e8/0x810 net/socket.c:2345
 ___sys_sendmsg+0xf3/0x170 net/socket.c:2399
 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2432
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x43f8a9
Code: 28 c3 e8 5a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 
89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 
c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:7ffe56c91f68 EFLAGS: 0246 ORIG_RAX: 002e
RAX: ffda RBX: 004004a0 RCX: 0043f8a9
RDX:  RSI: 2a00 RDI: 0004
RBP: 00403310 R08: 004004a0 R09: 004004a0
R10: 0003 R11: 0246 R12: 004033a0
R13:  R14: 004ad018 R15: 004004a0
Modules linked in:
---[ end trace 1844a7d5c231fd88 ]---
RIP: 0010:nla_len include/net/netlink.h:1148 [inline]
RIP: 0010:nla_parse_nested_deprecated include/net/netlink.h:1231 [inline]
RIP: 0010:nl802154_add_llsec_key+0x1f7/0x560 net/ieee802154/nl802154.c:1547
Code: 00 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 2f 03 00 00 48 8b 93 28 01 00 
00 48 b8 00 00 00 00 00 fc ff df 48 89 d1 48 c1 e9 03 <0f> b6 0c 01 48 89 d0 83 
e0 07 83 c0 01 38 c8 7c 08 84 c9 0f 85 c9
RSP: 0018:c90001adf4a0 EFLAGS: 00010246
RAX: dc00 RBX: 88802117c000 RCX: 
RDX:  RSI: 88892195 RDI: 88802117c128
RBP: 19200035be95 R08:  R09: 
R10: 87315ffa R11:  R12: 88801c222000
R13: 88801bb34bd0 R14: c90001adf8b0 R15: 
FS:  01dcd300() GS:8880b9d0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 26c8 CR3: 2146a000 CR4: 001506e0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot ca

Re: [PATCH 2/4] net: stmmac: Add Toshiba Visconti SoCs glue driver

2021-02-15 Thread Nobuhiro Iwamatsu
Hi,

On Mon, Feb 15, 2021 at 01:19:18PM +0100, Arnd Bergmann wrote:
> On Mon, Feb 15, 2021 at 10:23 AM Leon Romanovsky  wrote:
> > On Mon, Feb 15, 2021 at 04:28:09PM +0900, Nobuhiro Iwamatsu wrote:
> > >
> > > Sorry, I sent the wrong patchset that didn't fix this point out.
> > >
> > > > I asked it before, but never received an answer.
> > >
> > > I have received your point out and have sent an email with the content
> > > to remove this line. But it may not have arrived yet...
> > >
> > > > Why did you use "def_bool y" and not "default y"? Isn't it supposed to 
> > > > be
> > > > "depends on STMMAC_ETH"? And probably it shouldn't be set as a default 
> > > > as "y".
> > > >
> > >
> > > The reason why "def_bool y" was set is that the wrong fix was left when
> > > debugging. Also, I don't think it is necessary to set "default y".
> > > This is also incorrect because it says "bool" Toshiba Visconti DWMAC
> > > support "". I change it to trustate in the new patch.
> > >
> > > And this driver is enabled when STMMAC_PLATFORM was Y. And STMMAC_PLATFORM
> > > depends on STMMAC_ETH.
> > > So I understand that STMMAC_ETH does not need to be dependents. Is this
> > > understanding wrong?
> >
> > This is correct understanding, just need to clean other entries in that
> > Kconfig that depends on STMMAC_ETH.
> 
> 'tristate' with no default sounds right. I see that some platforms have a
> default according to the platform, which also makes sense but isn't
> required. What I would suggest though is a dependency on the platform,
> to make it easier to disable the front-end based on which platforms
> are enabled. This would end up as
> 
> config DWMAC_VISCONTI
> tristate "Toshiba Visconti DWMAC support"
> depends on ARCH_VISCONTI || COMPILE_TEST
> depends on OF && COMMON_CLK # only add this line if it's
> required for compilation
> default ARCH_VISCONTI
>

The fix at hand is the same as your suggestion.
Thank you for your comment.

>   Arnd
> 

Best regards,
  Nobuhiro


[net-next] net: mvpp2: Add TX flow control support for jumbo frames

2021-02-15 Thread stefanc
From: Stefan Chulski 

With MTU less than 1500B on all ports, the driver uses per CPU pool mode.
If one of the ports set to jumbo frame MTU size, all ports move
to shared pools mode.
Here, buffer manager TX Flow Control reconfigured on all ports.

Signed-off-by: Stefan Chulski 
---
 drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c | 26 
 1 file changed, 26 insertions(+)

diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c 
b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
index 222e9a3..10c17d1 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
@@ -924,6 +924,25 @@ static void mvpp2_bm_pool_update_fc(struct mvpp2_port 
*port,
spin_unlock_irqrestore(&port->priv->mss_spinlock, flags);
 }
 
+/* disable/enable flow control for BM pool on all ports */
+static void mvpp2_bm_pool_update_priv_fc(struct mvpp2 *priv, bool en)
+{
+   struct mvpp2_port *port;
+   int i;
+
+   for (i = 0; i < priv->port_count; i++) {
+   port = priv->port_list[i];
+   if (port->priv->percpu_pools) {
+   for (i = 0; i < port->nrxqs; i++)
+   mvpp2_bm_pool_update_fc(port, 
&port->priv->bm_pools[i],
+   port->tx_fc & en);
+   } else {
+   mvpp2_bm_pool_update_fc(port, port->pool_long, 
port->tx_fc & en);
+   mvpp2_bm_pool_update_fc(port, port->pool_short, 
port->tx_fc & en);
+   }
+   }
+}
+
 static int mvpp2_enable_global_fc(struct mvpp2 *priv)
 {
int val, timeout = 0;
@@ -4913,6 +4932,7 @@ static int mvpp2_set_mac_address(struct net_device *dev, 
void *p)
  */
 static int mvpp2_bm_switch_buffers(struct mvpp2 *priv, bool percpu)
 {
+   bool change_percpu = (percpu != priv->percpu_pools);
int numbufs = MVPP2_BM_POOLS_NUM, i;
struct mvpp2_port *port = NULL;
bool status[MVPP2_MAX_PORTS];
@@ -4928,6 +4948,9 @@ static int mvpp2_bm_switch_buffers(struct mvpp2 *priv, 
bool percpu)
if (priv->percpu_pools)
numbufs = port->nrxqs * 2;
 
+   if (change_percpu)
+   mvpp2_bm_pool_update_priv_fc(priv, false);
+
for (i = 0; i < numbufs; i++)
mvpp2_bm_pool_destroy(port->dev->dev.parent, priv, 
&priv->bm_pools[i]);
 
@@ -4942,6 +4965,9 @@ static int mvpp2_bm_switch_buffers(struct mvpp2 *priv, 
bool percpu)
mvpp2_open(port->dev);
}
 
+   if (change_percpu)
+   mvpp2_bm_pool_update_priv_fc(priv, true);
+
return 0;
 }
 
-- 
1.9.1



selftests: tls: multi_chunk_sendfile: Test terminated by timeout (constant failure)

2021-02-15 Thread Paolo Pisati
Hi Pooja,

commit 0e6fbe39bdf71b4e665767bcbf53567a3e6d0623
Author: Pooja Trivedi 
Date:   Fri Jun 5 16:01:18 2020 +

net/tls(TLS_SW): Add selftest for 'chunked' sendfile test

your multi_chunk_sendfile test is constantly failing for me (x86_64 defconfig +
tools/testing/selftests/net/config + TLS) on 5.10.y:

...
#  RUN   tls.12.multi_chunk_sendfile ...

# multi_chunk_sendfile: Test terminated by timeout  

#  FAIL  tls.12.multi_chunk_sendfile
not ok 6 tls.12.multi_chunk_sendfile
...
#  RUN   tls.13.multi_chunk_sendfile ...
# multi_chunk_sendfile: Test terminated by timeout  
#  FAIL  tls.13.multi_chunk_sendfile
not ok 51 tls.13.multi_chunk_sendfile

i tried bumping up the timeout to an insane value, but that didn't change the
outcome - anything particular i should check?
-- 
bye,
p.


[PATCH v4 1/4] dt-bindings: net: Add DT bindings for Toshiba Visconti TMPV7700 SoC

2021-02-15 Thread Nobuhiro Iwamatsu
Add device tree bindings for ethernet controller of Toshiba Visconti
TMPV7700 SoC series.

Signed-off-by: Nobuhiro Iwamatsu 
---
 .../bindings/net/toshiba,visconti-dwmac.yaml  | 85 +++
 1 file changed, 85 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/net/toshiba,visconti-dwmac.yaml

diff --git a/Documentation/devicetree/bindings/net/toshiba,visconti-dwmac.yaml 
b/Documentation/devicetree/bindings/net/toshiba,visconti-dwmac.yaml
new file mode 100644
index ..59724d18e6f3
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/toshiba,visconti-dwmac.yaml
@@ -0,0 +1,85 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: "http://devicetree.org/schemas/net/toshiba,visconti-dwmac.yaml#";
+$schema: "http://devicetree.org/meta-schemas/core.yaml#";
+
+title: Toshiba Visconti DWMAC Ethernet controller
+
+maintainers:
+  - Nobuhiro Iwamatsu 
+
+select:
+  properties:
+compatible:
+  contains:
+enum:
+  - toshiba,visconti-dwmac
+  required:
+- compatible
+
+allOf:
+  - $ref: "snps,dwmac.yaml#"
+
+properties:
+  compatible:
+oneOf:
+  - items:
+  - enum:
+  - toshiba,visconti-dwmac
+  - const: snps,dwmac-4.20a
+
+  reg:
+maxItems: 1
+
+  clocks:
+items:
+  - description: main clock
+  - description: PHY reference clock
+
+  clock-names:
+items:
+  - const: stmmaceth
+  - const: phy_ref_clk
+
+required:
+  - compatible
+  - reg
+  - clocks
+  - clock-names
+
+unevaluatedProperties: false
+
+examples:
+  - |
+#include 
+
+soc {
+#address-cells = <2>;
+#size-cells = <2>;
+
+piether: ethernet@2800 {
+compatible = "toshiba,visconti-dwmac", "snps,dwmac-4.20a";
+reg = <0 0x2800 0 0x1>;
+interrupts = ;
+interrupt-names = "macirq";
+clocks = <&clk300mhz>, <&clk125mhz>;
+clock-names = "stmmaceth", "phy_ref_clk";
+snps,txpbl = <4>;
+snps,rxpbl = <4>;
+snps,tso;
+phy-mode = "rgmii-id";
+phy-handle = <&phy0>;
+
+mdio0 {
+#address-cells = <0x1>;
+#size-cells = <0x0>;
+compatible = "snps,dwmac-mdio";
+
+phy0: ethernet-phy@1 {
+device_type = "ethernet-phy";
+reg = <0x1>;
+};
+};
+};
+};
-- 
2.30.0.rc2



[PATCH v4 0/4] net: stmmac: Add Toshiba Visconti SoCs glue driver

2021-02-15 Thread Nobuhiro Iwamatsu
Hi,

This series is the ethernet driver for Toshiba's ARM SoC, Visconti[0].
This provides DT binding documentation, device driver, MAINTAINER files,
and updates to DT files.

Best regards,
  Nobuhiro

[0]:
https://toshiba.semicon-storage.com/ap-en/semiconductor/product/image-recognition-processors-visconti.htmli

Updates:

  dt-bindings: net: Add DT bindings for Toshiba Visconti TMPV7700 SoC
v3 -> v4: No update. Resend the correct patch series.
v2 -> v3: Change to the correct dwmac version.
  Remove description of reg property.
  Update examples, drop dma-coherent, add snps,tso, fix phy-mode.
v1 -> v2: No update.

  net: stmmac: Add Toshiba Visconti SoCs glue driver
v3 -> v4: No update. Resend the correct patch series.

https://lore.kernel.org/netdev/20210215050655.2532-3-nobuhiro1.iwama...@toshiba.co.jp/
did not contain the following fixes:
  - Drop def_bool y in Kconfig
  - Change from bool to tristate in Kconfig option.
  - Add 'default ARCH_VISCONTI' to Kconfig option.

v2 -> v3: Remove code that is no longer needed by using compatible string.
  Drop def_bool y in Kconfig
  Change from bool to tristate in Kconfig option.
  Add 'default ARCH_VISCONTI' to Kconfig option.
v1 -> v2: Use reverse christmas tree ordering for local variable 
declarations.

  MAINTAINERS: Add entries for Toshiba Visconti ethernet controller
v3 -> v4: No update. Resend the correct patch series.
v2 -> v3: No update.
v1 -> v2: No update.

  arm: dts: visconti: Add DT support for Toshiba Visconti5 ethernet controller
v3 -> v4: No update
v2 -> v3: Add "snps,dwmac-4.20a" as compatible string.
  Add snps,tso.
v1 -> v2: No update.

Nobuhiro Iwamatsu (4):
  dt-bindings: net: Add DT bindings for Toshiba Visconti TMPV7700 SoC
  net: stmmac: Add Toshiba Visconti SoCs glue driver
  MAINTAINERS: Add entries for Toshiba Visconti ethernet controller
  arm: dts: visconti: Add DT support for Toshiba Visconti5 ethernet
controller

 .../bindings/net/toshiba,visconti-dwmac.yaml  |  85 ++
 MAINTAINERS   |   2 +
 .../boot/dts/toshiba/tmpv7708-rm-mbrc.dts |  18 ++
 arch/arm64/boot/dts/toshiba/tmpv7708.dtsi |  25 ++
 drivers/net/ethernet/stmicro/stmmac/Kconfig   |   8 +
 drivers/net/ethernet/stmicro/stmmac/Makefile  |   1 +
 .../ethernet/stmicro/stmmac/dwmac-visconti.c  | 285 ++
 7 files changed, 424 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/net/toshiba,visconti-dwmac.yaml
 create mode 100644 drivers/net/ethernet/stmicro/stmmac/dwmac-visconti.c

-- 
2.30.0.rc2



[PATCH v4 2/4] net: stmmac: Add Toshiba Visconti SoCs glue driver

2021-02-15 Thread Nobuhiro Iwamatsu
Add dwmac-visconti to the stmmac driver in Toshiba Visconti ARM SoCs.
This patch contains only the basic function of the device. There is no
clock control, PM, etc. yet. These will be added in the future.

Signed-off-by: Nobuhiro Iwamatsu 
---
 drivers/net/ethernet/stmicro/stmmac/Kconfig   |   8 +
 drivers/net/ethernet/stmicro/stmmac/Makefile  |   1 +
 .../ethernet/stmicro/stmmac/dwmac-visconti.c  | 285 ++
 3 files changed, 294 insertions(+)
 create mode 100644 drivers/net/ethernet/stmicro/stmmac/dwmac-visconti.c

diff --git a/drivers/net/ethernet/stmicro/stmmac/Kconfig 
b/drivers/net/ethernet/stmicro/stmmac/Kconfig
index 53f14c5a9e02..e675ba12fde2 100644
--- a/drivers/net/ethernet/stmicro/stmmac/Kconfig
+++ b/drivers/net/ethernet/stmicro/stmmac/Kconfig
@@ -219,6 +219,14 @@ config DWMAC_INTEL_PLAT
  This selects the Intel platform specific glue layer support for
  the stmmac device driver. This driver is used for the Intel Keem Bay
  SoC.
+
+config DWMAC_VISCONTI
+   tristate "Toshiba Visconti DWMAC support"
+   default ARCH_VISCONTI
+   depends on OF && COMMON_CLK && (ARCH_VISCONTI || COMPILE_TEST)
+   help
+ Support for ethernet controller on Visconti SoCs.
+
 endif
 
 config DWMAC_INTEL
diff --git a/drivers/net/ethernet/stmicro/stmmac/Makefile 
b/drivers/net/ethernet/stmicro/stmmac/Makefile
index 24e6145d4eae..366740ab9c5a 100644
--- a/drivers/net/ethernet/stmicro/stmmac/Makefile
+++ b/drivers/net/ethernet/stmicro/stmmac/Makefile
@@ -29,6 +29,7 @@ obj-$(CONFIG_DWMAC_DWC_QOS_ETH)   += dwmac-dwc-qos-eth.o
 obj-$(CONFIG_DWMAC_INTEL_PLAT) += dwmac-intel-plat.o
 obj-$(CONFIG_DWMAC_GENERIC)+= dwmac-generic.o
 obj-$(CONFIG_DWMAC_IMX8)   += dwmac-imx.o
+obj-$(CONFIG_DWMAC_VISCONTI)   += dwmac-visconti.o
 stmmac-platform-objs:= stmmac_platform.o
 dwmac-altr-socfpga-objs := altr_tse_pcs.o dwmac-socfpga.o
 
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-visconti.c 
b/drivers/net/ethernet/stmicro/stmmac/dwmac-visconti.c
new file mode 100644
index ..b7a0c57dfbfb
--- /dev/null
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-visconti.c
@@ -0,0 +1,285 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Toshiba Visconti Ethernet Support
+ *
+ * (C) Copyright 2020 TOSHIBA CORPORATION
+ * (C) Copyright 2020 Toshiba Electronic Devices & Storage Corporation
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include "stmmac_platform.h"
+#include "dwmac4.h"
+
+#define REG_ETHER_CONTROL  0x52D4
+#define ETHER_ETH_CONTROL_RESET BIT(17)
+
+#define REG_ETHER_CLOCK_SEL0x52D0
+#define ETHER_CLK_SEL_TX_CLK_EN BIT(0)
+#define ETHER_CLK_SEL_RX_CLK_EN BIT(1)
+#define ETHER_CLK_SEL_RMII_CLK_EN BIT(2)
+#define ETHER_CLK_SEL_RMII_CLK_RST BIT(3)
+#define ETHER_CLK_SEL_DIV_SEL_2 BIT(4)
+#define ETHER_CLK_SEL_DIV_SEL_20 BIT(0)
+#define ETHER_CLK_SEL_FREQ_SEL_125M(BIT(9) | BIT(8))
+#define ETHER_CLK_SEL_FREQ_SEL_50M BIT(9)
+#define ETHER_CLK_SEL_FREQ_SEL_25M BIT(8)
+#define ETHER_CLK_SEL_FREQ_SEL_2P5MBIT(0)
+#define ETHER_CLK_SEL_TX_CLK_EXT_SEL_IN BIT(0)
+#define ETHER_CLK_SEL_TX_CLK_EXT_SEL_TXC BIT(10)
+#define ETHER_CLK_SEL_TX_CLK_EXT_SEL_DIV BIT(11)
+#define ETHER_CLK_SEL_RX_CLK_EXT_SEL_IN  BIT(0)
+#define ETHER_CLK_SEL_RX_CLK_EXT_SEL_RXC BIT(12)
+#define ETHER_CLK_SEL_RX_CLK_EXT_SEL_DIV BIT(13)
+#define ETHER_CLK_SEL_TX_CLK_O_TX_I BIT(0)
+#define ETHER_CLK_SEL_TX_CLK_O_RMII_I   BIT(14)
+#define ETHER_CLK_SEL_TX_O_E_N_IN   BIT(15)
+#define ETHER_CLK_SEL_RMII_CLK_SEL_IN   BIT(0)
+#define ETHER_CLK_SEL_RMII_CLK_SEL_RX_C BIT(16)
+
+#define ETHER_CLK_SEL_RX_TX_CLK_EN (ETHER_CLK_SEL_RX_CLK_EN | 
ETHER_CLK_SEL_TX_CLK_EN)
+
+#define ETHER_CONFIG_INTF_MII 0
+#define ETHER_CONFIG_INTF_RGMII BIT(0)
+#define ETHER_CONFIG_INTF_RMII BIT(2)
+
+struct visconti_eth {
+   void __iomem *reg;
+   u32 phy_intf_sel;
+   struct clk *phy_ref_clk;
+   spinlock_t lock; /* lock to protect register update */
+};
+
+static void visconti_eth_fix_mac_speed(void *priv, unsigned int speed)
+{
+   struct visconti_eth *dwmac = priv;
+   unsigned int val, clk_sel_val;
+   unsigned long flags;
+
+   spin_lock_irqsave(&dwmac->lock, flags);
+
+   /* adjust link */
+   val = readl(dwmac->reg + MAC_CTRL_REG);
+   val &= ~(GMAC_CONFIG_PS | GMAC_CONFIG_FES);
+
+   switch (speed) {
+   case SPEED_1000:
+   if (dwmac->phy_intf_sel == ETHER_CONFIG_INTF_RGMII)
+   clk_sel_val = ETHER_CLK_SEL_FREQ_SEL_125M;
+   break;
+   case SPEED_100:
+   if (dwmac->phy_intf_sel == ETHER_CONFIG_INTF_RGMII)
+   clk_sel_val = ETHER_CLK_SEL_FREQ_SEL_25M;
+   if (dwmac->phy_intf_sel == ETHER_CONFIG_INTF_RMII)
+   clk_sel_val = ETHER_CLK_SEL_DIV_SEL_2;
+   val |= GMAC_CONFIG_PS | GMAC_CONFIG_FES;
+   break;
+   case SPEED_10:
+   if (dwmac->phy_intf_sel == ETHER_CONFIG_INTF_

[PATCH v4 4/4] arm: dts: visconti: Add DT support for Toshiba Visconti5 ethernet controller

2021-02-15 Thread Nobuhiro Iwamatsu
Add the ethernet controller node in Toshiba Visconti5 SoC-specific DT file.
And enable this node in TMPV7708 RM main board's board-specific DT file.

Signed-off-by: Nobuhiro Iwamatsu 
---
 .../boot/dts/toshiba/tmpv7708-rm-mbrc.dts | 18 +
 arch/arm64/boot/dts/toshiba/tmpv7708.dtsi | 25 +++
 2 files changed, 43 insertions(+)

diff --git a/arch/arm64/boot/dts/toshiba/tmpv7708-rm-mbrc.dts 
b/arch/arm64/boot/dts/toshiba/tmpv7708-rm-mbrc.dts
index ed0bf7f13f54..48fa8776e36f 100644
--- a/arch/arm64/boot/dts/toshiba/tmpv7708-rm-mbrc.dts
+++ b/arch/arm64/boot/dts/toshiba/tmpv7708-rm-mbrc.dts
@@ -41,3 +41,21 @@ &uart1 {
clocks = <&uart_clk>;
clock-names = "apb_pclk";
 };
+
+&piether {
+   status = "okay";
+   phy-handle = <&phy0>;
+   phy-mode = "rgmii-id";
+   clocks = <&clk300mhz>, <&clk125mhz>;
+   clock-names = "stmmaceth", "phy_ref_clk";
+
+   mdio0 {
+   #address-cells = <1>;
+   #size-cells = <0>;
+   compatible = "snps,dwmac-mdio";
+   phy0: ethernet-phy@1 {
+   device_type = "ethernet-phy";
+   reg = <0x1>;
+   };
+   };
+};
diff --git a/arch/arm64/boot/dts/toshiba/tmpv7708.dtsi 
b/arch/arm64/boot/dts/toshiba/tmpv7708.dtsi
index 242f25f4e12a..3366786699fc 100644
--- a/arch/arm64/boot/dts/toshiba/tmpv7708.dtsi
+++ b/arch/arm64/boot/dts/toshiba/tmpv7708.dtsi
@@ -134,6 +134,20 @@ uart_clk: uart-clk {
#clock-cells = <0>;
};
 
+   clk125mhz: clk125mhz {
+   compatible = "fixed-clock";
+   clock-frequency = <12500>;
+   #clock-cells = <0>;
+   clock-output-names = "clk125mhz";
+   };
+
+   clk300mhz: clk300mhz {
+   compatible = "fixed-clock";
+   clock-frequency = <3>;
+   #clock-cells = <0>;
+   clock-output-names = "clk300mhz";
+   };
+
soc {
#address-cells = <2>;
#size-cells = <2>;
@@ -384,6 +398,17 @@ spi6: spi@28146000 {
#size-cells = <0>;
status = "disabled";
};
+
+   piether: ethernet@2800 {
+   compatible = "toshiba,visconti-dwmac", 
"snps,dwmac-4.20a";
+   reg = <0 0x2800 0 0x1>;
+   interrupts = ;
+   interrupt-names = "macirq";
+   snps,txpbl = <4>;
+   snps,rxpbl = <4>;
+   snps,tso;
+   status = "disabled";
+   };
};
 };
 
-- 
2.30.0.rc2



[PATCH v4 3/4] MAINTAINERS: Add entries for Toshiba Visconti ethernet controller

2021-02-15 Thread Nobuhiro Iwamatsu
Add entries for Toshiba Visconti ethernet controller binding and driver.

Signed-off-by: Nobuhiro Iwamatsu 
---
 MAINTAINERS | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index cbf4b94f89d4..6be4bdaabf32 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2641,8 +2641,10 @@ L:   linux-arm-ker...@lists.infradead.org (moderated 
for non-subscribers)
 S: Supported
 T: git 
git://git.kernel.org/pub/scm/linux/kernel/git/iwamatsu/linux-visconti.git
 F: Documentation/devicetree/bindings/arm/toshiba.yaml
+F: Documentation/devicetree/bindings/net/toshiba,visconti-dwmac.yaml
 F: Documentation/devicetree/bindings/pinctrl/toshiba,tmpv7700-pinctrl.yaml
 F: arch/arm64/boot/dts/toshiba/
+F: drivers/net/ethernet/stmicro/stmmac/dwmac-visconti.c
 F: drivers/pinctrl/visconti/
 N: visconti
 
-- 
2.30.0.rc2



Re: [PATCH] net: dsa: mv88e6xxx: prevent 2500BASEX mode override

2021-02-15 Thread Russell King - ARM Linux admin
On Mon, Feb 15, 2021 at 04:16:27PM +0100, Marek Behun wrote:
> On Mon, 15 Feb 2021 14:57:57 +
> Russell King - ARM Linux admin  wrote:
> 
> > On Mon, Feb 15, 2021 at 02:47:56PM +0100, Marek Behun wrote:
> > > On Mon, 15 Feb 2021 06:15:59 +
> > > Nathan Rossi  wrote:
> > >   
> > > > diff --git a/drivers/net/dsa/mv88e6xxx/chip.c 
> > > > b/drivers/net/dsa/mv88e6xxx/chip.c
> > > > index 54aa942eed..5c52906b29 100644
> > > > --- a/drivers/net/dsa/mv88e6xxx/chip.c
> > > > +++ b/drivers/net/dsa/mv88e6xxx/chip.c
> > > > @@ -650,6 +650,13 @@ static void mv88e6xxx_validate(struct dsa_switch 
> > > > *ds, int port,
> > > > if (chip->info->ops->phylink_validate)
> > > > chip->info->ops->phylink_validate(chip, port, mask, 
> > > > state);
> > > >  
> > > > +   /* Advertise 2500BASEX only if 1000BASEX is not configured, this
> > > > +* prevents phylink_helper_basex_speed from always overriding 
> > > > the
> > > > +* 1000BASEX mode since auto negotiation is always enabled.
> > > > +*/
> > > > +   if (state->interface == PHY_INTERFACE_MODE_1000BASEX)
> > > > +   phylink_clear(mask, 2500baseX_Full);
> > > > +  
> > > 
> > > I don't quite like this. This problem should be either solved in
> > > phylink_helper_basex_speed() or somewhere in the mv88e6xxx code, but near
> > > the call to phylink_helper_basex_speed().
> > > 
> > > Putting a solution to the behaviour of phylink_helper_basex_speed() it
> > > into the validate() method when phylink_helper_basex_speed() is called
> > > from a different place will complicate debugging in the future. If
> > > we start solving problems in this kind of way, the driver will become
> > > totally unreadable, IMO.  
> > 
> > If we can't switch between 1000base-X and 2500base-X, then we should
> > not be calling phylink_helper_basex_speed() - and only one of those
> > two capabilities should be set in the validation callback. I thought
> > there were DSA switches where we could program the CMODE to switch
> > between these two...
> 
> There are. At least Peridot, Topaz and Amethyst support switching
> between these modes. But only on some ports.
> 
> This problem happnes on Peridot X, I think.
> 
> On Peridot X there are
> - port 0: RGMII
> - ports 9-10: capable of 1, 2.5 and 10G SerDes (10G via
>   XAUI/RXAUI, so multiple lanes)
> - ports 1-8: with copper PHYs
>   - some of these can instead be set to use the unused SerDes lanes
> of ports 9-10, but only in 1000base-x mode
> 
> So the problem can happen if you set port 9 or 10 to only use one
> SerDes lane, and use the spare lanes for the 1G ports.
> On these ports 2500base-x is not supported, only 1000base-x (maybe
> sgmii, I don't remember)

It sounds like the modes are not reporting correctly then before calling
phylink_helper_basex_speed(). If the port can only be used at 1000base-X,
then it should not be allowing 2500base-X to be set prior to calling
phylink_helper_basex_speed().

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [PATCH net-next 4/4] net: dsa: don't set skb->offload_fwd_mark when not offloading the bridge

2021-02-15 Thread George McCollister
On Sun, Feb 14, 2021 at 9:54 AM Vladimir Oltean  wrote:
[snip]
> diff --git a/net/dsa/tag_xrs700x.c b/net/dsa/tag_xrs700x.c
> index 858cdf9d2913..215ecceea89e 100644
> --- a/net/dsa/tag_xrs700x.c
> +++ b/net/dsa/tag_xrs700x.c
> @@ -45,8 +45,7 @@ static struct sk_buff *xrs700x_rcv(struct sk_buff *skb, 
> struct net_device *dev,
> if (pskb_trim_rcsum(skb, skb->len - 1))
> return NULL;
>
> -   /* Frame is forwarded by hardware, don't forward in software. */
> -   skb->offload_fwd_mark = 1;
> +   dsa_default_offload_fwd_mark(skb);

Does it make sense that the following would have worked prior to this
change? Is this only an issue for bridging between DSA ports when
offloading is supported? lan0 is a port an an xrs700x switch:

ip link set eth0 up
ip link del veth0
ip link add veth0 type veth peer name veth1

for eth in veth0 veth1 lan1; do
ip link set ${eth} up
done
ip link add br0 type bridge
ip link set veth1 master br0
ip link set lan1 master br0
ip link set br0 up

ip addr add 192.168.2.1/24 dev veth0

# ping host connected to physical LAN that lan0 is on
ping 192.168.2.249 (works!)

I was trying to come up with a way to test this change and expected
this would fail (and your patch) would fix it based on what you're
described.

-George

>
> return skb;
>  }
> --
> 2.25.1
>


[PATCH bpf-next V1 0/2] bpf: Updates for BPF-helper bpf_check_mtu

2021-02-15 Thread Jesper Dangaard Brouer
The FIB lookup example[1] show how the IP-header field tot_len
(iph->tot_len) is used as input to perform the MTU check. The recently
added MTU check helper bpf_check_mtu() should also support this type
of MTU check.

Lets add this feature before merge window, please. This is a followup
to 34b2021cc616 ("bpf: Add BPF-helper for MTU checking").

[1] samples/bpf/xdp_fwd_kern.c
---

Jesper Dangaard Brouer (2):
  bpf: BPF-helper for MTU checking add length input
  selftests/bpf: Tests using bpf_check_mtu BPF-helper input mtu_len param


 include/uapi/linux/bpf.h   |   17 ++--
 net/core/filter.c  |   12 ++-
 tools/testing/selftests/bpf/prog_tests/check_mtu.c |4 +
 tools/testing/selftests/bpf/progs/test_check_mtu.c |   92 
 4 files changed, 117 insertions(+), 8 deletions(-)

--



[PATCH bpf-next V1 2/2] selftests/bpf: Tests using bpf_check_mtu BPF-helper input mtu_len param

2021-02-15 Thread Jesper Dangaard Brouer
Add tests that use mtu_len as input parameter in BPF-helper
bpf_check_mtu().

The BPF-helper is avail from both XDP and TC context. Add two tests
per context, one that tests below MTU and one that exceeds the MTU.

Signed-off-by: Jesper Dangaard Brouer 
---
 tools/testing/selftests/bpf/prog_tests/check_mtu.c |4 +
 tools/testing/selftests/bpf/progs/test_check_mtu.c |   92 
 2 files changed, 96 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/check_mtu.c 
b/tools/testing/selftests/bpf/prog_tests/check_mtu.c
index 36af1c138faf..b62a39315336 100644
--- a/tools/testing/selftests/bpf/prog_tests/check_mtu.c
+++ b/tools/testing/selftests/bpf/prog_tests/check_mtu.c
@@ -128,6 +128,8 @@ static void test_check_mtu_xdp(__u32 mtu, __u32 ifindex)
test_check_mtu_run_xdp(skel, skel->progs.xdp_use_helper, mtu);
test_check_mtu_run_xdp(skel, skel->progs.xdp_exceed_mtu, mtu);
test_check_mtu_run_xdp(skel, skel->progs.xdp_minus_delta, mtu);
+   test_check_mtu_run_xdp(skel, skel->progs.xdp_input_len, mtu);
+   test_check_mtu_run_xdp(skel, skel->progs.xdp_input_len_exceed, mtu);
 
 cleanup:
test_check_mtu__destroy(skel);
@@ -187,6 +189,8 @@ static void test_check_mtu_tc(__u32 mtu, __u32 ifindex)
test_check_mtu_run_tc(skel, skel->progs.tc_exceed_mtu, mtu);
test_check_mtu_run_tc(skel, skel->progs.tc_exceed_mtu_da, mtu);
test_check_mtu_run_tc(skel, skel->progs.tc_minus_delta, mtu);
+   test_check_mtu_run_tc(skel, skel->progs.tc_input_len, mtu);
+   test_check_mtu_run_tc(skel, skel->progs.tc_input_len_exceed, mtu);
 cleanup:
test_check_mtu__destroy(skel);
 }
diff --git a/tools/testing/selftests/bpf/progs/test_check_mtu.c 
b/tools/testing/selftests/bpf/progs/test_check_mtu.c
index b7787b43f9db..c4a9bae96e75 100644
--- a/tools/testing/selftests/bpf/progs/test_check_mtu.c
+++ b/tools/testing/selftests/bpf/progs/test_check_mtu.c
@@ -105,6 +105,54 @@ int xdp_minus_delta(struct xdp_md *ctx)
return retval;
 }
 
+SEC("xdp")
+int xdp_input_len(struct xdp_md *ctx)
+{
+   int retval = XDP_PASS; /* Expected retval on successful test */
+   void *data_end = (void *)(long)ctx->data_end;
+   void *data = (void *)(long)ctx->data;
+   __u32 ifindex = GLOBAL_USER_IFINDEX;
+   __u32 data_len = data_end - data;
+
+   /* API allow user give length to check as input via mtu_len param,
+* resulting MTU value is still output in mtu_len param after call.
+*
+* Input len is L3, like MTU and iph->tot_len.
+* Remember XDP data_len is L2.
+*/
+   __u32 mtu_len = data_len - ETH_HLEN;
+
+   if (bpf_check_mtu(ctx, ifindex, &mtu_len, 0, 0))
+   retval = XDP_ABORTED;
+
+   global_bpf_mtu_xdp = mtu_len;
+   return retval;
+}
+
+SEC("xdp")
+int xdp_input_len_exceed(struct xdp_md *ctx)
+{
+   int retval = XDP_ABORTED; /* Fail */
+   __u32 ifindex = GLOBAL_USER_IFINDEX;
+   int err;
+
+   /* API allow user give length to check as input via mtu_len param,
+* resulting MTU value is still output in mtu_len param after call.
+*
+* Input length value is L3 size like MTU.
+*/
+   __u32 mtu_len = GLOBAL_USER_MTU;
+
+   mtu_len += 1; /* Exceed with 1 */
+
+   err = bpf_check_mtu(ctx, ifindex, &mtu_len, 0, 0);
+   if (err == BPF_MTU_CHK_RET_FRAG_NEEDED)
+   retval = XDP_PASS ; /* Success in exceeding MTU check */
+
+   global_bpf_mtu_xdp = mtu_len;
+   return retval;
+}
+
 SEC("classifier")
 int tc_use_helper(struct __sk_buff *ctx)
 {
@@ -196,3 +244,47 @@ int tc_minus_delta(struct __sk_buff *ctx)
global_bpf_mtu_xdp = mtu_len;
return retval;
 }
+
+SEC("classifier")
+int tc_input_len(struct __sk_buff *ctx)
+{
+   int retval = BPF_OK; /* Expected retval on successful test */
+   __u32 ifindex = GLOBAL_USER_IFINDEX;
+
+   /* API allow user give length to check as input via mtu_len param,
+* resulting MTU value is still output in mtu_len param after call.
+*
+* Input length value is L3 size.
+*/
+   __u32 mtu_len = GLOBAL_USER_MTU;
+
+   if (bpf_check_mtu(ctx, ifindex, &mtu_len, 0, 0))
+   retval = BPF_DROP;
+
+   global_bpf_mtu_xdp = mtu_len;
+   return retval;
+}
+
+SEC("classifier")
+int tc_input_len_exceed(struct __sk_buff *ctx)
+{
+   int retval = BPF_DROP; /* Fail */
+   __u32 ifindex = GLOBAL_USER_IFINDEX;
+   int err;
+
+   /* API allow user give length to check as input via mtu_len param,
+* resulting MTU value is still output in mtu_len param after call.
+*
+* Input length value is L3 size like MTU.
+*/
+   __u32 mtu_len = GLOBAL_USER_MTU;
+
+   mtu_len += 1; /* Exceed with 1 */
+
+   err = bpf_check_mtu(ctx, ifindex, &mtu_len, 0, 0);
+   if (err == BPF_MTU_CHK_RET_FRAG_NEEDED)
+   retval = BPF_OK; /* Success in ex

[PATCH bpf-next V1 1/2] bpf: BPF-helper for MTU checking add length input

2021-02-15 Thread Jesper Dangaard Brouer
The FIB lookup example[1] show how the IP-header field tot_len
(iph->tot_len) is used as input to perform the MTU check.

This patch extend the BPF-helper bpf_check_mtu() with the same ability
to provide the length as user parameter input, via mtu_len parameter.

[1] samples/bpf/xdp_fwd_kern.c

Signed-off-by: Jesper Dangaard Brouer 
---
 include/uapi/linux/bpf.h |   17 +++--
 net/core/filter.c|   12 ++--
 2 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 4c24daa43bac..9c8aa50dc8a5 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3850,8 +3850,7 @@ union bpf_attr {
  *
  * long bpf_check_mtu(void *ctx, u32 ifindex, u32 *mtu_len, s32 len_diff, u64 
flags)
  * Description
-
- * Check ctx packet size against exceeding MTU of net device (based
+ * Check packet size against exceeding MTU of net device (based
  * on *ifindex*).  This helper will likely be used in combination
  * with helpers that adjust/change the packet size.
  *
@@ -3868,6 +3867,14 @@ union bpf_attr {
  * against the current net device.  This is practical if this isn't
  * used prior to redirect.
  *
+ * On input *mtu_len* must be a valid pointer, else verifier will
+ * reject BPF program.  If the value *mtu_len* is initialized to
+ * zero then the ctx packet size is use.  When value *mtu_len* is
+ * provided as input this specify the L3 length that the MTU check
+ * is done against. Remeber XDP and TC length operate at L2, but
+ * this value is L3 as this correlate to MTU and IP-header tot_len
+ * values which are L3 (similar behavior as bpf_fib_lookup).
+ *
  * The Linux kernel route table can configure MTUs on a more
  * specific per route level, which is not provided by this helper.
  * For route level MTU checks use the **bpf_fib_lookup**\ ()
@@ -3892,11 +3899,9 @@ union bpf_attr {
  *
  * On return *mtu_len* pointer contains the MTU value of the net
  * device.  Remember the net device configured MTU is the L3 size,
- * which is returned here and XDP and TX length operate at L2.
+ * which is returned here and XDP and TC length operate at L2.
  * Helper take this into account for you, but remember when using
- * MTU value in your BPF-code.  On input *mtu_len* must be a valid
- * pointer and be initialized (to zero), else verifier will reject
- * BPF program.
+ * MTU value in your BPF-code.
  *
  * Return
  * * 0 on success, and populate MTU value in *mtu_len* pointer.
diff --git a/net/core/filter.c b/net/core/filter.c
index 7059cf604d94..fcc3bda85960 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5660,7 +5660,7 @@ BPF_CALL_5(bpf_skb_check_mtu, struct sk_buff *, skb,
if (unlikely(flags & ~(BPF_MTU_CHK_SEGS)))
return -EINVAL;
 
-   if (unlikely(flags & BPF_MTU_CHK_SEGS && len_diff))
+   if (unlikely(flags & BPF_MTU_CHK_SEGS && (len_diff || *mtu_len)))
return -EINVAL;
 
dev = __dev_via_ifindex(dev, ifindex);
@@ -5670,7 +5670,11 @@ BPF_CALL_5(bpf_skb_check_mtu, struct sk_buff *, skb,
mtu = READ_ONCE(dev->mtu);
 
dev_len = mtu + dev->hard_header_len;
-   skb_len = skb->len + len_diff; /* minus result pass check */
+
+   /* If set use *mtu_len as input, L3 as iph->tot_len (like fib_lookup) */
+   skb_len = *mtu_len ? *mtu_len + dev->hard_header_len : skb->len;
+
+   skb_len += len_diff; /* minus result pass check */
if (skb_len <= dev_len) {
ret = BPF_MTU_CHK_RET_SUCCESS;
goto out;
@@ -5715,6 +5719,10 @@ BPF_CALL_5(bpf_xdp_check_mtu, struct xdp_buff *, xdp,
/* Add L2-header as dev MTU is L3 size */
dev_len = mtu + dev->hard_header_len;
 
+   /* Use *mtu_len as input, L3 as iph->tot_len (like fib_lookup) */
+   if (*mtu_len)
+   xdp_len = *mtu_len + dev->hard_header_len;
+
xdp_len += len_diff; /* minus result pass check */
if (xdp_len > dev_len)
ret = BPF_MTU_CHK_RET_FRAG_NEEDED;




[PATCH bpf-next 0/3] Introduce bpf_link in libbpf's xsk

2021-02-15 Thread Maciej Fijalkowski
Hi,

This set is another approach towards addressing the below issue:

// load xdp prog and xskmap and add entry to xskmap at idx 10
$ sudo ./xdpsock -i ens801f0 -t -q 10

// add entry to xskmap at idx 11
$ sudo ./xdpsock -i ens801f0 -t -q 11

terminate one of the processes and another one is unable to work due to
the fact that the XDP prog was unloaded from interface.

Previous attempt was, to put it mildly, a bit broken, as there was no
synchronization between updates to additional map, as Bjorn pointed out.
See 
https://lore.kernel.org/netdev/20190603131907.13395-5-maciej.fijalkow...@intel.com/

In the meantime bpf_link was introduced and it seems that it can address
the issue of refcounting the XDP prog on interface. More info on commit
messages.

Thanks.

Maciej Fijalkowski (3):
  libbpf: xsk: use bpf_link
  libbpf: clear map_info before each bpf_obj_get_info_by_fd
  samples: bpf: do not unload prog within xdpsock

 samples/bpf/xdpsock_user.c |  55 --
 tools/lib/bpf/xsk.c| 147 +++--
 2 files changed, 139 insertions(+), 63 deletions(-)

-- 
2.20.1



[PATCH bpf-next 1/3] libbpf: xsk: use bpf_link

2021-02-15 Thread Maciej Fijalkowski
Currently, if there are multiple xdpsock instances running on a single
interface and in case one of the instances is terminated, the rest of
them are left in an inoperable state due to the fact of unloaded XDP
prog from interface.

To address that, step away from setting bpf prog in favour of bpf_link.
This means that refcounting of BPF resources will be done automatically
by bpf_link itself.

When setting up BPF resources during xsk socket creation, check whether
bpf_link for a given ifindex already exists via set of calls to
bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd
and comparing the ifindexes from bpf_link and xsk socket.

If there's no bpf_link yet, create one for a given XDP prog and unload
explicitly existing prog if XDP_FLAGS_UPDATE_IF_NOEXIST is not set.

If bpf_link is already at a given ifindex and underlying program is not
AF-XDP one, bail out or update the bpf_link's prog given the presence of
XDP_FLAGS_UPDATE_IF_NOEXIST.

Signed-off-by: Maciej Fijalkowski 
---
 tools/lib/bpf/xsk.c | 143 +---
 1 file changed, 122 insertions(+), 21 deletions(-)

diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
index 20500fb1f17e..5911868efa43 100644
--- a/tools/lib/bpf/xsk.c
+++ b/tools/lib/bpf/xsk.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "bpf.h"
 #include "libbpf.h"
@@ -70,6 +71,7 @@ struct xsk_ctx {
int ifindex;
struct list_head list;
int prog_fd;
+   int link_fd;
int xsks_map_fd;
char ifname[IFNAMSIZ];
 };
@@ -409,7 +411,7 @@ static int xsk_load_xdp_prog(struct xsk_socket *xsk)
static const int log_buf_size = 16 * 1024;
struct xsk_ctx *ctx = xsk->ctx;
char log_buf[log_buf_size];
-   int err, prog_fd;
+   int prog_fd;
 
/* This is the fallback C-program:
 * SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx)
@@ -499,14 +501,47 @@ static int xsk_load_xdp_prog(struct xsk_socket *xsk)
return prog_fd;
}
 
-   err = bpf_set_link_xdp_fd(xsk->ctx->ifindex, prog_fd,
- xsk->config.xdp_flags);
-   if (err) {
-   close(prog_fd);
-   return err;
+   ctx->prog_fd = prog_fd;
+   return 0;
+}
+
+static int xsk_create_bpf_link(struct xsk_socket *xsk)
+{
+   /* bpf_link only accepts XDP_FLAGS_MODES, but xsk->config.xdp_flags
+* might have set XDP_FLAGS_UPDATE_IF_NOEXIST
+*/
+   DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts,
+   .flags = (xsk->config.xdp_flags & XDP_FLAGS_MODES));
+   struct xsk_ctx *ctx = xsk->ctx;
+   __u32 prog_id;
+   int link_fd;
+   int err;
+
+   /* for !XDP_FLAGS_UPDATE_IF_NOEXIST, unload the program first, if any,
+* so that bpf_link can be attached
+*/
+   if (!(xsk->config.xdp_flags & XDP_FLAGS_UPDATE_IF_NOEXIST)) {
+   err = bpf_get_link_xdp_id(ctx->ifindex, &prog_id, 
xsk->config.xdp_flags);
+   if (err) {
+   pr_warn("getting XDP prog id failed\n");
+   return err;
+   }
+   if (prog_id) {
+   err = bpf_set_link_xdp_fd(ctx->ifindex, -1, 0);
+   if (err < 0) {
+   pr_warn("detaching XDP prog failed\n");
+   return err;
+   }
+   }
}
 
-   ctx->prog_fd = prog_fd;
+   link_fd = bpf_link_create(ctx->prog_fd, xsk->ctx->ifindex, BPF_XDP, 
&opts);
+   if (link_fd < 0) {
+   pr_warn("bpf_link_create failed: %s\n", strerror(errno));
+   return link_fd;
+   }
+
+   ctx->link_fd = link_fd;
return 0;
 }
 
@@ -625,8 +660,32 @@ static int xsk_lookup_bpf_maps(struct xsk_socket *xsk)
}
 
err = 0;
-   if (ctx->xsks_map_fd == -1)
+   if (ctx->xsks_map_fd != -1)
+   goto out_map_ids;
+
+   /* if prog load is forced and we found bpf_link on a given ifindex BUT
+* the underlying XDP prog is not an AF-XDP one, close old prog's fd,
+* load AF-XDP and update existing bpf_link
+*/
+   if (!(xsk->config.xdp_flags & XDP_FLAGS_UPDATE_IF_NOEXIST)) {
+   close(ctx->prog_fd);
+   err = xsk_create_bpf_maps(xsk);
+   if (err)
+   goto out_map_ids;
+
+   err = xsk_load_xdp_prog(xsk);
+   if (err) {
+   xsk_delete_bpf_maps(xsk);
+   goto out_map_ids;
+   }
+
+   /* if update failed, caller will close fds */
+   err = bpf_link_update(ctx->link_fd, ctx->prog_fd, 0);
+   if (err)
+   pr_warn("bpf_link_update failed: %s\n", 
strerror(errno));
+   } else {
err = -ENOENT;
+   }
 
 out_map_ids:
free(map

[PATCH net-next] octeontx2-pf: fix an off by one bug in otx2_get_fecparam()

2021-02-15 Thread Dan Carpenter
The "<= FEC_MAX_INDEX" comparison should be "< FEC_MAX_INDEX".

I did some cleanup in this function to hopefully make the code a bit
clearer.  There was no blank line after the declaration block.  The
closing curly brace on the fec[] declaration normally goes on a line
by itself.  And I removed the FEC_MAX_INDEX define and used
ARRAY_SIZE(fec) instead.

Fixes: d0cf9503e908 ("octeontx2-pf: ethtool fec mode support")
Signed-off-by: Dan Carpenter 
---
 .../net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c| 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c 
b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c
index 237e5d3321d4..0eaf11107cb7 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c
@@ -955,16 +955,17 @@ static int otx2_get_fecparam(struct net_device *netdev,
ETHTOOL_FEC_OFF,
ETHTOOL_FEC_BASER,
ETHTOOL_FEC_RS,
-   ETHTOOL_FEC_BASER | ETHTOOL_FEC_RS};
-#define FEC_MAX_INDEX 4
-   if (pfvf->linfo.fec < FEC_MAX_INDEX)
+   ETHTOOL_FEC_BASER | ETHTOOL_FEC_RS,
+   };
+
+   if (pfvf->linfo.fec < ARRAY_SIZE(fec))
fecparam->active_fec = fec[pfvf->linfo.fec];
 
rsp = otx2_get_fwdata(pfvf);
if (IS_ERR(rsp))
return PTR_ERR(rsp);
 
-   if (rsp->fwdata.supported_fec <= FEC_MAX_INDEX) {
+   if (rsp->fwdata.supported_fec < ARRAY_SIZE(fec)) {
if (!rsp->fwdata.supported_fec)
fecparam->fec = ETHTOOL_FEC_NONE;
else
-- 
2.30.0



Re: [PATCH] net: dsa: mv88e6xxx: prevent 2500BASEX mode override

2021-02-15 Thread Marek Behun
On Mon, 15 Feb 2021 15:29:44 +
Russell King - ARM Linux admin  wrote:

> On Mon, Feb 15, 2021 at 04:16:27PM +0100, Marek Behun wrote:
> > On Mon, 15 Feb 2021 14:57:57 +
> > Russell King - ARM Linux admin  wrote:
> >   
> > > On Mon, Feb 15, 2021 at 02:47:56PM +0100, Marek Behun wrote:  
> > > > On Mon, 15 Feb 2021 06:15:59 +
> > > > Nathan Rossi  wrote:
> > > > 
> > > > > diff --git a/drivers/net/dsa/mv88e6xxx/chip.c 
> > > > > b/drivers/net/dsa/mv88e6xxx/chip.c
> > > > > index 54aa942eed..5c52906b29 100644
> > > > > --- a/drivers/net/dsa/mv88e6xxx/chip.c
> > > > > +++ b/drivers/net/dsa/mv88e6xxx/chip.c
> > > > > @@ -650,6 +650,13 @@ static void mv88e6xxx_validate(struct dsa_switch 
> > > > > *ds, int port,
> > > > >   if (chip->info->ops->phylink_validate)
> > > > >   chip->info->ops->phylink_validate(chip, port, mask, 
> > > > > state);
> > > > >  
> > > > > + /* Advertise 2500BASEX only if 1000BASEX is not configured, this
> > > > > +  * prevents phylink_helper_basex_speed from always overriding 
> > > > > the
> > > > > +  * 1000BASEX mode since auto negotiation is always enabled.
> > > > > +  */
> > > > > + if (state->interface == PHY_INTERFACE_MODE_1000BASEX)
> > > > > + phylink_clear(mask, 2500baseX_Full);
> > > > > +
> > > > 
> > > > I don't quite like this. This problem should be either solved in
> > > > phylink_helper_basex_speed() or somewhere in the mv88e6xxx code, but 
> > > > near
> > > > the call to phylink_helper_basex_speed().
> > > > 
> > > > Putting a solution to the behaviour of phylink_helper_basex_speed() it
> > > > into the validate() method when phylink_helper_basex_speed() is called
> > > > from a different place will complicate debugging in the future. If
> > > > we start solving problems in this kind of way, the driver will become
> > > > totally unreadable, IMO.
> > > 
> > > If we can't switch between 1000base-X and 2500base-X, then we should
> > > not be calling phylink_helper_basex_speed() - and only one of those
> > > two capabilities should be set in the validation callback. I thought
> > > there were DSA switches where we could program the CMODE to switch
> > > between these two...  
> > 
> > There are. At least Peridot, Topaz and Amethyst support switching
> > between these modes. But only on some ports.
> > 
> > This problem happnes on Peridot X, I think.
> > 
> > On Peridot X there are
> > - port 0: RGMII
> > - ports 9-10: capable of 1, 2.5 and 10G SerDes (10G via
> >   XAUI/RXAUI, so multiple lanes)
> > - ports 1-8: with copper PHYs
> >   - some of these can instead be set to use the unused SerDes lanes
> > of ports 9-10, but only in 1000base-x mode
> > 
> > So the problem can happen if you set port 9 or 10 to only use one
> > SerDes lane, and use the spare lanes for the 1G ports.
> > On these ports 2500base-x is not supported, only 1000base-x (maybe
> > sgmii, I don't remember)  
> 
> It sounds like the modes are not reporting correctly then before calling
> phylink_helper_basex_speed(). If the port can only be used at 1000base-X,
> then it should not be allowing 2500base-X to be set prior to calling
> phylink_helper_basex_speed().
> 

Hmm. It doesn't seem that way. Ports 1-8 only set support for 1000baseT
and 1000baseX. And for lower modes if state->interface is not 8023z.

Nathan, what switch do you use and on which port does this happen?

Marek


Re: [PATCH 1/2] lockdep: add lockdep_assert_not_held()

2021-02-15 Thread Peter Zijlstra
On Mon, Feb 15, 2021 at 02:12:30PM +0100, Johannes Berg wrote:
> On Mon, 2021-02-15 at 11:44 +0100, Peter Zijlstra wrote:
> > 
> > I think something like so will work, but please double check.
> 
> Yeah, that looks better.
> 
> > +++ b/include/linux/lockdep.h
> > @@ -294,11 +294,15 @@ extern void lock_unpin_lock(struct lockdep_map *lock, 
> > struct pin_cookie);
> >  
> >  #define lockdep_depth(tsk) (debug_locks ? (tsk)->lockdep_depth : 0)
> >  
> > -#define lockdep_assert_held(l) do {\
> > -   WARN_ON(debug_locks && !lockdep_is_held(l));\
> > +#define lockdep_assert_held(l) do {
> > \
> > +   WARN_ON(debug_locks && lockdep_is_held(l) == 0));   \
> > } while (0)
> 
> That doesn't really need to change? It's the same.

Correct, but I found it more symmetric vs the not implementation below.

> > -#define lockdep_assert_held_write(l)   do {\
> > +#define lockdep_assert_not_held(l) do {\
> > +   WARN_ON(debug_locks && lockdep_is_held(l) == 1));   \
> > +   } while (0)
> > +
> > +#define lockdep_assert_held_write(l)   do {
> > \
> > WARN_ON(debug_locks && !lockdep_is_held_type(l, 0));\
> > } while (0)
> >  
> > diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> > index c1418b47f625..983ba206f7b2 100644
> > --- a/kernel/locking/lockdep.
> > +++ b/kernel/locking/lockdep.c
> > @@ -5467,7 +5467,7 @@ noinstr int lock_is_held_type(const struct 
> > lockdep_map *lock, int read)
> > int ret = 0;
> >  
> > if (unlikely(!lockdep_enabled()))
> > -   return 1; /* avoid false negative lockdep_assert_held() */
> > +   return -1; /* avoid false negative lockdep_assert_held() */
> 
> Maybe add lockdep_assert_not_held() to the comment, to explain the -1
> (vs non-zero)?

Yeah, or frob a '*' in there.


Re: [PATCH v1 1/3] string: Consolidate yesno() helpers under string.h hood

2021-02-15 Thread Andy Shevchenko
On Mon, Feb 15, 2021 at 04:37:50PM +0200, Jani Nikula wrote:
> On Mon, 15 Feb 2021, Andy Shevchenko  
> wrote:
> > We have already few similar implementation and a lot of code that can 
> > benefit
> > of the yesno() helper.  Consolidate yesno() helpers under string.h hood.
> 
> Good luck. I gave up after just four versions. [1]

Thanks for a pointer! I like your version, but here we also discussing a
possibility to do something like %py[DOY]. It will consolidate all those RO or
whatever sections inside one data structure.

> Acked-by: Jani Nikula 
> 
> [1] http://lore.kernel.org/r/20191023131308.9420-1-jani.nik...@intel.com


-- 
With Best Regards,
Andy Shevchenko




Re: [net-next] tcp: Sanitize CMSG flags and reserved args in tcp_zerocopy_receive.

2021-02-15 Thread Dan Carpenter
On Mon, Feb 15, 2021 at 08:04:11AM -0700, David Ahern wrote:
> On 2/15/21 5:03 AM, Dan Carpenter wrote:
> > Hi Arjun,
> > 
> > url:
> > https://github.com/0day-ci/linux/commits/Arjun-Roy/tcp-Sanitize-CMSG-flags-and-reserved-args-in-tcp_zerocopy_receive/20210212-052537
> >  
> > base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git  
> > e4b62cf7559f2ef9a022de235e5a09a8d7ded520
> > config: x86_64-randconfig-m001-20210209 (attached as .config)
> > compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
> > 
> > If you fix the issue, kindly add following tag as appropriate
> > Reported-by: kernel test robot 
> > Reported-by: Dan Carpenter 
> > 
> > smatch warnings:
> > net/ipv4/tcp.c:4158 do_tcp_getsockopt() warn: check for integer overflow 
> > 'len'
> > 
> > vim +/len +4158 net/ipv4/tcp.c
> > 
> > 3fdadf7d27e3fb Dmitry Mishin2006-03-20  3896  static int 
> > do_tcp_getsockopt(struct sock *sk, int level,
> > 3fdadf7d27e3fb Dmitry Mishin2006-03-20  3897int 
> > optname, char __user *optval, int __user *optlen)
> > ^1da177e4c3f41 Linus Torvalds   2005-04-16  3898  {
> > 295f7324ff8d9e Arnaldo Carvalho de Melo 2005-08-09  3899struct 
> > inet_connection_sock *icsk = inet_csk(sk);
> > ^1da177e4c3f41 Linus Torvalds   2005-04-16  3900struct tcp_sock 
> > *tp = tcp_sk(sk);
> > 6fa251663069e0 Nikolay Borisov  2016-02-03  3901struct net *net 
> > = sock_net(sk);
> > ^1da177e4c3f41 Linus Torvalds   2005-04-16  3902int val, len;
> > 
> > "len" is int.
> > 
> > [ snip ]
> > 05255b823a6173 Eric Dumazet 2018-04-27  4146  #ifdef CONFIG_MMU
> > 05255b823a6173 Eric Dumazet 2018-04-27  4147case 
> > TCP_ZEROCOPY_RECEIVE: {
> > 7eeba1706eba6d Arjun Roy2021-01-20  4148struct 
> > scm_timestamping_internal tss;
> > e0fecb289ad3fd Arjun Roy2020-12-10  4149struct 
> > tcp_zerocopy_receive zc = {};
> > 05255b823a6173 Eric Dumazet 2018-04-27  4150int err;
> > 05255b823a6173 Eric Dumazet 2018-04-27  4151  
> > 05255b823a6173 Eric Dumazet 2018-04-27  4152if 
> > (get_user(len, optlen))
> > 05255b823a6173 Eric Dumazet 2018-04-27  4153
> > return -EFAULT;
> > c8856c05145490 Arjun Roy2020-02-14  4154if (len 
> > < offsetofend(struct tcp_zerocopy_receive, length))
> > 05255b823a6173 Eric Dumazet 2018-04-27  4155
> > return -EINVAL;
> > 
> > 
> > The problem is that negative values of "len" are type promoted to high
> > positive values.  So the fix is to write this as:
> > 
> > if (len < 0 || len < offsetofend(struct tcp_zerocopy_receive, length))
> > return -EINVAL;
> > 
> > 110912bdf28392 Arjun Roy2021-02-11  4156if 
> > (unlikely(len > sizeof(zc))) {
> > 110912bdf28392 Arjun Roy2021-02-11  4157
> > err = check_zeroed_user(optval + sizeof(zc),
> > 110912bdf28392 Arjun Roy2021-02-11 @4158
> > len - sizeof(zc));
> > 
> > 
> > Potentially "len - a negative value".
> > 
> > 
> 
> get_user(len, optlen) is called multiple times in that function. len < 0
> was checked after the first one at the top.
> 

What you're describing is a "Double Fetch" bug, where the attack is we
get some data from the user, and we verify it, then we get it from the
user a second time and trust it.  The problem is that the user modifies
it between the first and second get_user() call so it ends up being a
security vulnerability.

But I'm glad you pointed out the first get_user() because it has an
ancient, harmless pre git bug in it.

net/ipv4/tcp.c
  3888  static int do_tcp_getsockopt(struct sock *sk, int level,
  3889  int optname, char __user *optval, int __user *optlen)
  3890  {
  3891  struct inet_connection_sock *icsk = inet_csk(sk);
  3892  struct tcp_sock *tp = tcp_sk(sk);
  3893  struct net *net = sock_net(sk);
  3894  int val, len;
  3895  
  3896  if (get_user(len, optlen))
  3897  return -EFAULT;
  3898  
  3899  len = min_t(unsigned int, len, sizeof(int));
  3900  
  3901  if (len < 0)
^^^
This is impossible.  "len" has to be in the 0-4 range because of the
min_t() assignment.  It's harmless though and the condition should just
be removed.

  3902  return -EINVAL;
  3903  
  3904  switch (optname) {
  3905  case TCP_MAXSEG:

Anyway, I will create a new Smatch warning for this situation.

> Also, maybe I am missing something here, but offsetofend can not return
> a negative value, so this checks catches len < 0 as well:
> 
>   if (le

Re: [PATCH bpf-next 0/3] Introduce bpf_link in libbpf's xsk

2021-02-15 Thread Björn Töpel

On 2021-02-15 16:46, Maciej Fijalkowski wrote:

Hi,

This set is another approach towards addressing the below issue:

// load xdp prog and xskmap and add entry to xskmap at idx 10
$ sudo ./xdpsock -i ens801f0 -t -q 10

// add entry to xskmap at idx 11
$ sudo ./xdpsock -i ens801f0 -t -q 11

terminate one of the processes and another one is unable to work due to
the fact that the XDP prog was unloaded from interface.

Previous attempt was, to put it mildly, a bit broken, as there was no
synchronization between updates to additional map, as Bjorn pointed out.
See 
https://lore.kernel.org/netdev/20190603131907.13395-5-maciej.fijalkow...@intel.com/

In the meantime bpf_link was introduced and it seems that it can address
the issue of refcounting the XDP prog on interface. More info on commit
messages.



For the series:

Reviewed-by: Björn Töpel 
Acked-by: Björn Töpel 

Finally, bpf_link/scoped XDP support! Thanks a lot!


Björn




Thanks.

Maciej Fijalkowski (3):
   libbpf: xsk: use bpf_link
   libbpf: clear map_info before each bpf_obj_get_info_by_fd
   samples: bpf: do not unload prog within xdpsock

  samples/bpf/xdpsock_user.c |  55 --
  tools/lib/bpf/xsk.c| 147 +++--
  2 files changed, 139 insertions(+), 63 deletions(-)



[PATCH bpf-next 3/3] samples: bpf: do not unload prog within xdpsock

2021-02-15 Thread Maciej Fijalkowski
With the introduction of bpf_link in xsk's libbpf part, there's no
further need for explicit unload of prog on xdpsock's termination. When
process dies, the bpf_link's refcount will be decremented and resources
will be unloaded/freed under the hood in case when there are no more
active users.

While at it, don't dump stats on error path.

Signed-off-by: Maciej Fijalkowski 
---
 samples/bpf/xdpsock_user.c | 55 ++
 1 file changed, 14 insertions(+), 41 deletions(-)

diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c
index db0cb73513a5..96246313e342 100644
--- a/samples/bpf/xdpsock_user.c
+++ b/samples/bpf/xdpsock_user.c
@@ -96,7 +96,6 @@ static int opt_xsk_frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE;
 static int opt_timeout = 1000;
 static bool opt_need_wakeup = true;
 static u32 opt_num_xsks = 1;
-static u32 prog_id;
 static bool opt_busy_poll;
 static bool opt_reduced_cap;
 
@@ -462,59 +461,37 @@ static void *poller(void *arg)
return NULL;
 }
 
-static void remove_xdp_program(void)
+static void int_exit(int sig)
 {
-   u32 curr_prog_id = 0;
-   int cmd = CLOSE_CONN;
-
-   if (bpf_get_link_xdp_id(opt_ifindex, &curr_prog_id, opt_xdp_flags)) {
-   printf("bpf_get_link_xdp_id failed\n");
-   exit(EXIT_FAILURE);
-   }
-   if (prog_id == curr_prog_id)
-   bpf_set_link_xdp_fd(opt_ifindex, -1, opt_xdp_flags);
-   else if (!curr_prog_id)
-   printf("couldn't find a prog id on a given interface\n");
-   else
-   printf("program on interface changed, not removing\n");
-
-   if (opt_reduced_cap) {
-   if (write(sock, &cmd, sizeof(int)) < 0) {
-   fprintf(stderr, "Error writing into stream socket: %s", 
strerror(errno));
-   exit(EXIT_FAILURE);
-   }
-   }
+   benchmark_done = true;
 }
 
-static void int_exit(int sig)
+static void __exit_with_error(int error, const char *file, const char *func,
+ int line)
 {
-   benchmark_done = true;
+   fprintf(stderr, "%s:%s:%i: errno: %d/\"%s\"\n", file, func,
+   line, error, strerror(error));
+   exit(EXIT_FAILURE);
 }
 
+#define exit_with_error(error) __exit_with_error(error, __FILE__, __func__, 
__LINE__)
+
 static void xdpsock_cleanup(void)
 {
struct xsk_umem *umem = xsks[0]->umem->umem;
-   int i;
+   int i, cmd = CLOSE_CONN;
 
dump_stats();
for (i = 0; i < num_socks; i++)
xsk_socket__delete(xsks[i]->xsk);
(void)xsk_umem__delete(umem);
-   remove_xdp_program();
-}
 
-static void __exit_with_error(int error, const char *file, const char *func,
- int line)
-{
-   fprintf(stderr, "%s:%s:%i: errno: %d/\"%s\"\n", file, func,
-   line, error, strerror(error));
-   dump_stats();
-   remove_xdp_program();
-   exit(EXIT_FAILURE);
+   if (opt_reduced_cap) {
+   if (write(sock, &cmd, sizeof(int)) < 0)
+   exit_with_error(errno);
+   }
 }
 
-#define exit_with_error(error) __exit_with_error(error, __FILE__, __func__, \
-__LINE__)
 static void swap_mac_addresses(void *data)
 {
struct ether_header *eth = (struct ether_header *)data;
@@ -880,10 +857,6 @@ static struct xsk_socket_info *xsk_configure_socket(struct 
xsk_umem_info *umem,
if (ret)
exit_with_error(-ret);
 
-   ret = bpf_get_link_xdp_id(opt_ifindex, &prog_id, opt_xdp_flags);
-   if (ret)
-   exit_with_error(-ret);
-
xsk->app_stats.rx_empty_polls = 0;
xsk->app_stats.fill_fail_polls = 0;
xsk->app_stats.copy_tx_sendtos = 0;
-- 
2.20.1



Re: [PATCH 1/2] lockdep: add lockdep_assert_not_held()

2021-02-15 Thread Johannes Berg
On Mon, 2021-02-15 at 17:04 +0100, Peter Zijlstra wrote:
> On Mon, Feb 15, 2021 at 02:12:30PM +0100, Johannes Berg wrote:
> > On Mon, 2021-02-15 at 11:44 +0100, Peter Zijlstra wrote:
> > > I think something like so will work, but please double check.
> > 
> > Yeah, that looks better.
> > 
> > > +++ b/include/linux/lockdep.h
> > > @@ -294,11 +294,15 @@ extern void lock_unpin_lock(struct lockdep_map 
> > > *lock, struct pin_cookie);
> > >  
> > >  #define lockdep_depth(tsk)   (debug_locks ? (tsk)->lockdep_depth : 0)
> > >  
> > > -#define lockdep_assert_held(l)   do {\
> > > - WARN_ON(debug_locks && !lockdep_is_held(l));\
> > > +#define lockdep_assert_held(l)   do {
> > > \
> > > + WARN_ON(debug_locks && lockdep_is_held(l) == 0));   \
> > >   } while (0)
> > 
> > That doesn't really need to change? It's the same.
> 
> Correct, but I found it more symmetric vs the not implementation below.

Fair enough. One might argue that you should have an

enum lockdep_lock_state {
LOCK_STATE_NOT_HELD, /* 0 now */
LOCK_STATE_HELD, /* 1 now */
LOCK_STATE_UNKNOWN, /* -1 with your patch but might as well be 2 */
};

:)

johannes



Re: [PATCH] b43: N-PHY: Fix the update of coef for the PHY revision >= 3case

2021-02-15 Thread Larry Finger

On 2/15/21 6:05 AM, Colin King wrote:

From: Colin Ian King 

The documentation for the PHY update [1] states:

Loop 4 times with index i

 If PHY Revision >= 3
 Copy table[i] to coef[i]
 Otherwise
 Set coef[i] to 0

the copy of the table to coef is currently implemented the wrong way
around, table is being updated from uninitialized values in coeff.
Fix this by swapping the assignment around.

[1] https://bcm-v4.sipsolutions.net/802.11/PHY/N/RestoreCal/

Fixes: 2f258b74d13c ("b43: N-PHY: implement restoring general configuration")
Addresses-Coverity: ("Uninitialized scalar variable")
Signed-off-by: Colin Ian King 
---
  drivers/net/wireless/broadcom/b43/phy_n.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/broadcom/b43/phy_n.c 
b/drivers/net/wireless/broadcom/b43/phy_n.c
index b669dff24b6e..665b737fbb0d 100644
--- a/drivers/net/wireless/broadcom/b43/phy_n.c
+++ b/drivers/net/wireless/broadcom/b43/phy_n.c
@@ -5311,7 +5311,7 @@ static void b43_nphy_restore_cal(struct b43_wldev *dev)
  
  	for (i = 0; i < 4; i++) {

if (dev->phy.rev >= 3)
-   table[i] = coef[i];
+   coef[i] = table[i];
else
coef[i] = 0;
}



Acked-by: Larry Finger 

Good catch, thanks.

Larry



[PATCH][next] i40e: Fix uninitialized variable mfs_max

2021-02-15 Thread Colin King
From: Colin Ian King 

The variable mfs_max is not initialized and is being compared to find
the maximum value. Fix this by initializing it to 0.

Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: 90bc8e003be2 ("i40e: Add hardware configuration for software based DCB")
Signed-off-by: Colin Ian King 
---
 drivers/net/ethernet/intel/i40e/i40e_dcb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_dcb.c 
b/drivers/net/ethernet/intel/i40e/i40e_dcb.c
index 7b73a279d46e..243b0d2b7b72 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_dcb.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_dcb.c
@@ -1636,7 +1636,7 @@ void i40e_dcb_hw_calculate_pool_sizes(struct i40e_hw *hw,
u32 total_pool_size = 0;
int shared_pool_size; /* Need signed variable */
u32 port_pb_size;
-   u32 mfs_max;
+   u32 mfs_max = 0;
u32 pcirtt;
u8 i;
 
-- 
2.30.0



[PATCH bpf-next 2/3] libbpf: clear map_info before each bpf_obj_get_info_by_fd

2021-02-15 Thread Maciej Fijalkowski
xsk_lookup_bpf_maps, based on prog_fd, looks whether current prog has a
reference to XSKMAP. BPF prog can include insns that work on various BPF
maps and this is covered by iterating through map_ids.

The bpf_map_info that is passed to bpf_obj_get_info_by_fd for filling
needs to be cleared at each iteration, so that it doesn't any outdated
fields and that is currently missing in the function of interest.

To fix that, zero-init map_info via memset before each
bpf_obj_get_info_by_fd call.

Also, since the area of this code is touched, in general strcmp is
considered harmful, so let's convert it to strncmp and provide the
length of the array name that we're looking for.

Signed-off-by: Maciej Fijalkowski 
---
 tools/lib/bpf/xsk.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
index 5911868efa43..fb259c0bba93 100644
--- a/tools/lib/bpf/xsk.c
+++ b/tools/lib/bpf/xsk.c
@@ -616,6 +616,7 @@ static int xsk_lookup_bpf_maps(struct xsk_socket *xsk)
__u32 i, *map_ids, num_maps, prog_len = sizeof(struct bpf_prog_info);
__u32 map_len = sizeof(struct bpf_map_info);
struct bpf_prog_info prog_info = {};
+   const char *map_name = "xsks_map";
struct xsk_ctx *ctx = xsk->ctx;
struct bpf_map_info map_info;
int fd, err;
@@ -645,13 +646,14 @@ static int xsk_lookup_bpf_maps(struct xsk_socket *xsk)
if (fd < 0)
continue;
 
+   memset(&map_info, 0, map_len);
err = bpf_obj_get_info_by_fd(fd, &map_info, &map_len);
if (err) {
close(fd);
continue;
}
 
-   if (!strcmp(map_info.name, "xsks_map")) {
+   if (!strncmp(map_info.name, map_name, strlen(map_name))) {
ctx->xsks_map_fd = fd;
continue;
}
-- 
2.20.1



RE: [EXT] Re: Phylink flow control support on ports with MLO_AN_FIXED auto negotiation

2021-02-15 Thread Stefan Chulski
> > > I discussed it with Andrew earlier last year, and his response was:
> > >
> > >  DT configuration of pause for fixed link probably is sufficient. I
> > > don't  remember it ever been really discussed for DSA. It was a
> > > Melanox  discussion about limiting pause for the CPU. So I think it
> > > is safe to  not implement ethtool -A, at least until somebody has a
> > > real use case  for it.
> > >
> > > So I chose not to support it - no point supporting features that
> > > people aren't using. If you have a "real use case" then it can be added.
> >
> > This patch may be sufficient - I haven't fully considered all the
> > implications of changing this though.
> 
> Did you try this patch? What's the outcome?

For me patch worked as expected.

Thanks,
Stefan. 

> >
> >  drivers/net/phy/phylink.c | 9 +++--
> >  1 file changed, 3 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
> > index 7e0fdc17c8ee..2ee0d4dcf9a5 100644
> > --- a/drivers/net/phy/phylink.c
> > +++ b/drivers/net/phy/phylink.c
> > @@ -673,7 +673,6 @@ static void phylink_resolve(struct work_struct *w)
> > switch (pl->cur_link_an_mode) {
> > case MLO_AN_PHY:
> > link_state = pl->phy_state;
> > -   phylink_apply_manual_flow(pl, &link_state);
> > mac_config = link_state.link;
> > break;
> >
> > @@ -698,11 +697,12 @@ static void phylink_resolve(struct work_struct
> *w)
> > link_state.pause = pl->phy_state.pause;
> > mac_config = true;
> > }
> > -   phylink_apply_manual_flow(pl, &link_state);
> > break;
> > }
> > }
> >
> > +   phylink_apply_manual_flow(pl, &link_state);
> > +
> > if (mac_config) {
> > if (link_state.interface != pl->link_config.interface) {
> > /* The interface has changed, force the link down
> and @@ -1639,9
> > +1639,6 @@ int phylink_ethtool_set_pauseparam(struct phylink *pl,
> >
> > ASSERT_RTNL();
> >
> > -   if (pl->cur_link_an_mode == MLO_AN_FIXED)
> > -   return -EOPNOTSUPP;
> > -
> > if (!phylink_test(pl->supported, Pause) &&
> > !phylink_test(pl->supported, Asym_Pause))
> > return -EOPNOTSUPP;
> > @@ -1684,7 +1681,7 @@ int phylink_ethtool_set_pauseparam(struct
> phylink *pl,
> > /* Update our in-band advertisement, triggering a renegotiation if
> >  * the advertisement changed.
> >  */
> > -   if (!pl->phydev)
> > +   if (!pl->phydev && pl->cur_link_an_mode != MLO_AN_FIXED)
> > phylink_change_inband_advert(pl);
> >
> > mutex_unlock(&pl->state_mutex);
> >
> > --
> > RMK's Patch system:
> > https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__www.armlinux.org.
> >
> uk_developer_patches_&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=DDQ
> 3dKwkTIxK
> >
> Al6_Bs7GMx4zhJArrXKN2mDMOXGh7lg&m=bLvAkwDrYioAER_dvXEqutiRiU
> W57bKfscMh
> > 71TGDxw&s=p5jgFDs7cxtpIE9LZ3ogOahGzpuKjG4PHOcPF6qXnXI&e=
> > FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
> 
> --
> RMK's Patch system: https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__www.armlinux.org.uk_developer_patches_&d=DwIBAg&c=nKjWec2b6
> R0mOyPaz7xtfQ&r=DDQ3dKwkTIxKAl6_Bs7GMx4zhJArrXKN2mDMOXGh7lg&
> m=bLvAkwDrYioAER_dvXEqutiRiUW57bKfscMh71TGDxw&s=p5jgFDs7cxtpIE9
> LZ3ogOahGzpuKjG4PHOcPF6qXnXI&e=
> FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


[PATCH] i40e: Fix incorrect use of ip6src due to copy-paste coding error

2021-02-15 Thread Colin King
From: Colin Ian King 

It appears that the call of ipv6_add_any for the destination address
is using ip6src instead of ip6dst, this looks like a copy-paste
coding error. Fix this by replacing ip6src with ip6dst.

Addresses-Coverity: ("Copy-paste error")
Fixes: efca91e89b67 ("i40e: Add flow director support for IPv6")
Signed-off-by: Colin Ian King 
---
 drivers/net/ethernet/intel/i40e/i40e_ethtool.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c 
b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
index 8a4dd77a12da..a8a2b5f683a2 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
@@ -4250,7 +4250,7 @@ static int i40e_check_fdir_input_set(struct i40e_vsi *vsi,
(struct in6_addr *)&ipv6_full_mask))
new_mask |= I40E_L3_V6_DST_MASK;
else if (ipv6_addr_any((struct in6_addr *)
-  &tcp_ip6_spec->ip6src))
+  &tcp_ip6_spec->ip6dst))
new_mask &= ~I40E_L3_V6_DST_MASK;
else
return -EOPNOTSUPP;
-- 
2.30.0



Re: [EXT] Re: Phylink flow control support on ports with MLO_AN_FIXED auto negotiation

2021-02-15 Thread Russell King - ARM Linux admin
On Mon, Feb 15, 2021 at 04:19:19PM +, Stefan Chulski wrote:
> > > > I discussed it with Andrew earlier last year, and his response was:
> > > >
> > > >  DT configuration of pause for fixed link probably is sufficient. I
> > > > don't  remember it ever been really discussed for DSA. It was a
> > > > Melanox  discussion about limiting pause for the CPU. So I think it
> > > > is safe to  not implement ethtool -A, at least until somebody has a
> > > > real use case  for it.
> > > >
> > > > So I chose not to support it - no point supporting features that
> > > > people aren't using. If you have a "real use case" then it can be added.
> > >
> > > This patch may be sufficient - I haven't fully considered all the
> > > implications of changing this though.
> > 
> > Did you try this patch? What's the outcome?
> 
> For me patch worked as expected.

Great, thanks. Andrew, do you have any further opinions on this
subject, or shall I sent a proper patch for net-next?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: HSR/PRP sequence counter issue with Cisco Redbox

2021-02-15 Thread George McCollister
On Mon, Feb 15, 2021 at 6:30 AM Wenzel, Marco  wrote:
>
> > On Wed, Jan 27, 2021 at 6:32 AM Wenzel, Marco  > eberle.de> wrote:
> > >
> > > Hi,
> > >
> > > we have figured out an issue with the current PRP driver when trying to
> > communicate with Cisco IE 2000 industrial Ethernet switches in Redbox
> > mode. The Cisco always resets the HSR/PRP sequence counter to "1" at low
> > traffic (<= 1 frame in 400 ms). It can be reproduced by a simple ICMP echo
> > request with 1 s interval between a Linux box running with PRP and a VDAN
> > behind the Cisco Redbox. The Linux box then always receives frames with
> > sequence counter "1" and drops them. The behavior is not configurable at
> > the Cisco Redbox.
> > >
> > > I fixed it by ignoring sequence counters with value "1" at the sequence
> > counter check in hsr_register_frame_out ():
> > >
> > > diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c index
> > > 5c97de459905..630c238e81f0 100644
> > > --- a/net/hsr/hsr_framereg.c
> > > +++ b/net/hsr/hsr_framereg.c
> > > @@ -411,7 +411,7 @@ void hsr_register_frame_in(struct hsr_node *node,
> > > struct hsr_port *port,  int hsr_register_frame_out(struct hsr_port *port,
> > struct hsr_node *node,
> > >u16 sequence_nr)  {
> > > -   if (seq_nr_before_or_eq(sequence_nr, node->seq_out[port->type]))
> > > +   if (seq_nr_before_or_eq(sequence_nr,
> > > + node->seq_out[port->type]) && (sequence_nr != 1))
> > > return 1;
> > >
> > > node->seq_out[port->type] = sequence_nr;
> > >
> > >
> > > Do you think this could be a solution? Should this patch be officially 
> > > applied
> > in order to avoid other users running into these communication issues?
> >
> > This isn't the correct way to solve the problem. IEC 62439-3 defines
> > EntryForgetTime as "Time after which an entry is removed from the duplicate
> > table" with a value of 400ms and states devices should usually be configured
> > to keep entries in the table for a much shorter time. hsr_framereg.c needs 
> > to
> > be reworked to handle this according to the specification.
>
> Sorry for the delay but I did not have the time to take a closer look at the 
> problem until now.
>
> My suggestion for the EntryForgetTime feature would be the following: A 
> time_out element will be added to the hsr_node structure, which always stores 
> the current time when entering hsr_register_frame_out(). If the last stored 
> time is older than EntryForgetTime (400 ms) the sequence number check will be 
> ignored.
>
> diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c
> index 5c97de459905..a97bffbd2581 100644
> --- a/net/hsr/hsr_framereg.c
> +++ b/net/hsr/hsr_framereg.c
> @@ -164,8 +164,10 @@ static struct hsr_node *hsr_add_node(struct hsr_priv 
> *hsr,
>  * as initialization. (0 could trigger an spurious ring error 
> warning).
>  */
> now = jiffies;
> -   for (i = 0; i < HSR_PT_PORTS; i++)
> +   for (i = 0; i < HSR_PT_PORTS; i++) {
> new_node->time_in[i] = now;
> +   new_node->time_out[i] = now;
> +   }
> for (i = 0; i < HSR_PT_PORTS; i++)
> new_node->seq_out[i] = seq_out;
>
> @@ -411,9 +413,12 @@ void hsr_register_frame_in(struct hsr_node *node, struct 
> hsr_port *port,
>  int hsr_register_frame_out(struct hsr_port *port, struct hsr_node *node,
>u16 sequence_nr)
>  {
> -   if (seq_nr_before_or_eq(sequence_nr, node->seq_out[port->type]))
> +   if (seq_nr_before_or_eq(sequence_nr, node->seq_out[port->type]) &&
> +time_is_after_jiffies(node->time_out[port->type] + 
> msecs_to_jiffies(HSR_ENTRY_FORGET_TIME))) {
> return 1;
> +   }
>
> +   node->time_out[port->type] = jiffies;
> node->seq_out[port->type] = sequence_nr;
> return 0;
>  }
> diff --git a/net/hsr/hsr_framereg.h b/net/hsr/hsr_framereg.h
> index 86b43f539f2c..d9628e7a5f05 100644
> --- a/net/hsr/hsr_framereg.h
> +++ b/net/hsr/hsr_framereg.h
> @@ -75,6 +75,7 @@ struct hsr_node {
> enum hsr_port_type  addr_B_port;
> unsigned long   time_in[HSR_PT_PORTS];
> booltime_in_stale[HSR_PT_PORTS];
> +   unsigned long   time_out[HSR_PT_PORTS];
> /* if the node is a SAN */
> boolsan_a;
> boolsan_b;
> diff --git a/net/hsr/hsr_main.h b/net/hsr/hsr_main.h
> index 7dc92ce5a134..f79ca55d6986 100644
> --- a/net/hsr/hsr_main.h
> +++ b/net/hsr/hsr_main.h
> @@ -21,6 +21,7 @@
>  #define HSR_LIFE_CHECK_INTERVAL 2000 /* ms */
>  #define HSR_NODE_FORGET_TIME   6 /* ms */
>  #define HSR_ANNOUNCE_INTERVAL100 /* ms */
> +#define HSR_ENTRY_FORGET_TIME400 /* ms */
>
>  /* By how much may slave1 and slave2 timestamps of latest received frame from
>   * each node differ before we notify of communication problem?
>
>
> This app

re: octeontx2-af: cn10k: MAC internal loopback support

2021-02-15 Thread Colin Ian King
Hi,

Static analysis on linux-next today using Coverity found an issue in the
following commit:

commit 3ad3f8f93c81f81d6e28b2e286b03669cc1fb3b0
Author: Hariprasad Kelam 
Date:   Thu Feb 11 21:28:34 2021 +0530

octeontx2-af: cn10k: MAC internal loopback support

The analysis is as follows:

723 static int rvu_cgx_config_intlbk(struct rvu *rvu, u16 pcifunc, bool en)
724 {
725struct mac_ops *mac_ops;

   1. var_decl: Declaring variable lmac_id without initializer.

726u8 cgx_id, lmac_id;
727

   2. Condition !is_cgx_config_permitted(rvu, pcifunc), taking false branch.

728if (!is_cgx_config_permitted(rvu, pcifunc))
729return -EPERM;
730

Uninitialized scalar variable (UNINIT)

731mac_ops = get_mac_ops(rvu_cgx_pdata(cgx_id, rvu));
732

   Uninitialized scalar variable (UNINIT)
   3. uninit_use_in_call: Using uninitialized value lmac_id when calling
*mac_ops->mac_lmac_intl_lbk.

733return mac_ops->mac_lmac_intl_lbk(rvu_cgx_pdata(cgx_id, rvu),
734  lmac_id, en);
735 }

Variables cgx_id and lmac_id are no longer being initialized and garbage
values are being passed into function calls.  Originally, these
variables were being initialized with a call to rvu_get_cgx_lmac_id()
but that has now been removed.

Colin


Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link

2021-02-15 Thread Toke Høiland-Jørgensen
Maciej Fijalkowski  writes:

> Currently, if there are multiple xdpsock instances running on a single
> interface and in case one of the instances is terminated, the rest of
> them are left in an inoperable state due to the fact of unloaded XDP
> prog from interface.
>
> To address that, step away from setting bpf prog in favour of bpf_link.
> This means that refcounting of BPF resources will be done automatically
> by bpf_link itself.
>
> When setting up BPF resources during xsk socket creation, check whether
> bpf_link for a given ifindex already exists via set of calls to
> bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd
> and comparing the ifindexes from bpf_link and xsk socket.

One consideration here is that bpf_link_get_fd_by_id() is a privileged
operation (privileged as in CAP_SYS_ADMIN), so this has the side effect
of making AF_XDP privileged as well. Is that the intention?

Another is that the AF_XDP code is in the process of moving to libxdp
(see in-progress PR [0]), and this approach won't carry over as-is to
that model, because libxdp has to pin the bpf_link fds.

However, in libxdp we can solve the original problem in a different way,
and in fact I already suggested to Magnus that we should do this (see
[1]); so one way forward could be to address it during the merge in
libxdp? It should be possible to address the original issue (two
instances of xdpsock breaking each other when they exit), but
applications will still need to do an explicit unload operation before
exiting (i.e., the automatic detach on bpf_link fd closure will take
more work, and likely require extending the bpf_link kernel support)...

-Toke

[0] https://github.com/xdp-project/xdp-tools/pull/92
[1] https://github.com/xdp-project/xdp-tools/pull/92#discussion_r576204719



Re: [PATCH net-next RFC v3] net: hdlc_x25: Queue outgoing LAPB frames

2021-02-15 Thread Xie He
On Mon, Feb 15, 2021 at 1:25 AM Leon Romanovsky  wrote:
>
> > + /* When transmitting data:
> > +  * first we'll remove a pseudo header of 1 byte,
> > +  * then the LAPB module will prepend an LAPB header of at most 3 
> > bytes.
> > +  */
> > + dev->needed_headroom = 3 - 1;
>
> 3 - 1 = 2
>
> Thanks

Actually this is intentional. It makes the numbers more meaningful.

The compiler should automatically generate the "2" so there would be
no runtime penalty.


Re: [PATCH net-next 4/4] net: dsa: don't set skb->offload_fwd_mark when not offloading the bridge

2021-02-15 Thread Vladimir Oltean
Hi George,

On Mon, Feb 15, 2021 at 09:48:38AM -0600, George McCollister wrote:
> On Sun, Feb 14, 2021 at 9:54 AM Vladimir Oltean  wrote:
> [snip]
> > diff --git a/net/dsa/tag_xrs700x.c b/net/dsa/tag_xrs700x.c
> > index 858cdf9d2913..215ecceea89e 100644
> > --- a/net/dsa/tag_xrs700x.c
> > +++ b/net/dsa/tag_xrs700x.c
> > @@ -45,8 +45,7 @@ static struct sk_buff *xrs700x_rcv(struct sk_buff *skb, 
> > struct net_device *dev,
> > if (pskb_trim_rcsum(skb, skb->len - 1))
> > return NULL;
> >
> > -   /* Frame is forwarded by hardware, don't forward in software. */
> > -   skb->offload_fwd_mark = 1;
> > +   dsa_default_offload_fwd_mark(skb);
>
> Does it make sense that the following would have worked prior to this
> change? Is this only an issue for bridging between DSA ports when
> offloading is supported? lan0 is a port an an xrs700x switch:
>
> ip link set eth0 up
> ip link del veth0
> ip link add veth0 type veth peer name veth1
>
> for eth in veth0 veth1 lan1; do
> ip link set ${eth} up
> done
> ip link add br0 type bridge
> ip link set veth1 master br0
> ip link set lan1 master br0
> ip link set br0 up
>
> ip addr add 192.168.2.1/24 dev veth0
>
> # ping host connected to physical LAN that lan0 is on
> ping 192.168.2.249 (works!)
>
> I was trying to come up with a way to test this change and expected
> this would fail (and your patch) would fix it based on what you're
> described.

No, the configuration you've shown should be supported and functional
already (as you've noticed, in fact). I call it 'bridging with a
foreign interface', where a foreign interface is a bridge port that has
a different switchdev mark compared to the DSA switch. A switchdev mark
is a number assigned to every bridge port by nbp_switchdev_mark_set,
based on the "physical switch id"*.

There is a simple rule with switchdev: on reception of an skb, the bridge
checks if it was marked as 'already forwarded in hardware' (checks if
skb->offload_fwd_mark == 1), and if it is, it puts a mark of its own on
that skb, with the switchdev mark of the ingress port. Then during
forwarding, it enforces that the egress port must have a different switchdev
mark than the ingress one (this is done in nbp_switchdev_allowed_egress).

The veth driver does not implement any sort of method for retrieving a
physical switch id (neither devlink nor .ndo_get_port_parent_id),
therefore the bridge assigns it a switchdev mark of 0, and packets
coming from it will always have skb->offload_fwd_mark = 0. So there
aren't any restrictions.

Problems appear as soon as software bridging is attempted between two
interfaces that have the same switchdev mark. If skb->offload_fwd_mark=1,
the bridge will say that forwarding was already performed in hw, so it
will deny it in sw.
The issue is that a bond0 (or hsr0) upper of lan0 will be assigned the
same switchdev mark as lan0 itself, because the function that assigns
switchdev marks to bridge ports, nbp_switchdev_mark_set, recurses
through that port's lower interfaces until it finds something that
implements devlink.

What I tested is actually pretty laughable and a far cry from a
real-life scenario: I commented out the .port_bridge_join and
.port_bridge_leave methods of a driver and made sure that forwarding
between ports still works regardless of what uppers they have
(even that used not to). But this bypasses the switchdev mark checks in
nbp_switchdev_allowed_egress because the skb->offload_fwd_mark=0 now.
This is an important prerequisite for seamless operation, true, but it
isn't quite what we want. For one thing, we may have a topology like
this:

 +-- br0 -+
/   / |\
   /   /  | \
  /   /   |  \
 /   /|   \
/   / |\
   /| |   bond0
  / | |  /\
 swp0  swp1  swp2  swp3  swp4

where it is desirable that the presence of swp3 and swp4 under a
non-offloaded LAG does not preclude us from doing hardware bridging
beteen swp0, swp1 and swp2.

But this creates an impossible paradox if we continue in the way that I
started in this patch.

When the CPU receives a packet from swp0 (say, due to flooding), the
tagger must set skb->offload_fwd_mark to something.

If we set it to 0, then the bridge will forward it towards swp1, swp2
and bond0. But the switch has already forwarded it towards swp1 and
swp2 (not to bond0, remember, that isn't offloaded, so as far as the
switch is concerned, ports swp3 and swp4 are not looking up the FDB, and
the entire bond0 is a destination that is strictly behind the CPU). But
we don't want duplicated traffic towards swp1 and swp2, so it's not ok
to set skb->offload_fwd_mark = 0.

If we set it to 1, then the bridge will not forward the skb towards the
ports with the same switchdev mark, i.e. not to swp1, swp2 and bond0.
Towards swp1 and swp2 that's ok, but towards bond0? It should have
forwarded the skb there.

An actual solution to this problem, which has nothing to do with my
se

Re: [PATCH bpf-next 1/3] libbpf: xsk: use bpf_link

2021-02-15 Thread Björn Töpel

On 2021-02-15 18:07, Toke Høiland-Jørgensen wrote:

Maciej Fijalkowski  writes:


Currently, if there are multiple xdpsock instances running on a single
interface and in case one of the instances is terminated, the rest of
them are left in an inoperable state due to the fact of unloaded XDP
prog from interface.

To address that, step away from setting bpf prog in favour of bpf_link.
This means that refcounting of BPF resources will be done automatically
by bpf_link itself.

When setting up BPF resources during xsk socket creation, check whether
bpf_link for a given ifindex already exists via set of calls to
bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd
and comparing the ifindexes from bpf_link and xsk socket.


One consideration here is that bpf_link_get_fd_by_id() is a privileged
operation (privileged as in CAP_SYS_ADMIN), so this has the side effect
of making AF_XDP privileged as well. Is that the intention?



We're already using, e.g., bpf_map_get_fd_by_id() which has that
as well. So we're assuming that for XDP setup already!


Another is that the AF_XDP code is in the process of moving to libxdp
(see in-progress PR [0]), and this approach won't carry over as-is to
that model, because libxdp has to pin the bpf_link fds.



I was assuming there were two modes of operations for AF_XDP in libxdp.
One which is with the multi-program support (which AFAIK is why the
pinning is required), and one "like the current libbpf" one. For the
latter Maciej's series would be a good fit, no?


However, in libxdp we can solve the original problem in a different way,
and in fact I already suggested to Magnus that we should do this (see
[1]); so one way forward could be to address it during the merge in
libxdp? It should be possible to address the original issue (two
instances of xdpsock breaking each other when they exit), but
applications will still need to do an explicit unload operation before
exiting (i.e., the automatic detach on bpf_link fd closure will take
more work, and likely require extending the bpf_link kernel support)...



I'd say it's depending on the libbpf 1.0/libxdp merge timeframe. If
we're months ahead, then I'd really like to see this in libbpf until the
merge. However, I'll leave that for Magnus/you to decide!

Bottom line; I'd *really* like bpf_link behavior (process scoped) for
AF_XDP sooner than later! ;-)


Thanks for the input!
Björn



-Toke

[0] https://github.com/xdp-project/xdp-tools/pull/92
[1] https://github.com/xdp-project/xdp-tools/pull/92#discussion_r576204719



[net-next PATCH] octeontx2-af: cn10k: Fixes CN10K RPM reference issue

2021-02-15 Thread Geetha sowjanya
This patch fixes references to uninitialized variables and
debugfs entry name for CN10K platform and HW_TSO flag check. 

Signed-off-by: Geetha sowjanya 
Signed-off-by: Sunil Goutham 

This patch fixes the bug introduced by the commit
3ad3f8f93c81 ("octeontx2-af: cn10k: MAC internal loopback support".
These changes are not yet merged into net branch, hence submitting
to net-next.

---
 drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c   |  2 ++
 .../net/ethernet/marvell/octeontx2/af/rvu_debugfs.c   |  2 +-
 .../net/ethernet/marvell/octeontx2/nic/otx2_txrx.c| 11 ++-
 3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c 
b/drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c
index 3a1809c28e83..e668e482383a 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c
@@ -722,12 +722,14 @@ u32 rvu_cgx_get_fifolen(struct rvu *rvu)
 
 static int rvu_cgx_config_intlbk(struct rvu *rvu, u16 pcifunc, bool en)
 {
+   int pf = rvu_get_pf(pcifunc);
struct mac_ops *mac_ops;
u8 cgx_id, lmac_id;
 
if (!is_cgx_config_permitted(rvu, pcifunc))
return -EPERM;
 
+   rvu_get_cgx_lmac_id(rvu->pf2cgxlmac_map[pf], &cgx_id, &lmac_id);
mac_ops = get_mac_ops(rvu_cgx_pdata(cgx_id, rvu));
 
return mac_ops->mac_lmac_intl_lbk(rvu_cgx_pdata(cgx_id, rvu),
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c 
b/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c
index 48a84c65804c..094124b695dc 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c
@@ -2432,7 +2432,7 @@ void rvu_dbg_init(struct rvu *rvu)
debugfs_create_file("rvu_pf_cgx_map", 0444, rvu->rvu_dbg.root,
rvu, &rvu_dbg_rvu_pf_cgx_map_fops);
else
-   debugfs_create_file("rvu_pf_cgx_map", 0444, rvu->rvu_dbg.root,
+   debugfs_create_file("rvu_pf_rpm_map", 0444, rvu->rvu_dbg.root,
rvu, &rvu_dbg_rvu_pf_cgx_map_fops);
 
 create:
diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c 
b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c
index 3f778fc054b5..22ec03a618b1 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c
@@ -816,22 +816,23 @@ static bool is_hw_tso_supported(struct otx2_nic *pfvf,
 {
int payload_len, last_seg_size;
 
+   if (test_bit(HW_TSO, &pfvf->hw.cap_flag))
+   return true;
+
+   /* On 96xx A0, HW TSO not supported */
+   if (!is_96xx_B0(pfvf->pdev))
+   return false;
 
/* HW has an issue due to which when the payload of the last LSO
 * segment is shorter than 16 bytes, some header fields may not
 * be correctly modified, hence don't offload such TSO segments.
 */
-   if (!is_96xx_B0(pfvf->pdev))
-   return true;
 
payload_len = skb->len - (skb_transport_offset(skb) + tcp_hdrlen(skb));
last_seg_size = payload_len % skb_shinfo(skb)->gso_size;
if (last_seg_size && last_seg_size < 16)
return false;
 
-   if (!test_bit(HW_TSO, &pfvf->hw.cap_flag))
-   return false;
-
return true;
 }
 
-- 
2.17.1



Re: [EXT] re: octeontx2-af: cn10k: MAC internal loopback support

2021-02-15 Thread Geethasowjanya Akula
Hi Colin,

I have submitted the patch fixing the reported issue to net-next branch.

Thank you,
Geetha.


From: Colin Ian King 
Sent: Monday, February 15, 2021 10:22 PM
To: Hariprasad Kelam
Cc: Sunil Kovvuri Goutham; Linu Cherian; Geethasowjanya Akula; Jerin Jacob 
Kollanukkaran; Hariprasad Kelam; Subbaraya Sundeep Bhatta; 
netdev@vger.kernel.org; linux-ker...@vger.kernel.org
Subject: [EXT] re: octeontx2-af: cn10k: MAC internal loopback support

External Email

--
Hi,

Static analysis on linux-next today using Coverity found an issue in the
following commit:

commit 3ad3f8f93c81f81d6e28b2e286b03669cc1fb3b0
Author: Hariprasad Kelam 
Date:   Thu Feb 11 21:28:34 2021 +0530

octeontx2-af: cn10k: MAC internal loopback support

The analysis is as follows:

723 static int rvu_cgx_config_intlbk(struct rvu *rvu, u16 pcifunc, bool en)
724 {
725struct mac_ops *mac_ops;

   1. var_decl: Declaring variable lmac_id without initializer.

726u8 cgx_id, lmac_id;
727

   2. Condition !is_cgx_config_permitted(rvu, pcifunc), taking false branch.

728if (!is_cgx_config_permitted(rvu, pcifunc))
729return -EPERM;
730

Uninitialized scalar variable (UNINIT)

731mac_ops = get_mac_ops(rvu_cgx_pdata(cgx_id, rvu));
732

   Uninitialized scalar variable (UNINIT)
   3. uninit_use_in_call: Using uninitialized value lmac_id when calling
*mac_ops->mac_lmac_intl_lbk.

733return mac_ops->mac_lmac_intl_lbk(rvu_cgx_pdata(cgx_id, rvu),
734  lmac_id, en);
735 }

Variables cgx_id and lmac_id are no longer being initialized and garbage
values are being passed into function calls.  Originally, these
variables were being initialized with a call to rvu_get_cgx_lmac_id()
but that has now been removed.

Colin


Re: commit 0f0aefd733f7 to linux-firmware effectively broke all of the setups with old kernels

2021-02-15 Thread Peter Robinson
> Seems the commit 0f0aefd733f7 to linux-firmware effectively broke all
> of the setups with the old kernels. Firmware name is an ABI (!) and
> replacing it like this will definitely break systems with older
> kernels. Linux firmware package likely, but unfortunately, should
> carry on both versions as long as it's needed. Alternative solution is
> to provide the links during installation.

It does provide the links using the copy-firmware.sh and the details in WHENCE.

The alternative is to leave firmwares in place with CVEs.


Re: [PATCH net-next v2 1/7] mld: convert from timer to delayed work

2021-02-15 Thread Cong Wang
On Sun, Feb 14, 2021 at 2:56 AM Taehee Yoo  wrote:
>
>
>
> On 21. 2. 14. 오전 4:07, Cong Wang wrote:
>  > On Sat, Feb 13, 2021 at 9:51 AM Taehee Yoo  wrote:
>  >> -static void mld_dad_start_timer(struct inet6_dev *idev, unsigned
> long delay)
>  >> +static void mld_dad_start_work(struct inet6_dev *idev, unsigned
> long delay)
>  >>   {
>  >>  unsigned long tv = prandom_u32() % delay;
>  >>
>  >> -   if (!mod_timer(&idev->mc_dad_timer, jiffies+tv+2))
>  >> +   if (!mod_delayed_work(mld_wq, &idev->mc_dad_work,
> msecs_to_jiffies(tv + 2)))
>  >
>  > IIUC, before this patch 'delay' is in jiffies, after this patch it is
> in msecs?
>  >
>
> Ah, I understand, It's my mistake.
> I didn't change the behavior of 'delay' in this patchset.
> So, 'delay' is still in jiffies, not msecs.
> Therefore, msecs_to_jiffies() should not be used in this patchset.
> I will send a v3 patch, which doesn't use msecs_to_jiffies().
> Thanks!
>
> By the way, I think the 'delay' is from the
> unsolicited_report_interval() and it just return value of
> idev->cnf.mldv{1 | 2}_unsolicited_report_interval.
> I think this value is msecs, not jiffies.
> So, It should be converted to use msecs_to_jiffies(), I think.
> How do you think about it?

Hmm? I think it is in jiffies:

.mldv1_unsolicited_report_interval = 10 * HZ,
.mldv2_unsolicited_report_interval = HZ,


>
>  > [...]
>  >
>  >> -static void mld_dad_timer_expire(struct timer_list *t)
>  >> +static void mld_dad_work(struct work_struct *work)
>  >>   {
>  >> -   struct inet6_dev *idev = from_timer(idev, t, mc_dad_timer);
>  >> +   struct inet6_dev *idev = container_of(to_delayed_work(work),
>  >> + struct inet6_dev,
>  >> + mc_dad_work);
>  >>
>  >> +   rtnl_lock();
>  >
>  > Any reason why we need RTNL after converting the timer to
>  > delayed work?
>  >
>
> For the moment, RTNL is not needed.
> But the Resources, which are used by delayed_work will be protected by
> RTNL instead of other locks.
> So, It just pre-adds RTNL and the following patches will delete other locks.

Sounds like this change does not belong to this patch. ;) If so,
please move it to where ever more appropriate.

Thanks.


Re: [net-next PATCH] octeontx2-af: cn10k: Fixes CN10K RPM reference issue

2021-02-15 Thread Sunil Kovvuri
On Mon, Feb 15, 2021 at 11:27 PM Geetha sowjanya  wrote:
>
> This patch fixes references to uninitialized variables and
> debugfs entry name for CN10K platform and HW_TSO flag check.
>
> Signed-off-by: Geetha sowjanya 
> Signed-off-by: Sunil Goutham 
>
> This patch fixes the bug introduced by the commit
> 3ad3f8f93c81 ("octeontx2-af: cn10k: MAC internal loopback support".
> These changes are not yet merged into net branch, hence submitting
> to net-next.
>
> ---
>  drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c   |  2 ++
>  .../net/ethernet/marvell/octeontx2/af/rvu_debugfs.c   |  2 +-
>  .../net/ethernet/marvell/octeontx2/nic/otx2_txrx.c| 11 ++-
>  3 files changed, 9 insertions(+), 6 deletions(-)
>

> diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c 
> b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c
> index 3f778fc054b5..22ec03a618b1 100644
> --- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c
> +++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c
> @@ -816,22 +816,23 @@ static bool is_hw_tso_supported(struct otx2_nic *pfvf,
>  {
> int payload_len, last_seg_size;
>
> +   if (test_bit(HW_TSO, &pfvf->hw.cap_flag))
> +   return true;
> +
> +   /* On 96xx A0, HW TSO not supported */
> +   if (!is_96xx_B0(pfvf->pdev))
> +   return false;
>
> /* HW has an issue due to which when the payload of the last LSO
>  * segment is shorter than 16 bytes, some header fields may not
>  * be correctly modified, hence don't offload such TSO segments.
>  */
> -   if (!is_96xx_B0(pfvf->pdev))
> -   return true;
>
> payload_len = skb->len - (skb_transport_offset(skb) + 
> tcp_hdrlen(skb));
> last_seg_size = payload_len % skb_shinfo(skb)->gso_size;
> if (last_seg_size && last_seg_size < 16)
> return false;
>
> -   if (!test_bit(HW_TSO, &pfvf->hw.cap_flag))
> -   return false;
> -
> return true;
>  }

The HW_TSO flag should not be set for B0 silicon as well, otherwise
the checks related to
HW issue mentioned above will not come into effect.

Thanks,
Sunil.


Re: [EXT] Re: Phylink flow control support on ports with MLO_AN_FIXED auto negotiation

2021-02-15 Thread Andrew Lunn
On Mon, Feb 15, 2021 at 04:19:19PM +, Stefan Chulski wrote:
> > > > I discussed it with Andrew earlier last year, and his response was:
> > > >
> > > >  DT configuration of pause for fixed link probably is sufficient. I
> > > > don't  remember it ever been really discussed for DSA. It was a
> > > > Melanox  discussion about limiting pause for the CPU. So I think it
> > > > is safe to  not implement ethtool -A, at least until somebody has a
> > > > real use case  for it.
> > > >
> > > > So I chose not to support it - no point supporting features that
> > > > people aren't using. If you have a "real use case" then it can be added.
> > >
> > > This patch may be sufficient - I haven't fully considered all the
> > > implications of changing this though.
> > 
> > Did you try this patch? What's the outcome?
> 
> For me patch worked as expected.

Hi Stefan

Russell's patch allows it, but i would be interested in knows why you
actually need it. What is your use case for changing this on the fly?

 Andrew


Re: commit 0f0aefd733f7 to linux-firmware effectively broke all of the setups with old kernels

2021-02-15 Thread Andy Shevchenko
On Mon, Feb 15, 2021 at 8:03 PM Peter Robinson  wrote:
>
> > Seems the commit 0f0aefd733f7 to linux-firmware effectively broke all
> > of the setups with the old kernels. Firmware name is an ABI (!) and
> > replacing it like this will definitely break systems with older
> > kernels. Linux firmware package likely, but unfortunately, should
> > carry on both versions as long as it's needed. Alternative solution is
> > to provide the links during installation.
>
> It does provide the links using the copy-firmware.sh and the details in 
> WHENCE.
>
> The alternative is to leave firmwares in place with CVEs.

Good, thanks, I haven't looked into that script.


-- 
With Best Regards,
Andy Shevchenko


RE: [Patch bpf-next v3 1/5] bpf: clean up sockmap related Kconfigs

2021-02-15 Thread John Fastabend
Cong Wang wrote:
> From: Cong Wang 
> 
> As suggested by John, clean up sockmap related Kconfigs:
> 
> Reduce the scope of CONFIG_BPF_STREAM_PARSER down to TCP stream
> parser, to reflect its name.
> 
> Make the rest sockmap code simply depend on CONFIG_BPF_SYSCALL.
> And leave CONFIG_NET_SOCK_MSG untouched, as it is used by
> non-sockmap cases.
> 
> Cc: John Fastabend 
> Cc: Daniel Borkmann 
> Cc: Jakub Sitnicki 
> Cc: Lorenz Bauer 
> Signed-off-by: Cong Wang 
> ---

Thanks for doing this.

Acked-by: John Fastabend 


[PATCH AUTOSEL 5.10 2/6] NET: usb: qmi_wwan: Adding support for Cinterion MV31

2021-02-15 Thread Sasha Levin
From: Christoph Schemmel 

[ Upstream commit a4dc7eee9106a9d2a6e08b442db19677aa9699c7 ]

Adding support for Cinterion MV31 with PID 0x00B7.

T:  Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 11 Spd=5000 MxCh= 0
D:  Ver= 3.20 Cls=ef(misc ) Sub=02 Prot=01 MxPS= 9 #Cfgs=  1
P:  Vendor=1e2d ProdID=00b7 Rev=04.14
S:  Manufacturer=Cinterion
S:  Product=Cinterion USB Mobile Broadband
S:  SerialNumber=b3246eed
C:  #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=896mA
I:  If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
I:  If#=0x1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#=0x3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option

Signed-off-by: Christoph Schemmel 
Link: 
https://lore.kernel.org/r/20210202084523.4371-1-christoph.schem...@gmail.com
Signed-off-by: Jakub Kicinski 
Signed-off-by: Sasha Levin 
---
 drivers/net/usb/qmi_wwan.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index ce73df4c137ea..b223536e07bed 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -1332,6 +1332,7 @@ static const struct usb_device_id products[] = {
{QMI_FIXED_INTF(0x1e2d, 0x0082, 5)},/* Cinterion PHxx,PXxx (2 
RmNet) */
{QMI_FIXED_INTF(0x1e2d, 0x0083, 4)},/* Cinterion PHxx,PXxx (1 RmNet 
+ USB Audio)*/
{QMI_QUIRK_SET_DTR(0x1e2d, 0x00b0, 4)}, /* Cinterion CLS8 */
+   {QMI_FIXED_INTF(0x1e2d, 0x00b7, 0)},/* Cinterion MV31 RmNet */
{QMI_FIXED_INTF(0x413c, 0x81a2, 8)},/* Dell Wireless 5806 Gobi(TM) 
4G LTE Mobile Broadband Card */
{QMI_FIXED_INTF(0x413c, 0x81a3, 8)},/* Dell Wireless 5570 HSPA+ 
(42Mbps) Mobile Broadband Card */
{QMI_FIXED_INTF(0x413c, 0x81a4, 8)},/* Dell Wireless 5570e HSPA+ 
(42Mbps) Mobile Broadband Card */
-- 
2.27.0



[PATCH AUTOSEL 5.10 3/6] cxgb4: Add new T6 PCI device id 0x6092

2021-02-15 Thread Sasha Levin
From: Raju Rangoju 

[ Upstream commit 3401e4aa43a540881cc97190afead650e709c418 ]

Signed-off-by: Raju Rangoju 
Link: https://lore.kernel.org/r/20210202182511.8109-1-ra...@chelsio.com
Signed-off-by: Jakub Kicinski 
Signed-off-by: Sasha Levin 
---
 drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h 
b/drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h
index 0c5373462cedb..0b1b5f9c67d47 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_pci_id_tbl.h
@@ -219,6 +219,7 @@ CH_PCI_DEVICE_ID_TABLE_DEFINE_BEGIN
CH_PCI_ID_TABLE_FENTRY(0x6089), /* Custom T62100-KR */
CH_PCI_ID_TABLE_FENTRY(0x608a), /* Custom T62100-CR */
CH_PCI_ID_TABLE_FENTRY(0x608b), /* Custom T6225-CR */
+   CH_PCI_ID_TABLE_FENTRY(0x6092), /* Custom T62100-CR-LOM */
 CH_PCI_DEVICE_ID_TABLE_DEFINE_END;
 
 #endif /* __T4_PCI_ID_TBL_H__ */
-- 
2.27.0



[PATCH AUTOSEL 5.4 1/4] NET: usb: qmi_wwan: Adding support for Cinterion MV31

2021-02-15 Thread Sasha Levin
From: Christoph Schemmel 

[ Upstream commit a4dc7eee9106a9d2a6e08b442db19677aa9699c7 ]

Adding support for Cinterion MV31 with PID 0x00B7.

T:  Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 11 Spd=5000 MxCh= 0
D:  Ver= 3.20 Cls=ef(misc ) Sub=02 Prot=01 MxPS= 9 #Cfgs=  1
P:  Vendor=1e2d ProdID=00b7 Rev=04.14
S:  Manufacturer=Cinterion
S:  Product=Cinterion USB Mobile Broadband
S:  SerialNumber=b3246eed
C:  #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=896mA
I:  If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
I:  If#=0x1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#=0x3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option

Signed-off-by: Christoph Schemmel 
Link: 
https://lore.kernel.org/r/20210202084523.4371-1-christoph.schem...@gmail.com
Signed-off-by: Jakub Kicinski 
Signed-off-by: Sasha Levin 
---
 drivers/net/usb/qmi_wwan.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index 72a3a5dc51319..5a1d21aae2a9e 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -1354,6 +1354,7 @@ static const struct usb_device_id products[] = {
{QMI_FIXED_INTF(0x1e2d, 0x0082, 5)},/* Cinterion PHxx,PXxx (2 
RmNet) */
{QMI_FIXED_INTF(0x1e2d, 0x0083, 4)},/* Cinterion PHxx,PXxx (1 RmNet 
+ USB Audio)*/
{QMI_QUIRK_SET_DTR(0x1e2d, 0x00b0, 4)}, /* Cinterion CLS8 */
+   {QMI_FIXED_INTF(0x1e2d, 0x00b7, 0)},/* Cinterion MV31 RmNet */
{QMI_FIXED_INTF(0x413c, 0x81a2, 8)},/* Dell Wireless 5806 Gobi(TM) 
4G LTE Mobile Broadband Card */
{QMI_FIXED_INTF(0x413c, 0x81a3, 8)},/* Dell Wireless 5570 HSPA+ 
(42Mbps) Mobile Broadband Card */
{QMI_FIXED_INTF(0x413c, 0x81a4, 8)},/* Dell Wireless 5570e HSPA+ 
(42Mbps) Mobile Broadband Card */
-- 
2.27.0



  1   2   3   >