date:20201230

[PATCH net-next] net/mlxfw: Use kzalloc for allocating only one thing

2020-12-30 Thread Zheng Yongjun

Use kzalloc rather than kcalloc(1,...)

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// 
@@
@@

- kcalloc(1,
+ kzalloc(
  ...)
// 

Signed-off-by: Zheng Yongjun 
---
 drivers/net/ethernet/mellanox/mlxfw/mlxfw_mfa2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlxfw/mlxfw_mfa2.c 
b/drivers/net/ethernet/mellanox/mlxfw/mlxfw_mfa2.c
index 5d9ddf36fb4e..e6f677e42007 100644
--- a/drivers/net/ethernet/mellanox/mlxfw/mlxfw_mfa2.c
+++ b/drivers/net/ethernet/mellanox/mlxfw/mlxfw_mfa2.c
@@ -267,7 +267,7 @@ struct mlxfw_mfa2_file *mlxfw_mfa2_file_init(const struct 
firmware *fw)
const void *first_tlv_ptr;
const void *cb_top_ptr;
 
-   mfa2_file = kcalloc(1, sizeof(*mfa2_file), GFP_KERNEL);
+   mfa2_file = kzalloc(sizeof(*mfa2_file), GFP_KERNEL);
if (!mfa2_file)
return ERR_PTR(-ENOMEM);
 
-- 
2.22.0

Re: [PATCH rfc 2/3] virtio-net: support receive timestamp

2020-12-30 Thread Jason Wang




On 2020/12/29 下午10:20, Willem de Bruijn wrote:

On Tue, Dec 29, 2020 at 4:23 AM Jason Wang  wrote:



- Original Message -

On Mon, Dec 28, 2020 at 7:55 PM Michael S. Tsirkin  wrote:

On Mon, Dec 28, 2020 at 02:30:31PM -0500, Willem de Bruijn wrote:

On Mon, Dec 28, 2020 at 12:29 PM Michael S. Tsirkin 
wrote:

On Mon, Dec 28, 2020 at 11:22:32AM -0500, Willem de Bruijn wrote:

From: Willem de Bruijn 

Add optional PTP hardware timestamp offload for virtio-net.

Accurate RTT measurement requires timestamps close to the wire.
Introduce virtio feature VIRTIO_NET_F_RX_TSTAMP. If negotiated, the
virtio-net header is expanded with room for a timestamp. A host may
pass receive timestamps for all or some packets. A timestamp is valid
if non-zero.

The timestamp straddles (virtual) hardware domains. Like PTP, use
international atomic time (CLOCK_TAI) as global clock base. It is
guest responsibility to sync with host, e.g., through kvm-clock.

Signed-off-by: Willem de Bruijn 
diff --git a/include/uapi/linux/virtio_net.h
b/include/uapi/linux/virtio_net.h
index f6881b5b77ee..0ffe2eeebd4a 100644
--- a/include/uapi/linux/virtio_net.h
+++ b/include/uapi/linux/virtio_net.h
@@ -57,6 +57,7 @@
* Steering */
  #define VIRTIO_NET_F_CTRL_MAC_ADDR 23/* Set MAC address */

+#define VIRTIO_NET_F_RX_TSTAMP 55/* Host sends TAI
receive time */
  #define VIRTIO_NET_F_TX_HASH   56/* Guest sends hash report */
  #define VIRTIO_NET_F_HASH_REPORT  57 /* Supports hash report */
  #define VIRTIO_NET_F_RSS   60/* Supports RSS RX steering */
@@ -182,6 +183,17 @@ struct virtio_net_hdr_v1_hash {
   };
  };

+struct virtio_net_hdr_v12 {
+ struct virtio_net_hdr_v1 hdr;
+ struct {
+ __le32 value;
+ __le16 report;
+ __le16 flow_state;
+ } hash;
+ __virtio32 reserved;
+ __virtio64 tstamp;
+};
+
  #ifndef VIRTIO_NET_NO_LEGACY
  /* This header comes first in the scatter-gather list.
   * For legacy virtio, if VIRTIO_F_ANY_LAYOUT is not negotiated, it
   must


So it looks like VIRTIO_NET_F_RX_TSTAMP should depend on both
VIRTIO_NET_F_RX_TSTAMP and VIRTIO_NET_F_HASH_REPORT then?

Do you mean VIRTIO_NET_F_TX_TSTAMP depends on VIRTIO_NET_F_RX_TSTAMP?

I think if either is enabled we need to enable the extended layout.
Regardless of whether TX_HASH or HASH_REPORT are enabled. If they are
not, then those fields are ignored.

I mean do we waste the 8 bytes for hash if TSTAMP is set but HASH is not?
If TSTAMP depends on HASH then point is moot.

True, but the two features really are independent.

I did consider using configurable metadata layout depending on
features negotiated. If there are tons of optional extensions, that
makes sense. But it is more complex and parsing error prone. With a
handful of options each of a few bytes, that did not seem worth the
cost to me at this point.

Consider NFV workloads(64B) packet. Most fields of current vnet header
is even a burdern.

Such workloads will not negotiate these features, I imagine.

I think this could only become an issue if a small packet workload
would want to negotiate one optional feature, without the others.



Yes.



Even then, the actual cost of a few untouched bytes is negligible, as
long as the headers don't span cache-lines.



Right. It looks to me with hash report support the current header has 
exceeds 32 bytes which is the cacheline size for some archs. And it can 
be imagined that after two or more features it could be larger than 64 
bytes.




  I expect it to be cheaper
than having to parse a dynamic layout.



The only overhead is probably analyzing "type" which should be cheap I 
guess.






It might be the time to introduce configurable or
self-descriptive vnet header.

I did briefly toy with a type-val encoding. Not type-val-len, as all
options have fixed length (so far), so length can be left implicit.
But as each feature is negotiated before hand, the type can be left
implicit too. Leaving just the data elements tightly packed. As long
as order is defined.



Right, so in this case there should be even no overhead.





And importantly, such a mode can always be added later as a separate
VIRTIO_NET_F_PACKED_HEADER option.

What will do feature provide?

The tightly packed data elements. Strip out the elements for non
negotiated features. So if either tstamp option is negotiated, but
hash is not, exclude the 64b of hash fields. Note that I'm not
actually arguing for this (at this point).



I second for this.

Thanks

Re: [RFC v2 09/13] vduse: Add support for processing vhost iotlb message

2020-12-30 Thread Jason Wang




On 2020/12/30 下午3:09, Yongji Xie wrote:

On Wed, Dec 30, 2020 at 2:11 PM Jason Wang  wrote:


On 2020/12/29 下午6:26, Yongji Xie wrote:

On Tue, Dec 29, 2020 at 5:11 PM Jason Wang  wrote:


- Original Message -

On Mon, Dec 28, 2020 at 4:43 PM Jason Wang  wrote:

On 2020/12/28 下午4:14, Yongji Xie wrote:

I see. So all the above two questions are because VHOST_IOTLB_INVALIDATE
is expected to be synchronous. This need to be solved by tweaking the
current VDUSE API or we can re-visit to go with descriptors relaying
first.


Actually all vdpa related operations are synchronous in current
implementation. The ops.set_map/dma_map/dma_unmap should not return
until the VDUSE_UPDATE_IOTLB/VDUSE_INVALIDATE_IOTLB message is replied
by userspace. Could it solve this problem?

I was thinking whether or not we need to generate IOTLB_INVALIDATE
message to VDUSE during dma_unmap (vduse_dev_unmap_page).

If we don't, we're probably fine.


It seems not feasible. This message will be also used in the
virtio-vdpa case to notify userspace to unmap some pages during
consistent dma unmapping. Maybe we can document it to make sure the
users can handle the message correctly.

Just to make sure I understand your point.

Do you mean you plan to notify the unmap of 1) streaming DMA or 2)
coherent DMA?

For 1) you probably need a workqueue to do that since dma unmap can
be done in irq or bh context. And if usrspace does't do the unmap, it
can still access the bounce buffer (if you don't zap pte)?


I plan to do it in the coherent DMA case.


Any reason for treating coherent DMA differently?


Now the memory of the bounce buffer is allocated page by page in the
page fault handler. So it can't be used in coherent DMA mapping case
which needs some memory with contiguous virtual addresses. I can use
vmalloc() to do allocation for the bounce buffer instead. But it might
cause some memory waste. Any suggestion?



I may miss something. But I don't see a relationship between the 
IOTLB_UNMAP and vmalloc().






It's true that userspace can
access the dma buffer if userspace doesn't do the unmap. But the dma
pages would not be freed and reused unless user space called munmap()
for them.


I wonder whether or not we could recycle IOVA in this case to avoid the
IOTLB_UMAP message.


We can achieve that if we use vmalloc() to do allocation for the
bounce buffer which can be used in coherent DMA mapping case. But
looks like we still have no way to avoid the IOTLB_UMAP message in
vhost-vdpa case.



I think that's fine. For virtio-vdpa, from VDUSE userspace perspective, 
it works like a driver that is using SWIOTLB in this case.


Thanks




Thanks,
Yongji

Re: [PATCH rfc 2/3] virtio-net: support receive timestamp

2020-12-30 Thread Jason Wang




On 2020/12/29 上午8:57, Willem de Bruijn wrote:

On Mon, Dec 28, 2020 at 5:59 PM Jakub Kicinski  wrote:

On Mon, 28 Dec 2020 11:22:32 -0500 Willem de Bruijn wrote:

From: Willem de Bruijn 

Add optional PTP hardware timestamp offload for virtio-net.

Accurate RTT measurement requires timestamps close to the wire.
Introduce virtio feature VIRTIO_NET_F_RX_TSTAMP. If negotiated, the
virtio-net header is expanded with room for a timestamp. A host may
pass receive timestamps for all or some packets. A timestamp is valid
if non-zero.

The timestamp straddles (virtual) hardware domains. Like PTP, use
international atomic time (CLOCK_TAI) as global clock base. It is
guest responsibility to sync with host, e.g., through kvm-clock.

Would this not be confusing to some user space SW to have a NIC with
no PHC deliver HW stamps?

I'd CC Richard on this, unless you already discussed with him offline.

Thanks, good point. I should have included Richard.

There is a well understood method for synchronizing guest and host
clock in KVM using ptp_kvm. For virtual environments without NIC
hardware offload, the when host timestamps in software, this suffices.

Syncing host with NIC is assumed if the host advertises the feature
and implements using real hardware timestamps.



Or it could be useful for virtio hardware when there's no KVM that 
provides PTP.


Thanks

[PATCH] Allow user to set metric on default route learned via Router Advertisement.

2020-12-30 Thread Praveen Chaudhary

Allow user to set metric on default route learned via Router Advertisement.
Not: RFC 4191 does not say anything for metric for IPv6 default route.

Fix:
For IPv4, default route is learned via DHCPv4 and user is allowed to change
metric using config etc/network/interfaces. But for IPv6, default route can
be learned via RA, for which, currently a fixed metric value 1024 is used.

Ideally, user should be able to configure metric on default route for IPv6
similar to IPv4. This fix adds sysctl for the same.

Logs:

For IPv4:


Config in etc/network/interfaces

```
auto eth0
iface eth0 inet dhcp
metric 4261413864
```

IPv4 Kernel Route Table:

```
$ sudo route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric RefUse Iface
0.0.0.0 172.11.44.1 0.0.0.0 UG-33553432 00 eth0
```

FRR Table, if default route is learned via routing protocol too.

```
# show ip route 
Codes: K - kernel route, C - connected, S - static, R - RIP,
   O - OSPF, I - IS-IS, B - BGP, P - PIM, E - EIGRP, N - NHRP,
   T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
   > - selected route, * - FIB route

S>* 0.0.0.0/0 [20/0] is directly connected, eth0, 00:00:03
K   0.0.0.0/0 [254/1000] via 172.21.47.1, eth0, 6d08h51m
```


i.e. User can prefer Default Router learned via Routing Protocol, 
Similar behavior is not possible for IPv6, without this fix.



After fix [for IPv6]:

```
sudo sysctl -w 
net.ipv6.conf.eth0.net.ipv6.conf.eth0.accept_ra_defrtr_metric=0x770003e9
```

IP monitor:

```
default via fe80::be16:65ff:feb3:ce8e dev eth0 proto ra metric 1996489705  pref 
high
```

Kernel IPv6 routing table

```
DestinationNext Hop   Flag Met Ref Use If
::/0   fe80::be16:65ff:feb3:ce8e  UGDAe 1996489705 0
 0 eth0
```

FRR Routing Table, if default route is learned via routing protocol.

# show ipv6 route 
Codes: K - kernel route, C - connected, S - static, R - RIPng,
   O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
   v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
   > - selected route, * - FIB route

S>* ::/0 [20/0] is directly connected, eth0, 00:00:06
K   ::/0 [119/1001] via fe80::be16:65ff:feb3:ce8e, eth0, 6d07h43m


Praveen Chaudhary (1):
  Allow user to set metric on default route learned via Router
Advertisement.

 Documentation/networking/ip-sysctl.rst |  8 
 include/linux/ipv6.h   |  1 +
 include/net/ip6_route.h|  3 ++-
 include/uapi/linux/ipv6.h  |  1 +
 include/uapi/linux/sysctl.h|  1 +
 net/ipv6/addrconf.c| 10 ++
 net/ipv6/ndisc.c   | 15 +++
 net/ipv6/route.c   |  8 +---
 8 files changed, 39 insertions(+), 8 deletions(-)

-- 
2.7.4

[PATCH] Allow user to set metric on default route learned via Router Advertisement.

2020-12-30 Thread Praveen Chaudhary

Allow user to set metric on default route learned via Router Advertisement.
Not: RFC 4191 does not say anything for metric for IPv6 default route.

Fix:
For IPv4, default route is learned via DHCPv4 and user is allowed to change
metric using config etc/network/interfaces. But for IPv6, default route can
be learned via RA, for which, currently a fixed metric value 1024 is used.

Ideally, user should be able to configure metric on default route for IPv6
similar to IPv4. This fix adds sysctl for the same.

Logs:

For IPv4:


Config in etc/network/interfaces

```
auto eth0
iface eth0 inet dhcp
metric 4261413864
```

IPv4 Kernel Route Table:

```
$ sudo route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric RefUse Iface
0.0.0.0 172.11.44.1 0.0.0.0 UG-33553432 00 eth0
```

FRR Table, if default route is learned via routing protocol too.

```
# show ip route 
Codes: K - kernel route, C - connected, S - static, R - RIP,
   O - OSPF, I - IS-IS, B - BGP, P - PIM, E - EIGRP, N - NHRP,
   T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
   > - selected route, * - FIB route

S>* 0.0.0.0/0 [20/0] is directly connected, eth0, 00:00:03
K   0.0.0.0/0 [254/1000] via 172.21.47.1, eth0, 6d08h51m
```


i.e. User can prefer Default Router learned via Routing Protocol, 
Similar behavior is not possible for IPv6, without this fix.



After fix [for IPv6]:

```
sudo sysctl -w 
net.ipv6.conf.eth0.net.ipv6.conf.eth0.accept_ra_defrtr_metric=0x770003e9
```

IP monitor:

```
default via fe80::be16:65ff:feb3:ce8e dev eth0 proto ra metric 1996489705  pref 
high
```

Kernel IPv6 routing table

```
DestinationNext Hop   Flag Met Ref Use If
::/0   fe80::be16:65ff:feb3:ce8e  UGDAe 1996489705 0
 0 eth0
```

FRR Routing Table, if default route is learned via routing protocol.

# show ipv6 route 
Codes: K - kernel route, C - connected, S - static, R - RIPng,
   O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
   v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
   > - selected route, * - FIB route

S>* ::/0 [20/0] is directly connected, eth0, 00:00:06
K   ::/0 [119/1001] via fe80::be16:65ff:feb3:ce8e, eth0, 6d07h43m


Praveen Chaudhary (1):
  Allow user to set metric on default route learned via Router
Advertisement.

 Documentation/networking/ip-sysctl.rst |  8 
 include/linux/ipv6.h   |  1 +
 include/net/ip6_route.h|  3 ++-
 include/uapi/linux/ipv6.h  |  1 +
 include/uapi/linux/sysctl.h|  1 +
 net/ipv6/addrconf.c| 10 ++
 net/ipv6/ndisc.c   | 15 +++
 net/ipv6/route.c   |  8 +---
 8 files changed, 39 insertions(+), 8 deletions(-)

-- 
2.7.4

[PATCH] Allow user to set metric on default route learned via Router Advertisement.

2020-12-30 Thread Praveen Chaudhary

Fix:
For IPv4, default route is learned via DHCPv4 and user is allowed to change
metric using config etc/network/interfaces. But for IPv6, default route can
be learned via RA, for which, currently a fixed metric value 1024 is used.

Ideally, user should be able to configure metric on default route for IPv6
similar to IPv4. This fix adds sysctl for the same.

Signed-off-by: Praveen Chaudhary
Signed-off-by: Zhenggen Xu
---
 Documentation/networking/ip-sysctl.rst |  8 
 include/linux/ipv6.h   |  1 +
 include/net/ip6_route.h|  3 ++-
 include/uapi/linux/ipv6.h  |  1 +
 include/uapi/linux/sysctl.h|  1 +
 net/ipv6/addrconf.c| 10 ++
 net/ipv6/ndisc.c   | 15 +++
 net/ipv6/route.c   |  8 +---
 8 files changed, 39 insertions(+), 8 deletions(-)

diff --git a/Documentation/networking/ip-sysctl.rst 
b/Documentation/networking/ip-sysctl.rst
index dd2b12a..073c1f3 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -1871,6 +1871,14 @@ accept_ra_defrtr - BOOLEAN
- enabled if accept_ra is enabled.
- disabled if accept_ra is disabled.
 
+accept_ra_defrtr_metric - INTEGER
+   Metric for default router learned in Router Advertisement.
+
+   Functional default:
+
+   * 0 if accept_ra_defrtr is enabled.
+   * Ignored, if accept_ra_defrtr is enabled.
+
 accept_ra_from_local - BOOLEAN
Accept RA with source-address that is found on local machine
if the RA is otherwise proper and able to be accepted.
diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index dda61d1..19af90c 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -31,6 +31,7 @@ struct ipv6_devconf {
__s32   max_desync_factor;
__s32   max_addresses;
__s32   accept_ra_defrtr;
+   __s32   accept_ra_defrtr_metric;
__s32   accept_ra_min_hop_limit;
__s32   accept_ra_pinfo;
__s32   ignore_routes_with_linkdown;
diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h
index 2a52777..a470bda 100644
--- a/include/net/ip6_route.h
+++ b/include/net/ip6_route.h
@@ -174,7 +174,8 @@ struct fib6_info *rt6_get_dflt_router(struct net *net,
 struct net_device *dev);
 struct fib6_info *rt6_add_dflt_router(struct net *net,
 const struct in6_addr *gwaddr,
-struct net_device *dev, unsigned int pref);
+struct net_device *dev, unsigned int pref,
+unsigned int defrtr_usr_metric);
 
 void rt6_purge_dflt_routers(struct net *net);
 
diff --git a/include/uapi/linux/ipv6.h b/include/uapi/linux/ipv6.h
index 13e8751..945de5d 100644
--- a/include/uapi/linux/ipv6.h
+++ b/include/uapi/linux/ipv6.h
@@ -189,6 +189,7 @@ enum {
DEVCONF_ACCEPT_RA_RT_INFO_MIN_PLEN,
DEVCONF_NDISC_TCLASS,
DEVCONF_RPL_SEG_ENABLED,
+   DEVCONF_ACCEPT_RA_DEFRTR_METRIC,
DEVCONF_MAX
 };
 
diff --git a/include/uapi/linux/sysctl.h b/include/uapi/linux/sysctl.h
index 458179d..5e79c19 100644
--- a/include/uapi/linux/sysctl.h
+++ b/include/uapi/linux/sysctl.h
@@ -571,6 +571,7 @@ enum {
NET_IPV6_ACCEPT_SOURCE_ROUTE=25,
NET_IPV6_ACCEPT_RA_FROM_LOCAL=26,
NET_IPV6_ACCEPT_RA_RT_INFO_MIN_PLEN=27,
+   NET_IPV6_ACCEPT_RA_DEFRTR_METRIC=28,
__NET_IPV6_MAX
 };
 
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index eff2cac..702ec4a 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -205,6 +205,7 @@ static struct ipv6_devconf ipv6_devconf __read_mostly = {
.max_desync_factor  = MAX_DESYNC_FACTOR,
.max_addresses  = IPV6_MAX_ADDRESSES,
.accept_ra_defrtr   = 1,
+   .accept_ra_defrtr_metric = 0,
.accept_ra_from_local   = 0,
.accept_ra_min_hop_limit= 1,
.accept_ra_pinfo= 1,
@@ -260,6 +261,7 @@ static struct ipv6_devconf ipv6_devconf_dflt __read_mostly 
= {
.max_desync_factor  = MAX_DESYNC_FACTOR,
.max_addresses  = IPV6_MAX_ADDRESSES,
.accept_ra_defrtr   = 1,
+   .accept_ra_defrtr_metric = 0,
.accept_ra_from_local   = 0,
.accept_ra_min_hop_limit= 1,
.accept_ra_pinfo= 1,
@@ -5475,6 +5477,7 @@ static inline void ipv6_store_devconf(struct ipv6_devconf 
*cnf,
array[DEVCONF_MAX_DESYNC_FACTOR] = cnf->max_desync_factor;
array[DEVCONF_MAX_ADDRESSES] = cnf->max_addresses;
array[DEVCONF_ACCEPT_RA_DEFRTR] = cnf->accept_ra_defrtr;
+   array[DEVCONF_ACCEPT_RA_DEFRTR_METRIC] = cnf->accept_ra_defrtr_metric;
array[DEVCONF_ACCEPT_RA_MIN_HOP_LIMIT] = cnf->accept_ra_min_hop_limit;
array[DEVCONF_ACCEPT_RA_PINFO] = cnf->acc

Re: BTFIDS: FAILED unresolved symbol udp6_sock

2020-12-30 Thread Jiri Olsa

On Tue, Dec 29, 2020 at 11:28:35PM +, Qais Yousef wrote:
> Hi Jiri
> 
> On 12/29/20 18:34, Jiri Olsa wrote:
> > On Tue, Dec 29, 2020 at 03:13:52PM +, Qais Yousef wrote:
> > > Hi
> > > 
> > > When I enable CONFIG_DEBUG_INFO_BTF I get the following error in the 
> > > BTFIDS
> > > stage
> > > 
> > >   FAILED unresolved symbol udp6_sock
> > > 
> > > I cross compile for arm64. My .config is attached.
> > > 
> > > I managed to reproduce the problem on v5.9 and v5.10. Plus 5.11-rc1.
> > > 
> > > Have you seen this before? I couldn't find a specific report about this
> > > problem.
> > > 
> > > Let me know if you need more info.
> > 
> > hi,
> > this looks like symptom of the gcc DWARF bug we were
> > dealing with recently:
> > 
> >   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060
> >   
> > https://lore.kernel.org/lkml/CAE1WUT75gu9G62Q9uAALGN6vLX=o7vz9uhqtvwnbuv81dgm...@mail.gmail.com/#r
> > 
> > what pahole/gcc version are you using?
> 
> I'm on gcc 9.3.0
> 
>   aarch64-linux-gnu-gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
> 
> I was on pahole v1.17. I moved to v1.19 but I still see the same problem.

I can reproduce with your .config, but make 'defconfig' works,
so I guess it's some config option issue, I'll check later today

jirka

Re: Registering IRQ for MT7530 internal PHYs

2020-12-30 Thread DENG Qingfang

Hi Heiner,
Thanks for your reply.

On Wed, Dec 30, 2020 at 3:39 PM Heiner Kallweit  wrote:
> I don't think that's the best option.

I'm well aware of that.

> You may want to add a PHY driver for your chip. Supposedly it
> supports at least PHY suspend/resume. You can use the RTL8366RB
> PHY driver as template.

There's no MediaTek PHY driver yet. Do we really need a new one just
for the interrupts?

> > + dev_info_ratelimited(priv->dev, "interrupt status: 0x%08x\n", val);
> > + dev_info_ratelimited(priv->dev, "interrupt enable: 0x%08x\n", 
> > mt7530_read(priv, MT7530_SYS_INT_EN));
> > +
> This is debug code to be removed in the final version?

Yes.

> > + for (phy = 0; phy < MT7530_NUM_PHYS; phy++) {
> > + if (val & BIT(phy)) {
> > + unsigned int child_irq;
> > +
> > + child_irq = irq_find_mapping(priv->irq_domain, phy);
> > + handle_nested_irq(child_irq);
> > + handled = true;
> > + }
> > + }
> > +
> > + return handled ? IRQ_HANDLED : IRQ_NONE;
>
> IRQ_RETVAL() could be used here.

Good to know :)

>
> > +}
> > +
> > +static void mt7530_irq_mask(struct irq_data *d)
> > +{
> > + struct mt7530_priv *priv = irq_data_get_irq_chip_data(d);
> > +
> > + priv->irq_enable &= ~BIT(d->hwirq);
>
> Here you don't actually do something. HW doesn't support masking
> interrupt generation for a port?

priv->irq_enable will be written to MT7530_SYS_INT_EN in
mt7530_irq_bus_sync_unlock. You can think of it as an inverted mask.

Re: Registering IRQ for MT7530 internal PHYs

2020-12-30 Thread Heiner Kallweit

On 30.12.2020 10:07, DENG Qingfang wrote:
> Hi Heiner,
> Thanks for your reply.
> 
> On Wed, Dec 30, 2020 at 3:39 PM Heiner Kallweit  wrote:
>> I don't think that's the best option.
> 
> I'm well aware of that.
> 
>> You may want to add a PHY driver for your chip. Supposedly it
>> supports at least PHY suspend/resume. You can use the RTL8366RB
>> PHY driver as template.
> 
> There's no MediaTek PHY driver yet. Do we really need a new one just
> for the interrupts?
> 
Not only for the interrupts. The genphy driver e.g. doesn't support
PHY suspend/resume. And the PHY driver needs basically no code,
just set the proper callbacks.

>>> + dev_info_ratelimited(priv->dev, "interrupt status: 0x%08x\n", val);
>>> + dev_info_ratelimited(priv->dev, "interrupt enable: 0x%08x\n", 
>>> mt7530_read(priv, MT7530_SYS_INT_EN));
>>> +
>> This is debug code to be removed in the final version?
> 
> Yes.
> 
>>> + for (phy = 0; phy < MT7530_NUM_PHYS; phy++) {
>>> + if (val & BIT(phy)) {
>>> + unsigned int child_irq;
>>> +
>>> + child_irq = irq_find_mapping(priv->irq_domain, phy);
>>> + handle_nested_irq(child_irq);
>>> + handled = true;
>>> + }
>>> + }
>>> +
>>> + return handled ? IRQ_HANDLED : IRQ_NONE;
>>
>> IRQ_RETVAL() could be used here.
> 
> Good to know :)
> 
>>
>>> +}
>>> +
>>> +static void mt7530_irq_mask(struct irq_data *d)
>>> +{
>>> + struct mt7530_priv *priv = irq_data_get_irq_chip_data(d);
>>> +
>>> + priv->irq_enable &= ~BIT(d->hwirq);
>>
>> Here you don't actually do something. HW doesn't support masking
>> interrupt generation for a port?
> 
> priv->irq_enable will be written to MT7530_SYS_INT_EN in
> mt7530_irq_bus_sync_unlock. You can think of it as an inverted mask.
>

[PATCH v3 net-next] net: kcm: Replace fput with sockfd_put

2020-12-30 Thread Zheng Yongjun

The function sockfd_lookup uses fget on the value that is stored in
the file field of the returned structure, so fput should ultimately be
applied to this value.  This can be done directly, but it seems better
to use the specific macro sockfd_put, which does the same thing.

Perform a source code refactoring by using the following semantic patch.

// 
@@
expression s;
@@

   s = sockfd_lookup(...)
   ...
+ sockfd_put(s);
- fput(s->file);
// 

Signed-off-by: Zheng Yongjun 
---
 net/kcm/kcmsock.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c
index 56dad9565bc9..a9eb616f5521 100644
--- a/net/kcm/kcmsock.c
+++ b/net/kcm/kcmsock.c
@@ -1496,7 +1496,7 @@ static int kcm_attach_ioctl(struct socket *sock, struct 
kcm_attach *info)
 
return 0;
 out:
-   fput(csock->file);
+   sockfd_put(csock);
return err;
 }
 
@@ -1644,7 +1644,7 @@ static int kcm_unattach_ioctl(struct socket *sock, struct 
kcm_unattach *info)
spin_unlock_bh(&mux->lock);
 
 out:
-   fput(csock->file);
+   sockfd_put(csock);
return err;
 }
 
-- 
2.22.0

Re: [PATCH] mlx4: style: replace zero-length array with flexible-array member.

2020-12-30 Thread Tariq Toukan





On 12/30/2020 8:28 AM, YANG LI wrote:

There is a regular need in the kernel to provide a way to declare
having a dynamically sized set of trailing elements in a structure.
Kernel code should always use "flexible array members"[1] for these
cases. The older style of one-element or zero-length arrays should
no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.9/process/
 deprecated.html#zero-length-and-one-element-arrays

Signed-off-by: YANG LI 
Reported-by: Abaci 
---
  drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h 
b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index e8ed2319..4029a8b 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -314,7 +314,7 @@ struct mlx4_en_tx_ring {
  
  struct mlx4_en_rx_desc {

/* actual number of entries depends on rx ring stride */
-   struct mlx4_wqe_data_seg data[0];
+   struct mlx4_wqe_data_seg data[];
  };
  
  struct mlx4_en_rx_ring {




Reviewed-by: Tariq Toukan 

Thanks.

Re: Registering IRQ for MT7530 internal PHYs

2020-12-30 Thread Marc Zyngier


Hi Qingfang,

On 2020-12-30 04:22, DENG Qingfang wrote:

Hi,

I added MT7530 IRQ support and registered its internal PHYs to IRQ.
It works but my patch used two hacks.

1. Removed phy_drv_supports_irq check, because config_intr and
handle_interrupt are not set for Generic PHY.

2. Allocated ds->slave_mii_bus before calling ds->ops->setup, because
we cannot call dsa_slave_mii_bus_init which is private.

Any better ideas?

Regards,
Qingfang

diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
index a67cac15a724..d59a8c50ede3 100644
--- a/drivers/net/dsa/mt7530.c
+++ b/drivers/net/dsa/mt7530.c
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1639,6 +1640,125 @@ mtk_get_tag_protocol(struct dsa_switch *ds, int 
port,

}
 }

+static irqreturn_t
+mt7530_irq(int irq, void *data)
+{
+   struct mt7530_priv *priv = data;
+   bool handled = false;
+   int phy;
+   u32 val;
+
+   val = mt7530_read(priv, MT7530_SYS_INT_STS);
+   mt7530_write(priv, MT7530_SYS_INT_STS, val);


If that is an ack operation, it should be dealt with as such in
an irqchip callback instead of being open-coded here.


+
+   dev_info_ratelimited(priv->dev, "interrupt status: 0x%08x\n", val);
+   dev_info_ratelimited(priv->dev, "interrupt enable: 0x%08x\n",
mt7530_read(priv, MT7530_SYS_INT_EN));
+


I don't think printing these from an interrupt handler is a good idea.
Use the debug primitives if you really want these messages.


+   for (phy = 0; phy < MT7530_NUM_PHYS; phy++) {
+   if (val & BIT(phy)) {
+   unsigned int child_irq;
+
+   child_irq = irq_find_mapping(priv->irq_domain, phy);
+   handle_nested_irq(child_irq);
+   handled = true;
+   }
+   }
+
+   return handled ? IRQ_HANDLED : IRQ_NONE;
+}


What is the reason for not implementing this as a chained interrupt 
flow?



+
+static void mt7530_irq_mask(struct irq_data *d)
+{
+   struct mt7530_priv *priv = irq_data_get_irq_chip_data(d);
+
+   priv->irq_enable &= ~BIT(d->hwirq);
+}
+
+static void mt7530_irq_unmask(struct irq_data *d)
+{
+   struct mt7530_priv *priv = irq_data_get_irq_chip_data(d);
+
+   priv->irq_enable |= BIT(d->hwirq);
+}
+
+static void mt7530_irq_bus_lock(struct irq_data *d)
+{
+   struct mt7530_priv *priv = irq_data_get_irq_chip_data(d);
+
+   mutex_lock(&priv->reg_mutex);


Are you always guaranteed to be in a thread context? I guess that
is the case, given that you request a threaded interrupt, but
it would be worth documenting.


+}
+
+static void mt7530_irq_bus_sync_unlock(struct irq_data *d)
+{
+   struct mt7530_priv *priv = irq_data_get_irq_chip_data(d);
+
+   mt7530_write(priv, MT7530_SYS_INT_EN, priv->irq_enable);
+   mutex_unlock(&priv->reg_mutex);
+}
+
+static struct irq_chip mt7530_irq_chip = {
+   .name = MT7530_NAME,
+   .irq_mask = mt7530_irq_mask,
+   .irq_unmask = mt7530_irq_unmask,
+   .irq_bus_lock = mt7530_irq_bus_lock,
+   .irq_bus_sync_unlock = mt7530_irq_bus_sync_unlock,
+};
+
+static int
+mt7530_irq_map(struct irq_domain *domain, unsigned int irq,
+  irq_hw_number_t hwirq)
+{
+   irq_set_chip_data(irq, domain->host_data);
+   irq_set_chip_and_handler(irq, &mt7530_irq_chip, handle_simple_irq);
+   irq_set_noprobe(irq);
+
+   return 0;
+}
+
+static const struct irq_domain_ops mt7530_irq_domain_ops = {
+   .map = mt7530_irq_map,
+   .xlate = irq_domain_xlate_onecell,
+};
+
+static void
+mt7530_irq_init(struct mt7530_priv *priv)
+{
+   struct mii_bus *bus = priv->ds->slave_mii_bus;
+   struct device *dev = priv->dev;
+   struct device_node *np = dev->of_node;
+   int parent_irq;
+   int phy, ret;
+
+   parent_irq = of_irq_get(np, 0);
+   if (parent_irq <= 0) {
+   dev_err(dev, "failed to get parent IRQ: %d\n", parent_irq);
+   return;


It seems odd not to propagate the error, since I assume the device will
not be functional.


+   }
+
+   mt7530_set(priv, MT7530_TOP_SIG_CTRL, TOP_SIG_CTRL_NORMAL);
+   ret = devm_request_threaded_irq(dev, parent_irq, NULL, mt7530_irq,
+   IRQF_ONESHOT, MT7530_NAME, priv);
+   if (ret) {
+   dev_err(dev, "failed to request IRQ: %d\n", ret);
+   return;
+   }
+
+   priv->irq_domain = irq_domain_add_linear(np, MT7530_NUM_PHYS,
+   &mt7530_irq_domain_ops, priv);


The creation order is... interesting. You have a handler ready to fire,
nothing seems to initialise the HW state, and the priv data structure
is not fully populated. If any interrupt is pending at this stage,
you have a panic in your hands.


+   if (!priv->irq_domain) {
+   dev_err(dev, "failed to create IRQ domain\n");
+   return;
+   }
+
+

Re: [PATCH 07/21] vdpa: multiple address spaces support

2020-12-30 Thread Eli Cohen

On Wed, Dec 30, 2020 at 12:04:30PM +0800, Jason Wang wrote:
> 
> On 2020/12/29 下午3:28, Eli Cohen wrote:
> > > @@ -43,6 +43,8 @@ struct vdpa_vq_state {
> > >* @index: device index
> > >* @features_valid: were features initialized? for legacy guests
> > >* @nvqs: the number of virtqueues
> > > + * @ngroups: the number of virtqueue groups
> > > + * @nas: the number of address spaces
> > I am not sure these can be categorised as part of the state of the VQ.
> > It's more of a property so maybe we can have a callback to get the
> > properties of the VQ?
> 
> 
> Or maybe there's a misunderstanding of the patch.
> 

Yes, I misinterpreted the hunk. No issue here.

> Those two attributes belongs to vdpa_device instead of vdpa_vq_state
> actually.
> 
> Thanks
>

[PATCH] selftests: xfrm: fix test return value override issue in xfrm_policy.sh

2020-12-30 Thread Po-Hsu Lin

When running this xfrm_policy.sh test script, even with some cases
marked as FAIL, the overall test result will still be PASS:

$ sudo ./xfrm_policy.sh
PASS: policy before exception matches
FAIL: expected ping to .254 to fail (exceptions)
PASS: direct policy matches (exceptions)
PASS: policy matches (exceptions)
FAIL: expected ping to .254 to fail (exceptions and block policies)
PASS: direct policy matches (exceptions and block policies)
PASS: policy matches (exceptions and block policies)
FAIL: expected ping to .254 to fail (exceptions and block policies after hresh 
changes)
PASS: direct policy matches (exceptions and block policies after hresh changes)
PASS: policy matches (exceptions and block policies after hresh changes)
FAIL: expected ping to .254 to fail (exceptions and block policies after 
hthresh change in ns3)
PASS: direct policy matches (exceptions and block policies after hthresh change 
in ns3)
PASS: policy matches (exceptions and block policies after hthresh change in ns3)
FAIL: expected ping to .254 to fail (exceptions and block policies after htresh 
change to normal)
PASS: direct policy matches (exceptions and block policies after htresh change 
to normal)
PASS: policy matches (exceptions and block policies after htresh change to 
normal)
PASS: policies with repeated htresh change
$ echo $?
0

This is because the $lret in check_xfrm() is not a local variable.
Therefore when a test failed in check_exceptions(), the non-zero $lret
will later get reset to 0 when the next test calls check_xfrm().

With this fix, the final return value will be 1. Make it easier for
testers to spot this failure.

Fixes: 39aa6928d462d0 ("xfrm: policy: fix netlink/pf_key policy lookups")
Signed-off-by: Po-Hsu Lin 
---
 tools/testing/selftests/net/xfrm_policy.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/net/xfrm_policy.sh 
b/tools/testing/selftests/net/xfrm_policy.sh
index 7a1bf94..5922941 100755
--- a/tools/testing/selftests/net/xfrm_policy.sh
+++ b/tools/testing/selftests/net/xfrm_policy.sh
@@ -202,7 +202,7 @@ check_xfrm() {
# 1: iptables -m policy rule count != 0
rval=$1
ip=$2
-   lret=0
+   local lret=0
 
ip netns exec ns1 ping -q -c 1 10.0.2.$ip > /dev/null
 
-- 
2.7.4

Re: [PATCH 12/21] vhost-vdpa: introduce uAPI to get the number of virtqueue groups

2020-12-30 Thread Eli Cohen

On Wed, Dec 16, 2020 at 02:48:09PM +0800, Jason Wang wrote:
> Follows the vDPA support for multiple address spaces, this patch
> introduce uAPI for the userspace to know the number of virtqueue
> groups supported by the vDPA device.
> 
> Signed-off-by: Jason Wang 
> ---
>  drivers/vhost/vdpa.c   | 4 
>  include/uapi/linux/vhost.h | 3 +++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 060d5b5b7e64..1ba5901b28e7 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -536,6 +536,10 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
>   case VHOST_VDPA_GET_VRING_NUM:
>   r = vhost_vdpa_get_vring_num(v, argp);
>   break;
> + case VHOST_VDPA_GET_GROUP_NUM:
> + r = copy_to_user(argp, &v->vdpa->ngroups,
> +  sizeof(v->vdpa->ngroups));
> + break;

Is this and other ioctls already supported in qemu?

>   case VHOST_SET_LOG_BASE:
>   case VHOST_SET_LOG_FD:
>   r = -ENOIOCTLCMD;
> diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
> index 59c6c0fbaba1..8a4e6e426bbf 100644
> --- a/include/uapi/linux/vhost.h
> +++ b/include/uapi/linux/vhost.h
> @@ -145,4 +145,7 @@
>  /* Get the valid iova range */
>  #define VHOST_VDPA_GET_IOVA_RANGE_IOR(VHOST_VIRTIO, 0x78, \
>struct vhost_vdpa_iova_range)
> +/* Get the number of virtqueue groups. */
> +#define VHOST_VDPA_GET_GROUP_NUM _IOR(VHOST_VIRTIO, 0x79, unsigned int)
> +
>  #endif
> -- 
> 2.25.1
>

Re: Re: [RFC v2 09/13] vduse: Add support for processing vhost iotlb message

2020-12-30 Thread Yongji Xie

On Wed, Dec 30, 2020 at 4:41 PM Jason Wang  wrote:
>
>
> On 2020/12/30 下午3:09, Yongji Xie wrote:
> > On Wed, Dec 30, 2020 at 2:11 PM Jason Wang  wrote:
> >>
> >> On 2020/12/29 下午6:26, Yongji Xie wrote:
> >>> On Tue, Dec 29, 2020 at 5:11 PM Jason Wang  wrote:
> 
>  - Original Message -
> > On Mon, Dec 28, 2020 at 4:43 PM Jason Wang  wrote:
> >> On 2020/12/28 下午4:14, Yongji Xie wrote:
>  I see. So all the above two questions are because 
>  VHOST_IOTLB_INVALIDATE
>  is expected to be synchronous. This need to be solved by tweaking the
>  current VDUSE API or we can re-visit to go with descriptors relaying
>  first.
> 
> >>> Actually all vdpa related operations are synchronous in current
> >>> implementation. The ops.set_map/dma_map/dma_unmap should not return
> >>> until the VDUSE_UPDATE_IOTLB/VDUSE_INVALIDATE_IOTLB message is replied
> >>> by userspace. Could it solve this problem?
> >> I was thinking whether or not we need to generate IOTLB_INVALIDATE
> >> message to VDUSE during dma_unmap (vduse_dev_unmap_page).
> >>
> >> If we don't, we're probably fine.
> >>
> > It seems not feasible. This message will be also used in the
> > virtio-vdpa case to notify userspace to unmap some pages during
> > consistent dma unmapping. Maybe we can document it to make sure the
> > users can handle the message correctly.
>  Just to make sure I understand your point.
> 
>  Do you mean you plan to notify the unmap of 1) streaming DMA or 2)
>  coherent DMA?
> 
>  For 1) you probably need a workqueue to do that since dma unmap can
>  be done in irq or bh context. And if usrspace does't do the unmap, it
>  can still access the bounce buffer (if you don't zap pte)?
> 
> >>> I plan to do it in the coherent DMA case.
> >>
> >> Any reason for treating coherent DMA differently?
> >>
> > Now the memory of the bounce buffer is allocated page by page in the
> > page fault handler. So it can't be used in coherent DMA mapping case
> > which needs some memory with contiguous virtual addresses. I can use
> > vmalloc() to do allocation for the bounce buffer instead. But it might
> > cause some memory waste. Any suggestion?
>
>
> I may miss something. But I don't see a relationship between the
> IOTLB_UNMAP and vmalloc().
>

In the vmalloc() case, the coherent DMA page will be taken from the
memory allocated by vmalloc(). So IOTLB_UNMAP is not needed anymore
during coherent DMA unmapping because those vmalloc'ed memory which
has been mapped into userspace address space during initialization can
be reused. And userspace should not unmap the region until we destroy
the device.

>
> >
> >>> It's true that userspace can
> >>> access the dma buffer if userspace doesn't do the unmap. But the dma
> >>> pages would not be freed and reused unless user space called munmap()
> >>> for them.
> >>
> >> I wonder whether or not we could recycle IOVA in this case to avoid the
> >> IOTLB_UMAP message.
> >>
> > We can achieve that if we use vmalloc() to do allocation for the
> > bounce buffer which can be used in coherent DMA mapping case. But
> > looks like we still have no way to avoid the IOTLB_UMAP message in
> > vhost-vdpa case.
>
>
> I think that's fine. For virtio-vdpa, from VDUSE userspace perspective,
> it works like a driver that is using SWIOTLB in this case.
>

OK, will do it in v3.

Thanks,
Yongji

[PATCH v2] net: macb: Correct usage of MACB_CAPS_CLK_HW_CHG flag

2020-12-30 Thread Charles Keepax

A new flag MACB_CAPS_CLK_HW_CHG was added and all callers of
macb_set_tx_clk were gated on the presence of this flag.

-   if (!clk)
+ if (!bp->tx_clk || !(bp->caps & MACB_CAPS_CLK_HW_CHG))

However the flag was not added to anything other than the new
sama7g5_gem, turning that function call into a no op for all other
systems. This breaks the networking on Zynq.

The commit message adding this states: a new capability so that
macb_set_tx_clock() to not be called for IPs having this
capability

This strongly implies that present of the flag was intended to skip
the function not absence of the flag. Update the if statement to
this effect, which repairs the existing users.

Fixes: daafa1d33cc9 ("net: macb: add capability to not set the clock rate")
Suggested-by: Andrew Lunn 
Signed-off-by: Charles Keepax 
---

Changes since v1:
 - Updated flag semantics to skip function, as appears to have been
   intended by the initial commit.

Thanks,
Charles

 drivers/net/ethernet/cadence/macb_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/cadence/macb_main.c 
b/drivers/net/ethernet/cadence/macb_main.c
index d5d910916c2e8..814a5b10141d1 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -467,7 +467,7 @@ static void macb_set_tx_clk(struct macb *bp, int speed)
 {
long ferr, rate, rate_rounded;
 
-   if (!bp->tx_clk || !(bp->caps & MACB_CAPS_CLK_HW_CHG))
+   if (!bp->tx_clk || (bp->caps & MACB_CAPS_CLK_HW_CHG))
return;
 
switch (speed) {
-- 
2.11.0

[PATCH net] macvlan: fix null pointer dereference in macvlan_changelink_sources()

2020-12-30 Thread wangyunjian

From: Yunjian Wang 

Currently pointer data is dereferenced when declaring addr before
pointer data is null checked. This could lead to a null pointer
dereference. Fix this by checking if pointer data is null first.

Fixes: 79cf79abce71 ("macvlan: add source mode")
Signed-off-by: Yunjian Wang 
---
 drivers/net/macvlan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index fb51329f8964..e412fd6b6798 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -1356,7 +1356,7 @@ static int macvlan_changelink_sources(struct macvlan_dev 
*vlan, u32 mode,
struct nlattr *nla, *head;
struct macvlan_source_entry *entry;
 
-   if (data[IFLA_MACVLAN_MACADDR])
+   if (data && data[IFLA_MACVLAN_MACADDR])
addr = nla_data(data[IFLA_MACVLAN_MACADDR]);
 
if (mode == MACVLAN_MACADDR_ADD) {
-- 
2.23.0

Re: [PATCH net-next] net/mlxfw: Use kzalloc for allocating only one thing

2020-12-30 Thread Ido Schimmel

On Wed, Dec 30, 2020 at 04:18:35PM +0800, Zheng Yongjun wrote:
> Use kzalloc rather than kcalloc(1,...)
> 
> The semantic patch that makes this change is as follows:
> (http://coccinelle.lip6.fr/)
> 
> // 
> @@
> @@
> 
> - kcalloc(1,
> + kzalloc(
>   ...)
> // 
> 
> Signed-off-by: Zheng Yongjun 

Reviewed-by: Ido Schimmel 

Thanks

Re: [PATCH] CDC-NCM: remove "connected" log message

2020-12-30 Thread Oliver Neukum

Am Dienstag, den 29.12.2020, 11:50 -0800 schrieb Roland Dreier:
> > I looked at them again and found that there is a way to get
> > the same effect that will make maintenance easier in the long run.
> > Could I send them to you later this week for testing?
> 
> Yes, please.  I have a good test setup now so I can easily try out patches.

Thank you,

here we go.

Regards
Oliver

From 7dfb5f35933ebbe12076e41606b178fa7d8d2e7b Mon Sep 17 00:00:00 2001
From: Oliver Neukum 
Date: Tue, 1 Dec 2020 11:31:15 +0100
Subject: [PATCH 1/2] usbnet: add method for reporting speed without MDIO

The old method for reporting network speed upwards
assumed that a device uses MDIO and uses the generic phy
functions based on that.
Add a a primitive internal version not making the assumption
reporting back directly what the status operations record.

Signed-off-by: Oliver Neukum 
---
 drivers/net/usb/usbnet.c   | 30 +-
 include/linux/usb/usbnet.h |  7 ++-
 2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
index 1447da1d5729..bcd17f6d6de6 100644
--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -944,7 +944,7 @@ EXPORT_SYMBOL_GPL(usbnet_open);
  * they'll probably want to use this base set.
  */
 
-int usbnet_get_link_ksettings(struct net_device *net,
+int usbnet_get_link_ksettings_mdio(struct net_device *net,
 			  struct ethtool_link_ksettings *cmd)
 {
 	struct usbnet *dev = netdev_priv(net);
@@ -956,6 +956,32 @@ int usbnet_get_link_ksettings(struct net_device *net,
 
 	return 0;
 }
+EXPORT_SYMBOL_GPL(usbnet_get_link_ksettings_mdio);
+
+int usbnet_get_link_ksettings(struct net_device *net,
+	struct ethtool_link_ksettings *cmd)
+{
+	struct usbnet *dev = netdev_priv(net);
+
+	/* the assumption that speed is equal on tx and rx
+	 * is deeply engrained into the networking layer.
+	 * For wireless stuff it is not true.
+	 * We assume that rxspeed matters more.
+	 */
+	if (dev->rxspeed != SPEED_UNKNOWN)
+		cmd->base.speed = dev->rxspeed / 100;
+	else if (dev->txspeed != SPEED_UNKNOWN)
+		cmd->base.speed = dev->txspeed / 100;
+	/* if a minidriver does not record speed we try to
+	 * fall back on MDIO
+	 */
+	else if (!dev->mii.mdio_read)
+		cmd->base.speed = SPEED_UNKNOWN;
+	else
+		mii_ethtool_get_link_ksettings(&dev->mii, cmd);
+
+	return 0;
+}
 EXPORT_SYMBOL_GPL(usbnet_get_link_ksettings);
 
 int usbnet_set_link_ksettings(struct net_device *net,
@@ -1661,6 +1687,8 @@ usbnet_probe (struct usb_interface *udev, const struct usb_device_id *prod)
 	dev->intf = udev;
 	dev->driver_info = info;
 	dev->driver_name = name;
+	dev->rxspeed = -1; /* unknown or handled by MII */
+	dev->txspeed = -1;
 
 	net->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats);
 	if (!net->tstats)
diff --git a/include/linux/usb/usbnet.h b/include/linux/usb/usbnet.h
index 88a7673894d5..f748c758f82a 100644
--- a/include/linux/usb/usbnet.h
+++ b/include/linux/usb/usbnet.h
@@ -53,6 +53,8 @@ struct usbnet {
 	u32			hard_mtu;	/* count any extra framing */
 	size_t			rx_urb_size;	/* size for rx urbs */
 	struct mii_if_info	mii;
+	int			rxspeed;	/* if MII is not used */
+	int			txspeed;	/* if MII is not used */
 
 	/* various kinds of pending driver work */
 	struct sk_buff_head	rxq;
@@ -267,8 +269,11 @@ extern void usbnet_purge_paused_rxq(struct usbnet *);
 
 extern int usbnet_get_link_ksettings(struct net_device *net,
  struct ethtool_link_ksettings *cmd);
-extern int usbnet_set_link_ksettings(struct net_device *net,
+extern int usbnet_set_link_ksettings_mdio(struct net_device *net,
  const struct ethtool_link_ksettings *cmd);
+/* Legacy - to be used if you really need an error to be returned */
+extern int usbnet_set_link_ksettings(struct net_device *net,
+	const struct ethtool_link_ksettings *cmd);
 extern u32 usbnet_get_link(struct net_device *net);
 extern u32 usbnet_get_msglevel(struct net_device *);
 extern void usbnet_set_msglevel(struct net_device *, u32);
-- 
2.26.2

From 447850c0e90aef5a8bb569d3888c76032a82c7c7 Mon Sep 17 00:00:00 2001
From: Oliver Neukum 
Date: Tue, 1 Dec 2020 11:33:38 +0100
Subject: [PATCH 2/2] CDC-NCM: record speed in status method

The driver has a status method for receiving speed updates.
The framework, however, had support functions only for devices
that reported their speed upon an explicit query over a MDIO
interface.
CDC_NCM however gets direct notifications from the device.
As new support functions have become available, we shall now
record such notifications and tell the usbnet framework
to make direct use of them without going through the PHY layer.

Signed-off-by: Oliver Neukum 
---
 drivers/net/usb/cdc_ncm.c | 17 ++---
 1 file changed, 2 insertions(+), 15 deletions(-)

diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c
index 2bac57d5e8d5..78cb3da8de0b 100644
--- a/drivers/net/usb/cdc_ncm.c
+++ b/drivers/net/usb/cdc_ncm.c
@@ -18

[PATCH net] selftests: mlxsw: Set headroom size of correct port

2020-12-30 Thread Ido Schimmel

From: Ido Schimmel 

The test was setting the headroom size of the wrong port. This was not
visible because of a firmware bug that canceled this bug.

Set the headroom size of the correct port, so that the test will pass
with both old and new firmware versions.

Fixes: bfa804784e32 ("selftests: mlxsw: Add a PFC test")
Signed-off-by: Ido Schimmel 
Reviewed-by: Petr Machata 
---
 tools/testing/selftests/drivers/net/mlxsw/qos_pfc.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/drivers/net/mlxsw/qos_pfc.sh 
b/tools/testing/selftests/drivers/net/mlxsw/qos_pfc.sh
index 4d900bc1f76c..5c7700212f75 100755
--- a/tools/testing/selftests/drivers/net/mlxsw/qos_pfc.sh
+++ b/tools/testing/selftests/drivers/net/mlxsw/qos_pfc.sh
@@ -230,7 +230,7 @@ switch_create()
__mlnx_qos -i $swp4 --pfc=0,1,0,0,0,0,0,0 >/dev/null
# PG0 will get autoconfigured to Xoff, give PG1 arbitrarily 100K, which
# is (-2*MTU) about 80K of delay provision.
-   __mlnx_qos -i $swp3 --buffer_size=0,$_100KB,0,0,0,0,0,0 >/dev/null
+   __mlnx_qos -i $swp4 --buffer_size=0,$_100KB,0,0,0,0,0,0 >/dev/null
 
# bridges
# ---
-- 
2.29.2

Re: [PATCH] xfrm: Fix wraparound in xfrm_policy_addr_delta()

2020-12-30 Thread Visa Hankala

On Tue, Dec 29, 2020 at 05:01:27PM +0100, Florian Westphal wrote:
> Visa Hankala  wrote:
> > Use three-way comparison for address elements to avoid integer
> > wraparound in the result of xfrm_policy_addr_delta().
> > 
> > This ensures that the search trees are built and traversed correctly
> > when the difference between compared address elements is larger
> > than INT_MAX.
> 
> Please provide an update to tools/testing/selftests/net/xfrm_policy.sh
> that shows that this is a problem.

I will do that in the next revision.

> > switch (family) {
> > case AF_INET:
> > -   if (sizeof(long) == 4 && prefixlen == 0)
> > -   return ntohl(a->a4) - ntohl(b->a4);
> > -   return (ntohl(a->a4) & ((~0UL << (32 - prefixlen -
> > -  (ntohl(b->a4) & ((~0UL << (32 - prefixlen;
> > +   mask = ~0U << (32 - prefixlen);
> > +   ma = ntohl(a->a4) & mask;
> > +   mb = ntohl(b->a4) & mask;
> 
> This is suspicious.  Is prefixlen == 0 impossible?
> 
> If not, then after patch
> mask = ~0U << 32;
> 
> ... and function returns 0.

prefixlen == 0 is possible. However, I now realize that left shift
by 32 should be avoided with 32-bit integers.

With prefixlen == 0, there is only one equivalence class, so
returning 0 seems reasonable to me.

Is there a reason why the function has treated /0 prefix as /32
with IPv4? IPv6 does not have this treatment.

[PATCH net] net: ipv6: Validate GSO SKB before finish IPv6 processing

2020-12-30 Thread Aya Levin

There are cases where GSO segment's length exceeds the egress MTU.
If so:
 - Consume the SKB and its segments.
 - Issue an ICMP packet with 'Packet Too Big' message containing the
   MTU, allowing the source host to reduce its Path MTU appropriately.

Note: These cases are handled in the same manner in IPv4 output finish.
This patch aligns the behavior of IPv6 and IPv4.

Fixes: 9e50849054a4 ("netfilter: ipv6: move POSTROUTING invocation before 
fragmentation")
Signed-off-by: Aya Levin 
Reviewed-by: Tariq Toukan 
---
 net/ipv6/ip6_output.c | 41 -
 1 file changed, 40 insertions(+), 1 deletion(-)

Hi,

Please queue to -stable >= v2.6.35.

Daniel Axtens, does this solve the issue referred in your commit?
8914a595110a bnx2x: disable GSO where gso_size is too big for hardware

Thanks,
Aya

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 749ad72386b2..36466f2211bd 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -125,8 +125,43 @@ static int ip6_finish_output2(struct net *net, struct sock 
*sk, struct sk_buff *
return -EINVAL;
 }
 
+static int
+ip6_finish_output_gso_slowpath_drop(struct net *net, struct sock *sk,
+   struct sk_buff *skb, unsigned int mtu)
+{
+   struct sk_buff *segs, *nskb;
+   netdev_features_t features;
+   int ret;
+
+   /* Please see corresponding comment in ip_finish_output_gso
+* describing the cases where GSO segment length exceeds the
+* egress MTU.
+*/
+   features = netif_skb_features(skb);
+   segs = skb_gso_segment(skb, features & ~NETIF_F_GSO_MASK);
+   if (IS_ERR_OR_NULL(segs)) {
+   kfree_skb(skb);
+   return -ENOMEM;
+   }
+
+   consume_skb(skb);
+
+   skb_list_walk_safe(segs, segs, nskb) {
+   int err;
+
+   skb_mark_not_on_list(segs);
+   err = ip6_fragment(net, sk, segs, ip6_finish_output2);
+   if (err && ret == 0)
+   ret = err;
+   }
+
+   return ret;
+}
+
 static int __ip6_finish_output(struct net *net, struct sock *sk, struct 
sk_buff *skb)
 {
+   unsigned int mtu;
+
 #if defined(CONFIG_NETFILTER) && defined(CONFIG_XFRM)
/* Policy lookup after SNAT yielded a new policy */
if (skb_dst(skb)->xfrm) {
@@ -135,7 +170,11 @@ static int __ip6_finish_output(struct net *net, struct 
sock *sk, struct sk_buff
}
 #endif
 
-   if ((skb->len > ip6_skb_dst_mtu(skb) && !skb_is_gso(skb)) ||
+   mtu = ip6_skb_dst_mtu(skb);
+   if (skb_is_gso(skb) && !skb_gso_validate_network_len(skb, mtu))
+   return ip6_finish_output_gso_slowpath_drop(net, sk, skb, mtu);
+
+   if ((skb->len > mtu && !skb_is_gso(skb)) ||
dst_allfrag(skb_dst(skb)) ||
(IP6CB(skb)->frag_max_size && skb->len > IP6CB(skb)->frag_max_size))
return ip6_fragment(net, sk, skb, ip6_finish_output2);
-- 
2.14.1

Re: [PATCH rfc 2/3] virtio-net: support receive timestamp

2020-12-30 Thread Richard Cochran

On Mon, Dec 28, 2020 at 07:57:20PM -0500, Willem de Bruijn wrote:
> There is a well understood method for synchronizing guest and host
> clock in KVM using ptp_kvm. For virtual environments without NIC
> hardware offload, the when host timestamps in software, this suffices.
> 
> Syncing host with NIC is assumed if the host advertises the feature
> and implements using real hardware timestamps.

Sounds reasonable.

Thanks,
Richard

Re: [PATCH rfc 3/3] virtio-net: support transmit timestamp

2020-12-30 Thread Richard Cochran

On Mon, Dec 28, 2020 at 11:22:33AM -0500, Willem de Bruijn wrote:
> From: Willem de Bruijn 
> 
> Add optional delivery time (SO_TXTIME) offload for virtio-net.
> 
> The Linux TCP/IP stack tries to avoid bursty transmission and network
> congestion through pacing: computing an skb delivery time based on
> congestion information. Userspace protocol implementations can achieve
> the same with SO_TXTIME. This may also reduce scheduling jitter and
> improve RTT estimation.

This description is clear, but the Subject line is confusing.  It made
me wonder whether this series is somehow about host/guest synchronization
(but your comments do explain that that isn't the case).

How about this instead?

   virtio-net: support future packet transmit time

Thanks,
Richard

Re: [PATCH] xfrm: Fix wraparound in xfrm_policy_addr_delta()

2020-12-30 Thread Florian Westphal

Visa Hankala  wrote:
> On Tue, Dec 29, 2020 at 05:01:27PM +0100, Florian Westphal wrote:
> > This is suspicious.  Is prefixlen == 0 impossible?
> > 
> > If not, then after patch
> > mask = ~0U << 32;
> > 
> > ... and function returns 0.
> 
> With prefixlen == 0, there is only one equivalence class, so
> returning 0 seems reasonable to me.

Right, that seems reasonable indeed.

> Is there a reason why the function has treated /0 prefix as /32
> with IPv4? IPv6 does not have this treatment.

Not that I recall, looks like a bug.

[PATCH 2/2] net: ks8851: Register MDIO bus and the internal PHY

2020-12-30 Thread Marek Vasut

The KS8851 has a reduced internal PHY, which is accessible through its
registers at offset 0xe4. The PHY is compatible with KS886x PHY present
in Micrel switches, except the PHY ID Low/High registers are swapped.
Register MDIO bus so this PHY can be detected and probed by phylib.

Signed-off-by: Marek Vasut 
Cc: Andrew Lunn 
Cc: Heiner Kallweit 
Cc: Lukas Wunner 
To: netdev@vger.kernel.org
---
 drivers/net/ethernet/micrel/ks8851.h|  1 +
 drivers/net/ethernet/micrel/ks8851_common.c | 69 -
 2 files changed, 67 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/micrel/ks8851.h 
b/drivers/net/ethernet/micrel/ks8851.h
index 2b319e451121..9bed9e024d81 100644
--- a/drivers/net/ethernet/micrel/ks8851.h
+++ b/drivers/net/ethernet/micrel/ks8851.h
@@ -403,6 +403,7 @@ struct ks8851_net {
struct regulator*vdd_reg;
struct regulator*vdd_io;
int gpio;
+   struct mii_bus  *mii_bus;
 
void(*lock)(struct ks8851_net *ks,
unsigned long *flags);
diff --git a/drivers/net/ethernet/micrel/ks8851_common.c 
b/drivers/net/ethernet/micrel/ks8851_common.c
index 6fc7483aea03..c4f5cb9768ef 100644
--- a/drivers/net/ethernet/micrel/ks8851_common.c
+++ b/drivers/net/ethernet/micrel/ks8851_common.c
@@ -23,6 +23,7 @@
 
 #include 
 #include 
+#include 
 #include 
 
 #include "ks8851.h"
@@ -983,6 +984,24 @@ static void ks8851_phy_write(struct net_device *dev,
}
 }
 
+static int ks8851_mdio_read(struct mii_bus *bus, int phy_id, int reg)
+{
+   struct ks8851_net *ks = bus->priv;
+
+   if (phy_id != 0)
+   return 0x;
+
+   return ks8851_phy_read(ks->netdev, phy_id, reg);
+}
+
+static int ks8851_mdio_write(struct mii_bus *bus, int phy_id, int reg, u16 val)
+{
+   struct ks8851_net *ks = bus->priv;
+
+   ks8851_phy_write(ks->netdev, phy_id, reg, val);
+   return 0;
+}
+
 /**
  * ks8851_read_selftest - read the selftest memory info.
  * @ks: The device state
@@ -1046,6 +1065,42 @@ int ks8851_resume(struct device *dev)
 }
 #endif
 
+static int ks8851_register_mdiobus(struct ks8851_net *ks, struct device *dev)
+{
+   struct mii_bus *mii_bus;
+   int ret;
+
+   mii_bus = mdiobus_alloc();
+   if (!mii_bus)
+   return -ENOMEM;
+
+   mii_bus->name = "ks8851_eth_mii";
+   mii_bus->read = ks8851_mdio_read;
+   mii_bus->write = ks8851_mdio_write;
+   mii_bus->priv = ks;
+   mii_bus->parent = dev;
+   mii_bus->phy_mask = ~BIT(0);
+   snprintf(mii_bus->id, MII_BUS_ID_SIZE, "%s", dev_name(dev));
+
+   ret = mdiobus_register(mii_bus);
+   if (ret)
+   goto err_mdiobus_register;
+
+   ks->mii_bus = mii_bus;
+
+   return 0;
+
+err_mdiobus_register:
+   mdiobus_free(mii_bus);
+   return ret;
+}
+
+static void ks8851_unregister_mdiobus(struct ks8851_net *ks)
+{
+   mdiobus_unregister(ks->mii_bus);
+   mdiobus_free(ks->mii_bus);
+}
+
 int ks8851_probe_common(struct net_device *netdev, struct device *dev,
int msg_en)
 {
@@ -1104,6 +1159,8 @@ int ks8851_probe_common(struct net_device *netdev, struct 
device *dev,
 
INIT_WORK(&ks->rxctrl_work, ks8851_rxctrl_work);
 
+   SET_NETDEV_DEV(netdev, dev);
+
/* setup EEPROM state */
ks->eeprom.data = ks;
ks->eeprom.width = PCI_EEPROM_WIDTH_93C46;
@@ -1120,6 +1177,10 @@ int ks8851_probe_common(struct net_device *netdev, 
struct device *dev,
 
dev_info(dev, "message enable is %d\n", msg_en);
 
+   ret = ks8851_register_mdiobus(ks, dev);
+   if (ret)
+   goto err_mdio;
+
/* set the default message enable */
ks->msg_enable = netif_msg_init(msg_en, NETIF_MSG_DRV |
NETIF_MSG_PROBE |
@@ -1128,7 +1189,6 @@ int ks8851_probe_common(struct net_device *netdev, struct 
device *dev,
skb_queue_head_init(&ks->txq);
 
netdev->ethtool_ops = &ks8851_ethtool_ops;
-   SET_NETDEV_DEV(netdev, dev);
 
dev_set_drvdata(dev, ks);
 
@@ -1156,7 +1216,7 @@ int ks8851_probe_common(struct net_device *netdev, struct 
device *dev,
ret = register_netdev(netdev);
if (ret) {
dev_err(dev, "failed to register network device\n");
-   goto err_netdev;
+   goto err_id;
}
 
netdev_info(netdev, "revision %d, MAC %pM, IRQ %d, %s EEPROM\n",
@@ -1165,8 +1225,9 @@ int ks8851_probe_common(struct net_device *netdev, struct 
device *dev,
 
return 0;
 
-err_netdev:
 err_id:
+   ks8851_unregister_mdiobus(ks);
+err_mdio:
if (gpio_is_valid(gpio))
gpio_set_value(gpio, 0);
regulator_disable(ks->vdd_reg);
@@ -1180,6 +1241,8 @@ int ks8851_remove_common(struct device *dev)
 {
struct ks8851_net *priv = dev_get_drvdata(dev);
 
+   ks8851_unregister_mdiobus(pri

[PATCH 1/2] net: phy: micrel: Add KS8851 PHY support

2020-12-30 Thread Marek Vasut

The KS8851 has a reduced internal PHY, which is accessible through its
registers at offset 0xe4. The PHY is compatible with KS886x PHY present
in Micrel switches, except the PHY ID Low/High registers are swapped.
Add PHY ID for this KS8851 PHY and use custom PHY ID mask due to the swap.

Signed-off-by: Marek Vasut 
Cc: Andrew Lunn 
Cc: Heiner Kallweit 
Cc: Lukas Wunner 
To: netdev@vger.kernel.org
---
 drivers/net/phy/micrel.c   | 9 +
 include/linux/micrel_phy.h | 2 ++
 2 files changed, 11 insertions(+)

diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
index 54e0d75203da..ca6da128e37a 100644
--- a/drivers/net/phy/micrel.c
+++ b/drivers/net/phy/micrel.c
@@ -1386,6 +1386,14 @@ static struct phy_driver ksphy_driver[] = {
.read_status= ksz8873mll_read_status,
.suspend= genphy_suspend,
.resume = genphy_resume,
+}, {
+   .phy_id = PHY_ID_KSZ8851,
+   .phy_id_mask= 0xfff000ff,
+   .name   = "Micrel KSZ8851 Ethernet MAC",
+   /* PHY_BASIC_FEATURES */
+   .config_init= kszphy_config_init,
+   .suspend= genphy_suspend,
+   .resume = genphy_resume,
 }, {
.phy_id = PHY_ID_KSZ886X,
.phy_id_mask= MICREL_PHY_ID_MASK,
@@ -1432,6 +1440,7 @@ static struct mdio_device_id __maybe_unused micrel_tbl[] 
= {
{ PHY_ID_KSZ8061, MICREL_PHY_ID_MASK },
{ PHY_ID_KSZ8081, MICREL_PHY_ID_MASK },
{ PHY_ID_KSZ8873MLL, MICREL_PHY_ID_MASK },
+   { PHY_ID_KSZ8851, 0xfff000ff },
{ PHY_ID_KSZ886X, MICREL_PHY_ID_MASK },
{ PHY_ID_LAN8814, MICREL_PHY_ID_MASK },
{ }
diff --git a/include/linux/micrel_phy.h b/include/linux/micrel_phy.h
index 416ee6dd2574..1c26e4ac0dc9 100644
--- a/include/linux/micrel_phy.h
+++ b/include/linux/micrel_phy.h
@@ -29,6 +29,8 @@
 #define PHY_ID_KSZ9131 0x00221640
 #define PHY_ID_LAN8814 0x00221660
 
+/* The PHY ID Low/High registers are swapped on KSZ8851 */
+#define PHY_ID_KSZ8851 0x14300022
 #define PHY_ID_KSZ886X 0x00221430
 #define PHY_ID_KSZ8863 0x00221435
 
-- 
2.29.2

Re: [PATCH] selftests: xfrm: fix test return value override issue in xfrm_policy.sh

2020-12-30 Thread Florian Westphal

Po-Hsu Lin  wrote:
> When running this xfrm_policy.sh test script, even with some cases
> marked as FAIL, the overall test result will still be PASS:
> 
> $ sudo ./xfrm_policy.sh
> PASS: policy before exception matches
> FAIL: expected ping to .254 to fail (exceptions)
> PASS: direct policy matches (exceptions)
> PASS: policy matches (exceptions)
> FAIL: expected ping to .254 to fail (exceptions and block policies)
> PASS: direct policy matches (exceptions and block policies)
> PASS: policy matches (exceptions and block policies)
> FAIL: expected ping to .254 to fail (exceptions and block policies after 
> hresh changes)
> PASS: direct policy matches (exceptions and block policies after hresh 
> changes)
> PASS: policy matches (exceptions and block policies after hresh changes)
> FAIL: expected ping to .254 to fail (exceptions and block policies after 
> hthresh change in ns3)
> PASS: direct policy matches (exceptions and block policies after hthresh 
> change in ns3)
> PASS: policy matches (exceptions and block policies after hthresh change in 
> ns3)
> FAIL: expected ping to .254 to fail (exceptions and block policies after 
> htresh change to normal)
> PASS: direct policy matches (exceptions and block policies after htresh 
> change to normal)
> PASS: policy matches (exceptions and block policies after htresh change to 
> normal)
> PASS: policies with repeated htresh change
> $ echo $?
> 0
> 
> This is because the $lret in check_xfrm() is not a local variable.

Acked-by: Florian Westphal

Announcing Netdev 0x15

2020-12-30 Thread Jamal Hadi Salim




While we bid goodbye to 2020, here's some good hopeful news for 2021.

cheers,
jamal


 Forwarded Message 
Subject: [NetDev-People] Announcing Netdev 0x15
Date: Sun, 27 Dec 2020 15:54:31 -0700
From: Tom Herbert via people 
Reply-To: Tom Herbert 
To: peo...@netdevconf.info

Hello Netdev'ers,

On behalf of Netdev Society, I am very pleased to announce the Netdev
0x15 conference! The tentative dates are July 20-23, 2021 and the
location will be San Francisco, California, USA.

Our current plan is to hold an in person conference assuming that the
conditions around Covid-19 have improved, however if conditions
warrant we do have the option of holding a virtual conference again.
We will try to make a definitive announcement about this at least six
weeks before the conference date. Additionally, it seems likely that
even if we do hold the conference in person there will be a hybrid
component to accommodate remote participants that wish to opt out of
attending in person.

The call for papers and workshops will be made early next year, so stay 
tuned!


Thanks,
Tom
___
people mailing list
peo...@netdevconf.org
https://lists.netdevconf.info/cgi-bin/mailman/listinfo/people

Re: BTFIDS: FAILED unresolved symbol udp6_sock

2020-12-30 Thread Jiri Olsa

On Wed, Dec 30, 2020 at 02:28:02PM +0100, Jiri Olsa wrote:
> On Wed, Dec 30, 2020 at 10:03:37AM +0100, Jiri Olsa wrote:
> > On Tue, Dec 29, 2020 at 11:28:35PM +, Qais Yousef wrote:
> > > Hi Jiri
> > > 
> > > On 12/29/20 18:34, Jiri Olsa wrote:
> > > > On Tue, Dec 29, 2020 at 03:13:52PM +, Qais Yousef wrote:
> > > > > Hi
> > > > > 
> > > > > When I enable CONFIG_DEBUG_INFO_BTF I get the following error in the 
> > > > > BTFIDS
> > > > > stage
> > > > > 
> > > > >   FAILED unresolved symbol udp6_sock
> > > > > 
> > > > > I cross compile for arm64. My .config is attached.
> > > > > 
> > > > > I managed to reproduce the problem on v5.9 and v5.10. Plus 5.11-rc1.
> > > > > 
> > > > > Have you seen this before? I couldn't find a specific report about 
> > > > > this
> > > > > problem.
> > > > > 
> > > > > Let me know if you need more info.
> > > > 
> > > > hi,
> > > > this looks like symptom of the gcc DWARF bug we were
> > > > dealing with recently:
> > > > 
> > > >   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060
> > > >   
> > > > https://lore.kernel.org/lkml/CAE1WUT75gu9G62Q9uAALGN6vLX=o7vz9uhqtvwnbuv81dgm...@mail.gmail.com/#r
> > > > 
> > > > what pahole/gcc version are you using?
> > > 
> > > I'm on gcc 9.3.0
> > > 
> > >   aarch64-linux-gnu-gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
> > > 
> > > I was on pahole v1.17. I moved to v1.19 but I still see the same problem.
> > 
> > I can reproduce with your .config, but make 'defconfig' works,
> > so I guess it's some config option issue, I'll check later today
> 
> so your .config has
>   CONFIG_CRYPTO_DEV_BCM_SPU=y
> 
> and that defines 'struct device_private' which
> clashes with the same struct defined in drivers/base/base.h
> 
> so several networking structs will be doubled, like net_device:
> 
>   $ bpftool btf dump file ../vmlinux.config | grep net_device\' | grep 
> STRUCT
>   [2731] STRUCT 'net_device' size=2240 vlen=133
>   [113981] STRUCT 'net_device' size=2240 vlen=133
> 
> each is using different 'struct device_private' when it's unwinded
> 
> and that will confuse BTFIDS logic, becase we have multiple structs
> with the same name, and we can't be sure which one to pick
> 
> perhaps we should check on this in pahole and warn earlier with
> better error message.. I'll check, but I'm not sure if pahole can
> survive another hastab ;-)
> 
> Andrii, any ideas on this? ;-)
> 
> easy fix is the patch below that renames the bcm's structs,
> it makes the kernel to compile.. but of course the new name
> is probably wrong and we should push this through that code
> authors

also another quick fix is to switch it to module

jirka

Re: BTFIDS: FAILED unresolved symbol udp6_sock

2020-12-30 Thread Jiri Olsa

On Wed, Dec 30, 2020 at 10:03:37AM +0100, Jiri Olsa wrote:
> On Tue, Dec 29, 2020 at 11:28:35PM +, Qais Yousef wrote:
> > Hi Jiri
> > 
> > On 12/29/20 18:34, Jiri Olsa wrote:
> > > On Tue, Dec 29, 2020 at 03:13:52PM +, Qais Yousef wrote:
> > > > Hi
> > > > 
> > > > When I enable CONFIG_DEBUG_INFO_BTF I get the following error in the 
> > > > BTFIDS
> > > > stage
> > > > 
> > > > FAILED unresolved symbol udp6_sock
> > > > 
> > > > I cross compile for arm64. My .config is attached.
> > > > 
> > > > I managed to reproduce the problem on v5.9 and v5.10. Plus 5.11-rc1.
> > > > 
> > > > Have you seen this before? I couldn't find a specific report about this
> > > > problem.
> > > > 
> > > > Let me know if you need more info.
> > > 
> > > hi,
> > > this looks like symptom of the gcc DWARF bug we were
> > > dealing with recently:
> > > 
> > >   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060
> > >   
> > > https://lore.kernel.org/lkml/CAE1WUT75gu9G62Q9uAALGN6vLX=o7vz9uhqtvwnbuv81dgm...@mail.gmail.com/#r
> > > 
> > > what pahole/gcc version are you using?
> > 
> > I'm on gcc 9.3.0
> > 
> > aarch64-linux-gnu-gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
> > 
> > I was on pahole v1.17. I moved to v1.19 but I still see the same problem.
> 
> I can reproduce with your .config, but make 'defconfig' works,
> so I guess it's some config option issue, I'll check later today

so your .config has
  CONFIG_CRYPTO_DEV_BCM_SPU=y

and that defines 'struct device_private' which
clashes with the same struct defined in drivers/base/base.h

so several networking structs will be doubled, like net_device:

$ bpftool btf dump file ../vmlinux.config | grep net_device\' | grep 
STRUCT
[2731] STRUCT 'net_device' size=2240 vlen=133
[113981] STRUCT 'net_device' size=2240 vlen=133

each is using different 'struct device_private' when it's unwinded

and that will confuse BTFIDS logic, becase we have multiple structs
with the same name, and we can't be sure which one to pick

perhaps we should check on this in pahole and warn earlier with
better error message.. I'll check, but I'm not sure if pahole can
survive another hastab ;-)

Andrii, any ideas on this? ;-)

easy fix is the patch below that renames the bcm's structs,
it makes the kernel to compile.. but of course the new name
is probably wrong and we should push this through that code
authors

jirka


---
diff --git a/drivers/crypto/bcm/cipher.c b/drivers/crypto/bcm/cipher.c
index 30390a7324b2..0e5537838ef3 100644
--- a/drivers/crypto/bcm/cipher.c
+++ b/drivers/crypto/bcm/cipher.c
@@ -42,7 +42,7 @@
 
 /* = Device Structure == */
 
-struct device_private iproc_priv;
+struct bcm_device_private iproc_priv;
 
 /*  Parameters = */
 
diff --git a/drivers/crypto/bcm/cipher.h b/drivers/crypto/bcm/cipher.h
index 0ad5892b445d..71281a3bdbdc 100644
--- a/drivers/crypto/bcm/cipher.h
+++ b/drivers/crypto/bcm/cipher.h
@@ -420,7 +420,7 @@ struct spu_hw {
u32 num_chan;
 };
 
-struct device_private {
+struct bcm_device_private {
struct platform_device *pdev;
 
struct spu_hw spu;
@@ -467,6 +467,6 @@ struct device_private {
struct mbox_chan **mbox;
 };
 
-extern struct device_private iproc_priv;
+extern struct bcm_device_private iproc_priv;
 
 #endif
diff --git a/drivers/crypto/bcm/util.c b/drivers/crypto/bcm/util.c
index 2b304fc78059..77aeedb84055 100644
--- a/drivers/crypto/bcm/util.c
+++ b/drivers/crypto/bcm/util.c
@@ -348,7 +348,7 @@ char *spu_alg_name(enum spu_cipher_alg alg, enum 
spu_cipher_mode mode)
 static ssize_t spu_debugfs_read(struct file *filp, char __user *ubuf,
size_t count, loff_t *offp)
 {
-   struct device_private *ipriv;
+   struct bcm_device_private *ipriv;
char *buf;
ssize_t ret, out_offset, out_count;
int i;

Re: [PATCH v2 2/2] docs: networking: packet_mmap: fix old config reference

2020-12-30 Thread Baruch Siach

Hi Ulises,

On Tue, Dec 29 2020, Ulises Alonso wrote:
> Can you also replace the sentence
>
>  "(like libpcap always does)."
>
> with
>
> "(which old libpcap versions used do)."

Are you sure this change is correct? The text says:

  ... it requires two if you want to get packet's timestamp
  (like libpcap always does).

I think libpcap still reads packets timestamps, though it most likely
uses the newer interface for that.

Maybe we should just drop the libpcap reference here?

Thanks for reviewing the patch,
baruch

> On Tue, Dec 29, 2020 at 10:11 AM Baruch Siach  wrote:
>
>> Before commit 889b8f964f2f ("packet: Kill CONFIG_PACKET_MMAP.") there
>> used to be a CONFIG_PACKET_MMAP config symbol that depended on
>> CONFIG_PACKET. The text still implies that PACKET_MMAP can be disabled.
>> Remove that from the text, as well as reference to old kernel versions.
>>
>> Also, drop reference to broken link to information for pre 2.6.5
>> kernels.
>>
>> Make a slight working improvement (s/In/On/) while at it.
>>
>> Signed-off-by: Baruch Siach 
>> ---
>> v2: Address comments from Jakub Kicinski and Willem de Bruijn
>>
>>   * Don't change PACKET_MMAP
>>
>>   * Remove mention of specific kernel versions
>>
>>   * Don't reflow paragraphs
>>
>>   * s/In/On/
>> ---
>>  Documentation/networking/packet_mmap.rst | 9 -
>>  1 file changed, 4 insertions(+), 5 deletions(-)
>>
>> diff --git a/Documentation/networking/packet_mmap.rst
>> b/Documentation/networking/packet_mmap.rst
>> index f3646c80b019..500ef60b1b82 100644
>> --- a/Documentation/networking/packet_mmap.rst
>> +++ b/Documentation/networking/packet_mmap.rst
>> @@ -8,7 +8,7 @@ Abstract
>>  
>>
>>  This file documents the mmap() facility available with the PACKET
>> -socket interface on 2.4/2.6/3.x kernels. This type of sockets is used for
>> +socket interface. This type of sockets is used for
>>
>>  i) capture network traffic with utilities like tcpdump,
>>  ii) transmit network traffic, or any other that needs raw
>> @@ -25,12 +25,12 @@ Please send your comments to
>>  Why use PACKET_MMAP
>>  ===
>>
>> -In Linux 2.4/2.6/3.x if PACKET_MMAP is not enabled, the capture process
>> is very
>> +Non PACKET_MMAP capture process (plain AF_PACKET) is very
>>  inefficient. It uses very limited buffers and requires one system call to
>>  capture each packet, it requires two if you want to get packet's timestamp
>>  (like libpcap always does).
>>
>> -In the other hand PACKET_MMAP is very efficient. PACKET_MMAP provides a
>> size
>> +On the other hand PACKET_MMAP is very efficient. PACKET_MMAP provides a
>> size
>>  configurable circular buffer mapped in user space that can be used to
>> either
>>  send or receive packets. This way reading packets just needs to wait for
>> them,
>>  most of the time there is no need to issue a single system call.
>> Concerning
>> @@ -252,8 +252,7 @@ PACKET_MMAP setting constraints
>>
>>  In kernel versions prior to 2.4.26 (for the 2.4 branch) and 2.6.5 (2.6
>> branch),
>>  the PACKET_MMAP buffer could hold only 32768 frames in a 32 bit
>> architecture or
>> -16384 in a 64 bit architecture. For information on these kernel versions
>> -see
>> http://pusa.uv.es/~ulisses/packet_mmap/packet_mmap.pre-2.4.26_2.6.5.txt
>> +16384 in a 64 bit architecture.
>>
>>  Block size limit
>>  
>> --
>> 2.29.2

-- 
 ~. .~   Tk Open Systems
=}ooO--U--Ooo{=
   - bar...@tkos.co.il - tel: +972.52.368.4656, http://www.tkos.co.il -

Re: [PATCH] ibmvnic: fix: NULL pointer dereference.

2020-12-30 Thread Michal Suchánek

On Wed, Dec 30, 2020 at 03:23:14PM +0800, YANG LI wrote:
> The error is due to dereference a null pointer in function
> reset_one_sub_crq_queue():
> 
> if (!scrq) {
> netdev_dbg(adapter->netdev,
>"Invalid scrq reset. irq (%d) or msgs(%p).\n",
>   scrq->irq, scrq->msgs);
>   return -EINVAL;
> }
> 
> If the expression is true, scrq must be a null pointer and cannot
> dereference.
> 
> Signed-off-by: YANG LI 
> Reported-by: Abaci 
Fixes: 9281cf2d5840 ("ibmvnic: avoid memset null scrq msgs")
> ---
>  drivers/net/ethernet/ibm/ibmvnic.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
> b/drivers/net/ethernet/ibm/ibmvnic.c
> index f302504..d7472be 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -2981,9 +2981,7 @@ static int reset_one_sub_crq_queue(struct 
> ibmvnic_adapter *adapter,
>   int rc;
>  
>   if (!scrq) {
> - netdev_dbg(adapter->netdev,
> -"Invalid scrq reset. irq (%d) or msgs (%p).\n",
> -scrq->irq, scrq->msgs);
> + netdev_dbg(adapter->netdev, "Invalid scrq reset.\n");
>   return -EINVAL;
>   }
>  
> -- 
> 1.8.3.1
>

Re: [PATCH net-next v2 0/2] Add support for DSFP transceiver type

2020-12-30 Thread Vladyslav Tarasiuk




On 29-Dec-20 18:25, Andrew Lunn wrote:

Hi Andrew,

Following this conversation, I wrote some pseudocode checking if I'm on
right path here.
Please review:

struct eeprom_page {
     u8 page_number;
     u8 bank_number;
     u16 offset;
     u16 data_length;
     u8 *data;
}

I'm wondering about offset and data_length, in this context. I would
expect you always ask the kernel for the full page, not part of
it. Even when user space asks for just part of a page. That keeps you
cache management simpler.

As far as I know, there may be bytes, which may change on read.
For example, clear on read values in CMIS 4.0.
Then, retrieving whole page every time may be incorrect.
So I kept these for cases, when user asks for specific few bytes.

But maybe some indicator of low/high is
needed, since many pages are actually 1/2 pages?

I was planning to use offset and data_length fields to indicate the
available page. For example, high page will have offset 128 and
data_length 128.

The other thing to consider is SFF-8472 and its use of two different
i2c addresses, A0h and A2h. These are different to pages and banks.


I wasn't aware of that. It complicates things a bit, should we add a
parameter of i2c address? So in this case page 0 will be with i2c
address A0h. And if user needs page 0 from i2c address A2h, he will
specify it in command line. And for other specs, this parameter will
not be supported.


print_human_readable()
{
     spec_id = cache_get_page(0, 0, 0, 128)->data[0];
     switch (spec_id) {
     case sff_:
     print_sff_();
     case cmis_y:
     print_cmis_y();
     default:
     print_hex();
     }
}

You want to keep as much of the existing ethtool code as you can, but
the basic idea looks O.K.

Yes, under print_sff_(), for example, I meant using existing functions.
Either as is, or refactoring according to cache requirements.

getmodule_reply_cb()
{
     if (offset || hex || bank_number || page number)
     print_hex();
     else
     // if _human_readable() decoder needs more than page 00, it
will
     // fetch it on demand
     print_human_readable();
}

Things get interesting here. Say this is page 0, and
print_human_readable() finds a bit indicating page 1 is valid. So it
requests page 1. We go recursive. While deep down in
print_human_readable(), we send the next netlink message and call
getmodule_reply_cb() when the answer appears. I've had problems with
some of the netlink code not liking recursive calls.

So i suggest you try to find a different structure for the code. Try
to complete the netlink call before doing the decoding. So add the
page to the cache and then return. Do the decode after
nlsock_sendmsg() has returned.

I'm thinking about a standard-specific function, which will prefetch pages
needed by print_human_readable(). It will check the standard ID, and go 
request

pages and add them to the cache. Then, decoder kicks with already cached
pages. This will eliminate recursive netlink calls.

Driver
It is required to implement get_module_eeprom_page() ndo, where it queries
its EEPROM and copies to u8 *data array allocated by the kernel previously.
The ndo has the following prototype:
int get_module_eeprom_page(struct net_device *, u16 offset, u16 length,
    u8 page_number, u8 bank_number, u8 *data);


I would include extack here, so we can get better error messages.


OK, I will add extack.

Thanks,
Vlad

Re: BTFIDS: FAILED unresolved symbol udp6_sock

2020-12-30 Thread Arnaldo Carvalho de Melo

Em Wed, Dec 30, 2020 at 02:28:52PM +0100, Jiri Olsa escreveu:
> On Wed, Dec 30, 2020 at 02:28:02PM +0100, Jiri Olsa wrote:
> > On Wed, Dec 30, 2020 at 10:03:37AM +0100, Jiri Olsa wrote:
> > > On Tue, Dec 29, 2020 at 11:28:35PM +, Qais Yousef wrote:
> > > > Hi Jiri
> > > > 
> > > > On 12/29/20 18:34, Jiri Olsa wrote:
> > > > > On Tue, Dec 29, 2020 at 03:13:52PM +, Qais Yousef wrote:
> > > > > > Hi
> > > > > > 
> > > > > > When I enable CONFIG_DEBUG_INFO_BTF I get the following error in 
> > > > > > the BTFIDS
> > > > > > stage
> > > > > > 
> > > > > > FAILED unresolved symbol udp6_sock
> > > > > > 
> > > > > > I cross compile for arm64. My .config is attached.
> > > > > > 
> > > > > > I managed to reproduce the problem on v5.9 and v5.10. Plus 5.11-rc1.
> > > > > > 
> > > > > > Have you seen this before? I couldn't find a specific report about 
> > > > > > this
> > > > > > problem.
> > > > > > 
> > > > > > Let me know if you need more info.
> > > > > 
> > > > > hi,
> > > > > this looks like symptom of the gcc DWARF bug we were
> > > > > dealing with recently:
> > > > > 
> > > > >   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060
> > > > >   
> > > > > https://lore.kernel.org/lkml/CAE1WUT75gu9G62Q9uAALGN6vLX=o7vz9uhqtvwnbuv81dgm...@mail.gmail.com/#r
> > > > > 
> > > > > what pahole/gcc version are you using?
> > > > 
> > > > I'm on gcc 9.3.0
> > > > 
> > > > aarch64-linux-gnu-gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
> > > > 
> > > > I was on pahole v1.17. I moved to v1.19 but I still see the same 
> > > > problem.

There are some changes post v1.19 in the git repo:

[acme@five pahole]$ git log --oneline v1.19..
b688e35970600c15 (HEAD -> master) btf_encoder: fix skipping per-CPU variables 
at offset 0
8c009d6ce762dfc9 btf_encoder: fix BTF variable generation for kernel modules
b94e97e015a94e6b dwarves: Fix compilation on 32-bit architectures
17df51c700248f02 btf_encoder: Detect kernel module ftrace addresses
06ca639505fc56c6 btf_encoder: Use address size based on ELF's class
aff60970d16b909e btf_encoder: Factor filter_functions function
1e6a3fed6e52d365 (quaco/master) rpm: Fix changelog date
[acme@five pahole]$

But I think these won't matter in this case :-\

- Arnaldo

> > > I can reproduce with your .config, but make 'defconfig' works,
> > > so I guess it's some config option issue, I'll check later today
> > 
> > so your .config has
> >   CONFIG_CRYPTO_DEV_BCM_SPU=y
> > 
> > and that defines 'struct device_private' which
> > clashes with the same struct defined in drivers/base/base.h
> > 
> > so several networking structs will be doubled, like net_device:
> > 
> > $ bpftool btf dump file ../vmlinux.config | grep net_device\' | grep 
> > STRUCT
> > [2731] STRUCT 'net_device' size=2240 vlen=133
> > [113981] STRUCT 'net_device' size=2240 vlen=133
> > 
> > each is using different 'struct device_private' when it's unwinded
> > 
> > and that will confuse BTFIDS logic, becase we have multiple structs
> > with the same name, and we can't be sure which one to pick
> > 
> > perhaps we should check on this in pahole and warn earlier with
> > better error message.. I'll check, but I'm not sure if pahole can
> > survive another hastab ;-)
> > 
> > Andrii, any ideas on this? ;-)
> > 
> > easy fix is the patch below that renames the bcm's structs,
> > it makes the kernel to compile.. but of course the new name
> > is probably wrong and we should push this through that code
> > authors
> 
> also another quick fix is to switch it to module
> 
> jirka
> 

-- 

- Arnaldo

[PATCH v1] af/rvu_cgx: Fix missing check bugs in rvu_cgx.c

2020-12-30 Thread wangyingjie55

From: Yingjie Wang 

In rvu_mbox_handler_cgx_mac_addr_get() and rvu_mbox_handler_cgx_mac_addr_set(),
the msg is expected only from PFs that are mapped to CGX LMACs.
It should be checked before mapping, so we add the is_cgx_config_permitted() in 
the functions.

Signed-off-by: Yingjie Wang 
---
 drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c 
b/drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c
index d298b93..6c6b411 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_cgx.c
@@ -469,6 +469,9 @@ int rvu_mbox_handler_cgx_mac_addr_set(struct rvu *rvu,
int pf = rvu_get_pf(req->hdr.pcifunc);
u8 cgx_id, lmac_id;
 
+   if (!is_cgx_config_permitted(rvu, req->hdr.pcifunc))
+   return -EPERM;
+
rvu_get_cgx_lmac_id(rvu->pf2cgxlmac_map[pf], &cgx_id, &lmac_id);
 
cgx_lmac_addr_set(cgx_id, lmac_id, req->mac_addr);
@@ -485,6 +488,9 @@ int rvu_mbox_handler_cgx_mac_addr_get(struct rvu *rvu,
int rc = 0, i;
u64 cfg;
 
+   if (!is_cgx_config_permitted(rvu, req->hdr.pcifunc))
+   return -EPERM;
+
rvu_get_cgx_lmac_id(rvu->pf2cgxlmac_map[pf], &cgx_id, &lmac_id);
 
rsp->hdr.rc = rc;
-- 
2.7.4

[PATCH] mt76: fix enum conversion warning

2020-12-30 Thread Arnd Bergmann

From: Arnd Bergmann 

A recent patch changed some enum values, but not the type
declaration for the assignment:

drivers/net/wireless/mediatek/mt76/mt7615/mcu.c:238:9: error: implicit 
conversion from enumeration type 'enum mt76_mcuq_id' to different enumeration 
type 'enum mt76_txq_id' [-Werror,-Wenum-conversion]
qid = MT_MCUQ_WM;
~ ^~
drivers/net/wireless/mediatek/mt76/mt7615/mcu.c:240:9: error: implicit 
conversion from enumeration type 'enum mt76_mcuq_id' to different enumeration 
type 'enum mt76_txq_id' [-Werror,-Wenum-conversion]
qid = MT_MCUQ_FWDL;
~ ^~~~

Change the type to match again.

Fixes: e637763b606b ("mt76: move mcu queues to mt76_dev q_mcu array")
Signed-off-by: Arnd Bergmann 
---
 drivers/net/wireless/mediatek/mt76/mt7615/mcu.c | 2 +-
 drivers/net/wireless/mediatek/mt76/mt7915/mcu.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/mediatek/mt76/mt7615/mcu.c 
b/drivers/net/wireless/mediatek/mt76/mt7615/mcu.c
index a44b7766dec6..c13547841a4e 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7615/mcu.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7615/mcu.c
@@ -231,7 +231,7 @@ mt7615_mcu_send_message(struct mt76_dev *mdev, struct 
sk_buff *skb,
int cmd, int *seq)
 {
struct mt7615_dev *dev = container_of(mdev, struct mt7615_dev, mt76);
-   enum mt76_txq_id qid;
+   enum mt76_mcuq_id qid;
 
mt7615_mcu_fill_msg(dev, skb, cmd, seq);
if (test_bit(MT76_STATE_MCU_RUNNING, &dev->mphy.state))
diff --git a/drivers/net/wireless/mediatek/mt76/mt7915/mcu.c 
b/drivers/net/wireless/mediatek/mt76/mt7915/mcu.c
index 5fdd1a6d32ee..22753fbb05e5 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7915/mcu.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7915/mcu.c
@@ -256,7 +256,7 @@ mt7915_mcu_send_message(struct mt76_dev *mdev, struct 
sk_buff *skb,
struct mt7915_dev *dev = container_of(mdev, struct mt7915_dev, mt76);
struct mt7915_mcu_txd *mcu_txd;
u8 seq, pkt_fmt, qidx;
-   enum mt76_txq_id txq;
+   enum mt76_mcuq_id txq;
__le32 *txd;
u32 val;
 
-- 
2.29.2

Re: BTFIDS: FAILED unresolved symbol udp6_sock

2020-12-30 Thread Jiri Olsa

On Wed, Dec 30, 2020 at 11:19:36AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Wed, Dec 30, 2020 at 02:28:52PM +0100, Jiri Olsa escreveu:
> > On Wed, Dec 30, 2020 at 02:28:02PM +0100, Jiri Olsa wrote:
> > > On Wed, Dec 30, 2020 at 10:03:37AM +0100, Jiri Olsa wrote:
> > > > On Tue, Dec 29, 2020 at 11:28:35PM +, Qais Yousef wrote:
> > > > > Hi Jiri
> > > > > 
> > > > > On 12/29/20 18:34, Jiri Olsa wrote:
> > > > > > On Tue, Dec 29, 2020 at 03:13:52PM +, Qais Yousef wrote:
> > > > > > > Hi
> > > > > > > 
> > > > > > > When I enable CONFIG_DEBUG_INFO_BTF I get the following error in 
> > > > > > > the BTFIDS
> > > > > > > stage
> > > > > > > 
> > > > > > >   FAILED unresolved symbol udp6_sock
> > > > > > > 
> > > > > > > I cross compile for arm64. My .config is attached.
> > > > > > > 
> > > > > > > I managed to reproduce the problem on v5.9 and v5.10. Plus 
> > > > > > > 5.11-rc1.
> > > > > > > 
> > > > > > > Have you seen this before? I couldn't find a specific report 
> > > > > > > about this
> > > > > > > problem.
> > > > > > > 
> > > > > > > Let me know if you need more info.
> > > > > > 
> > > > > > hi,
> > > > > > this looks like symptom of the gcc DWARF bug we were
> > > > > > dealing with recently:
> > > > > > 
> > > > > >   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060
> > > > > >   
> > > > > > https://lore.kernel.org/lkml/CAE1WUT75gu9G62Q9uAALGN6vLX=o7vz9uhqtvwnbuv81dgm...@mail.gmail.com/#r
> > > > > > 
> > > > > > what pahole/gcc version are you using?
> > > > > 
> > > > > I'm on gcc 9.3.0
> > > > > 
> > > > >   aarch64-linux-gnu-gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
> > > > > 
> > > > > I was on pahole v1.17. I moved to v1.19 but I still see the same 
> > > > > problem.
> 
> There are some changes post v1.19 in the git repo:
> 
> [acme@five pahole]$ git log --oneline v1.19..
> b688e35970600c15 (HEAD -> master) btf_encoder: fix skipping per-CPU variables 
> at offset 0
> 8c009d6ce762dfc9 btf_encoder: fix BTF variable generation for kernel modules
> b94e97e015a94e6b dwarves: Fix compilation on 32-bit architectures
> 17df51c700248f02 btf_encoder: Detect kernel module ftrace addresses
> 06ca639505fc56c6 btf_encoder: Use address size based on ELF's class
> aff60970d16b909e btf_encoder: Factor filter_functions function
> 1e6a3fed6e52d365 (quaco/master) rpm: Fix changelog date
> [acme@five pahole]$
> 
> But I think these won't matter in this case :-\

yep, it did not.. I used the latest dwarves code

jirka

Re: Registering IRQ for MT7530 internal PHYs

2020-12-30 Thread Andrew Lunn

> > +static void mt7530_irq_bus_lock(struct irq_data *d)
> > +{
> > +   struct mt7530_priv *priv = irq_data_get_irq_chip_data(d);
> > +
> > +   mutex_lock(&priv->reg_mutex);
> 
> Are you always guaranteed to be in a thread context? I guess that
> is the case, given that you request a threaded interrupt, but
> it would be worth documenting.

Hi Marc

These Ethernet switches are often connected by MDIO, SPI or I2C busses
to the SoC. So in order to access switch registers over these busses,
sleeping is required. So yes, threaded interrupts are required.

 Andrew

Re: Registering IRQ for MT7530 internal PHYs

2020-12-30 Thread Andrew Lunn

> 2. Allocated ds->slave_mii_bus before calling ds->ops->setup, because
> we cannot call dsa_slave_mii_bus_init which is private.
> 
> Any better ideas?

Do what mv88e6xxx does, allocate the MDIO bus inside the switch
driver.

Andrew

Re: Registering IRQ for MT7530 internal PHYs

2020-12-30 Thread Andrew Lunn

On Wed, Dec 30, 2020 at 09:42:09AM +, Marc Zyngier wrote:
> > +static irqreturn_t
> > +mt7530_irq(int irq, void *data)
> > +{
> > +   struct mt7530_priv *priv = data;
> > +   bool handled = false;
> > +   int phy;
> > +   u32 val;
> > +
> > +   val = mt7530_read(priv, MT7530_SYS_INT_STS);
> > +   mt7530_write(priv, MT7530_SYS_INT_STS, val);
> 
> If that is an ack operation, it should be dealt with as such in
> an irqchip callback instead of being open-coded here.

Hi Qingfang

Does the PHY itself have interrupt control and status registers?

My experience with the Marvell Switch and its embedded PHYs is that
the PHYs are just the same as the discrete PHYs. There are bits to
enable different interrupts, and there are status bits indicating what
event caused the interrupt. Clearing the interrupt in the PHY clears
the interrupt in the switch interrupt controller. So in the mv88e6xxx
interrupt code, you see i do a read of the switch interrupt controller
status register, but i don't write to it as you have done.

   Andrew

Re: [PATCH rfc 3/3] virtio-net: support transmit timestamp

2020-12-30 Thread Willem de Bruijn

On Wed, Dec 30, 2020 at 7:38 AM Richard Cochran
 wrote:
>
> On Mon, Dec 28, 2020 at 11:22:33AM -0500, Willem de Bruijn wrote:
> > From: Willem de Bruijn 
> >
> > Add optional delivery time (SO_TXTIME) offload for virtio-net.
> >
> > The Linux TCP/IP stack tries to avoid bursty transmission and network
> > congestion through pacing: computing an skb delivery time based on
> > congestion information. Userspace protocol implementations can achieve
> > the same with SO_TXTIME. This may also reduce scheduling jitter and
> > improve RTT estimation.
>
> This description is clear, but the Subject line is confusing.  It made
> me wonder whether this series is somehow about host/guest synchronization
> (but your comments do explain that that isn't the case).
>
> How about this instead?
>
>virtio-net: support future packet transmit time

Yes, that's clearer. As is, this could easily be mistaken for
SOF_TIMESTAMPING_TX_*, which it is not. Will update, thanks.

Related terms already in use are SO_TXTIME, Earliest Delivery Time
(EDT) and Earliest TxTime First (ETF).

I should probably also s/TX_TSTAMP/TX_TIME/  in the code for the same reason.

Re: BTFIDS: FAILED unresolved symbol udp6_sock

2020-12-30 Thread Qais Yousef

On 12/30/20 14:28, Jiri Olsa wrote:
> On Wed, Dec 30, 2020 at 02:28:02PM +0100, Jiri Olsa wrote:
> > On Wed, Dec 30, 2020 at 10:03:37AM +0100, Jiri Olsa wrote:
> > > On Tue, Dec 29, 2020 at 11:28:35PM +, Qais Yousef wrote:
> > > > Hi Jiri
> > > > 
> > > > On 12/29/20 18:34, Jiri Olsa wrote:
> > > > > On Tue, Dec 29, 2020 at 03:13:52PM +, Qais Yousef wrote:
> > > > > > Hi
> > > > > > 
> > > > > > When I enable CONFIG_DEBUG_INFO_BTF I get the following error in 
> > > > > > the BTFIDS
> > > > > > stage
> > > > > > 
> > > > > > FAILED unresolved symbol udp6_sock
> > > > > > 
> > > > > > I cross compile for arm64. My .config is attached.
> > > > > > 
> > > > > > I managed to reproduce the problem on v5.9 and v5.10. Plus 5.11-rc1.
> > > > > > 
> > > > > > Have you seen this before? I couldn't find a specific report about 
> > > > > > this
> > > > > > problem.
> > > > > > 
> > > > > > Let me know if you need more info.
> > > > > 
> > > > > hi,
> > > > > this looks like symptom of the gcc DWARF bug we were
> > > > > dealing with recently:
> > > > > 
> > > > >   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060
> > > > >   
> > > > > https://lore.kernel.org/lkml/CAE1WUT75gu9G62Q9uAALGN6vLX=o7vz9uhqtvwnbuv81dgm...@mail.gmail.com/#r
> > > > > 
> > > > > what pahole/gcc version are you using?
> > > > 
> > > > I'm on gcc 9.3.0
> > > > 
> > > > aarch64-linux-gnu-gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
> > > > 
> > > > I was on pahole v1.17. I moved to v1.19 but I still see the same 
> > > > problem.
> > > 
> > > I can reproduce with your .config, but make 'defconfig' works,
> > > so I guess it's some config option issue, I'll check later today

My .config was a defconfig + CONFIG_DEBUG_INFO_BTF + convert all modules to
built-in.

> > 
> > so your .config has
> >   CONFIG_CRYPTO_DEV_BCM_SPU=y

Ah, how random. I removed this config and indeed it does work then, thanks!

> > 
> > and that defines 'struct device_private' which
> > clashes with the same struct defined in drivers/base/base.h
> > 
> > so several networking structs will be doubled, like net_device:
> > 
> > $ bpftool btf dump file ../vmlinux.config | grep net_device\' | grep 
> > STRUCT
> > [2731] STRUCT 'net_device' size=2240 vlen=133
> > [113981] STRUCT 'net_device' size=2240 vlen=133
> > 
> > each is using different 'struct device_private' when it's unwinded
> > 
> > and that will confuse BTFIDS logic, becase we have multiple structs
> > with the same name, and we can't be sure which one to pick

We can't tell which object/subsystem the struct come from?

Or maybe we can introduce some annotation to help BTFIDS to pick the right one?

> > 
> > perhaps we should check on this in pahole and warn earlier with
> > better error message.. I'll check, but I'm not sure if pahole can
> > survive another hastab ;-)
> > 
> > Andrii, any ideas on this? ;-)
> > 
> > easy fix is the patch below that renames the bcm's structs,
> > it makes the kernel to compile.. but of course the new name
> > is probably wrong and we should push this through that code
> > authors
> 
> also another quick fix is to switch it to module

This works too.

Thanks for your speedy response.

Cheers

--
Qais Yousef

[PATCH net,stable] net: usb: qmi_wwan: add Quectel EM160R-GL

2020-12-30 Thread Bjørn Mork

New modem using ff/ff/30 for QCDM, ff/00/00 for  AT and NMEA,
and ff/ff/ff for RMNET/QMI.

T: Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=5000 MxCh= 0
D: Ver= 3.20 Cls=ef(misc ) Sub=02 Prot=01 MxPS= 9 #Cfgs= 1
P: Vendor=2c7c ProdID=0620 Rev= 4.09
S: Manufacturer=Quectel
S: Product=EM160R-GL
S: SerialNumber=e31cedc1
C:* #Ifs= 5 Cfg#= 1 Atr=a0 MxPwr=896mA
I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=(none)
E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
E: Ad=83(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
E: Ad=82(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
E: Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
E: Ad=87(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
E: Ad=86(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E: Ad=04(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
E: Ad=88(I) Atr=03(Int.) MxPS= 8 Ivl=32ms
E: Ad=8e(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E: Ad=0f(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms

Signed-off-by: Bjørn Mork 
---
 drivers/net/usb/qmi_wwan.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index d166c321ee9b..af19513a9f75 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -1013,6 +1013,7 @@ static const struct usb_device_id products[] = {
{QMI_MATCH_FF_FF_FF(0x2c7c, 0x0125)},   /* Quectel EC25, EC20 R2.0  
Mini PCIe */
{QMI_MATCH_FF_FF_FF(0x2c7c, 0x0306)},   /* Quectel EP06/EG06/EM06 */
{QMI_MATCH_FF_FF_FF(0x2c7c, 0x0512)},   /* Quectel EG12/EM12 */
+   {QMI_MATCH_FF_FF_FF(0x2c7c, 0x0620)},   /* Quectel EM160R-GL */
{QMI_MATCH_FF_FF_FF(0x2c7c, 0x0800)},   /* Quectel RM500Q-GL */
 
/* 3. Combined interface devices matching on interface number */
-- 
2.29.2

Re: [PATCH net-next v2 0/2] Add support for DSFP transceiver type

2020-12-30 Thread Andrew Lunn

On Wed, Dec 30, 2020 at 03:55:02PM +0200, Vladyslav Tarasiuk wrote:
> 
> On 29-Dec-20 18:25, Andrew Lunn wrote:
> > > Hi Andrew,
> > > 
> > > Following this conversation, I wrote some pseudocode checking if I'm on
> > > right path here.
> > > Please review:
> > > 
> > > struct eeprom_page {
> > >      u8 page_number;
> > >      u8 bank_number;
> > >      u16 offset;
> > >      u16 data_length;
> > >      u8 *data;
> > > }
> > I'm wondering about offset and data_length, in this context. I would
> > expect you always ask the kernel for the full page, not part of
> > it. Even when user space asks for just part of a page. That keeps you
> > cache management simpler.
> As far as I know, there may be bytes, which may change on read.
> For example, clear on read values in CMIS 4.0.

Ah, i did not know there were such bits. I will go read the spec. But
it should not really matter. If the SFP driver is interested in these
bits, it will have to intercept the read and act on the values.

> I wasn't aware of that. It complicates things a bit, should we add a
> parameter of i2c address? So in this case page 0 will be with i2c
> address A0h. And if user needs page 0 from i2c address A2h, he will
> specify it in command line.

Not on the command line. You should be able to determine from reading
page 0 at A0h is the diagnostics are at A2h or a page of A0h. That is
the whole point of this API, we decode the first page, and that tells
us what other pages should be available. So adding the i2c address to
the netlink message would be sensible. And i would not be too
surprised if there are SFPs with proprietary registers on other
addresses, which could be interesting to dump, if you can get access
to the needed datasheets.

   Andrew

Re: [PATCH 1/2] net: phy: micrel: Add KS8851 PHY support

2020-12-30 Thread Andrew Lunn

On Wed, Dec 30, 2020 at 01:53:57PM +0100, Marek Vasut wrote:
> The KS8851 has a reduced internal PHY, which is accessible through its
> registers at offset 0xe4. The PHY is compatible with KS886x PHY present
> in Micrel switches, except the PHY ID Low/High registers are swapped.

Can you intercept the reads in the KS8851 driver and swap them back
again? The mv88e6xxx driver does something similar. The mv88e6393
family of switches have PHYs with the Marvell OUI but no device ID. So
the code traps these reads and provides an ID.

Andrew

[PATCH 0/4] net: sfp: add support for GPON RTL8672/RTL9601C and Ubiquiti U-Fiber

2020-12-30 Thread Pali Rohár

This patch series add generic workaround for reading EEPROM content from
broken GPON SFP modules based on Realtek RTL8672/RTL9601C chips and add
another workarounds for GPON SFP module Ubiquiti U-Fiber Instant.

GPON SFP modules based on Realtek RTL8672/RTL9601C chips do not have a
real EEPROM but rather EEPROM emulator which is broken and needs special
hack for reading its content.

SFP module detection is done based on EEPROM content. But we obviously
cannot read EEPROM correctly if we do not know what type of connected
SFP module... And to have this chicken and egg problem more complicated,
GPON vendors generally put garbage into their EEPROM content so even
with knowing EEPROM content we do not know what kind of broken SFP is
connected... Workaround for Realtek RTL8672/RTL9601C based modules is
therefore done based on broken EEPROM reading characteristic.

This patch series also available in my git branch sfp-rtl8672:
https://git.kernel.org/pub/scm/linux/kernel/git/pali/linux.git/log/?h=sfp-rtl8672

Pali Rohár (4):
  net: sfp: add workaround for Realtek RTL8672 and RTL9601C chips
  net: sfp: allow to use also SFP modules which are detected as SFF
  net: sfp: assume that LOS is not implemented if both LOS normal and
inverted is set
  net: sfp: add mode quirk for GPON module Ubiquiti U-Fiber Instant

 drivers/net/phy/sfp-bus.c |  15 +
 drivers/net/phy/sfp.c | 117 ++
 2 files changed, 83 insertions(+), 49 deletions(-)

-- 
2.20.1

[PATCH 3/4] net: sfp: assume that LOS is not implemented if both LOS normal and inverted is set

2020-12-30 Thread Pali Rohár

Some GPON SFP modules (e.g. Ubiquiti U-Fiber Instant) have set both
SFP_OPTIONS_LOS_INVERTED and SFP_OPTIONS_LOS_NORMAL bits in their EEPROM.

Such combination of bits is meaningless so assume that LOS signal is not
implemented.

This patch fixes link carrier for GPON SFP module Ubiquiti U-Fiber Instant.

Co-developed-by: Russell King 
Signed-off-by: Russell King 
Signed-off-by: Pali Rohár 
---
 drivers/net/phy/sfp.c | 36 ++--
 1 file changed, 22 insertions(+), 14 deletions(-)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index 73f3ecf15260..d47485ed239c 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -1475,15 +1475,19 @@ static void sfp_sm_link_down(struct sfp *sfp)
 
 static void sfp_sm_link_check_los(struct sfp *sfp)
 {
-   unsigned int los = sfp->state & SFP_F_LOS;
+   const __be16 los_inverted = cpu_to_be16(SFP_OPTIONS_LOS_INVERTED);
+   const __be16 los_normal = cpu_to_be16(SFP_OPTIONS_LOS_NORMAL);
+   __be16 los_options = sfp->id.ext.options & (los_inverted | los_normal);
+   bool los = false;
 
/* If neither SFP_OPTIONS_LOS_INVERTED nor SFP_OPTIONS_LOS_NORMAL
-* are set, we assume that no LOS signal is available.
+* are set, we assume that no LOS signal is available. If both are
+* set, we assume LOS is not implemented (and is meaningless.)
 */
-   if (sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_LOS_INVERTED))
-   los ^= SFP_F_LOS;
-   else if (!(sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_LOS_NORMAL)))
-   los = 0;
+   if (los_options == los_inverted)
+   los = !(sfp->state & SFP_F_LOS);
+   else if (los_options == los_normal)
+   los = !!(sfp->state & SFP_F_LOS);
 
if (los)
sfp_sm_next(sfp, SFP_S_WAIT_LOS, 0);
@@ -1493,18 +1497,22 @@ static void sfp_sm_link_check_los(struct sfp *sfp)
 
 static bool sfp_los_event_active(struct sfp *sfp, unsigned int event)
 {
-   return (sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_LOS_INVERTED) &&
-   event == SFP_E_LOS_LOW) ||
-  (sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_LOS_NORMAL) &&
-   event == SFP_E_LOS_HIGH);
+   const __be16 los_inverted = cpu_to_be16(SFP_OPTIONS_LOS_INVERTED);
+   const __be16 los_normal = cpu_to_be16(SFP_OPTIONS_LOS_NORMAL);
+   __be16 los_options = sfp->id.ext.options & (los_inverted | los_normal);
+
+   return (los_options == los_inverted && event == SFP_E_LOS_LOW) ||
+  (los_options == los_normal && event == SFP_E_LOS_HIGH);
 }
 
 static bool sfp_los_event_inactive(struct sfp *sfp, unsigned int event)
 {
-   return (sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_LOS_INVERTED) &&
-   event == SFP_E_LOS_HIGH) ||
-  (sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_LOS_NORMAL) &&
-   event == SFP_E_LOS_LOW);
+   const __be16 los_inverted = cpu_to_be16(SFP_OPTIONS_LOS_INVERTED);
+   const __be16 los_normal = cpu_to_be16(SFP_OPTIONS_LOS_NORMAL);
+   __be16 los_options = sfp->id.ext.options & (los_inverted | los_normal);
+
+   return (los_options == los_inverted && event == SFP_E_LOS_HIGH) ||
+  (los_options == los_normal && event == SFP_E_LOS_LOW);
 }
 
 static void sfp_sm_fault(struct sfp *sfp, unsigned int next_state, bool warn)
-- 
2.20.1

[PATCH 2/4] net: sfp: allow to use also SFP modules which are detected as SFF

2020-12-30 Thread Pali Rohár

Some GPON SFP modules (e.g. Ubiquiti U-Fiber Instant) have set SFF phys_id
in their EEPROM. Kernel SFP subsystem currently does not allow to use
modules detected as SFF.

This change extends check for SFP modules so also those with SFF phys_id
are allowed. With this change also GPON SFP module Ubiquiti U-Fiber Instant
is recognized.

Signed-off-by: Pali Rohár 
---
 drivers/net/phy/sfp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index 490e78a72dd6..73f3ecf15260 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -273,7 +273,8 @@ static const struct sff_data sff_data = {
 
 static bool sfp_module_supported(const struct sfp_eeprom_id *id)
 {
-   return id->base.phys_id == SFF8024_ID_SFP &&
+   return (id->base.phys_id == SFF8024_ID_SFP ||
+   id->base.phys_id == SFF8024_ID_SFF_8472) &&
   id->base.phys_ext_id == SFP_PHYS_EXT_ID_SFP;
 }
 
-- 
2.20.1

[PATCH 1/4] net: sfp: add workaround for Realtek RTL8672 and RTL9601C chips

2020-12-30 Thread Pali Rohár

Workaround for GPON SFP modules based on VSOL V2801F brand was added in
commit 0d035bed2a4a ("net: sfp: VSOL V2801F / CarlitoxxPro CPGOS03-0490
v2.0 workaround"). But it works only for ids explicitly added to the list.
As there are more rebraded VSOL V2801F modules and OEM vendors are putting
into vendor name random strings we cannot build workaround based on ids.

Moreover issue which that commit tried to workaround is generic not only to
VSOL based modules, but rather to all GPON modules based on Realtek RTL8672
and RTL9601C chips.

They can be found for example in following GPON modules:
* V-SOL V2801F
* C-Data FD511GX-RM0
* OPTON GP801R
* BAUDCOM BD-1234-SFM
* CPGOS03-0490 v2.0
* Ubiquiti U-Fiber Instant
* EXOT EGS1

Those Realtek chips have broken EEPROM emulator which for N-byte read
operation returns just one byte of EEPROM data followed by N-1 zeros.

So introduce a new function sfp_id_needs_byte_io() which detects SFP
modules with these Realtek chips which have broken EEPROM emulator based on
N-1 zeros and switch to 1 byte EEPROM reading operation which workaround
this issue.

This patch fixes reading EEPROM content from SFP modules based on Realtek
RTL8672 and RTL9601C chips.

Fixes: 0d035bed2a4a ("net: sfp: VSOL V2801F / CarlitoxxPro CPGOS03-0490 v2.0 
workaround")
Co-developed-by: Russell King 
Signed-off-by: Russell King 
Signed-off-by: Pali Rohár 
---
 drivers/net/phy/sfp.c | 78 ---
 1 file changed, 44 insertions(+), 34 deletions(-)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index 91d74c1a920a..490e78a72dd6 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -336,19 +336,11 @@ static int sfp_i2c_read(struct sfp *sfp, bool a2, u8 
dev_addr, void *buf,
size_t len)
 {
struct i2c_msg msgs[2];
-   size_t block_size;
+   u8 bus_addr = a2 ? 0x51 : 0x50;
+   size_t block_size = sfp->i2c_block_size;
size_t this_len;
-   u8 bus_addr;
int ret;
 
-   if (a2) {
-   block_size = 16;
-   bus_addr = 0x51;
-   } else {
-   block_size = sfp->i2c_block_size;
-   bus_addr = 0x50;
-   }
-
msgs[0].addr = bus_addr;
msgs[0].flags = 0;
msgs[0].len = 1;
@@ -1642,26 +1634,30 @@ static int sfp_sm_mod_hpower(struct sfp *sfp, bool 
enable)
return 0;
 }
 
-/* Some modules (Nokia 3FE46541AA) lock up if byte 0x51 is read as a
- * single read. Switch back to reading 16 byte blocks unless we have
- * a CarlitoxxPro module (rebranded VSOL V2801F). Even more annoyingly,
- * some VSOL V2801F have the vendor name changed to OEM.
+/* GPON modules based on Realtek RTL8672 and RTL9601C chips (e.g. V-SOL
+ * V2801F, CarlitoxxPro CPGOS03-0490, Ubiquiti U-Fiber Instant, ...) do
+ * not support multibyte reads from the EEPROM. Each multi-byte read
+ * operation returns just one byte of EEPROM followed by zeros. There is
+ * no way to identify which modules are using Realtek RTL8672 and RTL9601C
+ * chips. Moreover every OEM of V-SOL V2801F module puts its own vendor
+ * name and vendor id into EEPROM, so there is even no way to detect if
+ * module is V-SOL V2801F. Therefore check for those zeros in the read
+ * data and then based on check switch to reading EEPROM to one byte
+ * at a time.
  */
-static int sfp_quirk_i2c_block_size(const struct sfp_eeprom_base *base)
+static bool sfp_id_needs_byte_io(struct sfp *sfp, void *buf, size_t len)
 {
-   if (!memcmp(base->vendor_name, "VSOL", 16))
-   return 1;
-   if (!memcmp(base->vendor_name, "OEM ", 16) &&
-   !memcmp(base->vendor_pn,   "V2801F  ", 16))
-   return 1;
+   size_t i, block_size = sfp->i2c_block_size;
 
-   /* Some modules can't cope with long reads */
-   return 16;
-}
+   /* Already using byte IO */
+   if (block_size == 1)
+   return false;
 
-static void sfp_quirks_base(struct sfp *sfp, const struct sfp_eeprom_base 
*base)
-{
-   sfp->i2c_block_size = sfp_quirk_i2c_block_size(base);
+   for (i = 1; i < len; i += block_size) {
+   if (memchr_inv(buf + i, '\0', block_size - 1))
+   return false;
+   }
+   return true;
 }
 
 static int sfp_cotsworks_fixup_check(struct sfp *sfp, struct sfp_eeprom_id *id)
@@ -1705,11 +1701,7 @@ static int sfp_sm_mod_probe(struct sfp *sfp, bool report)
u8 check;
int ret;
 
-   /* Some modules (CarlitoxxPro CPGOS03-0490) do not support multibyte
-* reads from the EEPROM, so start by reading the base identifying
-* information one byte at a time.
-*/
-   sfp->i2c_block_size = 1;
+   sfp->i2c_block_size = 16;
 
ret = sfp_read(sfp, false, 0, &id.base, sizeof(id.base));
if (ret < 0) {
@@ -1723,6 +1715,27 @@ static int sfp_sm_mod_probe(struct sfp *sfp, bool report)
return -EAGAIN;
}
 
+

[PATCH 4/4] net: sfp: add mode quirk for GPON module Ubiquiti U-Fiber Instant

2020-12-30 Thread Pali Rohár

SFP GPON module Ubiquiti U-Fiber Instant has in its EEPROM stored nonsense
information. It claims that support all transceiver types including 10G
Ethernet which is not truth. So clear all claimed modes and set only one
mode which module supports: 1000baseX_Full.

This change finally allows to detect and use SFP GPON module Ubiquiti
U-Fiber Instant on Linux system.

EEPROM content of this SFP module is (where XX is serial number):

00: 02 04 0b ff ff ff ff ff ff ff ff 03 0c 00 14 c8?????.??
10: 00 00 00 00 55 42 4e 54 20 20 20 20 20 20 20 20UBNT
20: 20 20 20 20 00 18 e8 29 55 46 2d 49 4e 53 54 41.??)UF-INSTA
30: 4e 54 20 20 20 20 20 20 34 20 20 20 05 1e 00 36NT  4   ??.6
40: 00 06 00 00 55 42 4e 54 XX XX XX XX XX XX XX XX.?..UBNT
50: 20 20 20 20 31 34 30 31 32 33 20 20 60 80 02 41140123  `??A

Signed-off-by: Pali Rohár 
---
 drivers/net/phy/sfp-bus.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/drivers/net/phy/sfp-bus.c b/drivers/net/phy/sfp-bus.c
index 20b91f5dfc6e..4cf874fb5c5b 100644
--- a/drivers/net/phy/sfp-bus.c
+++ b/drivers/net/phy/sfp-bus.c
@@ -44,6 +44,17 @@ static void sfp_quirk_2500basex(const struct sfp_eeprom_id 
*id,
phylink_set(modes, 2500baseX_Full);
 }
 
+static void sfp_quirk_ubnt_uf_instant(const struct sfp_eeprom_id *id,
+ unsigned long *modes)
+{
+   /* Ubiquiti U-Fiber Instant module claims that support all transceiver
+* types including 10G Ethernet which is not truth. So clear all claimed
+* modes and set only one mode which module supports: 1000baseX_Full.
+*/
+   phylink_zero(modes);
+   phylink_set(modes, 1000baseX_Full);
+}
+
 static const struct sfp_quirk sfp_quirks[] = {
{
// Alcatel Lucent G-010S-P can operate at 2500base-X, but
@@ -63,6 +74,10 @@ static const struct sfp_quirk sfp_quirks[] = {
.vendor = "HUAWEI",
.part = "MA5671A",
.modes = sfp_quirk_2500basex,
+   }, {
+   .vendor = "UBNT",
+   .part = "UF-INSTANT",
+   .modes = sfp_quirk_ubnt_uf_instant,
},
 };
 
-- 
2.20.1

Re: [PATCH 2/2] net: ks8851: Register MDIO bus and the internal PHY

2020-12-30 Thread Andrew Lunn

> +static int ks8851_mdio_read(struct mii_bus *bus, int phy_id, int reg)
> +{
> + struct ks8851_net *ks = bus->priv;
> +
> + if (phy_id != 0)
> + return 0x;
> +

Please check for C45 and return -EOPNOTSUPP.

   Andrew

Re: [PATCH 1/4] net: sfp: add workaround for Realtek RTL8672 and RTL9601C chips

2020-12-30 Thread Russell King - ARM Linux admin

On Wed, Dec 30, 2020 at 04:47:52PM +0100, Pali Rohár wrote:
> Workaround for GPON SFP modules based on VSOL V2801F brand was added in
> commit 0d035bed2a4a ("net: sfp: VSOL V2801F / CarlitoxxPro CPGOS03-0490
> v2.0 workaround"). But it works only for ids explicitly added to the list.
> As there are more rebraded VSOL V2801F modules and OEM vendors are putting
> into vendor name random strings we cannot build workaround based on ids.
> 
> Moreover issue which that commit tried to workaround is generic not only to
> VSOL based modules, but rather to all GPON modules based on Realtek RTL8672
> and RTL9601C chips.
> 
> They can be found for example in following GPON modules:
> * V-SOL V2801F
> * C-Data FD511GX-RM0
> * OPTON GP801R
> * BAUDCOM BD-1234-SFM
> * CPGOS03-0490 v2.0
> * Ubiquiti U-Fiber Instant
> * EXOT EGS1
> 
> Those Realtek chips have broken EEPROM emulator which for N-byte read
> operation returns just one byte of EEPROM data followed by N-1 zeros.
> 
> So introduce a new function sfp_id_needs_byte_io() which detects SFP
> modules with these Realtek chips which have broken EEPROM emulator based on
> N-1 zeros and switch to 1 byte EEPROM reading operation which workaround
> this issue.
> 
> This patch fixes reading EEPROM content from SFP modules based on Realtek
> RTL8672 and RTL9601C chips.
> 
> Fixes: 0d035bed2a4a ("net: sfp: VSOL V2801F / CarlitoxxPro CPGOS03-0490 v2.0 
> workaround")
> Co-developed-by: Russell King 
> Signed-off-by: Russell King 
> Signed-off-by: Pali Rohár 
> ---
>  drivers/net/phy/sfp.c | 78 ---
>  1 file changed, 44 insertions(+), 34 deletions(-)
> 
> diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
> index 91d74c1a920a..490e78a72dd6 100644
> --- a/drivers/net/phy/sfp.c
> +++ b/drivers/net/phy/sfp.c
> @@ -336,19 +336,11 @@ static int sfp_i2c_read(struct sfp *sfp, bool a2, u8 
> dev_addr, void *buf,
>   size_t len)
>  {
>   struct i2c_msg msgs[2];
> - size_t block_size;
> + u8 bus_addr = a2 ? 0x51 : 0x50;
> + size_t block_size = sfp->i2c_block_size;
>   size_t this_len;
> - u8 bus_addr;
>   int ret;
>  
> - if (a2) {
> - block_size = 16;
> - bus_addr = 0x51;
> - } else {
> - block_size = sfp->i2c_block_size;
> - bus_addr = 0x50;
> - }
> -

NAK. You are undoing something that is definitely needed. The
diagnostics must be read with sequential reads to be able to properly
read the 16-bit values.

The rest of the patch is fine; it's a shame the entire thing has
been spoilt by this extra addition that was not in the patch we had
been discussing off-list.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 2/4] net: sfp: allow to use also SFP modules which are detected as SFF

2020-12-30 Thread Russell King - ARM Linux admin

On Wed, Dec 30, 2020 at 04:47:53PM +0100, Pali Rohár wrote:
> Some GPON SFP modules (e.g. Ubiquiti U-Fiber Instant) have set SFF phys_id
> in their EEPROM. Kernel SFP subsystem currently does not allow to use
> modules detected as SFF.
> 
> This change extends check for SFP modules so also those with SFF phys_id
> are allowed. With this change also GPON SFP module Ubiquiti U-Fiber Instant
> is recognized.

I really don't want to do this for every single module out there.
It's likely that Ubiquiti do this crap as a vendor lock-in measure.
Let's make it specific to Ubiquiti modules _only_ until such time
that we know better.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 3/4] net: sfp: assume that LOS is not implemented if both LOS normal and inverted is set

2020-12-30 Thread Russell King - ARM Linux admin

On Wed, Dec 30, 2020 at 04:47:54PM +0100, Pali Rohár wrote:
> Some GPON SFP modules (e.g. Ubiquiti U-Fiber Instant) have set both
> SFP_OPTIONS_LOS_INVERTED and SFP_OPTIONS_LOS_NORMAL bits in their EEPROM.
> 
> Such combination of bits is meaningless so assume that LOS signal is not
> implemented.
> 
> This patch fixes link carrier for GPON SFP module Ubiquiti U-Fiber Instant.
> 
> Co-developed-by: Russell King 
> Signed-off-by: Russell King 

No, this is not co-developed. The patch content is exactly what _I_
sent you, only the commit description is your own.

> Signed-off-by: Pali Rohár 
> ---
>  drivers/net/phy/sfp.c | 36 ++--
>  1 file changed, 22 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
> index 73f3ecf15260..d47485ed239c 100644
> --- a/drivers/net/phy/sfp.c
> +++ b/drivers/net/phy/sfp.c
> @@ -1475,15 +1475,19 @@ static void sfp_sm_link_down(struct sfp *sfp)
>  
>  static void sfp_sm_link_check_los(struct sfp *sfp)
>  {
> - unsigned int los = sfp->state & SFP_F_LOS;
> + const __be16 los_inverted = cpu_to_be16(SFP_OPTIONS_LOS_INVERTED);
> + const __be16 los_normal = cpu_to_be16(SFP_OPTIONS_LOS_NORMAL);
> + __be16 los_options = sfp->id.ext.options & (los_inverted | los_normal);
> + bool los = false;
>  
>   /* If neither SFP_OPTIONS_LOS_INVERTED nor SFP_OPTIONS_LOS_NORMAL
> -  * are set, we assume that no LOS signal is available.
> +  * are set, we assume that no LOS signal is available. If both are
> +  * set, we assume LOS is not implemented (and is meaningless.)
>*/
> - if (sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_LOS_INVERTED))
> - los ^= SFP_F_LOS;
> - else if (!(sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_LOS_NORMAL)))
> - los = 0;
> + if (los_options == los_inverted)
> + los = !(sfp->state & SFP_F_LOS);
> + else if (los_options == los_normal)
> + los = !!(sfp->state & SFP_F_LOS);
>  
>   if (los)
>   sfp_sm_next(sfp, SFP_S_WAIT_LOS, 0);
> @@ -1493,18 +1497,22 @@ static void sfp_sm_link_check_los(struct sfp *sfp)
>  
>  static bool sfp_los_event_active(struct sfp *sfp, unsigned int event)
>  {
> - return (sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_LOS_INVERTED) &&
> - event == SFP_E_LOS_LOW) ||
> -(sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_LOS_NORMAL) &&
> - event == SFP_E_LOS_HIGH);
> + const __be16 los_inverted = cpu_to_be16(SFP_OPTIONS_LOS_INVERTED);
> + const __be16 los_normal = cpu_to_be16(SFP_OPTIONS_LOS_NORMAL);
> + __be16 los_options = sfp->id.ext.options & (los_inverted | los_normal);
> +
> + return (los_options == los_inverted && event == SFP_E_LOS_LOW) ||
> +(los_options == los_normal && event == SFP_E_LOS_HIGH);
>  }
>  
>  static bool sfp_los_event_inactive(struct sfp *sfp, unsigned int event)
>  {
> - return (sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_LOS_INVERTED) &&
> - event == SFP_E_LOS_HIGH) ||
> -(sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_LOS_NORMAL) &&
> - event == SFP_E_LOS_LOW);
> + const __be16 los_inverted = cpu_to_be16(SFP_OPTIONS_LOS_INVERTED);
> + const __be16 los_normal = cpu_to_be16(SFP_OPTIONS_LOS_NORMAL);
> + __be16 los_options = sfp->id.ext.options & (los_inverted | los_normal);
> +
> + return (los_options == los_inverted && event == SFP_E_LOS_HIGH) ||
> +(los_options == los_normal && event == SFP_E_LOS_LOW);
>  }
>  
>  static void sfp_sm_fault(struct sfp *sfp, unsigned int next_state, bool warn)
> -- 
> 2.20.1
> 
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

[PATCH v2] xfrm: Fix wraparound in xfrm_policy_addr_delta()

2020-12-30 Thread Visa Hankala

Use three-way comparison for address components to avoid integer
wraparound in the result of xfrm_policy_addr_delta(). This ensures
that the search trees are built and traversed correctly.

Treat IPv4 and IPv6 similarly by returning 0 when prefixlen == 0.
Prefix /0 has only one equivalence class.

Fixes: 9cf545ebd591d ("xfrm: policy: store inexact policies in a tree ordered 
by destination address")
Signed-off-by: Visa Hankala 
---
 net/xfrm/xfrm_policy.c | 26 +
 tools/testing/selftests/net/xfrm_policy.sh | 43 ++
 2 files changed, 61 insertions(+), 8 deletions(-)

diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index d622c2548d22..68258ff6a30b 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -793,15 +793,22 @@ static int xfrm_policy_addr_delta(const xfrm_address_t *a,
  const xfrm_address_t *b,
  u8 prefixlen, u16 family)
 {
+   u32 ma, mb, mask;
unsigned int pdw, pbi;
int delta = 0;
 
switch (family) {
case AF_INET:
-   if (sizeof(long) == 4 && prefixlen == 0)
-   return ntohl(a->a4) - ntohl(b->a4);
-   return (ntohl(a->a4) & ((~0UL << (32 - prefixlen -
-  (ntohl(b->a4) & ((~0UL << (32 - prefixlen;
+   if (prefixlen == 0)
+   return 0;
+   mask = ~0U << (32 - prefixlen);
+   ma = ntohl(a->a4) & mask;
+   mb = ntohl(b->a4) & mask;
+   if (ma < mb)
+   delta = -1;
+   else if (ma > mb)
+   delta = 1;
+   break;
case AF_INET6:
pdw = prefixlen >> 5;
pbi = prefixlen & 0x1f;
@@ -812,10 +819,13 @@ static int xfrm_policy_addr_delta(const xfrm_address_t *a,
return delta;
}
if (pbi) {
-   u32 mask = ~0u << (32 - pbi);
-
-   delta = (ntohl(a->a6[pdw]) & mask) -
-   (ntohl(b->a6[pdw]) & mask);
+   mask = ~0U << (32 - pbi);
+   ma = ntohl(a->a6[pdw]) & mask;
+   mb = ntohl(b->a6[pdw]) & mask;
+   if (ma < mb)
+   delta = -1;
+   else if (ma > mb)
+   delta = 1;
}
break;
default:
diff --git a/tools/testing/selftests/net/xfrm_policy.sh 
b/tools/testing/selftests/net/xfrm_policy.sh
index 7a1bf94c5bd3..9e2dffb670d5 100755
--- a/tools/testing/selftests/net/xfrm_policy.sh
+++ b/tools/testing/selftests/net/xfrm_policy.sh
@@ -287,6 +287,47 @@ check_hthresh_repeat()
return 0
 }
 
+# insert non-overlapping policies in a random order and check that
+# all of them can be fetched using the traffic selectors.
+check_random_order()
+{
+   local ns=$1
+   local log=$2
+
+   for i in $(seq 100); do
+   ip -net $ns xfrm policy flush
+   for j in $(seq 0 16 255 | sort -R); do
+   ip -net $ns xfrm policy add dst $j.0.0.0/24 dir out 
priority 10 action allow
+   done
+   for j in $(seq 0 16 255); do
+   if ! ip -net $ns xfrm policy get dst $j.0.0.0/24 dir 
out > /dev/null; then
+   echo "FAIL: $log" 1>&2
+   return 1
+   fi
+   done
+   done
+
+   for i in $(seq 100); do
+   ip -net $ns xfrm policy flush
+   for j in $(seq 0 16 255 | sort -R); do
+   local addr=$(printf "e000::%02x00::/56" $j)
+   ip -net $ns xfrm policy add dst $addr dir out priority 
10 action allow
+   done
+   for j in $(seq 0 16 255); do
+   local addr=$(printf "e000::%02x00::/56" $j)
+   if ! ip -net $ns xfrm policy get dst $addr dir out > 
/dev/null; then
+   echo "FAIL: $log" 1>&2
+   return 1
+   fi
+   done
+   done
+
+   ip -net $ns xfrm policy flush
+
+   echo "PASS: $log"
+   return 0
+}
+
 #check for needed privileges
 if [ "$(id -u)" -ne 0 ];then
echo "SKIP: Need root privileges"
@@ -438,6 +479,8 @@ check_exceptions "exceptions and block policies after 
htresh change to normal"
 
 check_hthresh_repeat "policies with repeated htresh change"
 
+check_random_order ns3 "policies inserted in random order"
+
 for i in 1 2 3 4;do ip netns del ns$i;done
 
 exit $ret

Re: Registering IRQ for MT7530 internal PHYs

2020-12-30 Thread Florian Fainelli

On 12/30/2020 1:12 AM, Heiner Kallweit wrote:
> On 30.12.2020 10:07, DENG Qingfang wrote:
>> Hi Heiner,
>> Thanks for your reply.
>>
>> On Wed, Dec 30, 2020 at 3:39 PM Heiner Kallweit  wrote:
>>> I don't think that's the best option.
>>
>> I'm well aware of that.
>>
>>> You may want to add a PHY driver for your chip. Supposedly it
>>> supports at least PHY suspend/resume. You can use the RTL8366RB
>>> PHY driver as template.
>>
>> There's no MediaTek PHY driver yet. Do we really need a new one just
>> for the interrupts?
>>
> Not only for the interrupts. The genphy driver e.g. doesn't support
> PHY suspend/resume. And the PHY driver needs basically no code,
> just set the proper callbacks.

That statement about not supporting suspend/resume is not exactly true,
the generic "1g" PHY driver only implements suspend/resume through the
use of the standard BMCR power down bit, but not anything more
complicated than that.

Interrupt handling within the PHY itself is not defined by the existing
standard registers and will typically not reside in a standard register
space either, so just for that reason you do need a custom PHY driver.
There are other advantages if you need to expose additional PHY features
down the road like PHY counters, energy detection, automatic power down etc.

I don't believe we will see discrete/standalone Mediatek PHY chips, but
if that happens, then you would already have a framework for supporting
them.
-- 
Florian

Re: Registering IRQ for MT7530 internal PHYs

2020-12-30 Thread Florian Fainelli




On 12/30/2020 7:16 AM, Andrew Lunn wrote:
>> 2. Allocated ds->slave_mii_bus before calling ds->ops->setup, because
>> we cannot call dsa_slave_mii_bus_init which is private.
>>
>> Any better ideas?
> 
> Do what mv88e6xxx does, allocate the MDIO bus inside the switch
> driver.

Yes, exactly, or you could add additional hooks to allow intercepting
the initialization and de-initialization of the ds->slave_mii_bus,
something like this:

https://github.com/ffainelli/linux/commit/758da087a819cd1a284de074ea7d8eae9f875f0b

which was part of a larger series adding threaded IRQ support to the b53
driver:

https://github.com/ffainelli/linux/commits/b53-irq
-- 
Florian

Re: [PATCH] Allow user to set metric on default route learned via Router Advertisement.

2020-12-30 Thread Andrew Lunn

On Wed, Dec 30, 2020 at 12:48:32AM -0800, Praveen Chaudhary wrote:
> Allow user to set metric on default route learned via Router Advertisement.
> Not: RFC 4191 does not say anything for metric for IPv6 default route.

Hi Praveen

Please take a look at

https://www.kernel.org/doc/html/latest/process/submitting-patches.html

and

https://www.kernel.org/doc/html/latest/networking/netdev-FAQ.html

In particular, you should be using git format-patch, which will
correctly number your patches, make use of a version number inside the
[PATCH v0 net-next] subject part, and please indicate which tree this
is for.

   Andrew

Re: [PATCH v2] net: macb: Correct usage of MACB_CAPS_CLK_HW_CHG flag

2020-12-30 Thread Andrew Lunn

On Wed, Dec 30, 2020 at 10:33:09AM +, Charles Keepax wrote:
> A new flag MACB_CAPS_CLK_HW_CHG was added and all callers of
> macb_set_tx_clk were gated on the presence of this flag.
> 
> -   if (!clk)
> + if (!bp->tx_clk || !(bp->caps & MACB_CAPS_CLK_HW_CHG))
> 
> However the flag was not added to anything other than the new
> sama7g5_gem, turning that function call into a no op for all other
> systems. This breaks the networking on Zynq.
> 
> The commit message adding this states: a new capability so that
> macb_set_tx_clock() to not be called for IPs having this
> capability
> 
> This strongly implies that present of the flag was intended to skip
> the function not absence of the flag. Update the if statement to
> this effect, which repairs the existing users.
> 
> Fixes: daafa1d33cc9 ("net: macb: add capability to not set the clock rate")
> Suggested-by: Andrew Lunn 
> Signed-off-by: Charles Keepax 

Reviewed-by: Andrew Lunn 

Hi Charles

Since this is a fix, this should be based on the net tree. And you
indicate this in the subject line with [PATCH net]. If this applies
cleanly to net, Dave will probably just accept it, but please keep
this in mind for any more submissions you make to netdev.

Andrew

Re: Registering IRQ for MT7530 internal PHYs

2020-12-30 Thread Heiner Kallweit

On 30.12.2020 17:15, Florian Fainelli wrote:
> 
> 
> On 12/30/2020 1:12 AM, Heiner Kallweit wrote:
>> On 30.12.2020 10:07, DENG Qingfang wrote:
>>> Hi Heiner,
>>> Thanks for your reply.
>>>
>>> On Wed, Dec 30, 2020 at 3:39 PM Heiner Kallweit  
>>> wrote:
 I don't think that's the best option.
>>>
>>> I'm well aware of that.
>>>
 You may want to add a PHY driver for your chip. Supposedly it
 supports at least PHY suspend/resume. You can use the RTL8366RB
 PHY driver as template.
>>>
>>> There's no MediaTek PHY driver yet. Do we really need a new one just
>>> for the interrupts?
>>>
>> Not only for the interrupts. The genphy driver e.g. doesn't support
>> PHY suspend/resume. And the PHY driver needs basically no code,
>> just set the proper callbacks.
> 
> That statement about not supporting suspend/resume is not exactly true,
> the generic "1g" PHY driver only implements suspend/resume through the
> use of the standard BMCR power down bit, but not anything more
> complicated than that.
> 
Oh, right. Somehow I had in the back of my mind that the genphy driver
has no suspend/resume callbacks set.

> Interrupt handling within the PHY itself is not defined by the existing
> standard registers and will typically not reside in a standard register
> space either, so just for that reason you do need a custom PHY driver.
> There are other advantages if you need to expose additional PHY features
> down the road like PHY counters, energy detection, automatic power down etc.
> 
> I don't believe we will see discrete/standalone Mediatek PHY chips, but
> if that happens, then you would already have a framework for supporting
> them.
>

Re: [PATCH 1/4] net: sfp: add workaround for Realtek RTL8672 and RTL9601C chips

2020-12-30 Thread Pali Rohár

On Wednesday 30 December 2020 16:10:37 Russell King - ARM Linux admin wrote:
> On Wed, Dec 30, 2020 at 04:47:52PM +0100, Pali Rohár wrote:
> > Workaround for GPON SFP modules based on VSOL V2801F brand was added in
> > commit 0d035bed2a4a ("net: sfp: VSOL V2801F / CarlitoxxPro CPGOS03-0490
> > v2.0 workaround"). But it works only for ids explicitly added to the list.
> > As there are more rebraded VSOL V2801F modules and OEM vendors are putting
> > into vendor name random strings we cannot build workaround based on ids.
> > 
> > Moreover issue which that commit tried to workaround is generic not only to
> > VSOL based modules, but rather to all GPON modules based on Realtek RTL8672
> > and RTL9601C chips.
> > 
> > They can be found for example in following GPON modules:
> > * V-SOL V2801F
> > * C-Data FD511GX-RM0
> > * OPTON GP801R
> > * BAUDCOM BD-1234-SFM
> > * CPGOS03-0490 v2.0
> > * Ubiquiti U-Fiber Instant
> > * EXOT EGS1
> > 
> > Those Realtek chips have broken EEPROM emulator which for N-byte read
> > operation returns just one byte of EEPROM data followed by N-1 zeros.
> > 
> > So introduce a new function sfp_id_needs_byte_io() which detects SFP
> > modules with these Realtek chips which have broken EEPROM emulator based on
> > N-1 zeros and switch to 1 byte EEPROM reading operation which workaround
> > this issue.
> > 
> > This patch fixes reading EEPROM content from SFP modules based on Realtek
> > RTL8672 and RTL9601C chips.
> > 
> > Fixes: 0d035bed2a4a ("net: sfp: VSOL V2801F / CarlitoxxPro CPGOS03-0490 
> > v2.0 workaround")
> > Co-developed-by: Russell King 
> > Signed-off-by: Russell King 
> > Signed-off-by: Pali Rohár 
> > ---
> >  drivers/net/phy/sfp.c | 78 ---
> >  1 file changed, 44 insertions(+), 34 deletions(-)
> > 
> > diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
> > index 91d74c1a920a..490e78a72dd6 100644
> > --- a/drivers/net/phy/sfp.c
> > +++ b/drivers/net/phy/sfp.c
> > @@ -336,19 +336,11 @@ static int sfp_i2c_read(struct sfp *sfp, bool a2, u8 
> > dev_addr, void *buf,
> > size_t len)
> >  {
> > struct i2c_msg msgs[2];
> > -   size_t block_size;
> > +   u8 bus_addr = a2 ? 0x51 : 0x50;
> > +   size_t block_size = sfp->i2c_block_size;
> > size_t this_len;
> > -   u8 bus_addr;
> > int ret;
> >  
> > -   if (a2) {
> > -   block_size = 16;
> > -   bus_addr = 0x51;
> > -   } else {
> > -   block_size = sfp->i2c_block_size;
> > -   bus_addr = 0x50;
> > -   }
> > -
> 
> NAK. You are undoing something that is definitely needed. The
> diagnostics must be read with sequential reads to be able to properly
> read the 16-bit values.

This change is really required for those Realtek chips. I thought that
it is obvious that from *both* addresses 0x50 and 0x51 can be read only
one byte at the same time. Reading 2 bytes (for be16 value) cannot be
really done by one i2 transfer, it must be done in two.

Therefore I had to "undo" this change as it is breaking support for all
those Realtek SFPs, including VSOL. That is why I also put Fixes tag
into commit message as it is really fixing reading for VSOL SFP from
address 0x51.

I understand that you want to read __be16 val in sfp_hwmon_read_sensor()
function via one i2c transfer to have "consistent" value, but it is
impossible on these Realtek chips.

I have already tested that sfp_hwmon_read_sensor() function see just
zero on second byte when is trying to use 2-byte read via one i2c
transfer.

When it use two i2c transfers, one for low 8bits and one for high 8bits
then reading is working.

> The rest of the patch is fine; it's a shame the entire thing has
> been spoilt by this extra addition that was not in the patch we had
> been discussing off-list.

Sorry for that. I really thought that it is obvious that this change
needs to be undone and read must always use byte transfer if we detect
these broken Realtek chips.

Re: [PATCH 3/4] net: sfp: assume that LOS is not implemented if both LOS normal and inverted is set

2020-12-30 Thread Pali Rohár

On Wednesday 30 December 2020 16:13:10 Russell King - ARM Linux admin wrote:
> On Wed, Dec 30, 2020 at 04:47:54PM +0100, Pali Rohár wrote:
> > Some GPON SFP modules (e.g. Ubiquiti U-Fiber Instant) have set both
> > SFP_OPTIONS_LOS_INVERTED and SFP_OPTIONS_LOS_NORMAL bits in their EEPROM.
> > 
> > Such combination of bits is meaningless so assume that LOS signal is not
> > implemented.
> > 
> > This patch fixes link carrier for GPON SFP module Ubiquiti U-Fiber Instant.
> > 
> > Co-developed-by: Russell King 
> > Signed-off-by: Russell King 
> 
> No, this is not co-developed. The patch content is exactly what _I_
> sent you, only the commit description is your own.

Sorry, in this case I misunderstood usage of this Co-developed-by tag.
I will remove it in next iteration of patches.

> > Signed-off-by: Pali Rohár 
> > ---
> >  drivers/net/phy/sfp.c | 36 ++--
> >  1 file changed, 22 insertions(+), 14 deletions(-)
> > 
> > diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
> > index 73f3ecf15260..d47485ed239c 100644
> > --- a/drivers/net/phy/sfp.c
> > +++ b/drivers/net/phy/sfp.c
> > @@ -1475,15 +1475,19 @@ static void sfp_sm_link_down(struct sfp *sfp)
> >  
> >  static void sfp_sm_link_check_los(struct sfp *sfp)
> >  {
> > -   unsigned int los = sfp->state & SFP_F_LOS;
> > +   const __be16 los_inverted = cpu_to_be16(SFP_OPTIONS_LOS_INVERTED);
> > +   const __be16 los_normal = cpu_to_be16(SFP_OPTIONS_LOS_NORMAL);
> > +   __be16 los_options = sfp->id.ext.options & (los_inverted | los_normal);
> > +   bool los = false;
> >  
> > /* If neither SFP_OPTIONS_LOS_INVERTED nor SFP_OPTIONS_LOS_NORMAL
> > -* are set, we assume that no LOS signal is available.
> > +* are set, we assume that no LOS signal is available. If both are
> > +* set, we assume LOS is not implemented (and is meaningless.)
> >  */
> > -   if (sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_LOS_INVERTED))
> > -   los ^= SFP_F_LOS;
> > -   else if (!(sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_LOS_NORMAL)))
> > -   los = 0;
> > +   if (los_options == los_inverted)
> > +   los = !(sfp->state & SFP_F_LOS);
> > +   else if (los_options == los_normal)
> > +   los = !!(sfp->state & SFP_F_LOS);
> >  
> > if (los)
> > sfp_sm_next(sfp, SFP_S_WAIT_LOS, 0);
> > @@ -1493,18 +1497,22 @@ static void sfp_sm_link_check_los(struct sfp *sfp)
> >  
> >  static bool sfp_los_event_active(struct sfp *sfp, unsigned int event)
> >  {
> > -   return (sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_LOS_INVERTED) &&
> > -   event == SFP_E_LOS_LOW) ||
> > -  (sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_LOS_NORMAL) &&
> > -   event == SFP_E_LOS_HIGH);
> > +   const __be16 los_inverted = cpu_to_be16(SFP_OPTIONS_LOS_INVERTED);
> > +   const __be16 los_normal = cpu_to_be16(SFP_OPTIONS_LOS_NORMAL);
> > +   __be16 los_options = sfp->id.ext.options & (los_inverted | los_normal);
> > +
> > +   return (los_options == los_inverted && event == SFP_E_LOS_LOW) ||
> > +  (los_options == los_normal && event == SFP_E_LOS_HIGH);
> >  }
> >  
> >  static bool sfp_los_event_inactive(struct sfp *sfp, unsigned int event)
> >  {
> > -   return (sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_LOS_INVERTED) &&
> > -   event == SFP_E_LOS_HIGH) ||
> > -  (sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_LOS_NORMAL) &&
> > -   event == SFP_E_LOS_LOW);
> > +   const __be16 los_inverted = cpu_to_be16(SFP_OPTIONS_LOS_INVERTED);
> > +   const __be16 los_normal = cpu_to_be16(SFP_OPTIONS_LOS_NORMAL);
> > +   __be16 los_options = sfp->id.ext.options & (los_inverted | los_normal);
> > +
> > +   return (los_options == los_inverted && event == SFP_E_LOS_HIGH) ||
> > +  (los_options == los_normal && event == SFP_E_LOS_LOW);
> >  }
> >  
> >  static void sfp_sm_fault(struct sfp *sfp, unsigned int next_state, bool 
> > warn)
> > -- 
> > 2.20.1
> > 
> > 
> 
> -- 
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 1/4] net: sfp: add workaround for Realtek RTL8672 and RTL9601C chips

2020-12-30 Thread Russell King - ARM Linux admin

On Wed, Dec 30, 2020 at 05:56:34PM +0100, Pali Rohár wrote:
> This change is really required for those Realtek chips. I thought that
> it is obvious that from *both* addresses 0x50 and 0x51 can be read only
> one byte at the same time. Reading 2 bytes (for be16 value) cannot be
> really done by one i2 transfer, it must be done in two.

Then these modules are even more broken than first throught, and
quite simply it is pointless supporting the diagnostics on them
because we can never read the values in an atomic way.

It's also a violation of the SFF-8472 that _requires_ multi-byte reads
to read these 16 byte values atomically. Reading them with individual
byte reads results in a non-atomic read, and the 16-bit value can not
be trusted to be correct.

That is really not optional, no matter what any manufacturer says - if
they claim the SFP MSAs allows it, they're quite simply talking out of
a donkey's backside and you should dispose of the module in biohazard
packaging. :)

So no, I hadn't understood this from your emails, and as I say above,
if this is the case, then we quite simply disable diagnostics on these
modules since they are _highly_ noncompliant.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 3/4] net: sfp: assume that LOS is not implemented if both LOS normal and inverted is set

2020-12-30 Thread Russell King - ARM Linux admin

On Wed, Dec 30, 2020 at 05:57:58PM +0100, Pali Rohár wrote:
> On Wednesday 30 December 2020 16:13:10 Russell King - ARM Linux admin wrote:
> > On Wed, Dec 30, 2020 at 04:47:54PM +0100, Pali Rohár wrote:
> > > Some GPON SFP modules (e.g. Ubiquiti U-Fiber Instant) have set both
> > > SFP_OPTIONS_LOS_INVERTED and SFP_OPTIONS_LOS_NORMAL bits in their EEPROM.
> > > 
> > > Such combination of bits is meaningless so assume that LOS signal is not
> > > implemented.
> > > 
> > > This patch fixes link carrier for GPON SFP module Ubiquiti U-Fiber 
> > > Instant.
> > > 
> > > Co-developed-by: Russell King 
> > > Signed-off-by: Russell King 
> > 
> > No, this is not co-developed. The patch content is exactly what _I_
> > sent you, only the commit description is your own.
> 
> Sorry, in this case I misunderstood usage of this Co-developed-by tag.
> I will remove it in next iteration of patches.

You need to mark me as the author of the code at the very least...

> > > Signed-off-by: Pali Rohár 
> > > ---
> > >  drivers/net/phy/sfp.c | 36 ++--
> > >  1 file changed, 22 insertions(+), 14 deletions(-)
> > > 
> > > diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
> > > index 73f3ecf15260..d47485ed239c 100644
> > > --- a/drivers/net/phy/sfp.c
> > > +++ b/drivers/net/phy/sfp.c
> > > @@ -1475,15 +1475,19 @@ static void sfp_sm_link_down(struct sfp *sfp)
> > >  
> > >  static void sfp_sm_link_check_los(struct sfp *sfp)
> > >  {
> > > - unsigned int los = sfp->state & SFP_F_LOS;
> > > + const __be16 los_inverted = cpu_to_be16(SFP_OPTIONS_LOS_INVERTED);
> > > + const __be16 los_normal = cpu_to_be16(SFP_OPTIONS_LOS_NORMAL);
> > > + __be16 los_options = sfp->id.ext.options & (los_inverted | los_normal);
> > > + bool los = false;
> > >  
> > >   /* If neither SFP_OPTIONS_LOS_INVERTED nor SFP_OPTIONS_LOS_NORMAL
> > > -  * are set, we assume that no LOS signal is available.
> > > +  * are set, we assume that no LOS signal is available. If both are
> > > +  * set, we assume LOS is not implemented (and is meaningless.)
> > >*/
> > > - if (sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_LOS_INVERTED))
> > > - los ^= SFP_F_LOS;
> > > - else if (!(sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_LOS_NORMAL)))
> > > - los = 0;
> > > + if (los_options == los_inverted)
> > > + los = !(sfp->state & SFP_F_LOS);
> > > + else if (los_options == los_normal)
> > > + los = !!(sfp->state & SFP_F_LOS);
> > >  
> > >   if (los)
> > >   sfp_sm_next(sfp, SFP_S_WAIT_LOS, 0);
> > > @@ -1493,18 +1497,22 @@ static void sfp_sm_link_check_los(struct sfp *sfp)
> > >  
> > >  static bool sfp_los_event_active(struct sfp *sfp, unsigned int event)
> > >  {
> > > - return (sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_LOS_INVERTED) &&
> > > - event == SFP_E_LOS_LOW) ||
> > > -(sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_LOS_NORMAL) &&
> > > - event == SFP_E_LOS_HIGH);
> > > + const __be16 los_inverted = cpu_to_be16(SFP_OPTIONS_LOS_INVERTED);
> > > + const __be16 los_normal = cpu_to_be16(SFP_OPTIONS_LOS_NORMAL);
> > > + __be16 los_options = sfp->id.ext.options & (los_inverted | los_normal);
> > > +
> > > + return (los_options == los_inverted && event == SFP_E_LOS_LOW) ||
> > > +(los_options == los_normal && event == SFP_E_LOS_HIGH);
> > >  }
> > >  
> > >  static bool sfp_los_event_inactive(struct sfp *sfp, unsigned int event)
> > >  {
> > > - return (sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_LOS_INVERTED) &&
> > > - event == SFP_E_LOS_HIGH) ||
> > > -(sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_LOS_NORMAL) &&
> > > - event == SFP_E_LOS_LOW);
> > > + const __be16 los_inverted = cpu_to_be16(SFP_OPTIONS_LOS_INVERTED);
> > > + const __be16 los_normal = cpu_to_be16(SFP_OPTIONS_LOS_NORMAL);
> > > + __be16 los_options = sfp->id.ext.options & (los_inverted | los_normal);
> > > +
> > > + return (los_options == los_inverted && event == SFP_E_LOS_HIGH) ||
> > > +(los_options == los_normal && event == SFP_E_LOS_LOW);
> > >  }
> > >  
> > >  static void sfp_sm_fault(struct sfp *sfp, unsigned int next_state, bool 
> > > warn)
> > > -- 
> > > 2.20.1
> > > 
> > > 
> > 
> > -- 
> > RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> > FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 2/4] net: sfp: allow to use also SFP modules which are detected as SFF

2020-12-30 Thread Pali Rohár

On Wednesday 30 December 2020 16:11:51 Russell King - ARM Linux admin wrote:
> On Wed, Dec 30, 2020 at 04:47:53PM +0100, Pali Rohár wrote:
> > Some GPON SFP modules (e.g. Ubiquiti U-Fiber Instant) have set SFF phys_id
> > in their EEPROM. Kernel SFP subsystem currently does not allow to use
> > modules detected as SFF.
> > 
> > This change extends check for SFP modules so also those with SFF phys_id
> > are allowed. With this change also GPON SFP module Ubiquiti U-Fiber Instant
> > is recognized.
> 
> I really don't want to do this for every single module out there.
> It's likely that Ubiquiti do this crap as a vendor lock-in measure.
> Let's make it specific to Ubiquiti modules _only_ until such time
> that we know better.

Ok. This module_supported() function is called in sfp_sm_mod_probe()
function. Current code is:

/* Check whether we support this module */
if (!sfp->type->module_supported(&id)) {
dev_err(sfp->dev,
"module is not supported - phys id 0x%02x 0x%02x\n",
sfp->id.base.phys_id, sfp->id.base.phys_ext_id);
return -EINVAL;
}

Do you want to change code to something like this?

/* Check whether we support this module */
if (!sfp->type->module_supported(&id) &&
(memcmp(id.base.vendor_name, "UBNT", 16) ||
 memcmp(id.base.vendor_pn, "UF-INSTANT  ", 16)))
dev_err(sfp->dev,
"module is not supported - phys id 0x%02x 0x%02x\n",
sfp->id.base.phys_id, sfp->id.base.phys_ext_id);
return -EINVAL;
}

Or do you have a better idea how to skip that module_supported check for
this UBNT SFP?

Re: [PATCH 1/4] net: sfp: add workaround for Realtek RTL8672 and RTL9601C chips

2020-12-30 Thread Andrew Lunn

On Wed, Dec 30, 2020 at 05:05:46PM +, Russell King - ARM Linux admin wrote:
> On Wed, Dec 30, 2020 at 05:56:34PM +0100, Pali Rohár wrote:
> > This change is really required for those Realtek chips. I thought that
> > it is obvious that from *both* addresses 0x50 and 0x51 can be read only
> > one byte at the same time. Reading 2 bytes (for be16 value) cannot be
> > really done by one i2 transfer, it must be done in two.
> 
> Then these modules are even more broken than first throught, and
> quite simply it is pointless supporting the diagnostics on them
> because we can never read the values in an atomic way.
> 
> It's also a violation of the SFF-8472 that _requires_ multi-byte reads
> to read these 16 byte values atomically. Reading them with individual
> byte reads results in a non-atomic read, and the 16-bit value can not
> be trusted to be correct.

Hi Pali

I have to agree with Russell here. I would rather have no diagnostics
than untrustable diagnostics.

The only way this is going to be accepted is if the manufacture says
that reading the first byte of a word snapshots the second byte as
well in an atomic way and returns that snapshot on the second
read. But i highly doubt that happens, given how bad these SFPs are.

  Andrew

Re: [PATCH 3/4] net: sfp: assume that LOS is not implemented if both LOS normal and inverted is set

2020-12-30 Thread Andrew Lunn

On Wed, Dec 30, 2020 at 05:06:23PM +, Russell King - ARM Linux admin wrote:
> On Wed, Dec 30, 2020 at 05:57:58PM +0100, Pali Rohár wrote:
> > On Wednesday 30 December 2020 16:13:10 Russell King - ARM Linux admin wrote:
> > > On Wed, Dec 30, 2020 at 04:47:54PM +0100, Pali Rohár wrote:
> > > > Some GPON SFP modules (e.g. Ubiquiti U-Fiber Instant) have set both
> > > > SFP_OPTIONS_LOS_INVERTED and SFP_OPTIONS_LOS_NORMAL bits in their 
> > > > EEPROM.
> > > > 
> > > > Such combination of bits is meaningless so assume that LOS signal is not
> > > > implemented.
> > > > 
> > > > This patch fixes link carrier for GPON SFP module Ubiquiti U-Fiber 
> > > > Instant.
> > > > 
> > > > Co-developed-by: Russell King 
> > > > Signed-off-by: Russell King 
> > > 
> > > No, this is not co-developed. The patch content is exactly what _I_
> > > sent you, only the commit description is your own.
> > 
> > Sorry, in this case I misunderstood usage of this Co-developed-by tag.
> > I will remove it in next iteration of patches.
> 
> You need to mark me as the author of the code at the very least...

Hi Pali

You also need to keep your own Signed-off-by, since the patch is
coming through you.

So basically, git commit --am --author="Russell King 
"
and then two Signed-off-by: lines.

   Andrew

Re: [PATCH 2/4] net: sfp: allow to use also SFP modules which are detected as SFF

2020-12-30 Thread Marek Behún

On Wed, 30 Dec 2020 18:06:52 +0100
Pali Rohár  wrote:

>   if (!sfp->type->module_supported(&id) &&
>   (memcmp(id.base.vendor_name, "UBNT", 16) ||
>memcmp(id.base.vendor_pn, "UF-INSTANT  ", 16)))

I would rather add a quirk member (bitfield) to the sfp structure and do
something like this

if (!sfp->type->module_supported(&id) &&
!(sfp->quirks & SFP_QUIRK_BAD_PHYS_ID))

or maybe put this check into the module_supported method.

Marek

Re: [PATCH 1/4] net: sfp: add workaround for Realtek RTL8672 and RTL9601C chips

2020-12-30 Thread Pali Rohár

On Wednesday 30 December 2020 17:05:46 Russell King - ARM Linux admin wrote:
> On Wed, Dec 30, 2020 at 05:56:34PM +0100, Pali Rohár wrote:
> > This change is really required for those Realtek chips. I thought that
> > it is obvious that from *both* addresses 0x50 and 0x51 can be read only
> > one byte at the same time. Reading 2 bytes (for be16 value) cannot be
> > really done by one i2 transfer, it must be done in two.
> 
> Then these modules are even more broken than first throught, and
> quite simply it is pointless supporting the diagnostics on them
> because we can never read the values in an atomic way.

They are broken in a way that neither holy water help them...

But from diagnostic 0x51 address we can read at least 8bit registers in
atomic way :-)

> It's also a violation of the SFF-8472 that _requires_ multi-byte reads
> to read these 16 byte values atomically. Reading them with individual
> byte reads results in a non-atomic read, and the 16-bit value can not
> be trusted to be correct.
> 
> That is really not optional, no matter what any manufacturer says - if
> they claim the SFP MSAs allows it, they're quite simply talking out of
> a donkey's backside and you should dispose of the module in biohazard
> packaging. :)
> 
> So no, I hadn't understood this from your emails, and as I say above,
> if this is the case, then we quite simply disable diagnostics on these
> modules since they are _highly_ noncompliant.

We have just two options:

Disable 2 (and more) bytes reads from 0x51 address and therefore disable
sfp_hwmon_read_sensor() function.

Or allow 2 bytes non-atomic reads and allow at least semi-correct values
for hwmon. I guess that upper 8bits would not change between two single
byte i2c transfers too much (when they are done immediately one by one).

But in any case we really need to set read operation for this eeprom to
one byte.

Re: [PATCH 3/4] net: sfp: assume that LOS is not implemented if both LOS normal and inverted is set

2020-12-30 Thread Pali Rohár

On Wednesday 30 December 2020 18:17:41 Andrew Lunn wrote:
> On Wed, Dec 30, 2020 at 05:06:23PM +, Russell King - ARM Linux admin 
> wrote:
> > On Wed, Dec 30, 2020 at 05:57:58PM +0100, Pali Rohár wrote:
> > > On Wednesday 30 December 2020 16:13:10 Russell King - ARM Linux admin 
> > > wrote:
> > > > On Wed, Dec 30, 2020 at 04:47:54PM +0100, Pali Rohár wrote:
> > > > > Some GPON SFP modules (e.g. Ubiquiti U-Fiber Instant) have set both
> > > > > SFP_OPTIONS_LOS_INVERTED and SFP_OPTIONS_LOS_NORMAL bits in their 
> > > > > EEPROM.
> > > > > 
> > > > > Such combination of bits is meaningless so assume that LOS signal is 
> > > > > not
> > > > > implemented.
> > > > > 
> > > > > This patch fixes link carrier for GPON SFP module Ubiquiti U-Fiber 
> > > > > Instant.
> > > > > 
> > > > > Co-developed-by: Russell King 
> > > > > Signed-off-by: Russell King 
> > > > 
> > > > No, this is not co-developed. The patch content is exactly what _I_
> > > > sent you, only the commit description is your own.
> > > 
> > > Sorry, in this case I misunderstood usage of this Co-developed-by tag.
> > > I will remove it in next iteration of patches.
> > 
> > You need to mark me as the author of the code at the very least...
> 
> Hi Pali
> 
> You also need to keep your own Signed-off-by, since the patch is
> coming through you.
> 
> So basically, git commit --am --author="Russell King 
> "
> and then two Signed-off-by: lines.

Got it, thank you!

Re: [PATCH 1/4] net: sfp: add workaround for Realtek RTL8672 and RTL9601C chips

2020-12-30 Thread Pali Rohár

On Wednesday 30 December 2020 18:13:15 Andrew Lunn wrote:
> On Wed, Dec 30, 2020 at 05:05:46PM +, Russell King - ARM Linux admin 
> wrote:
> > On Wed, Dec 30, 2020 at 05:56:34PM +0100, Pali Rohár wrote:
> > > This change is really required for those Realtek chips. I thought that
> > > it is obvious that from *both* addresses 0x50 and 0x51 can be read only
> > > one byte at the same time. Reading 2 bytes (for be16 value) cannot be
> > > really done by one i2 transfer, it must be done in two.
> > 
> > Then these modules are even more broken than first throught, and
> > quite simply it is pointless supporting the diagnostics on them
> > because we can never read the values in an atomic way.
> > 
> > It's also a violation of the SFF-8472 that _requires_ multi-byte reads
> > to read these 16 byte values atomically. Reading them with individual
> > byte reads results in a non-atomic read, and the 16-bit value can not
> > be trusted to be correct.
> 
> Hi Pali
> 
> I have to agree with Russell here. I would rather have no diagnostics
> than untrustable diagnostics.

Ok!

So should we completely skip hwmon_device_register_with_info() call
if (i2c_block_size < 2) ?

> The only way this is going to be accepted is if the manufacture says
> that reading the first byte of a word snapshots the second byte as
> well in an atomic way and returns that snapshot on the second
> read. But i highly doubt that happens, given how bad these SFPs are.

I do not think that manufacture says something. I think that they even
do not know that their Realtek chips are completely broken.

I can imagine that vendor just says: it is working in our branded boxes
with SFP cages and if it does not work in your kernel then problem is
with your custom kernel and we do not care about 3rd parties.

Re: [PATCH v2] btf: support ints larger than 128 bits

2020-12-30 Thread Yonghong Song





On 12/19/20 8:36 AM, Sean Young wrote:

clang supports arbitrary length ints using the _ExtInt extension. This
can be useful to hold very large values, e.g. 256 bit or 512 bit types.

Larger types (e.g. 1024 bits) are possible but I am unaware of a use
case for these.

This requires the _ExtInt extension enabled in clang, which is under
review.

Link: https://clang.llvm.org/docs/LanguageExtensions.html#extended-integer-types
Link: https://reviews.llvm.org/D93103

Signed-off-by: Sean Young 
---
changes since v2:
  - added tests as suggested by Yonghong Song
  - added kernel pretty-printer

  Documentation/bpf/btf.rst |   4 +-
  include/uapi/linux/btf.h  |   2 +-
  kernel/bpf/btf.c  |  54 +-
  tools/bpf/bpftool/btf_dumper.c|  40 ++
  tools/include/uapi/linux/btf.h|   2 +-
  tools/lib/bpf/btf.c   |   2 +-
  tools/testing/selftests/bpf/Makefile  |   3 +-
  tools/testing/selftests/bpf/prog_tests/btf.c  |   3 +-
  .../selftests/bpf/progs/test_btf_extint.c |  50 ++
  tools/testing/selftests/bpf/test_extint.py| 535 ++


For easier review, maybe you can break this patch into a patch series 
like below?

  patch 1 (kernel related changes and doc)
  kernel/bpf/btf.c, include/uapi/linux/btf.h,
  tools/include/uapi/linux/btf.h
  Documentation/bpf/btf.rst
  patch 2 (libbpf support)
  tools/lib/bpf/btf.c
  patch 3 (bpftool support)
  tools/bpf/bpftool/btf_dumper.c
  patch 4 (testing)
  rest files


  10 files changed, 679 insertions(+), 16 deletions(-)
  create mode 100644 tools/testing/selftests/bpf/progs/test_btf_extint.c
  create mode 100755 tools/testing/selftests/bpf/test_extint.py

diff --git a/Documentation/bpf/btf.rst b/Documentation/bpf/btf.rst
index 44dc789de2b4..784f1743dbc7 100644
--- a/Documentation/bpf/btf.rst
+++ b/Documentation/bpf/btf.rst
@@ -132,7 +132,7 @@ The following sections detail encoding of each kind.
  
#define BTF_INT_ENCODING(VAL)   (((VAL) & 0x0f00) >> 24)

#define BTF_INT_OFFSET(VAL) (((VAL) & 0x00ff) >> 16)
-  #define BTF_INT_BITS(VAL)   ((VAL)  & 0x00ff)
+  #define BTF_INT_BITS(VAL)   ((VAL)  & 0x03ff)
  
  The ``BTF_INT_ENCODING`` has the following attributes::
  
@@ -147,7 +147,7 @@ pretty print. At most one encoding can be specified for the int type.

  The ``BTF_INT_BITS()`` specifies the number of actual bits held by this int
  type. For example, a 4-bit bitfield encodes ``BTF_INT_BITS()`` equals to 4.
  The ``btf_type.size * 8`` must be equal to or greater than ``BTF_INT_BITS()``
-for the type. The maximum value of ``BTF_INT_BITS()`` is 128.
+for the type. The maximum value of ``BTF_INT_BITS()`` is 512.
  
  The ``BTF_INT_OFFSET()`` specifies the starting bit offset to calculate values

  for this int. For example, a bitfield struct member has:
diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h
index 5a667107ad2c..1696fd02b302 100644
--- a/include/uapi/linux/btf.h
+++ b/include/uapi/linux/btf.h
@@ -84,7 +84,7 @@ struct btf_type {
   */
  #define BTF_INT_ENCODING(VAL) (((VAL) & 0x0f00) >> 24)
  #define BTF_INT_OFFSET(VAL)   (((VAL) & 0x00ff) >> 16)
-#define BTF_INT_BITS(VAL)  ((VAL)  & 0x00ff)
+#define BTF_INT_BITS(VAL)  ((VAL)  & 0x03ff)
  
  /* Attributes stored in the BTF_INT_ENCODING */

  #define BTF_INT_SIGNED(1 << 0)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 8d6bdb4f4d61..44bc17207e9b 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -166,7 +166,8 @@
   *
   */
  
-#define BITS_PER_U128 (sizeof(u64) * BITS_PER_BYTE * 2)

+#define BITS_PER_U128 128
+#define BITS_PER_U512 512
  #define BITS_PER_BYTE_MASK (BITS_PER_BYTE - 1)
  #define BITS_PER_BYTE_MASKED(bits) ((bits) & BITS_PER_BYTE_MASK)
  #define BITS_ROUNDDOWN_BYTES(bits) ((bits) >> 3)
@@ -1907,9 +1908,9 @@ static int btf_int_check_member(struct btf_verifier_env 
*env,
nr_copy_bits = BTF_INT_BITS(int_data) +
BITS_PER_BYTE_MASKED(struct_bits_off);
  
-	if (nr_copy_bits > BITS_PER_U128) {

+   if (nr_copy_bits > BITS_PER_U512) {
btf_verifier_log_member(env, struct_type, member,
-   "nr_copy_bits exceeds 128");
+   "nr_copy_bits exceeds 512");
return -EINVAL;
}
  
@@ -1963,9 +1964,9 @@ static int btf_int_check_kflag_member(struct btf_verifier_env *env,
  
  	bytes_offset = BITS_ROUNDDOWN_BYTES(struct_bits_off);

nr_copy_bits = nr_bits + BITS_PER_BYTE_MASKED(struct_bits_off);
-   if (nr_copy_bits > BITS_PER_U128) {
+   if (nr_copy_bits > BITS_PER_U512) {
btf_verifier_log_member(env, struct_type, member,
-   "nr_copy_bits exceeds 128");
+   "nr_copy_bits exceeds 512");
return -EINVAL;
}

[PATCH net] r8169: work around power-saving bug on some chip versions

2020-12-30 Thread Heiner Kallweit

A user reported failing network with RTL8168dp (a quite rare chip
version). Realtek confirmed that few chip versions suffer from a PLL
power-down hw bug.

Fixes: 07df5bd874f0 ("r8169: power down chip in probe")
Signed-off-by: Heiner Kallweit 
---
Note that since the original change source file r8169.c was renamed
to r8169_main.c. Also the name of the affected functions has changed
from r8168_pll_power_down/up to rtl_pll_power_down/up.
---
 drivers/net/ethernet/realtek/r8169_main.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/realtek/r8169_main.c 
b/drivers/net/ethernet/realtek/r8169_main.c
index 46d8510b2..a569abe7f 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -2207,7 +2207,8 @@ static void rtl_pll_power_down(struct rtl8169_private *tp)
}
 
switch (tp->mac_version) {
-   case RTL_GIGA_MAC_VER_25 ... RTL_GIGA_MAC_VER_33:
+   case RTL_GIGA_MAC_VER_25 ... RTL_GIGA_MAC_VER_26:
+   case RTL_GIGA_MAC_VER_32 ... RTL_GIGA_MAC_VER_33:
case RTL_GIGA_MAC_VER_37:
case RTL_GIGA_MAC_VER_39:
case RTL_GIGA_MAC_VER_43:
@@ -2233,7 +2234,8 @@ static void rtl_pll_power_down(struct rtl8169_private *tp)
 static void rtl_pll_power_up(struct rtl8169_private *tp)
 {
switch (tp->mac_version) {
-   case RTL_GIGA_MAC_VER_25 ... RTL_GIGA_MAC_VER_33:
+   case RTL_GIGA_MAC_VER_25 ... RTL_GIGA_MAC_VER_26:
+   case RTL_GIGA_MAC_VER_32 ... RTL_GIGA_MAC_VER_33:
case RTL_GIGA_MAC_VER_37:
case RTL_GIGA_MAC_VER_39:
case RTL_GIGA_MAC_VER_43:
-- 
2.29.2

Re: [PATCH v2] xfrm: Fix wraparound in xfrm_policy_addr_delta()

2020-12-30 Thread Florian Westphal

Visa Hankala  wrote:
> Use three-way comparison for address components to avoid integer
> wraparound in the result of xfrm_policy_addr_delta(). This ensures
> that the search trees are built and traversed correctly.
> 
> Treat IPv4 and IPv6 similarly by returning 0 when prefixlen == 0.
> Prefix /0 has only one equivalence class.

Acked-by: Florian Westphal

Re: [PATCH 1/4] net: sfp: add workaround for Realtek RTL8672 and RTL9601C chips

2020-12-30 Thread Russell King - ARM Linux admin

On Wed, Dec 30, 2020 at 06:43:07PM +0100, Pali Rohár wrote:
> On Wednesday 30 December 2020 18:13:15 Andrew Lunn wrote:
> > Hi Pali
> > 
> > I have to agree with Russell here. I would rather have no diagnostics
> > than untrustable diagnostics.
> 
> Ok!
> 
> So should we completely skip hwmon_device_register_with_info() call
> if (i2c_block_size < 2) ?

I don't think that alone is sufficient - there's also the matter of
ethtool -m which will dump that information as well, and we don't want
to offer it to userspace in an unreliable form.

For reference, here is what SFF-8472 which defines the diagnostics, says
about this:

  To guarantee coherency of the diagnostic monitoring data, the host is
  required to retrieve any multi-byte fields from the diagnostic
  monitoring data structure (IE: Rx Power MSB - byte 104 in A2h, Rx Power
  LSB - byte 105 in A2h) by the use of a single two-byte read sequence
  across the two-wire interface interface.

  The transceiver is required to ensure that any multi-byte fields which
  are updated with diagnostic monitoring data (e.g. Rx Power MSB - byte
  104 in A2h, Rx Power LSB - byte 105 in A2h) must have this update done
  in a fashion which guarantees coherency and consistency of the data. In
  other words, the update of a multi-byte field by the transceiver must
  not occur such that a partially updated multi-byte field can be
  transferred to the host. Also, the transceiver shall not update a
  multi-byte field within the structure during the transfer of that
  multi-byte field to the host, such that partially updated data would be
  transferred to the host.

The first paragraph is extremely definitive in how these fields shall
be read atomically - by a _single_ two-byte read sequence. From what
you are telling us, these modules do not support that. Therefore, by
definition, they do *not* support proper and reliable reporting of
diagnostic data, and are non-conformant with the SFP MSAs.

So, they are basically broken, and the diagnostics can't be used to
retrieve data that can be said to be useful.

> I do not think that manufacture says something. I think that they even
> do not know that their Realtek chips are completely broken.

Oh, they do know. I had a response from CarlitoxxPro passed to me, which
was:

  That is a behavior related on how your router/switch try to read the
  EEPROM, as described in the datasheet of the GPON ONU SFP, the EEPROM
  can be read in Sequential Single-Byte mode, not in Multi-byte mode as
  you router do, basically, your router is trying to read the full a0h
  table in a single call, and retrieve a null response. that is normal
  and not affect the operations of the GPON ONU SFP, because these
  values are informational only. so the Software for your router should
  be able to read in Single-Byte mode to read the content of the EEPROM
  in concordance to SFF-8431

which totally misses the point that it is /not/ up to the module to
choose whether multi-byte reads are supported or not. If they bothered
to gain a proper understanding of the MSAs, they would have noticed that
the device on 0xA0 is required to behave as an Atmel AT24Cxx EEPROM.
The following from INF-8074i, which is the very first definition of the
SFP form factor modules:

  The SFP serial ID provides access to sophisticated identification
  information that describes the transceiver's capabilities, standard
  interfaces, manufacturer, and other information. The serial interface
  uses the 2-wire serial CMOS E2PROM protocol defined for the ATMEL
  AT24C01A/02/04 family of components.

As they took less than one working day to provide the above response, I
suspect they know full well that there's a problem - and it likely
affects other "routers" as well.

They're also confused about their SFF specifications. SFF-8431 is: "SFP+
10 Gb/s and Low Speed Electrical Interface" which is not the correct
specification for a 1Gbps module.

> I can imagine that vendor just says: it is working in our branded boxes
> with SFP cages and if it does not work in your kernel then problem is
> with your custom kernel and we do not care about 3rd parties.

Which shows why it's pointless producing an EEPROM validation tool that
runs under Linux (as has been your suggestion). They won't use it,
since their testing only goes as far as "does it work in our product?"

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

[RFC PATCH v3 10/12] net: group skb_shinfo zerocopy related bits together.

2020-12-30 Thread Jonathan Lemon

From: Jonathan Lemon 

In preparation for expanded zerocopy (TX and RX), move
the zerocopy related bits out of tx_flags into their own
flag word.

Signed-off-by: Jonathan Lemon 
---
 drivers/net/tap.c   |  3 +--
 drivers/net/tun.c   |  3 +--
 drivers/net/xen-netback/interface.c |  4 +--
 include/linux/skbuff.h  | 38 -
 net/core/skbuff.c   |  9 +++
 net/ipv4/tcp.c  |  2 +-
 net/kcm/kcmsock.c   |  4 +--
 7 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/drivers/net/tap.c b/drivers/net/tap.c
index 3f51f3766d18..f7a19d9b7c27 100644
--- a/drivers/net/tap.c
+++ b/drivers/net/tap.c
@@ -723,8 +723,7 @@ static ssize_t tap_get_user(struct tap_queue *q, void 
*msg_control,
/* copy skb_ubuf_info for callback when skb has no error */
if (zerocopy) {
skb_shinfo(skb)->destructor_arg = msg_control;
-   skb_shinfo(skb)->tx_flags |= SKBTX_DEV_ZEROCOPY;
-   skb_shinfo(skb)->tx_flags |= SKBTX_SHARED_FRAG;
+   skb_shinfo(skb)->flags |= SKBFL_ZEROCOPY_FRAG;
} else if (msg_control) {
struct ubuf_info *uarg = msg_control;
uarg->callback(NULL, uarg, false);
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index b64e2cc85751..dd9edbd72ae8 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1815,8 +1815,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, 
struct tun_file *tfile,
/* copy skb_ubuf_info for callback when skb has no error */
if (zerocopy) {
skb_shinfo(skb)->destructor_arg = msg_control;
-   skb_shinfo(skb)->tx_flags |= SKBTX_DEV_ZEROCOPY;
-   skb_shinfo(skb)->tx_flags |= SKBTX_SHARED_FRAG;
+   skb_shinfo(skb)->flags |= SKBFL_ZEROCOPY_FRAG;
} else if (msg_control) {
struct ubuf_info *uarg = msg_control;
uarg->callback(NULL, uarg, false);
diff --git a/drivers/net/xen-netback/interface.c 
b/drivers/net/xen-netback/interface.c
index acb786d8b1d8..08b0e3d0b7eb 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -47,7 +47,7 @@
 /* Number of bytes allowed on the internal guest Rx queue. */
 #define XENVIF_RX_QUEUE_BYTES (XEN_NETIF_RX_RING_SIZE/2 * PAGE_SIZE)
 
-/* This function is used to set SKBTX_DEV_ZEROCOPY as well as
+/* This function is used to set SKBFL_ZEROCOPY_ENABLE as well as
  * increasing the inflight counter. We need to increase the inflight
  * counter because core driver calls into xenvif_zerocopy_callback
  * which calls xenvif_skb_zerocopy_complete.
@@ -55,7 +55,7 @@
 void xenvif_skb_zerocopy_prepare(struct xenvif_queue *queue,
 struct sk_buff *skb)
 {
-   skb_shinfo(skb)->tx_flags |= SKBTX_DEV_ZEROCOPY;
+   skb_shinfo(skb)->flags |= SKBFL_ZEROCOPY_ENABLE;
atomic_inc(&queue->inflight_packets);
 }
 
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 3c34c75ab004..66fde9a7b851 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -430,28 +430,32 @@ enum {
/* device driver is going to provide hardware time stamp */
SKBTX_IN_PROGRESS = 1 << 2,
 
-   /* device driver supports TX zero-copy buffers */
-   SKBTX_DEV_ZEROCOPY = 1 << 3,
-
/* generate wifi status information (where possible) */
SKBTX_WIFI_STATUS = 1 << 4,
 
-   /* This indicates at least one fragment might be overwritten
-* (as in vmsplice(), sendfile() ...)
-* If we need to compute a TX checksum, we'll need to copy
-* all frags to avoid possible bad checksum
-*/
-   SKBTX_SHARED_FRAG = 1 << 5,
-
/* generate software time stamp when entering packet scheduling */
SKBTX_SCHED_TSTAMP = 1 << 6,
 };
 
-#define SKBTX_ZEROCOPY_FRAG(SKBTX_DEV_ZEROCOPY | SKBTX_SHARED_FRAG)
 #define SKBTX_ANY_SW_TSTAMP(SKBTX_SW_TSTAMP| \
 SKBTX_SCHED_TSTAMP)
 #define SKBTX_ANY_TSTAMP   (SKBTX_HW_TSTAMP | SKBTX_ANY_SW_TSTAMP)
 
+/* Definitions for flags in struct skb_shared_info */
+enum {
+   /* use zcopy routines */
+   SKBFL_ZEROCOPY_ENABLE = BIT(0),
+
+   /* This indicates at least one fragment might be overwritten
+* (as in vmsplice(), sendfile() ...)
+* If we need to compute a TX checksum, we'll need to copy
+* all frags to avoid possible bad checksum
+*/
+   SKBFL_SHARED_FRAG = BIT(1),
+};
+
+#define SKBFL_ZEROCOPY_FRAG(SKBFL_ZEROCOPY_ENABLE | SKBFL_SHARED_FRAG)
+
 /*
  * The callback notifies userspace to release buffers when skb DMA is done in
  * lower device, the skb last reference should be 0 when calling this.
@@ -506,7 +510,7 @@ int skb_zerocopy_iter_stream(struct sock *sk, struct 
sk_buff *skb,
  * the end of the header data, ie. at skb->end.
  */
 struct skb_shared_info {
-   __u8

[RFC PATCH v3 05/12] skbuff: replace sock_zerocopy_get with skb_zcopy_get

2020-12-30 Thread Jonathan Lemon

From: Jonathan Lemon 

Rename the get routines for consistency.

Signed-off-by: Jonathan Lemon 
---
 include/linux/skbuff.h | 12 ++--
 net/core/skbuff.c  |  2 +-
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index a6c86839035b..5b8a53ab51fd 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -491,11 +491,6 @@ struct ubuf_info *sock_zerocopy_alloc(struct sock *sk, 
size_t size);
 struct ubuf_info *sock_zerocopy_realloc(struct sock *sk, size_t size,
struct ubuf_info *uarg);
 
-static inline void sock_zerocopy_get(struct ubuf_info *uarg)
-{
-   refcount_inc(&uarg->refcnt);
-}
-
 void sock_zerocopy_put_abort(struct ubuf_info *uarg, bool have_uref);
 
 void sock_zerocopy_callback(struct ubuf_info *uarg, bool success);
@@ -1441,6 +1436,11 @@ static inline struct ubuf_info *skb_zcopy(struct sk_buff 
*skb)
return is_zcopy ? skb_uarg(skb) : NULL;
 }
 
+static inline void skb_zcopy_get(struct ubuf_info *uarg)
+{
+   refcount_inc(&uarg->refcnt);
+}
+
 static inline void skb_zcopy_set(struct sk_buff *skb, struct ubuf_info *uarg,
 bool *have_ref)
 {
@@ -1448,7 +1448,7 @@ static inline void skb_zcopy_set(struct sk_buff *skb, 
struct ubuf_info *uarg,
if (unlikely(have_ref && *have_ref))
*have_ref = false;
else
-   sock_zerocopy_get(uarg);
+   skb_zcopy_get(uarg);
skb_shinfo(skb)->destructor_arg = uarg;
skb_shinfo(skb)->tx_flags |= SKBTX_ZEROCOPY_FRAG;
}
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 0e028825367a..00f195908e79 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1163,7 +1163,7 @@ struct ubuf_info *sock_zerocopy_realloc(struct sock *sk, 
size_t size,
 
/* no extra ref when appending to datagram (MSG_MORE) */
if (sk->sk_type == SOCK_STREAM)
-   sock_zerocopy_get(uarg);
+   skb_zcopy_get(uarg);
 
return uarg;
}
-- 
2.24.1

[RFC PATCH v3 06/12] skbuff: Add skb parameter to the ubuf zerocopy callback

2020-12-30 Thread Jonathan Lemon

From: Jonathan Lemon 

Add an optional skb parameter to the zerocopy callback parameter,
which is passed down from skb_zcopy_clear().  This gives access
to the original skb, which is needed for upcoming RX zero-copy
error handling.

Signed-off-by: Jonathan Lemon 
---
 drivers/net/tap.c |  2 +-
 drivers/net/tun.c |  2 +-
 drivers/net/xen-netback/common.h  |  3 ++-
 drivers/net/xen-netback/netback.c |  5 +++--
 drivers/vhost/net.c   |  3 ++-
 include/linux/skbuff.h| 17 -
 net/core/skbuff.c |  3 ++-
 7 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/drivers/net/tap.c b/drivers/net/tap.c
index 1f4bdd94407a..3f51f3766d18 100644
--- a/drivers/net/tap.c
+++ b/drivers/net/tap.c
@@ -727,7 +727,7 @@ static ssize_t tap_get_user(struct tap_queue *q, void 
*msg_control,
skb_shinfo(skb)->tx_flags |= SKBTX_SHARED_FRAG;
} else if (msg_control) {
struct ubuf_info *uarg = msg_control;
-   uarg->callback(uarg, false);
+   uarg->callback(NULL, uarg, false);
}
 
if (tap) {
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index fbed05ae7b0f..b64e2cc85751 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1819,7 +1819,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, 
struct tun_file *tfile,
skb_shinfo(skb)->tx_flags |= SKBTX_SHARED_FRAG;
} else if (msg_control) {
struct ubuf_info *uarg = msg_control;
-   uarg->callback(uarg, false);
+   uarg->callback(NULL, uarg, false);
}
 
skb_reset_network_header(skb);
diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 8ee24e351bdc..4a16d6e33c09 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -399,7 +399,8 @@ void xenvif_rx_queue_tail(struct xenvif_queue *queue, 
struct sk_buff *skb);
 void xenvif_carrier_on(struct xenvif *vif);
 
 /* Callback from stack when TX packet can be released */
-void xenvif_zerocopy_callback(struct ubuf_info *ubuf, bool zerocopy_success);
+void xenvif_zerocopy_callback(struct sk_buff *skb, struct ubuf_info *ubuf,
+ bool zerocopy_success);
 
 /* Unmap a pending page and release it back to the guest */
 void xenvif_idx_unmap(struct xenvif_queue *queue, u16 pending_idx);
diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index bc3421d14576..19a27dce79d2 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1091,7 +1091,7 @@ static int xenvif_handle_frag_list(struct xenvif_queue 
*queue, struct sk_buff *s
uarg = skb_shinfo(skb)->destructor_arg;
/* increase inflight counter to offset decrement in callback */
atomic_inc(&queue->inflight_packets);
-   uarg->callback(uarg, true);
+   uarg->callback(NULL, uarg, true);
skb_shinfo(skb)->destructor_arg = NULL;
 
/* Fill the skb with the new (local) frags. */
@@ -1228,7 +1228,8 @@ static int xenvif_tx_submit(struct xenvif_queue *queue)
return work_done;
 }
 
-void xenvif_zerocopy_callback(struct ubuf_info *ubuf, bool zerocopy_success)
+void xenvif_zerocopy_callback(struct sk_buff *skb, struct ubuf_info *ubuf,
+ bool zerocopy_success)
 {
unsigned long flags;
pending_ring_idx_t index;
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 531a00d703cd..bf28d0b75c1b 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -381,7 +381,8 @@ static void vhost_zerocopy_signal_used(struct vhost_net 
*net,
}
 }
 
-static void vhost_zerocopy_callback(struct ubuf_info *ubuf, bool success)
+static void vhost_zerocopy_callback(struct sk_buff *skb,
+   struct ubuf_info *ubuf, bool success)
 {
struct vhost_net_ubuf_ref *ubufs = ubuf->ctx;
struct vhost_virtqueue *vq = ubufs->vq;
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 5b8a53ab51fd..b23c3b4b3209 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -461,7 +461,8 @@ enum {
  * The desc field is used to track userspace buffer index.
  */
 struct ubuf_info {
-   void (*callback)(struct ubuf_info *, bool zerocopy_success);
+   void (*callback)(struct sk_buff *, struct ubuf_info *,
+bool zerocopy_success);
union {
struct {
unsigned long desc;
@@ -493,7 +494,8 @@ struct ubuf_info *sock_zerocopy_realloc(struct sock *sk, 
size_t size,
 
 void sock_zerocopy_put_abort(struct ubuf_info *uarg, bool have_uref);
 
-void sock_zerocopy_callback(struct ubuf_info *uarg, bool success);
+void sock_zerocopy_callback(struct sk_buff *skb, struct ubuf_info *uarg,
+   bool success);
 
 int skb_zerocopy_iter_dgram(struct sk_buff *skb, struct msghdr *msg, int len);
 int

[RFC PATCH v3 03/12] skbuff: Push status and refcounts into sock_zerocopy_callback

2020-12-30 Thread Jonathan Lemon

From: Jonathan Lemon 

Before this change, the caller of sock_zerocopy_callback would
need to save the zerocopy status, decrement and check the refcount,
and then call the callback function - the callback was only invoked
when the refcount reached zero.

Now, the caller just passes the status into the callback function,
which saves the status and handles its own refcounts.

This makes the behavior of the sock_zerocopy_callback identical
to the tpacket and vhost callbacks.

Signed-off-by: Jonathan Lemon 
---
 include/linux/skbuff.h |  3 ---
 net/core/skbuff.c  | 14 +++---
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 3ca8d7c7b30c..52e96c35f5af 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1479,9 +1479,6 @@ static inline void skb_zcopy_clear(struct sk_buff *skb, 
bool zerocopy)
if (uarg) {
if (skb_zcopy_is_nouarg(skb)) {
/* no notification callback */
-   } else if (uarg->callback == sock_zerocopy_callback) {
-   uarg->zerocopy = uarg->zerocopy && zerocopy;
-   sock_zerocopy_put(uarg);
} else {
uarg->callback(uarg, zerocopy);
}
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index d88963f47f7d..8c18940723ff 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1194,7 +1194,7 @@ static bool skb_zerocopy_notify_extend(struct sk_buff 
*skb, u32 lo, u16 len)
return true;
 }
 
-void sock_zerocopy_callback(struct ubuf_info *uarg, bool success)
+static void __sock_zerocopy_callback(struct ubuf_info *uarg)
 {
struct sk_buff *tail, *skb = skb_from_uarg(uarg);
struct sock_exterr_skb *serr;
@@ -1222,7 +1222,7 @@ void sock_zerocopy_callback(struct ubuf_info *uarg, bool 
success)
serr->ee.ee_origin = SO_EE_ORIGIN_ZEROCOPY;
serr->ee.ee_data = hi;
serr->ee.ee_info = lo;
-   if (!success)
+   if (!uarg->zerocopy)
serr->ee.ee_code |= SO_EE_CODE_ZEROCOPY_COPIED;
 
q = &sk->sk_error_queue;
@@ -1241,11 +1241,19 @@ void sock_zerocopy_callback(struct ubuf_info *uarg, 
bool success)
consume_skb(skb);
sock_put(sk);
 }
+
+void sock_zerocopy_callback(struct ubuf_info *uarg, bool success)
+{
+   uarg->zerocopy = uarg->zerocopy & success;
+
+   if (refcount_dec_and_test(&uarg->refcnt))
+   __sock_zerocopy_callback(uarg);
+}
 EXPORT_SYMBOL_GPL(sock_zerocopy_callback);
 
 void sock_zerocopy_put(struct ubuf_info *uarg)
 {
-   if (uarg && refcount_dec_and_test(&uarg->refcnt))
+   if (uarg)
uarg->callback(uarg, uarg->zerocopy);
 }
 EXPORT_SYMBOL_GPL(sock_zerocopy_put);
-- 
2.24.1

[RFC PATCH v3 01/12] skbuff: remove unused skb_zcopy_abort function

2020-12-30 Thread Jonathan Lemon

From: Jonathan Lemon 

skb_zcopy_abort() has no in-tree consumers, remove it.

Signed-off-by: Jonathan Lemon 
Acked-by: Willem de Bruijn 
---
 include/linux/skbuff.h | 11 ---
 1 file changed, 11 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 333bcdc39635..3ca8d7c7b30c 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1490,17 +1490,6 @@ static inline void skb_zcopy_clear(struct sk_buff *skb, 
bool zerocopy)
}
 }
 
-/* Abort a zerocopy operation and revert zckey on error in send syscall */
-static inline void skb_zcopy_abort(struct sk_buff *skb)
-{
-   struct ubuf_info *uarg = skb_zcopy(skb);
-
-   if (uarg) {
-   sock_zerocopy_put_abort(uarg, false);
-   skb_shinfo(skb)->tx_flags &= ~SKBTX_ZEROCOPY_FRAG;
-   }
-}
-
 static inline void skb_mark_not_on_list(struct sk_buff *skb)
 {
skb->next = NULL;
-- 
2.24.1

[RFC PATCH v3 07/12] skbuff: Call sock_zerocopy_put_abort from skb_zcopy_put_abort

2020-12-30 Thread Jonathan Lemon

From: Jonathan Lemon 

The sock_zerocopy_put_abort function contains logic which is
specific to the current zerocopy implementation.  Add a wrapper
which checks the callback and dispatches apppropriately.

Signed-off-by: Jonathan Lemon 
---
 include/linux/skbuff.h | 10 ++
 net/core/skbuff.c  | 12 +---
 net/ipv4/ip_output.c   |  3 +--
 net/ipv4/tcp.c |  2 +-
 net/ipv6/ip6_output.c  |  3 +--
 5 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index b23c3b4b3209..9f7393167f0a 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1478,6 +1478,16 @@ static inline void skb_zcopy_put(struct ubuf_info *uarg)
uarg->callback(NULL, uarg, true);
 }
 
+static inline void skb_zcopy_put_abort(struct ubuf_info *uarg, bool have_uref)
+{
+   if (uarg) {
+   if (uarg->callback == sock_zerocopy_callback)
+   sock_zerocopy_put_abort(uarg, have_uref);
+   else if (have_uref)
+   skb_zcopy_put(uarg);
+   }
+}
+
 /* Release a reference on a zerocopy structure */
 static inline void skb_zcopy_clear(struct sk_buff *skb, bool zerocopy_success)
 {
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 89130b21d9f0..5b9cd528d6a6 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1254,15 +1254,13 @@ EXPORT_SYMBOL_GPL(sock_zerocopy_callback);
 
 void sock_zerocopy_put_abort(struct ubuf_info *uarg, bool have_uref)
 {
-   if (uarg) {
-   struct sock *sk = skb_from_uarg(uarg)->sk;
+   struct sock *sk = skb_from_uarg(uarg)->sk;
 
-   atomic_dec(&sk->sk_zckey);
-   uarg->len--;
+   atomic_dec(&sk->sk_zckey);
+   uarg->len--;
 
-   if (have_uref)
-   skb_zcopy_put(uarg);
-   }
+   if (have_uref)
+   sock_zerocopy_callback(NULL, uarg, true);
 }
 EXPORT_SYMBOL_GPL(sock_zerocopy_put_abort);
 
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 879b76ae4435..65f2299fd682 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1230,8 +1230,7 @@ static int __ip_append_data(struct sock *sk,
 error_efault:
err = -EFAULT;
 error:
-   if (uarg)
-   sock_zerocopy_put_abort(uarg, extra_uref);
+   skb_zcopy_put_abort(uarg, extra_uref);
cork->length -= length;
IP_INC_STATS(sock_net(sk), IPSTATS_MIB_OUTDISCARDS);
refcount_add(wmem_alloc_delta, &sk->sk_wmem_alloc);
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 298a1fae841c..fb58215972ba 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1440,7 +1440,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr 
*msg, size_t size)
if (copied + copied_syn)
goto out;
 out_err:
-   sock_zerocopy_put_abort(uarg, true);
+   skb_zcopy_put_abort(uarg, true);
err = sk_stream_error(sk, flags, err);
/* make sure we wake any epoll edge trigger waiter */
if (unlikely(tcp_rtx_and_write_queues_empty(sk) && err == -EAGAIN)) {
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 749ad72386b2..c8c87891533a 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1715,8 +1715,7 @@ static int __ip6_append_data(struct sock *sk,
 error_efault:
err = -EFAULT;
 error:
-   if (uarg)
-   sock_zerocopy_put_abort(uarg, extra_uref);
+   skb_zcopy_put_abort(uarg, extra_uref);
cork->length -= length;
IP6_INC_STATS(sock_net(sk), rt->rt6i_idev, IPSTATS_MIB_OUTDISCARDS);
refcount_add(wmem_alloc_delta, &sk->sk_wmem_alloc);
-- 
2.24.1

Re: [PATCH 2/4] net: sfp: allow to use also SFP modules which are detected as SFF

2020-12-30 Thread Russell King - ARM Linux admin

On Wed, Dec 30, 2020 at 06:27:07PM +0100, Marek Behún wrote:
> On Wed, 30 Dec 2020 18:06:52 +0100
> Pali Rohár  wrote:
> 
> > if (!sfp->type->module_supported(&id) &&
> > (memcmp(id.base.vendor_name, "UBNT", 16) ||
> >  memcmp(id.base.vendor_pn, "UF-INSTANT  ", 16)))
> 
> I would rather add a quirk member (bitfield) to the sfp structure and do
> something like this
> 
> if (!sfp->type->module_supported(&id) &&
> !(sfp->quirks & SFP_QUIRK_BAD_PHYS_ID))
> 
> or maybe put this check into the module_supported method.

Sorry, definitely not. If you've ever looked at the SDHCI driver with
its multiple "quirks" bitfields, doing this is a recipe for creating
a very horrid hard to understand mess.

What you suggest just results in yet more complexity.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

[PATCH iproute2] build: Fix link errors on some systems

2020-12-30 Thread Roi Dayan

Since moving get_rate() and get_size() from tc to lib, on some
systems we fail to link because of missing the math lib.
Move the link flag from tc makefile to the main makefile.

../lib/libutil.a(utils.o): In function `get_rate':
utils.c:(.text+0x10dc): undefined reference to `floor'
../lib/libutil.a(utils.o): In function `get_size':
utils.c:(.text+0x1394): undefined reference to `floor'
../lib/libutil.a(json_print.o): In function `sprint_size':
json_print.c:(.text+0x14c0): undefined reference to `rint'
json_print.c:(.text+0x14f4): undefined reference to `rint'
json_print.c:(.text+0x157c): undefined reference to `rint'

Fixes: f3be0e6366ac ("lib: Move get_rate(), get_rate64() from tc here")
Fixes: 44396bdfcc0a ("lib: Move get_size() from tc here")
Fixes: adbe5de96662 ("lib: Move sprint_size() from tc here, add print_size()")
Signed-off-by: Roi Dayan 
---
 Makefile| 1 +
 tc/Makefile | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index e64c65992585..2a604ec58905 100644
--- a/Makefile
+++ b/Makefile
@@ -59,6 +59,7 @@ SUBDIRS=lib ip tc bridge misc netem genl tipc devlink rdma 
dcb man
 
 LIBNETLINK=../lib/libutil.a ../lib/libnetlink.a
 LDLIBS += $(LIBNETLINK)
+LDFLAGS += -lm
 
 all: config.mk
@set -e; \
diff --git a/tc/Makefile b/tc/Makefile
index 5a517af20b7c..8d91900716c1 100644
--- a/tc/Makefile
+++ b/tc/Makefile
@@ -110,7 +110,7 @@ ifneq ($(TC_CONFIG_NO_XT),y)
 endif
 
 TCOBJ += $(TCMODULES)
-LDLIBS += -L. -lm
+LDLIBS += -L.
 
 ifeq ($(SHARED_LIBS),y)
 LDLIBS += -ldl
-- 
1.8.3.1

[RFC PATCH v3 04/12] skbuff: replace sock_zerocopy_put() with skb_zcopy_put()

2020-12-30 Thread Jonathan Lemon

From: Jonathan Lemon 

Replace sock_zerocopy_put with the generic skb_zcopy_put()
function.  Pass 'true' as the success argument, as this
is identical to no change.

Signed-off-by: Jonathan Lemon 
---
 include/linux/skbuff.h | 7 ++-
 net/core/skbuff.c  | 9 +
 net/ipv4/tcp.c | 2 +-
 3 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 52e96c35f5af..a6c86839035b 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -496,7 +496,6 @@ static inline void sock_zerocopy_get(struct ubuf_info *uarg)
refcount_inc(&uarg->refcnt);
 }
 
-void sock_zerocopy_put(struct ubuf_info *uarg);
 void sock_zerocopy_put_abort(struct ubuf_info *uarg, bool have_uref);
 
 void sock_zerocopy_callback(struct ubuf_info *uarg, bool success);
@@ -1471,6 +1470,12 @@ static inline void *skb_zcopy_get_nouarg(struct sk_buff 
*skb)
return (void *)((uintptr_t) skb_shinfo(skb)->destructor_arg & ~0x1UL);
 }
 
+static inline void skb_zcopy_put(struct ubuf_info *uarg)
+{
+   if (uarg)
+   uarg->callback(uarg, true);
+}
+
 /* Release a reference on a zerocopy structure */
 static inline void skb_zcopy_clear(struct sk_buff *skb, bool zerocopy)
 {
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 8c18940723ff..0e028825367a 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1251,13 +1251,6 @@ void sock_zerocopy_callback(struct ubuf_info *uarg, bool 
success)
 }
 EXPORT_SYMBOL_GPL(sock_zerocopy_callback);
 
-void sock_zerocopy_put(struct ubuf_info *uarg)
-{
-   if (uarg)
-   uarg->callback(uarg, uarg->zerocopy);
-}
-EXPORT_SYMBOL_GPL(sock_zerocopy_put);
-
 void sock_zerocopy_put_abort(struct ubuf_info *uarg, bool have_uref)
 {
if (uarg) {
@@ -1267,7 +1260,7 @@ void sock_zerocopy_put_abort(struct ubuf_info *uarg, bool 
have_uref)
uarg->len--;
 
if (have_uref)
-   sock_zerocopy_put(uarg);
+   skb_zcopy_put(uarg);
}
 }
 EXPORT_SYMBOL_GPL(sock_zerocopy_put_abort);
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index ed42d2193c5c..298a1fae841c 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1429,7 +1429,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr 
*msg, size_t size)
tcp_push(sk, flags, mss_now, tp->nonagle, size_goal);
}
 out_nopush:
-   sock_zerocopy_put(uarg);
+   skb_zcopy_put(uarg);
return copied + copied_syn;
 
 do_error:
-- 
2.24.1

[RFC PATCH v3 00/12] Generic zcopy_* functions

2020-12-30 Thread Jonathan Lemon

From: Jonathan Lemon 

This is set of cleanup patches for zerocopy which are intended
to allow a introduction of a different zerocopy implementation.

The top level API will use the skb_zcopy_*() functions, while
the current TCP specific zerocopy ends up using msg_zerocopy_*()
calls.

There should be no functional changes from these patches.

v2->v3:
 Rename zc_flags to 'flags'.  Use SKBFL_xxx naming, similar
 to the SKBTX_xx naming.  Leave zerocopy_success naming alone.
 Reorder patches.

v1->v2:
 Break changes to skb_zcopy_put into 3 patches, in order to
 make it easier to follow the changes.  Add Willem's suggestion
 about renaming sock_zerocopy_

Patch 1: remove dead function
Patch 2: simplify sock_zerocopy_put
Patch 3: push status/refcounts into sock_zerocopy_callback
Patch 4: replace sock_zerocopy_put with skb_zcopy_put
Patch 5: rename sock_zerocopy_get
Patch 6:
  Add an optional skb parameter to callback, allowing access to
  the attached skb from the callback.
Patch 7:
  Add skb_zcopy_put_abort, and move zerocopy logic into the
  callback function.  There unfortunately is still a check
  against the callback type here.
Patch 8: Relocate skb_zcopy_clear() in skb_release_data()
Patch 9: rename sock_zerocopy_ to msg_zerocopy_
Patch 10:
  Move zerocopy bits from tx_flags into flags for clarity.
  These bits will be used in the RX path in the future.
Patch 11:
  Set the skb flags from the ubuf being attached, instead
  of a fixed value, allowing different initialization types.
Patch 12: Replace open-coded assignments with skb_zcopy_init()

Jonathan Lemon (12):
  skbuff: remove unused skb_zcopy_abort function
  skbuff: simplify sock_zerocopy_put
  skbuff: Push status and refcounts into sock_zerocopy_callback
  skbuff: replace sock_zerocopy_put() with skb_zcopy_put()
  skbuff: replace sock_zerocopy_get with skb_zcopy_get
  skbuff: Add skb parameter to the ubuf zerocopy callback
  skbuff: Call sock_zerocopy_put_abort from skb_zcopy_put_abort
  skbuff: Call skb_zcopy_clear() before unref'ing fragments
  skbuff: rename sock_zerocopy_* to msg_zerocopy_*
  net: group skb_shinfo zerocopy related bits together.
  skbuff: add flags to ubuf_info for ubuf setup
  tap/tun: add skb_zcopy_init() helper for initialization.

 drivers/net/tap.c   |   6 +-
 drivers/net/tun.c   |   6 +-
 drivers/net/xen-netback/common.h|   3 +-
 drivers/net/xen-netback/interface.c |   4 +-
 drivers/net/xen-netback/netback.c   |   5 +-
 drivers/vhost/net.c |   4 +-
 include/linux/skbuff.h  | 106 +++-
 net/core/skbuff.c   |  65 -
 net/ipv4/ip_output.c|   5 +-
 net/ipv4/tcp.c  |   8 +--
 net/ipv6/ip6_output.c   |   5 +-
 net/kcm/kcmsock.c   |   4 +-
 12 files changed, 113 insertions(+), 108 deletions(-)

-- 
2.24.1

[RFC PATCH v3 12/12] tap/tun: add skb_zcopy_init() helper for initialization.

2020-12-30 Thread Jonathan Lemon

From: Jonathan Lemon 

Replace direct assignments with skb_zcopy_init() for zerocopy
cases where a new skb is initialized, without changing the
reference counts.

Signed-off-by: Jonathan Lemon 
---
 drivers/net/tap.c  | 3 +--
 drivers/net/tun.c  | 3 +--
 drivers/vhost/net.c| 1 +
 include/linux/skbuff.h | 9 +++--
 4 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/net/tap.c b/drivers/net/tap.c
index f7a19d9b7c27..3c652c8ac5ba 100644
--- a/drivers/net/tap.c
+++ b/drivers/net/tap.c
@@ -722,8 +722,7 @@ static ssize_t tap_get_user(struct tap_queue *q, void 
*msg_control,
tap = rcu_dereference(q->tap);
/* copy skb_ubuf_info for callback when skb has no error */
if (zerocopy) {
-   skb_shinfo(skb)->destructor_arg = msg_control;
-   skb_shinfo(skb)->flags |= SKBFL_ZEROCOPY_FRAG;
+   skb_zcopy_init(skb, msg_control);
} else if (msg_control) {
struct ubuf_info *uarg = msg_control;
uarg->callback(NULL, uarg, false);
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index dd9edbd72ae8..7414e0584729 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1814,8 +1814,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, 
struct tun_file *tfile,
 
/* copy skb_ubuf_info for callback when skb has no error */
if (zerocopy) {
-   skb_shinfo(skb)->destructor_arg = msg_control;
-   skb_shinfo(skb)->flags |= SKBFL_ZEROCOPY_FRAG;
+   skb_zcopy_init(skb, msg_control);
} else if (msg_control) {
struct ubuf_info *uarg = msg_control;
uarg->callback(NULL, uarg, false);
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index bf28d0b75c1b..5c722c4179a9 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -904,6 +904,7 @@ static void handle_tx_zerocopy(struct vhost_net *net, 
struct socket *sock)
ubuf->callback = vhost_zerocopy_callback;
ubuf->ctx = nvq->ubufs;
ubuf->desc = nvq->upend_idx;
+   ubuf->flags = SKBFL_ZEROCOPY_FRAG;
refcount_set(&ubuf->refcnt, 1);
msg.msg_control = &ctl;
ctl.type = TUN_MSG_UBUF;
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 58010df9d183..3ec8b83aca3e 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1448,6 +1448,12 @@ static inline void skb_zcopy_get(struct ubuf_info *uarg)
refcount_inc(&uarg->refcnt);
 }
 
+static inline void skb_zcopy_init(struct sk_buff *skb, struct ubuf_info *uarg)
+{
+   skb_shinfo(skb)->destructor_arg = uarg;
+   skb_shinfo(skb)->flags |= uarg->flags;
+}
+
 static inline void skb_zcopy_set(struct sk_buff *skb, struct ubuf_info *uarg,
 bool *have_ref)
 {
@@ -1456,8 +1462,7 @@ static inline void skb_zcopy_set(struct sk_buff *skb, 
struct ubuf_info *uarg,
*have_ref = false;
else
skb_zcopy_get(uarg);
-   skb_shinfo(skb)->destructor_arg = uarg;
-   skb_shinfo(skb)->flags |= uarg->flags;
+   skb_zcopy_init(skb, uarg);
}
 }
 
-- 
2.24.1

[RFC PATCH v3 11/12] skbuff: add flags to ubuf_info for ubuf setup

2020-12-30 Thread Jonathan Lemon

From: Jonathan Lemon 

Currently, when an ubuf is attached to a new skb, the shared
flags word is initialized to a fixed value.  Instead of doing
this, set the default flags in the ubuf, and have new skbs
inherit from this default.

This is needed when setting up different zerocopy types.

Signed-off-by: Jonathan Lemon 
---
 include/linux/skbuff.h | 3 ++-
 net/core/skbuff.c  | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 66fde9a7b851..58010df9d183 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -480,6 +480,7 @@ struct ubuf_info {
};
};
refcount_t refcnt;
+   u8 flags;
 
struct mmpin {
struct user_struct *user;
@@ -1456,7 +1457,7 @@ static inline void skb_zcopy_set(struct sk_buff *skb, 
struct ubuf_info *uarg,
else
skb_zcopy_get(uarg);
skb_shinfo(skb)->destructor_arg = uarg;
-   skb_shinfo(skb)->flags |= SKBFL_ZEROCOPY_FRAG;
+   skb_shinfo(skb)->flags |= uarg->flags;
}
 }
 
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 124e4752afb6..ee288af095f0 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1119,6 +1119,7 @@ struct ubuf_info *msg_zerocopy_alloc(struct sock *sk, 
size_t size)
uarg->len = 1;
uarg->bytelen = size;
uarg->zerocopy = 1;
+   uarg->flags = SKBFL_ZEROCOPY_FRAG;
refcount_set(&uarg->refcnt, 1);
sock_hold(sk);
 
-- 
2.24.1

[RFC PATCH v3 09/12] skbuff: rename sock_zerocopy_* to msg_zerocopy_*

2020-12-30 Thread Jonathan Lemon

From: Jonathan Lemon 

At Willem's suggestion, rename the sock_zerocopy_* functions
so that they match the MSG_ZEROCOPY flag, which makes it clear
they are specific to this zerocopy implementation.

Signed-off-by: Jonathan Lemon 
Acked-by: Willem de Bruijn 
---
 include/linux/skbuff.h | 16 
 net/core/skbuff.c  | 28 ++--
 net/ipv4/ip_output.c   |  2 +-
 net/ipv4/tcp.c |  2 +-
 net/ipv6/ip6_output.c  |  2 +-
 5 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 9f7393167f0a..3c34c75ab004 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -488,13 +488,13 @@ struct ubuf_info {
 int mm_account_pinned_pages(struct mmpin *mmp, size_t size);
 void mm_unaccount_pinned_pages(struct mmpin *mmp);
 
-struct ubuf_info *sock_zerocopy_alloc(struct sock *sk, size_t size);
-struct ubuf_info *sock_zerocopy_realloc(struct sock *sk, size_t size,
-   struct ubuf_info *uarg);
+struct ubuf_info *msg_zerocopy_alloc(struct sock *sk, size_t size);
+struct ubuf_info *msg_zerocopy_realloc(struct sock *sk, size_t size,
+  struct ubuf_info *uarg);
 
-void sock_zerocopy_put_abort(struct ubuf_info *uarg, bool have_uref);
+void msg_zerocopy_put_abort(struct ubuf_info *uarg, bool have_uref);
 
-void sock_zerocopy_callback(struct sk_buff *skb, struct ubuf_info *uarg,
+void msg_zerocopy_callback(struct sk_buff *skb, struct ubuf_info *uarg,
bool success);
 
 int skb_zerocopy_iter_dgram(struct sk_buff *skb, struct msghdr *msg, int len);
@@ -1481,8 +1481,8 @@ static inline void skb_zcopy_put(struct ubuf_info *uarg)
 static inline void skb_zcopy_put_abort(struct ubuf_info *uarg, bool have_uref)
 {
if (uarg) {
-   if (uarg->callback == sock_zerocopy_callback)
-   sock_zerocopy_put_abort(uarg, have_uref);
+   if (uarg->callback == msg_zerocopy_callback)
+   msg_zerocopy_put_abort(uarg, have_uref);
else if (have_uref)
skb_zcopy_put(uarg);
}
@@ -2776,7 +2776,7 @@ static inline int skb_orphan_frags(struct sk_buff *skb, 
gfp_t gfp_mask)
if (likely(!skb_zcopy(skb)))
return 0;
if (!skb_zcopy_is_nouarg(skb) &&
-   skb_uarg(skb)->callback == sock_zerocopy_callback)
+   skb_uarg(skb)->callback == msg_zerocopy_callback)
return 0;
return skb_copy_ubufs(skb, gfp_mask);
 }
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 6d031ed99182..d520168accf9 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1094,7 +1094,7 @@ void mm_unaccount_pinned_pages(struct mmpin *mmp)
 }
 EXPORT_SYMBOL_GPL(mm_unaccount_pinned_pages);
 
-struct ubuf_info *sock_zerocopy_alloc(struct sock *sk, size_t size)
+struct ubuf_info *msg_zerocopy_alloc(struct sock *sk, size_t size)
 {
struct ubuf_info *uarg;
struct sk_buff *skb;
@@ -1114,7 +1114,7 @@ struct ubuf_info *sock_zerocopy_alloc(struct sock *sk, 
size_t size)
return NULL;
}
 
-   uarg->callback = sock_zerocopy_callback;
+   uarg->callback = msg_zerocopy_callback;
uarg->id = ((u32)atomic_inc_return(&sk->sk_zckey)) - 1;
uarg->len = 1;
uarg->bytelen = size;
@@ -1124,15 +1124,15 @@ struct ubuf_info *sock_zerocopy_alloc(struct sock *sk, 
size_t size)
 
return uarg;
 }
-EXPORT_SYMBOL_GPL(sock_zerocopy_alloc);
+EXPORT_SYMBOL_GPL(msg_zerocopy_alloc);
 
 static inline struct sk_buff *skb_from_uarg(struct ubuf_info *uarg)
 {
return container_of((void *)uarg, struct sk_buff, cb);
 }
 
-struct ubuf_info *sock_zerocopy_realloc(struct sock *sk, size_t size,
-   struct ubuf_info *uarg)
+struct ubuf_info *msg_zerocopy_realloc(struct sock *sk, size_t size,
+  struct ubuf_info *uarg)
 {
if (uarg) {
const u32 byte_limit = 1 << 19; /* limit to a few TSO */
@@ -1171,9 +1171,9 @@ struct ubuf_info *sock_zerocopy_realloc(struct sock *sk, 
size_t size,
}
 
 new_alloc:
-   return sock_zerocopy_alloc(sk, size);
+   return msg_zerocopy_alloc(sk, size);
 }
-EXPORT_SYMBOL_GPL(sock_zerocopy_realloc);
+EXPORT_SYMBOL_GPL(msg_zerocopy_realloc);
 
 static bool skb_zerocopy_notify_extend(struct sk_buff *skb, u32 lo, u16 len)
 {
@@ -1195,7 +1195,7 @@ static bool skb_zerocopy_notify_extend(struct sk_buff 
*skb, u32 lo, u16 len)
return true;
 }
 
-static void __sock_zerocopy_callback(struct ubuf_info *uarg)
+static void __msg_zerocopy_callback(struct ubuf_info *uarg)
 {
struct sk_buff *tail, *skb = skb_from_uarg(uarg);
struct sock_exterr_skb *serr;
@@ -1243,17 +1243,17 @@ static void __sock_zerocopy_callback(struct ubuf_info 
*uarg)
sock_put(sk);
 }
 
-void sock_zerocopy_callback(struct sk_buff *skb, st

[RFC PATCH v3 08/12] skbuff: Call skb_zcopy_clear() before unref'ing fragments

2020-12-30 Thread Jonathan Lemon

From: Jonathan Lemon 

RX zerocopy fragment pages which are not allocated from the
system page pool require special handling.  Give the callback
in skb_zcopy_clear() a chance to process them first.

Signed-off-by: Jonathan Lemon 
---
 net/core/skbuff.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 5b9cd528d6a6..6d031ed99182 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -605,13 +605,14 @@ static void skb_release_data(struct sk_buff *skb)
  &shinfo->dataref))
return;
 
+   skb_zcopy_clear(skb, true);
+
for (i = 0; i < shinfo->nr_frags; i++)
__skb_frag_unref(&shinfo->frags[i]);
 
if (shinfo->frag_list)
kfree_skb_list(shinfo->frag_list);
 
-   skb_zcopy_clear(skb, true);
skb_free_head(skb);
 }
 
-- 
2.24.1

[RFC PATCH v3 02/12] skbuff: simplify sock_zerocopy_put

2020-12-30 Thread Jonathan Lemon

From: Jonathan Lemon 

All 'struct ubuf_info' users should have a callback defined
as of commit 0a4a060bb204 ("sock: fix zerocopy_success regression
with msg_zerocopy").

Remove the dead code path to consume_skb(), which makes
assumptions about how the structure was allocated.

Signed-off-by: Jonathan Lemon 
Acked-by: Willem de Bruijn 
---
 net/core/skbuff.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index f62cae3f75d8..d88963f47f7d 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1245,12 +1245,8 @@ EXPORT_SYMBOL_GPL(sock_zerocopy_callback);
 
 void sock_zerocopy_put(struct ubuf_info *uarg)
 {
-   if (uarg && refcount_dec_and_test(&uarg->refcnt)) {
-   if (uarg->callback)
-   uarg->callback(uarg, uarg->zerocopy);
-   else
-   consume_skb(skb_from_uarg(uarg));
-   }
+   if (uarg && refcount_dec_and_test(&uarg->refcnt))
+   uarg->callback(uarg, uarg->zerocopy);
 }
 EXPORT_SYMBOL_GPL(sock_zerocopy_put);
 
-- 
2.24.1

Re: [PATCH 1/4] net: sfp: add workaround for Realtek RTL8672 and RTL9601C chips

2020-12-30 Thread Russell King - ARM Linux admin

On Wed, Dec 30, 2020 at 06:31:52PM +0100, Pali Rohár wrote:
> On Wednesday 30 December 2020 17:05:46 Russell King - ARM Linux admin wrote:
> > On Wed, Dec 30, 2020 at 05:56:34PM +0100, Pali Rohár wrote:
> > > This change is really required for those Realtek chips. I thought that
> > > it is obvious that from *both* addresses 0x50 and 0x51 can be read only
> > > one byte at the same time. Reading 2 bytes (for be16 value) cannot be
> > > really done by one i2 transfer, it must be done in two.
> > 
> > Then these modules are even more broken than first throught, and
> > quite simply it is pointless supporting the diagnostics on them
> > because we can never read the values in an atomic way.
> 
> They are broken in a way that neither holy water help them...
> 
> But from diagnostic 0x51 address we can read at least 8bit registers in
> atomic way :-)

... which doesn't fit the requirements.

> > It's also a violation of the SFF-8472 that _requires_ multi-byte reads
> > to read these 16 byte values atomically. Reading them with individual
> > byte reads results in a non-atomic read, and the 16-bit value can not
> > be trusted to be correct.
> > 
> > That is really not optional, no matter what any manufacturer says - if
> > they claim the SFP MSAs allows it, they're quite simply talking out of
> > a donkey's backside and you should dispose of the module in biohazard
> > packaging. :)
> > 
> > So no, I hadn't understood this from your emails, and as I say above,
> > if this is the case, then we quite simply disable diagnostics on these
> > modules since they are _highly_ noncompliant.
> 
> We have just two options:
> 
> Disable 2 (and more) bytes reads from 0x51 address and therefore disable
> sfp_hwmon_read_sensor() function.
> 
> Or allow 2 bytes non-atomic reads and allow at least semi-correct values
> for hwmon. I guess that upper 8bits would not change between two single
> byte i2c transfers too much (when they are done immediately one by one).

So when you read the temperature, and the MSB reads as the next higher
value than the LSB, causing an error of 256, or vice versa causing an
error of -256, which when scaled according to the factors causes a big
error, that's acceptable.

No, it isn't. If the data can't be read reliably, the data is useless.

Consider a system that implements userspace monitoring for modules and
checks the current values against pre-set thresholds - it suddenly gets
a value that is outside of its alarm threshold due to this. It raises a
false alarm. This is not good.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 1/4] net: sfp: add workaround for Realtek RTL8672 and RTL9601C chips

2020-12-30 Thread Andrew Lunn

> Ok!
> 
> So should we completely skip hwmon_device_register_with_info() call
> if (i2c_block_size < 2) ?

Yep. That would be a nice simple test.

But does ethtool -m still export the second page? That is also
unreliable.

Andrew

general protection fault in dst_dev_put (6)

2020-12-30 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:40f78232 Merge tag 'pci-v5.11-fixes-1' of git://git.kernel..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1731027750
kernel config:  https://syzkaller.appspot.com/x/.config?x=9798d6898fec8a72
dashboard link: https://syzkaller.appspot.com/bug?extid=0f1d7ae54b1e86cb1b67
compiler:   gcc (GCC) 10.1.0-syz 20200507

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+0f1d7ae54b1e86cb1...@syzkaller.appspotmail.com

general protection fault, probably for non-canonical address 
0xdc01:  [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0008-0x000f]
CPU: 0 PID: 8482 Comm: syz-fuzzer Not tainted 5.10.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
RIP: 0010:dst_dev_put+0x1d/0x220 net/core/dst.c:156
Code: a0 b1 fa e9 53 ff ff ff cc cc cc cc cc 41 54 55 53 48 89 fb e8 64 b2 6e 
fa 48 89 da 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f 85 8b 01 
00 00 48 8d 7b 3a 4c 8b 23 48 b8 00 00 00
RSP: 0018:c9007db8 EFLAGS: 00010203
RAX: dc00 RBX: 000f RCX: 0100
RDX: 0001 RSI: 87049d6c RDI: 000f
RBP:  R08:  R09: 
R10: 87926f33 R11: 0001 R12: 0007
R13: fbfff1af7aec R14: 607f4607e398 R15: 000f
FS:  00c0002a0e90() GS:8880b9c0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 00c000b37000 CR3: 19e32000 CR4: 001526f0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
 
 rt_fibinfo_free_cpus.part.0+0x118/0x180 net/ipv4/fib_semantics.c:202
 rt_fibinfo_free_cpus net/ipv4/fib_semantics.c:194 [inline]
 fib_nh_common_release+0xfb/0x300 net/ipv4/fib_semantics.c:215
 fib6_info_destroy_rcu+0x187/0x210 net/ipv6/ip6_fib.c:174
 rcu_do_batch kernel/rcu/tree.c:2489 [inline]
 rcu_core+0x75d/0xf80 kernel/rcu/tree.c:2723
 __do_softirq+0x2bc/0xa77 kernel/softirq.c:343
 asm_call_irq_on_stack+0xf/0x20
 
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:26 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:77 [inline]
 do_softirq_own_stack+0xaa/0xd0 arch/x86/kernel/irq_64.c:77
 invoke_softirq kernel/softirq.c:226 [inline]
 __irq_exit_rcu+0x17f/0x200 kernel/softirq.c:420
 irq_exit_rcu+0x5/0x20 kernel/softirq.c:432
 sysvec_apic_timer_interrupt+0x4d/0x100 arch/x86/kernel/apic/apic.c:1096
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:628
RIP: 0010:stack_trace_consume_entry+0xd5/0x160 kernel/stacktrace.c:92
Code: 0f 85 92 00 00 00 8d 45 01 89 43 10 48 8b 03 48 8d 2c e8 48 b8 00 00 00 
00 00 fc ff df 48 89 ea 48 c1 ea 03 80 3c 02 00 75 5c <48> 89 75 00 8b 43 08 39 
43 10 0f 92 c0 48 83 c4 08 5b 5d c3 83 e8
RSP: 0018:c900011ef5d8 EFLAGS: 0246
RAX: dc00 RBX: c900011ef6b0 RCX: 
RDX: 19200023deeb RSI: 86f8c6ef RDI: c900011ef6bc
RBP: c900011ef758 R08: 8e6efcd6 R09: 0001
R10: f5200023deca R11: 0001 R12: c900011ef6b0
R13:  R14: 88802805c380 R15: 0280
 arch_stack_walk+0x6d/0xe0 arch/x86/kernel/stacktrace.c:27
 stack_trace_save+0x8c/0xc0 kernel/stacktrace.c:121
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:38
 kasan_set_track mm/kasan/common.c:46 [inline]
 set_alloc_info mm/kasan/common.c:401 [inline]
 kasan_kmalloc.constprop.0+0x7f/0xa0 mm/kasan/common.c:429
 __kmalloc_reserve net/core/skbuff.c:142 [inline]
 __alloc_skb+0xae/0x5c0 net/core/skbuff.c:210
 alloc_skb_fclone include/linux/skbuff.h:1149 [inline]
 sk_stream_alloc_skb+0x109/0xc30 net/ipv4/tcp.c:888
 tcp_sendmsg_locked+0xc00/0x2da0 net/ipv4/tcp.c:1310
 tcp_sendmsg+0x2b/0x40 net/ipv4/tcp.c:1459
 inet_sendmsg+0x99/0xe0 net/ipv4/af_inet.c:817
 sock_sendmsg_nosec net/socket.c:652 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:672
 sock_write_iter+0x289/0x3c0 net/socket.c:999
 call_write_iter include/linux/fs.h:1901 [inline]
 new_sync_write+0x426/0x650 fs/read_write.c:518
 vfs_write+0x7dd/0xa80 fs/read_write.c:605
 ksys_write+0x1ee/0x250 fs/read_write.c:658
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x4b117b
Code: ff e9 69 ff ff ff cc cc cc cc cc cc cc cc cc e8 9b c2 f8 ff 48 8b 7c 24 
10 48 8b 74 24 18 48 8b 54 24 20 48 8b 44 24 08 0f 05 <48> 3d 01 f0 ff ff 76 20 
48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30
RSP: 002b:00c000575480 EFLAGS: 0206 ORIG_RAX: 0001
RAX: ffda RBX: 00c20800 RCX: 004b117b
RDX: 00f0 RSI: 00c0a200 RDI: 0006
RBP: 00c00

KASAN: use-after-free Read in vlan_dev_real_dev

2020-12-30 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:1f45dc22 ibmvnic: continue fatal error reset after passive..
git tree:   net
console output: https://syzkaller.appspot.com/x/log.txt?x=10dbe3ff50
kernel config:  https://syzkaller.appspot.com/x/.config?x=9c6feb14e07e1220
dashboard link: https://syzkaller.appspot.com/bug?extid=944af49d74bdd9175f4b
compiler:   gcc (GCC) 10.1.0-syz 20200507

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+944af49d74bdd9175...@syzkaller.appspotmail.com

==
BUG: KASAN: use-after-free in is_vlan_dev include/linux/if_vlan.h:74 [inline]
BUG: KASAN: use-after-free in vlan_dev_real_dev+0xf9/0x120 
net/8021q/vlan_core.c:105
Read of size 4 at addr 8880263c222c by task kworker/u4:0/8

CPU: 1 PID: 8 Comm: kworker/u4:0 Not tainted 5.10.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Workqueue: gid-cache-wq netdevice_event_work_handler
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 print_address_description.constprop.0.cold+0xae/0x4c8 mm/kasan/report.c:385
 __kasan_report mm/kasan/report.c:545 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:562
 is_vlan_dev include/linux/if_vlan.h:74 [inline]
 vlan_dev_real_dev+0xf9/0x120 net/8021q/vlan_core.c:105
 rdma_vlan_dev_real_dev include/rdma/ib_addr.h:267 [inline]
 is_eth_port_of_netdev_filter.part.0+0xe7/0x300 
drivers/infiniband/core/roce_gid_mgmt.c:157
 is_eth_port_of_netdev_filter+0x28/0x40 
drivers/infiniband/core/roce_gid_mgmt.c:153
 ib_enum_roce_netdev+0x18f/0x2b0 drivers/infiniband/core/device.c:2287
 ib_enum_all_roce_netdevs+0xbd/0x130 drivers/infiniband/core/device.c:2316
 netdevice_event_work_handler+0x9c/0x1b0 
drivers/infiniband/core/roce_gid_mgmt.c:626
 process_one_work+0x98d/0x1630 kernel/workqueue.c:2275
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296

Allocated by task 19697:
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
 kasan_set_track mm/kasan/common.c:56 [inline]
 __kasan_kmalloc.constprop.0+0xc2/0xd0 mm/kasan/common.c:461
 kmalloc_node include/linux/slab.h:575 [inline]
 kvmalloc_node+0x61/0xf0 mm/util.c:575
 kvmalloc include/linux/mm.h:773 [inline]
 kvzalloc include/linux/mm.h:781 [inline]
 alloc_netdev_mqs+0x97/0xea0 net/core/dev.c:10540
 rtnl_create_link+0x219/0xad0 net/core/rtnetlink.c:3171
 __rtnl_newlink+0xf9b/0x1750 net/core/rtnetlink.c:3433
 rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3502
 rtnetlink_rcv_msg+0x498/0xb80 net/core/rtnetlink.c:5564
 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2494
 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
 netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1330
 netlink_sendmsg+0x907/0xe40 net/netlink/af_netlink.c:1919
 sock_sendmsg_nosec net/socket.c:652 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:672
 sys_sendmsg+0x6e8/0x810 net/socket.c:2345
 ___sys_sendmsg+0xf3/0x170 net/socket.c:2399
 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2432
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 19689:
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
 kasan_set_track+0x1c/0x30 mm/kasan/common.c:56
 kasan_set_free_info+0x1b/0x30 mm/kasan/generic.c:352
 __kasan_slab_free+0x102/0x140 mm/kasan/common.c:422
 slab_free_hook mm/slub.c:1544 [inline]
 slab_free_freelist_hook+0x5d/0x150 mm/slub.c:1577
 slab_free mm/slub.c:3140 [inline]
 kfree+0xdb/0x3c0 mm/slub.c:4122
 kvfree+0x42/0x50 mm/util.c:604
 device_release+0x9f/0x240 drivers/base/core.c:1962
 kobject_cleanup lib/kobject.c:705 [inline]
 kobject_release lib/kobject.c:736 [inline]
 kref_put include/linux/kref.h:65 [inline]
 kobject_put+0x1c8/0x540 lib/kobject.c:753
 put_device+0x1b/0x30 drivers/base/core.c:3190
 free_netdev+0x3d7/0x500 net/core/dev.c:10660
 ppp_destroy_interface+0x2ab/0x340 drivers/net/ppp/ppp_generic.c:3358
 ppp_release+0x1bf/0x240 drivers/net/ppp/ppp_generic.c:410
 __fput+0x283/0x920 fs/file_table.c:280
 task_work_run+0xdd/0x190 kernel/task_work.c:140
 tracehook_notify_resume include/linux/tracehook.h:189 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:174 [inline]
 exit_to_user_mode_prepare+0x1f0/0x200 kernel/entry/common.c:201
 __syscall_exit_to_user_mode_work kernel/entry/common.c:291 [inline]
 syscall_exit_to_user_mode+0x19/0x50 kernel/entry/common.c:302
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

The buggy address belongs to the object at 8880263c2000
 which belongs to the cache kmalloc-4k of size 4096
The buggy address is located 556 bytes inside of
 4096-byte region [8880263c2000, 8880263c3000)
The buggy address belongs to the page:
page:52fa6739 refcount:1 mapcount:0 mapping:0

Re: [PATCH 1/4] net: sfp: add workaround for Realtek RTL8672 and RTL9601C chips

2020-12-30 Thread Russell King - ARM Linux admin

On Wed, Dec 30, 2020 at 08:30:52PM +0100, Andrew Lunn wrote:
> > Ok!
> > 
> > So should we completely skip hwmon_device_register_with_info() call
> > if (i2c_block_size < 2) ?
> 
> Yep. That would be a nice simple test.
> 
> But does ethtool -m still export the second page? That is also
> unreliable.

It will do, but we need to check how ethtool decides to request/dump
that information - we probably need to make sfp_module_info()
report that it's a ETH_MODULE_SFF_8079 style EEPROM, not the
ETH_MODULE_SFF_8472 style.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Re: [PATCH 1/4] net: sfp: add workaround for Realtek RTL8672 and RTL9601C chips

2020-12-30 Thread Andrew Lunn

> Which shows why it's pointless producing an EEPROM validation tool that
> runs under Linux (as has been your suggestion). They won't use it,
> since their testing only goes as far as "does it work in our product?"

Yes, i would need SNIA to push for conformance testing, hold
plug-testing events, and marketing that only devices which pass the
conformance test are allowed to use the SNIA logo, etc.

They do seem to have conformance testing for some of their storage
standards, but nothing for SFPs.

   Andrew

Re: [PATCH 2/3] vhost/vsock: support for SOCK_SEQPACKET socket.

2020-12-30 Thread Michael S. Tsirkin

On Tue, Dec 29, 2020 at 02:06:33PM +0300, Arseny Krasnov wrote:
>   This patch simply adds transport ops and removes
> ignore of non-stream type of packets.
> 
> Signed-off-by: Arseny Krasnov 

How is this supposed to work? virtio vsock at the moment
has byte level end to end credit accounting at the
protocol level. I suspect some protocol changes involving
more than this tweak would
be needed to properly support anything that isn't a stream.

> ---
>  drivers/vhost/vsock.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
> index a483cec31d5c..4a36ef1c52d0 100644
> --- a/drivers/vhost/vsock.c
> +++ b/drivers/vhost/vsock.c
> @@ -346,8 +346,7 @@ vhost_vsock_alloc_pkt(struct vhost_virtqueue *vq,
>   return NULL;
>   }
>  
> - if (le16_to_cpu(pkt->hdr.type) == VIRTIO_VSOCK_TYPE_STREAM)
> - pkt->len = le32_to_cpu(pkt->hdr.len);
> + pkt->len = le32_to_cpu(pkt->hdr.len);
>  
>   /* No payload */
>   if (!pkt->len)
> @@ -416,6 +415,9 @@ static struct virtio_transport vhost_transport = {
>   .stream_is_active = virtio_transport_stream_is_active,
>   .stream_allow = virtio_transport_stream_allow,
>  
> + .seqpacket_seq_send_len   = 
> virtio_transport_seqpacket_seq_send_len,
> + .seqpacket_seq_get_len= 
> virtio_transport_seqpacket_seq_get_len,
> +
>   .notify_poll_in   = virtio_transport_notify_poll_in,
>   .notify_poll_out  = virtio_transport_notify_poll_out,
>   .notify_recv_init = virtio_transport_notify_recv_init,
> -- 
> 2.25.1

1 2 >

1 - 100 of 122 matches

Mail list logo