date:20201011

Re: [PATCH net-next 1/2] can-isotp: implement cleanups / improvements from review

2020-10-11 Thread Marc Kleine-Budde

On 10/10/20 10:49 PM, Oliver Hartkopp wrote:
> As pointed out by Jakub Kicinski here:
> https://marc.info/?l=linux-can&m=160229286216008
> this patch addresses the remarked issues:
> 
> - remove empty lines in comment
> - remove default=y for CAN_ISOTP in Kconfig
> - make use of pr_notice_once()
> - use GFP_KERNEL instead of gfp_any() in soft hrtimer context
> 
> The version strings in the CAN subsystem are removed by a separate patch.
> 
> Signed-off-by: Oliver Hartkopp 
> ---
>  include/uapi/linux/can/isotp.h |  4 +---
>  net/can/Kconfig|  3 ++-
>  net/can/isotp.c| 14 +++---
>  3 files changed, 10 insertions(+), 11 deletions(-)
> 
> diff --git a/include/uapi/linux/can/isotp.h b/include/uapi/linux/can/isotp.h
> index 553006509f4e..accf0efa46f4 100644
> --- a/include/uapi/linux/can/isotp.h
> +++ b/include/uapi/linux/can/isotp.h
> @@ -151,8 +151,7 @@ struct can_isotp_ll_options {
>  #define CAN_ISOTP_DEFAULT_LL_TX_DL   CAN_MAX_DLEN
>  #define CAN_ISOTP_DEFAULT_LL_TX_FLAGS0
>  
> -/*
> - * Remark on CAN_ISOTP_DEFAULT_RECV_* values:
> +/* Remark on CAN_ISOTP_DEFAULT_RECV_* values:

I think for uapi headers we use the default commenting style, not the netdev 
one.

>   *
>   * We can strongly assume, that the Linux Kernel implementation of
>   * CAN_ISOTP is capable to run with BS=0, STmin=0 and WFTmax=0.
> @@ -160,7 +159,6 @@ struct can_isotp_ll_options {
>   * these default settings can be changed via sockopts.
>   * For that reason the STmin value is intentionally _not_ checked for
>   * consistency and copied directly into the flow control (FC) frame.
> - *
>   */
>  
>  #endif /* !_UAPI_CAN_ISOTP_H */
> diff --git a/net/can/Kconfig b/net/can/Kconfig
> index 021fe03a8ed6..224e5e0283a9 100644
> --- a/net/can/Kconfig
> +++ b/net/can/Kconfig
> @@ -57,7 +57,6 @@ source "net/can/j1939/Kconfig"
>  
>  config CAN_ISOTP
>   tristate "ISO 15765-2:2016 CAN transport protocol"
> - default y
>   help
> CAN Transport Protocols offer support for segmented Point-to-Point
> communication between CAN nodes via two defined CAN Identifiers.
> @@ -67,6 +66,8 @@ config CAN_ISOTP
> vehicle diagnosis (UDS, ISO 14229) or IP-over-CAN traffic.
> This protocol driver implements data transfers according to
> ISO 15765-2:2016 for 'classic' CAN and CAN FD frame types.
> +   If you want to perform automotive vehicle diagnostic services (UDS),
> +   say 'y'.
>  
>  source "drivers/net/can/Kconfig"
>  
> diff --git a/net/can/isotp.c b/net/can/isotp.c
> index e6ff032b5426..bc3a722c200b 100644
> --- a/net/can/isotp.c
> +++ b/net/can/isotp.c
> @@ -222,8 +222,8 @@ static int isotp_send_fc(struct sock *sk, int ae, u8 
> flowstatus)
>  
>   can_send_ret = can_send(nskb, 1);
>   if (can_send_ret)
> - printk_once(KERN_NOTICE "can-isotp: %s: can_send_ret %d\n",
> - __func__, can_send_ret);
> + pr_notice_once("can-isotp: %s: can_send_ret %d\n",
> +__func__, can_send_ret);

please define a pr_fmt for the "can-isotp: " prefix.

>  
>   dev_put(dev);
>  
> @@ -769,7 +769,7 @@ static enum hrtimer_restart isotp_tx_timer_handler(struct 
> hrtimer *hrtimer)
>  
>  isotp_tx_burst:
>   skb = alloc_skb(so->ll.mtu + sizeof(struct can_skb_priv),
> - gfp_any());
> + GFP_KERNEL);
>   if (!skb) {
>   dev_put(dev);
>   break;
> @@ -798,8 +798,8 @@ static enum hrtimer_restart isotp_tx_timer_handler(struct 
> hrtimer *hrtimer)
>  
>   can_send_ret = can_send(skb, 1);
>   if (can_send_ret)
> - printk_once(KERN_NOTICE "can-isotp: %s: can_send_ret 
> %d\n",
> - __func__, can_send_ret);
> + pr_notice_once("can-isotp: %s: can_send_ret %d\n",
> +__func__, can_send_ret);
>  
>   if (so->tx.idx >= so->tx.len) {
>   /* we are done */
> @@ -942,8 +942,8 @@ static int isotp_sendmsg(struct socket *sock, struct 
> msghdr *msg, size_t size)
>   err = can_send(skb, 1);
>   dev_put(dev);
>   if (err) {
> - printk_once(KERN_NOTICE "can-isotp: %s: can_send_ret %d\n",
> - __func__, err);
> + pr_notice_once("can-isotp: %s: can_send_ret %d\n",
> +__func__, err);
>   return err;
>   }
>  
> 

Marc

-- 
Pengutronix e.K. | Marc Kleine-Budde   |
Embedded Linux   | https://www.pengutronix.de  |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917- |



signature.asc
Description: OpenPGP digital signature

Re: pull-request: can 2020-10-08

2020-10-11 Thread Marc Kleine-Budde

On 10/10/20 7:32 PM, Jakub Kicinski wrote:
> On Thu,  8 Oct 2020 23:40:19 +0200 Marc Kleine-Budde wrote:
>> The first patch is part of my pull request 
>> "linux-can-fixes-for-5.9-20201006",
>> so consider that one obsolete and take this instead.
>>
>> The first patch is by Lucas Stach and fixes m_can driver by removing an
>> erroneous call to m_can_class_suspend() in runtime suspend. Which causes the
>> pinctrl state to get stuck on the "sleep" state, which breaks all CAN
>> functionality on SoCs where this state is defined.
>>
>> The last two patches target the j1939 protocol: Cong Wang fixes a syzbot
>> finding of an uninitialized variable in the j1939 transport protocol. I
>> contribute a patch, that fixes the initialization of a same uninitialized
>> variable in a different function.
> 
> Pulled, thanks!
> 
> Since we missed 5.9 would you like me to queue these up for stable?

Yes.

tnx,
Marc

-- 
Pengutronix e.K. | Marc Kleine-Budde   |
Embedded Linux   | https://www.pengutronix.de  |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917- |



signature.asc
Description: OpenPGP digital signature

Re: [PATCH net-next 2/2] can: remove obsolete version strings

2020-10-11 Thread Marc Kleine-Budde

On 10/10/20 10:49 PM, Oliver Hartkopp wrote:
> As pointed out by Jakub Kicinski here:
> https://marc.info/?l=linux-can&m=160229286216008

Better use https://lore.kernel.org for this. The URL is constructed by adding
"/r/$MESSAGE_ID":


http://lore.kernel.org/r/20201009175751.5c540...@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com

> this patch removes the obsolete version information of the different
> CAN protocols and the AF_CAN core module.
> 
> Signed-off-by: Oliver Hartkopp 
Marc

-- 
Pengutronix e.K. | Marc Kleine-Budde   |
Embedded Linux   | https://www.pengutronix.de  |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917- |



signature.asc
Description: OpenPGP digital signature

KASAN: use-after-free Read in sco_chan_del

2020-10-11 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:a804ab08 Add linux-next specific files for 20201006
git tree:   linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=1073270b90
kernel config:  https://syzkaller.appspot.com/x/.config?x=26c1b4cc4a62ccb
dashboard link: https://syzkaller.appspot.com/bug?extid=1df6a63e69a359c8b517
compiler:   gcc (GCC) 10.1.0-syz 20200507

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+1df6a63e69a359c8b...@syzkaller.appspotmail.com

==
BUG: KASAN: use-after-free in hci_conn_drop 
include/net/bluetooth/hci_core.h:1145 [inline]
BUG: KASAN: use-after-free in hci_conn_drop 
include/net/bluetooth/hci_core.h:1115 [inline]
BUG: KASAN: use-after-free in sco_chan_del+0x400/0x430 net/bluetooth/sco.c:149
Read of size 8 at addr 88804d29c918 by task syz-executor.2/27575

CPU: 0 PID: 27575 Comm: syz-executor.2 Not tainted 
5.9.0-rc8-next-20201006-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x198/0x1fb lib/dump_stack.c:118
 print_address_description.constprop.0.cold+0xae/0x497 mm/kasan/report.c:385
 __kasan_report mm/kasan/report.c:545 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:562
 hci_conn_drop include/net/bluetooth/hci_core.h:1145 [inline]
 hci_conn_drop include/net/bluetooth/hci_core.h:1115 [inline]
 sco_chan_del+0x400/0x430 net/bluetooth/sco.c:149
 __sco_sock_close+0x16e/0x5b0 net/bluetooth/sco.c:434
 sco_sock_close net/bluetooth/sco.c:448 [inline]
 sco_sock_release+0x69/0x290 net/bluetooth/sco.c:1059
 __sock_release+0xcd/0x280 net/socket.c:596
 sock_close+0x18/0x20 net/socket.c:1277
 __fput+0x285/0x920 fs/file_table.c:281
 task_work_run+0xdd/0x190 kernel/task_work.c:141
 get_signal+0xd89/0x1f00 kernel/signal.c:2561
 arch_do_signal+0x82/0x2470 arch/x86/kernel/signal.c:811
 exit_to_user_mode_loop kernel/entry/common.c:161 [inline]
 exit_to_user_mode_prepare+0x194/0x1f0 kernel/entry/common.c:192
 syscall_exit_to_user_mode+0x7a/0x2c0 kernel/entry/common.c:267
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45de29
Code: 0d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 
89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 
db b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:7fabeda51c78 EFLAGS: 0246 ORIG_RAX: 002a
RAX: fffc RBX: 2200 RCX: 0045de29
RDX: 0008 RSI: 2080 RDI: 0006
RBP: 0118c158 R08:  R09: 
R10:  R11: 0246 R12: 0118c124
R13: 7ffdb3c9529f R14: 7fabeda529c0 R15: 0118c124

Allocated by task 27575:
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
 kasan_set_track mm/kasan/common.c:56 [inline]
 __kasan_kmalloc.constprop.0+0xbf/0xd0 mm/kasan/common.c:461
 kmem_cache_alloc_trace+0x1a0/0x480 mm/slab.c:3552
 kmalloc include/linux/slab.h:554 [inline]
 kzalloc include/linux/slab.h:666 [inline]
 hci_conn_add+0x53/0x1330 net/bluetooth/hci_conn.c:525
 hci_connect_sco+0x356/0x860 net/bluetooth/hci_conn.c:1283
 sco_connect net/bluetooth/sco.c:241 [inline]
 sco_sock_connect+0x308/0x980 net/bluetooth/sco.c:588
 __sys_connect_file+0x155/0x1a0 net/socket.c:1852
 __sys_connect+0x161/0x190 net/socket.c:1869
 __do_sys_connect net/socket.c:1879 [inline]
 __se_sys_connect net/socket.c:1876 [inline]
 __x64_sys_connect+0x6f/0xb0 net/socket.c:1876
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 26665:
 kasan_save_stack+0x1b/0x40 mm/kasan/common.c:48
 kasan_set_track+0x1c/0x30 mm/kasan/common.c:56
 kasan_set_free_info+0x1b/0x30 mm/kasan/generic.c:355
 __kasan_slab_free+0xd8/0x120 mm/kasan/common.c:422
 __cache_free mm/slab.c:3420 [inline]
 kfree+0x10e/0x2a0 mm/slab.c:3758
 device_release+0x9f/0x240 drivers/base/core.c:1808
 kobject_cleanup lib/kobject.c:705 [inline]
 kobject_release lib/kobject.c:736 [inline]
 kref_put include/linux/kref.h:65 [inline]
 kobject_put+0x171/0x270 lib/kobject.c:753
 put_device+0x1b/0x30 drivers/base/core.c:3037
 hci_conn_del+0x27e/0x6a0 net/bluetooth/hci_conn.c:645
 hci_conn_hash_flush+0x189/0x220 net/bluetooth/hci_conn.c:1558
 hci_dev_do_close+0x5c6/0x1080 net/bluetooth/hci_core.c:1770
 hci_unregister_dev+0x214/0xe90 net/bluetooth/hci_core.c:3827
 vhci_release+0x70/0xe0 drivers/bluetooth/hci_vhci.c:340
 __fput+0x285/0x920 fs/file_table.c:281
 task_work_run+0xdd/0x190 kernel/task_work.c:141
 exit_task_work include/linux/task_work.h:25 [inline]
 do_exit+0xb23/0x2930 kernel/exit.c:806
 do_group_exit+0x125/0x310 kernel/exit.c:903
 get_signal+0x428/0x1f00 kernel/signal.c:2757
 arch_do_signal+0x82/0x2470 arch/x86/kernel/signal.c:811
 exit

Re: [PATCH nf-next v3 3/3] netfilter: Introduce egress hook

2020-10-11 Thread Lukas Wunner

On Tue, Sep 08, 2020 at 02:55:36PM +0200, Daniel Borkmann wrote:
> I would strongly prefer something where nf integrates into existing
> tc hook, not only due to the hook reuse which would be better,
> but also to allow for a more flexible interaction between tc/BPF
> use cases
[...]
> one option to move forward [...] overall rework of ingress/egress
> side to be a more flexible pipeline (think of cont/ok actions
> as with tc filters or stackable LSMs to process & delegate).

Interaction between netfilter and tc is facilitated by skb->mark.
Both netfilter and tc are able to set and match by way of the mark.
E.g. a netfilter hook may set the mark and tc may later perform an
action if a matching mark is found.

Because the placement of netfilter and tc hooks in the data path
has been unchanged for decades, we must assume that users depend
on their order for setting and matching the mark.

Thus, reworking the data path in the way you suggest (a flexible
pipeline) must not change the order of the hooks.  It would have
to be a fixed pipeline.  But what's the benefit then compared to
separate netfilter and tc hooks which are patched in at runtime
and become NOPs if not used?  (Which is what the present series is
aiming for.)

> to name one example... consider two different entities in the system
> setting up the two, that is, one adding rules for nf ingress/egress
> on the phys device for host fw and the other one for routing traffic
> into/from containers at the tc layer, then traffic going into host ns
> will hit nf ingress and on egress side the nf egress part; however,
> traffic going to containers via existing tc redirect will not see the
> nf ingress as expected but would on reverse path incorrectly
> hit the nf egress one which is /not/ the case for dev_queue_xmit() today.

Using tc to bounce ingress traffic into a container -- is that actually
a thing or is it a hypothetical example?  I think at least Docker uses
plain vanilla routing and bridging to move packets in and out of
containers.

However you're right that if tc *is* used to redirect ingress packets
to a container veth, then the data path would look like:

host tc -> container tc -> container nft

Whereas the egress data path would look like:

container nft -> container tc -> host nft -> host tc

But I argue that the egress data path is actually correct because the
host must be able to firewall packets coming out of the container
in case the container has been compromised.

> And if you check a typical DHCP client that is present on major
> modern distros like systemd-networkd's DHCP client then they
> already implement filtering of malicious packets via BPF at
> socket layer including checking for cookies in the DHCP header
> that are set by the application itself to prevent spoofing [0].
> 
> [0] 
> https://github.com/systemd/systemd/blob/master/src/libsystemd-network/dhcp-network.c#L28

That's an *ingress* filter so that user space only receives DHCP
packets and nothing else.

We're talking about the ability to filter *egress* DHCP packets
(among others) at the kernel level to guard against unwanted
packets coming from user space.

Thanks,

Lukas

Re: [PATCH net-next 1/2] dt-bindings: net: dsa: b53: Add YAML bindings

2020-10-11 Thread Kurt Kanzenbach

On Sat Oct 10 2020, Florian Fainelli wrote:
> On 10/10/2020 9:46 AM, Kurt Kanzenbach wrote:
>> Convert the b53 DSA device tree bindings to YAML in order to allow
>> for automatic checking and such.
>> 
>> Suggested-by: Florian Fainelli 
>> Signed-off-by: Kurt Kanzenbach 
>
> Thanks for making this change, there are quite a few warnings that are 
> going to show up because the binding was defined in a way that it would 
> define chip compatible strings, which not all DTS files are using.

Oh, I didn't know there is a second make command for doing the actual
check against the dtbs. I've just used `make dt_binding_check'.

So, it seems like a lot of the errors are caused by the include files
such as

[linux]/arch/arm/boot/dts/bcm5301x.dtsi

srab: srab@18007000 {
compatible = "brcm,bcm5301x-srab";
reg = <0x18007000 0x1000>;

status = "disabled";

/* ports are defined in board DTS */
};

The nodename should be "switch" not "srab" as enforced by
dsa.yaml. Furthermore, some DTS files are not adding the chip specific
compatible strings and the ports leading to more errors.

There are also some minor errors regarding the reg-names and such for
specific instances.

How should we proceed? Adding the missing compatible strings and ports
to the DTS files? Or adjusting the include files?

> I don't know if Rob would be comfortable with taking this until we
> resolve all warnings first.

Probably not. We should fix the existing device trees first.

Thanks,
Kurt

signature.asc
Description: PGP signature

Re: [PATCH nf-next v3 3/3] netfilter: Introduce egress hook

2020-10-11 Thread Lukas Wunner

On Tue, Sep 15, 2020 at 12:02:03AM +0200, Daniel Borkmann wrote:
> today it is possible and
> perfectly fine to e.g. redirect to a host-facing veth from tc ingress which
> then goes into container. Only traffic that goes up the host stack is seen
> by nf ingress hook in that case. Likewise, reply traffic can be redirected
> from host-facing veth to phys dev for xmit w/o any netfilter interference.
> This means netfilter in host ns really only sees traffic to/from host as
> intended. This is fine today, however, if 3rd party entities (e.g. distro
> side) start pushing down rules on the two nf hooks, then these use cases will
> break on the egress one due to this asymmetric layering violation. Hence my
> ask that this needs to be configurable from a control plane perspective so
> that both use cases can live next to each other w/o breakage. Most trivial
> one I can think of is (aside from the fact to refactor the hooks and improve
> their performance) a flag e.g. for skb that can be set from tc/BPF layer to
> bypass the nf hooks. Basically a flexible opt-in so that existing use-cases
> can be retained w/o breakage.

I argue that being able to filter traffic coming out of the container
is desirable because why should the host trust the software running
in the container to never send malicious packets.

As for the flag you're asking for, it already exists in the form of
skb->mark.  Just let tc set the mark when the packet exits the container
and add a netfilter rule to accept packets carrying that mark.  Or do
not set any netfilter egress rules at all to disable the egress
filtering and avoid the performance impact it would imply.

Thanks,

Lukas

[PATCH] Add documentation of fiter syntax to ss manpage

2020-10-11 Thread Thayne

Since the documentation currently referenced in the manpage no longer
exists.
---
 man/man8/ss.8 | 75 ++-
 1 file changed, 74 insertions(+), 1 deletion(-)

diff --git a/man/man8/ss.8 b/man/man8/ss.8
index 3b2559ff..f9e629f6 100644
--- a/man/man8/ss.8
+++ b/man/man8/ss.8
@@ -401,7 +401,7 @@ Read filter information from FILE.  Each line of
FILE is interpreted
 like single command line option. If FILE is - stdin is used.
 .TP
 .B FILTER := [ state STATE-FILTER ] [ EXPRESSION ]
-Please take a look at the official documentation for details regarding filters.
+See below an explanation of STATE-FILTER and EXPRESSION.

 .SH STATE-FILTER

@@ -437,6 +437,79 @@ states except for
 - opposite to
 .B bucket

+.SH EXPRESSION
+
+The following simple expressions are supported:
+
+.TP
+.RB { \ src \ | \ dst \ } \ = \
\fR[\fIFAMILY\fB:\fR][\fIADDRESS\fR][\fB:\fIPORT\fR]
+Matches if the source or destination matches the host condition.
+Providing FAMILY is equivalent to passing the family with the -f option.
+ADDRESS and PORT are the family specific address (or hostname) and port (or
+service name) to match against. At least one of ADDRESS and PORT should be
+provided. Additionally, "*" may be used as a wildcard for either ADDRESS or
+PORT. Note that for some families, PORT is meaningless.
+
+For inet and inet6 addresses, if the address is numeric (not a hostname) a
+bitmask can be provided in CIDR notation (ex. 127.0.0.0/16) to match a range of
+addresses. If the address is provided as a hostname, all addresses returned by
+DNS for that hostname will match. The inet or inet6 address may be enclosed in
+"[" and "]".
+.TP
+.RB { \ sport \ | \ dport \ } "\fI OP \fR[\fIFAMILY\fB:\fR][\fB:\fR]\fIPORT"
+Matches if the source or destination port matches the comparison with the
+supplied port.  FAMILY and PORT are the same as above.  OP can  be any of "=",
+"!=", "<", ">", "<=", or ">=".
+.TP
+.BR dev \ { \ = \ | \ != \ }  \fI\ DEV
+Matches if it is on the specified device (or not). The device can be specified
+either by name or by index.
+.TP
+.BR fwmark \ { \ = \ | \ != \ } \ \fIMARK-MASK
+Matches if the firewall mark matches the supplied mask (or not). The
mask should
+be specified as an integer value optionally followed by a "/" and an integer
+mask. The integer may be hex-encoded if it begins with "0x" or "0X".
+.TP
+.BR cgroup \ { \ = \ | \ != \ } \ \fICGROUP
+Matches if it is part of the cgroup (or not).
+.B CGROUP
+should be the path for the desired cgroup.
+.TP
+.B autobound
+Matches if the local port is automatically bound (randomly assigned).
+
+.PP
+Each operator has equivalent aliases as follows:
+.IP
+"=" can be replaced with "==" or "eq"
+.IP
+"!=" can be replaced with "ne" or "neq"
+.IP
+">" can be replaced with "gt"
+.IP
+"<" can be replaced with "lt"
+.IP
+">=" can be replaced with "ge" or "geq"
+.IP
+"<=" can be replaced with "le" or "leq"
+
+Subexpressions can be combined into more complex expressions in the following
+ways:
+.TP
+.BI not \ EXPRESSION
+Negate the EXPRESSION. "!" can be used in place of of "not".
+.TP
+\fI EXPRESSION EXPRESSION \fR| \fIEXPRESSION \fBand \fIEXPRESSION
+Match only if both expressions match. "&" or "&&" can be used in
place of "and".
+.TP
+.IB EXPRESSION \ or \ EXPRESSION
+Match if either expression matches. "|" or "||" can be used in place of "or".
+.TP
+.BI ( \ EXPRESSION \ )
+Group EXPRESSION to change precedence of the above operators. The default
+precedence is "not", "and", "or".
+
+
 .SH USAGE EXAMPLES
 .TP
 .B ss -t -a
-- 
2.28.0

[PATCH net-next 1/3] macb: add RM9200's interrupt flag TBRE

2020-10-11 Thread Willy Tarreau

Transmit Buffer Register Empty replaces TXERR on RM9200 and signals the
sender may try to send again becase the last queued frame is no longer
in queue (being transmitted or already transmitted).

Cc: Nicolas Ferre 
Cc: Claudiu Beznea 
Cc: Daniel Palmer 
Signed-off-by: Willy Tarreau 
---
 drivers/net/ethernet/cadence/macb.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/cadence/macb.h 
b/drivers/net/ethernet/cadence/macb.h
index 4f1b41569260..49d347429de8 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -365,6 +365,8 @@
 #define MACB_ISR_RLE_SIZE  1
 #define MACB_TXERR_OFFSET  6 /* EN TX frame corrupt from error interrupt */
 #define MACB_TXERR_SIZE1
+#define MACB_RM9200_TBRE_OFFSET6 /* EN may send new frame interrupt 
(RM9200) */
+#define MACB_RM9200_TBRE_SIZE  1
 #define MACB_TCOMP_OFFSET  7 /* Enable transmit complete interrupt */
 #define MACB_TCOMP_SIZE1
 #define MACB_ISR_LINK_OFFSET   9 /* Enable link change interrupt */
-- 
2.28.0

[PATCH net-next 3/3] macb: support the two tx descriptors on at91rm9200

2020-10-11 Thread Willy Tarreau

The at91rm9200 variant used by a few chips including the MSC313 supports
two Tx descriptors (one frame being serialized and another one queued).
However the driver only implemented a single one, which adds a dead time
after each transfer to receive and process the interrupt and wake the
queue up, preventing from reaching line rate.

This patch implements a very basic 2-deep queue to address this limitation.
The tests run on a Breadbee board equipped with an MSC313E show that at
1 GHz, HTTP traffic on medium-sized objects (45kB) was limited to exactly
50 Mbps before this patch, and jumped to 76 Mbps with this patch. And tests
on a single TCP stream with an MTU of 576 jump from 10kpps to 15kpps. With
1500 byte packets it's now possible to reach line rate versus 75 Mbps
before.

Cc: Nicolas Ferre 
Cc: Claudiu Beznea 
Cc: Daniel Palmer 
Signed-off-by: Willy Tarreau 
---
 drivers/net/ethernet/cadence/macb.h  |  2 ++
 drivers/net/ethernet/cadence/macb_main.c | 46 +++-
 2 files changed, 40 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb.h 
b/drivers/net/ethernet/cadence/macb.h
index e87db95fb0f6..f8133003981f 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -1208,6 +1208,8 @@ struct macb {
 
/* AT91RM9200 transmit queue (1 on wire + 1 queued) */
struct macb_tx_skb  rm9200_txq[2];
+   unsigned intrm9200_tx_tail;
+   unsigned intrm9200_tx_len;
unsigned intmax_tx_length;
 
u64 ethtool_stats[GEM_STATS_LEN + QUEUE_STATS_LEN * 
MACB_MAX_QUEUES];
diff --git a/drivers/net/ethernet/cadence/macb_main.c 
b/drivers/net/ethernet/cadence/macb_main.c
index ca6e5456906a..6ff8e4b0b95d 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -3909,6 +3909,7 @@ static int at91ether_start(struct macb *lp)
 MACB_BIT(ISR_TUND) |
 MACB_BIT(ISR_RLE)  |
 MACB_BIT(TCOMP)|
+MACB_BIT(RM9200_TBRE)  |
 MACB_BIT(ISR_ROVR) |
 MACB_BIT(HRESP));
 
@@ -3925,6 +3926,7 @@ static void at91ether_stop(struct macb *lp)
 MACB_BIT(ISR_TUND) |
 MACB_BIT(ISR_RLE)  |
 MACB_BIT(TCOMP)|
+MACB_BIT(RM9200_TBRE)  |
 MACB_BIT(ISR_ROVR) |
 MACB_BIT(HRESP));
 
@@ -3994,11 +3996,10 @@ static netdev_tx_t at91ether_start_xmit(struct sk_buff 
*skb,
struct net_device *dev)
 {
struct macb *lp = netdev_priv(dev);
+   unsigned long flags;
 
-   if (macb_readl(lp, TSR) & MACB_BIT(RM9200_BNQ)) {
-   int desc = 0;
-
-   netif_stop_queue(dev);
+   if (lp->rm9200_tx_len < 2) {
+   int desc = lp->rm9200_tx_tail;
 
/* Store packet information (to free when Tx completed) */
lp->rm9200_txq[desc].skb = skb;
@@ -4012,6 +4013,15 @@ static netdev_tx_t at91ether_start_xmit(struct sk_buff 
*skb,
return NETDEV_TX_OK;
}
 
+   spin_lock_irqsave(&lp->lock, flags);
+
+   lp->rm9200_tx_tail = (desc + 1) & 1;
+   lp->rm9200_tx_len++;
+   if (lp->rm9200_tx_len > 1)
+   netif_stop_queue(dev);
+
+   spin_unlock_irqrestore(&lp->lock, flags);
+
/* Set address of the data in the Transmit Address register */
macb_writel(lp, TAR, lp->rm9200_txq[desc].mapping);
/* Set length of the packet in the Transmit Control register */
@@ -4077,6 +4087,8 @@ static irqreturn_t at91ether_interrupt(int irq, void 
*dev_id)
struct macb *lp = netdev_priv(dev);
u32 intstatus, ctl;
unsigned int desc;
+   unsigned int qlen;
+   u32 tsr;
 
/* MAC Interrupt Status register indicates what interrupts are pending.
 * It is automatically cleared once read.
@@ -4088,21 +4100,39 @@ static irqreturn_t at91ether_interrupt(int irq, void 
*dev_id)
at91ether_rx(dev);
 
/* Transmit complete */
-   if (intstatus & MACB_BIT(TCOMP)) {
+   if (intstatus & (MACB_BIT(TCOMP) | MACB_BIT(RM9200_TBRE))) {
/* The TCOM bit is set even if the transmission failed */
if (intstatus & (MACB_BIT(ISR_TUND) | MACB_BIT(ISR_RLE)))
dev->stats.tx_errors++;
 
-   desc = 0;
-   if (lp->rm9200_txq[desc].skb) {
+   spin_lock(&lp->lock);
+
+   tsr = macb_readl(lp, TSR);
+
+   /* we have three possibilities here:
+*   - all pending packets transmitted (TGO, implies BNQ)
+

[PATCH net-next 0/3] macb: support the 2-deep Tx queue on at91

2020-10-11 Thread Willy Tarreau

Hi,

while running some tests on my Breadbee board, I noticed poor network
Tx performance. I had a look at the driver (macb, at91ether variant)
and noticed that at91ether_start_xmit() immediately stops the queue
after sending a frame and waits for the interrupt to restart the queue,
causing a dead time after each packet is sent.

The AT91RM9200 datasheet states that the controller supports two frames,
one being sent and the other one being queued, so I performed minimal
changes to support this. The transmit performance on my board has
increased by 50% on medium-sized packets (HTTP traffic), and with large
packets I can now reach line rate.

Since this driver is shared by various platforms, I tried my best to
isolate and limit the changes as much as possible and I think it's pretty
reasonable as-is. I've run extensive tests and couldn't meet any
unexpected situation (no stall, overflow nor lockup).

There are 3 patches in this series. The first one adds the missing
interrupt flag for RM9200 (TBRE, indicating the tx buffer is willing
to take a new packet). The second one replaces the single skb with a
2-array and uses only index 0. It does no other change, this is just
to prepare the code for the third one. The third one implements the
queue. Packets are added at the tail of the queue, the queue is
stopped at 2 packets and the interrupt releases 0, 1 or 2 depending
on what the transmit status register reports.

Thanks,
Willy

Willy Tarreau (3):
  macb: add RM9200's interrupt flag TBRE
  macb: prepare at91 to use a 2-frame TX queue
  macb: support the two tx descriptors on at91rm9200

 drivers/net/ethernet/cadence/macb.h  | 10 ++--
 drivers/net/ethernet/cadence/macb_main.c | 66 ++--
 2 files changed, 56 insertions(+), 20 deletions(-)

-- 
2.28.0

[PATCH net-next 2/3] macb: prepare at91 to use a 2-frame TX queue

2020-10-11 Thread Willy Tarreau

The RM9200 supports one frame being sent while another one is waiting in
queue. This avoids the dead time that follows the emission of a frame
and which prevents one from reaching line speed.

Right now the driver supports only a single skb, so we'll first replace
the rm9200-specific skb info with an array of two macb_tx_skb (already
used by other drivers). This patch only moves the skb_length to
txq[0].size and skb_physaddr to skb[0].mapping but doesn't perform any
other change. It already uses [desc] in order to minimize future changes.

Cc: Nicolas Ferre 
Cc: Claudiu Beznea 
Cc: Daniel Palmer 
Signed-off-by: Willy Tarreau 
---
 drivers/net/ethernet/cadence/macb.h  |  6 ++---
 drivers/net/ethernet/cadence/macb_main.c | 28 ++--
 2 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ethernet/cadence/macb.h 
b/drivers/net/ethernet/cadence/macb.h
index 49d347429de8..e87db95fb0f6 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -1206,10 +1206,8 @@ struct macb {
 
phy_interface_t phy_interface;
 
-   /* AT91RM9200 transmit */
-   struct sk_buff *skb;/* holds skb until xmit 
interrupt completes */
-   dma_addr_t skb_physaddr;/* phys addr from 
pci_map_single */
-   int skb_length; /* saved skb length for 
pci_unmap_single */
+   /* AT91RM9200 transmit queue (1 on wire + 1 queued) */
+   struct macb_tx_skb  rm9200_txq[2];
unsigned intmax_tx_length;
 
u64 ethtool_stats[GEM_STATS_LEN + QUEUE_STATS_LEN * 
MACB_MAX_QUEUES];
diff --git a/drivers/net/ethernet/cadence/macb_main.c 
b/drivers/net/ethernet/cadence/macb_main.c
index 9179f7b0b900..ca6e5456906a 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -3996,14 +3996,16 @@ static netdev_tx_t at91ether_start_xmit(struct sk_buff 
*skb,
struct macb *lp = netdev_priv(dev);
 
if (macb_readl(lp, TSR) & MACB_BIT(RM9200_BNQ)) {
+   int desc = 0;
+
netif_stop_queue(dev);
 
/* Store packet information (to free when Tx completed) */
-   lp->skb = skb;
-   lp->skb_length = skb->len;
-   lp->skb_physaddr = dma_map_single(&lp->pdev->dev, skb->data,
- skb->len, DMA_TO_DEVICE);
-   if (dma_mapping_error(&lp->pdev->dev, lp->skb_physaddr)) {
+   lp->rm9200_txq[desc].skb = skb;
+   lp->rm9200_txq[desc].size = skb->len;
+   lp->rm9200_txq[desc].mapping = dma_map_single(&lp->pdev->dev, 
skb->data,
+ skb->len, 
DMA_TO_DEVICE);
+   if (dma_mapping_error(&lp->pdev->dev, 
lp->rm9200_txq[desc].mapping)) {
dev_kfree_skb_any(skb);
dev->stats.tx_dropped++;
netdev_err(dev, "%s: DMA mapping error\n", __func__);
@@ -4011,7 +4013,7 @@ static netdev_tx_t at91ether_start_xmit(struct sk_buff 
*skb,
}
 
/* Set address of the data in the Transmit Address register */
-   macb_writel(lp, TAR, lp->skb_physaddr);
+   macb_writel(lp, TAR, lp->rm9200_txq[desc].mapping);
/* Set length of the packet in the Transmit Control register */
macb_writel(lp, TCR, skb->len);
 
@@ -4074,6 +4076,7 @@ static irqreturn_t at91ether_interrupt(int irq, void 
*dev_id)
struct net_device *dev = dev_id;
struct macb *lp = netdev_priv(dev);
u32 intstatus, ctl;
+   unsigned int desc;
 
/* MAC Interrupt Status register indicates what interrupts are pending.
 * It is automatically cleared once read.
@@ -4090,13 +4093,14 @@ static irqreturn_t at91ether_interrupt(int irq, void 
*dev_id)
if (intstatus & (MACB_BIT(ISR_TUND) | MACB_BIT(ISR_RLE)))
dev->stats.tx_errors++;
 
-   if (lp->skb) {
-   dev_consume_skb_irq(lp->skb);
-   lp->skb = NULL;
-   dma_unmap_single(&lp->pdev->dev, lp->skb_physaddr,
-lp->skb_length, DMA_TO_DEVICE);
+   desc = 0;
+   if (lp->rm9200_txq[desc].skb) {
+   dev_consume_skb_irq(lp->rm9200_txq[desc].skb);
+   lp->rm9200_txq[desc].skb = NULL;
+   dma_unmap_single(&lp->pdev->dev, 
lp->rm9200_txq[desc].mapping,
+lp->rm9200_txq[desc].size, 
DMA_TO_DEVICE);
dev->stats.tx_packets++;
-   dev->stats.tx_bytes += lp->skb_length;
+   dev->stats.tx_bytes += lp->rm9200_txq[desc].size;
}
netif_wake_queue(dev);

[PATCH net-next] net: mscc: ocelot: remove duplicate ocelot_port_dev_check

2020-10-11 Thread Vladimir Oltean

A helper for checking whether a net_device belongs to mscc_ocelot
already existed and did not need to be rewritten. Use it.

Fixes: 319e4dd11a20 ("net: mscc: ocelot: introduce conversion helpers between 
port and netdev")
Signed-off-by: Vladimir Oltean 
---
 drivers/net/ethernet/mscc/ocelot_net.c | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/mscc/ocelot_net.c 
b/drivers/net/ethernet/mscc/ocelot_net.c
index d3c03942546d..b34da11acf65 100644
--- a/drivers/net/ethernet/mscc/ocelot_net.c
+++ b/drivers/net/ethernet/mscc/ocelot_net.c
@@ -669,7 +669,8 @@ struct net_device *ocelot_port_to_netdev(struct ocelot 
*ocelot, int port)
return priv->dev;
 }
 
-static bool ocelot_port_dev_check(const struct net_device *dev)
+/* Checks if the net_device instance given to us originates from our driver */
+static bool ocelot_netdevice_dev_check(const struct net_device *dev)
 {
return dev->netdev_ops == &ocelot_port_netdev_ops;
 }
@@ -678,7 +679,7 @@ int ocelot_netdev_to_port(struct net_device *dev)
 {
struct ocelot_port_private *priv;
 
-   if (!dev || !ocelot_port_dev_check(dev))
+   if (!dev || !ocelot_netdevice_dev_check(dev))
return -EINVAL;
 
priv = netdev_priv(dev);
@@ -907,12 +908,6 @@ static int ocelot_port_obj_del(struct net_device *dev,
return ret;
 }
 
-/* Checks if the net_device instance given to us originate from our driver. */
-static bool ocelot_netdevice_dev_check(const struct net_device *dev)
-{
-   return dev->netdev_ops == &ocelot_port_netdev_ops;
-}
-
 static int ocelot_netdevice_port_event(struct net_device *dev,
   unsigned long event,
   struct netdev_notifier_changeupper_info 
*info)
-- 
2.25.1

Re: [PATCH bpf-next v6 2/6] bpf: add redirect_peer helper

2020-10-11 Thread Jesper Dangaard Brouer

On Sun, 11 Oct 2020 01:40:02 +0200
Daniel Borkmann  wrote:

> Add an efficient ingress to ingress netns switch that can be used out of tc 
> BPF
> programs in order to redirect traffic from host ns ingress into a container
> veth device ingress without having to go via CPU backlog queue [0]. For local
> containers this can also be utilized and path via CPU backlog queue only needs
> to be taken once, not twice. On a high level this borrows from ipvlan which 
> does
> similar switch in __netif_receive_skb_core() and then iterates via 
> another_round.
> This helps to reduce latency for mentioned use cases.
> 
> Pod to remote pod with redirect(), TCP_RR [1]:
> 
>   # percpu_netperf 10.217.1.33
>   RT_LATENCY: 122.450 (per CPU: 122.666   
>   122.401 122.333 122.401 )
> MEAN_LATENCY: 121.210 (per CPU: 121.100   
>   121.260 121.320 121.160 )
>   STDDEV_LATENCY: 120.040 (per CPU: 119.420   
>   119.910 125.460 115.370 )
>  MIN_LATENCY:  46.500 (per CPU:  47.000   
>47.000  47.000  45.000 )
>  P50_LATENCY: 118.500 (per CPU: 118.000   
>   119.000 118.000 119.000 )
>  P90_LATENCY: 127.500 (per CPU: 127.000   
>   128.000 127.000 128.000 )
>  P99_LATENCY: 130.750 (per CPU: 131.000   
>   131.000 129.000 132.000 )
> 
> TRANSACTION_RATE:   32666.400 (per CPU:8152.200   
>  8169.8428174.4398169.897 )
> 
> Pod to remote pod with redirect_peer(), TCP_RR:
> 
>   # percpu_netperf 10.217.1.33
>   RT_LATENCY:  44.449 (per CPU:  43.767   
>43.127  45.279  45.622 )
> MEAN_LATENCY:  45.065 (per CPU:  44.030   
>45.530  45.190  45.510 )
>   STDDEV_LATENCY:  84.823 (per CPU:  66.770   
>97.290  84.380  90.850 )
>  MIN_LATENCY:  33.500 (per CPU:  33.000   
>33.000  34.000  34.000 )
>  P50_LATENCY:  43.250 (per CPU:  43.000   
>43.000  43.000  44.000 )
>  P90_LATENCY:  46.750 (per CPU:  46.000   
>47.000  47.000  47.000 )
>  P99_LATENCY:  52.750 (per CPU:  51.000   
>54.000  53.000  53.000 )
> 
> TRANSACTION_RATE:   90039.500 (per CPU:   22848.186   
> 23187.089   22085.077   21919.130 )

This is awesome results and great work Daniel! :-)

I wonder if we can also support this from XDP, which can also native
redirect into veth.  Originally I though we could add the peer netdev
in the devmap, but AFAIK Toke showed me that this was not possible.


>   [0] 
> https://linuxplumbersconf.org/event/7/contributions/674/attachments/568/1002/plumbers_2020_cilium_load_balancer.pdf
>   [1] https://github.com/borkmann/netperf_scripts/blob/master/percpu_netperf
> 
> Signed-off-by: Daniel Borkmann 
> ---
>  drivers/net/veth.c |  9 ++
>  include/linux/netdevice.h  |  4 +++
>  include/uapi/linux/bpf.h   | 17 +++
>  net/core/dev.c | 15 --
>  net/core/filter.c  | 54 +-
>  tools/include/uapi/linux/bpf.h | 17 +++
>  6 files changed, 106 insertions(+), 10 deletions(-)
> 
[...]
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 9d55bf5d1a65..7dd015823593 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4930,7 +4930,7 @@ EXPORT_SYMBOL_GPL(br_fdb_test_addr_hook);
>  
>  static inline struct sk_buff *
>  sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int 
> *ret,
> -struct net_device *orig_dev)
> +struct net_device *orig_dev, bool *another)
>  {
>  #ifdef CONFIG_NET_CLS_ACT
>   struct mini_Qdisc *miniq = rcu_dereference_bh(skb->dev->miniq_ingress);
> @@ -4974,7 +4974,11 @@ sch_handle_ingress(struct sk_buff *skb, struct 
> packet_type **pt_prev, int *ret,
>* redirecting to another netdev
>*/
>   __skb_push(skb, skb->mac_len);
> - skb_do_redirect(skb);
> + if (skb_do_redirect(skb) == -EAGAIN) {
> + __skb_pull(skb, skb->mac_len);
> + *another = true;
> + break;
> + }
>   return NULL;
>   case TC_ACT_CONSUMED:
>   return NULL;
> @@ -5163,7 +5167,12 @@ static int __netif_receive_skb_core(struct sk_buff 
> **pskb, bool pfmemalloc,
>  skip_taps:
>  #ifdef CONFIG_NET_INGRESS
>   if (static_branch_unlikely

Re: [PATCH] net: stmmac: Don't call _irqoff() with hardirqs enabled

2020-10-11 Thread Heiner Kallweit

On 10.10.2020 17:22, Jakub Kicinski wrote:
> On Sat, 10 Oct 2020 15:08:15 +0200 Heiner Kallweit wrote:
>> On 09.10.2020 18:06, Heiner Kallweit wrote:
>>> On 09.10.2020 17:58, Jakub Kicinski wrote:  
 On Fri, 9 Oct 2020 16:54:06 +0200 Heiner Kallweit wrote:  
> I'm thinking about a __napi_schedule version that disables hard irq's
> conditionally, based on variable force_irqthreads, exported by the irq
> subsystem. This would allow to behave correctly with threadirqs set,
> whilst not loosing the _irqoff benefit with threadirqs unset.
> Let me come up with a proposal.  

 I think you'd need to make napi_schedule_irqoff() behave like that,
 right?  Are there any uses of napi_schedule_irqoff() that are disabling
 irqs and not just running from an irq handler?
  
>>> Right, the best approach depends on the answer to the latter question.
>>> I didn't check this yet, therefore I described the least intrusive approach.
>>>   
>>
>> With some help from coccinelle I identified the following functions that
>> call napi_schedule_irqoff() or __napi_schedule_irqoff() and do not run
>> from an irq handler (at least not at the first glance).
>>
>> dpaa2_caam_fqdan_cb
>> qede_simd_fp_handler
>> mlx4_en_rx_irq
>> mlx4_en_tx_irq
> 
> Don't know the others but FWIW the mlx4 ones run from an IRQ handler,
> AFAICT:
> 
> static irqreturn_t mlx4_interrupt(int irq, void *dev_ptr)
> static irqreturn_t mlx4_msi_x_interrupt(int irq, void *eq_ptr)
>   mlx4_eq_int()
> mlx4_cq_completion
>   cq->comp()
> 
>> qeth_qdio_poll
>> netvsc_channel_cb
>> napi_watchdog
> 
> This one runs from a hrtimer, which I believe will be a hard irq
> context on anything but RT. I could be wrong.
> 

A similar discussion can be found e.g. here:
https://lore.kernel.org/netdev/20191126222013.1904785-1-bige...@linutronix.de/
However I don't see any actual outcome.

[PATCH net-next v2 2/2] can: remove obsolete version strings

2020-10-11 Thread Oliver Hartkopp

As pointed out by Jakub Kicinski here:
http://lore.kernel.org/r/20201009175751.5c540...@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com
this patch removes the obsolete version information of the different
CAN protocols and the AF_CAN core module.

Signed-off-by: Oliver Hartkopp 
---
 include/linux/can/core.h |  7 ---
 include/net/netns/can.h  |  1 -
 net/can/af_can.c |  2 +-
 net/can/bcm.c|  4 +---
 net/can/gw.c |  4 +---
 net/can/isotp.c  |  4 +---
 net/can/proc.c   | 12 
 net/can/raw.c|  4 +---
 8 files changed, 5 insertions(+), 33 deletions(-)

diff --git a/include/linux/can/core.h b/include/linux/can/core.h
index 7da9f1f82e8e..5fb8d0e3f9c1 100644
--- a/include/linux/can/core.h
+++ b/include/linux/can/core.h
@@ -18,13 +18,6 @@
 #include 
 #include 
 
-#define CAN_VERSION "20170425"
-
-/* increment this number each time you change some user-space interface */
-#define CAN_ABI_VERSION "9"
-
-#define CAN_VERSION_STRING "rev " CAN_VERSION " abi " CAN_ABI_VERSION
-
 #define DNAME(dev) ((dev) ? (dev)->name : "any")
 
 /**
diff --git a/include/net/netns/can.h b/include/net/netns/can.h
index b6ab7d1530d7..52fbd8291a96 100644
--- a/include/net/netns/can.h
+++ b/include/net/netns/can.h
@@ -15,7 +15,6 @@ struct can_rcv_lists_stats;
 struct netns_can {
 #if IS_ENABLED(CONFIG_PROC_FS)
struct proc_dir_entry *proc_dir;
-   struct proc_dir_entry *pde_version;
struct proc_dir_entry *pde_stats;
struct proc_dir_entry *pde_reset_stats;
struct proc_dir_entry *pde_rcvlist_all;
diff --git a/net/can/af_can.c b/net/can/af_can.c
index b7d0f6500893..6373ab9c5507 100644
--- a/net/can/af_can.c
+++ b/net/can/af_can.c
@@ -875,7 +875,7 @@ static __init int can_init(void)
 offsetof(struct can_frame, data) !=
 offsetof(struct canfd_frame, data));
 
-   pr_info("can: controller area network core (" CAN_VERSION_STRING ")\n");
+   pr_info("can: controller area network core\n");
 
rcv_cache = kmem_cache_create("can_receiver", sizeof(struct receiver),
  0, 0, NULL);
diff --git a/net/can/bcm.c b/net/can/bcm.c
index 4253915800e6..0e5c37be4a2b 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -81,8 +81,6 @@
 (CAN_EFF_MASK | CAN_EFF_FLAG | CAN_RTR_FLAG) : \
 (CAN_SFF_MASK | CAN_EFF_FLAG | CAN_RTR_FLAG))
 
-#define CAN_BCM_VERSION "20170425"
-
 MODULE_DESCRIPTION("PF_CAN broadcast manager protocol");
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_AUTHOR("Oliver Hartkopp ");
@@ -1696,7 +1694,7 @@ static int __init bcm_module_init(void)
 {
int err;
 
-   pr_info("can: broadcast manager protocol (rev " CAN_BCM_VERSION " 
t)\n");
+   pr_info("can: broadcast manager protocol\n");
 
err = can_proto_register(&bcm_can_proto);
if (err < 0) {
diff --git a/net/can/gw.c b/net/can/gw.c
index 49b4e3d91ad6..6b790b6ff8d2 100644
--- a/net/can/gw.c
+++ b/net/can/gw.c
@@ -59,7 +59,6 @@
 #include 
 #include 
 
-#define CAN_GW_VERSION "20190810"
 #define CAN_GW_NAME "can-gw"
 
 MODULE_DESCRIPTION("PF_CAN netlink gateway");
@@ -1194,8 +1193,7 @@ static __init int cgw_module_init(void)
/* sanitize given module parameter */
max_hops = clamp_t(unsigned int, max_hops, CGW_MIN_HOPS, CGW_MAX_HOPS);
 
-   pr_info("can: netlink gateway (rev " CAN_GW_VERSION ") max_hops=%d\n",
-   max_hops);
+   pr_info("can: netlink gateway - max_hops=%d\n", max_hops);
 
ret = register_pernet_subsys(&cangw_pernet_ops);
if (ret)
diff --git a/net/can/isotp.c b/net/can/isotp.c
index 22187669c5c9..132ca6c471d6 100644
--- a/net/can/isotp.c
+++ b/net/can/isotp.c
@@ -72,8 +72,6 @@
 #include 
 #include 
 
-#define CAN_ISOTP_VERSION "20200928"
-
 MODULE_DESCRIPTION("PF_CAN isotp 15765-2:2016 protocol");
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_AUTHOR("Oliver Hartkopp ");
@@ -1409,7 +1407,7 @@ static __init int isotp_module_init(void)
 {
int err;
 
-   pr_info("isotp protocol (rev " CAN_ISOTP_VERSION ")\n");
+   pr_info("isotp protocol\n");
 
err = can_proto_register(&isotp_can_proto);
if (err < 0)
diff --git a/net/can/proc.c b/net/can/proc.c
index a4eb06c9eb70..550928b8b8a2 100644
--- a/net/can/proc.c
+++ b/net/can/proc.c
@@ -54,7 +54,6 @@
  * proc filenames for the PF_CAN core
  */
 
-#define CAN_PROC_VERSION "version"
 #define CAN_PROC_STATS   "stats"
 #define CAN_PROC_RESET_STATS "reset_stats"
 #define CAN_PROC_RCVLIST_ALL "rcvlist_all"
@@ -293,12 +292,6 @@ static int can_reset_stats_proc_show(struct seq_file *m, 
void *v)
return 0;
 }
 
-static int can_version_proc_show(struct seq_file *m, void *v)
-{
-   seq_printf(m, "%s\n", CAN_VERSION_STRING);
-   return 0;
-}
-
 static inline void can_rcvlist_proc_show_one(struct seq_file *m, int idx,
 struct net_device *dev,

[PATCH net-next v2 1/2] can-isotp: implement cleanups / improvements from review

2020-10-11 Thread Oliver Hartkopp

As pointed out by Jakub Kicinski here:
http://lore.kernel.org/r/20201009175751.5c540...@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com
this patch addresses the remarked issues:

- remove empty line in comment
- remove default=y for CAN_ISOTP in Kconfig
- make use of pr_notice_once()
- use GFP_KERNEL instead of gfp_any() in soft hrtimer context
- make use of pr_fmt() [suggested my Marc Kleine-Budde]

The version strings in the CAN subsystem are removed by a separate patch.

Signed-off-by: Oliver Hartkopp 
---
 include/uapi/linux/can/isotp.h |  1 -
 net/can/Kconfig|  3 ++-
 net/can/isotp.c| 19 ++-
 3 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/include/uapi/linux/can/isotp.h b/include/uapi/linux/can/isotp.h
index 553006509f4e..7793b26aa154 100644
--- a/include/uapi/linux/can/isotp.h
+++ b/include/uapi/linux/can/isotp.h
@@ -160,7 +160,6 @@ struct can_isotp_ll_options {
  * these default settings can be changed via sockopts.
  * For that reason the STmin value is intentionally _not_ checked for
  * consistency and copied directly into the flow control (FC) frame.
- *
  */
 
 #endif /* !_UAPI_CAN_ISOTP_H */
diff --git a/net/can/Kconfig b/net/can/Kconfig
index 021fe03a8ed6..224e5e0283a9 100644
--- a/net/can/Kconfig
+++ b/net/can/Kconfig
@@ -57,7 +57,6 @@ source "net/can/j1939/Kconfig"
 
 config CAN_ISOTP
tristate "ISO 15765-2:2016 CAN transport protocol"
-   default y
help
  CAN Transport Protocols offer support for segmented Point-to-Point
  communication between CAN nodes via two defined CAN Identifiers.
@@ -67,6 +66,8 @@ config CAN_ISOTP
  vehicle diagnosis (UDS, ISO 14229) or IP-over-CAN traffic.
  This protocol driver implements data transfers according to
  ISO 15765-2:2016 for 'classic' CAN and CAN FD frame types.
+ If you want to perform automotive vehicle diagnostic services (UDS),
+ say 'y'.
 
 source "drivers/net/can/Kconfig"
 
diff --git a/net/can/isotp.c b/net/can/isotp.c
index e6ff032b5426..22187669c5c9 100644
--- a/net/can/isotp.c
+++ b/net/can/isotp.c
@@ -79,6 +79,8 @@ MODULE_LICENSE("Dual BSD/GPL");
 MODULE_AUTHOR("Oliver Hartkopp ");
 MODULE_ALIAS("can-proto-6");
 
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
 #define SINGLE_MASK(id) (((id) & CAN_EFF_FLAG) ? \
 (CAN_EFF_MASK | CAN_EFF_FLAG | CAN_RTR_FLAG) : \
 (CAN_SFF_MASK | CAN_EFF_FLAG | CAN_RTR_FLAG))
@@ -222,8 +224,8 @@ static int isotp_send_fc(struct sock *sk, int ae, u8 
flowstatus)
 
can_send_ret = can_send(nskb, 1);
if (can_send_ret)
-   printk_once(KERN_NOTICE "can-isotp: %s: can_send_ret %d\n",
-   __func__, can_send_ret);
+   pr_notice_once("%s: can_send_ret %d\n",
+  __func__, can_send_ret);
 
dev_put(dev);
 
@@ -769,7 +771,7 @@ static enum hrtimer_restart isotp_tx_timer_handler(struct 
hrtimer *hrtimer)
 
 isotp_tx_burst:
skb = alloc_skb(so->ll.mtu + sizeof(struct can_skb_priv),
-   gfp_any());
+   GFP_KERNEL);
if (!skb) {
dev_put(dev);
break;
@@ -798,8 +800,8 @@ static enum hrtimer_restart isotp_tx_timer_handler(struct 
hrtimer *hrtimer)
 
can_send_ret = can_send(skb, 1);
if (can_send_ret)
-   printk_once(KERN_NOTICE "can-isotp: %s: can_send_ret 
%d\n",
-   __func__, can_send_ret);
+   pr_notice_once("%s: can_send_ret %d\n",
+  __func__, can_send_ret);
 
if (so->tx.idx >= so->tx.len) {
/* we are done */
@@ -942,8 +944,7 @@ static int isotp_sendmsg(struct socket *sock, struct msghdr 
*msg, size_t size)
err = can_send(skb, 1);
dev_put(dev);
if (err) {
-   printk_once(KERN_NOTICE "can-isotp: %s: can_send_ret %d\n",
-   __func__, err);
+   pr_notice_once("%s: can_send_ret %d\n", __func__, err);
return err;
}
 
@@ -1408,11 +1409,11 @@ static __init int isotp_module_init(void)
 {
int err;
 
-   pr_info("can: isotp protocol (rev " CAN_ISOTP_VERSION ")\n");
+   pr_info("isotp protocol (rev " CAN_ISOTP_VERSION ")\n");
 
err = can_proto_register(&isotp_can_proto);
if (err < 0)
-   pr_err("can: registration of isotp protocol failed\n");
+   pr_err("registration of isotp protocol failed\n");
 
return err;
 }
-- 
2.28.0

[PATCH net-next 0/9] bnxt_en: Updates for net-next.

2020-10-11 Thread Michael Chan

This series contains these main changes:

1. Change of default message level to enable more logging.
2. Some cleanups related to processing async events from firmware.
3. Allow online ethtool selftest on multi-function PFs.
4. Return stored firmware version information to devlink.

Michael Chan (4):
  bnxt_en: Set driver default message level.
  bnxt_en: Simplify bnxt_async_event_process().
  bnxt_en: Log event_data1 and event_data2 when handling RESET_NOTIFY
event.
  bnxt_en: Log unknown link speed appropriately.

Vasundhara Volam (5):
  bnxt_en: Return -EROFS to user space, if NVM writes are not permitted.
  bnxt_en: Enable online self tests for multi-host/NPAR mode.
  bnxt_en: Add bnxt_hwrm_nvm_get_dev_info() to query NVM info.
  bnxt_en: Refactor bnxt_dl_info_get().
  bnxt_en: Add stored FW version info to devlink info_get cb.

 drivers/net/ethernet/broadcom/bnxt/bnxt.c |  37 +++--
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |   1 +
 .../net/ethernet/broadcom/bnxt/bnxt_devlink.c | 153 --
 .../net/ethernet/broadcom/bnxt/bnxt_devlink.h |   6 +
 .../net/ethernet/broadcom/bnxt/bnxt_ethtool.c |  24 ++-
 .../net/ethernet/broadcom/bnxt/bnxt_ethtool.h |   2 +
 6 files changed, 162 insertions(+), 61 deletions(-)

-- 
2.18.1



smime.p7s
Description: S/MIME Cryptographic Signature

[PATCH net-next 1/9] bnxt_en: Return -EROFS to user space, if NVM writes are not permitted.

2020-10-11 Thread Michael Chan

From: Vasundhara Volam 

If NVRAM resources are locked, NVM writes are not permitted. In such
scenarios, firmware returns HWRM_ERR_CODE_RESOURCE_LOCKED error to
firmware commands.

Reviewed-by: Edwin Peer 
Signed-off-by: Vasundhara Volam 
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 5e4b7fbeef06..d4402a2cd07f 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -4325,6 +4325,8 @@ static int bnxt_hwrm_to_stderr(u32 hwrm_err)
switch (hwrm_err) {
case HWRM_ERR_CODE_SUCCESS:
return 0;
+   case HWRM_ERR_CODE_RESOURCE_LOCKED:
+   return -EROFS;
case HWRM_ERR_CODE_RESOURCE_ACCESS_DENIED:
return -EACCES;
case HWRM_ERR_CODE_RESOURCE_ALLOC_ERROR:
-- 
2.18.1



smime.p7s
Description: S/MIME Cryptographic Signature

[PATCH net-next 9/9] bnxt_en: Add stored FW version info to devlink info_get cb.

2020-10-11 Thread Michael Chan

From: Vasundhara Volam 

This patch adds FW versions stored in the flash to devlink info_get
callback.  Return the correct fw.psid running version using the
newly added bp->nvm_cfg_ver.

Reviewed-by: Andy Gospodarek 
Signed-off-by: Vasundhara Volam 
Signed-off-by: Michael Chan 
---
 .../net/ethernet/broadcom/bnxt/bnxt_devlink.c | 45 ++-
 1 file changed, 44 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
index 65e8da1f7a01..7631c2d4c19e 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
@@ -408,6 +408,7 @@ static int bnxt_dl_info_put(struct bnxt *bp, struct 
devlink_info_req *req,
 static int bnxt_dl_info_get(struct devlink *dl, struct devlink_info_req *req,
struct netlink_ext_ack *extack)
 {
+   struct hwrm_nvm_get_dev_info_output nvm_dev_info;
struct bnxt *bp = bnxt_get_bp_from_dl(dl);
union devlink_param_value nvm_cfg_ver;
struct hwrm_ver_get_output *ver_resp;
@@ -455,6 +456,12 @@ static int bnxt_dl_info_get(struct devlink *dl, struct 
devlink_info_req *req,
if (rc)
return rc;
 
+   rc = bnxt_dl_info_put(bp, req, BNXT_VERSION_RUNNING,
+ DEVLINK_INFO_VERSION_GENERIC_FW_PSID,
+ bp->nvm_cfg_ver);
+   if (rc)
+   return rc;
+
rc = bnxt_dl_info_put(bp, req, BNXT_VERSION_RUNNING,
  DEVLINK_INFO_VERSION_GENERIC_FW,
  ver_resp->active_pkg_name);
@@ -466,7 +473,7 @@ static int bnxt_dl_info_get(struct devlink *dl, struct 
devlink_info_req *req,
 
sprintf(buf, "%X.%X.%X", (ver >> 16) & 0xF, (ver >> 8) & 0xF,
ver & 0xF);
-   rc = bnxt_dl_info_put(bp, req, BNXT_VERSION_RUNNING,
+   rc = bnxt_dl_info_put(bp, req, BNXT_VERSION_STORED,
  DEVLINK_INFO_VERSION_GENERIC_FW_PSID,
  buf);
if (rc)
@@ -516,7 +523,43 @@ static int bnxt_dl_info_get(struct devlink *dl, struct 
devlink_info_req *req,
 
rc = bnxt_dl_info_put(bp, req, BNXT_VERSION_RUNNING,
  DEVLINK_INFO_VERSION_GENERIC_FW_ROCE, roce_ver);
+   if (rc)
+   return rc;
+
+   rc = bnxt_hwrm_nvm_get_dev_info(bp, &nvm_dev_info);
+   if (rc)
+   return rc;
+
+   if (!(nvm_dev_info.flags & NVM_GET_DEV_INFO_RESP_FLAGS_FW_VER_VALID))
+   return 0;
+
+   rc = bnxt_dl_info_put(bp, req, BNXT_VERSION_STORED,
+ DEVLINK_INFO_VERSION_GENERIC_FW,
+ nvm_dev_info.pkg_name);
+   if (rc)
+   return rc;
+
+   snprintf(mgmt_ver, FW_VER_STR_LEN, "%d.%d.%d.%d",
+nvm_dev_info.hwrm_fw_major, nvm_dev_info.hwrm_fw_minor,
+nvm_dev_info.hwrm_fw_build, nvm_dev_info.hwrm_fw_patch);
+   rc = bnxt_dl_info_put(bp, req, BNXT_VERSION_STORED,
+ DEVLINK_INFO_VERSION_GENERIC_FW_MGMT, mgmt_ver);
+   if (rc)
+   return rc;
+
+   snprintf(ncsi_ver, FW_VER_STR_LEN, "%d.%d.%d.%d",
+nvm_dev_info.mgmt_fw_major, nvm_dev_info.mgmt_fw_minor,
+nvm_dev_info.mgmt_fw_build, nvm_dev_info.mgmt_fw_patch);
+   rc = bnxt_dl_info_put(bp, req, BNXT_VERSION_STORED,
+ DEVLINK_INFO_VERSION_GENERIC_FW_NCSI, ncsi_ver);
+   if (rc)
+   return rc;
 
+   snprintf(roce_ver, FW_VER_STR_LEN, "%d.%d.%d.%d",
+nvm_dev_info.roce_fw_major, nvm_dev_info.roce_fw_minor,
+nvm_dev_info.roce_fw_build, nvm_dev_info.roce_fw_patch);
+   rc = bnxt_dl_info_put(bp, req, BNXT_VERSION_STORED,
+ DEVLINK_INFO_VERSION_GENERIC_FW_ROCE, roce_ver);
return rc;
 }
 
-- 
2.18.1



smime.p7s
Description: S/MIME Cryptographic Signature

[PATCH net-next 6/9] bnxt_en: Log unknown link speed appropriately.

2020-10-11 Thread Michael Chan

If the VF virtual link is set to always enabled, the speed may be
unknown when the physical link is down.  The driver currently logs
the link speed as 4294967295 Mbps which is SPEED_UNKNOWN.  Modify
the link up log message as "speed unknown" which makes more sense.

Reviewed-by: Vasundhara Volam 
Reviewed-by: Edwin Peer 
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index ca35ec06abf0..2d6ecf1adbc5 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -8902,6 +8902,11 @@ static void bnxt_report_link(struct bnxt *bp)
u16 fec;
 
netif_carrier_on(bp->dev);
+   speed = bnxt_fw_to_ethtool_speed(bp->link_info.link_speed);
+   if (speed == SPEED_UNKNOWN) {
+   netdev_info(bp->dev, "NIC Link is Up, speed unknown\n");
+   return;
+   }
if (bp->link_info.duplex == BNXT_LINK_DUPLEX_FULL)
duplex = "full";
else
@@ -8914,7 +8919,6 @@ static void bnxt_report_link(struct bnxt *bp)
flow_ctrl = "ON - receive";
else
flow_ctrl = "none";
-   speed = bnxt_fw_to_ethtool_speed(bp->link_info.link_speed);
netdev_info(bp->dev, "NIC Link is Up, %u Mbps %s duplex, Flow 
control: %s\n",
speed, duplex, flow_ctrl);
if (bp->flags & BNXT_FLAG_EEE_CAP)
-- 
2.18.1



smime.p7s
Description: S/MIME Cryptographic Signature

[PATCH net-next 7/9] bnxt_en: Add bnxt_hwrm_nvm_get_dev_info() to query NVM info.

2020-10-11 Thread Michael Chan

From: Vasundhara Volam 

Add a new bnxt_hwrm_nvm_get_dev_info() to query firmware version
information via NVM_GET_DEV_INFO firmware command.  Use it to
get the running version of the NVM configuration information.

This new function will also be used in subsequent patches to get the
stored firmware versions.

Reviewed-by: Andy Gospodarek 
Signed-off-by: Vasundhara Volam 
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c| 12 
 drivers/net/ethernet/broadcom/bnxt/bnxt.h|  1 +
 .../net/ethernet/broadcom/bnxt/bnxt_ethtool.c| 16 
 .../net/ethernet/broadcom/bnxt/bnxt_ethtool.h|  2 ++
 4 files changed, 31 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 2d6ecf1adbc5..55d96e3a69da 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -7562,6 +7562,16 @@ static int bnxt_hwrm_func_reset(struct bnxt *bp)
return hwrm_send_message(bp, &req, sizeof(req), HWRM_RESET_TIMEOUT);
 }
 
+static void bnxt_nvm_cfg_ver_get(struct bnxt *bp)
+{
+   struct hwrm_nvm_get_dev_info_output nvm_info;
+
+   if (!bnxt_hwrm_nvm_get_dev_info(bp, &nvm_info))
+   snprintf(bp->nvm_cfg_ver, FW_VER_STR_LEN, "%d.%d.%d",
+nvm_info.nvm_cfg_ver_maj, nvm_info.nvm_cfg_ver_min,
+nvm_info.nvm_cfg_ver_upd);
+}
+
 static int bnxt_hwrm_queue_qportcfg(struct bnxt *bp)
 {
int rc = 0;
@@ -11223,6 +11233,8 @@ static int bnxt_fw_init_one_p1(struct bnxt *bp)
if (rc)
return rc;
}
+   bnxt_nvm_cfg_ver_get(bp);
+
rc = bnxt_hwrm_func_reset(bp);
if (rc)
return -ENODEV;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index b208ff7c5d14..21ef1c21f602 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1856,6 +1856,7 @@ struct bnxt {
 #define PHY_VER_STR_LEN (FW_VER_STR_LEN - BC_HWRM_STR_LEN)
charfw_ver_str[FW_VER_STR_LEN];
charhwrm_ver_supp[FW_VER_STR_LEN];
+   charnvm_cfg_ver[FW_VER_STR_LEN];
u64 fw_ver_code;
 #define BNXT_FW_VER_CODE(maj, min, bld, rsv)   \
((u64)(maj) << 48 | (u64)(min) << 32 | (u64)(bld) << 16 | (rsv))
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index dcbb7b70d60a..53687bc7fcf5 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -2072,6 +2072,22 @@ static u32 bnxt_get_link(struct net_device *dev)
return bp->link_info.link_up;
 }
 
+int bnxt_hwrm_nvm_get_dev_info(struct bnxt *bp,
+  struct hwrm_nvm_get_dev_info_output 
*nvm_dev_info)
+{
+   struct hwrm_nvm_get_dev_info_output *resp = bp->hwrm_cmd_resp_addr;
+   struct hwrm_nvm_get_dev_info_input req = {0};
+   int rc;
+
+   bnxt_hwrm_cmd_hdr_init(bp, &req, HWRM_NVM_GET_DEV_INFO, -1, -1);
+   mutex_lock(&bp->hwrm_cmd_lock);
+   rc = _hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT);
+   if (!rc)
+   memcpy(nvm_dev_info, resp, sizeof(*resp));
+   mutex_unlock(&bp->hwrm_cmd_lock);
+   return rc;
+}
+
 static void bnxt_print_admin_err(struct bnxt *bp)
 {
netdev_info(bp->dev, "PF does not have admin privileges to flash or 
reset the device\n");
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.h 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.h
index 34f44ddfad79..fa6fbde52bea 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.h
@@ -92,6 +92,8 @@ u32 bnxt_get_rxfh_indir_size(struct net_device *dev);
 u32 _bnxt_fw_to_ethtool_adv_spds(u16, u8);
 u32 bnxt_fw_to_ethtool_speed(u16);
 u16 bnxt_get_fw_auto_link_speeds(u32);
+int bnxt_hwrm_nvm_get_dev_info(struct bnxt *bp,
+  struct hwrm_nvm_get_dev_info_output 
*nvm_dev_info);
 int bnxt_flash_package_from_file(struct net_device *dev, const char *filename,
 u32 install_type);
 void bnxt_ethtool_init(struct bnxt *bp);
-- 
2.18.1



smime.p7s
Description: S/MIME Cryptographic Signature

[PATCH net-next 2/9] bnxt_en: Enable online self tests for multi-host/NPAR mode.

2020-10-11 Thread Michael Chan

From: Vasundhara Volam 

Online self tests are not disruptive and can be run in NPAR mode
and in multi-host NIC as well.

Reviewed-by: Edwin Peer 
Signed-off-by: Vasundhara Volam 
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index 52b4ea6ef8c9..dcbb7b70d60a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -3298,7 +3298,7 @@ static void bnxt_self_test(struct net_device *dev, struct 
ethtool_test *etest,
u8 test_mask = 0;
int rc = 0, i;
 
-   if (!bp->num_tests || !BNXT_SINGLE_PF(bp))
+   if (!bp->num_tests || !BNXT_PF(bp))
return;
memset(buf, 0, sizeof(u64) * bp->num_tests);
if (!netif_running(dev)) {
@@ -3311,9 +3311,9 @@ static void bnxt_self_test(struct net_device *dev, struct 
ethtool_test *etest,
do_ext_lpbk = true;
 
if (etest->flags & ETH_TEST_FL_OFFLINE) {
-   if (bp->pf.active_vfs) {
+   if (bp->pf.active_vfs || !BNXT_SINGLE_PF(bp)) {
etest->flags |= ETH_TEST_FL_FAILED;
-   netdev_warn(dev, "Offline tests cannot be run with 
active VFs\n");
+   netdev_warn(dev, "Offline tests cannot be run with 
active VFs or on shared PF\n");
return;
}
offline = true;
@@ -3829,7 +3829,7 @@ void bnxt_ethtool_init(struct bnxt *bp)
bnxt_get_pkgver(dev);
 
bp->num_tests = 0;
-   if (bp->hwrm_spec_code < 0x10704 || !BNXT_SINGLE_PF(bp))
+   if (bp->hwrm_spec_code < 0x10704 || !BNXT_PF(bp))
return;
 
bnxt_hwrm_cmd_hdr_init(bp, &req, HWRM_SELFTEST_QLIST, -1, -1);
-- 
2.18.1



smime.p7s
Description: S/MIME Cryptographic Signature

[PATCH net-next 8/9] bnxt_en: Refactor bnxt_dl_info_get().

2020-10-11 Thread Michael Chan

From: Vasundhara Volam 

Add a new function bnxt_dl_info_put() to simplify the code, as there
are more stored firmware version fields to be added in the next patch.

Also, rename fw_ver variable name to ncsi_ver for better naming while
copying to devlink info_get cb.

Reviewed-by: Pavan Chebbi 
Reviewed-by: Andy Gospodarek 
Signed-off-by: Vasundhara Volam 
Signed-off-by: Michael Chan 
---
 .../net/ethernet/broadcom/bnxt/bnxt_devlink.c | 112 ++
 .../net/ethernet/broadcom/bnxt/bnxt_devlink.h |   6 +
 2 files changed, 70 insertions(+), 48 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
index d436134bdc40..65e8da1f7a01 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
@@ -382,6 +382,29 @@ static int bnxt_hwrm_get_nvm_cfg_ver(struct bnxt *bp,
return rc;
 }
 
+static int bnxt_dl_info_put(struct bnxt *bp, struct devlink_info_req *req,
+   enum bnxt_dl_version_type type, const char *key,
+   char *buf)
+{
+   if (!strlen(buf))
+   return 0;
+
+   if ((bp->flags & BNXT_FLAG_CHIP_P5) &&
+   (!strcmp(key, DEVLINK_INFO_VERSION_GENERIC_FW_NCSI) ||
+!strcmp(key, DEVLINK_INFO_VERSION_GENERIC_FW_ROCE)))
+   return 0;
+
+   switch (type) {
+   case BNXT_VERSION_FIXED:
+   return devlink_info_version_fixed_put(req, key, buf);
+   case BNXT_VERSION_RUNNING:
+   return devlink_info_version_running_put(req, key, buf);
+   case BNXT_VERSION_STORED:
+   return devlink_info_version_stored_put(req, key, buf);
+   }
+   return 0;
+}
+
 static int bnxt_dl_info_get(struct devlink *dl, struct devlink_info_req *req,
struct netlink_ext_ack *extack)
 {
@@ -390,7 +413,7 @@ static int bnxt_dl_info_get(struct devlink *dl, struct 
devlink_info_req *req,
struct hwrm_ver_get_output *ver_resp;
char mgmt_ver[FW_VER_STR_LEN];
char roce_ver[FW_VER_STR_LEN];
-   char fw_ver[FW_VER_STR_LEN];
+   char ncsi_ver[FW_VER_STR_LEN];
char buf[32];
int rc;
 
@@ -398,10 +421,11 @@ static int bnxt_dl_info_get(struct devlink *dl, struct 
devlink_info_req *req,
if (rc)
return rc;
 
-   if (strlen(bp->board_partno)) {
-   rc = devlink_info_version_fixed_put(req,
-   DEVLINK_INFO_VERSION_GENERIC_BOARD_ID,
-   bp->board_partno);
+   if (BNXT_PF(bp) && (bp->flags & BNXT_FLAG_DSN_VALID)) {
+   sprintf(buf, "%02X-%02X-%02X-%02X-%02X-%02X-%02X-%02X",
+   bp->dsn[7], bp->dsn[6], bp->dsn[5], bp->dsn[4],
+   bp->dsn[3], bp->dsn[2], bp->dsn[1], bp->dsn[0]);
+   rc = devlink_info_serial_number_put(req, buf);
if (rc)
return rc;
}
@@ -412,54 +436,49 @@ static int bnxt_dl_info_get(struct devlink *dl, struct 
devlink_info_req *req,
return rc;
}
 
+   rc = bnxt_dl_info_put(bp, req, BNXT_VERSION_FIXED,
+ DEVLINK_INFO_VERSION_GENERIC_BOARD_ID,
+ bp->board_partno);
+   if (rc)
+   return rc;
+
sprintf(buf, "%X", bp->chip_num);
-   rc = devlink_info_version_fixed_put(req,
-   DEVLINK_INFO_VERSION_GENERIC_ASIC_ID, buf);
+   rc = bnxt_dl_info_put(bp, req, BNXT_VERSION_FIXED,
+ DEVLINK_INFO_VERSION_GENERIC_ASIC_ID, buf);
if (rc)
return rc;
 
ver_resp = &bp->ver_resp;
sprintf(buf, "%X", ver_resp->chip_rev);
-   rc = devlink_info_version_fixed_put(req,
-   DEVLINK_INFO_VERSION_GENERIC_ASIC_REV, buf);
+   rc = bnxt_dl_info_put(bp, req, BNXT_VERSION_FIXED,
+ DEVLINK_INFO_VERSION_GENERIC_ASIC_REV, buf);
if (rc)
return rc;
 
-   if (BNXT_PF(bp)) {
-   sprintf(buf, "%02X-%02X-%02X-%02X-%02X-%02X-%02X-%02X",
-   bp->dsn[7], bp->dsn[6], bp->dsn[5], bp->dsn[4],
-   bp->dsn[3], bp->dsn[2], bp->dsn[1], bp->dsn[0]);
-   rc = devlink_info_serial_number_put(req, buf);
-   if (rc)
-   return rc;
-   }
-
-   if (strlen(ver_resp->active_pkg_name)) {
-   rc =
-   devlink_info_version_running_put(req,
-   DEVLINK_INFO_VERSION_GENERIC_FW,
-   ver_resp->active_pkg_name);
-   if (rc)
-   return rc;
-   }
+   rc = bnxt_dl_info_put(bp, req, BNXT_VERSION_RUNNING,
+ DEVLINK_INFO_VERSION_GENERIC_FW,
+ ver_resp->active_pkg_name);
+

[PATCH net-next 5/9] bnxt_en: Log event_data1 and event_data2 when handling RESET_NOTIFY event.

2020-10-11 Thread Michael Chan

Log these values that contain useful firmware state information.

Reviewed-by: Edwin Peer 
Reviewed-by: Vasundhara Volam 
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index f093cdd61426..ca35ec06abf0 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -2032,6 +2032,9 @@ static int bnxt_async_event_process(struct bnxt *bp,
set_bit(BNXT_RESET_TASK_SILENT_SP_EVENT, &bp->sp_event);
break;
case ASYNC_EVENT_CMPL_EVENT_ID_RESET_NOTIFY:
+   if (netif_msg_hw(bp))
+   netdev_warn(bp->dev, "Received RESET_NOTIFY event, 
data1: 0x%x, data2: 0x%x\n",
+   data1, data2);
if (!bp->fw_health)
goto async_event_process_exit;
 
-- 
2.18.1



smime.p7s
Description: S/MIME Cryptographic Signature

[PATCH net-next 4/9] bnxt_en: Simplify bnxt_async_event_process().

2020-10-11 Thread Michael Chan

event_data1 and event_data2 are used when processing most events.
Store these in local variables at the beginning of the function to
simplify many of the case statements.

Reviewed-by: Edwin Peer 
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 0da6a1138eee..f093cdd61426 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -1980,11 +1980,12 @@ static int bnxt_async_event_process(struct bnxt *bp,
struct hwrm_async_event_cmpl *cmpl)
 {
u16 event_id = le16_to_cpu(cmpl->event_id);
+   u32 data1 = le32_to_cpu(cmpl->event_data1);
+   u32 data2 = le32_to_cpu(cmpl->event_data2);
 
/* TODO CHIMP_FW: Define event id's for link change, error etc */
switch (event_id) {
case ASYNC_EVENT_CMPL_EVENT_ID_LINK_SPEED_CFG_CHANGE: {
-   u32 data1 = le32_to_cpu(cmpl->event_data1);
struct bnxt_link_info *link_info = &bp->link_info;
 
if (BNXT_VF(bp))
@@ -2014,7 +2015,6 @@ static int bnxt_async_event_process(struct bnxt *bp,
set_bit(BNXT_HWRM_PF_UNLOAD_SP_EVENT, &bp->sp_event);
break;
case ASYNC_EVENT_CMPL_EVENT_ID_PORT_CONN_NOT_ALLOWED: {
-   u32 data1 = le32_to_cpu(cmpl->event_data1);
u16 port_id = BNXT_GET_EVENT_PORT(data1);
 
if (BNXT_VF(bp))
@@ -2031,9 +2031,7 @@ static int bnxt_async_event_process(struct bnxt *bp,
goto async_event_process_exit;
set_bit(BNXT_RESET_TASK_SILENT_SP_EVENT, &bp->sp_event);
break;
-   case ASYNC_EVENT_CMPL_EVENT_ID_RESET_NOTIFY: {
-   u32 data1 = le32_to_cpu(cmpl->event_data1);
-
+   case ASYNC_EVENT_CMPL_EVENT_ID_RESET_NOTIFY:
if (!bp->fw_health)
goto async_event_process_exit;
 
@@ -2053,10 +2051,8 @@ static int bnxt_async_event_process(struct bnxt *bp,
}
set_bit(BNXT_FW_RESET_NOTIFY_SP_EVENT, &bp->sp_event);
break;
-   }
case ASYNC_EVENT_CMPL_EVENT_ID_ERROR_RECOVERY: {
struct bnxt_fw_health *fw_health = bp->fw_health;
-   u32 data1 = le32_to_cpu(cmpl->event_data1);
 
if (!fw_health)
goto async_event_process_exit;
@@ -2084,8 +2080,6 @@ static int bnxt_async_event_process(struct bnxt *bp,
goto async_event_process_exit;
}
case ASYNC_EVENT_CMPL_EVENT_ID_RING_MONITOR_MSG: {
-   u32 data1 = le32_to_cpu(cmpl->event_data1);
-   u32 data2 = le32_to_cpu(cmpl->event_data2);
struct bnxt_rx_ring_info *rxr;
u16 grp_idx;
 
-- 
2.18.1



smime.p7s
Description: S/MIME Cryptographic Signature

[PATCH net-next 3/9] bnxt_en: Set driver default message level.

2020-10-11 Thread Michael Chan

Currently, bp->msg_enable has default value of 0.  It is more useful
to have the commonly used NETIF_MSG_DRV and NETIF_MSG_HW enabled by
default.

Reviewed-by: Edwin Peer 
Reviewed-by: Vasundhara Volam 
Signed-off-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c 
b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index d4402a2cd07f..0da6a1138eee 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -69,6 +69,7 @@
 #include "bnxt_debugfs.h"
 
 #define BNXT_TX_TIMEOUT(5 * HZ)
+#define BNXT_DEF_MSG_ENABLE(NETIF_MSG_DRV | NETIF_MSG_HW)
 
 MODULE_LICENSE("GPL");
 MODULE_DESCRIPTION("Broadcom BCM573xx network driver");
@@ -12510,6 +12511,7 @@ static int bnxt_init_one(struct pci_dev *pdev, const 
struct pci_device_id *ent)
return -ENOMEM;
 
bp = netdev_priv(dev);
+   bp->msg_enable = BNXT_DEF_MSG_ENABLE;
bnxt_set_max_func_irqs(bp, max_irqs);
 
if (bnxt_vf_pciid(ent->driver_data))
-- 
2.18.1



smime.p7s
Description: S/MIME Cryptographic Signature

Re: [PATCH v2 0/3] [PATCH v2 0/3] [PATCH v2 0/3] net, mac80211, kernel: enable KCOV remote coverage collection for 802.11 frame handling

2020-10-11 Thread Johannes Berg

On 11 October 2020 12:37:29 CEST, Andrey Konovalov  
wrote:
>I initially hesitated to do that, as it would multiply the number of
>kcov callbacks. But perhaps you're right and a clean API look
>outweighs the rest. I will do this in v3.

Yeah, OK, dunno. You can always make it an inline calling the "full" API so 
after compiling it's equivalent. But if course that still has the two APIs. It 
just seemed to the common case wouldn't worry really (have to) about these 
things, especially if you plan on changing it again later. 

johannes
-- 
Sent from my phone.

[PATCH 5/5] net/tls: use semicolons rather than commas to separate statements

2020-10-11 Thread Julia Lawall

Replace commas with semicolons.  Commas introduce unnecessary
variability in the code structure and are hard to see.  What is done
is essentially described by the following Coccinelle semantic patch
(http://coccinelle.lip6.fr/):

// 
@@ expression e1,e2; @@
e1
-,
+;
e2
... when any
// 

Signed-off-by: Julia Lawall 

---
 net/tls/tls_main.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 002b0859fed5..8d93cea99f2c 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -869,7 +869,7 @@ static int __init tls_register(void)
 
tls_sw_proto_ops = inet_stream_ops;
tls_sw_proto_ops.splice_read = tls_sw_splice_read;
-   tls_sw_proto_ops.sendpage_locked   = tls_sw_sendpage_locked,
+   tls_sw_proto_ops.sendpage_locked   = tls_sw_sendpage_locked;
 
tls_device_init();
tcp_register_ulp(&tcp_tls_ulp_ops);

[PATCH 0/5] net: use semicolons rather than commas to separate statements

2020-10-11 Thread Julia Lawall

These patches replace commas by semicolons.  Commas introduce
unnecessary variability in the code structure and are hard to see.
This was done using the Coccinelle semantic patch
(http://coccinelle.lip6.fr/) shown below.

This semantic patch ensures that commas inside for loop headers will
not be transformed.  It also doesn't touch macro definitions.

Coccinelle ensures that braces are added as needed when a
single-statement branch turns into a multi-statement one.

This semantic patch has a few false positives, for variable
delcarations such as:

LIST_HEAD(x), *y;

The semantic patch could be improved to avoid these, but for the
moment they have been removed manually (2 occurrences).

// 
@initialize:ocaml@
@@

let infunction p =
  (* avoid macros *)
  (List.hd p).current_element <> "something_else"

let combined p1 p2 =
  (List.hd p1).line_end = (List.hd p2).line ||
  (((List.hd p1).line_end < (List.hd p2).line) &&
   ((List.hd p1).col < (List.hd p2).col))

@bad@
statement S;
declaration d;
position p;
@@

S@p
d

// special cases where newlines are needed (hope for no more than 5)
@@
expression e1,e2;
statement S;
position p != bad.p;
position p1;
position p2 :
script:ocaml(p1) { infunction p1 && combined p1 p2 };
@@

- e1@p1,@S@p e2@p2;
+ e1; e2;

@@
expression e1,e2;
statement S;
position p != bad.p;
position p1;
position p2 :
script:ocaml(p1) { infunction p1 && combined p1 p2 };
@@

- e1@p1,@S@p e2@p2;
+ e1; e2;

@@
expression e1,e2;
statement S;
position p != bad.p;
position p1;
position p2 :
script:ocaml(p1) { infunction p1 && combined p1 p2 };
@@

- e1@p1,@S@p e2@p2;
+ e1; e2;

@@
expression e1,e2;
statement S;
position p != bad.p;
position p1;
position p2 :
script:ocaml(p1) { infunction p1 && combined p1 p2 };
@@

- e1@p1,@S@p e2@p2;
+ e1; e2;

@@
expression e1,e2;
statement S;
position p != bad.p;
position p1;
position p2 :
script:ocaml(p1) { infunction p1 && combined p1 p2 };
@@

- e1@p1,@S@p e2@p2;
+ e1; e2;

@r@
expression e1,e2;
statement S;
position p != bad.p;
@@

e1 ,@S@p e2;

@@
expression e1,e2;
position p1;
position p2 :
script:ocaml(p1) { infunction p1 && not(combined p1 p2) };
statement S;
position r.p;
@@

e1@p1
-,@S@p
+;
e2@p2
... when any
// 

---

 net/ipv4/tcp_input.c   |3 ++-
 net/ipv4/tcp_vegas.c   |8 
 net/ipv6/calipso.c |2 +-
 net/mac80211/debugfs_sta.c |2 +-
 net/rxrpc/recvmsg.c|2 +-
 net/tls/tls_main.c |2 +-
 6 files changed, 10 insertions(+), 9 deletions(-)

[PATCH 1/5] rxrpc: use semicolons rather than commas to separate statements

2020-10-11 Thread Julia Lawall

Replace commas with semicolons.  Commas introduce unnecessary
variability in the code structure and are hard to see.  What is done
is essentially described by the following Coccinelle semantic patch
(http://coccinelle.lip6.fr/):

// 
@@ expression e1,e2; @@
e1
-,
+;
e2
... when any
// 

Signed-off-by: Julia Lawall 

---
 net/rxrpc/recvmsg.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/rxrpc/recvmsg.c b/net/rxrpc/recvmsg.c
index c4684dde1f16..acc4660cfa5b 100644
--- a/net/rxrpc/recvmsg.c
+++ b/net/rxrpc/recvmsg.c
@@ -69,7 +69,7 @@ bool __rxrpc_set_call_completion(struct rxrpc_call *call,
if (call->state < RXRPC_CALL_COMPLETE) {
call->abort_code = abort_code;
call->error = error;
-   call->completion = compl,
+   call->completion = compl;
call->state = RXRPC_CALL_COMPLETE;
trace_rxrpc_call_complete(call);
wake_up(&call->waitq);

[PATCH 2/5] mac80211: use semicolons rather than commas to separate statements

2020-10-11 Thread Julia Lawall

Replace commas with semicolons.  Commas introduce unnecessary
variability in the code structure and are hard to see.  What is done
is essentially described by the following Coccinelle semantic patch
(http://coccinelle.lip6.fr/):

// 
@@ expression e1,e2; @@
e1
-,
+;
e2
... when any
// 

Signed-off-by: Julia Lawall 

---
 net/mac80211/debugfs_sta.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/mac80211/debugfs_sta.c b/net/mac80211/debugfs_sta.c
index 829dcad69c2c..6a51b8b58f9e 100644
--- a/net/mac80211/debugfs_sta.c
+++ b/net/mac80211/debugfs_sta.c
@@ -274,7 +274,7 @@ static ssize_t sta_aql_read(struct file *file, char __user 
*userbuf,
"Q limit[low/high]: VO: %u/%u VI: %u/%u BE: %u/%u BK: %u/%u\n",
q_depth[0], q_depth[1], q_depth[2], q_depth[3],
q_limit_l[0], q_limit_h[0], q_limit_l[1], q_limit_h[1],
-   q_limit_l[2], q_limit_h[2], q_limit_l[3], q_limit_h[3]),
+   q_limit_l[2], q_limit_h[2], q_limit_l[3], q_limit_h[3]);
 
rv = simple_read_from_buffer(userbuf, count, ppos, buf, p - buf);
kfree(buf);

[PATCH 3/5] tcp: use semicolons rather than commas to separate statements

2020-10-11 Thread Julia Lawall

Replace commas with semicolons.  Commas introduce unnecessary
variability in the code structure and are hard to see.  What is done
is essentially described by the following Coccinelle semantic patch
(http://coccinelle.lip6.fr/):

// 
@@ expression e1,e2; @@
e1
-,
+;
e2
... when any
// 

Signed-off-by: Julia Lawall 

---
 net/ipv4/tcp_input.c |3 ++-
 net/ipv4/tcp_vegas.c |8 
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 02d0e2fb77c0..7ff9be52c061 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4418,7 +4418,8 @@ static void tcp_sack_maybe_coalesce(struct tcp_sock *tp)
sp[i] = sp[i + 1];
continue;
}
-   this_sack++, swalk++;
+   this_sack++;
+   swalk++;
}
 }
 
diff --git a/net/ipv4/tcp_vegas.c b/net/ipv4/tcp_vegas.c
index 3f51e781562a..c8003c8aad2c 100644
--- a/net/ipv4/tcp_vegas.c
+++ b/net/ipv4/tcp_vegas.c
@@ -293,10 +293,10 @@ size_t tcp_vegas_get_info(struct sock *sk, u32 ext, int 
*attr,
const struct vegas *ca = inet_csk_ca(sk);
 
if (ext & (1 << (INET_DIAG_VEGASINFO - 1))) {
-   info->vegas.tcpv_enabled = ca->doing_vegas_now,
-   info->vegas.tcpv_rttcnt = ca->cntRTT,
-   info->vegas.tcpv_rtt = ca->baseRTT,
-   info->vegas.tcpv_minrtt = ca->minRTT,
+   info->vegas.tcpv_enabled = ca->doing_vegas_now;
+   info->vegas.tcpv_rttcnt = ca->cntRTT;
+   info->vegas.tcpv_rtt = ca->baseRTT;
+   info->vegas.tcpv_minrtt = ca->minRTT;
 
*attr = INET_DIAG_VEGASINFO;
return sizeof(struct tcpvegas_info);

[PATCH 4/5] net/ipv6: use semicolons rather than commas to separate statements

2020-10-11 Thread Julia Lawall

Replace commas with semicolons.  Commas introduce unnecessary
variability in the code structure and are hard to see.  What is done
is essentially described by the following Coccinelle semantic patch
(http://coccinelle.lip6.fr/):

// 
@@ expression e1,e2; @@
e1
-,
+;
e2
... when any
// 

Signed-off-by: Julia Lawall 

---
 net/ipv6/calipso.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv6/calipso.c b/net/ipv6/calipso.c
index 8d3f66c310db..78f766019b7e 100644
--- a/net/ipv6/calipso.c
+++ b/net/ipv6/calipso.c
@@ -761,7 +761,7 @@ static int calipso_genopt(unsigned char *buf, u32 start, 
u32 buf_len,
calipso[1] = len - 2;
*(__be32 *)(calipso + 2) = htonl(doi_def->doi);
calipso[6] = (len - CALIPSO_HDR_LEN) / 4;
-   calipso[7] = secattr->attr.mls.lvl,
+   calipso[7] = secattr->attr.mls.lvl;
crc = ~crc_ccitt(0x, calipso, len);
calipso[8] = crc & 0xff;
calipso[9] = (crc >> 8) & 0xff;

[PATCH] net: korina: free array used for rx/tx descriptors

2020-10-11 Thread Valentin Vidic

Memory was not freed when driver is unloaded from the kernel.

Signed-off-by: Valentin Vidic 
---
 drivers/net/ethernet/korina.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/korina.c b/drivers/net/ethernet/korina.c
index 03e034918d14..99146145f020 100644
--- a/drivers/net/ethernet/korina.c
+++ b/drivers/net/ethernet/korina.c
@@ -1133,6 +1133,7 @@ static int korina_remove(struct platform_device *pdev)
iounmap(lp->eth_regs);
iounmap(lp->rx_dma_regs);
iounmap(lp->tx_dma_regs);
+   kfree(lp->td_ring);
 
unregister_netdev(bif->dev);
free_netdev(bif->dev);
-- 
2.20.1

Re: [PATCH net-next v6 2/7] net: dsa: Add DSA driver for Hirschmann Hellcreek switches

2020-10-11 Thread Kurt Kanzenbach

On Tue Oct 06 2020, Vladimir Oltean wrote:
> On Tue, Oct 06, 2020 at 03:23:36PM +0200, Kurt Kanzenbach wrote:
>> So you're saying private VLANs can be used but the user or the other
>> kernel modules shouldn't be allowed to use them to simplify the
>> implementation?  Makes sense to me.
>
> It would be interesting to see if you could simply turn off VLAN
> awareness in standalone mode, and still use unique pvids per port.

That doesn't work, just tested. When VLAN awareness is disabled,
everything is switched regardless of VLAN tags and table. Therefore, the
implementation could look like this:

 * bridge without filtering:
   * vlan_awareness=0
   * drop private vlans
 * bridge with vlan filtering:
   * vlan_awareness=1
   * drop private vlans
 * standalone:
   * vlan_awareness=1
   * use private vlans
   * forbid other users to use private vlans to allow
 configure_vlans_while_not_filtering behavior in .vlan_prepare()
   * forbid use of lan0. and lan1. in .port_prechangeupper()

So, this should work, or?

Thanks,
Kurt

signature.asc
Description: PGP signature

Re: [PATCH net-next v2 1/2] Makefile.extrawarn: Add symbol for W=1 warnings for today

2020-10-11 Thread Masahiro Yamada

On Tue, Oct 6, 2020 at 6:08 AM Andrew Lunn  wrote:
>
> > It depends a lot on what portion of the kernel gets enabled for W=1.
> >
> > As long as it's only drivers that are actively maintained, and they
> > make up a fairly small portion of all code, it should not be a problem
> > to find someone to fix useful warnings.
>
> Well, drivers/net/ethernet is around 1.5M LOC. The tree as a whole is
> just short of 23M LOC. So i guess that is a small portion of all the
> code.
>
> Andrew

I am not a big fan of KBUILD_CFLAGS_W1_
since it is ugly.

I'd like to start with adding individual flags
like drivers/gpu/drm/i915/Makefile, and see
how difficult it would be to maintain it.

One drawback of your approach is that
you cannot set KBUILD_CFLAGS_W1_20200930
until you eliminate all the warnings in the
sub-directory in interest.
(i.e. all or nothing approach)

At best, you can only work out from 'old -> new' order
because KBUILD_CFLAGS_W1_20200326 is a suer-set of
KBUILD_CFLAGS_W1_20190907, which is a suer-set of
KBUILD_CFLAGS_W1_20190617 ...

If you add flags individually, you can start with
low-hanging fruits, or ones with higher priority
as Arnd mentions about -Wmissing-{declaration,prototypes}.

For example, you might be able to set
'subdir-ccflags-y += -Wmissing-declarations'
to drivers/net/Makefile, while
'subdir-ccflags-y += -Wunused-but-set-variable'
stays in drivers/net/ethernet/Makefile.

--
Best Regards
Masahiro Yamada

Re: [PATCH] net: stmmac: Don't call _irqoff() with hardirqs enabled

2020-10-11 Thread Heiner Kallweit

On 10.10.2020 17:22, Jakub Kicinski wrote:
> On Sat, 10 Oct 2020 15:08:15 +0200 Heiner Kallweit wrote:
>> On 09.10.2020 18:06, Heiner Kallweit wrote:
>>> On 09.10.2020 17:58, Jakub Kicinski wrote:  
 On Fri, 9 Oct 2020 16:54:06 +0200 Heiner Kallweit wrote:  
> I'm thinking about a __napi_schedule version that disables hard irq's
> conditionally, based on variable force_irqthreads, exported by the irq
> subsystem. This would allow to behave correctly with threadirqs set,
> whilst not loosing the _irqoff benefit with threadirqs unset.
> Let me come up with a proposal.  

 I think you'd need to make napi_schedule_irqoff() behave like that,
 right?  Are there any uses of napi_schedule_irqoff() that are disabling
 irqs and not just running from an irq handler?
  
>>> Right, the best approach depends on the answer to the latter question.
>>> I didn't check this yet, therefore I described the least intrusive approach.
>>>   
>>
>> With some help from coccinelle I identified the following functions that
>> call napi_schedule_irqoff() or __napi_schedule_irqoff() and do not run
>> from an irq handler (at least not at the first glance).
>>
>> dpaa2_caam_fqdan_cb
>> qede_simd_fp_handler
>> mlx4_en_rx_irq
>> mlx4_en_tx_irq
> 
> Don't know the others but FWIW the mlx4 ones run from an IRQ handler,
> AFAICT:
> 
> static irqreturn_t mlx4_interrupt(int irq, void *dev_ptr)
> static irqreturn_t mlx4_msi_x_interrupt(int irq, void *eq_ptr)
>   mlx4_eq_int()
> mlx4_cq_completion
>   cq->comp()
> 
>> qeth_qdio_poll
>> netvsc_channel_cb
>> napi_watchdog
> 
> This one runs from a hrtimer, which I believe will be a hard irq
> context on anything but RT. I could be wrong.
> 

Typically forced irq threading will not be enabled, therefore going
back to use napi_schedule() in drivers in most cases will cause
losing the benefit of the irqoff version. Something like the following
should be better. Only small drawback I see is that in case of forced
irq threading hrtimers will still run in hardirq context and we lose
the benefit of the irqoff version in napi_watchdog().

diff --git a/net/core/dev.c b/net/core/dev.c
index a146bac84..7d18560b2 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6393,7 +6393,11 @@ EXPORT_SYMBOL(napi_schedule_prep);
  */
 void __napi_schedule_irqoff(struct napi_struct *n)
 {
-   napi_schedule(this_cpu_ptr(&softnet_data), n);
+   /* hard irqs may not be masked in case of forced irq threading */
+   if (force_irqthreads)
+   __napi_schedule(n);
+   else
+   napi_schedule(this_cpu_ptr(&softnet_data), n);
 }
 EXPORT_SYMBOL(__napi_schedule_irqoff);
 
-- 
2.28.0

Re: [PATCH] mm: proc: add Sock to /proc/meminfo

2020-10-11 Thread Mike Rapoport

On Sat, Oct 10, 2020 at 06:38:54PM +0800, Muchun Song wrote:
> The amount of memory allocated to sockets buffer can become significant.
> However, we do not display the amount of memory consumed by sockets
> buffer. In this case, knowing where the memory is consumed by the kernel
> is very difficult. On our server with 500GB RAM, sometimes we can see
> 25GB disappear through /proc/meminfo. After our analysis, we found the
> following memory allocation path which consumes the memory with page_owner
> enabled.
 
I have a high lelel question.
There is accounting of the socket memory for memcg that gets called from
the networking layer. Did you check if the same call sites can be used
for the system-wide accounting as well?

>   849698 times:
>   Page allocated via order 3, mask 
> 0x4052c0(GFP_NOWAIT|__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP)
>__alloc_pages_nodemask+0x11d/0x290
>skb_page_frag_refill+0x68/0xf0
>sk_page_frag_refill+0x19/0x70
>tcp_sendmsg_locked+0x2f4/0xd10
>tcp_sendmsg+0x29/0xa0
>sock_sendmsg+0x30/0x40
>sock_write_iter+0x8f/0x100
>__vfs_write+0x10b/0x190
>vfs_write+0xb0/0x190
>ksys_write+0x5a/0xd0
>do_syscall_64+0x5d/0x110
>entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> Signed-off-by: Muchun Song 
> ---
>  drivers/base/node.c  |  2 ++
>  drivers/net/virtio_net.c |  3 +--

Is virtio-net the only dirver that requred an update?

>  fs/proc/meminfo.c|  1 +
>  include/linux/mmzone.h   |  1 +
>  include/linux/skbuff.h   | 43 ++--
>  kernel/exit.c|  3 +--
>  mm/page_alloc.c  |  7 +--
>  mm/vmstat.c  |  1 +
>  net/core/sock.c  |  8 
>  net/ipv4/tcp.c   |  3 +--
>  net/xfrm/xfrm_state.c|  3 +--
>  11 files changed, 59 insertions(+), 16 deletions(-)
>

Re: [Patch net] ip_gre: set dev->hard_header_len properly

2020-10-11 Thread Xie He

On Sat, Oct 10, 2020 at 11:58 AM Cong Wang  wrote:
>
> Well, AF_PACKET RAW socket is supposed to construct L2 headers
> from the user buffer, and for a tunnel device these headers are indeed its
> L2. If users do not want to do this, they can switch to DGRAM anyway.
>
> I know how inconvenient it is to construct a GRE tunnel header, I guess
> this is why all other tunnel devices do not provide a header_ops::create.
> GRE tunnel is not a special case, this is why I agree on removing its
> ->create() although it does look like all tunnel devices should let users
> construct headers by definition of RAW.
>
> Of course, it may be too late to change this behavior even if we really
> want, users may already assume there is always no tunnel header needed
> to construct.

Sorry. I was re-thinking about whether we should expose certain
headers in header_ops or not. Now my thoughts are a little bit
different, so I want to re-reply to your email.

1.

The first reason why I think we should not expose the GRE header in
header_ops, is that in this way we can keep the un-exposed headers of
GRE devices consistent with those of GRETAP devices. (This is just
like the consistency between "universal TUN devices" and "universal
TAP devices".) This is similar to what I wrote in my previous replies.

However, after thinking I think this only applies to GRE/GRETAP but
not to some other tunnel devices. For example, for PPP, we can't
actually transmit Ethernet frames directly over PPP (but need to
encapsulate them in BCP over PPP), so there's no way to keep the
un-exposed headers of PPP devices consistent with those of "PPP TAP"
(PPP bridging) devices anyway.

2.

Second, I think the headers before the GRE header (the outer IP header
and the "encap" header) should actually belong to the underlying
network being tunneled through, and should not be viewed as L2 headers
of the tunneling network. As for the GRE header, it might be better
viewed as a "bridge" between the tunneled network and the tunneling
network, and belongs neither of the two, so it should not be viewed as
the L2 header of the tunneling network either.

However, for PPP tunnels, I think the PPP header clearly belongs to
the tunneling network's L2 header (because PPP can run directly over a
hardware link, and even if it is used for a tunnel, it needs a
specific underlying network's protocol, such as SSH or PPPoE, to
transport). So I think it is better to expose the PPP header in
header_ops.

Dear Friend

2020-10-11 Thread mrs chantal

  Hello Friend.I am Mrs.CHANTAL I am sending this brief
letter to solicit your partnership to transfer $7.2 Million US
Dollars.I shall send you more information and procedures when I receive
positive response From you. Please send me a message in My private
email address is ( mrschant...@gmail.com )

Best Regards

Mrs.Chantal

Re: [PATCH net-next v6 2/7] net: dsa: Add DSA driver for Hirschmann Hellcreek switches

2020-10-11 Thread Vladimir Oltean

On Sun, Oct 11, 2020 at 02:29:08PM +0200, Kurt Kanzenbach wrote:
> On Tue Oct 06 2020, Vladimir Oltean wrote:
> > It would be interesting to see if you could simply turn off VLAN
> > awareness in standalone mode, and still use unique pvids per port.
>
> That doesn't work, just tested. When VLAN awareness is disabled,
> everything is switched regardless of VLAN tags and table.

That's strange, do you happen to know where things are going wrong?
I would expect:
- port VLAN awareness is disabled, so any packet is classified to the
  port-based VLAN
- the port-based VLAN is a private VLAN whose membership includes only
  that port, plus the CPU port
- the switch does not forward packets towards a port that is not member
  of the packets' classified VLAN
When VLAN awareness is disabled, are you able to cause packet drops by
deleting the pvid of the ingress port? Therefore, can you confirm that
lan1 is not a member of lan0's pvid, but the switch still forwards the
packets to it?

> Therefore, the implementation could look like this:
>
>  * bridge without filtering:
>* vlan_awareness=0
>* drop private vlans
>  * bridge with vlan filtering:
>* vlan_awareness=1
>* drop private vlans
>  * standalone:
>* vlan_awareness=1
>* use private vlans
>* forbid other users to use private vlans to allow
>  configure_vlans_while_not_filtering behavior in .vlan_prepare()
>* forbid use of lan0. and lan1. in .port_prechangeupper()
>
> So, this should work, or?

Yes, this is an alternative that could work.

[PATCH] net/af_unix: Remove unused old_pid variable

2020-10-11 Thread Or Cohen

Commit 109f6e39fa07c48f5801 ("af_unix: Allow SO_PEERCRED
to work across namespaces.") introduced the old_pid variable
in unix_listen, but it's never used.
Remove the declaration and the call to put_pid.

Signed-off-by: Or Cohen 
---
 net/unix/af_unix.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 3385a7a0b231..26d3bf81186f 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -613,7 +613,6 @@ static int unix_listen(struct socket *sock, int backlog)
int err;
struct sock *sk = sock->sk;
struct unix_sock *u = unix_sk(sk);
-   struct pid *old_pid = NULL;
 
err = -EOPNOTSUPP;
if (sock->type != SOCK_STREAM && sock->type != SOCK_SEQPACKET)
@@ -634,7 +633,6 @@ static int unix_listen(struct socket *sock, int backlog)
 
 out_unlock:
unix_state_unlock(sk);
-   put_pid(old_pid);
 out:
return err;
 }
-- 
2.17.1

Re: [PATCH net-next v2 1/2] can-isotp: implement cleanups / improvements from review

2020-10-11 Thread Jakub Kicinski

On Sun, 11 Oct 2020 11:24:07 +0200 Oliver Hartkopp wrote:
> diff --git a/net/can/isotp.c b/net/can/isotp.c
> index e6ff032b5426..22187669c5c9 100644
> --- a/net/can/isotp.c
> +++ b/net/can/isotp.c
> @@ -79,6 +79,8 @@ MODULE_LICENSE("Dual BSD/GPL");
>  MODULE_AUTHOR("Oliver Hartkopp ");
>  MODULE_ALIAS("can-proto-6");
>  
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt

You need to move this before the includes:

net/can/isotp.c:82: warning: "pr_fmt" redefined
   82 | #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
  | 
In file included from ../include/linux/kernel.h:15,
 from ../include/linux/list.h:9,
 from ../include/linux/module.h:12,
 from ../net/can/isotp.c:56:
include/linux/printk.h:297: note: this is the location of the previous 
definition
  297 | #define pr_fmt(fmt) fmt
  | 
net/can/isotp.c:82:9: warning: preprocessor token pr_fmt redefined
net/can/isotp.c: note: in included file (through ../include/linux/kernel.h, 
../include/linux/list.h, ../include/linux/module.h):
include/linux/printk.h:297:9: this was the original definition

Re: [PATCH net-next v2 1/2] can-isotp: implement cleanups / improvements from review

2020-10-11 Thread Jakub Kicinski

On Sun, 11 Oct 2020 11:24:07 +0200 Oliver Hartkopp wrote:
> @@ -769,7 +771,7 @@ static enum hrtimer_restart isotp_tx_timer_handler(struct 
> hrtimer *hrtimer)
>  
>  isotp_tx_burst:
>   skb = alloc_skb(so->ll.mtu + sizeof(struct can_skb_priv),
> - gfp_any());
> + GFP_KERNEL);

hrtimer will need GFP_ATOMIC

Re: [PATCH net-next 1/2] dt-bindings: net: dsa: b53: Add YAML bindings

2020-10-11 Thread Florian Fainelli





On 10/11/2020 1:32 AM, Kurt Kanzenbach wrote:

On Sat Oct 10 2020, Florian Fainelli wrote:

On 10/10/2020 9:46 AM, Kurt Kanzenbach wrote:

Convert the b53 DSA device tree bindings to YAML in order to allow
for automatic checking and such.

Suggested-by: Florian Fainelli 
Signed-off-by: Kurt Kanzenbach 


Thanks for making this change, there are quite a few warnings that are
going to show up because the binding was defined in a way that it would
define chip compatible strings, which not all DTS files are using.


Oh, I didn't know there is a second make command for doing the actual
check against the dtbs. I've just used `make dt_binding_check'.

So, it seems like a lot of the errors are caused by the include files
such as

[linux]/arch/arm/boot/dts/bcm5301x.dtsi

srab: srab@18007000 {
compatible = "brcm,bcm5301x-srab";
reg = <0x18007000 0x1000>;

status = "disabled";

/* ports are defined in board DTS */
};

The nodename should be "switch" not "srab" as enforced by
dsa.yaml. Furthermore, some DTS files are not adding the chip specific
compatible strings and the ports leading to more errors.

There are also some minor errors regarding the reg-names and such for
specific instances.

How should we proceed? Adding the missing compatible strings and ports
to the DTS files? Or adjusting the include files?


The include is correct as it provides the fallback family string which 
is what the driver will be looking for unless we do not provide a chip 
compatible. The various DTS should be updated to contain both the chip 
compatible and the fallback family (brcm,bcm5301x-srab) string, I will 
update the various DTS and submit these for review later next week.


Then we could imagine me taking this YAML change through the Broadcom 
ARM SoC pull requests that way no new regressions are introduced.


Sounds good?
--
Florian

[PATCH] rtw88: fix fw_fifo_addr check

2020-10-11 Thread trix

From: Tom Rix 

The clang build reports this warning

fw.c:1485:21: warning: address of array 'rtwdev->chip->fw_fifo_addr'
  will always evaluate to 'true'
if (!rtwdev->chip->fw_fifo_addr) {

fw_fifo_addr is an array in rtw_chip_info so it is always
nonzero.  A better check is if the first element of the array is
nonzero.  In the cases where fw_fifo_addr is initialized by rtw88b
and rtw88c, the first array element is 0x780.

Signed-off-by: Tom Rix 
---
 drivers/net/wireless/realtek/rtw88/fw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/realtek/rtw88/fw.c 
b/drivers/net/wireless/realtek/rtw88/fw.c
index 042015bc8055..b2fd87834f23 100644
--- a/drivers/net/wireless/realtek/rtw88/fw.c
+++ b/drivers/net/wireless/realtek/rtw88/fw.c
@@ -1482,7 +1482,7 @@ static bool rtw_fw_dump_check_size(struct rtw_dev *rtwdev,
 int rtw_fw_dump_fifo(struct rtw_dev *rtwdev, u8 fifo_sel, u32 addr, u32 size,
 u32 *buffer)
 {
-   if (!rtwdev->chip->fw_fifo_addr) {
+   if (!rtwdev->chip->fw_fifo_addr[0]) {
rtw_dbg(rtwdev, RTW_DBG_FW, "chip not support dump fw fifo\n");
return -ENOTSUPP;
}
-- 
2.18.1

Re: [PATCH] net: stmmac: Don't call _irqoff() with hardirqs enabled

2020-10-11 Thread Jakub Kicinski

On Sun, 11 Oct 2020 15:42:24 +0200 Heiner Kallweit wrote:
> On 10.10.2020 17:22, Jakub Kicinski wrote:
> > On Sat, 10 Oct 2020 15:08:15 +0200 Heiner Kallweit wrote:  
> >> On 09.10.2020 18:06, Heiner Kallweit wrote:  
> >>> On 09.10.2020 17:58, Jakub Kicinski wrote:
>  On Fri, 9 Oct 2020 16:54:06 +0200 Heiner Kallweit wrote:
> > I'm thinking about a __napi_schedule version that disables hard irq's
> > conditionally, based on variable force_irqthreads, exported by the irq
> > subsystem. This would allow to behave correctly with threadirqs set,
> > whilst not loosing the _irqoff benefit with threadirqs unset.
> > Let me come up with a proposal.
> 
>  I think you'd need to make napi_schedule_irqoff() behave like that,
>  right?  Are there any uses of napi_schedule_irqoff() that are disabling
>  irqs and not just running from an irq handler?
> 
> >>> Right, the best approach depends on the answer to the latter question.
> >>> I didn't check this yet, therefore I described the least intrusive 
> >>> approach.
> >>> 
> >>
> >> With some help from coccinelle I identified the following functions that
> >> call napi_schedule_irqoff() or __napi_schedule_irqoff() and do not run
> >> from an irq handler (at least not at the first glance).
> >>
> >> dpaa2_caam_fqdan_cb
> >> qede_simd_fp_handler
> >> mlx4_en_rx_irq
> >> mlx4_en_tx_irq  
> > 
> > Don't know the others but FWIW the mlx4 ones run from an IRQ handler,
> > AFAICT:
> > 
> > static irqreturn_t mlx4_interrupt(int irq, void *dev_ptr)
> > static irqreturn_t mlx4_msi_x_interrupt(int irq, void *eq_ptr)
> >   mlx4_eq_int()
> > mlx4_cq_completion
> >   cq->comp()
> >   
> >> qeth_qdio_poll
> >> netvsc_channel_cb
> >> napi_watchdog  
> > 
> > This one runs from a hrtimer, which I believe will be a hard irq
> > context on anything but RT. I could be wrong.
> >   
> 
> Typically forced irq threading will not be enabled, therefore going
> back to use napi_schedule() in drivers in most cases will cause
> losing the benefit of the irqoff version. Something like the following
> should be better. Only small drawback I see is that in case of forced
> irq threading hrtimers will still run in hardirq context and we lose
> the benefit of the irqoff version in napi_watchdog().
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index a146bac84..7d18560b2 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -6393,7 +6393,11 @@ EXPORT_SYMBOL(napi_schedule_prep);
>   */
>  void __napi_schedule_irqoff(struct napi_struct *n)
>  {
> - napi_schedule(this_cpu_ptr(&softnet_data), n);
> + /* hard irqs may not be masked in case of forced irq threading */
> + if (force_irqthreads)
> + __napi_schedule(n);
> + else
> + napi_schedule(this_cpu_ptr(&softnet_data), n);
>  }
>  EXPORT_SYMBOL(__napi_schedule_irqoff);

Does

if (force_irqthreads)
local_irq_save(flags);
napi_schedule(this_cpu_ptr(&softnet_data), n);
if (force_irqthreads)
local_irq_restore(flags);

not produce more concise assembly?

Re: [External] Re: [PATCH] mm: proc: add Sock to /proc/meminfo

2020-10-11 Thread Muchun Song

On Sun, Oct 11, 2020 at 9:53 PM Mike Rapoport  wrote:
>
> On Sat, Oct 10, 2020 at 06:38:54PM +0800, Muchun Song wrote:
> > The amount of memory allocated to sockets buffer can become significant.
> > However, we do not display the amount of memory consumed by sockets
> > buffer. In this case, knowing where the memory is consumed by the kernel
> > is very difficult. On our server with 500GB RAM, sometimes we can see
> > 25GB disappear through /proc/meminfo. After our analysis, we found the
> > following memory allocation path which consumes the memory with page_owner
> > enabled.
>
> I have a high lelel question.
> There is accounting of the socket memory for memcg that gets called from
> the networking layer. Did you check if the same call sites can be used
> for the system-wide accounting as well?

I also think about this. But we did not pass the `struct page` parameter to
the sock accounting memcg API. So we did not know the NUMA node
which allocated the socket buffer memory and cannot do node-level
statistics. In addition, there is another problem. If the user sends a 4096-byte
message, we only charge one page to the memcg but the system allocates 8
pages. So if we reuse the same call sites for the system-wide accounting,
the statistical count we get is always smaller than the actual situation.

>
> >   849698 times:
> >   Page allocated via order 3, mask 
> > 0x4052c0(GFP_NOWAIT|__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP)
> >__alloc_pages_nodemask+0x11d/0x290
> >skb_page_frag_refill+0x68/0xf0
> >sk_page_frag_refill+0x19/0x70
> >tcp_sendmsg_locked+0x2f4/0xd10
> >tcp_sendmsg+0x29/0xa0
> >sock_sendmsg+0x30/0x40
> >sock_write_iter+0x8f/0x100
> >__vfs_write+0x10b/0x190
> >vfs_write+0xb0/0x190
> >ksys_write+0x5a/0xd0
> >do_syscall_64+0x5d/0x110
> >entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
> > Signed-off-by: Muchun Song 
> > ---
> >  drivers/base/node.c  |  2 ++
> >  drivers/net/virtio_net.c |  3 +--
>
> Is virtio-net the only dirver that requred an update?

Yeah, only virtio-net needs an update. Because only it uses the
skb_page_frag_refill() API.

>
> >  fs/proc/meminfo.c|  1 +
> >  include/linux/mmzone.h   |  1 +
> >  include/linux/skbuff.h   | 43 ++--
> >  kernel/exit.c|  3 +--
> >  mm/page_alloc.c  |  7 +--
> >  mm/vmstat.c  |  1 +
> >  net/core/sock.c  |  8 
> >  net/ipv4/tcp.c   |  3 +--
> >  net/xfrm/xfrm_state.c|  3 +--
> >  11 files changed, 59 insertions(+), 16 deletions(-)
> >



-- 
Yours,
Muchun

Re: [PATCH] net: stmmac: Don't call _irqoff() with hardirqs enabled

2020-10-11 Thread Jakub Kicinski

On Sun, 11 Oct 2020 11:24:41 +0200 Heiner Kallweit wrote:
> >> qeth_qdio_poll
> >> netvsc_channel_cb
> >> napi_watchdog  
> > 
> > This one runs from a hrtimer, which I believe will be a hard irq
> > context on anything but RT. I could be wrong.
> >   
> 
> A similar discussion can be found e.g. here:
> https://lore.kernel.org/netdev/20191126222013.1904785-1-bige...@linutronix.de/
> However I don't see any actual outcome.

Interesting, hopefully Eric will chime in. I think the hrtimer issue
was solved. But I'm not actually seeing a lockdep_assert_irqs_disabled()
in __raise_softirq_irqoff() in net, so IDK what that's for?

In any case if NAPI thinks it has irqs off while they're not, and
interacts with other parts of the kernel we may be in for a game of
whack-a-mole. 

Perhaps a way around touching force_irqthreads directly in net/ would 
be some form of a helper like "theaded_local_irq_save" or such that'd
disable IRQs only if force_irqthreads == 1? Is that cheating? :)

Re: [PATCH net-next v2] ne2k: Enable RW-Bugfix

2020-10-11 Thread Jakub Kicinski

On Sat, 10 Oct 2020 18:17:11 +0200 Armin Wolf wrote:
> Correct a typo in ne.c and ne2k-pci.c which
> prevented activation of the RW-Bugfix.
> Also enable the RW-Bugfix by default since
> not doing so could (according to the Datasheet)
> cause the system to lock up with some chips.
> 
> Signed-off-by: Armin Wolf 
> ---
> v2 changes:
> - change NE8390_RW_BUGFIX to NE_RW_BUGFIX

I'm not applying this, sorry.

It's been this way since Linux 1.2 and you're giving no information
about why the change is needed in practice.

Re: [Linux-kernel-mentees] [PATCH net] ethtool: strset: Fix out of bound read in strset_parse_request()

2020-10-11 Thread Jakub Kicinski

On Sun, 11 Oct 2020 02:39:29 +0530 Anmol Karn wrote:
> Flag ``ETHTOOL_A_STRSET_COUNTS_ONLY`` tells the kernel to only return the 
> string 
> counts of the sets, but, when req_info->counts_only tries to read the 
> tb[ETHTOOL_A_STRSET_COUNTS_ONLY] it gets out of bound. 
> 
> - net/ethtool/strset.c
> The bug seems to trigger in this line:
> 
> req_info->counts_only = tb[ETHTOOL_A_STRSET_COUNTS_ONLY];
> 
> Fix it by NULL checking for req_info->counts_only while 
> reading from tb[ETHTOOL_A_STRSET_COUNTS_ONLY].
> 
> Reported-by: syzbot+9d1389df89299fa36...@syzkaller.appspotmail.com 
> Link: 
> https://syzkaller.appspot.com/bug?id=730deff8fe9954a5e317924d9acff98d9c64a770 
> Signed-off-by: Anmol Karn 

I think the correct fix for this was already applied to net-next as:

 commit db972e532518 ("ethtool: strset: allow ETHTOOL_A_STRSET_COUNTS_ONLY 
attr")

Re: [PATCH bpf-next v6 2/6] bpf: add redirect_peer helper

2020-10-11 Thread Daniel Borkmann


On 10/11/20 11:22 AM, Jesper Dangaard Brouer wrote:

On Sun, 11 Oct 2020 01:40:02 +0200
Daniel Borkmann  wrote:


Add an efficient ingress to ingress netns switch that can be used out of tc BPF
programs in order to redirect traffic from host ns ingress into a container
veth device ingress without having to go via CPU backlog queue [0]. For local
containers this can also be utilized and path via CPU backlog queue only needs
to be taken once, not twice. On a high level this borrows from ipvlan which does
similar switch in __netif_receive_skb_core() and then iterates via 
another_round.
This helps to reduce latency for mentioned use cases.

Pod to remote pod with redirect(), TCP_RR [1]:

   # percpu_netperf 10.217.1.33
   RT_LATENCY: 122.450 (per CPU: 122.666
 122.401 122.333 122.401 )
 MEAN_LATENCY: 121.210 (per CPU: 121.100
 121.260 121.320 121.160 )
   STDDEV_LATENCY: 120.040 (per CPU: 119.420
 119.910 125.460 115.370 )
  MIN_LATENCY:  46.500 (per CPU:  47.000
  47.000  47.000  45.000 )
  P50_LATENCY: 118.500 (per CPU: 118.000
 119.000 118.000 119.000 )
  P90_LATENCY: 127.500 (per CPU: 127.000
 128.000 127.000 128.000 )
  P99_LATENCY: 130.750 (per CPU: 131.000
 131.000 129.000 132.000 )

 TRANSACTION_RATE:   32666.400 (per CPU:8152.200
8169.8428174.4398169.897 )

Pod to remote pod with redirect_peer(), TCP_RR:

   # percpu_netperf 10.217.1.33
   RT_LATENCY:  44.449 (per CPU:  43.767
  43.127  45.279  45.622 )
 MEAN_LATENCY:  45.065 (per CPU:  44.030
  45.530  45.190  45.510 )
   STDDEV_LATENCY:  84.823 (per CPU:  66.770
  97.290  84.380  90.850 )
  MIN_LATENCY:  33.500 (per CPU:  33.000
  33.000  34.000  34.000 )
  P50_LATENCY:  43.250 (per CPU:  43.000
  43.000  43.000  44.000 )
  P90_LATENCY:  46.750 (per CPU:  46.000
  47.000  47.000  47.000 )
  P99_LATENCY:  52.750 (per CPU:  51.000
  54.000  53.000  53.000 )

 TRANSACTION_RATE:   90039.500 (per CPU:   22848.186   
23187.089   22085.077   21919.130 )


This is awesome results and great work Daniel! :-)

I wonder if we can also support this from XDP, which can also native
redirect into veth.  Originally I though we could add the peer netdev
in the devmap, but AFAIK Toke showed me that this was not possible.


I think it should be possible with similar principle. What was the limitation
that you ran into with devmap for XDP?


   [0] 
https://linuxplumbersconf.org/event/7/contributions/674/attachments/568/1002/plumbers_2020_cilium_load_balancer.pdf
   [1] https://github.com/borkmann/netperf_scripts/blob/master/percpu_netperf

Signed-off-by: Daniel Borkmann 
---
  drivers/net/veth.c |  9 ++
  include/linux/netdevice.h  |  4 +++
  include/uapi/linux/bpf.h   | 17 +++
  net/core/dev.c | 15 --
  net/core/filter.c  | 54 +-
  tools/include/uapi/linux/bpf.h | 17 +++
  6 files changed, 106 insertions(+), 10 deletions(-)


[...]

diff --git a/net/core/dev.c b/net/core/dev.c
index 9d55bf5d1a65..7dd015823593 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4930,7 +4930,7 @@ EXPORT_SYMBOL_GPL(br_fdb_test_addr_hook);
  
  static inline struct sk_buff *

  sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int 
*ret,
-  struct net_device *orig_dev)
+  struct net_device *orig_dev, bool *another)
  {
  #ifdef CONFIG_NET_CLS_ACT
struct mini_Qdisc *miniq = rcu_dereference_bh(skb->dev->miniq_ingress);
@@ -4974,7 +4974,11 @@ sch_handle_ingress(struct sk_buff *skb, struct 
packet_type **pt_prev, int *ret,
 * redirecting to another netdev
 */
__skb_push(skb, skb->mac_len);
-   skb_do_redirect(skb);
+   if (skb_do_redirect(skb) == -EAGAIN) {
+   __skb_pull(skb, skb->mac_len);
+   *another = true;
+   break;
+   }
return NULL;
case TC_ACT_CONSUMED:
return NULL;
@@ -5163,7 +5167,12 @@ static int __netif_receive_skb_core(struct sk_buff 
**pskb, bool pfmemalloc,
  skip_taps:
  #ifdef CONFIG_NE

[PATCH v2] net: usb: rtl8150: don't incorrectly assign random MAC addresses

2020-10-11 Thread Anant Thazhemadam

In set_ethernet_addr(), if get_registers() succeeds, the ethernet address
that was read must be copied over. Otherwise, a random ethernet address
must be assigned.

get_registers() returns 0 if successful, and negative error number
otherwise. However, in set_ethernet_addr(), this return value is
incorrectly checked.

Since this return value will never be equal to sizeof(node_id), a
random MAC address will always be generated and assigned to the
device; even in cases when get_registers() is successful.

Correctly modifying the condition that checks if get_registers() was
successful or not fixes this problem, and copies the ethernet address
appropriately.

Fixes: f45a4248ea4c ("net: usb: rtl8150: set random MAC address when 
set_ethernet_addr() fails")
Signed-off-by: Anant Thazhemadam 
---
Changes in v2:
* Fixed the format of the Fixes tag
* Modified the commit message to better describe the issue being 
  fixed

+CCing Stephen and linux-next, since the commit fixed isn't in the networking
tree, but is present in linux-next.

 drivers/net/usb/rtl8150.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/usb/rtl8150.c b/drivers/net/usb/rtl8150.c
index f020401adf04..bf8a60533f3e 100644
--- a/drivers/net/usb/rtl8150.c
+++ b/drivers/net/usb/rtl8150.c
@@ -261,7 +261,7 @@ static void set_ethernet_addr(rtl8150_t *dev)
 
ret = get_registers(dev, IDR, sizeof(node_id), node_id);
 
-   if (ret == sizeof(node_id)) {
+   if (!ret) {
ether_addr_copy(dev->netdev->dev_addr, node_id);
} else {
eth_hw_addr_random(dev->netdev);
-- 
2.25.1

Re: [PATCH bpf-next v6 0/6] Follow-up BPF helper improvements

2020-10-11 Thread patchwork-bot+netdevbpf

Hello:

This series was applied to bpf/bpf-next.git (refs/heads/master):

On Sun, 11 Oct 2020 01:40:00 +0200 you wrote:
> This series addresses most of the feedback [0] that was to be followed
> up from the last series, that is, UAPI helper comment improvements and
> getting rid of the ifindex obj file hacks in the selftest by using a
> BPF map instead. The __sk_buff data/data_end pointer work, I'm planning
> to do in a later round as well as the mem*() BPF improvements we have
> in Cilium for libbpf. Next, the series adds two features, i) a helper
> called redirect_peer() to improve latency on netns switch, and ii) to
> allow map in map with dynamic inner array map sizes. Selftests for each
> are added as well. For details, please check individual patches, thanks!
> 
> [...]

Here is the summary with links:
  - [bpf-next,v6,1/6] bpf: improve bpf_redirect_neigh helper description
https://git.kernel.org/bpf/bpf-next/c/dd2ce6a5373c
  - [bpf-next,v6,2/6] bpf: add redirect_peer helper
https://git.kernel.org/bpf/bpf-next/c/9aa1206e8f48
  - [bpf-next,v6,3/6] bpf: allow for map-in-map with dynamic inner array map 
entries
https://git.kernel.org/bpf/bpf-next/c/4a8f87e60f6d
  - [bpf-next,v6,4/6] bpf, selftests: add test for different array inner map 
size
https://git.kernel.org/bpf/bpf-next/c/6775dab73bdc
  - [bpf-next,v6,5/6] bpf, selftests: make redirect_neigh test more extensible
https://git.kernel.org/bpf/bpf-next/c/57a73fe7c198
  - [bpf-next,v6,6/6] bpf, selftests: add redirect_peer selftest
https://git.kernel.org/bpf/bpf-next/c/9f4c53ca23a2

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html

Re: [PATCH v2] net: usb: rtl8150: don't incorrectly assign random MAC addresses

2020-10-11 Thread Jakub Kicinski

On Sun, 11 Oct 2020 23:00:30 +0530 Anant Thazhemadam wrote:
> In set_ethernet_addr(), if get_registers() succeeds, the ethernet address
> that was read must be copied over. Otherwise, a random ethernet address
> must be assigned.
> 
> get_registers() returns 0 if successful, and negative error number
> otherwise. However, in set_ethernet_addr(), this return value is
> incorrectly checked.
> 
> Since this return value will never be equal to sizeof(node_id), a
> random MAC address will always be generated and assigned to the
> device; even in cases when get_registers() is successful.
> 
> Correctly modifying the condition that checks if get_registers() was
> successful or not fixes this problem, and copies the ethernet address
> appropriately.
> 
> Fixes: f45a4248ea4c ("net: usb: rtl8150: set random MAC address when 
> set_ethernet_addr() fails")
> Signed-off-by: Anant Thazhemadam 

This patch is a fix to a conflict resolution in linux-next.

linux-next is not a "real" tree, it's an integration tree used to
figure out conflicts early.

We had let Stephen know about the problem already. Please wait one
week, and if the problem is still present resend this.

Thank you.

Re: [PATCH net-next v2 1/2] can-isotp: implement cleanups / improvements from review

2020-10-11 Thread kernel test robot

Hi Oliver,

I love your patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]

url:
https://github.com/0day-ci/linux/commits/Oliver-Hartkopp/can-isotp-implement-cleanups-improvements-from-review/20201012-55
base:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git 
df41c19abbeaae6e24c8e31cff5fdb48632be380
config: xtensa-allyesconfig (attached as .config)
compiler: xtensa-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/693c9b4e8c5710e7b536ec9d6a969a7d7343a100
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Oliver-Hartkopp/can-isotp-implement-cleanups-improvements-from-review/20201012-55
git checkout 693c9b4e8c5710e7b536ec9d6a969a7d7343a100
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=xtensa 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

>> net/can/isotp.c:82: warning: "pr_fmt" redefined
  82 | #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 | 
   In file included from include/linux/kernel.h:15,
from include/linux/list.h:9,
from include/linux/module.h:12,
from net/can/isotp.c:56:
   include/linux/printk.h:297: note: this is the location of the previous 
definition
 297 | #define pr_fmt(fmt) fmt
 | 

vim +/pr_fmt +82 net/can/isotp.c

81  
  > 82  #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
83  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip

Re: [PATCH net-next v2 0/4] enetc: Migrate to PHYLINK and PCS_LYNX

2020-10-11 Thread Jakub Kicinski

On Wed,  7 Oct 2020 12:48:19 +0300 Claudiu Manoil wrote:
> Transitioning the enetc driver from phylib to phylink.
> Offloading the serdes configuration to the PCS_LYNX
> module is a mandatory part of this transition. Aiming
> for a cleaner, more maintainable design, and better
> code reuse.
> The first 2 patches are clean up prerequisites.
> 
> Tested on a p1028rdb board.
> 
> v2: validate() explicitly rejects now all interface modes not
> supported by the driver instead of relying on the device tree
> to provide only supported interfaces, and dropped redundant
> activation of pcs_poll (addressing Ioana's findings)

Applied, thank you!

Re: [PATCH net-next 0/3] Offload tc-vlan mangle to mscc_ocelot switch

2020-10-11 Thread Jakub Kicinski

On Thu,  8 Oct 2020 14:56:57 +0300 Vladimir Oltean wrote:
> This series offloads one more action to the VCAP IS1 ingress TCAM, which
> is to change the classified VLAN for packets, according to the VCAP IS1
> keys (VLAN, source MAC, source IP, EtherType, etc).

Applied, thanks!

Re: [ PATCH v1 2/2] ibmveth: Identify ingress large send packets.

2020-10-11 Thread Jakub Kicinski

On Sat, 10 Oct 2020 12:51:30 -0400 Willem de Bruijn wrote:
> > > @@ -1385,7 +1386,17 @@ static int ibmveth_poll(struct napi_struct *napi, 
> > > int budget)
> > > skb_put(skb, length);
> > > skb->protocol = eth_type_trans(skb, netdev);
> > >
> > > -   if (length > netdev->mtu + ETH_HLEN) {
> > > +   /* PHYP without PLSO support places a -1 in the ip
> > > +* checksum for large send frames.
> > > +*/
> > > +   if (be16_to_cpu(skb->protocol) == ETH_P_IP) {

You can byteswap the constant, so its done at compilation time.

> > > +   struct iphdr *iph = (struct iphdr 
> > > *)skb->data;
> > > +
> > > +   iph_check = iph->check;  
> >
> > Check against truncated/bad packets.  
> 
> .. unless I missed context. Other code in this driver seems to peek in
> the network and transport layer headers without additional geometry
> and integrity checks, too.

Good catch, even if we trust the hypervisor to only forward valid
frames this needs to be at least mentioned in the commit message.

Also please add Fixes tags to both patches.

Re: [PATCH v2] net: usb: rtl8150: don't incorrectly assign random MAC addresses

2020-10-11 Thread Joe Perches

On Sun, 2020-10-11 at 10:59 -0700, Jakub Kicinski wrote:
> On Sun, 11 Oct 2020 23:00:30 +0530 Anant Thazhemadam wrote:
> > In set_ethernet_addr(), if get_registers() succeeds, the ethernet address
> > that was read must be copied over. Otherwise, a random ethernet address
> > must be assigned.
> > 
> > get_registers() returns 0 if successful, and negative error number
> > otherwise. However, in set_ethernet_addr(), this return value is
> > incorrectly checked.
> > 
> > Since this return value will never be equal to sizeof(node_id), a
> > random MAC address will always be generated and assigned to the
> > device; even in cases when get_registers() is successful.
> > 
> > Correctly modifying the condition that checks if get_registers() was
> > successful or not fixes this problem, and copies the ethernet address
> > appropriately.

There are many unchecked uses of set_registers and get_registers
 in this file.

If failures are really expected, then it might be better to fix
them up too.

$ git grep -w '[gs]et_registers' drivers/net/usb/rtl8150.c
drivers/net/usb/rtl8150.c:static int get_registers(rtl8150_t * dev, u16 indx, 
u16 size, void *data)
drivers/net/usb/rtl8150.c:static int set_registers(rtl8150_t * dev, u16 indx, 
u16 size, const void *data)
drivers/net/usb/rtl8150.c:  set_registers(dev, PHYADD, sizeof(data), data);
drivers/net/usb/rtl8150.c:  set_registers(dev, PHYCNT, 1, &tmp);
drivers/net/usb/rtl8150.c:  get_registers(dev, PHYCNT, 1, data);
drivers/net/usb/rtl8150.c:  get_registers(dev, PHYDAT, 2, data);
drivers/net/usb/rtl8150.c:  set_registers(dev, PHYADD, sizeof(data), data);
drivers/net/usb/rtl8150.c:  set_registers(dev, PHYCNT, 1, &tmp);
drivers/net/usb/rtl8150.c:  get_registers(dev, PHYCNT, 1, data);
drivers/net/usb/rtl8150.c:  ret = get_registers(dev, IDR, sizeof(node_id), 
node_id);
drivers/net/usb/rtl8150.c:  set_registers(dev, IDR, netdev->addr_len, 
netdev->dev_addr);
drivers/net/usb/rtl8150.c:  get_registers(dev, CR, 1, &cr);
drivers/net/usb/rtl8150.c:  set_registers(dev, CR, 1, &cr);
drivers/net/usb/rtl8150.c:  set_registers(dev, IDR_EEPROM + (i * 
2), 2,
drivers/net/usb/rtl8150.c:  set_registers(dev, CR, 1, &cr);
drivers/net/usb/rtl8150.c:  set_registers(dev, CR, 1, &data);
drivers/net/usb/rtl8150.c:  get_registers(dev, CR, 1, &data);
drivers/net/usb/rtl8150.c:  set_registers(dev, RCR, 1, &rcr);
drivers/net/usb/rtl8150.c:  set_registers(dev, TCR, 1, &tcr);
drivers/net/usb/rtl8150.c:  set_registers(dev, CR, 1, &cr);
drivers/net/usb/rtl8150.c:  get_registers(dev, MSR, 1, &msr);
drivers/net/usb/rtl8150.c:  get_registers(dev, CR, 1, &cr);
drivers/net/usb/rtl8150.c:  set_registers(dev, CR, 1, &cr);
drivers/net/usb/rtl8150.c:  get_registers(dev, CSCR, 2, &tmp);
drivers/net/usb/rtl8150.c:  set_registers(dev, IDR, 6, netdev->dev_addr);
drivers/net/usb/rtl8150.c:  get_registers(dev, BMCR, 2, &bmcr);
drivers/net/usb/rtl8150.c:  get_registers(dev, ANLP, 2, &lpa);

Re: [PATCH net-next] mlx4: handle non-napi callers to napi_poll

2020-10-11 Thread Jakub Kicinski

On Thu, 8 Oct 2020 11:45:26 -0700 Jonathan Lemon wrote:
> From: Jonathan Lemon 
> 
> netcons calls napi_poll with a budget of 0 to transmit packets.
> Handle this by:
>  - skipping RX processing
>  - do not try to recycle TX packets to the RX cache
> 
> Signed-off-by: Jonathan Lemon 

Tariq, Saeed - how does this look to you?

Re: [PATCH] net: korina: free array used for rx/tx descriptors

2020-10-11 Thread Willem de Bruijn

On Sun, Oct 11, 2020 at 7:46 AM Valentin Vidic
 wrote:
>
> Memory was not freed when driver is unloaded from the kernel.
>
> Signed-off-by: Valentin Vidic 

Makes sense.

Fixes: ef11291bcd5f ("Add support the Korina (IDT RC32434) Ethernet MAC")

Slightly off-topic, but I don't fully fathom what goes on with this
pointer straight after the initial kmalloc.

lp->td_ring = (struct dma_desc *)KSEG1ADDR(lp->td_ring);

> ---
>  drivers/net/ethernet/korina.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/ethernet/korina.c b/drivers/net/ethernet/korina.c
> index 03e034918d14..99146145f020 100644
> --- a/drivers/net/ethernet/korina.c
> +++ b/drivers/net/ethernet/korina.c
> @@ -1133,6 +1133,7 @@ static int korina_remove(struct platform_device *pdev)
> iounmap(lp->eth_regs);
> iounmap(lp->rx_dma_regs);
> iounmap(lp->tx_dma_regs);
> +   kfree(lp->td_ring);
>
> unregister_netdev(bif->dev);
> free_netdev(bif->dev);

In general it is nice to release in reverse of acquire. But the driver
already does not follow this practice.

Re: [PATCH] mm: proc: add Sock to /proc/meminfo

2020-10-11 Thread Cong Wang

On Sat, Oct 10, 2020 at 3:39 AM Muchun Song  wrote:
>
> The amount of memory allocated to sockets buffer can become significant.
> However, we do not display the amount of memory consumed by sockets
> buffer. In this case, knowing where the memory is consumed by the kernel

We do it via `ss -m`. Is it not sufficient? And if not, why not adding it there
rather than /proc/meminfo?

>  static inline void __skb_frag_unref(skb_frag_t *frag)
>  {
> -   put_page(skb_frag_page(frag));
> +   struct page *page = skb_frag_page(frag);
> +
> +   if (put_page_testzero(page)) {
> +   dec_sock_node_page_state(page);
> +   __put_page(page);
> +   }
>  }

You mix socket page frag with skb frag at least, not sure this is exactly
what you want, because clearly skb page frags are frequently used
by network drivers rather than sockets.

Also, which one matches this dec_sock_node_page_state()? Clearly
not skb_fill_page_desc() or __skb_frag_ref().

Thanks.

Re: [PATCH v2 0/3] [PATCH v2 0/3] [PATCH v2 0/3] net, mac80211, kernel: enable KCOV remote coverage collection for 802.11 frame handling

2020-10-11 Thread Johannes Berg

On Fri, 2020-10-09 at 17:01 +, Aleksandr Nogikh wrote:
> From: Aleksandr Nogikh 
> 
> This patch series enables remote KCOV coverage collection during
> 802.11 frames processing. These changes make it possible to perform
> coverage-guided fuzzing in search of remotely triggerable bugs.

Btw, it occurred to me that I don't know at all - is this related to
syzkaller? Or is there some other fuzzing you're working on? Can we get
the bug reports from it if it's different? :)

Also, unrelated to that (but I see Dmitry CC'ed), I started wondering if
it'd be helpful to have an easier raw 802.11 inject path on top of say
hwsim0; I noticed some syzbot reports where it created raw sockets, but
that only gets you into the *data* plane of the wifi stack, not into the
*management* plane. Theoretically you could add a monitor interface, but
right now the wifi setup (according to the current docs on github) is
using two IBSS interfaces.

Perhaps an inject path on the mac80211-hwsim "hwsim0" interface would be
something to consider? Or simply adding a third radio that's in
"monitor" mode, so that a raw socket bound to *that* interface can
inject with a radiotap header followed by an 802.11 frame, getting to
arbitrary frame handling code, not just data frames.

Any thoughts?

johannes

Re: [PATCH] net: Add mhi-net driver

2020-10-11 Thread Jakub Kicinski

On Fri,  9 Oct 2020 22:33:31 +0200 Loic Poulain wrote:
> This patch adds a new network driver implementing MHI transport for
> network packets. Packets can be in any format, though QMAP (rmnet)
> is the usual protocol (flow control + PDN mux).
> 
> It support two MHI devices, IP_HW0 which is, the path to the IPA
> (IP accelerator) on qcom modem, And IP_SW0 which is the software
> driven IP path (to modem CPU).
> 
> Signed-off-by: Loic Poulain 

> +static int mhi_ndo_xmit(struct sk_buff *skb, struct net_device *ndev)
> +{
> + struct mhi_net_dev *mhi_netdev = netdev_priv(ndev);
> + struct mhi_device *mdev = mhi_netdev->mdev;
> + int err;
> +
> + /* Only support for single buffer transfer for now */
> + err = skb_linearize(skb);

Since you don't advertise NETIF_F_SG you shouldn't have to call this,
no?

> + if (unlikely(err)) {
> + kfree_skb(skb);
> + mhi_netdev->stats.tx_dropped++;
> + return NETDEV_TX_OK;
> + }
> +
> + skb_tx_timestamp(skb);
> +
> + /* mhi_queue_skb is not thread-safe, but xmit is serialized by the
> +  * network core. Once MHI core will be thread save, migrate to
> +  * NETIF_F_LLTX support.
> +  */
> + err = mhi_queue_skb(mdev, DMA_TO_DEVICE, skb, skb->len, MHI_EOT);
> + if (err) {
> + netdev_err(ndev, "mhi_queue_skb err %d\n", err);

This needs to be at least rate limited. 

> + netif_stop_queue(ndev);

What's going to start the queue if it's a transient errors and not
NETDEV_TX_BUSY?

> + return (err == -ENOMEM) ? NETDEV_TX_BUSY : err;

You should drop the packet if it's not NETDEV_TX_BUSY, and return
NETDEV_TX_OK. Don't return negative errors from ndo_xmit. 

> + }
> +
> + return NETDEV_TX_OK;
> +}
> +
> +static void mhi_ndo_get_stats64(struct net_device *ndev,
> + struct rtnl_link_stats64 *stats)
> +{
> + struct mhi_net_dev *mhi_netdev = netdev_priv(ndev);
> +
> + stats->rx_packets = mhi_netdev->stats.rx_packets;
> + stats->rx_bytes = mhi_netdev->stats.rx_bytes;
> + stats->rx_errors = mhi_netdev->stats.rx_errors;
> + stats->rx_dropped = mhi_netdev->stats.rx_dropped;
> + stats->tx_packets = mhi_netdev->stats.tx_packets;
> + stats->tx_bytes = mhi_netdev->stats.tx_bytes;
> + stats->tx_errors = mhi_netdev->stats.tx_errors;
> + stats->tx_dropped = mhi_netdev->stats.tx_dropped;
> +}

Can you use 

> +static void mhi_net_dl_callback(struct mhi_device *mhi_dev,
> + struct mhi_result *mhi_res)
> +{
> + struct mhi_net_dev *mhi_netdev = dev_get_drvdata(&mhi_dev->dev);
> + struct sk_buff *skb = mhi_res->buf_addr;
> +
> + atomic_dec(&mhi_netdev->stats.rx_queued);
> +
> + if (mhi_res->transaction_status) {
> + mhi_netdev->stats.rx_errors++;
> + kfree_skb(skb);
> + } else {
> + mhi_netdev->stats.rx_packets++;
> + mhi_netdev->stats.rx_bytes += mhi_res->bytes_xferd;
> +
> + skb->protocol = htons(ETH_P_MAP);
> + skb_put(skb, mhi_res->bytes_xferd);
> + netif_rx(skb);
> + }
> +
> + schedule_delayed_work(&mhi_netdev->rx_refill, 0);

Scheduling a work to replace every single RX buffer looks quite 
inefficient. Any chance you can do batching? I assume mhi_queue_skb()
has to be able to sleep?

> +static void mhi_net_rx_refill_work(struct work_struct *work)
> +{
> + struct mhi_net_dev *mhi_netdev = container_of(work, struct mhi_net_dev,
> +   rx_refill.work);
> + struct net_device *ndev = mhi_netdev->ndev;
> + struct mhi_device *mdev = mhi_netdev->mdev;
> + struct sk_buff *skb;
> + int err;
> +
> + if (!netif_running(ndev))
> + return;

How can this happen? You cancel the work from ndo_stop.

> + do {
> + skb = netdev_alloc_skb(ndev, READ_ONCE(ndev->mtu));
> + if (unlikely(!skb)) {
> + /* If we are starved of RX buffers, retry later */
> + if (!atomic_read(&mhi_netdev->stats.rx_queued))
> + schedule_delayed_work(&mhi_netdev->rx_refill, 
> HZ / 2);
> + break;
> + }
> +
> + err = mhi_queue_skb(mdev, DMA_FROM_DEVICE, skb, ndev->mtu,
> + MHI_EOT);
> + if (err) {
> + atomic_dec(&mhi_netdev->stats.rx_queued);

This can never fail with an empty ring? No need to potentially
reschedule the work here?

> + kfree_skb(skb);
> + break;
> + }
> +
> + atomic_inc(&mhi_netdev->stats.rx_queued);
> +
> + } while (1);
> +}
> +
> +static int mhi_net_probe(struct mhi_device *mhi_dev,
> +  const struct mhi_device_id *id)
> +{
> + const char *netname = (char *)id->driver_data;
> + struct mhi_net_dev *mhi_netdev;
> + st

Re: [Linux-kernel-mentees] [PATCH net] ethtool: strset: Fix out of bound read in strset_parse_request()

2020-10-11 Thread Anmol karn

Hello sir,
On Sun, Oct 11, 2020 at 10:24 PM Jakub Kicinski  wrote:
>
> On Sun, 11 Oct 2020 02:39:29 +0530 Anmol Karn wrote:
> > Flag ``ETHTOOL_A_STRSET_COUNTS_ONLY`` tells the kernel to only return the 
> > string
> > counts of the sets, but, when req_info->counts_only tries to read the
> > tb[ETHTOOL_A_STRSET_COUNTS_ONLY] it gets out of bound.
> >
> > - net/ethtool/strset.c
> > The bug seems to trigger in this line:
> >
> > req_info->counts_only = tb[ETHTOOL_A_STRSET_COUNTS_ONLY];
> >
> > Fix it by NULL checking for req_info->counts_only while
> > reading from tb[ETHTOOL_A_STRSET_COUNTS_ONLY].
> >
> > Reported-by: syzbot+9d1389df89299fa36...@syzkaller.appspotmail.com
> > Link: 
> > https://syzkaller.appspot.com/bug?id=730deff8fe9954a5e317924d9acff98d9c64a770
> > Signed-off-by: Anmol Karn 
>
> I think the correct fix for this was already applied to net-next as:
>
>  commit db972e532518 ("ethtool: strset: allow ETHTOOL_A_STRSET_COUNTS_ONLY 
> attr")

I am glad that it's fixed now.

Thanks,
Anmol

[Patch net v2] ip_gre: set dev->hard_header_len and dev->needed_headroom properly

2020-10-11 Thread Cong Wang

GRE tunnel has its own header_ops, ipgre_header_ops, and sets it
conditionally. When it is set, it assumes the outer IP header is
already created before ipgre_xmit().

This is not true when we send packets through a raw packet socket,
where L2 headers are supposed to be constructed by user. Packet
socket calls dev_validate_header() to validate the header. But
GRE tunnel does not set dev->hard_header_len, so that check can
be simply bypassed, therefore uninit memory could be passed down
to ipgre_xmit(). Similar for dev->needed_headroom.

dev->hard_header_len is supposed to be the length of the header
created by dev->header_ops->create(), so it should be used whenever
header_ops is set, and dev->needed_headroom should be used when it
is not set.

Reported-and-tested-by: syzbot+4a2c52677a8a1aa28...@syzkaller.appspotmail.com
Cc: Xie He 
Cc: William Tu 
Cc: Willem de Bruijn 
Signed-off-by: Cong Wang 
---
Note, there are some other suspicious use of dev->hard_header_len in
the same file, but let's leave them to a separate patch if really
needed.

 net/ipv4/ip_gre.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index 4e31f23e4117..82fee0010353 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -626,8 +626,7 @@ static netdev_tx_t ipgre_xmit(struct sk_buff *skb,
 
if (dev->header_ops) {
/* Need space for new headers */
-   if (skb_cow_head(skb, dev->needed_headroom -
- (tunnel->hlen + sizeof(struct iphdr
+   if (skb_cow_head(skb, dev->hard_header_len))
goto free_skb;
 
tnl_params = (const struct iphdr *)skb->data;
@@ -748,7 +747,11 @@ static void ipgre_link_update(struct net_device *dev, bool 
set_mtu)
len = tunnel->tun_hlen - len;
tunnel->hlen = tunnel->hlen + len;
 
-   dev->needed_headroom = dev->needed_headroom + len;
+   if (dev->header_ops)
+   dev->hard_header_len += len;
+   else
+   dev->needed_headroom += len;
+
if (set_mtu)
dev->mtu = max_t(int, dev->mtu - len, 68);
 
@@ -944,6 +947,7 @@ static void __gre_tunnel_init(struct net_device *dev)
tunnel->parms.iph.protocol = IPPROTO_GRE;
 
tunnel->hlen = tunnel->tun_hlen + tunnel->encap_hlen;
+   dev->needed_headroom = tunnel->hlen + sizeof(tunnel->parms.iph);
 
dev->features   |= GRE_FEATURES;
dev->hw_features|= GRE_FEATURES;
@@ -987,10 +991,14 @@ static int ipgre_tunnel_init(struct net_device *dev)
return -EINVAL;
dev->flags = IFF_BROADCAST;
dev->header_ops = &ipgre_header_ops;
+   dev->hard_header_len = tunnel->hlen + sizeof(*iph);
+   dev->needed_headroom = 0;
}
 #endif
} else if (!tunnel->collect_md) {
dev->header_ops = &ipgre_header_ops;
+   dev->hard_header_len = tunnel->hlen + sizeof(*iph);
+   dev->needed_headroom = 0;
}
 
return ip_tunnel_init(dev);
-- 
2.28.0

Re: BUG: unable to handle kernel paging request in tcf_action_dump_terse

2020-10-11 Thread Cong Wang

#syz fix: net_sched: check error pointer in tcf_dump_walker()

Re: [PATCH] net: Avoid allocing memory on memoryless numa node

2020-10-11 Thread Jakub Kicinski

On Sun, 11 Oct 2020 12:11:40 +0800 Xianting Tian wrote:
> In architecture like powerpc, we can have cpus without any local memory
> attached to it. In such cases the node does not have real memory.
> 
> Use local_memory_node(), which is guaranteed to have memory.
> local_memory_node is a noop in other architectures that does not support
> memoryless nodes.
> 
> Signed-off-by: Xianting Tian 
> ---
>  net/core/dev.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 266073e30..dcb4533ef 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2590,7 +2590,7 @@ static struct xps_map *expand_xps_map(struct xps_map 
> *map, int attr_index,
>   new_map = kzalloc(XPS_MAP_SIZE(alloc_len), GFP_KERNEL);
>   else
>   new_map = kzalloc_node(XPS_MAP_SIZE(alloc_len), GFP_KERNEL,
> -cpu_to_node(attr_index));
> +
> local_memory_node(cpu_to_node(attr_index)));
>   if (!new_map)
>   return NULL;
>  

Are we going to patch all kmalloc_node() callers now to apply
local_memory_node()?  Can't the allocator take care of this?

Re: [PATCH net-next] cxgb4/ch_ipsec: Replace the module name to ch_ipsec from chcr

2020-10-11 Thread Jakub Kicinski

On Thu,  8 Oct 2020 19:30:15 +0530 Ayush Sawal wrote:
> @@ -140,8 +141,8 @@ static int ch_ipsec_uld_state_change(void *handle, enum 
> cxgb4_state new_state)
>   return 0;
>  }
>  
> -static inline int chcr_ipsec_setauthsize(struct xfrm_state *x,
> -  struct ipsec_sa_entry *sa_entry)
> +static inline int ch_ipsec_setauthsize(struct xfrm_state *x,
> +struct ipsec_sa_entry *sa_entry)
>  {
>   int hmac_ctrl;
>   int authsize = x->aead->alg_icv_len / 8;
> @@ -164,8 +165,8 @@ static inline int chcr_ipsec_setauthsize(struct 
> xfrm_state *x,
>   return hmac_ctrl;
>  }
>  
> -static inline int chcr_ipsec_setkey(struct xfrm_state *x,
> - struct ipsec_sa_entry *sa_entry)
> +static inline int ch_ipsec_setkey(struct xfrm_state *x,
> +   struct ipsec_sa_entry *sa_entry)

Please remove the inline keywords while at it, and let the compiler
decide what to inline.

>  {
>   int keylen = (x->aead->alg_key_len + 7) / 8;
>   unsigned char *key = x->aead->alg_key;

>   if (x->props.aalgo != SADB_AALG_NONE) {
> - pr_debug("CHCR: Cannot offload authenticated xfrm states\n");
> + pr_debug("CH_IPSEC: Cannot offload authenticated xfrm 
> states\n");
>   return -EINVAL;
>   }
>   if (x->props.calgo != SADB_X_CALG_NONE) {
> - pr_debug("CHCR: Cannot offload compressed xfrm states\n");
> + pr_debug("CH_IPSEC: Cannot offload compressed xfrm states\n");
>   return -EINVAL;
>   }
>   if (x->props.family != AF_INET &&
>   x->props.family != AF_INET6) {
> - pr_debug("CHCR: Only IPv4/6 xfrm state offloaded\n");
> + pr_debug("CH_IPSEC: Only IPv4/6 xfrm state offloaded\n");
>   return -EINVAL;
>   }
>   if (x->props.mode != XFRM_MODE_TRANSPORT &&
>   x->props.mode != XFRM_MODE_TUNNEL) {
> - pr_debug("CHCR: Only transport and tunnel xfrm offload\n");
> + pr_debug("CH_IPSEC: Only transport and tunnel xfrm offload\n");
>   return -EINVAL;
>   }
>   if (x->id.proto != IPPROTO_ESP) {
> - pr_debug("CHCR: Only ESP xfrm state offloaded\n");
> + pr_debug("CH_IPSEC: Only ESP xfrm state offloaded\n");
>   return -EINVAL;
>   }
>   if (x->encap) {
> - pr_debug("CHCR: Encapsulated xfrm state not offloaded\n");
> + pr_debug("CH_IPSEC: Encapsulated xfrm state not offloaded\n");
>   return -EINVAL;
>   }
>   if (!x->aead) {
> - pr_debug("CHCR: Cannot offload xfrm states without aead\n");
> + pr_debug("CH_IPSEC: Cannot offload xfrm states without aead\n");

Why is this printing the "CH_IPSEC: " prefix if you already have:

+#define pr_fmt(fmt) "ch_ipsec: " fmt

?

>   return -EINVAL;
>   }
>   if (x->aead->alg_icv_len != 128 &&
>   x->aead->alg_icv_len != 96) {
> - pr_debug("CHCR: Cannot offload xfrm states with AEAD ICV length 
> other than 96b & 128b\n");
> + pr_debug("CH_IPSEC: Cannot offload xfrm states with AEAD ICV 
> length other than 96b & 128b\n");
>   return -EINVAL;
>   }
>   if ((x->aead->alg_key_len != 128 + 32) &&
>   (x->aead->alg_key_len != 256 + 32)) {
> - pr_debug("CHCR: Cannot offload xfrm states with AEAD key length 
> other than 128/256 bit\n");
> + pr_debug("CH_IPSEC: Cannot offload xfrm states with AEAD key 
> length other than 128/256 bit\n");
>   return -EINVAL;
>   }
>   if (x->tfcpad) {
> - pr_debug("CHCR: Cannot offload xfrm states with tfc padding\n");
> + pr_debug("CH_IPSEC: Cannot offload xfrm states with tfc 
> padding\n");
>   return -EINVAL;
>   }
>   if (!x->geniv) {
> - pr_debug("CHCR: Cannot offload xfrm states without geniv\n");
> + pr_debug("CH_IPSEC: Cannot offload xfrm states without 
> geniv\n");
>   return -EINVAL;
>   }
>   if (strcmp(x->geniv, "seqiv")) {
> - pr_debug("CHCR: Cannot offload xfrm states with geniv other 
> than seqiv\n");
> + pr_debug("CH_IPSEC: Cannot offload xfrm states with geniv other 
> than seqiv\n");
>   return -EINVAL;
>   }

> @@ -763,7 +764,7 @@ out_free:   dev_kfree_skb_any(skb);
>   before = (u64 *)pos;
>   end = (u64 *)pos + flits;
>   /* Setup IPSec CPL */
> - pos = (void *)chcr_crypto_wreq(skb, dev, (void *)pos,
> + pos = (void *)ch_ipsec_crypto_wreq(skb, dev, (void *)pos,
>  credits, sa_entry);

The continuation line needs to be adjusted to match the position of
opening parenthesis.

>   if (before > (u64 *)pos) {
>   left = (u8 *)end - (u8 *)q->q.stat;

Re: [PATCH v2] net: usb: rtl8150: don't incorrectly assign random MAC addresses

2020-10-11 Thread Petko Manolov

On 20-10-11 11:33:00, Joe Perches wrote:
> On Sun, 2020-10-11 at 10:59 -0700, Jakub Kicinski wrote:
> > On Sun, 11 Oct 2020 23:00:30 +0530 Anant Thazhemadam wrote:
> > > In set_ethernet_addr(), if get_registers() succeeds, the ethernet address
> > > that was read must be copied over. Otherwise, a random ethernet address
> > > must be assigned.
> > > 
> > > get_registers() returns 0 if successful, and negative error number
> > > otherwise. However, in set_ethernet_addr(), this return value is
> > > incorrectly checked.
> > > 
> > > Since this return value will never be equal to sizeof(node_id), a
> > > random MAC address will always be generated and assigned to the
> > > device; even in cases when get_registers() is successful.
> > > 
> > > Correctly modifying the condition that checks if get_registers() was
> > > successful or not fixes this problem, and copies the ethernet address
> > > appropriately.
> 
> There are many unchecked uses of set_registers and get_registers
>  in this file.
> 
> If failures are really expected, then it might be better to fix
> them up too.

Checking the return value of each get/set_registers() is going to be a PITA and
not very helpful.  Doing so when setting the MAC address _does_ make sense as in
that case it is not a hard error.

In almost all other occasions if usb_control_msg_send/recv() return an error i'd
rather dump an error message (from within get/set_registers()) and let the user
decide whether to get rid of this adapter or start debugging it.


cheers,
Petko


> $ git grep -w '[gs]et_registers' drivers/net/usb/rtl8150.c
> drivers/net/usb/rtl8150.c:static int get_registers(rtl8150_t * dev, u16 indx, 
> u16 size, void *data)
> drivers/net/usb/rtl8150.c:static int set_registers(rtl8150_t * dev, u16 indx, 
> u16 size, const void *data)
> drivers/net/usb/rtl8150.c:  set_registers(dev, PHYADD, sizeof(data), 
> data);
> drivers/net/usb/rtl8150.c:  set_registers(dev, PHYCNT, 1, &tmp);
> drivers/net/usb/rtl8150.c:  get_registers(dev, PHYCNT, 1, data);
> drivers/net/usb/rtl8150.c:  get_registers(dev, PHYDAT, 2, data);
> drivers/net/usb/rtl8150.c:  set_registers(dev, PHYADD, sizeof(data), 
> data);
> drivers/net/usb/rtl8150.c:  set_registers(dev, PHYCNT, 1, &tmp);
> drivers/net/usb/rtl8150.c:  get_registers(dev, PHYCNT, 1, data);
> drivers/net/usb/rtl8150.c:  ret = get_registers(dev, IDR, 
> sizeof(node_id), node_id);
> drivers/net/usb/rtl8150.c:  set_registers(dev, IDR, netdev->addr_len, 
> netdev->dev_addr);
> drivers/net/usb/rtl8150.c:  get_registers(dev, CR, 1, &cr);
> drivers/net/usb/rtl8150.c:  set_registers(dev, CR, 1, &cr);
> drivers/net/usb/rtl8150.c:  set_registers(dev, IDR_EEPROM + (i * 
> 2), 2,
> drivers/net/usb/rtl8150.c:  set_registers(dev, CR, 1, &cr);
> drivers/net/usb/rtl8150.c:  set_registers(dev, CR, 1, &data);
> drivers/net/usb/rtl8150.c:  get_registers(dev, CR, 1, &data);
> drivers/net/usb/rtl8150.c:  set_registers(dev, RCR, 1, &rcr);
> drivers/net/usb/rtl8150.c:  set_registers(dev, TCR, 1, &tcr);
> drivers/net/usb/rtl8150.c:  set_registers(dev, CR, 1, &cr);
> drivers/net/usb/rtl8150.c:  get_registers(dev, MSR, 1, &msr);
> drivers/net/usb/rtl8150.c:  get_registers(dev, CR, 1, &cr);
> drivers/net/usb/rtl8150.c:  set_registers(dev, CR, 1, &cr);
> drivers/net/usb/rtl8150.c:  get_registers(dev, CSCR, 2, &tmp);
> drivers/net/usb/rtl8150.c:  set_registers(dev, IDR, 6, netdev->dev_addr);
> drivers/net/usb/rtl8150.c:  get_registers(dev, BMCR, 2, &bmcr);
> drivers/net/usb/rtl8150.c:  get_registers(dev, ANLP, 2, &lpa);
> 
> 
>

Re: [PATCH net-next 3/9] bnxt_en: Set driver default message level.

2020-10-11 Thread Jakub Kicinski

On Sun, 11 Oct 2020 06:22:55 -0400 Michael Chan wrote:
> Currently, bp->msg_enable has default value of 0.  It is more useful
> to have the commonly used NETIF_MSG_DRV and NETIF_MSG_HW enabled by
> default.
> 
> Reviewed-by: Edwin Peer 
> Reviewed-by: Vasundhara Volam 
> Signed-off-by: Michael Chan 

This will add whole bunch of output for "RX buffer error 4000[45]", no?
That one needs to switch to a silent reset first, I'd think.

[PATCH net-next 00/12] net: add and use function dev_fetch_sw_netstats for fetching pcpu_sw_netstats

2020-10-11 Thread Heiner Kallweit

In several places the same code is used to populate rtnl_link_stats64
fields with data from pcpu_sw_netstats. Therefore factor out this code
to a new function dev_fetch_sw_netstats().

Heiner Kallweit (12):
  net: core: add function dev_fetch_sw_netstats for fetching
pcpu_sw_netstats
  IB/hfi1: use new function dev_fetch_sw_netstats
  net: macsec: use new function dev_fetch_sw_netstats
  net: usb: qmi_wwan: use new function dev_fetch_sw_netstats
  net: usbnet: use new function dev_fetch_sw_netstats
  qtnfmac: use new function dev_fetch_sw_netstats
  net: bridge: use new function dev_fetch_sw_netstats
  net: dsa: use new function dev_fetch_sw_netstats
  iptunnel: use new function dev_fetch_sw_netstats
  mac80211: use new function dev_fetch_sw_netstats
  net: openvswitch: use new function dev_fetch_sw_netstats
  xfrm: use new function dev_fetch_sw_netstats

 drivers/infiniband/hw/hfi1/ipoib_main.c   | 34 +-
 drivers/net/macsec.c  | 25 +
 drivers/net/usb/qmi_wwan.c| 24 +
 drivers/net/usb/usbnet.c  | 24 +
 drivers/net/wireless/quantenna/qtnfmac/core.c | 27 +-
 include/linux/netdevice.h |  2 ++
 net/bridge/br_device.c| 21 +--
 net/core/dev.c| 36 +++
 net/dsa/slave.c   | 21 +--
 net/ipv4/ip_tunnel_core.c | 23 +---
 net/mac80211/iface.c  | 23 +---
 net/openvswitch/vport-internal_dev.c  | 20 +--
 net/xfrm/xfrm_interface.c | 22 +---
 13 files changed, 49 insertions(+), 253 deletions(-)

-- 
2.28.0

[PATCH net-next 01/12] net: core: add function dev_fetch_sw_netstats for fetching pcpu_sw_netstats

2020-10-11 Thread Heiner Kallweit

In several places the same code is used to populate rtnl_link_stats64
fields with data from pcpu_sw_netstats. Therefore factor out this code
to a new function dev_fetch_sw_netstats().

Signed-off-by: Heiner Kallweit 
---
 include/linux/netdevice.h |  2 ++
 net/core/dev.c| 36 
 2 files changed, 38 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index a0df43b13..ca4736349 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -4495,6 +4495,8 @@ struct rtnl_link_stats64 *dev_get_stats(struct net_device 
*dev,
struct rtnl_link_stats64 *storage);
 void netdev_stats_to_stats64(struct rtnl_link_stats64 *stats64,
 const struct net_device_stats *netdev_stats);
+void dev_fetch_sw_netstats(struct rtnl_link_stats64 *s,
+  struct pcpu_sw_netstats __percpu *netstats);
 
 extern int netdev_max_backlog;
 extern int netdev_tstamp_prequeue;
diff --git a/net/core/dev.c b/net/core/dev.c
index 7d18560b2..ba91bf16b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10323,6 +10323,42 @@ struct rtnl_link_stats64 *dev_get_stats(struct 
net_device *dev,
 }
 EXPORT_SYMBOL(dev_get_stats);
 
+/**
+ * dev_fetch_sw_netstats - get per-cpu network device statistics
+ * @s: place to store stats
+ * @netstats: per-cpu network stats to read from
+ *
+ * Read per-cpu network statistics and populate the related fields in s.
+ */
+void dev_fetch_sw_netstats(struct rtnl_link_stats64 *s,
+  struct pcpu_sw_netstats __percpu *netstats)
+{
+   int cpu;
+
+   if (IS_ERR_OR_NULL(netstats))
+   return;
+
+   for_each_possible_cpu(cpu) {
+   struct pcpu_sw_netstats *stats, tmp;
+   unsigned int start;
+
+   stats = per_cpu_ptr(netstats, cpu);
+   do {
+   start = u64_stats_fetch_begin_irq(&stats->syncp);
+   tmp.rx_packets = stats->rx_packets;
+   tmp.rx_bytes   = stats->rx_bytes;
+   tmp.tx_packets = stats->tx_packets;
+   tmp.tx_bytes   = stats->tx_bytes;
+   } while (u64_stats_fetch_retry_irq(&stats->syncp, start));
+
+   s->rx_packets += tmp.rx_packets;
+   s->rx_bytes   += tmp.rx_bytes;
+   s->tx_packets += tmp.tx_packets;
+   s->tx_bytes   += tmp.tx_bytes;
+   }
+}
+EXPORT_SYMBOL(dev_fetch_sw_netstats);
+
 struct netdev_queue *dev_ingress_queue_create(struct net_device *dev)
 {
struct netdev_queue *queue = dev_ingress_queue(dev);
-- 
2.28.0

[PATCH net-next 11/12] net: openvswitch: use new function dev_fetch_sw_netstats

2020-10-11 Thread Heiner Kallweit

Simplify the code by using new function dev_fetch_sw_netstats().

Signed-off-by: Heiner Kallweit 
---
 net/openvswitch/vport-internal_dev.c | 20 +---
 1 file changed, 1 insertion(+), 19 deletions(-)

diff --git a/net/openvswitch/vport-internal_dev.c 
b/net/openvswitch/vport-internal_dev.c
index d8fe66eea..1e30d8df3 100644
--- a/net/openvswitch/vport-internal_dev.c
+++ b/net/openvswitch/vport-internal_dev.c
@@ -86,31 +86,13 @@ static void internal_dev_destructor(struct net_device *dev)
 static void
 internal_get_stats(struct net_device *dev, struct rtnl_link_stats64 *stats)
 {
-   int i;
-
memset(stats, 0, sizeof(*stats));
stats->rx_errors  = dev->stats.rx_errors;
stats->tx_errors  = dev->stats.tx_errors;
stats->tx_dropped = dev->stats.tx_dropped;
stats->rx_dropped = dev->stats.rx_dropped;
 
-   for_each_possible_cpu(i) {
-   const struct pcpu_sw_netstats *percpu_stats;
-   struct pcpu_sw_netstats local_stats;
-   unsigned int start;
-
-   percpu_stats = per_cpu_ptr(dev->tstats, i);
-
-   do {
-   start = u64_stats_fetch_begin_irq(&percpu_stats->syncp);
-   local_stats = *percpu_stats;
-   } while (u64_stats_fetch_retry_irq(&percpu_stats->syncp, 
start));
-
-   stats->rx_bytes += local_stats.rx_bytes;
-   stats->rx_packets   += local_stats.rx_packets;
-   stats->tx_bytes += local_stats.tx_bytes;
-   stats->tx_packets   += local_stats.tx_packets;
-   }
+   dev_fetch_sw_netstats(stats, dev->tstats);
 }
 
 static const struct net_device_ops internal_dev_netdev_ops = {
-- 
2.28.0

[PATCH net-next 08/12] net: dsa: use new function dev_fetch_sw_netstats

2020-10-11 Thread Heiner Kallweit

Simplify the code by using new function dev_fetch_sw_netstats().

Signed-off-by: Heiner Kallweit 
---
 net/dsa/slave.c | 21 +
 1 file changed, 1 insertion(+), 20 deletions(-)

diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index e7c1d62fd..3bc5ca40c 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -1221,28 +1221,9 @@ static void dsa_slave_get_stats64(struct net_device *dev,
  struct rtnl_link_stats64 *stats)
 {
struct dsa_slave_priv *p = netdev_priv(dev);
-   struct pcpu_sw_netstats *s;
-   unsigned int start;
-   int i;
 
netdev_stats_to_stats64(stats, &dev->stats);
-   for_each_possible_cpu(i) {
-   u64 tx_packets, tx_bytes, rx_packets, rx_bytes;
-
-   s = per_cpu_ptr(p->stats64, i);
-   do {
-   start = u64_stats_fetch_begin_irq(&s->syncp);
-   tx_packets = s->tx_packets;
-   tx_bytes = s->tx_bytes;
-   rx_packets = s->rx_packets;
-   rx_bytes = s->rx_bytes;
-   } while (u64_stats_fetch_retry_irq(&s->syncp, start));
-
-   stats->tx_packets += tx_packets;
-   stats->tx_bytes += tx_bytes;
-   stats->rx_packets += rx_packets;
-   stats->rx_bytes += rx_bytes;
-   }
+   dev_fetch_sw_netstats(stats, p->stats64);
 }
 
 static int dsa_slave_get_rxnfc(struct net_device *dev,
-- 
2.28.0

[PATCH net-next 07/12] net: bridge: use new function dev_fetch_sw_netstats

2020-10-11 Thread Heiner Kallweit

Simplify the code by using new function dev_fetch_sw_netstats().

Signed-off-by: Heiner Kallweit 
---
 net/bridge/br_device.c | 21 +
 1 file changed, 1 insertion(+), 20 deletions(-)

diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 9a2fb4aa1..6f742fee8 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -206,27 +206,8 @@ static void br_get_stats64(struct net_device *dev,
   struct rtnl_link_stats64 *stats)
 {
struct net_bridge *br = netdev_priv(dev);
-   struct pcpu_sw_netstats tmp, sum = { 0 };
-   unsigned int cpu;
-
-   for_each_possible_cpu(cpu) {
-   unsigned int start;
-   const struct pcpu_sw_netstats *bstats
-   = per_cpu_ptr(br->stats, cpu);
-   do {
-   start = u64_stats_fetch_begin_irq(&bstats->syncp);
-   memcpy(&tmp, bstats, sizeof(tmp));
-   } while (u64_stats_fetch_retry_irq(&bstats->syncp, start));
-   sum.tx_bytes   += tmp.tx_bytes;
-   sum.tx_packets += tmp.tx_packets;
-   sum.rx_bytes   += tmp.rx_bytes;
-   sum.rx_packets += tmp.rx_packets;
-   }
 
-   stats->tx_bytes   = sum.tx_bytes;
-   stats->tx_packets = sum.tx_packets;
-   stats->rx_bytes   = sum.rx_bytes;
-   stats->rx_packets = sum.rx_packets;
+   dev_fetch_sw_netstats(stats, br->stats);
 }
 
 static int br_change_mtu(struct net_device *dev, int new_mtu)
-- 
2.28.0

[PATCH net-next 04/12] net: usb: qmi_wwan: use new function dev_fetch_sw_netstats

2020-10-11 Thread Heiner Kallweit

Simplify the code by using new function dev_fetch_sw_netstats().

Signed-off-by: Heiner Kallweit 
---
 drivers/net/usb/qmi_wwan.c | 24 +---
 1 file changed, 1 insertion(+), 23 deletions(-)

diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index 5ca1356b8..a322f5187 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -126,31 +126,9 @@ static void qmimux_get_stats64(struct net_device *net,
   struct rtnl_link_stats64 *stats)
 {
struct qmimux_priv *priv = netdev_priv(net);
-   unsigned int start;
-   int cpu;
 
netdev_stats_to_stats64(stats, &net->stats);
-
-   for_each_possible_cpu(cpu) {
-   struct pcpu_sw_netstats *stats64;
-   u64 rx_packets, rx_bytes;
-   u64 tx_packets, tx_bytes;
-
-   stats64 = per_cpu_ptr(priv->stats64, cpu);
-
-   do {
-   start = u64_stats_fetch_begin_irq(&stats64->syncp);
-   rx_packets = stats64->rx_packets;
-   rx_bytes = stats64->rx_bytes;
-   tx_packets = stats64->tx_packets;
-   tx_bytes = stats64->tx_bytes;
-   } while (u64_stats_fetch_retry_irq(&stats64->syncp, start));
-
-   stats->rx_packets += rx_packets;
-   stats->rx_bytes += rx_bytes;
-   stats->tx_packets += tx_packets;
-   stats->tx_bytes += tx_bytes;
-   }
+   dev_fetch_sw_netstats(stats, priv->stats64);
 }
 
 static const struct net_device_ops qmimux_netdev_ops = {
-- 
2.28.0

[PATCH net-next 12/12] xfrm: use new function dev_fetch_sw_netstats

2020-10-11 Thread Heiner Kallweit

Simplify the code by using new function dev_fetch_sw_netstats().

Signed-off-by: Heiner Kallweit 
---
 net/xfrm/xfrm_interface.c | 22 +-
 1 file changed, 1 insertion(+), 21 deletions(-)

diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c
index 5b120936d..aa4cdcf69 100644
--- a/net/xfrm/xfrm_interface.c
+++ b/net/xfrm/xfrm_interface.c
@@ -541,27 +541,7 @@ static int xfrmi_update(struct xfrm_if *xi, struct 
xfrm_if_parms *p)
 static void xfrmi_get_stats64(struct net_device *dev,
   struct rtnl_link_stats64 *s)
 {
-   int cpu;
-
-   for_each_possible_cpu(cpu) {
-   struct pcpu_sw_netstats *stats;
-   struct pcpu_sw_netstats tmp;
-   int start;
-
-   stats = per_cpu_ptr(dev->tstats, cpu);
-   do {
-   start = u64_stats_fetch_begin_irq(&stats->syncp);
-   tmp.rx_packets = stats->rx_packets;
-   tmp.rx_bytes   = stats->rx_bytes;
-   tmp.tx_packets = stats->tx_packets;
-   tmp.tx_bytes   = stats->tx_bytes;
-   } while (u64_stats_fetch_retry_irq(&stats->syncp, start));
-
-   s->rx_packets += tmp.rx_packets;
-   s->rx_bytes   += tmp.rx_bytes;
-   s->tx_packets += tmp.tx_packets;
-   s->tx_bytes   += tmp.tx_bytes;
-   }
+   dev_fetch_sw_netstats(s, dev->tstats);
 
s->rx_dropped = dev->stats.rx_dropped;
s->tx_dropped = dev->stats.tx_dropped;
-- 
2.28.0

[PATCH net-next 05/12] net: usbnet: use new function dev_fetch_sw_netstats

2020-10-11 Thread Heiner Kallweit

Simplify the code by using new function dev_fetch_sw_netstats().

Signed-off-by: Heiner Kallweit 
---
 drivers/net/usb/usbnet.c | 24 +---
 1 file changed, 1 insertion(+), 23 deletions(-)

diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
index 963d260d1..6062dc278 100644
--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -983,31 +983,9 @@ EXPORT_SYMBOL_GPL(usbnet_set_link_ksettings);
 void usbnet_get_stats64(struct net_device *net, struct rtnl_link_stats64 
*stats)
 {
struct usbnet *dev = netdev_priv(net);
-   unsigned int start;
-   int cpu;
 
netdev_stats_to_stats64(stats, &net->stats);
-
-   for_each_possible_cpu(cpu) {
-   struct pcpu_sw_netstats *stats64;
-   u64 rx_packets, rx_bytes;
-   u64 tx_packets, tx_bytes;
-
-   stats64 = per_cpu_ptr(dev->stats64, cpu);
-
-   do {
-   start = u64_stats_fetch_begin_irq(&stats64->syncp);
-   rx_packets = stats64->rx_packets;
-   rx_bytes = stats64->rx_bytes;
-   tx_packets = stats64->tx_packets;
-   tx_bytes = stats64->tx_bytes;
-   } while (u64_stats_fetch_retry_irq(&stats64->syncp, start));
-
-   stats->rx_packets += rx_packets;
-   stats->rx_bytes += rx_bytes;
-   stats->tx_packets += tx_packets;
-   stats->tx_bytes += tx_bytes;
-   }
+   dev_fetch_sw_netstats(stats, dev->stats64);
 }
 EXPORT_SYMBOL_GPL(usbnet_get_stats64);
 
-- 
2.28.0

[PATCH net-next 10/12] mac80211: use new function dev_fetch_sw_netstats

2020-10-11 Thread Heiner Kallweit

Simplify the code by using new function dev_fetch_sw_netstats().

Signed-off-by: Heiner Kallweit 
---
 net/mac80211/iface.c | 23 +--
 1 file changed, 1 insertion(+), 22 deletions(-)

diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index 240862a74..1be775979 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -709,28 +709,7 @@ static u16 ieee80211_netdev_select_queue(struct net_device 
*dev,
 static void
 ieee80211_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
 {
-   int i;
-
-   for_each_possible_cpu(i) {
-   const struct pcpu_sw_netstats *tstats;
-   u64 rx_packets, rx_bytes, tx_packets, tx_bytes;
-   unsigned int start;
-
-   tstats = per_cpu_ptr(dev->tstats, i);
-
-   do {
-   start = u64_stats_fetch_begin_irq(&tstats->syncp);
-   rx_packets = tstats->rx_packets;
-   tx_packets = tstats->tx_packets;
-   rx_bytes = tstats->rx_bytes;
-   tx_bytes = tstats->tx_bytes;
-   } while (u64_stats_fetch_retry_irq(&tstats->syncp, start));
-
-   stats->rx_packets += rx_packets;
-   stats->tx_packets += tx_packets;
-   stats->rx_bytes   += rx_bytes;
-   stats->tx_bytes   += tx_bytes;
-   }
+   dev_fetch_sw_netstats(stats, dev->tstats);
 }
 
 static const struct net_device_ops ieee80211_dataif_ops = {
-- 
2.28.0

[PATCH net-next 06/12] qtnfmac: use new function dev_fetch_sw_netstats

2020-10-11 Thread Heiner Kallweit

Simplify the code by using new function dev_fetch_sw_netstats().

Signed-off-by: Heiner Kallweit 
---
 drivers/net/wireless/quantenna/qtnfmac/core.c | 27 +--
 1 file changed, 1 insertion(+), 26 deletions(-)

diff --git a/drivers/net/wireless/quantenna/qtnfmac/core.c 
b/drivers/net/wireless/quantenna/qtnfmac/core.c
index 374074dc7..7f66cf608 100644
--- a/drivers/net/wireless/quantenna/qtnfmac/core.c
+++ b/drivers/net/wireless/quantenna/qtnfmac/core.c
@@ -139,34 +139,9 @@ static void qtnf_netdev_get_stats64(struct net_device 
*ndev,
struct rtnl_link_stats64 *stats)
 {
struct qtnf_vif *vif = qtnf_netdev_get_priv(ndev);
-   unsigned int start;
-   int cpu;
 
netdev_stats_to_stats64(stats, &ndev->stats);
-
-   if (!vif->stats64)
-   return;
-
-   for_each_possible_cpu(cpu) {
-   struct pcpu_sw_netstats *stats64;
-   u64 rx_packets, rx_bytes;
-   u64 tx_packets, tx_bytes;
-
-   stats64 = per_cpu_ptr(vif->stats64, cpu);
-
-   do {
-   start = u64_stats_fetch_begin_irq(&stats64->syncp);
-   rx_packets = stats64->rx_packets;
-   rx_bytes = stats64->rx_bytes;
-   tx_packets = stats64->tx_packets;
-   tx_bytes = stats64->tx_bytes;
-   } while (u64_stats_fetch_retry_irq(&stats64->syncp, start));
-
-   stats->rx_packets += rx_packets;
-   stats->rx_bytes += rx_bytes;
-   stats->tx_packets += tx_packets;
-   stats->tx_bytes += tx_bytes;
-   }
+   dev_fetch_sw_netstats(stats, vif->stats64);
 }
 
 /* Netdev handler for transmission timeout.
-- 
2.28.0

[PATCH net-next 03/12] net: macsec: use new function dev_fetch_sw_netstats

2020-10-11 Thread Heiner Kallweit

Simplify the code by using new function dev_fetch_sw_netstats().

Signed-off-by: Heiner Kallweit 
---
 drivers/net/macsec.c | 25 +
 1 file changed, 1 insertion(+), 24 deletions(-)

diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
index 2b0c8f01d..e74483279 100644
--- a/drivers/net/macsec.c
+++ b/drivers/net/macsec.c
@@ -3647,30 +3647,7 @@ static int macsec_change_mtu(struct net_device *dev, int 
new_mtu)
 static void macsec_get_stats64(struct net_device *dev,
   struct rtnl_link_stats64 *s)
 {
-   int cpu;
-
-   if (!dev->tstats)
-   return;
-
-   for_each_possible_cpu(cpu) {
-   struct pcpu_sw_netstats *stats;
-   struct pcpu_sw_netstats tmp;
-   int start;
-
-   stats = per_cpu_ptr(dev->tstats, cpu);
-   do {
-   start = u64_stats_fetch_begin_irq(&stats->syncp);
-   tmp.rx_packets = stats->rx_packets;
-   tmp.rx_bytes   = stats->rx_bytes;
-   tmp.tx_packets = stats->tx_packets;
-   tmp.tx_bytes   = stats->tx_bytes;
-   } while (u64_stats_fetch_retry_irq(&stats->syncp, start));
-
-   s->rx_packets += tmp.rx_packets;
-   s->rx_bytes   += tmp.rx_bytes;
-   s->tx_packets += tmp.tx_packets;
-   s->tx_bytes   += tmp.tx_bytes;
-   }
+   dev_fetch_sw_netstats(s, dev->tstats);
 
s->rx_dropped = dev->stats.rx_dropped;
s->tx_dropped = dev->stats.tx_dropped;
-- 
2.28.0

[PATCH net-next 02/12] IB/hfi1: use new function dev_fetch_sw_netstats

2020-10-11 Thread Heiner Kallweit

Simplify the code by using new function dev_fetch_sw_netstats().

Signed-off-by: Heiner Kallweit 
---
 drivers/infiniband/hw/hfi1/ipoib_main.c | 34 +
 1 file changed, 1 insertion(+), 33 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/ipoib_main.c 
b/drivers/infiniband/hw/hfi1/ipoib_main.c
index 014351ebb..9f71b9d70 100644
--- a/drivers/infiniband/hw/hfi1/ipoib_main.c
+++ b/drivers/infiniband/hw/hfi1/ipoib_main.c
@@ -97,41 +97,9 @@ static void hfi1_ipoib_dev_get_stats64(struct net_device 
*dev,
   struct rtnl_link_stats64 *storage)
 {
struct hfi1_ipoib_dev_priv *priv = hfi1_ipoib_priv(dev);
-   u64 rx_packets = 0ull;
-   u64 rx_bytes = 0ull;
-   u64 tx_packets = 0ull;
-   u64 tx_bytes = 0ull;
-   int i;
 
netdev_stats_to_stats64(storage, &dev->stats);
-
-   for_each_possible_cpu(i) {
-   const struct pcpu_sw_netstats *stats;
-   unsigned int start;
-   u64 trx_packets;
-   u64 trx_bytes;
-   u64 ttx_packets;
-   u64 ttx_bytes;
-
-   stats = per_cpu_ptr(priv->netstats, i);
-   do {
-   start = u64_stats_fetch_begin_irq(&stats->syncp);
-   trx_packets = stats->rx_packets;
-   trx_bytes = stats->rx_bytes;
-   ttx_packets = stats->tx_packets;
-   ttx_bytes = stats->tx_bytes;
-   } while (u64_stats_fetch_retry_irq(&stats->syncp, start));
-
-   rx_packets += trx_packets;
-   rx_bytes += trx_bytes;
-   tx_packets += ttx_packets;
-   tx_bytes += ttx_bytes;
-   }
-
-   storage->rx_packets += rx_packets;
-   storage->rx_bytes += rx_bytes;
-   storage->tx_packets += tx_packets;
-   storage->tx_bytes += tx_bytes;
+   dev_fetch_sw_netstats(storage, priv->netstats);
 }
 
 static const struct net_device_ops hfi1_ipoib_netdev_ops = {
-- 
2.28.0

[PATCH net-next 09/12] iptunnel: use new function dev_fetch_sw_netstats

2020-10-11 Thread Heiner Kallweit

Simplify the code by using new function dev_fetch_sw_netstats().

Signed-off-by: Heiner Kallweit 
---
 net/ipv4/ip_tunnel_core.c | 23 +--
 1 file changed, 1 insertion(+), 22 deletions(-)

diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index b2ea1a8c5..25f1caf5a 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -433,29 +433,8 @@ EXPORT_SYMBOL(skb_tunnel_check_pmtu);
 void ip_tunnel_get_stats64(struct net_device *dev,
   struct rtnl_link_stats64 *tot)
 {
-   int i;
-
netdev_stats_to_stats64(tot, &dev->stats);
-
-   for_each_possible_cpu(i) {
-   const struct pcpu_sw_netstats *tstats =
-  per_cpu_ptr(dev->tstats, i);
-   u64 rx_packets, rx_bytes, tx_packets, tx_bytes;
-   unsigned int start;
-
-   do {
-   start = u64_stats_fetch_begin_irq(&tstats->syncp);
-   rx_packets = tstats->rx_packets;
-   tx_packets = tstats->tx_packets;
-   rx_bytes = tstats->rx_bytes;
-   tx_bytes = tstats->tx_bytes;
-   } while (u64_stats_fetch_retry_irq(&tstats->syncp, start));
-
-   tot->rx_packets += rx_packets;
-   tot->tx_packets += tx_packets;
-   tot->rx_bytes   += rx_bytes;
-   tot->tx_bytes   += tx_bytes;
-   }
+   dev_fetch_sw_netstats(tot, dev->tstats);
 }
 EXPORT_SYMBOL_GPL(ip_tunnel_get_stats64);
 
-- 
2.28.0

Re: [PATCH net-next 01/12] net: core: add function dev_fetch_sw_netstats for fetching pcpu_sw_netstats

2020-10-11 Thread Stephen Hemminger

On Sun, 11 Oct 2020 21:36:43 +0200
Heiner Kallweit  wrote:

> +void dev_fetch_sw_netstats(struct rtnl_link_stats64 *s,
> +struct pcpu_sw_netstats __percpu *netstats)

netstats is unmodified, should it be const?

> +{
> + int cpu;
> +
> + if (IS_ERR_OR_NULL(netstats))
> + return;

Any code calling this with a null pointer is broken/buggy, please don't
ignore that.

[PATCH bpf-next] bpf: Migrate from patchwork.ozlabs.org to patchwork.kernel.org.

2020-10-11 Thread Alexei Starovoitov

From: Alexei Starovoitov 

Move the bpf/bpf-next patch processing queue to patchwork.kernel.org.

Signed-off-by: Alexei Starovoitov 
---

Hi All BPF developers,

we've migrated bpf patchwork to kernel.org.
Please monitor the bpf queue at
https://patchwork.kernel.org/project/netdevbpf/list/?delegate=121173
The new home provides:
- automatic marking of patches as 'accepted' as soon as they land in 
bpf/bpf-next trees.
- automatic email notification when patches are pushed into bpf/bpf-next trees.
- automatic marking as 'superseded' when patch series are respun.
- automatic marking with bpf or netdev delegate based on
  [PATCH bpf|bpf-next|net|net-next] subject and via file-based pattern matching.
- Fast UI with low latency for those in US and Europe.
- Patches sent to netdev@vger and/or bpf@vger appear in one place.

Thank you, Konstantin, for making it happen.

The patches will still appear at ozlabs.org until it stops receiving emails.
All bpf patches at ozlabs will be marked as 'not applicable'.
The ozlabs instance is deprecated. It will be used as an archive.
---
 Documentation/bpf/bpf_devel_QA.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/bpf/bpf_devel_QA.rst 
b/Documentation/bpf/bpf_devel_QA.rst
index 75a0dca5f295..5b613d2a5f1a 100644
--- a/Documentation/bpf/bpf_devel_QA.rst
+++ b/Documentation/bpf/bpf_devel_QA.rst
@@ -60,13 +60,13 @@ Q: Where can I find patches currently under discussion for 
BPF subsystem?
 A: All patches that are Cc'ed to netdev are queued for review under netdev
 patchwork project:
 
-  http://patchwork.ozlabs.org/project/netdev/list/
+  https://patchwork.kernel.org/project/netdevbpf/list/
 
 Those patches which target BPF, are assigned to a 'bpf' delegate for
 further processing from BPF maintainers. The current queue with
 patches under review can be found at:
 
-  https://patchwork.ozlabs.org/project/netdev/list/?delegate=77147
+  https://patchwork.kernel.org/project/netdevbpf/list/?delegate=121173
 
 Once the patches have been reviewed by the BPF community as a whole
 and approved by the BPF maintainers, their status in patchwork will be
-- 
2.23.0

Re: [PATCH bpf-next v2] bpf_fib_lookup: optionally skip neighbour lookup

2020-10-11 Thread Daniel Borkmann


On 10/10/20 3:42 PM, Toke Høiland-Jørgensen wrote:

Daniel Borkmann  writes:

On 10/9/20 11:28 PM, David Ahern wrote:

On 10/9/20 11:42 AM, Toke Høiland-Jørgensen wrote:

David Ahern  writes:

On 10/9/20 3:13 AM, Toke Høiland-Jørgensen wrote:

The bpf_fib_lookup() helper performs a neighbour lookup for the destination
IP and returns BPF_FIB_LKUP_NO_NEIGH if this fails, with the expectation
that the BPF program will pass the packet up the stack in this case.
However, with the addition of bpf_redirect_neigh() that can be used instead
to perform the neighbour lookup, at the cost of a bit of duplicated work.

For that we still need the target ifindex, and since bpf_fib_lookup()
already has that at the time it performs the neighbour lookup, there is
really no reason why it can't just return it in any case. So let's just
always return the ifindex, and also add a flag that lets the caller turn
off the neighbour lookup entirely in bpf_fib_lookup().


seems really odd to do the fib lookup only to skip the neighbor lookup
and defer to a second helper to do a second fib lookup and send out.

The better back-to-back calls is to return the ifindex and gateway on
successful fib lookup regardless of valid neighbor. If the call to
bpf_redirect_neigh is needed, it can have a flag to skip the fib lookup
and just redirect to the given nexthop address + ifindex. ie.,
bpf_redirect_neigh only does neighbor handling in this case.


Hmm, yeah, I guess it would make sense to cache and reuse the lookup -
maybe stick it in bpf_redirect_info()? However, given the imminent


That is not needed.


opening of the merge window, I don't see this landing before then. So
I'm going to respin this patch with just the original change to always
return the ifindex, then we can revisit the flags/reuse of the fib
lookup later.


What I am suggesting is a change in API to bpf_redirect_neigh which
should be done now, before the merge window, before it comes a locked
API. Right now, bpf_redirect_neigh does a lookup to get the nexthop. It
should take the gateway as an input argument. If set, then the lookup is
not done - only the neighbor redirect.


Sounds like a reasonable extension, agree. API freeze is not merge win, but
final v5.10 tag in this case as it always has been. In case it's not in time,
we can simply just move flags to arg3 and add a reserved param as arg2 which
must be zero (and thus indicate to perform the lookup as-is). Later we could
extend to pass params similar as in fib_lookup helper for the gw.


Right, I can take a look at this next week. Feel free to merge (v3 of)
this patch now, that change will be needed in any case I think...


Ok, sounds reasonable, done. Lets fix the remaining one as David suggested until
rc1, at latest rc2 time frame. I'll be mostly offline next week during the day,
but happy to help till that deadline as well.

Thanks,
Daniel

Re: [PATCH 4/5] net/ipv6: use semicolons rather than commas to separate statements

2020-10-11 Thread Paul Moore

On Sun, Oct 11, 2020 at 7:18 AM Julia Lawall  wrote:
>
> Replace commas with semicolons.  Commas introduce unnecessary
> variability in the code structure and are hard to see.  What is done
> is essentially described by the following Coccinelle semantic patch
> (http://coccinelle.lip6.fr/):
>
> // 
> @@ expression e1,e2; @@
> e1
> -,
> +;
> e2
> ... when any
> // 
>
> Signed-off-by: Julia Lawall 
>
> ---
>  net/ipv6/calipso.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Thanks Julia.

Acked-by: Paul Moore 

> diff --git a/net/ipv6/calipso.c b/net/ipv6/calipso.c
> index 8d3f66c310db..78f766019b7e 100644
> --- a/net/ipv6/calipso.c
> +++ b/net/ipv6/calipso.c
> @@ -761,7 +761,7 @@ static int calipso_genopt(unsigned char *buf, u32 start, 
> u32 buf_len,
> calipso[1] = len - 2;
> *(__be32 *)(calipso + 2) = htonl(doi_def->doi);
> calipso[6] = (len - CALIPSO_HDR_LEN) / 4;
> -   calipso[7] = secattr->attr.mls.lvl,
> +   calipso[7] = secattr->attr.mls.lvl;
> crc = ~crc_ccitt(0x, calipso, len);
> calipso[8] = crc & 0xff;
> calipso[9] = (crc >> 8) & 0xff;

-- 
paul moore
www.paul-moore.com

Re: [PATCH bpf-next] bpf: Migrate from patchwork.ozlabs.org to patchwork.kernel.org.

2020-10-11 Thread patchwork-bot+netdevbpf

Hello:

This patch was applied to bpf/bpf-next.git (refs/heads/master):

On Sun, 11 Oct 2020 13:01:49 -0700 you wrote:
> From: Alexei Starovoitov 
> 
> Move the bpf/bpf-next patch processing queue to patchwork.kernel.org.
> 
> Signed-off-by: Alexei Starovoitov 
> ---
> 
> [...]

Here is the summary with links:
  - [bpf-next] bpf: Migrate from patchwork.ozlabs.org to patchwork.kernel.org.
https://git.kernel.org/bpf/bpf-next/c/ebb034b15bfa

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html

[PATCH] ptp: ptp_clockmatrix: initialize variables

2020-10-11 Thread trix

From: Tom Rix 

Clang static analysis reports this representative problem

ptp_clockmatrix.c:1852:2: warning: 5th function call argument
  is an uninitialized value
snprintf(idtcm->version, sizeof(idtcm->version), "%u.%u.%u",
^~~~

idtcm_display_version_info() calls several idtcm_read_*'s without
checking a return status.  Initialize the read variables so if a
read fails, a garbage value is not displayed.

Signed-off-by: Tom Rix 
---
 drivers/ptp/ptp_clockmatrix.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/ptp/ptp_clockmatrix.c b/drivers/ptp/ptp_clockmatrix.c
index e020faff7da5..47e5e62da5d2 100644
--- a/drivers/ptp/ptp_clockmatrix.c
+++ b/drivers/ptp/ptp_clockmatrix.c
@@ -1832,12 +1832,12 @@ static int idtcm_enable_tod(struct idtcm_channel 
*channel)
 
 static void idtcm_display_version_info(struct idtcm *idtcm)
 {
-   u8 major;
-   u8 minor;
-   u8 hotfix;
-   u16 product_id;
-   u8 hw_rev_id;
-   u8 config_select;
+   u8 major = 0;
+   u8 minor = 0;
+   u8 hotfix = 0;
+   u16 product_id = 0;
+   u8 hw_rev_id = 0;
+   u8 config_select = 0;
char *fmt = "%d.%d.%d, Id: 0x%04x  HW Rev: %d  OTP Config Select: %d\n";
 
idtcm_read_major_release(idtcm, &major);
-- 
2.18.1

Re: [PATCH v2] net: usb: rtl8150: don't incorrectly assign random MAC addresses

2020-10-11 Thread Joe Perches

On Sun, 2020-10-11 at 22:31 +0300, Petko Manolov wrote:
> On 20-10-11 11:33:00, Joe Perches wrote:
> > On Sun, 2020-10-11 at 10:59 -0700, Jakub Kicinski wrote:
> > > On Sun, 11 Oct 2020 23:00:30 +0530 Anant Thazhemadam wrote:
> > > > In set_ethernet_addr(), if get_registers() succeeds, the ethernet 
> > > > address
> > > > that was read must be copied over. Otherwise, a random ethernet address
> > > > must be assigned.
> > > > 
> > > > get_registers() returns 0 if successful, and negative error number
> > > > otherwise. However, in set_ethernet_addr(), this return value is
> > > > incorrectly checked.
> > > > 
> > > > Since this return value will never be equal to sizeof(node_id), a
> > > > random MAC address will always be generated and assigned to the
> > > > device; even in cases when get_registers() is successful.
> > > > 
> > > > Correctly modifying the condition that checks if get_registers() was
> > > > successful or not fixes this problem, and copies the ethernet address
> > > > appropriately.
> > 
> > There are many unchecked uses of set_registers and get_registers
> >  in this file.
> > 
> > If failures are really expected, then it might be better to fix
> > them up too.
> 
> Checking the return value of each get/set_registers() is going to be a PITA 
> and
> not very helpful.  Doing so when setting the MAC address _does_ make sense as 
> in
> that case it is not a hard error.
> 
> In almost all other occasions if usb_control_msg_send/recv() return an error 
> i'd
> rather dump an error message (from within get/set_registers()) and let the 
> user
> decide whether to get rid of this adapter or start debugging it.

Your code, your choices...

Consider using _once or _ratelimited output too.

Re: [PATCH net-next 08/12] net: dsa: use new function dev_fetch_sw_netstats

2020-10-11 Thread Vladimir Oltean

On Sun, Oct 11, 2020 at 09:41:27PM +0200, Heiner Kallweit wrote:
> Simplify the code by using new function dev_fetch_sw_netstats().
> 
> Signed-off-by: Heiner Kallweit 
> ---

Tested-by: Vladimir Oltean

Re: [PATCH net-next 01/12] net: core: add function dev_fetch_sw_netstats for fetching pcpu_sw_netstats

2020-10-11 Thread Heiner Kallweit

On 11.10.2020 21:54, Stephen Hemminger wrote:
> On Sun, 11 Oct 2020 21:36:43 +0200
> Heiner Kallweit  wrote:
> 
>> +void dev_fetch_sw_netstats(struct rtnl_link_stats64 *s,
>> +   struct pcpu_sw_netstats __percpu *netstats)
> 
> netstats is unmodified, should it be const?
> 
>> +{
>> +int cpu;
>> +
>> +if (IS_ERR_OR_NULL(netstats))
>> +return;
> 
> Any code calling this with a null pointer is broken/buggy, please don't
> ignore that.
> 
Thanks, I'll consider both points in a v2.

Re: [Patch net v2] ip_gre: set dev->hard_header_len and dev->needed_headroom properly

2020-10-11 Thread Xie He

On Sun, Oct 11, 2020 at 12:11 PM Cong Wang  wrote:
>
> GRE tunnel has its own header_ops, ipgre_header_ops, and sets it
> conditionally. When it is set, it assumes the outer IP header is
> already created before ipgre_xmit().
>
> This is not true when we send packets through a raw packet socket,
> where L2 headers are supposed to be constructed by user. Packet
> socket calls dev_validate_header() to validate the header. But
> GRE tunnel does not set dev->hard_header_len, so that check can
> be simply bypassed, therefore uninit memory could be passed down
> to ipgre_xmit(). Similar for dev->needed_headroom.
>
> dev->hard_header_len is supposed to be the length of the header
> created by dev->header_ops->create(), so it should be used whenever
> header_ops is set, and dev->needed_headroom should be used when it
> is not set.

Hi, thanks for attempting to fix this tunnel. Are we still considering
removing header_ops->create?

As said in my email sent previously today, I want to remove
header_ops->create because 1) this keeps the un-exposed headers of GRE
devices consistent with those of GRETAP devices, and 2) I think the
GRE header (and the headers before the GRE header) is not actually the
L2 header of the tunnel (the Wikipedia page for "Generic Routing
Encapsulation" doesn't consider this protocol to be at L2 either).

I'm not sure if you still agree to remove header_ops->create. Do you
still agree but think it'd be better to do that in a separate patch?

Removing header_ops->create would simplify the fixing of the issue you
are trying to fix, too, because that way we would no longer need to
use header_ops or hard_header_len. Also, I'm worried that changing
hard_header_len (or needed_headroom) in ipgre_link_update would have
racing issues. If we remove header_ops, we no longer need to use
hard_header_len and we can just set needed_headroom to the maximum
value, so that we no longer need to update them in ipgre_link_update.

Re: [Patch net v2] ip_gre: set dev->hard_header_len and dev->needed_headroom properly

2020-10-11 Thread Willem de Bruijn

On Sun, Oct 11, 2020 at 3:11 PM Cong Wang  wrote:
>
> GRE tunnel has its own header_ops, ipgre_header_ops, and sets it
> conditionally. When it is set, it assumes the outer IP header is
> already created before ipgre_xmit().
>
> This is not true when we send packets through a raw packet socket,
> where L2 headers are supposed to be constructed by user. Packet
> socket calls dev_validate_header() to validate the header. But
> GRE tunnel does not set dev->hard_header_len, so that check can
> be simply bypassed, therefore uninit memory could be passed down
> to ipgre_xmit(). Similar for dev->needed_headroom.
>
> dev->hard_header_len is supposed to be the length of the header
> created by dev->header_ops->create(), so it should be used whenever
> header_ops is set, and dev->needed_headroom should be used when it
> is not set.
>
> Reported-and-tested-by: syzbot+4a2c52677a8a1aa28...@syzkaller.appspotmail.com
> Cc: Xie He 
> Cc: William Tu 
> Cc: Willem de Bruijn 
> Signed-off-by: Cong Wang 
> ---
> Note, there are some other suspicious use of dev->hard_header_len in
> the same file, but let's leave them to a separate patch if really
> needed.

There is agreement that hard_header_len should be the length of link
layer headers visible to the upper layers, needed_headroom the
additional room required for headers that are not exposed, i.e., those
pushed inside ndo_start_xmit.

The link layer header length also has to agree with the interface
hardware type (ARPHRD_..).

Tunnel devices have not always been consistent in this, but today
"bare" ip tunnel devices without additional headers (ipip, sit, ..) do
match this and advertise 0 byte hard_header_len. Bareudp, vxlan and
geneve also conform to this. Known exception that probably needs to be
addressed is sit, which still advertises LL_MAX_HEADER and so has
exposed quite a few syzkaller issues. Side note, it is not entirely
clear to me what sets ARPHRD_TUNNEL et al apart from ARPHRD_NONE and
why they are needed.

GRE devices advertise ARPHRD_IPGRE and GRETAP advertise ARPHRD_ETHER.
The second makes sense, as it appears as an Ethernet device. The first
should match "bare" ip tunnel devices, if following the above logic.
Indeed, this is what commit e271c7b4420d ("gre: do not keep the GRE
header around in collect medata mode") implements. It changes
dev->type to ARPHRD_NONE in collect_md mode.

Some of the inconsistency comes from the various modes of the GRE
driver. Which brings us to ipgre_header_ops. It is set only in two
special cases.

Commit 6a5f44d7a048 ("[IPV4] ip_gre: sendto/recvfrom NBMA address")
added ipgre_header_ops.parse to be able to receive the inner ip source
address with PF_PACKET recvfrom. And apparently relies on
ipgre_header_ops.create to be able to set an address, which implies
SOCK_DGRAM.

The other special case, CONFIG_NET_IPGRE_BROADCAST, predates git. Its
implementation starts with the beautiful comment "/* Nice toy.
Unfortunately, useless in real life :-)". From the rest of that
detailed comment, it is not clear to me why it would need to expose
the headers. The example does not use packet sockets.

A packet socket cannot know devices details such as which configurable
mode a device may be in. And different modes conflict with the basic
rule that for a given well defined link layer type, i.e., dev->type,
header length can be expected to be consistent. In an ideal world
these exceptions would not exist, therefore.

Unfortunately, this is legacy behavior that will have to continue to
be supported.

I agree that then swapping dev->needed_headroom for
dev->hard_header_len at init is then a good approach.

Underlying functions like ip_tunnel_xmit can modify needed_headroom at
runtime, not sure how that affects runtime changes in
ipgre_link_update:

"
max_headroom = LL_RESERVED_SPACE(rt->dst.dev) + sizeof(struct iphdr)
+ rt->dst.header_len + ip_encap_hlen(&tunnel->encap);
if (max_headroom > dev->needed_headroom)
dev->needed_headroom = max_headroom;
"

>  net/ipv4/ip_gre.c | 14 +++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
> index 4e31f23e4117..82fee0010353 100644
> --- a/net/ipv4/ip_gre.c
> +++ b/net/ipv4/ip_gre.c
> @@ -626,8 +626,7 @@ static netdev_tx_t ipgre_xmit(struct sk_buff *skb,
>
> if (dev->header_ops) {
> /* Need space for new headers */
> -   if (skb_cow_head(skb, dev->needed_headroom -
> - (tunnel->hlen + sizeof(struct iphdr
> +   if (skb_cow_head(skb, dev->hard_header_len))
> goto free_skb;
>
> tnl_params = (const struct iphdr *)skb->data;
> @@ -748,7 +747,11 @@ static void ipgre_link_update(struct net_device *dev, 
> bool set_mtu)
> len = tunnel->tun_hlen - len;
> tunnel->hlen = tunnel->hlen + len;
>
> -   dev->needed_headroom = dev->needed_headroom + len;
> +

Re: [Patch net v2] ip_gre: set dev->hard_header_len and dev->needed_headroom properly

2020-10-11 Thread Willem de Bruijn

On Sun, Oct 11, 2020 at 4:42 PM Xie He  wrote:
>
> On Sun, Oct 11, 2020 at 12:11 PM Cong Wang  wrote:
> >
> > GRE tunnel has its own header_ops, ipgre_header_ops, and sets it
> > conditionally. When it is set, it assumes the outer IP header is
> > already created before ipgre_xmit().
> >
> > This is not true when we send packets through a raw packet socket,
> > where L2 headers are supposed to be constructed by user. Packet
> > socket calls dev_validate_header() to validate the header. But
> > GRE tunnel does not set dev->hard_header_len, so that check can
> > be simply bypassed, therefore uninit memory could be passed down
> > to ipgre_xmit(). Similar for dev->needed_headroom.
> >
> > dev->hard_header_len is supposed to be the length of the header
> > created by dev->header_ops->create(), so it should be used whenever
> > header_ops is set, and dev->needed_headroom should be used when it
> > is not set.
>
> Hi, thanks for attempting to fix this tunnel. Are we still considering
> removing header_ops->create?
>
> As said in my email sent previously today, I want to remove
> header_ops->create because 1) this keeps the un-exposed headers of GRE
> devices consistent with those of GRETAP devices, and 2) I think the
> GRE header (and the headers before the GRE header) is not actually the
> L2 header of the tunnel (the Wikipedia page for "Generic Routing
> Encapsulation" doesn't consider this protocol to be at L2 either).
>
> I'm not sure if you still agree to remove header_ops->create. Do you
> still agree but think it'd be better to do that in a separate patch?
>
> Removing header_ops->create would simplify the fixing of the issue you
> are trying to fix, too, because that way we would no longer need to
> use header_ops or hard_header_len. Also, I'm worried that changing
> hard_header_len (or needed_headroom) in ipgre_link_update would have
> racing issues. If we remove header_ops, we no longer need to use
> hard_header_len and we can just set needed_headroom to the maximum
> value, so that we no longer need to update them in ipgre_link_update.

Our messages crossed.

It seems there are legacy expectations that sendto/recvfrom packet
sockets allow writing/reading the outer IP address, as of commit
6a5f44d7a048 ("[IPV4] ip_gre: sendto/recvfrom NBMA address"). That is
the express purpose of that commit.

The behavior is inconsistent with other tunnels, as you also point
out, and probably only rarely used if at all. I would love to get rid
of it, but given that we cannot be certain that it is unused, I'm afraid
that we have to continue to support this special case.

Re: [PATCH net-next 3/9] bnxt_en: Set driver default message level.

2020-10-11 Thread Michael Chan

On Sun, Oct 11, 2020 at 12:34 PM Jakub Kicinski  wrote:
>
> On Sun, 11 Oct 2020 06:22:55 -0400 Michael Chan wrote:
> > Currently, bp->msg_enable has default value of 0.  It is more useful
> > to have the commonly used NETIF_MSG_DRV and NETIF_MSG_HW enabled by
> > default.
> >
> > Reviewed-by: Edwin Peer 
> > Reviewed-by: Vasundhara Volam 
> > Signed-off-by: Michael Chan 
>
> This will add whole bunch of output for "RX buffer error 4000[45]", no?
> That one needs to switch to a silent reset first, I'd think.

The last round of net-next patches have made changes to reduce this
noise already.  If the firmware supports the new  RING_MONITOR scheme,
There will be no more messages.  The reset will be counted in a new
ethtool -S counter only.

If the firmware is older and does not support the new scheme, we've
changed the logging to warn_once() in addition to the ethtool -S
counter.  There is no point in warning more than once.

smime.p7s
Description: S/MIME Cryptographic Signature

Re: [PATCH] net: korina: free array used for rx/tx descriptors

2020-10-11 Thread Valentin Vidić

On Sun, Oct 11, 2020 at 02:37:33PM -0400, Willem de Bruijn wrote:
> Slightly off-topic, but I don't fully fathom what goes on with this
> pointer straight after the initial kmalloc.
> 
> lp->td_ring = (struct dma_desc *)KSEG1ADDR(lp->td_ring);

KSEG1ADDR should rewrite the memory address into the uncached region
for memory mapped I/O. Not sure if this would case problems for kfree
since there is another kfree on the fail path:

probe_err_register:
kfree(lp->td_ring);

-- 
Valentin

1 2 >

1 - 100 of 166 matches

Mail list logo