[RFC PATCH bpf-next 3/3] tools/bpf: add a selftest for bpf_send_signal() helper

2019-04-30 Thread Yonghong Song
The test covered both nmi and tracepoint perf events. $ ./test_send_signal_user test_send_signal (tracepoint): OK test_send_signal (perf_event): OK Signed-off-by: Yonghong Song --- tools/testing/selftests/bpf/Makefile | 5 +- tools/testing/selftests/bpf/bpf_helpers.h | 2 +

[RFC PATCH bpf-next 0/3] implement bpf_send_signal() helper

2019-04-30 Thread Yonghong Song
Currently, bpf program can already collect stack traces when certain events happens (e.g., cache miss counter or cpu clock counter overflows). These stack traces can be used for performance analysis. For jitted programs, e.g., hhvm (jited php), it is very hard to get the true stack trace in the bpf

[RFC PATCH bpf-next 2/3] tools/bpf: sync bpf uapi header bpf.h

2019-04-30 Thread Yonghong Song
sync bpf uapi header bpf.h to tools directory. Signed-off-by: Yonghong Song --- tools/include/uapi/linux/bpf.h | 15 ++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 72336bac7573..e3e824848335 1006

[RFC PATCH bpf-next 1/3] bpf: implement bpf_send_signal() helper

2019-04-30 Thread Yonghong Song
This patch tries to solve the following specific use case. Currently, bpf program can already collect stack traces when certain events happens (e.g., cache miss counter or cpu clock counter overflows). These stack traces can be used for performance analysis. For jitted programs, e.g., hhvm (jited

read and get back to me

2019-04-30 Thread Mr Anbrose Fred
-- Greetings I wonder why you continue neglecting my emails. Please, acknowledge the receipt of this message in reference to the subject above as I intend to send to you the details of the project. Sometimes, try to check your spam box because most of these correspondences fall out sometimes

[PATCH 5/5] net: Rename skb_frag page to bv_page

2019-04-30 Thread Matthew Wilcox
From: "Matthew Wilcox (Oracle)" One step closer to turning the skb_frag_t into a bio_vec. Signed-off-by: Matthew Wilcox (Oracle) --- include/linux/skbuff.h | 12 +--- net/core/skbuff.c | 2 +- 2 files changed, 6 insertions(+), 8 deletions(-) diff --git a/include/linux/skbuff.h b

[PATCH 3/5] net: Include bvec.h in skbuff.h

2019-04-30 Thread Matthew Wilcox
From: "Matthew Wilcox (Oracle)" Add the dependency now, even though we're not using the bio_vec yet. Signed-off-by: Matthew Wilcox (Oracle) --- include/linux/skbuff.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 9c6193a57241..bc416e5

[PATCH 1/5] net: Increase the size of skb_frag_t

2019-04-30 Thread Matthew Wilcox
From: "Matthew Wilcox (Oracle)" To increase commonality between block and net, we are going to replace the skb_frag_t with the bio_vec. This patch increases the size of skb_frag_t on 32-bit machines from 8 bytes to 12 bytes. The size is unchanged on 64-bit machines. Signed-off-by: Matthew Wilc

[PATCH 2/5] net: Reorder the contents of skb_frag_t

2019-04-30 Thread Matthew Wilcox
From: "Matthew Wilcox (Oracle)" Match the layout of bio_vec. Signed-off-by: Matthew Wilcox (Oracle) --- include/linux/skbuff.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 23f05c64aa31..9c6193a57241 100644 --- a/inclu

[PATCH 0/5] Beginnings of skb_frag -> bio_vec conversion

2019-04-30 Thread Matthew Wilcox
From: "Matthew Wilcox (Oracle)" It turns out there's a lot of accessors for the skb_frag, which would make this conversion really easy if some drivers didn't bypass them. This is what I've done so far; my laptop's not really beefy enough to cope with changing skbuff.h too often ;-) This would be

[PATCH 4/5] net: Use skb accessors for skb->page

2019-04-30 Thread Matthew Wilcox
From: "Matthew Wilcox (Oracle)" In preparation for renaming skb->page, use the fine accessors which already exist. Signed-off-by: Matthew Wilcox (Oracle) --- drivers/hsi/clients/ssi_protocol.c | 3 ++- drivers/net/ethernet/cavium/liquidio/lio_main.c| 2 +- drivers/net/ether

Re: [PATCH net] ipv6: fix races in ip6_dst_destroy()

2019-04-30 Thread David Miller
From: Eric Dumazet Date: Sun, 28 Apr 2019 12:22:25 -0700 > We had many syzbot reports that seem to be caused by use-after-free > of struct fib6_info. > > ip6_dst_destroy(), fib6_drop_pcpu_from() and rt6_remove_exception() > are writers vs rt->from, and use non consistent synchronization among >

Re: [PATCH bpf 1/2] libbpf: fix invalid munmap call

2019-04-30 Thread William Tu
On Tue, Apr 30, 2019 at 5:46 AM Björn Töpel wrote: > > From: Björn Töpel > > When unmapping the AF_XDP memory regions used for the rings, an > invalid address was passed to the munmap() calls. Instead of passing > the beginning of the memory region, the descriptor region was passed > to munmap. >

Re: [PATCH bpf 2/2] libbpf: proper XSKMAP cleanup

2019-04-30 Thread William Tu
On Tue, Apr 30, 2019 at 5:46 AM Björn Töpel wrote: > > From: Björn Töpel > > The bpf_map_update_elem() function, when used on an XSKMAP, will fail > if not a valid AF_XDP socket is passed as value. Therefore, this is > function cannot be used to clear the XSKMAP. Instead, the > bpf_map_delete_ele

Re: [PATCH net-next 0/3] r8169: improve eri function handling

2019-04-30 Thread David Miller
From: Heiner Kallweit Date: Sun, 28 Apr 2019 11:09:36 +0200 > This series aims at improving and simplifying the eri functions. > No functional change intended. Series applied, thanks.

[PATCH mlx5-next] net/mlx5: Fix broken hca cap offset

2019-04-30 Thread Saeed Mahameed
The cited commit broke the offsets of hca cap struct, fix it. While at it, cleanup a white space introduced by the same commit. Fixes: b169e64a2444 ("net/mlx5: Geneve, Add flow table capabilities for Geneve decap with TLV options") Reported-by: Qian Cai Cc: Yevgeny Kliteynik Signed-off-by: Saee

Re: [PATCH net-next 0/4] Convert mv88e6060 to mdio device

2019-04-30 Thread David Miller
From: Andrew Lunn Date: Sun, 28 Apr 2019 02:56:20 +0200 > This patchset builds upon the previous patches to mv88e6060. It adds > support for probing the switch as an MDIO device and then removes the > legacy probe method. Since this is the last device supporting legacy > probe, this allows legacy

Re: [PATCH net-next 00/13] Improvements to DSA core VLAN manipulation

2019-04-30 Thread David Miller
From: Vladimir Oltean Date: Sun, 28 Apr 2019 21:45:41 +0300 > In preparation of submitting the NXP SJA1105 driver, the Broadcom b53 > and Mediatek mt7530 drivers have been found to apply some VLAN > workarounds that are needed in the new driver as well. > > Therefore this patchset is mostly simp

[bpf-next PATCH v3 4/4] bpf: sockmap, only stop/flush strp if it was enabled at some point

2019-04-30 Thread John Fastabend
If we try to call strp_done on a parser that has never been initialized, because the sockmap user is only using TX side for example we get the following error. [ 883.422081] WARNING: CPU: 1 PID: 208 at kernel/workqueue.c:3030 __flush_work+0x1ca/0x1e0 ... [ 883.422095] Workqueue: events s

[bpf-next PATCH v3 1/4] bpf: tls, implement unhash to avoid transition out of ESTABLISHED

2019-04-30 Thread John Fastabend
It is possible (via shutdown()) for TCP socks to go through TCP_CLOSE state via tcp_disconnect() without calling into close callback. This would allow a kTLS enabled socket to exist outside of ESTABLISHED state which is not supported. Solve this the same way we solved the sock{map|hash} case by ad

[bpf-next PATCH v3 2/4] bpf: sockmap remove duplicate queue free

2019-04-30 Thread John Fastabend
In tcp bpf remove we free the cork list and purge the ingress msg list. However we do this before the ref count reaches zero so it could be possible some other access is in progress. In this case (tcp close and/or tcp_unhash) we happen to also hold the sock lock so no path exists but lets fix it ot

[bpf-next PATCH v3 3/4] bpf: sockmap fix msg->sg.size account on ingress skb

2019-04-30 Thread John Fastabend
When converting a skb to msg->sg we forget to set the size after the latest ktls/tls code conversion. This patch can be reached by doing a redir into ingress path from BPF skb sock recv hook. Then trying to read the size fails. Fix this by setting the size. Fixes: 604326b41a6fb ("bpf, sockmap: co

[bpf-next PATCH v3 0/4] sockmap/ktls fixes

2019-04-30 Thread John Fastabend
Series of fixes for sockmap and ktls, see patches for descriptions. v2: fix build issue for CONFIG_TLS_DEVICE and fixup couple comments from Jakub v3: fix issue where release could call unhash resulting in a use after free. Now we detach the ulp pointer before calling into destroy or

[PATCH v2] ss: add option to print socket information on one line

2019-04-30 Thread Josh Hunt
Multi-line output in ss makes it difficult to search for things with grep. This new option will make it easier to find sockets matching certain criteria with simple grep commands. Example without option: $ ss -emoitn State Recv-Q Send-Q Local Address:Port Peer Address:Port ESTAB

Re: Bug#927825: arm: mvneta driver used on Armada XP GP boards does not receive packets (regression from 4.9)

2019-04-30 Thread Aurelien Jarno
On 2019-04-30 10:12, Uwe Kleine-König wrote: > [Adding the mvebu guys and netdev to Cc] > > Hello, > > On Thu, Apr 25, 2019 at 09:17:32PM +0200, Aurelien Jarno wrote: > > On 2019-04-25 14:50, Aurelien Jarno wrote: > > > On 2019-04-23 22:16, Aurelien Jarno wrote: > > > > Source: linux > > > > Vers

[PATCH net-next] net: dsa: mv88e6xxx: Pass interrupt number in platform data

2019-04-30 Thread Andrew Lunn
Allow an interrupt number to be passed in the platform data. The driver will then use it if not zero, otherwise it will poll for interrupts. Signed-off-by: Andrew Lunn --- drivers/net/dsa/mv88e6xxx/chip.c| 13 + include/linux/platform_data/mv88e6xxx.h | 1 + 2 files changed,

[PATCH net-next 2/2] net: dsa :mv88e6xxx: Disable unused ports

2019-04-30 Thread Andrew Lunn
If the NO_CPU strap is set, the switch starts in 'dumb hub' mode, with all ports enable. Ports which are then actively used are reconfigured as required when the driver starts. However unused ports are left alone. Change this to disable them, and turn off any SERDES interface. This could save some

[PATCH net-next 0/2] mv88e6xxx: Disable ports to save power

2019-04-30 Thread Andrew Lunn
Save some power by disabling ports. The first patch fully disables a port when it is runtime disabled. The second disables any ports which are not used at all. Depending on configuration strapping, this can lower the temperature of an idle switch a few degrees. Andrew Lunn (2): net: dsa: mv88e6

[PATCH net-next 1/2] net: dsa: mv88e6xxx: Set STP disable state in port_disable

2019-04-30 Thread Andrew Lunn
When requested to disable a port, set the port STP state to disabled. This fully disables the port and should save some power. Signed-off-by: Andrew Lunn --- drivers/net/dsa/mv88e6xxx/chip.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa

[net-next 13/15] net/mlx5: E-Switch, Use getter to access all vport array

2019-04-30 Thread Saeed Mahameed
From: Bodong Wang Some functions issue vport commands and access vport array using vport_index/vport_num interchangeably which is OK for VFs vports. However, this creates potential bug if those vports are not VFs (E.g, uplink, sf) where their vport_index don't equal to vport_num. Prepare code to

[net-next 10/15] net/mlx5: Remove unused mlx5_query_nic_vport_vlans

2019-04-30 Thread Saeed Mahameed
From: Bodong Wang mlx5_query_nic_vport_vlans() is not used anymore. Hence remove it. This patch doesn't change any functionality. Signed-off-by: Bodong Wang Reviewed-by: Parav Pandit Signed-off-by: Saeed Mahameed --- .../net/ethernet/mellanox/mlx5/core/vport.c | 61 --- inc

[net-next 12/15] net/mlx5: Use available mlx5_vport struct

2019-04-30 Thread Saeed Mahameed
From: Parav Pandit Several functions need to access mlx5_vport and vport_num. When these functions are called, caller already has mlx5_vport* available. Hence pass such mlx5_vport pointer. This is preparation patch to add error checks to mlx5_eswitch_get_vport() and to return error status. By do

[net-next 14/15] net/mlx5: E-Switch, Fix the check of legal vport

2019-04-30 Thread Saeed Mahameed
From: Bodong Wang The check of legal vport is to ensure the vport number falls between 0 and total number of vports. Along with the introduction of uplink rep, enabled vports are not consecutive any more. Therefore, rely on the eswitch vport getter function to check if it's a valid vport. As the

[net-next 06/15] ethtool: Add SFF-8436 and SFF-8636 max EEPROM length definitions

2019-04-30 Thread Saeed Mahameed
From: Erez Alfasi Added max EEPROM length defines for ethtool usage: #define ETH_MODULE_SFF_8636_MAX_LEN 640 #define ETH_MODULE_SFF_8436_MAX_LEN 640 These definitions are exists in ethtool and used to determine the EEPROM data length when reading high pages as well. For example, SFF-8

[net-next 15/15] net/mlx5: E-Switch, Use atomic rep state to serialize state change

2019-04-30 Thread Saeed Mahameed
From: Bodong Wang When the state of rep was introduced, it was also designed to prevent duplicate unloading of the same rep. Considering the following two flows when an eswitch manager is at switchdev mode with n VF reps loaded. +--+---

[net-next 11/15] net/mlx5: Reuse mlx5_esw_for_each_vf_vport macro in two files

2019-04-30 Thread Saeed Mahameed
From: Parav Pandit Currently mlx5_esw_for_each_vf_vport iterates over mlx5_vport entries in eswitch.c Same macro in eswitch_offloads.c iterates over vport number in eswitch_offloads.c Instead of duplicate macro names, to avoid confusion and to reuse the same macro in both files, move it to eswit

[net-next 09/15] net/mlx5e: remove meaningless CFLAGS_tracepoint.o

2019-04-30 Thread Saeed Mahameed
From: Masahiro Yamada CFLAGS_tracepoint.o specifies CFLAGS for compiling tracepoint.c but it does not exist under drivers/net/ethernet/mellanox/mlx5/core/. CFLAGS_tracepoint.o is unused. Signed-off-by: Masahiro Yamada Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/

[net-next 07/15] net/mlx5e: ethtool, Add support for EEPROM high pages query

2019-04-30 Thread Saeed Mahameed
From: Erez Alfasi Add the support to read additional EEPROM information from high pages. Information for modules such as SFF-8436 and SFF-8636: 1) Application select table 2) User writable EEPROM 3) Thresholds and alarms Signed-off-by: Erez Alfasi Signed-off-by: Saeed Mahameed --- .../ethe

[net-next 03/15] net/mlx5e: ACLs for priority tag mode

2019-04-30 Thread Saeed Mahameed
From: Eli Britstein Current ConnectX HW is unable to perform VLAN pop in TX path and VLAN push on RX path. As a workaround, untagged packets are tagged with VID 0x000 allowing pop/push actions to be exchanged with VLAN rewrite actions. Use the ingress ACL table, preceding the FDB, to push VLAN 0x

[net-next 05/15] net/mlx5e: Return error when trying to insert existing flower filter

2019-04-30 Thread Saeed Mahameed
From: Vlad Buslov With unlocked TC it is possible to have spurious deletes and inserts of same filter. TC layer needs drivers to always return error when flow insertion failed in order to correctly calculate "in_hw_count" for each filter. Fix mlx5e_configure_flower() to return -EEXIST when TC tri

[net-next 08/15] net/mlx5e: Put the common XDP code into a function

2019-04-30 Thread Saeed Mahameed
From: Maxim Mikityanskiy The same code that returns XDP frames and releases pages is used both in mlx5e_poll_xdpsq_cq and mlx5e_free_xdpsq_descs. Create a function that cleans up an MPWQE. Signed-off-by: Maxim Mikityanskiy Signed-off-by: Saeed Mahameed --- .../net/ethernet/mellanox/mlx5/core/

[net-next 04/15] net/mlx5e: Replace TC VLAN pop with VLAN 0 rewrite in prio tag mode

2019-04-30 Thread Saeed Mahameed
From: Eli Britstein Current ConnectX HW is unable to perform VLAN pop in TX path and VLAN push on RX path. To workaround that limitation untagged packets are tagged with VLAN ID 0x000 (priority tag) and pop/push actions are replaced by VLAN re-write actions (which are supported by the HW). Replac

[net-next 01/15] net/mlx5e: Take common TIR context settings into a function

2019-04-30 Thread Saeed Mahameed
From: Tariq Toukan Many TIR context settings are common to different TIR types, take them into a common function. Signed-off-by: Tariq Toukan Reviewed-by: Aya Levin Signed-off-by: Saeed Mahameed --- .../net/ethernet/mellanox/mlx5/core/en_main.c | 49 --- 1 file changed, 21 in

[net-next 02/15] net/mlx5e: Turn on HW tunnel offload in all TIRs

2019-04-30 Thread Saeed Mahameed
From: Tariq Toukan Hardware requires that all TIRs that steer traffic to the same RQ should share identical tunneled_offload_en value. For that, the tunneled_offload_en bit should be set/unset (according to the HW capability) for all TIRs', not only the ones dedicated for tunneled (inner) traffic

[pull request][net-next 00/15] Mellanox, mlx5 updates 2019-04-30

2019-04-30 Thread Saeed Mahameed
Hi Dave, This series provides misc updates to mlx5 driver. There is one patch of this series that is touching outside mlx5 driver: ethtool.h: Add SFF-8436 and SFF-8636 max EEPROM length definitions Added max EEPROM length defines for ethtool usage: #define ETH_MODULE_SFF_8636_MAX_LEN 640

Re: [PATCH iproute2-next] ss: add option to print socket information on one line

2019-04-30 Thread Josh Hunt
On 4/30/19 12:41 PM, David Ahern wrote: On 4/30/19 12:55 PM, Josh Hunt wrote: Actually, David can you clarify what you meant by "use 'oneline' as the long option without the '-'."? for your patch: 1,$s/one-line/oneline/ ip has -oneline which is most likely used as 'ip -o'. having ss with --on

Re: [PATCH iproute2-next] ss: add option to print socket information on one line

2019-04-30 Thread David Ahern
On 4/30/19 12:55 PM, Josh Hunt wrote: > Actually, David can you clarify what you meant by "use 'oneline' as the > long option without the '-'."? for your patch: 1,$s/one-line/oneline/ ip has -oneline which is most likely used as 'ip -o'. having ss with --one-line vs --oneline is at least consiste

Re: [PATCH bpf-next 1/6] tools: bpftool: add --log-libbpf option to get debug info from libbpf

2019-04-30 Thread Jakub Kicinski
On Tue, 30 Apr 2019 08:31:53 -0700, Y Song wrote: > On Tue, Apr 30, 2019 at 2:34 AM Quentin Monnet > wrote: > > > > Hi Yonghong, > > > > 2019-04-29 16:32 UTC-0700 ~ Y Song > > > On Mon, Apr 29, 2019 at 2:53 AM Quentin Monnet > > > wrote: > > >> > > >> libbpf has three levels of priority for

Re: [PATCH iproute2-next] ss: add option to print socket information on one line

2019-04-30 Thread Josh Hunt
On 4/30/19 11:31 AM, Josh Hunt wrote: On 4/30/19 11:30 AM, David Ahern wrote: On 4/25/19 3:21 PM, Josh Hunt wrote: @@ -4877,6 +4903,7 @@ static void _usage(FILE *dest)   "\n"   "   -K, --kill  forcibly close sockets, display what was closed\n"   "   -H, --no-header Suppress header

[Patch net-next] net: add a generic tracepoint for TX queue timeout

2019-04-30 Thread Cong Wang
Although devlink health report does a nice job on reporting TX timeout and other NIC errors, unfortunately it requires drivers to support it but currently only mlx5 has implemented it. Before other drivers could catch up, it is useful to have a generic tracepoint to monitor this kind of TX timeout.

Re: [PATCH iproute2-next] ss: add option to print socket information on one line

2019-04-30 Thread Josh Hunt
On 4/30/19 11:30 AM, David Ahern wrote: On 4/25/19 3:21 PM, Josh Hunt wrote: @@ -4877,6 +4903,7 @@ static void _usage(FILE *dest) "\n" " -K, --kill forcibly close sockets, display what was closed\n" " -H, --no-header Suppress header line\n" +" -O, --one-line socket'

Re: [PATCH iproute2-next] ss: add option to print socket information on one line

2019-04-30 Thread David Ahern
On 4/25/19 3:21 PM, Josh Hunt wrote: > @@ -4877,6 +4903,7 @@ static void _usage(FILE *dest) > "\n" > " -K, --kill forcibly close sockets, display what was closed\n" > " -H, --no-header Suppress header line\n" > +" -O, --one-line socket's data printed on a single line\n" >

Re: [PATCH net] ipv6: A few fixes on dereferencing rt->from

2019-04-30 Thread Wei Wang
On Tue, Apr 30, 2019 at 10:45 AM Martin KaFai Lau wrote: > > It is a followup after the fix in > commit 9c69a1320515 ("route: Avoid crash from dereferencing NULL rt->from") > > rt6_do_redirect(): > 1. NULL checking is needed on rt->from because a parallel >fib6_info delete could happen that se

Re: [PATCH net-next 4/6] xsk: Extend channels to support combined XSK/non-XSK traffic

2019-04-30 Thread Björn Töpel
On 2019-04-30 20:11, Maxim Mikityanskiy wrote: I'm going to respin this series, adding the mlx5 patches and addressing the other comments. If you feel like there are more things to discuss regarding this patch, please move on to the v2. Very cool, thanks for working on this Maxim! It's Labor D

Re: [PATCH iproute2-next v2] devlink: Increase column size for larger shared buffers

2019-04-30 Thread David Ahern
On 4/30/19 2:42 AM, Ido Schimmel wrote: > From: Ido Schimmel > > With current number of spaces the output is mangled if the shared buffer > is congested. > ... > > v2: > * Increase number of spaces to make the change more future-proof > > Signed-off-by: Ido Schimmel > Reported-by: Alex Kush

[PATCH bpf-next v2 07/16] net/mlx5e: Replace deprecated PCI_DMA_TODEVICE

2019-04-30 Thread Maxim Mikityanskiy
The PCI API for DMA is deprecated, and PCI_DMA_TODEVICE is just defined to DMA_TO_DEVICE for backward compatibility. Just use DMA_TO_DEVICE. Signed-off-by: Maxim Mikityanskiy Reviewed-by: Tariq Toukan Acked-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c | 2 +- 1 file

[PATCH bpf-next v2 11/16] net/mlx5e: Share the XDP SQ for XDP_TX between RQs

2019-04-30 Thread Maxim Mikityanskiy
Put the XDP SQ that is used for XDP_TX into the channel. It used to be a part of the RQ, but with introduction of AF_XDP there will be one more RQ that could share the same XDP SQ. This patch is a preparation for that change. Separate XDP_TX statistics per RQ were implemented in one of the previou

[PATCH bpf-next v2 09/16] net/mlx5e: Allow ICO SQ to be used by multiple RQs

2019-04-30 Thread Maxim Mikityanskiy
Prepare to creation of the XSK RQ, which will require posting UMRs, too. The same ICO SQ will be used for both RQs and also to trigger interrupts by posting NOPs. UMR WQEs can't be reused any more. Optimization introduced in commit ab966d7e4ff98 ("net/mlx5e: RX, Recycle buffer of UMR WQEs") is reve

[PATCH bpf-next v2 15/16] net/mlx5e: Move queue param structs to en/params.h

2019-04-30 Thread Maxim Mikityanskiy
structs mlx5e_{rq,sq,cq,channel}_param are going to be used in the upcoming XSK RX and TX patches. Move them to a header file to make them accessible from other C files. Signed-off-by: Maxim Mikityanskiy Reviewed-by: Tariq Toukan Acked-by: Saeed Mahameed --- .../ethernet/mellanox/mlx5/core/en/

[PATCH bpf-next v2 13/16] net/mlx5e: Consider XSK in XDP MTU limit calculation

2019-04-30 Thread Maxim Mikityanskiy
Use the existing mlx5e_get_linear_rq_headroom function to calculate the headroom for mlx5e_xdp_max_mtu. This function takes the XSK headroom into consideration, which will be used in the following patches. Signed-off-by: Maxim Mikityanskiy Reviewed-by: Tariq Toukan Acked-by: Saeed Mahameed ---

[PATCH bpf-next v2 14/16] net/mlx5e: Encapsulate open/close queues into a function

2019-04-30 Thread Maxim Mikityanskiy
Create new functions mlx5e_{open,close}_queues to encapsulate opening and closing RQs and SQs, and call the new functions from mlx5e_{open,close}_channel. It simplifies the existing functions a bit and prepares them for the upcoming AF_XDP changes. Signed-off-by: Maxim Mikityanskiy Reviewed-by: T

[PATCH bpf-next v2 08/16] net/mlx5e: Calculate linear RX frag size considering XSK

2019-04-30 Thread Maxim Mikityanskiy
Additional conditions introduced: - XSK implies XDP. - Headroom includes the XSK headroom if it exists. - No space is reserved for struct shared_skb_info in XSK mode. - Fragment size smaller than the XSK chunk size is not allowed. A new auxiliary function mlx5e_get_linear_rq_headroom with the sup

[PATCH bpf-next v2 10/16] net/mlx5e: Refactor struct mlx5e_xdp_info

2019-04-30 Thread Maxim Mikityanskiy
Currently, struct mlx5e_xdp_info has some issues that have to be cleaned up before the upcoming AF_XDP support makes things too complicated and messy. This structure is used both when sending the packet and on completion. Moreover, the cleanup procedure on completion depends on the origin of the pa

[PATCH bpf-next v2 12/16] net/mlx5e: XDP_TX from UMEM support

2019-04-30 Thread Maxim Mikityanskiy
When an XDP program returns XDP_TX, and the RQ is XSK-enabled, it requires careful handling, because convert_to_xdp_frame creates a new page and copies the data there, while our driver expects the xdp_frame to point to the same memory as the xdp_buff. Handle this case separately: map the page, and

[PATCH bpf-next v2 06/16] xsk: Return the whole xdp_desc from xsk_umem_consume_tx

2019-04-30 Thread Maxim Mikityanskiy
Some drivers want to access the data transmitted in order to implement acceleration features of the NICs. It is also useful in AF_XDP TX flow. Change the xsk_umem_consume_tx API to return the whole xdp_desc, that contains the data pointer, length and DMA address, instead of only the latter two. Ad

[PATCH bpf-next v2 05/16] xsk: Change the default frame size to 4096 and allow controlling it

2019-04-30 Thread Maxim Mikityanskiy
The typical XDP memory scheme is one packet per page. Change the AF_XDP frame size in libbpf to 4096, which is the page size on x86, to allow libbpf to be used with the drivers with the packet-per-page scheme. Add a command line option -f to xdpsock to allow to specify a custom frame size. Signed

[PATCH bpf-next v2 04/16] xsk: Extend channels to support combined XSK/non-XSK traffic

2019-04-30 Thread Maxim Mikityanskiy
Currently, the drivers that implement AF_XDP zero-copy support (e.g., i40e) switch the channel into a different mode when an XSK is opened. It causes some issues that have to be taken into account. For example, RSS needs to be reconfigured to skip the XSK-enabled channels, or the XDP program should

[PATCH bpf-next v2 03/16] libbpf: Support getsockopt XDP_OPTIONS

2019-04-30 Thread Maxim Mikityanskiy
Query XDP_OPTIONS in libbpf to determine if the zero-copy mode is active or not. Signed-off-by: Maxim Mikityanskiy Reviewed-by: Tariq Toukan Acked-by: Saeed Mahameed --- tools/lib/bpf/xsk.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk

[PATCH bpf-next v2 02/16] xsk: Add getsockopt XDP_OPTIONS

2019-04-30 Thread Maxim Mikityanskiy
Make it possible for the application to determine whether the AF_XDP socket is running in zero-copy mode. To achieve this, add a new getsockopt option XDP_OPTIONS that returns flags. The only flag supported for now is the zero-copy mode indicator. Signed-off-by: Maxim Mikityanskiy Reviewed-by: Ta

[PATCH bpf-next v2 00/16] AF_XDP infrastructure improvements and mlx5e support

2019-04-30 Thread Maxim Mikityanskiy
This series contains improvements to the AF_XDP kernel infrastructure and AF_XDP support in mlx5e. The infrastructure improvements are required for mlx5e, but also some of them benefit to all drivers, and some can be useful for other drivers that want to implement AF_XDP. The performance testing w

[PATCH bpf-next v2 01/16] xsk: Add API to check for available entries in FQ

2019-04-30 Thread Maxim Mikityanskiy
Add a function that checks whether the Fill Ring has the specified amount of descriptors available. It will be useful for mlx5e that wants to check in advance, whether it can allocate a bulk of RX descriptors, to get the best performance. Signed-off-by: Maxim Mikityanskiy Reviewed-by: Tariq Touka

Re: [PATCH net-next 4/6] xsk: Extend channels to support combined XSK/non-XSK traffic

2019-04-30 Thread Maxim Mikityanskiy
On 2019-04-26 23:26, Björn Töpel wrote: > On 2019-04-26 13:42, Maxim Mikityanskiy wrote: >> Currently, the drivers that implement AF_XDP zero-copy support (e.g., >> i40e) switch the channel into a different mode when an XSK is opened. It >> causes some issues that have to be taken into account. For

Re: [PATCH net-next 4/6] xsk: Extend channels to support combined XSK/non-XSK traffic

2019-04-30 Thread Maxim Mikityanskiy
On 2019-04-26 22:11, Jakub Kicinski wrote: > On Fri, 26 Apr 2019 11:42:37 +, Maxim Mikityanskiy wrote: >> Currently, the drivers that implement AF_XDP zero-copy support (e.g., >> i40e) switch the channel into a different mode when an XSK is opened. It >> causes some issues that have to be taken

Re: [PATCH net] selftests: fib_rule_tests: Fix icmp proto with ipv6

2019-04-30 Thread David Ahern
On 4/29/19 8:37 PM, Hangbin Liu wrote: > An other issue is The IPv4 rule 'from iif' check test failed while IPv6 > passed. I haven't found out the reason yet. > > # ip -netns testns rule add from 192.51.100.3 iif dummy0 table 100 > # ip -netns testns route get 192.51.100.2 from 192.51.100.3 iif du

[PATCH net] ipv6: A few fixes on dereferencing rt->from

2019-04-30 Thread Martin KaFai Lau
It is a followup after the fix in commit 9c69a1320515 ("route: Avoid crash from dereferencing NULL rt->from") rt6_do_redirect(): 1. NULL checking is needed on rt->from because a parallel fib6_info delete could happen that sets rt->from to NULL. (e.g. rt6_remove_exception() and fib6_drop_pcpu

[PATCH iproute2-next v2] tc: add support for plug qdisc

2019-04-30 Thread Paolo Abeni
sch_plug can be used to perform functional qdisc unit tests controlling explicitly the queuing behaviour from user-space. Plug support lacks since its introduction in 2012. This change introduces basic support, to control the tc status. v1 -> v2: - use the SPDX identifier Signed-off-by: Paolo A

Re: [PATCH net] l2ip: fix possible use-after-free

2019-04-30 Thread David Miller
From: Eric Dumazet Date: Tue, 30 Apr 2019 06:27:58 -0700 > Before taking a refcount on a rcu protected structure, > we need to make sure the refcount is not zero. > > syzbot reported : ... > Fixes: 54652eb12c1b ("l2tp: hold tunnel while looking up sessions in > l2tp_netlink") > Signed-off-by:

Re: [PATCH bpf 0/2] libbpf: fixes for AF_XDP teardown

2019-04-30 Thread Jonathan Lemon
On 30 Apr 2019, at 5:45, Björn Töpel wrote: > William found two bugs, when doing socket teardown within the same > process. > > The first issue was an invalid munmap call, and the second one was an > invalid XSKMAP cleanup. Both resulted in that the process kept > references to the socket, whic

Re: [PATCH bpf-next 1/6] tools: bpftool: add --log-libbpf option to get debug info from libbpf

2019-04-30 Thread Y Song
On Tue, Apr 30, 2019 at 2:34 AM Quentin Monnet wrote: > > Hi Yonghong, > > 2019-04-29 16:32 UTC-0700 ~ Y Song > > On Mon, Apr 29, 2019 at 2:53 AM Quentin Monnet > > wrote: > >> > >> libbpf has three levels of priority for output: warn, info, debug. By > >> default, debug output is not printed to

Re: [PATCH] net: dsa: bcm_sf2: fix buffer overflow doing set_rxnfc

2019-04-30 Thread David Miller
From: Dan Carpenter Date: Tue, 30 Apr 2019 13:44:19 +0300 > The "fs->location" is a u32 that comes from the user in ethtool_set_rxnfc(). > We can't pass unclamped values to test_bit() or it results in an out of > bounds access beyond the end of the bitmap. > > Fixes: 7318166cacad ("net: dsa: bcm

Re: [PATCH iproute2-next] tc: add support for plug qdisc

2019-04-30 Thread Paolo Abeni
On Mon, 2019-04-29 at 10:16 -0700, Stephen Hemminger wrote: > On Fri, 26 Apr 2019 10:47:52 +0200 > Paolo Abeni wrote: > > The problem here is that the sch_plug qdisc does not implement the > > dump() qdisc_op, so this callback has nothing to dump. > > > > Must I patch sch_plug first? > > > > Tha

Re: [PATCH v4 net-next 1/3] ipv4: Move cached routes to fib_nh_common

2019-04-30 Thread Ido Schimmel
On Tue, Apr 30, 2019 at 07:45:48AM -0700, David Ahern wrote: > From: David Ahern > > While the cached routes, nh_pcpu_rth_output and nh_rth_input, are IPv4 > specific, a later patch wants to make them accessible for IPv6 nexthops > with IPv4 routes using a fib6_nh. Move the cached routes from fib

[PATCH v4 net-next 2/3] ipv4: Pass fib_nh_common to rt_cache_route

2019-04-30 Thread David Ahern
From: David Ahern Now that the cached routes are in fib_nh_common, pass it to rt_cache_route and simplify its callers. For rt_set_nexthop, the tclassid becomes the last user of fib_nh so move the container_of under the #ifdef CONFIG_IP_ROUTE_CLASSID. Signed-off-by: David Ahern Reviewed-by: Ido

[PATCH v4 net-next 3/3] ipv4: Move exception bucket to nh_common

2019-04-30 Thread David Ahern
From: David Ahern Similar to the cached routes, make IPv4 exceptions accessible when using an IPv6 nexthop struct with IPv4 routes. Simplify the exception functions by passing in fib_nh_common since that is all it needs, and then cleanup the call sites that have extraneous fib_nh conversions. As

[PATCH v4 net-next 0/3] ipv4: Move location of pcpu route cache and exceptions

2019-04-30 Thread David Ahern
From: David Ahern This series moves IPv4 pcpu cached routes from fib_nh to fib_nh_common to make the caches available for IPv6 nexthops (fib6_nh) with IPv4 routes. This allows a fib6_nh struct to be used with both IPv4 and and IPv6 routes. v4 - fixed memleak if encap_type is not set as noticed b

[PATCH v4 net-next 1/3] ipv4: Move cached routes to fib_nh_common

2019-04-30 Thread David Ahern
From: David Ahern While the cached routes, nh_pcpu_rth_output and nh_rth_input, are IPv4 specific, a later patch wants to make them accessible for IPv6 nexthops with IPv4 routes using a fib6_nh. Move the cached routes from fib_nh to fib_nh_common and update references. Initialization of the cach

Re: [PATCH v3 net-next 1/3] ipv4: Move cached routes to fib_nh_common

2019-04-30 Thread David Ahern
On 4/30/19 12:40 AM, Ido Schimmel wrote: > On Mon, Apr 29, 2019 at 09:16:17AM -0700, David Ahern wrote: >> /* Release a nexthop info record */ >> @@ -491,9 +491,15 @@ int fib_nh_common_init(struct fib_nh_common *nhc, >> struct nlattr *encap, >> u16 encap_type, void *cfg, gfp_t

Re: [PATCH] netlink: limit recursion depth in policy validation

2019-04-30 Thread David Miller
From: Johannes Berg Date: Tue, 30 Apr 2019 08:58:10 +0200 > If you prefer to have the safeguard in net even if it shouldn't be > needed now, let me know and I'll make a version that applies there, but > note that will invariably cause conflicts with all the other changes in > lib/nlattr.c. No, t

Re: pull request (net-next): ipsec-next 2019-04-30

2019-04-30 Thread David Miller
From: Steffen Klassert Date: Tue, 30 Apr 2019 08:37:09 +0200 > 1) A lot of work to remove indirections from the xfrm code. >From Florian Westphal. > > 2) Support ESP offload in combination with gso partial. >From Boris Pismenny. > > 3) Remove some duplicated code from vti4. >From Je

[PATCH net] l2ip: fix possible use-after-free

2019-04-30 Thread Eric Dumazet
Before taking a refcount on a rcu protected structure, we need to make sure the refcount is not zero. syzbot reported : refcount_t: increment on 0; use-after-free. WARNING: CPU: 1 PID: 23533 at lib/refcount.c:156 refcount_inc_checked lib/refcount.c:156 [inline] WARNING: CPU: 1 PID: 23533 at lib/

selftests/bpf/test_tag takes ~30 minutes?

2019-04-30 Thread Michael Ellerman
Hi Daniel, I'm running selftests/bpf/test_tag and it's taking roughly half an hour to complete, is that expected? I don't really grok what the test is doing TBH, but it does appear to be doing it 5 times :) for (i = 0; i < 5; i++) { do_test(&tests, 2, -1, bpf_gen_imm_

Re: pull request (net): ipsec 2019-04-30

2019-04-30 Thread David Miller
From: Steffen Klassert Date: Tue, 30 Apr 2019 07:30:18 +0200 ... > Please pull or let me know if there are problems. Pulled, thanks Steffen.

[PATCH bpf 2/2] libbpf: proper XSKMAP cleanup

2019-04-30 Thread Björn Töpel
From: Björn Töpel The bpf_map_update_elem() function, when used on an XSKMAP, will fail if not a valid AF_XDP socket is passed as value. Therefore, this is function cannot be used to clear the XSKMAP. Instead, the bpf_map_delete_elem() function should be used for that. This patch also simplifies

[PATCH bpf 1/2] libbpf: fix invalid munmap call

2019-04-30 Thread Björn Töpel
From: Björn Töpel When unmapping the AF_XDP memory regions used for the rings, an invalid address was passed to the munmap() calls. Instead of passing the beginning of the memory region, the descriptor region was passed to munmap. When the userspace application tried to tear down an AF_XDP socke

[PATCH bpf 0/2] libbpf: fixes for AF_XDP teardown

2019-04-30 Thread Björn Töpel
William found two bugs, when doing socket teardown within the same process. The first issue was an invalid munmap call, and the second one was an invalid XSKMAP cleanup. Both resulted in that the process kept references to the socket, which was not correctly cleaned up. When a new socket was creat

Re: [PATCH 4.9 stable 0/5] net: ip6 defrag: backport fixes

2019-04-30 Thread Greg Kroah-Hartman
On Fri, Apr 26, 2019 at 08:41:03AM -0700, Peter Oskolkov wrote: > This is a backport of a 5.1rc patchset: > https://patchwork.ozlabs.org/cover/1029418/ > > Which was backported into 4.19: > https://patchwork.ozlabs.org/cover/1081619/ > > and into 4.14: > https://patchwork.ozlabs.org/cover/1

Re: [PATCH net-next RFC] Dump SW SQ context as part of tx reporter

2019-04-30 Thread Aya Levin
On 4/30/2019 3:54 AM, Jakub Kicinski wrote: > On Mon, 29 Apr 2019 17:17:39 +0300, Aya Levin wrote: >> In order to offline translate the raw memory into a human readable >> format, the user can use some out-of-kernel scripts which receives as an >> input the following: >> - Object raw memory >> -

Re: [PATCH net-next RFC] Dump SW SQ context as part of tx reporter

2019-04-30 Thread Aya Levin
On 4/29/2019 9:32 PM, Saeed Mahameed wrote: > On Mon, 2019-04-29 at 17:17 +0300, Aya Levin wrote: >> TX reporter reports an error on two scenarios: >> - TX timeout on a specific tx queue >> - TX completion error on a specific send queue >> Prior to this patch, no dump data was supported by the tx

[PATCH] net: dsa: bcm_sf2: fix buffer overflow doing set_rxnfc

2019-04-30 Thread Dan Carpenter
The "fs->location" is a u32 that comes from the user in ethtool_set_rxnfc(). We can't pass unclamped values to test_bit() or it results in an out of bounds access beyond the end of the bitmap. Fixes: 7318166cacad ("net: dsa: bcm_sf2: Add support for ethtool::rxnfc") Signed-off-by: Dan Carpenter -

RE: [net-next 01/12] i40e: replace switch-statement to speed-up retpoline-enabled builds

2019-04-30 Thread David Laight
From: Josh Elsasser > Sent: 29 April 2019 21:02 > On Apr 29, 2019, at 12:16 PM, Jeff Kirsher > wrote: > > > From: Björn Töpel > > > > GCC will generate jump tables for switch-statements with more than 5 > > case statements. An entry into the jump table is an indirect call, > > which means that

  1   2   >