Currently, all netlink protocol handlers for updating rules, actions and
qdiscs are protected with single global rtnl lock which removes any
possibility for parallelism. This patch set is a third step to remove
rtnl lock dependency from TC rules update path.
Recently, new rtnl registration flag RT
Add reference counter to tcf proto. Use it to manage tcf proto life cycle
in cls API.
Implement helper get/put functions for tcf proto and use them to modify cls
API to always take reference to tcf proto while using it. This change
allows to concurrently modify proto, instead of relying on rtnl lo
Implement unique insertion function to atomically attach tcf proto to chain
after verifying that no other tcf proto with specified priority exists.
Implement delete function that verifies that tp is actually empty before
deleting it. Use these functions to refactor cls API to account for
concurrent
Add optional tp->ops->put() API to be implemented for filter reference
counting. This new function is called by cls API to release filter
reference for filters returned by tp->ops->change() or tp->ops->get()
functions. Implement tfilter_put() helper to call tp->ops->put() only for
classifiers that
All users of chain->filters_chain rely on rtnl lock and assume that no new
classifier instances are added when traversing the list. Use
tcf_get_next_proto() to traverse filters list without relying on rtnl
mutex. This function iterates over classifiers by taking reference to
current iterator classi
Extend tcf_chain with 'flushing' flag. Use the flag to prevent insertion of
new classifier instances when chain flushing is in progress in order to
prevent resource leak when tcf_proto is created by unlocked users
concurrently.
Return EAGAIN error from tcf_chain_tp_insert_unique() to restart
tc_ne
Register netlink protocol handlers for message types RTM_NEWTFILTER,
RTM_DELTFILTER, RTM_GETTFILTER as unlocked. Set rtnl_held variable that
tracks rtnl mutex state to be false by default.
Modify tcf_block_release() to release rtnl lock if it was taken before.
Move code that releases block and qdi
As a preparation for registering rules update netlink handlers as unlocked,
conditionally take rtnl in following cases:
- Parent qdisc doesn't support unlocked execution.
- Requested classifier type doesn't support unlocked execution.
- User requested to flash whole chain using old filter update AP
All users of block->chain_list rely on rtnl lock and assume that no new
chains are added when traversing the list. Use tcf_get_next_chain() to
traverse chain list without relying on rtnl mutex. This function iterates
over chains by taking reference to current iterator chain only and doesn't
assume
Extend Qdisc_class_ops with flags. Create enum to hold possible class ops
flag values. Add first class ops flags value QDISC_CLASS_OPS_DOIT_UNLOCKED
to indicate that class ops functions can be called without taking rtnl
lock.
Signed-off-by: Vlad Buslov
Acked-by: Jiri Pirko
---
include/net/sch_g
Currently, tcf_block doesn't use any synchronization mechanisms to protect
code that manages lifetime of its chains. block->chain_list and multiple
variables in tcf_chain that control its lifetime assume external
synchronization provided by global rtnl lock. Converting chain reference
counting to a
Add 'rtnl_held' flag to tcf proto change, delete, destroy, dump, walk
functions to track rtnl lock status. This allows classifiers to release
rtnl lock when necessary and to pass rtnl lock status to extensions and
driver offload callbacks.
Add flags field to tcf proto ops. Add flag value to indica
In order to remove dependency on rtnl lock, use block->lock to protect
chain0 struct from concurrent modification. Rearrange code in chain0
callback add and del functions to only access chain0 when block->lock is
held.
Signed-off-by: Vlad Buslov
Acked-by: Jiri Pirko
---
net/sched/cls_api.c | 17
When cls API is called without protection of rtnl lock, parallel
modification of chain is possible, which means that chain template can be
changed concurrently in certain circumstances. For example, when chain is
'deleted' by new user-space chain API, the chain might continue to be used
if it is re
Actions API is already updated to not rely on rtnl lock for
synchronization. However, it need to be provided with rtnl status when
called from classifiers API in order to be able to correctly release the
lock when loading kernel module.
Extend extension validation function with 'rtnl_held' flag wh
As a part of the effort to remove dependency on rtnl lock, cls API is being
converted to use fine-grained locking mechanisms instead of global rtnl
lock. However, chain_head_change callback for ingress Qdisc is a sleeping
function and cannot be executed while holding a spinlock.
Extend cls API wit
Always lock chain when accessing filter_chain list, instead of relying on
rtnl lock. Dereference filter_chain with tcf_chain_dereference() lockdep
macro to verify that all users of chain_list have the lock taken.
Rearrange tp insert/remove code in tc_new_tfilter/tc_del_tfilter to execute
all neces
In order to remove dependency on rtnl lock, modify chain API to use
block->lock to protect chain from concurrent modification. Rearrange
tc_ctl_chain() code to call tcf_chain_hold() while holding block->lock.
Signed-off-by: Vlad Buslov
Acked-by: Jiri Pirko
---
net/sched/cls_api.c | 36 +
Hello,
We are seeing a problem with AF_PACKET when used along with the
veth interfaces. SCP complains that message authentication code is
incorrect.
I was browsing the code and I see that veth_xmit calls dev_forward_skb
which does a skb_scrub_packet, which in turn calls the skb destructor fun
Sat, Nov 10, 2018 at 06:21:26AM CET, jakub.kicin...@netronome.com wrote:
>From: John Hurley
>
>Currently drivers can register to receive TC block bind/unbind callbacks
>by implementing the setup_tc ndo in any of their given netdevs. However,
>drivers may also be interested in binds to higher level
> > return pp;
> > }
>
> What if 'pp' is NULL?
>
> Aside from that, this replace a lookup with 2 atomic ops, and only when
> such lookup is amortized on multiple aggregated packets: I'm unsure if
> it's worthy and I don't understand how that improves RR tests (where
> the socket
On Sat, Nov 10, 2018 at 1:29 AM Eric Dumazet wrote:
>
>
>
> On 11/08/2018 10:21 PM, Li RongQing wrote:
> > GRO for UDP needs to lookup socket twice, first is in gro receive,
> > second is gro complete, so if store sock to skb to avoid looking up
> > twice, this can give small performance boost
>
On Sun, 11 Nov 2018 09:55:35 -0800 (PST), David Miller wrote:
> From: Jakub Kicinski
> Date: Fri, 9 Nov 2018 21:21:25 -0800
>
> > John says:
> >
> > This patchset introduces an alternative to egdev offload by allowing a
> > driver to register for block updates when an external device (e.g. tunn
On 11/11/2018 04:40 PM, 배석진 wrote:
>> Also we might note that flow dissector itself is buggy as
>> found by Soukjin Bae ( https://patchwork.ozlabs.org/patch/994601/ )
>>
>> I will send a v2 of his patch with a different changelog.
>>
>> Defrag is fixed [1] but the bug in flow dissector is ad
On 2018/11/9 7:23, Andrew Lunn wrote:
> I'm just trying to ensure whatever is defined is flexible enough that
> we really can later support everything which DT does. We have PHYs on
> MDIO busses, inside switches, which are on MDIO busses, which are
> inside Ethernet interfaces, etc.
>
> An MDIO bu
Driver assigns DMAE channel 0 for FW as part of START_RAMROD command. FW
uses this channel for DMAE operations (e.g., TIME_SYNC implementation).
Driver also uses the same channel 0 for DMAE operations for some of the PFs
(e.g., PF0 on Port0). This could lead to concurrent access to the DMAE
channel
This patch implements SAIL[1] based routing table lookup for XDP. I
however made some changes from the original proposal (details are
described in the patch). This changes decreased the memory consumption
from 21.94 MB to 4.97 MB for my example routing table with 400K
routes.
This patch can perfor
> Also we might note that flow dissector itself is buggy as
> found by Soukjin Bae ( https://patchwork.ozlabs.org/patch/994601/ )
>
> I will send a v2 of his patch with a different changelog.
>
> Defrag is fixed [1] but the bug in flow dissector is adding
> extra work and hash inconsistencies.
>
Hi,friend,
This is Daniel Murray and i am from Sinara Group Co.Ltd Group Co.,LTD in Russia.
We are glad to know about your company from the web and we are interested in
your products.
Could you kindly send us your Latest catalog and price list for our trial order.
Best Regards,
Daniel Murray
From: Heiner Kallweit
Date: Sun, 11 Nov 2018 20:31:21 +0100
> The PCI vendor id of U.S. Robotics isn't defined in pci_ids.h so far,
> only ISDN driver w6692 has a private definition. Move the definition
> to pci_ids.h and use it in the r8169 driver too.
>
> Signed-off-by: Heiner Kallweit
> ---
From: Eric Dumazet
Date: Sun, 11 Nov 2018 09:11:31 -0800
> Similar to 80ba92fa1a92 ("codel: add ce_threshold attribute")
>
> After EDT adoption, it became easier to implement DCTCP-like CE marking.
>
> In many cases, queues are not building in the network fabric but on
> the hosts themselves.
>
From: Eric Dumazet
Date: Sun, 11 Nov 2018 07:34:28 -0800
> FQ pacing guarantees that paced packets queued by one flow do not
> add head-of-line blocking for other flows.
>
> After TCP GSO conversion, increasing limit_output_bytes to 1 MB is safe,
> since this maps to 16 skbs at most in qdisc or
From: Eric Dumazet
Date: Sun, 11 Nov 2018 06:41:28 -0800
> This series makes tcp_tso_should_defer() a bit smarter :
>
> 1) MSG_EOR gives a hint to TCP to not defer some skbs
>
> 2) Second patch takes into account that head tstamp
>can be in the future.
>
> 3) Third patch uses existing high
These checks are relevant during development / testing only,
therefore switch to lockdep_assert_held and friends.
Signed-off-by: Heiner Kallweit
---
drivers/net/phy/mdio_bus.c | 4 ++--
drivers/net/phy/phy.c| 2 +-
drivers/net/phy/phy_device.c | 2 +-
3 files changed, 4 insertions(+),
Greetings My Dear,
I sent this mail praying it will found you in a good condition of
health, since I myself are in a very critical health condition in
which I sleep every night without knowing if I may be alive to see
the next day. I am Mrs. Francisca Carlsen from Denmark wife of late
Mr Joh
Move IRQ configuration for IP101A/G from config_init to config_intr
callback. Reasons:
1. This allows phylib to disable interrupts if needed.
2. Icplus was the only driver supporting interrupts w/o defining a
config_intr callback. Now we can add a phylib plausibility check
disabling interrup
The PCI vendor id of U.S. Robotics isn't defined in pci_ids.h so far,
only ISDN driver w6692 has a private definition. Move the definition
to pci_ids.h and use it in the r8169 driver too.
Signed-off-by: Heiner Kallweit
---
v2:
- The original patch caused a build failure in w6692 driver because
On Sun, Nov 11, 2018 at 9:41 AM Eric Dumazet wrote:
>
> tcp_tso_should_defer() first heuristic is to not defer
> if last send is "old enough".
>
> Its current implementation uses jiffies and its low granularity.
>
> TSO autodefer performance should not rely on kernel HZ :/
>
> After EDT conversion
On Sun, Nov 11, 2018 at 9:41 AM Eric Dumazet wrote:
>
> tcp_tso_should_defer() last step tries to check if the probable
> next ACK packet is coming in less than half rtt.
>
> Problem is that the head->tstamp might be in the future,
> so we need to use signed arithmetics to avoid overflows.
>
> Sig
On Sun, Nov 11, 2018 at 9:41 AM Eric Dumazet wrote:
>
> Applications using MSG_EOR are giving a strong hint to TCP stack :
>
> Subsequent sendmsg() can not append more bytes to skbs having
> the EOR mark.
>
> Do not try to TSO defer suchs skbs, there is really no hope.
>
> Signed-off-by: Eric Duma
On Sun, Nov 11, 2018 at 12:11 PM Eric Dumazet wrote:
>
> Similar to 80ba92fa1a92 ("codel: add ce_threshold attribute")
>
> After EDT adoption, it became easier to implement DCTCP-like CE marking.
>
> In many cases, queues are not building in the network fabric but on
> the hosts themselves.
>
> If
From: Heiner Kallweit
Date: Sun, 11 Nov 2018 11:50:08 +0100
> diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
> index 69f0abe1b..1fac231fe 100644
> --- a/include/linux/pci_ids.h
> +++ b/include/linux/pci_ids.h
> @@ -2359,6 +2359,10 @@
>
> #define PCI_VENDOR_ID_SYNOPSYS
From: Denis Bolotin
Date: Sun, 11 Nov 2018 17:05:00 +0200
> diff --git a/drivers/net/ethernet/qlogic/qed/qed_int.c
> b/drivers/net/ethernet/qlogic/qed/qed_int.c
> index 0f0aba7..aa7504a 100644
> --- a/drivers/net/ethernet/qlogic/qed/qed_int.c
> +++ b/drivers/net/ethernet/qlogic/qed/qed_int.c
> @
From: Eric Dumazet
Date: Sat, 10 Nov 2018 16:22:29 -0800
> If sch_fq is used at ingress, skbs that might have been
> timestamped by net_timestamp_set() if a packet capture
> is requesting timestamps could be delayed by arbitrary
> amount of time, since sch_fq time base is MONOTONIC.
>
> Fix this
From: Andrew Lunn
Date: Sun, 11 Nov 2018 00:50:39 +0100
> We already have a workaround for a couple of switches whose internal
> PHYs only have the Marvel OUI, but no model number. We detect such
> PHYs and give them the 6390 ID as the model number. However the
> mv88e6161 has two SERDES interfac
From: Andrew Lunn
Date: Sun, 11 Nov 2018 00:41:10 +0100
> The mv88e6161 would sometime fail to probe with a timeout waiting for
> the switch to complete an operation. This operation is supposed to
> clear the statistics counters. However, due to a read/modify/write,
> without the needed mask, the
From: Andrew Lunn
Date: Sun, 11 Nov 2018 00:32:13 +0100
> Currently the SERDES interfaces for ports 9 and 10 on the mv88e6390x
> are supported, allowing upto 10G. However, when unused, these SERDES
> interfaces can be used by some of the lower ports for 1000Base-X.
>
> The tricky bit here is ord
From: Andrew Lunn
Date: Sat, 10 Nov 2018 23:43:32 +0100
> This is the last part in converting phylib to make use of a linux
> bitmap, not a u32, to represent links modes. This will allow support
> for PHYs > 1Gbps, which need to use link modes represented by a bit >
> 32.
>
> A number of MAC and
From: Heiner Kallweit
Date: Sat, 10 Nov 2018 23:40:50 +0100
> Both states aren't used. Most likely they result from an idea that
> never materialized. So remove them.
>
> Signed-off-by: Heiner Kallweit
Applied.
From: yupeng
Date: Sat, 10 Nov 2018 13:38:12 -0800
> The snmp_counter.rst explains the meanings of snmp counters. It also
> provides a set of experiments (only 1 for this initial patch),
> combines the experiments' resutls and the snmp counters'
> meanings. This is an initial path, only explains
From: Jakub Kicinski
Date: Fri, 9 Nov 2018 21:21:25 -0800
> John says:
>
> This patchset introduces an alternative to egdev offload by allowing a
> driver to register for block updates when an external device (e.g. tunnel
> netdev) is bound to a TC block. Drivers can track new netdevs or regist
From: Heiner Kallweit
Date: Sat, 10 Nov 2018 00:37:46 +0100
> Add macros for PHYID matching to be used in PHY driver configs.
> By using these macros some boilerplate code can be avoided.
>
> Use them initially in the Realtek PHY drivers.
Series applied.
From: Heiner Kallweit
Date: Fri, 9 Nov 2018 18:54:49 +0100
> After the recent changes to the state machine phylib can be further
> simplified (w/o having to make any assumptions).
Nothing exemplifies understanding of a piece of code like a patch
series like this, nice work.
Series applied.
Similar to 80ba92fa1a92 ("codel: add ce_threshold attribute")
After EDT adoption, it became easier to implement DCTCP-like CE marking.
In many cases, queues are not building in the network fabric but on
the hosts themselves.
If packets leaving fq missed their Earliest Departure Time by XXX usec,
FQ pacing guarantees that paced packets queued by one flow do not
add head-of-line blocking for other flows.
After TCP GSO conversion, increasing limit_output_bytes to 1 MB is safe,
since this maps to 16 skbs at most in qdisc or device queues.
(or slightly more if some drivers lower {gso_max_segs|
The value of "sb_index" is written by the hardware. Reading its value and
writing it to "index" must finish before checking the loop condition.
Signed-off-by: Denis Bolotin
Signed-off-by: Michal Kalderon
---
drivers/net/ethernet/qlogic/qed/qed_int.c | 2 ++
1 file changed, 2 insertions(+)
diff
Release PTT before entering error flow.
Signed-off-by: Denis Bolotin
Signed-off-by: Michal Kalderon
---
drivers/net/ethernet/qlogic/qed/qed_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_main.c
b/drivers/net/ethernet/qlogic/qed/qe
The TC received from APP TLV is stored in offload_tc, and should not be
set by protocols which did not receive an APP TLV. Fixed the condition
when overriding the offload_tc.
Signed-off-by: Denis Bolotin
Signed-off-by: Michal Kalderon
---
drivers/net/ethernet/qlogic/qed/qed_dcbx.c | 14 +++-
From: Michal Kalderon
Certain flows need to access the rdma-info structure, for example dcbx
update flows. In some cases there can be a race between the allocation or
deallocation of the structure which was done in roce start / roce stop and
an asynchrounous dcbx event that tries to access the st
Hi Dave,
This patch series fixes several unrelated bugs across the driver.
Please consider applying to net.
Thanks,
Denis
Denis Bolotin (3):
qed: Fix PTT leak in qed_drain()
qed: Fix overriding offload_tc by protocols without APP TLV
qed: Fix reading wrong value in loop condition
Michal K
Applications using MSG_EOR are giving a strong hint to TCP stack :
Subsequent sendmsg() can not append more bytes to skbs having
the EOR mark.
Do not try to TSO defer suchs skbs, there is really no hope.
Signed-off-by: Eric Dumazet
Acked-by: Soheil Hassas Yeganeh
---
net/ipv4/tcp_output.c | 4
tcp_tso_should_defer() first heuristic is to not defer
if last send is "old enough".
Its current implementation uses jiffies and its low granularity.
TSO autodefer performance should not rely on kernel HZ :/
After EDT conversion, we have state variables in nanoseconds that
can allow us to proper
tcp_tso_should_defer() last step tries to check if the probable
next ACK packet is coming in less than half rtt.
Problem is that the head->tstamp might be in the future,
so we need to use signed arithmetics to avoid overflows.
Signed-off-by: Eric Dumazet
Acked-by: Soheil Hassas Yeganeh
---
net
This series makes tcp_tso_should_defer() a bit smarter :
1) MSG_EOR gives a hint to TCP to not defer some skbs
2) Second patch takes into account that head tstamp
can be in the future.
3) Third patch uses existing high resolution state variables
to have a more precise heuristic.
Eric Duma
Hi Stefano,
On Sun, Nov 11, 2018 at 12:50:39PM +0100, Stefano Brivio wrote:
> On Sat, 10 Nov 2018 22:48:44 +0100
> Phil Sutter wrote:
>
> > On Sat, Nov 10, 2018 at 10:21:59AM +0100, Stefano Brivio wrote:
> >
> > > @@ -12,37 +12,37 @@ export TCPDIAG_FILE="$(dirname $0)/ss1.dump"
> > > ts_log "[T
Hi Phil,
On Sat, 10 Nov 2018 22:48:44 +0100
Phil Sutter wrote:
> On Sat, Nov 10, 2018 at 10:21:59AM +0100, Stefano Brivio wrote:
>
> > @@ -12,37 +12,37 @@ export TCPDIAG_FILE="$(dirname $0)/ss1.dump"
> > ts_log "[Testing ssfilter]"
> >
> > ts_ss "$0" "Match dport = 22" -Htna dport = 22
> > -
The PCI vendor id of U.S. Robotics isn't defined in pci_ids.h so far,
only ISDN driver w6692 has a private definition. Move the definition
to pci_ids.h and use it also in the r8169 driver.
Signed-off-by: Heiner Kallweit
---
v2:
- The original patch caused a build failure in w6692 driver because
W dniu 11.11.2018 o 09:03, Jesper Dangaard Brouer pisze:
On Sat, 10 Nov 2018 23:19:50 +0100
Paweł Staszewski wrote:
W dniu 10.11.2018 o 23:06, Jesper Dangaard Brouer pisze:
On Sat, 10 Nov 2018 20:56:02 +0100
Paweł Staszewski wrote:
W dniu 10.11.2018 o 20:49, Paweł Staszewski pisze:
W
Hello,
My Name is Juliet Muhammad from Turkey, I very happy to contact you because i
want to be your friend and business partner hope you don't mind writing me back
I came across your e-mail contact prior a private search while in need of your
assistance.
Hi Heiner,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on net-next/master]
url:
https://github.com/0day-ci/linux/commits/Heiner-Kallweit/r8169-add-USR-PCI-vendor-id/2018-13
config: ia64-allmodconfig (attached as .config)
compiler: ia64-linux-gcc (GCC
On Sat, 10 Nov 2018 22:53:53 +0100 Paweł Staszewski
wrote:
> Now im messing with ring configuration for connectx5 nics.
> And after reading that paper:
> https://netdevconf.org/2.1/slides/apr6/network-performance/04-amir-RX_and_TX_bulking_v2.pdf
>
Do notice that some of the ideas in that slid
Hello!
On 11.11.2018 3:55, David Miller wrote:
Eliminate the assumption that SKBs and SKB list heads can
be cast to eachother in SKB list handling code.
Each other? My spellchecker trips here.
This change also appears to fix a bug since the list->next pointer is
sampled outside of holding
Hello!
On 11.11.2018 2:50, Andrew Lunn wrote:
We already have a workaround for a couple of switches whose internal
PHYs only have the Marvel OUI, but no model number. We detect such
PHYs and give them the 6390 ID as the model number. However the
mv88e6161 has two SERDES interfaces in the same a
Netdev https://goo.gl/pW8d8y Oliver Carter
On Sat, 10 Nov 2018 23:19:50 +0100
Paweł Staszewski wrote:
> W dniu 10.11.2018 o 23:06, Jesper Dangaard Brouer pisze:
> > On Sat, 10 Nov 2018 20:56:02 +0100
> > Paweł Staszewski wrote:
> >
> >> W dniu 10.11.2018 o 20:49, Paweł Staszewski pisze:
> >>>
> >>> W dniu 10.11.2018 o 20:34, Jesper D
75 matches
Mail list logo