From: Ilpo Järvinen
1) Don't early return when sack doesn't fit. AccECN code will be
placed after this fragment so no early returns please.
2) Make sure opts->num_sack_blocks is not left undefined. E.g.,
tcp_current_mss() does not memset its opts struct to zero.
AccECN code checks if SA
On Mon, 14 Apr 2025 12:11:36 +0200 Steffen Klassert wrote:
> I'm still a bit skeptical about the bonding offloads itself as
> mentioned here:
>
> https://lore.kernel.org/all/zsbkdzvjvf3gi...@gauss3.secunet.de/
So am I, FWIW.
> but I'm OK with this particular pachset.
>
> How should we merge thi
From: Ilpo Järvinen
Armed with ceb delta from option, delivered bytes, and
delivered packets it is possible to estimate how many times
ACE field wrapped.
This calculation is necessary only if more than one wrap
is possible. Without SACK, delivered bytes and packets are
not always trustworthy in
From: Chia-Yu Chang
Hello,
Plese find the v3:
v3(14-Apr-2025)
- Fix patch apply issue in v2 (Jakub Kicinski )
v2 (18-Mar-2025)
- Add one missing patch from previous AccECN protocol preparation patch series
to this patch series
The full patch series can be found in
https://github.com/L4STeam/
From: Ilpo Järvinen
As SACK blocks tend to eat all option space when there are
many holes, it is useful to compromise on sending many SACK
blocks in every ACK and try to fit AccECN option there
by reduction the number of SACK blocks. But never go below
two SACK blocks because of AccECN option.
A
From: Chia-Yu Chang
AccECN option may fail in various way, handle these:
- Remove option from SYN/ACK rexmits to handle blackholes
- If no option arrives in SYN/ACK, assume Option is not usable
- If an option arrives later, re-enabled
- If option is zeroed, disable AccECN option processin
From: Ilpo Järvinen
Instead of sending the option in every ACK, limit sending to
those ACKs where the option is necessary:
- Handshake
- "Change-triggered ACK" + the ACK following it. The
2nd ACK is necessary to unambiguously indicate which
of the ECN byte counters in increasing. The first
KVM exposes the sanitised ID registers to guests. Currently these ignore
the ID_AA64PFR1_EL1.MTE_frac field, meaning guests always see a value of
zero.
This is a problem for platforms without the MTE_ASYNC feature where
ID_AA64PFR1_EL1.MTE==0x2 and ID_AA64PFR1_EL1.MTE_frac==0xf. KVM forces
MTE_fra
From: Chia-Yu Chang
AccECN option may fail in various way, handle these:
- Remove option from SYN/ACK rexmits to handle blackholes
- If no option arrives in SYN/ACK, assume Option is not usable
- If an option arrives later, re-enabled
- If option is zeroed, disable AccECN option processin
On Mon, Apr 14, 2025 at 06:02:45PM +, David Binderman wrote:
> Hello there,
>
> Static analyser cppcheck says:
>
> linux-6.15-rc2/tools/testing/selftests/kvm/lib/arm64/processor.c:107:2:
> style: int result is returned as long value. If the return value is long to
> avoid loss of informatio
From: Ilpo Järvinen
1) Don't early return when sack doesn't fit. AccECN code will be
placed after this fragment so no early returns please.
2) Make sure opts->num_sack_blocks is not left undefined. E.g.,
tcp_current_mss() does not memset its opts struct to zero.
AccECN code checks if SA
Hello there,
Static analyser cppcheck says:
linux-6.15-rc2/tools/testing/selftests/kvm/lib/arm64/processor.c:107:2: style:
int result is returned as long value. If the return value is long to avoid loss
of information, then you have loss of information. [truncLongCastReturn]
Source code is
On Thu, Apr 03, 2025 at 05:00:14PM +0200, Sabrina Dubroca wrote:
> 2025-04-03, 13:09:02 +, Hangbin Liu wrote:
> > Hi Sabrina,
> > On Thu, Apr 03, 2025 at 12:28:54PM +0200, Sabrina Dubroca wrote:
> > > Hello Hangbin,
> > >
> > > 2025-04-03, 08:58:55 +, Hangbin Liu wrote:
> > > > When settin
From: Ilpo Järvinen
These counters track IP ECN field payload byte sums for all
arriving (acceptable) packets. The AccECN option (added by
a later patch in the series) echoes these counters back to
sender side.
Signed-off-by: Ilpo Järvinen
Signed-off-by: Neal Cardwell
Signed-off-by: Chia-Yu Ch
From: Ilpo Järvinen
This change implements Accurate ECN without negotiation and
AccECN Option (that will be added by later changes). Based on
AccECN specifications:
https://tools.ietf.org/id/draft-ietf-tcpm-accurate-ecn-28.txt
Accurate ECN allows feeding back the number of CE (congestion
exper
From: Ilpo Järvinen
The following patch will use tcp_ecn_mode_accecn(),
TCP_ACCECN_CEP_INIT_OFFSET, TCP_ACCECN_CEP_ACE_MASK in
__tcp_fast_path_on() to make new flag for AccECN.
No functional changes.
Signed-off-by: Ilpo Järvinen
Signed-off-by: Chai-Yu Chang
---
include/net/tcp.h | 54 +++
From: Ilpo Järvinen
As SACK blocks tend to eat all option space when there are
many holes, it is useful to compromise on sending many SACK
blocks in every ACK and try to fit AccECN option there
by reduction the number of SACK blocks. But never go below
two SACK blocks because of AccECN option.
A
From: Ilpo Järvinen
Prepare for AccECN that needs to have access here on IP ECN
field value which is only available after INET_ECN_xmit().
No functional changes.
Signed-off-by: Ilpo Järvinen
Signed-off-by: Chia-Yu Chang
Reviewed-by: Eric Dumazet
---
net/ipv4/tcp_output.c | 5 +++--
1 file c
From: Ilpo Järvinen
The following patch will use tcp_ecn_mode_accecn(),
TCP_ACCECN_CEP_INIT_OFFSET, TCP_ACCECN_CEP_ACE_MASK in
__tcp_fast_path_on() to make new flag for AccECN.
No functional changes.
Signed-off-by: Ilpo Järvinen
Signed-off-by: Chai-Yu Chang
---
include/net/tcp.h | 54 +++
From: Ilpo Järvinen
This change implements Accurate ECN without negotiation and
AccECN Option (that will be added by later changes). Based on
AccECN specifications:
https://tools.ietf.org/id/draft-ietf-tcpm-accurate-ecn-28.txt
Accurate ECN allows feeding back the number of CE (congestion
exper
From: Chia-Yu Chang
DUALPI2 AQM is a combination of the DUALQ Coupled-AQM with a PI2
base-AQM. The PI2 AQM is in turn both an extension and a simplification
of the PIE AQM. PI2 makes quite some PIE heuristics unnecessary, while
being able to control scalable congestion controls like TCP-Prague.
W
From: Chia-Yu Chang
Hello,
Please find DUALPI2 iproute2 patch v4.
v5 (25-Mar-25)
- Use matches() to replace current strcmp() (Stephen Hemminger
)
- Use general parse_percent() for handling scaled percentage values (Stephen
Hemminger )
- Add print function for JSON of dualpi2 stats (Stephen
From: Chia-Yu Chang
Hello,
Please find the reposted DualPI2 patch v10.
v10
- Remove leftover include in include/linux/netdevice.h and anonimous struct in
sch_dualpi2.c (Paolo Abeni )
- Use kfree_skb_reason() and add SKB_DROP_REASON_DUALPI2_STEP_DROP drop reason
(Paolo Abeni )
- Split DualPI
From: Ilpo Järvinen
Add newly acked pkts EWMA. When ACK thinning occurs, select
between safer and unsafe cep delta in AccECN processing based
on it. If the packets ACKed per ACK tends to be large, don't
conservatively assume ACE field overflow.
Signed-off-by: Ilpo Järvinen
Signed-off-by: Chia-Y
From: Ilpo Järvinen
The heuristic algorithm from draft-11 Appendix A.2.2 to
mitigate against false ACE field overflows.
Signed-off-by: Ilpo Järvinen
Signed-off-by: Chia-Yu Chang
---
include/net/tcp.h| 1 +
net/ipv4/tcp_input.c | 18 --
2 files changed, 17 insertions(+), 2
From: Ilpo Järvinen
Instead of sending the option in every ACK, limit sending to
those ACKs where the option is necessary:
- Handshake
- "Change-triggered ACK" + the ACK following it. The
2nd ACK is necessary to unambiguously indicate which
of the ECN byte counters in increasing. The first
From: Ilpo Järvinen
The Accurate ECN allows echoing back the sum of bytes for
each IP ECN field value in the received packets using
AccECN option. This change implements AccECN option tx & rx
side processing without option send control related features
that are added by a later change.
Based on
From: Ilpo Järvinen
AccECN byte counter estimation requires delivered bytes
which can be calculated while processing SACK blocks and
cumulative ACK. The delivered bytes will be used to estimate
the byte counters between AccECN option (on ACKs w/o the
option).
Non-SACK calculation is quite annoyi
From: Ilpo Järvinen
There is some waste space in the option usage due to padding
of 32-bit fields. AccECN option can take advantage of those
few bytes as its tail is often consuming just a few odd bytes.
Signed-off-by: Ilpo Järvinen
Signed-off-by: Chia-Yu Chang
---
net/ipv4/tcp_output.c | 22
From: Ilpo Järvinen
Prepare for AccECN that needs to have access here on IP ECN
field value which is only available after INET_ECN_xmit().
No functional changes.
Signed-off-by: Ilpo Järvinen
Signed-off-by: Chia-Yu Chang
Reviewed-by: Eric Dumazet
---
net/ipv4/tcp_output.c | 5 +++--
1 file c
From: Ilpo Järvinen
Accurate ECN negotiation parts based on the specification:
https://tools.ietf.org/id/draft-ietf-tcpm-accurate-ecn-28.txt
Accurate ECN is negotiated using ECE, CWR and AE flags in the
TCP header. TCP falls back into using RFC3168 ECN if one of the
ends supports only RFC3168-
From: Chia-Yu Chang
Hello,
Plese find the v3:
v3(14-Apr-2025)
- Fix patch apply issue in v2 (Jakub Kicinski )
v2 (18-Mar-2025)
- Add one missing patch from previous AccECN protocol preparation patch series
to this patch series
The full patch series can be found in
https://github.com/L4STeam/
From: Koen De Schepper
DualPI2 provides L4S-type low latency & loss to traffic that uses a
scalable congestion controller (e.g. TCP-Prague, DCTCP) without
degrading the performance of 'classic' traffic (e.g. Reno,
Cubic etc.). It is to be the reference implementation of IETF RFC9332
DualQ Coupled
From: Chia-Yu Chang
DualPI2 is the reference implementation of IETF RFC9332 DualQ Coupled
AQM (https://datatracker.ietf.org/doc/html/rfc9332) providing two
queues called low latency (L-queue) and classic (C-queue). By default,
it enqueues non-ECN and ECT(0) packets into the C-queue and ECT(1) and
From: Chia-Yu Chang
The configuration and statistics dump of the DualPI2 Qdisc provides
information related to both queues, such as packet numbers and queuing
delays in the L-queue and C-queue, as well as general information such as
probability value, WRR credits, memory usage, packet marking cou
From: Chia-Yu Chang
Update configuration of tc-tests and preload DualPI2 module for self-tests,
and add folloiwng self-test cases for DualPI2:
Test a4c7: Create DualPI2 with default setting
Test 2130: Create DualPI2 with typical_rtt and max_rtt
Test 90c1: Create DualPI2 with max_rtt
Test
From: Chia-Yu Chang
Introduce the specification of tc qdisc DualPI2 stats and attributes,
which is the reference implementation of IETF RFC9332 DualQ Coupled AQM
(https://datatracker.ietf.org/doc/html/rfc9332) providing two different
queues: low latency queue (L-queue) and classic queue (C-queue)
When MTE is supported but MTE_ASYMM is not (ID_AA64PFR1_EL1.MTE == 2)
ID_AA64PFR1_EL1.MTE_frac == 0xF indicates MTE_ASYNC is unsupported
and MTE_frac == 0 indicates it is supported.
As MTE_frac was previously unconditionally read as 0 from the guest
and user-space, check that using SET_ONE_REG to
If MTE_frac is masked out unconditionally then the guest will always
see ID_AA64PFR1_EL1_MTE_frac as 0. However, a value of 0 when
ID_AA64PFR1_EL1_MTE is 2 indicates that MTE_ASYNC is supported. Hence, for
a host with ID_AA64PFR1_EL1_MTE==2 and ID_AA64PFR1_EL1_MTE_frac==0xf
(MTE_ASYNC unsupported)
The ID_AA64PFR1_EL1.MTE_frac field is currently hidden from KVM.
However, when ID_AA64PFR1_EL1.MTE==2, ID_AA64PFR1_EL1.MTE_frac==0
indicates that MTE_ASYNC is supported. On a host with
ID_AA64PFR1_EL1.MTE==2 but without MTE_ASYNC support a guest with the
MTE capability enabled will incorrectly see
On Fri, Apr 11, 2025 at 10:49:52AM +0300, Cosmin Ratiu wrote:
> This patch series was motivated by fixing a few bugs in the bonding
> driver related to xfrm state migration on device failover.
>
> struct xfrm_dev_offload has two net_device pointers: dev and real_dev.
> The first one is the device
41 matches
Mail list logo