Re: [Int-area] IP parcels

Templin (US), Fred L Mon, 20 Dec 2021 15:11:33 -0800

Tom, in modern reassembly it is not going to wait for the MSL for all fragments
to arrive anymore; either they all get there after a very small inter-fragment
delay, or you send an immediate FRAGREP and possibly also a PTB soft error
then quickly declare the reassembly dead if that doesn’t help. And, you make
sure to inspect IDs of received fragments before admitting them into the
reassembly cache so you don’t end up caching garbage that will just have to
be discarded later.

Fred

From: Tom Herbert [mailto:t...@herbertland.com]
Sent: Monday, December 20, 2021 1:06 PM
To: Templin (US), Fred L <fred.l.temp...@boeing.com>
Cc: to...@strayalpha.com; int-area@ietf.org
Subject: Re: IP parcels

On Mon, Dec 20, 2021 at 12:03 PM Templin (US), Fred L 
<fred.l.temp...@boeing.com<mailto:fred.l.temp...@boeing.com>> wrote:
Tom, sorry I will try to use my words more carefully; I am using GSO/GRO also 
for
a UDP-based transport protocol – not QUIC but something similar. I like GSO/GRO
very much; I am glad the service is available and I want to see it continue. 
But, my
understanding of the services is that they leverage the IP ID field in whole 
IPv4
packets that are not eligible for fragmentation and those are limitations I am
seeking to improve on.

I want to enable a facility similar to GSO/GRO that works for both IPv4 and IPv6
packets and allows for lower layers to fragment if necessary. And, I want to use
a well-behaved 32-bit IPv6 ID instead of the 16-bit IPv4 one where the use is 
not
well defined when DF=1.

There has been a lot of work in this area. For instance, you might want to take 
a look at https://www.youtube.com/watch?v=ccUeG1dAhbw

About reassembly, that would only happen on the end systems themselves or on
a very capable device that is very close to the end systems; I would not want 
for
a high-speed core router to have to reassemble.

Even so, an intermediate device close to the end system still has to provide 
service to more than one host. Reassembly requires memory to store fragments. A 
host would need enough memory to service all of its own flows, but an 
intermediate node would need number of hosts it serves times that amount of 
memory to perform reassembly.  This is a fundamental scaling problem of 
stateful services in the network, inevitably the network nodes cannot scale to 
the number of users or flows that require service. In the best case scenario, 
when resources are not available the network won't attempt the stateful 
operation and will just forward the packet unimpeded (which is fine because 
host will never rely on this class of optimization). In the worse case 
scenario, the network will take a detrimental action such as forcibly breaking 
a connection (e.g. this is what can happen when a NAT evicts a TCP connection 
because it has run out of memory). IMO, maintaining state in the network is a 
bad, albeit unfortunately prevalent, idea.

Tom

Again, GSO/GRO is nice work and much respect is due to those who made it 
possible.

Fred

From: Tom Herbert [mailto:t...@herbertland.com<mailto:t...@herbertland.com>]
Sent: Monday, December 20, 2021 9:20 AM
To: Templin (US), Fred L 
<fred.l.temp...@boeing.com<mailto:fred.l.temp...@boeing.com>>
Cc: to...@strayalpha.com<mailto:to...@strayalpha.com>; 
int-area@ietf.org<mailto:int-area@ietf.org>
Subject: Re: [Int-area] [EXTERNAL] Re: IP parcels

The world is not just TCP anymore. QUIC and other UDP-based transports have 
already
shown performance increases using facilities like GSO/GRO which are essentially 
a short
term and non-standard implementation of what parcels promise to do in the long 
term.

Fred,

Can you explain why GSO/GRO aren't sufficient and are only short term 
solutions? We've been using these for almost twenty years now with good effect. 
These are widely deployed with TCP, TSO works well to offload transmit, LRO is 
defined and is in much better shape to offload RX now that programmable devices 
are emerging. For TCP it's hard to see how IP parcels would help significantly, 
but even for UDP we now have UDP GSO, sendmmsg, and recvmmsg that mitigate the 
cost of system calls and interrupts to which the draft refers. The reason these 
aren't standards in IETF is because they're implementation techniques and not 
protocol (although I will point out that GSO/GRO/sendmmsg/recvmmsg are in all 
Linux devices so that effectively makes it a de facto implementation standard).

I am also concerned about the idea that intermediate devices would perform 
reassembly. This has a whole bunch of implications like middleboxes are no 
longer work conserving and seems to have the implicit requirement that it has 
to be in the path of every packet in a parcel (i.e. even in the case of the 
last hop performing reassembly. Also, as simply a matter of resources and 
capabilities, hosts are in a much better position to perform tasks like 
reassembly. I don't readily see that having intermediate devices perform 
reassembly would be a win for hosts, and even if it were, host implementations 
still would need the capability to perform reassembly themselves since they 
will never rely on the network to always do it for them.

Tom

Thanks - Fred

From: to...@strayalpha.com<mailto:to...@strayalpha.com> 
[mailto:to...@strayalpha.com<mailto:to...@strayalpha.com>]
Sent: Sunday, December 19, 2021 11:53 AM
To: Templin (US), Fred L 
<fred.l.temp...@boeing.com<mailto:fred.l.temp...@boeing.com>>
Cc: int-area@ietf.org<mailto:int-area@ietf.org>; Wes Eddy 
<w...@mti-systems.com<mailto:w...@mti-systems.com>>
Subject: Re: [Int-area] IP parcels

Hi, Fred (et al.),

On Dec 19, 2021, at 10:21 AM, Templin (US), Fred L 
<fred.l.temp...@boeing.com<mailto:fred.l.temp...@boeing.com>> wrote:

Joe, your insistence on using html makes it impossible to respond to all of 
your points inline
which is the reason for my top-posts.

I use MacOS mail, IOS mail, and Thunderbird on Windows, all using default 
configurations, FWIW. I appear to be able to post inside everyone else’s 
responses. I don’t know if the IETF’s mailers are munging formats, though.

I’ve made my position clear. However:

- You still haven’t shown any evidence that end systems need to do all this 
extra work so they can somehow run faster, nor that this will be noticeably 
faster than large (i.e., 20-60KB) IPv4 packets.

- You still haven’t shown any reason why this is feasible; in fact, below you 
add the idea of on-path fragmentation, which is largely deprecated because 
fragments won’t traverse tunnels (in your case, notably for single chunks 
larger than 64KB). Nevermind that the fragmentation is both expensive and 
slow-path at routers.

- You have claimed that both routers and transports will somehow adopt this 
when we can’t even get reasonably large MTUs that already fit within IPv4 
across heterogeneous enterprises.

IPv4 is over; even if you don’t think so, any way forward with larger packets 
starts with:
               a) getting ~64KB IP packets across the net
               b) after (a), prove that >64KB are needed based on the IPv6 
jumbo approach

Any way forward with a lot of small packets inside one large one (where both 
chunks and total length are less than 64K) starts by proving there’s a need and 
it fixing how TCP interacts with its inherent burstiness and loss correlation.

Only THEN will this issue be worth more discussion.

Joe

Parcels that contain a single segment whether 64K or considerably less are 
still sent as
(singleton) parcels and not ordinary packets. That way, nodes in the network 
can know
that it is OK to encapsulate and fragment since by asserting its interest in 
receiving parcels
the destination has also subscribed to being able to reassemble up to a full 
64K.

Parcels do not set (Payload Length / Total Length) to 0; they set it to the 
length of the
first element of the parcel (which is also the same length of each non-final 
element of
the parcel). What happens then is that network equipment will see a unit with 
an L3
length that may be considerably shorter than the L2 length. You are right that 
legacy
routers might not like this (or, they might truncate the packet according to L3 
length),
and so for paths that might traverse legacy routers the first-hop node that 
recognizes
parcels instead encapsulates the parcel in an IPv4 or IPv6 header then performs 
(source)
fragmentation if necessary. These IP fragments will then travel through legacy 
routers
just fine.

About RFC793bis, you and Wes Eddy know far more about its status than I do; I 
only
noted that this is something with TCP implications and so made mention of it in 
case
there is still room for a few more engine tweaks while the hood is still open.

About IPv4, I am currently running IPv4 edge networks with IPv4-in-IPv6 tunnel 
endpoints
connected to an IPv6 transit network and it works really good. End systems get 
to use
smaller addresses and smaller headers, and they can talk to remote 
correspondents using
IPv4 as if they were all on the same IPv4 network. So yes, I think we might 
still want to
consider IPv4 for edge networks like that.

About getting 64K packets across, only the edge networks or end systems see 
them as
large packets; in the core thy are typically broken up into something much 
smaller by
ingress nodes that apply segmentation/fragmentation. We don’t need the core to 
move
to jumbo links; we only need that at the edges. ATM taught us that.

About our “nail”, end systems get to see larger packets/parcels and get to take 
advantage
of the reduced interrupts and system call overhead they provide. That is what 
makes it
worthwhile.

Fred

From: to...@strayalpha.com<mailto:to...@strayalpha.com> 
[mailto:to...@strayalpha.com]
Sent: Saturday, December 18, 2021 8:13 PM
To: Templin (US), Fred L 
<fred.l.temp...@boeing.com<mailto:fred.l.temp...@boeing.com>>
Cc: int-area@ietf.org<mailto:int-area@ietf.org>; Wes Eddy 
<w...@mti-systems.com<mailto:w...@mti-systems.com>>
Subject: Re: [Int-area] IP parcels

HI, Fred,

If you have one segment that’s less than 64K, you don’t need the parcel option 
at all.

If you have something longer than 64K, either as a single segment or multiple 
smaller segments, by setting total length to 0, you end up being dropped by 
legacy routers, which either ignore options they don’t understand or drop 
packets with options they don’t support.

RFC793bis does talk about IPv6 jumbos, but this new work is out of scope for 
RFC793bis - furthermore, it’s too late. It has passed WGLC, IETF LC, and is 
currently in IESG review for publication.

You also haven’t addressed why the IETF should be taking up this *new* work for 
IPv4, which I thought also had been considered ineligible.

But overall, again, what’s the point? We can’t even get 64K IP packets through 
the Internet; making them larger doesn’t make that easier or more likely. Such 
large sizes are of diminishing benefit; routers already forward at 40Gbps per 
link for minimal packets and end systems have other problems that this 
exacerbates.

This seems a lot like a huge hammer in search of a nail. Where’s the nail?

Joe

—
Joe Touch, temporal epistemologist
www.strayalpha.com<http://www.strayalpha.com/>

On Dec 18, 2021, at 7:18 PM, Templin (US), Fred L 
<fred.l.temp...@boeing.com<mailto:fred.l.temp...@boeing.com>> wrote:

Joe, I never said that I wanted to restrict this to small transport segments; 
in fact, I want
just the opposite – I want large segments. A perfectly legal parcel is one 
which includes 1
~64KB segment - another legal parcel is one which includes 64 of them! If you 
want bigger
segments than that, then true jumbos are necessary and this spec does not 
preclude that.

About RFC793(bis), I see there is a section on Jumbos and IP parcels is (sort 
of) an application
of Jumbos.

Fred

From: to...@strayalpha.com<mailto:to...@strayalpha.com> 
[mailto:to...@strayalpha.com]
Sent: Saturday, December 18, 2021 4:57 PM
To: Templin (US), Fred L 
<fred.l.temp...@boeing.com<mailto:fred.l.temp...@boeing.com>>
Cc: int-area@ietf.org<mailto:int-area@ietf.org>; Wes Eddy 
<w...@mti-systems.com<mailto:w...@mti-systems.com>>
Subject: [EXTERNAL] Re: [Int-area] IP parcels

EXT email: be mindful of links/attachments.

Hi, Fred,

Regarding 793bis, new ideas are out of scope. It’s supposed to be a roll-in of 
existing items only.

Nevermind the problems below, which “TCP will find a way” doesn’t magically fix.

The problem is this:
- end systems need to send larger transport segments (not just IP segments)
- if they can do that, they should, filling up to the largest IP payload

Having an IP packet have the opportunity to include lots of small transport 
packets assumes transport packets either work better or faster when they’re 
small. It’s the opposite.

Joe

—
Joe Touch, temporal epistemologist
www.strayalpha.com<http://www.strayalpha.com/>

On Dec 18, 2021, at 4:42 PM, Templin (US), Fred L 
<fred.l.temp...@boeing.com<mailto:fred.l.temp...@boeing.com>> wrote:

Joe, TCP will find a way to adapt – it always has. I also see that TCP is 
currently undergoing
a second edition revision so the timing seems right to consider IP parcels in 
the analysis.
I am Cc’ing the second edition editor for his information – Wesley, please 
consider this
new concept called “IP Parcels” as it relates to your document.

Here is the latest draft version – it expands on the “Motivation” section and 
adds a number
of important feature such as a new “Parcels Permitted” TCP option:

https://datatracker.ietf.org/doc/draft-templin-intarea-parcels/

Fred

From: to...@strayalpha.com<mailto:to...@strayalpha.com> 
[mailto:to...@strayalpha.com]
Sent: Friday, December 17, 2021 6:01 PM
To: Templin (US), Fred L 
<fred.l.temp...@boeing.com<mailto:fred.l.temp...@boeing.com>>
Cc: int-area@ietf.org<mailto:int-area@ietf.org>
Subject: Re: [Int-area] IP parcels

Hi, Fred,

I’m first concerned at the use of an IP option at all, due to the problems with 
*any* options forcing processing to slow-path.

From TCP’s viewpoint, it seems like you’ve just created a nightmare for SACK 
and ECN, basically because you will encourage drops of large bursts of packets.

This will also increase the bustiness of TCP, i.e., rather than letting the 
ACKs support pacing.

Any part of the system that currently coalesces TCP packets is likely to 
generate errors here, because they might see only the first TCP segment.

However, AFAICT the most significant consideration is that  the issue with 
per-packet performance is at the TCP and UDP layers, not as much at the IP 
layer.

So what problem is this trying to solve?

Joe
—
Joe Touch, temporal epistemologist
www.strayalpha.com<http://www.strayalpha.com/>

On Dec 17, 2021, at 5:06 PM, Templin (US), Fred L 
<fred.l.temp...@boeing.com<mailto:fred.l.temp...@boeing.com>> wrote:

Here's one that should help with shipping, just in time for Christmas. Thanks
to everyone for the past and future list exchanges.

Fred

-----Original Message-----
From: I-D-Announce [mailto:i-d-announce-boun...@ietf.org] On Behalf Of 
internet-dra...@ietf.org<mailto:internet-dra...@ietf.org>
Sent: Friday, December 17, 2021 5:00 PM
To: i-d-annou...@ietf.org<mailto:i-d-annou...@ietf.org>
Subject: I-D Action: draft-templin-intarea-parcels-00.txt

A New Internet-Draft is available from the on-line Internet-Drafts directories.

       Title           : IP Parcels
       Author          : Fred L. Templin
               Filename        : draft-templin-intarea-parcels-00.txt
               Pages           : 8
               Date            : 2021-12-17

Abstract:
  IP packets (both IPv4 and IPv6) are understood to contain a unit of
  data which becomes the retransmission unit in case of loss.  Upper
  layer protocols such as the Transmission Control Protocol (TCP)
  prepare data units known as "segments", with traditional arrangements
  including a single segment per packet.  This document presents a new
  construct known as the "IP Parcel" which permits a single packet to
  carry multiple segments.  The parcel can be opened at middleboxes on
  the path with the included segments broken out into individual
  packets, then rejoined into one or more repackaged parcels to be
  forwarded further toward the final destination.  Reordering of
  segments within parcels is unimportant; what matters is that the
  number of parcels delivered to the final destination should be kept
  to a minimum, and that loss or receipt of individual segments (and
  not parcel size) determines the retransmission unit.

The IETF datatracker status page for this draft is:
https://datatracker.ietf.org/doc/draft-templin-intarea-parcels/

There is also an htmlized version available at:
https://datatracker.ietf.org/doc/html/draft-templin-intarea-parcels-00

Internet-Drafts are also available by rsync at 
rsync.ietf.org<http://rsync.ietf.org/>::internet-drafts

_______________________________________________
I-D-Announce mailing list
i-d-annou...@ietf.org<mailto:i-d-annou...@ietf.org>
https://www.ietf.org/mailman/listinfo/i-d-announce
Internet-Draft directories: http://www.ietf.org/shadow.html
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt

_______________________________________________
Int-area mailing list
Int-area@ietf.org<mailto:Int-area@ietf.org>
https://www.ietf.org/mailman/listinfo/int-area

_______________________________________________
Int-area mailing list
Int-area@ietf.org<mailto:Int-area@ietf.org>
https://www.ietf.org/mailman/listinfo/int-area

_______________________________________________
Int-area mailing list
Int-area@ietf.org
https://www.ietf.org/mailman/listinfo/int-area

Re: [Int-area] IP parcels

Reply via email to