Thank you Jack, Bill and Ricky for your comments! (And everyone after!) Jack:
"> amount (bytes, datagrams) presumed lost and re-transmitted by the sender" I would consider those lost packets and just recovered through time complexity e.g. retransmission with TCP and that retransmission may require another retransmission etc. Then those re-transmitted data are counted towards overhead. Equivalent for UDP would be redundancy as in paying lost data recovery with Space Complexity. If the payload can not be recovered then that would be considered waste of the whole payload as packet loss addition. "> amount (bytes, datagrams) discarded at the receiver because they were already received" This is simply from my perspective just overhead. Any extra cost in Space Complexity over the original payload is inefficiency and should be just counted as such. "> amount (bytes, datagrams) discarded at the receiver because they arrived too late to be useful" This is more tricky, because we are taking stance on use-case. Better would be just to characterize in terms of time distribution as percentiles at some specific useful intervals e.g. RTT multiples or 100ms, 500ms, 1000ms etc to give rough estimate how much time needs to be paid to recover if feedback loop is involved. "> With such data, it would be possible to measure things like "useful throughput", i.e., the data successfully delivered from source to destination which was actually useful for the associated user's application." Here we have few different components at play. 1. Use-case required bandwidth, use-case consumed bandwidth. 2. Link saturation as non-use-case specific how much of the available path are we able to utilize effectively assuming that use case is equal or larger than this for example for file transfer as we do not have real time budget from the use-case perspective. Bill: "> All of that can probably be derived from sufficiently finely-grained TCP data. i.e. if you had a PCAP of a TCP flow that constituted the measurement, you’d be able to derive all of the above." What I would say that cannot be derived is the behavior of transport considering these combinations. What was very interesting, and what I have also experienced was here: https://www.youtube.com/watch?v=XHls8PvCVws&t=319s Specially the second talk about measurements at fine grained payload resolution. You see a very specific pattern on TCP when the payload size increases you will introduce extra RTT. Thus I think what is done for example in iperf is not exactly good method as there is just 1 payload size (?) and this result shows that the payload size itself will determine your results. So we need more dimensions in our tests. "> Bandwidth: The data transfer capacity available on the test path. Presumably the goal of a TCP transaction measurement would be to enable this calculation." This in the light of the previous video e.g. "we know X is available, how was it utilized?" "> Transfer Efficiency: The ratio of useful payload data to the overhead data. This is a how-its-used rather than a property-of-the-network. If there are network-inherent overheads, they’re likely to be not directly visible to endpoints, only inferable, and might require external knowledge of the network. So, I’d put this out-of-scope." I think differently, this is extremely important factor, and it is getting more and more important. How much waste there is to useful data. Every single extra bit is waste. Is it redundant data for recovery on non-feedbackloop utilizing transports or just ACK for retransmission the goal is same, we need to construct the payload, either by paying in time budget (wait for rentransmission) or space budget (add enough redundancy to make sure we can recreate regardless of lost data). "> RTT is measurable. If Latency is RTT minus processing delay on the remote end, I’m not sure it’s really measurable, per se, without the remote end being able to accurately clock itself, or an independent vantage point adjacent to the remote end. This is the old one-way-delay measurement problem in different guise, I think. Anyway, I think RTT is easy and necessary, and I think latency is difficult and probably an anchor not worth attaching to anything we want to see done in the near term. Latency jitter likewise." Yes, this is difficult without time sync. That is why it has to be laboratory conditions or glass to glass across relay. Very very tricky to make sure the timings are correct to no garbage measurements! For truthfully understanding performance from gradient of use-cases we need to be able to distinguish the baseline network fluctuation (e.g. Starlink) and what delays on top of that are caused by our transport.Thus I would really like to keep these 2 separate, as they will allow us to distinguish did we spend extra 300ms on few retransmission loops or did we get extra 5ms latency just as a error from the gradient in the baseline. "> This seems like it can be derived from a PCAP, but doesn’t really constitute an independent measurement." Agree, if we condition the network for lab test, it would be good to be able to derive the conditioned loss to make sure we know what we are doing. "> Energy Efficiency: The amount of energy consumed to achieve the test result. Not measurable." We should try! See attached picture about QUIC, just from the previous email, these things matter when we have constrained devices in IoT or space, or drones etc. We can get artificially good performance if we just burn enormous amount of energy. We should try to make sure we measure total energy spent on the overall transmission. Even if difficult, we should absolutely worry about the cost not only in bandwidth, time but also energy as consequence of computational complexity of some approaches at our disposal even for error correction. "> Did I overlook something? Out-of-order delivery is the fourth classical quality criterion. There are folks who argue that it doesn’t matter anymore, and others who (more compellingly, to my mind) argue that it’s at least as relevant as ever." Good point, is this concern for receive buffer size bloating if we need to make jitter buffer and pay in time complexity to recover misordered data? My perspective is mostly on application layer so forgive if I missed your point. "Thus, for an actual measurement suite: - A TCP transaction …from which we can observe: - Loss - RTT (which I’ll just call “Latency” because that’s what people have called it in the past) - out-of-order delivery - Jitter in the above three, if the transaction continues long enough …and we can calculate: - Goodput" I see, yes, my comments would have been doing all that TCP/LTP/UDP/QUIC... whatnot to have finally good overview where we stand as humanity on our ability to transfer bits for arbitrary use-case with quadrant on data intensity X axis and event rate on Y axis followed by environmental factors as latency and packetloss. "In addition to these, I think it’s necessary to also associate a traceroute (and, if available and reliable, a reverse-path traceroute) in order that it be clear what was measured, and a timestamp, and a digital signature over the whole thing, so we can know who’s attesting to the measurement." Yes, specially if we do tests in the wilderness, fully agreed! Overall all tests should be so accurately documented that they can be reproduced within error margins. If not, its not really science or even engineering, just artisans with rules of thumb! Best regards, Sauli On 12/7/23, Ricky Mok via Starlink <starlink@lists.bufferbloat.net> wrote: > How about applications? youtube and netflix? > > (TPC of this conference this year) > > Ricky > > On 12/6/23 18:22, Bill Woodcock via Starlink wrote: >> >>> On Dec 6, 2023, at 22:46, Sauli Kiviranta via Nnagain >>> <nnag...@lists.bufferbloat.net> wrote: >>> What would be a comprehensive measurement? Should cover all/most relevant >>> areas? >> It’s easy to specify a suite of measurements which is too heavy to be >> easily implemented or supported on the network. Also, as you point out, >> many things can be derived from raw data, so don’t necessarily require >> additional specific measurements. >> >>> Payload Size: The size of data being transmitted. >>> Event Rate: The frequency at which payloads are transmitted. >>> Bitrate: The combination of rate and size transferred in a given test. >>> Throughput: The data transfer capability achieved on the test path. >> All of that can probably be derived from sufficiently finely-grained TCP >> data. i.e. if you had a PCAP of a TCP flow that constituted the >> measurement, you’d be able to derive all of the above. >> >>> Bandwidth: The data transfer capacity available on the test path. >> Presumably the goal of a TCP transaction measurement would be to enable >> this calculation. >> >>> Transfer Efficiency: The ratio of useful payload data to the overhead >>> data. >> This is a how-its-used rather than a property-of-the-network. If there >> are network-inherent overheads, they’re likely to be not directly visible >> to endpoints, only inferable, and might require external knowledge of the >> network. So, I’d put this out-of-scope. >> >>> Round-Trip Time (RTT): The ping delay time to the target server and back. >>> RTT Jitter: The variation in the delay of round-trip time. >>> Latency: The transmission delay time to the target server and back. >>> Latency Jitter: The variation in delay of latency. >> RTT is measurable. If Latency is RTT minus processing delay on the remote >> end, I’m not sure it’s really measurable, per se, without the remote end >> being able to accurately clock itself, or an independent vantage point >> adjacent to the remote end. This is the old one-way-delay measurement >> problem in different guise, I think. Anyway, I think RTT is easy and >> necessary, and I think latency is difficult and probably an anchor not >> worth attaching to anything we want to see done in the near term. Latency >> jitter likewise. >> >>> Bit Error Rate: The corrupted bits as a percentage of the total >>> transmitted data. >> This seems like it can be derived from a PCAP, but doesn’t really >> constitute an independent measurement. >> >>> Packet Loss: The percentage of packets lost that needed to be recovered. >> Yep. >> >>> Energy Efficiency: The amount of energy consumed to achieve the test >>> result. >> Not measurable. >> >>> Did I overlook something? >> Out-of-order delivery is the fourth classical quality criterion. There >> are folks who argue that it doesn’t matter anymore, and others who (more >> compellingly, to my mind) argue that it’s at least as relevant as ever. >> >> Thus, for an actual measurement suite: >> >> - A TCP transaction >> >> …from which we can observe: >> >> - Loss >> - RTT (which I’ll just call “Latency” because that’s what people have >> called it in the past) >> - out-of-order delivery >> - Jitter in the above three, if the transaction continues long enough >> >> …and we can calculate: >> >> - Goodput >> >> In addition to these, I think it’s necessary to also associate a >> traceroute (and, if available and reliable, a reverse-path traceroute) in >> order that it be clear what was measured, and a timestamp, and a digital >> signature over the whole thing, so we can know who’s attesting to the >> measurement. >> >> -Bill >> >> >> _______________________________________________ >> Starlink mailing list >> Starlink@lists.bufferbloat.net >> https://lists.bufferbloat.net/listinfo/starlink > _______________________________________________ > Starlink mailing list > Starlink@lists.bufferbloat.net > https://lists.bufferbloat.net/listinfo/starlink > _______________________________________________ Starlink mailing list Starlink@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/starlink