Hi Panos, Great questions—we’ll aim to share some numbers next week.
Regarding the Chrome 10% budget, that’s taken from here (linked from the Cloudflare blog post, but the link didn’t survive the copy/paste to this list): https://dadrian.io/blog/posts/pqc-signatures-2024/ <https://dadrian.io/blog/posts/pqc-signatures-2024/#fnref:3> Best, Luke Luke Valenta Systems Engineer - Research On Thu, Nov 7, 2024 at 10:54 AM Kampanakis, Panos <kpanos= 40amazon....@dmarc.ietf.org> wrote: > Hi Bas, > > > > That is interesting and surprising, thank you. > > > > I am mostly interested in the ~63% of non-resumed sessions that would be > affected by 10-15KB of auth data. It looks like your data showed that each > QUIC conn transfers about 4.7KB which is very surprising to me. It seems > very low. > > > > In experiments I am getting here for top web servers, I see lots of conns > which transfer hundreds of KB even over QUIC in cached browsers sessions. > This aligns with the average KB from your blog is 551*0.6=~330KB, but not > the median 4.7. Hundreds of KB also aligns with the p50 per page / conns > per page in > https://httparchive.org/reports/page-weight?lens=top1k&start=2024_05_01&end=latest&view=list > . Of course browsers cache a lot of things like javascript, images etc, so > they don’t transfer all resources which could explain the median. But > still, based on anecdotal experience looking at top visited servers, I am > noticing many small transfers and just a few that transfer larger HTML, css > etc on every page even in cached browser sessions.. > > > > I am curious about the 4.7KB and the 15.8% of conns transferring <100KB in > your blog. Like you say in your blog, if the 95th percentile includes > very large transfers that would skew the diff between the median and the > average. But I am wondering if there is another explanation. In my > experiments I see a lot of 302 and 301 redirects which transfer minimal > data. Some pages have a lot of those. If you have many of them, then your > median will get skewed as it fills up with very small data transfers that > basically don’t do anything. In essence, we could have 10 pages which > transfer 100KB each for one of their resources and have another 9 that are > HTTP Redirects or transfer 0.1KB. That would make us think that 90% of the > 10 pages will be blazing fast, but the 100KB resource in each page will > take a good amount of time in a slow network. > > > > To validate this theory, what would your data show if you queried for the > % of conns that transfer <.5 or <1KB? If that is a lot, then there are many > small conns that skew the median downwards. Or what if you run the query to > exclude the very heavy conns and the very light (HTTP 301, 302 etc)? For > example if you ran a report on the conns transferring 1KB<data<80th > percentile KB, what would be the median for that? That would tell us if the > too small and two big conns skew the median. > > > > Btw, I am curious also about > > > Chrome is more cautious and set 10% as their target for maximum TLS > handshake time regression. > > Is this public somewhere? There is no immediate link between TLS handshake > and any of the Core Web Vitals Metrics or the CruX metrics other than the > TTFB. Even for the TTFB, 10% in the handshake does not mean 10% TTFB; the > TTFB is affected much less. I am wondering if we should start expecting the > TLS handshake to slowly become a tracked web performance metric. > > > > > > *From:* Bas Westerbaan <bas=40cloudflare....@dmarc.ietf.org> > *Sent:* Thursday, November 7, 2024 9:07 AM > *To:* <tls@ietf.org> <tls@ietf.org>; p...@ietf.org > *Subject:* [EXTERNAL] [TLS] Bytes server -> client > > > > *CAUTION*: This email originated from outside of the organization. Do not > click links or open attachments unless you can confirm the sender and know > the content is safe. > > > > Hi all, > > > > Just wanted to highlight a blog post we just published. > https://blog.cloudflare.com/another-look-at-pq-signatures/ At the end we > share some statistics that may be of interest: > > > > On average, around 15 million TLS connections are established with > Cloudflare per second. Upgrading each to ML-DSA, would take 1.8Tbps, which > is 0.6% of our current total network capacity. No problem so far. The > question is how these extra bytes affect performance. > Back in 2021, we ran a large-scale experiment to measure the impact of big > post-quantum certificate chains on connections to Cloudflare’s network over > the open Internet. There were two important results. First, we saw a steep > increase in the rate of client and middlebox failures when we added more > than 10kB to existing certificate chains. Secondly, when adding less than > 9kB, the slowdown in TLS handshake time would be approximately 15%. We felt > the latter is workable, but far from ideal: such a slowdown is noticeable > and people might hold off deploying post-quantum certificates before it’s > too late. > > > > Chrome is more cautious and set 10% as their target for maximum TLS > handshake time regression. They report that deploying post-quantum key > agreement has already incurred a 4% slowdown in TLS handshake time, for the > extra 1.1kB from server-to-client and 1.2kB from client-to-server. That > slowdown is proportionally larger than the 15% we found for 9kB, but that > could be explained by slower upload speeds than download speeds. > > > There has been pushback against the focus on TLS handshake times. One > argument is that session resumption alleviates the need for sending the > certificates again. A second argument is that the data required to visit a > typical website dwarfs the additional bytes for post-quantum certificates. > One example is this 2024 publication, where Amazon researchers have > simulated the impact of large post-quantum certificates on data-heavy TLS > connections. They argue that typical connections transfer multiple requests > and hundreds of kilobytes, and for those the TLS handshake slowdown > disappears in the margin. > > > > Are session resumption and hundreds of kilobytes over a connection typical > though? We’d like to share what we see. We focus on QUIC connections, which > are likely initiated by browsers or browser-like clients. Of all QUIC > connections with Cloudflare that carry at least one HTTP request, 37% are > resumptions, meaning that key material from a previous TLS connection is > reused, avoiding the need to transmit certificates. The median number of > bytes transferred from server-to-client over a resumed QUIC connection is > 4.4kB, while the average is 395kB. For non-resumptions the median is 7.8kB > and average is 551kB. This vast difference between median and average > indicates that a small fraction of data-heavy connections skew the average. > In fact, only 15.8% of all QUIC connections transfer more than 100kB. > > > The median certificate chain today (with compression) is 3.2kB. That means > that almost 40% of all data transferred from server to client on more than > half of the non-resumed QUIC connections are just for the certificates, and > this only gets worse with post-quantum algorithms. For the majority of QUIC > connections, using ML-DSA as a drop-in replacement for classical signatures > would more than double the number of transmitted bytes over the lifetime of > the connection. > > > > It sounds quite bad if the vast majority of data transferred for a typical > connection is just for the post-quantum certificates. It’s still only a > proxy for what is actually important: the effect on metrics relevant to the > end-user, such as the browsing experience (e.g. largest contentful paint) > and the amount of data those certificates take from a user’s monthly data > cap. We will continue to investigate and get a better understanding of the > impact. > > > > Best, > > > > Bas > _______________________________________________ > TLS mailing list -- tls@ietf.org > To unsubscribe send an email to tls-le...@ietf.org >
_______________________________________________ TLS mailing list -- tls@ietf.org To unsubscribe send an email to tls-le...@ietf.org