Hi Panos,

Great questions—we’ll aim to share some numbers next week.

Regarding the Chrome 10% budget, that’s taken from here (linked from the
Cloudflare blog post, but the link didn’t survive the copy/paste to this
list):
https://dadrian.io/blog/posts/pqc-signatures-2024/
<https://dadrian.io/blog/posts/pqc-signatures-2024/#fnref:3>

Best,
Luke

Luke Valenta
Systems Engineer - Research


On Thu, Nov 7, 2024 at 10:54 AM Kampanakis, Panos <kpanos=
40amazon....@dmarc.ietf.org> wrote:

> Hi Bas,
>
>
>
> That is interesting and surprising, thank you.
>
>
>
> I am mostly interested in the ~63% of non-resumed sessions that would be
> affected by 10-15KB of auth data. It looks like your data showed that each
> QUIC conn transfers about 4.7KB which is very surprising to me. It seems
> very low.
>
>
>
> In experiments I am getting here for top web servers, I see lots of conns
> which transfer hundreds of KB even over QUIC in cached browsers sessions.
> This aligns with the average KB from your blog is 551*0.6=~330KB, but not
> the median 4.7. Hundreds of KB also aligns with the p50 per page / conns
> per page in
> https://httparchive.org/reports/page-weight?lens=top1k&start=2024_05_01&end=latest&view=list
> . Of course browsers cache a lot of things like javascript, images etc, so
> they don’t transfer all resources which could explain the median. But
> still, based on anecdotal experience looking at top visited servers, I am
> noticing many small transfers and just a few that transfer larger HTML, css
> etc on every page even in cached browser sessions..
>
>
>
> I am curious about the 4.7KB and the 15.8% of conns transferring <100KB in
> your blog. Like you say in your blog, if the 95th percentile includes
> very large transfers that would skew the diff between the median and the
> average. But I am wondering if there is another explanation. In my
> experiments I see a lot of 302 and 301 redirects which transfer minimal
> data. Some pages have a lot of those. If you have many of them, then your
> median will get skewed as it fills up with very small data transfers that
> basically don’t do anything. In essence, we could have 10 pages which
> transfer 100KB each for one of their resources and have another 9 that are
> HTTP Redirects or transfer 0.1KB. That would make us think that 90% of the
> 10 pages will be blazing fast, but the 100KB resource in each page will
> take a good amount of time in a slow network.
>
>
>
> To validate this theory, what would your data show if you queried for the
> % of conns that transfer <.5 or <1KB? If that is a lot, then there are many
> small conns that skew the median downwards. Or what if you run the query to
> exclude the very heavy conns and the very light (HTTP 301, 302 etc)? For
> example if you ran a report on the conns transferring 1KB<data<80th
> percentile KB, what would be the median for that? That would tell us if the
> too small and two big conns skew the median.
>
>
>
> Btw, I am curious also about
>
> > Chrome is more cautious and set 10% as their target for maximum TLS
> handshake time regression.
>
> Is this public somewhere? There is no immediate link between TLS handshake
> and any of the Core Web Vitals Metrics or the CruX metrics other than the
> TTFB. Even for the TTFB, 10% in the handshake does not mean 10% TTFB; the
> TTFB is affected much less. I am wondering if we should start expecting the
> TLS handshake to slowly become a tracked web performance metric.
>
>
>
>
>
> *From:* Bas Westerbaan <bas=40cloudflare....@dmarc.ietf.org>
> *Sent:* Thursday, November 7, 2024 9:07 AM
> *To:* <tls@ietf.org> <tls@ietf.org>; p...@ietf.org
> *Subject:* [EXTERNAL] [TLS] Bytes server -> client
>
>
>
> *CAUTION*: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> Hi all,
>
>
>
> Just wanted to highlight a blog post we just published.
> https://blog.cloudflare.com/another-look-at-pq-signatures/  At the end we
> share some statistics that may be of interest:
>
>
>
> On average, around 15 million TLS connections are established with
> Cloudflare per second. Upgrading each to ML-DSA, would take 1.8Tbps, which
> is 0.6% of our current total network capacity. No problem so far. The
> question is how these extra bytes affect performance.
> Back in 2021, we ran a large-scale experiment to measure the impact of big
> post-quantum certificate chains on connections to Cloudflare’s network over
> the open Internet. There were two important results. First, we saw a steep
> increase in the rate of client and middlebox failures when we added more
> than 10kB to existing certificate chains. Secondly, when adding less than
> 9kB, the slowdown in TLS handshake time would be approximately 15%. We felt
> the latter is workable, but far from ideal: such a slowdown is noticeable
> and people might hold off deploying post-quantum certificates before it’s
> too late.
>
>
>
> Chrome is more cautious and set 10% as their target for maximum TLS
> handshake time regression. They report that deploying post-quantum key
> agreement has already incurred a 4% slowdown in TLS handshake time, for the
> extra 1.1kB from server-to-client and 1.2kB from client-to-server. That
> slowdown is proportionally larger than the 15% we found for 9kB, but that
> could be explained by slower upload speeds than download speeds.
>
>
> There has been pushback against the focus on TLS handshake times. One
> argument is that session resumption alleviates the need for sending the
> certificates again. A second argument is that the data required to visit a
> typical website dwarfs the additional bytes for post-quantum certificates.
> One example is this 2024 publication, where Amazon researchers have
> simulated the impact of large post-quantum certificates on data-heavy TLS
> connections. They argue that typical connections transfer multiple requests
> and hundreds of kilobytes, and for those the TLS handshake slowdown
> disappears in the margin.
>
>
>
> Are session resumption and hundreds of kilobytes over a connection typical
> though? We’d like to share what we see. We focus on QUIC connections, which
> are likely initiated by browsers or browser-like clients. Of all QUIC
> connections with Cloudflare that carry at least one HTTP request, 37% are
> resumptions, meaning that key material from a previous TLS connection is
> reused, avoiding the need to transmit certificates. The median number of
> bytes transferred from server-to-client over a resumed QUIC connection is
> 4.4kB, while the average is 395kB. For non-resumptions the median is 7.8kB
> and average is 551kB. This vast difference between median and average
> indicates that a small fraction of data-heavy connections skew the average.
> In fact, only 15.8% of all QUIC connections transfer more than 100kB.
>
>
> The median certificate chain today (with compression) is 3.2kB. That means
> that almost 40% of all data transferred from server to client on more than
> half of the non-resumed QUIC connections are just for the certificates, and
> this only gets worse with post-quantum algorithms. For the majority of QUIC
> connections, using ML-DSA as a drop-in replacement for classical signatures
> would more than double the number of transmitted bytes over the lifetime of
> the connection.
>
>
>
> It sounds quite bad if the vast majority of data transferred for a typical
> connection is just for the post-quantum certificates. It’s still only a
> proxy for what is actually important: the effect on metrics relevant to the
> end-user, such as the browsing experience (e.g. largest contentful paint)
> and the amount of data those certificates take from a user’s monthly data
> cap. We will continue to investigate and get a better understanding of the
> impact.
>
>
>
> Best,
>
>
>
>  Bas
> _______________________________________________
> TLS mailing list -- tls@ietf.org
> To unsubscribe send an email to tls-le...@ietf.org
>
_______________________________________________
TLS mailing list -- tls@ietf.org
To unsubscribe send an email to tls-le...@ietf.org

Reply via email to