> This vast difference between median and average indicates that a small
> fraction of data-heavy connections skew the average.

Hmmm. Why not describe this as "a large number of short sessions skew
the median, making the median fail to reflect total data usage"?

The total cost of all sessions is the _average_ cost per session times
the number of sessions. The total cryptographic cost of all sessions is
the average cryptographic cost per session times the number of sessions.

Example:

   * A news site sends you a 20MB video in 1 big session, plus 99 tiny
     sessions each with 0.01MB. The total data it's sending is 21MB.

   * If you add 0.01MB to each session for crypto, then you're adding
     1MB across the 100 sessions. That's not much compared to 21MB.

   * In terms of averages, the average data per session is 0.21MB, and
     you're adding 0.01MB on top of that. Same 1/21 ratio (although if
     you don't know the total number of sessions then you can't compare
     this to other expenditures).

   * The _median_ data per session is just 0.01MB. This is wildly
     misleading: it completely misses the big video, while incorrectly
     making the crypto sound as if it's doubling costs.

To be clear, I do recommend looking at more of the distribution than
just the average. Seeing variations opens up possibilities such as (1)
being able to convince people to use stronger crypto for the longer
sessions, and (2) batching the short sessions to reduce their overhead
(not just for crypto), in cases where aggregate overhead is an issue.

---D. J. Bernstein

_______________________________________________
TLS mailing list -- tls@ietf.org
To unsubscribe send an email to tls-le...@ietf.org

Reply via email to