On Tue, Mar 26, 2019 at 6:21 PM Sniffen, Brian <bsnif...@akamai.com> wrote:

> Brotli has a dictionary built into the algorithm. I believe that is indeed
> being used, as it's a part of Brotli. I think the earlier email was saying
> no external certificate-specific dictionary used.
>
>
> Brotli 1.03 and 1.05 each changed the standard dictionary—didn’t they?
> Perhaps I am misreading https://github.com/google/brotli/releases , but
> *even if* the Brotli maintainers are careful, I expect many less careful
> entities to version their compression schemes internally, without updating
> the codepoint.
>

I'm not personally familiar with Brotli's history, but looking through that
link, no I don't believe they changed the standard dictionary. The standard
dictionary is fixed in RFC 7932
<https://tools.ietf.org/html/rfc7932#appendix-A>. So is the rest of the
decoding algorithm <https://tools.ietf.org/html/rfc7932#section-10>.
Looking through the commits, 1.0.5
<https://github.com/google/brotli/compare/v1.0.4...v1.0.5> is just talking
about a tool to extract the dictionary from the RFC
<https://github.com/google/brotli/commit/a4581c158ecad55360aa54f09a08bcc1790c560b>
.. 1.0.3 <https://github.com/google/brotli/compare/v1.0.2...v1.0.3> appears
to be talking about adding a separate variant
<https://github.com/google/brotli/commit/35e69fc7cf9421ab04ffc9d52cb36d07fa12984a>
which
is not what the code point in the spec refers to. The spec says codepoint
brotli(2) corresponds to the algorithm defined RFC 7932. If you use another
algorithm, even a related one, it is not brotli(2). Maybe it's
large_window_brotli(123) or brotli(2) plus a separate large_window(456)
extension, but that will be reflected in the transcript.

Even without an RFC, not making wire-incompatible changes to a
serialization format is a rather fundamental requirement. Your decoding
needs to match, or you have not implemented the same thing and will not
interop. The spec can go call that out I suppose, but this is usually
considered redundant.

Compression is just a serialization format that attempts to use fewer bytes
for hopefully likely inputs (and thus more bytes for other, hopefully
unlikely, inputs). Maybe it is family of formats parameterized by a
dictionary, but then someone needs to pick an instantiation. Individual
instantiations are still serialization formats, with the usual expected
rules. Different instantiations will not interop or the dictionary didn't
do anything.

> I don't think "no information flow from the algorithm" is particularly
> well-defined. The output of course takes information flow from the
> algorithm as the algorithm is being run. One could replace Brotli's
> dictionary from an array lookups to a series of ifs, etc., without changing
> the function.
>
> The transcript encoding must be injective, but we inherently have that
> requirement: the receiver needs to decompress it! The transcript includes
> all inputs to the receiver, notably the compression algorithm code point.
>
>
> No, the time of the transaction is a silent input.  I’m worried about
> extremely persistent adversaries, including those who can update some of
> the involved software in apparently-innocent ways.
>

(Some aspects of time actually are part of the transcript by way of the
random values used for freshness.)

If your adversary can change your software, I don't think this is the part
you need to worry about. :-P

But, yeah, if you make wire-incompatible changes to your implementation
over time and introduce ambiguity, the transcript may not work as intended.
It will also presumably fail to interop. This equally true for code points
and length prefixes as compression formats.

David


> -Brian
>
> Were Brotli's static dictionary changed, it would no longer be Brotli. It
> would perhaps be Brotli2 and would want a separate codepoint. To that end,
> I think the discussion on hash table lookups similarly forgot this
> decompressibility requirement. Let's define my_fancy_algorithm to be:
>
> func compress(input) {
>    if input == some_particular_cert {
>       return "0x00"
>   }
>   return "0x01" + input
> }
>
> This is silly, but still fine because the codepoint for my_fancy_algorithm
> is in the transcript. It would even be fine if my_fancy_algorithm relied on
> a separately-negotiated dictionary extension. The sender inherently must
> unambiguously communicate the dictionary name. That ends up in the
> transcript. (This is the same logic behind other uses of the handshake
> transcript. Blindly stuffing the handshake bytes into the transcript lets
> us align functional and security requirements.)
>
> David
>
>
_______________________________________________
TLS mailing list
TLS@ietf.org
https://www.ietf.org/mailman/listinfo/tls

Reply via email to