On Tue, Mar 26, 2019 at 6:21 PM Sniffen, Brian <bsnif...@akamai.com> wrote:
> Brotli has a dictionary built into the algorithm. I believe that is indeed > being used, as it's a part of Brotli. I think the earlier email was saying > no external certificate-specific dictionary used. > > > Brotli 1.03 and 1.05 each changed the standard dictionary—didn’t they? > Perhaps I am misreading https://github.com/google/brotli/releases , but > *even if* the Brotli maintainers are careful, I expect many less careful > entities to version their compression schemes internally, without updating > the codepoint. > I'm not personally familiar with Brotli's history, but looking through that link, no I don't believe they changed the standard dictionary. The standard dictionary is fixed in RFC 7932 <https://tools.ietf.org/html/rfc7932#appendix-A>. So is the rest of the decoding algorithm <https://tools.ietf.org/html/rfc7932#section-10>. Looking through the commits, 1.0.5 <https://github.com/google/brotli/compare/v1.0.4...v1.0.5> is just talking about a tool to extract the dictionary from the RFC <https://github.com/google/brotli/commit/a4581c158ecad55360aa54f09a08bcc1790c560b> .. 1.0.3 <https://github.com/google/brotli/compare/v1.0.2...v1.0.3> appears to be talking about adding a separate variant <https://github.com/google/brotli/commit/35e69fc7cf9421ab04ffc9d52cb36d07fa12984a> which is not what the code point in the spec refers to. The spec says codepoint brotli(2) corresponds to the algorithm defined RFC 7932. If you use another algorithm, even a related one, it is not brotli(2). Maybe it's large_window_brotli(123) or brotli(2) plus a separate large_window(456) extension, but that will be reflected in the transcript. Even without an RFC, not making wire-incompatible changes to a serialization format is a rather fundamental requirement. Your decoding needs to match, or you have not implemented the same thing and will not interop. The spec can go call that out I suppose, but this is usually considered redundant. Compression is just a serialization format that attempts to use fewer bytes for hopefully likely inputs (and thus more bytes for other, hopefully unlikely, inputs). Maybe it is family of formats parameterized by a dictionary, but then someone needs to pick an instantiation. Individual instantiations are still serialization formats, with the usual expected rules. Different instantiations will not interop or the dictionary didn't do anything. > I don't think "no information flow from the algorithm" is particularly > well-defined. The output of course takes information flow from the > algorithm as the algorithm is being run. One could replace Brotli's > dictionary from an array lookups to a series of ifs, etc., without changing > the function. > > The transcript encoding must be injective, but we inherently have that > requirement: the receiver needs to decompress it! The transcript includes > all inputs to the receiver, notably the compression algorithm code point. > > > No, the time of the transaction is a silent input. I’m worried about > extremely persistent adversaries, including those who can update some of > the involved software in apparently-innocent ways. > (Some aspects of time actually are part of the transcript by way of the random values used for freshness.) If your adversary can change your software, I don't think this is the part you need to worry about. :-P But, yeah, if you make wire-incompatible changes to your implementation over time and introduce ambiguity, the transcript may not work as intended. It will also presumably fail to interop. This equally true for code points and length prefixes as compression formats. David > -Brian > > Were Brotli's static dictionary changed, it would no longer be Brotli. It > would perhaps be Brotli2 and would want a separate codepoint. To that end, > I think the discussion on hash table lookups similarly forgot this > decompressibility requirement. Let's define my_fancy_algorithm to be: > > func compress(input) { > if input == some_particular_cert { > return "0x00" > } > return "0x01" + input > } > > This is silly, but still fine because the codepoint for my_fancy_algorithm > is in the transcript. It would even be fine if my_fancy_algorithm relied on > a separately-negotiated dictionary extension. The sender inherently must > unambiguously communicate the dictionary name. That ends up in the > transcript. (This is the same logic behind other uses of the handshake > transcript. Blindly stuffing the handshake bytes into the transcript lets > us align functional and security requirements.) > > David > >
_______________________________________________ TLS mailing list TLS@ietf.org https://www.ietf.org/mailman/listinfo/tls