On Mon, May 20, 2024 at 2:42 PM Jacob Champion <jacob.champ...@enterprisedb.com> wrote: > > I mean... you said it, not me. I'm trying not to rain on the parade > too much, because compression is clearly very valuable. But it makes > me really uncomfortable that we're reintroducing the compression > oracle (especially over the authentication exchange, which is > generally more secret than the rest of the traffic).
As currently implemented, the compression only applies to CopyData/DataRow/Query messages, none of which should be involved in authentication, unless I've really missed something in my understanding. > Right, I think it's reasonable to let a sufficiently > determined/informed user lift the guardrails, but first we have to > choose to put guardrails in place... and then we have to somehow > sufficiently inform the users when it's okay to lift them. My thought would be that compression should be opt-in on the client side, with documentation around the potential security pitfalls. (I could be convinced it should be opt-in on the server side, but overall I think opt-in on the client side generally protects against footguns without excessively getting in the way and if an attacker controls the client, they can just get the information they want directly-they don't need compression sidechannels to get that information.) > But for SQL, where's the dividing line between attacker-chosen and > attacker-sought? To me, it seems like only the user knows; the server > has no clue. I think that puts us "lower" in Alyssa's model than HTTP > is. > > As Andrey points out, there was prior work done that started to take > this into account. I haven't reviewed it to see how good it is -- and > I think there are probably many use cases in which queries and tables > contain both private and attacker-controlled information -- but if we > agree that they have to be separated, then the strategy can at least > be improved upon. Within SQL-level things, I don't think we can reasonably differentiate between private and attacker-controlled information at the libpq/server level. We can reasonably differentiate between message types that *definitely* are private and ones that could have either/both data in them, but that's not nearly as useful. I think not compressing auth-related packets plus giving a mechanism to reset the compression stream for clients (plus guidance on the tradeoffs involved in turning on compression) is about as good as we can get. That said, I *think* the feature is reasonable to be reviewed/committed without the reset functionality as long as the compressed data already has the mechanism built in (as it does) to signal when a decompressor should restart its streaming. The actual signaling protocol mechanism/necessary libpq API can happen in followon work. -- Jacob Burroughs | Staff Software Engineer E: jburrou...@instructure.com