Re: Practical Timing Side Channel Attacks on Memory Compression

Robert Haas Wed, 06 Apr 2022 06:58:59 -0700

On Wed, Apr 6, 2022 at 7:18 AM Filip Janus <fja...@redhat.com> wrote:
> A few months ago a group of researchers published a paper about LZ77 
> vulnerability[1]. And it also affects PGLZ. From my point of view, it could 
> be a really dangerous issue for some kind of application. If I understand it 
> correctly there is a possibility of leaking approx. 24B secret data per 
> hour(but it depends on HW configuration).
>
> I understand that there is no simple and easy solution.  But I would like to 
> know Your opinion on this. Or if you have any plan on how to deal with this?


I hadn't heard of this before. It seems to be a real vulnerability in
PGLZ. Fortunately, the attack relies on the presence of conditions
that may not always be present, and the rate of data leakage is pretty
slow. Some threats of this kind are going to need to be addressed
outside the database, perhaps. For example, you could rate-limit
attempts to access your web application to make it harder to
accumulate enough accesses to get any meaningful data leakage, and you
could store highly secret data in a different place than you store
data that the user has the ability to modify. It sounds like even just
putting those things in separate jsonb columns rather than the same
one would block this particular attack. A user could also choose to
disable compression for a certain column entirely if they're worried
about this kind of thing.

However, there are new attacks all the time, and it's going to be
really hard to block them all. Variable latency is extremely difficult
to avoid, because pretty much every piece of code anyone writes is
going to have if statements and loops that can iterate for different
numbers of iterations on different input, and then there are CPU
effects like caching and branch prediction that add to the problem.
There are tons of attacks like this, and even if we could somehow, by
magic, secure PostgreSQL against this one completely, there will be
lots more in the future. I think it's inevitable that there are going
to be more and more papers demonstrating that a determined attacker
can leak information out of system A by very carefully measuring the
latency of operation X under different conditions, and there is no
real solution to that problem in general.

One thing that we could do internally to PostgreSQL is add more
possible TOAST compression algorithms. In addition to PGLZ, which the
attack in the paper targets, we now have LZ4 as an option. That's
probably vulnerable too, and probably zstd is as well, but if a state
of the art algorithm emerges that somehow isn't vulnerable, we can
consider adding support for it. I don't think that as a project we
really ought to be in the business of trying to design our own
compression algorithms. PGLZ is a great job for something that was
written by a PostgreSQL hacker, and many years ago at that, but not
surprisingly, people who spend all day thinking about compression are
really, really good at it. We should leave it up to them to figure out
whether there's something to be done here, and if the answer is yes,
then we can consider adopting whatever they come up with. Personally,
I don't quite see how such a thing would be possible, but I'm not a
compression expert.

One last thought: I don't think it's right to suppose that every
security vulnerability is the result of some design flaw and every
security vulnerability must be patched. Imagine, for example, that
someone posted a paper showing that they could break into your house.
Your reaction to that paper would probably depend on how they did it.
If it turns out that the lock you have on your front door will unlock
if you give it a hard bump with your fist, you'd probably want to
replace the lock with one that didn't have that design flaw. But if
the paper showed that they could break into your house by breaking one
of the windows with a crowbar, would you replace all of those windows
with solid steel? Most people understand that a window is likely to be
made of a more breakable substance than whatever surrounds it, because
it has an additional design constraint: it has to permit light to pass
through it. We accept that as a trade-off when we choose to live in a
house rather than a bunker. In the same way, without denying that
there's a real vulnerability here, I don't think that anyone who
understands a little bit about how compression and decompression work
would expect decompression to take the same amount of time on every
input. Every compression algorithm pretty much has a mode where
incompressible data is copied through byte for byte, and other modes
that take advantage of repeated byte sequences. It's only reasonable
to suppose that those various code paths are not all going to run at
the same speed, and nobody would want them to. It would mean trying to
slow down the fast paths through the code to the same speed as the
slow paths, and because decompression speed is so important, that
sounds like a thing that most people would not want.

Do you have any suggestions on what we should do here?

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Practical Timing Side Channel Attacks on Memory Compression

Reply via email to