On 11/08/2025 09:58, Jelte Fennema-Nio wrote:
On Fri, 8 Aug 2025 at 17:50, Heikki Linnakangas <hlinn...@iki.fi> wrote:
I think any of the other options would be
better. There's no guarantee that more data will ever arrive, the
connection might be used just to wait for the notification.

Yeah, I think that's the important thing to realize here. The "try
again later" only makes sense if we need more data to try again. If we
don't then we now start waiting on data that might never come.

Here's a new set of patches, to disconnect on OOM instead of hanging or silently discarding messages:

v4-0001-libpq-Don-t-hang-on-out-of-memory.patch

This is a very minimal fix for the "hang on OOM" issue. It changes the behavior to silently skip over the message, i.e. discard the async notification, or continue without cancellation key.

I think we should backpatch this.

v4-0002-libpq-Handle-OOMs-by-disconnecting-instead-of-sil.patch

This changes the behavior of all of the problematic places where we silently discarded things on OOM to disconnect instead. That is, when processing a Notify, BackendKeyData, or ParameterStatus message.

Now that I look at this again, we should probably forget about patch #1 and commit and backpatch this straight away. It's a little more changes than patch #1, but not that much.

Note that there are still places where we discard things on OOM, like when parsing an error 'E' message. Those are a little iffy too, but they're less problematic because you still get an error, and the error is clearly associated with a query, unlike these Notify, BackendData and ParameterStatus messages.

v4-0003-libpq-Be-strict-about-cancel-key-lengths.patch

Your patch to add more checks, rebased over the first two.

- Heikki



Reply via email to