Dear hackers,

I'm the maintainer of ruby-pg the ruby interface to the PostgreSQL database. This binding uses the asynchronous API of libpq by default to facilitate the ruby IO wait and scheduling mechanisms.

This works well with the vanilla postgresql server, but it leads to starvation with other types of servers using the postgresql wire protocol 3. This is because the current functioning of the libpq async interface depends on a maximum size of SSL records of 8kB.

The following servers were reported to starve with ruby-pg:

* AWS RDS Aurora Serverless [1]
* YugabyteDb [2]
* CockroachDB [3]

They block infinitely on certain message sizes sent from the backend to the libpq frontend. It is best described in [4]. A repro docker composition is provided by YugabyteDB at [2].

To fix this issue the attached patch calls pqReadData() repeatedly in PQconsumeInput() until there is no buffered SSL data left to be read. Another solution could be to process buffered SSL read bytes in PQisBusy() instead of PQconsumeInput() .

The synchronous libpq API isn't affected, since it supports arbitrary SSL record sizes already. That's why I think that the asynchronous API should also support bigger SSL record sizes.

Regards, Lars


[1] https://github.com/ged/ruby-pg/issues/325
[2] https://github.com/ged/ruby-pg/issues/588
[3] https://github.com/ged/ruby-pg/issues/583
[4] https://github.com/ged/ruby-pg/issues/325#issuecomment-737561270

From ab793829a4ce473f1cc2bbc0e2a6f6753553255d Mon Sep 17 00:00:00 2001
From: Lars Kanis <l...@greiz-reinsdorf.de>
Date: Sun, 8 Sep 2024 13:59:05 +0200
Subject: [PATCH] libpq: Process buffered SSL read bytes to support records
 >8kB on async API

The async API of libpq doesn't support SSL record sizes >8kB so far.
This size isn't exceeded by vanilla PostgreSQL, but by other products using
the postgres wire protocol 3.
PQconsumeInput() reads all data readable from the socket, so that the read
condition is cleared.
But it doesn't process all the data that is pending on the SSL layer.
Also a subsequent call to PQisBusy() doesn't process it, so that the client
is triggered to wait for more readable data on the socket.
But this never arrives, so that the connection blocks infinitely.

To fix this issue call pqReadData() repeatedly until there is no buffered
SSL data left to be read.

The synchronous libpq API isn't affected, since it supports arbitrary SSL
record sizes already.
---
 src/interfaces/libpq/fe-exec.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 0d224a852..637894ee1 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -2006,6 +2006,19 @@ PQconsumeInput(PGconn *conn)
 	if (pqReadData(conn) < 0)
 		return 0;
 
+	#ifdef USE_SSL
+		/*
+		 * Ensure all buffered read bytes in the SSL library are processed,
+		 * which might be not the case, if the SSL record size exceeds 8k.
+		 * Otherwise parseInput can't process the data.
+		 */
+		while (conn->ssl_in_use && pgtls_read_pending(conn))
+		{
+			if (pqReadData(conn) < 0)
+				return 0;
+		}
+	#endif
+
 	/* Parsing of the data waits till later. */
 	return 1;
 }
-- 
2.43.0

Reply via email to