[grpc-io] Ping rate limiting is too aggressive

'Damien Neil' via grpc.io Mon, 02 Dec 2024 14:19:09 -0800

(I'd file this as an issue, but so far as I can tell this spans all gRPC 
implementations and I can't figure out which GitHub tracker to use in that 
case.)

gRPC servers set a fairly aggressive limit on the number of pings clients
can send. The algorithm is detailed here:
https://github.com/grpc/proposal/blob/master/A8-client-side-keepalive.md#server-enforcement

In essence, a client can send two PINGs per HEADERS or DATA frame sent by
the server. Any more, and the server closes the connection with an
ENHANCE_YOUR_CALM error. Technically, it's two pings per 5 minutes or 2
hours, depending on whether there's an outstanding call. However: The
client can't tell if there's an outstanding call, because its PING frame
can race with the server finishing a call, and even 5 minutes is
essentially forever in computer terms. So it's effectively two PINGS per
HEADERS/DATA sent by the server.

I learned of this in https://go.dev/issue/70575, which is an issue filed
against Go's HTTP/2 client, caused by a new health check we'd added: When a
request times out or is canceled, we send a RST_STREAM frame for it.
Servers don't respond to RST_STREAM, so we bundle the RST_STREAM with a
PING frame to confirm that the server is still alive and responsive. In the
event many requests are canceled at once, we send only one PING for the
batch.

This triggers gRPC servers' rate limiting when several requests are
canceled in short succession.

Unfortunately, there's no good way for the client to avoid this: Consider
the case where we send three requests, one minute apart, and cancel each
request before the server begins responding. The third PING triggers the
rate limit, and the server closes the connection.

I think that gRPC servers should reset the ping strike count when they
*receive* a HEADERS or DATA frame. This limits clients to at most two pings
per real frame sent, and essentially places pings under the umbrella of
whatever rate limiting is being applied to HEADERS/DATA. PING frames should
be cheap to process compared to HEADERS/DATA, so limiting them to a small
multiple of the more expensive frames renders them ineffective as a DOS
vector.

This approach would ensure that a client waiting for a response to a
request may always send at least one PING frame to confirm that the server
is still alive.

- Damien

--
You received this message because you are subscribed to the Google Groups
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to grpc-io+unsubscr...@googlegroups.com.
To view this discussion visit
https://groups.google.com/d/msgid/grpc-io/d3a9a4d5-ab4b-42ca-a2cd-c397fdcd41a3n%40googlegroups.com.

[grpc-io] Ping rate limiting is too aggressive

Reply via email to