(I'd file this as an issue, but so far as I can tell this spans all gRPC 
implementations and I can't figure out which GitHub tracker to use in that 
case.)

gRPC servers set a fairly aggressive limit on the number of pings clients 
can send. The algorithm is detailed here:
https://github.com/grpc/proposal/blob/master/A8-client-side-keepalive.md#server-enforcement

In essence, a client can send two PINGs per HEADERS or DATA frame sent by 
the server. Any more, and the server closes the connection with an 
ENHANCE_YOUR_CALM error. Technically, it's two pings per 5 minutes or 2 
hours, depending on whether there's an outstanding call. However: The 
client can't tell if there's an outstanding call, because its PING frame 
can race with the server finishing a call, and even 5 minutes is 
essentially forever in computer terms. So it's effectively two PINGS per 
HEADERS/DATA sent by the server.

I learned of this in https://go.dev/issue/70575, which is an issue filed 
against Go's HTTP/2 client, caused by a new health check we'd added: When a 
request times out or is canceled, we send a RST_STREAM frame for it. 
Servers don't respond to RST_STREAM, so we bundle the RST_STREAM with a 
PING frame to confirm that the server is still alive and responsive. In the 
event many requests are canceled at once, we send only one PING for the 
batch.

This triggers gRPC servers' rate limiting when several requests are 
canceled in short succession.

Unfortunately, there's no good way for the client to avoid this: Consider 
the case where we send three requests, one minute apart, and cancel each 
request before the server begins responding. The third PING triggers the 
rate limit, and the server closes the connection.

I think that gRPC servers should reset the ping strike count when they 
*receive* a HEADERS or DATA frame. This limits clients to at most two pings 
per real frame sent, and essentially places pings under the umbrella of 
whatever rate limiting is being applied to HEADERS/DATA. PING frames should 
be cheap to process compared to HEADERS/DATA, so limiting them to a small 
multiple of the more expensive frames renders them ineffective as a DOS 
vector.

This approach would ensure that a client waiting for a response to a 
request may always send at least one PING frame to confirm that the server 
is still alive.

- Damien

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to grpc-io+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/grpc-io/d3a9a4d5-ab4b-42ca-a2cd-c397fdcd41a3n%40googlegroups.com.

Reply via email to