On Mon, Dec 2, 2024 at 2:19 PM 'Damien Neil' via grpc.io <
grpc-io@googlegroups.com> wrote:

> I learned of this in https://go.dev/issue/70575, which is an issue filed
> against Go's HTTP/2 client, caused by a new health check we'd added: When a
> request times out or is canceled, we send a RST_STREAM frame for it.
> Servers don't respond to RST_STREAM, so we bundle the RST_STREAM with a
> PING frame to confirm that the server is still alive and responsive. In the
> event many requests are canceled at once, we send only one PING for the
> batch.
>

Our keepalive does something similar, but is time-based. If it has been X
amount of time since the last receipt, then a PING checking the connection
is fair. The problem is only the "aggressive" PING rate by the client. The
client is doing exactly what the server was wanting to prevent:
"overzealous" connection checking. I do think it is more appropriate to
base it off a connection-level time instead of a per-request time, although
you probably don't have a connection-level time to auto-tune to whereas you
do get feedback from requests timing out.

I'm wary of tieing keepalive checks to resets/deadlines, as those are
load-shedding operations and people can have aggressive deadlines or cancel
aggressively as part of normal course. In addition, TCP_USER_TIMEOUT with
the RST_STREAM gets you a lot of the same value without requiring
additional ACK packets.

Note that I do think the 5 minutes is too large, but that's all I was able
to get agreement for. Compared to 2 hours it is short... I really wanted a
bit shy of 1 minute, as 1 minute is the magic inactivity for many home NATs
and some cloud LBs.

I think that gRPC servers should reset the ping strike count when they
> *receive* a HEADERS or DATA frame.
>

I'm biased against the idea as that's the rough behavior of a certain
server, and it was nothing but useless and a pain. HEADERS and DATA really
have nothing to do with monitoring the connection, so it seems strange to
let the client choose when to reset the counter. For BDP monitoring, we
need it to be reset when the server sends DATA to use PINGs to adjust the
client's receive window size. And I know of an implementation that sent
unnecessary frames just to reset the counter so it could send PINGs.

I question if that gets you what you need. If you start three requests at
the same time with timeouts of 1s, 2s, 3s, then you'll still run afoul the
limit.

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to grpc-io+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/grpc-io/CA%2B4M1oN1JuZ6cYeNMZ%3DHPYsAB8QjxjFJn8QC77Gu8QyJqh_Yog%40mail.gmail.com.

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to