On 9 May 2017, at 0:02, Masakazu Kitajo wrote:
Hi,
Don't worry guys. I'm working on it.
I merged the PR to have a draining (shedding) logic for HTTP/2 (Double
GOAWAY frames), and now I'm generalizing the flag so that we can
invoke
draining on both HTTP/1 and HTTP/2. I'm going to add "Connection:
close"
header automatically for HTTP/1. So,
"proxy.config.stop.shutdown_timeout"
will be available for both versions. The PR was step 1, and it will be
step
2.
“proxy.config.stop.shutdown_timeout” is still a mis-feature. What
really matters for shutting down is how much traffic you are still
serving, which is expressed by
“proxy.config.restart.active_client_threshold”. I think you might
also want to express a maximum time you are willing to wait, but just
having a fixed time is not what we should be doing.
"traffic_ctl server stop --drain" is also in my mind. Once we have one
generalized way for draining in traffic_server, we can invoke it from
anywhere we want (e.g. from traffic_manager, TS API, metric).
Currently it
is invoked using the flag, but eventually, we should be able to
decouple
draining and shutdown, and should also be able to remove the flag and
sleep().
Doing these all at once is heavy and the PR would be big, and it slows
review and progress, IMO. You might want full featured graceful
shutdown
but this is a reason why I didn't ask to do it.
I’m fine with (and even prefer) doing this in pieces, but we need to
agree on the direction and that the end result will be what we want.
Thanks,
Masakazu
On Tue, May 9, 2017 at 2:40 PM, CrazyCow <zhangzizhong0...@gmail.com>
wrote:
I don't disagree. The reasons I chose this way are:
1. We are using other stuff in my team instead of traffic_ctl to
manage the
process and do the upgrade.
2. traffic_ctl --drain can only support HTTP and it can only be used
when
restarting ATS. That makes it hardly useful in our use case.
2017-05-08 22:16 GMT-07:00 James Peach <jpe...@apache.org>:
On 8 May 2017, at 21:48, Miles Libbey wrote:
We'd also like a bit more fine grained control in the process -- we
frequently want to perform maintenance on a server (upgrading ATS;
upgrading the OS; performing hardware changes, etc) after draining
but
before restarting ATS. I suppose this would mean allowing the
--drain
option to apply to traffic_Ctl server stop.
Yes I agree that draining make sense for stop as well as restart.
miles
On Mon, May 8, 2017 at 8:19 PM, James Peach <jpe...@apache.org>
wrote:
This patch adds another separate shutdown mechanism that only
works for
HTTP/2. I think that we really ought to have a single,
well-defined
graceful
shutdown that works for all protocols.
In HTTP/1.1 it works like this:
- You (possibly dynamically) set
proxy.config.restart.active_client_threshold
- You run “traffic_ctl server restart —drain”
Then, once client connections have been drained to the threshold,
traffic_server restarts. Note that this assumes that you have some
additional orchestration that triggers header_rewrite to inject a
“Connection: close” header, and tell the GSLB to stop sending
new
connections.
After this HTTP/2 change, there is a new graceful shutdown path
- You (at startup only) set
proxy.config.stop.shutdown_timeout
- You send a signal to traffic_server
This flips a global variable to the “drain” state, then sleeps
in the
signal
handler until the timeout is reached. In HTTP/2 only, any new
connections
will be accepted and then immediately closed.
To rationalize these disparate approaches, I suggest that we go
back to
the
traffic_ctl methodology, and enhance it so that it sends a message
to
traffic_server that puts it into a known draining state. This
should be
published in a metric and should be reversible so you can abort
the
drain.
The metric can be observed by HTTP/1.1 and HTTP/2 to take
appropriate
action. We should also add a new setting
“proxy.config.restart.active_client_timeout” (or something
like that)
to
handle the maximum time to wait for traffic to drain.
I’m not sure whether it is a good idea to close connections in
HTTP/2
while
we are in draining state. If there is a desire for this, I would
like
it
to
be configurable (defaulting to off).
On 8 May 2017, at 18:17, Masakazu Kitajo wrote:
Merged #1710.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/trafficserver/pull/1710#event-1073784726