On 9 May 2017, at 0:02, Masakazu Kitajo wrote:
Hi,
Don't worry guys. I'm working on it.
I merged the PR to have a draining (shedding) logic for HTTP/2 (Double
GOAWAY frames), and now I'm generalizing the flag so that we can
invoke
draining on both HTTP/1 and HTTP/2. I'm going to add "Connection:
close"
header automatically for HTTP/1. So,
"proxy.config.stop.shutdown_timeout"
will be available for both versions. The PR was step 1, and it will be
step
2.
I think that this should be "proxy.config.restart.shutdown_timeout" or
something like that. It is best to keep configuration variable in
smaller hierarchies to improve discoverability. I definitely think you
should add a new configuration if you want to keep the code that closes
connections while draining.
"traffic_ctl server stop --drain" is also in my mind. Once we have one
generalized way for draining in traffic_server, we can invoke it from
anywhere we want (e.g. from traffic_manager, TS API, metric).
Currently it
is invoked using the flag, but eventually, we should be able to
decouple
draining and shutdown, and should also be able to remove the flag and
sleep().
I guess the sleeps will back up every time you get a signal?
Doing these all at once is heavy and the PR would be big, and it slows
review and progress, IMO. You might want full featured graceful
shutdown
but this is a reason why I didn't ask to do it.
Ok, thanks for clarifying.
Thanks,
Masakazu
On Tue, May 9, 2017 at 2:40 PM, CrazyCow <zhangzizhong0...@gmail.com>
wrote:
I don't disagree. The reasons I chose this way are:
1. We are using other stuff in my team instead of traffic_ctl to
manage the
process and do the upgrade.
2. traffic_ctl --drain can only support HTTP and it can only be used
when
restarting ATS. That makes it hardly useful in our use case.
2017-05-08 22:16 GMT-07:00 James Peach <jpe...@apache.org>:
On 8 May 2017, at 21:48, Miles Libbey wrote:
We'd also like a bit more fine grained control in the process -- we
frequently want to perform maintenance on a server (upgrading ATS;
upgrading the OS; performing hardware changes, etc) after draining
but
before restarting ATS. I suppose this would mean allowing the
--drain
option to apply to traffic_Ctl server stop.
Yes I agree that draining make sense for stop as well as restart.
miles
On Mon, May 8, 2017 at 8:19 PM, James Peach <jpe...@apache.org>
wrote:
This patch adds another separate shutdown mechanism that only
works for
HTTP/2. I think that we really ought to have a single,
well-defined
graceful
shutdown that works for all protocols.
In HTTP/1.1 it works like this:
- You (possibly dynamically) set
proxy.config.restart.active_client_threshold
- You run “traffic_ctl server restart —drain”
Then, once client connections have been drained to the threshold,
traffic_server restarts. Note that this assumes that you have some
additional orchestration that triggers header_rewrite to inject a
“Connection: close” header, and tell the GSLB to stop sending
new
connections.
After this HTTP/2 change, there is a new graceful shutdown path
- You (at startup only) set
proxy.config.stop.shutdown_timeout
- You send a signal to traffic_server
This flips a global variable to the “drain” state, then sleeps
in the
signal
handler until the timeout is reached. In HTTP/2 only, any new
connections
will be accepted and then immediately closed.
To rationalize these disparate approaches, I suggest that we go
back to
the
traffic_ctl methodology, and enhance it so that it sends a message
to
traffic_server that puts it into a known draining state. This
should be
published in a metric and should be reversible so you can abort
the
drain.
The metric can be observed by HTTP/1.1 and HTTP/2 to take
appropriate
action. We should also add a new setting
“proxy.config.restart.active_client_timeout” (or something
like that)
to
handle the maximum time to wait for traffic to drain.
I’m not sure whether it is a good idea to close connections in
HTTP/2
while
we are in draining state. If there is a desire for this, I would
like
it
to
be configurable (defaulting to off).
On 8 May 2017, at 18:17, Masakazu Kitajo wrote:
Merged #1710.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/trafficserver/pull/1710#event-1073784726