Re: [apache/trafficserver] HTTP2 drain (#1710)

Masakazu Kitajo Tue, 09 May 2017 08:36:31 -0700

On Tue, May 9, 2017 at 11:12 PM, James Peach <jpe...@apache.org> wrote:


>
>
> On 9 May 2017, at 0:02, Masakazu Kitajo wrote:
>
> Hi,
>>
>> Don't worry guys. I'm working on it.
>>
>> I merged the PR to have a draining (shedding) logic for HTTP/2 (Double
>> GOAWAY frames), and now I'm generalizing the flag so that we can invoke
>> draining on both HTTP/1 and HTTP/2. I'm going to add "Connection: close"
>> header automatically for HTTP/1. So, "proxy.config.stop.shutdown_timeout"
>> will be available for both versions. The PR was step 1, and it will be
>> step
>> 2.
>>
>
> I think that this should be "proxy.config.restart.shutdown_timeout" or
> something like that. It is best to keep configuration variable in smaller
> hierarchies to improve discoverability. I definitely think you should add a
> new configuration if you want to keep the code that closes connections
> while draining.


I agree that keeping configuration variable in smaller hierarchies, but I
don't see the difference between stop.shutdown_timeout and
restart.shutdown_timeout. Sending "Connection: close" and sending two
GOAWAY frames are the same thing. They both indicate "no more transactions
on this session", which suggest a client to close the connection.


>
>
> "traffic_ctl server stop --drain" is also in my mind. Once we have one
>> generalized way for draining in traffic_server, we can invoke it from
>> anywhere we want (e.g. from traffic_manager, TS API, metric). Currently it
>> is invoked using the flag, but eventually, we should be able to decouple
>> draining and shutdown, and should also be able to remove the flag and
>> sleep().
>>
>
> I guess the sleeps will back up every time you get a signal?


You mean the sleep() will be called every time? We should avoid that,
currently it may happen though.


>
>
> Doing these all at once is heavy and the PR would be big, and it slows
>> review and progress, IMO. You might want full featured graceful shutdown
>> but this is a reason why I didn't ask to do it.
>>
>
> Ok, thanks for clarifying.
>
>
>
>> Thanks,
>> Masakazu
>>
>>
>>
>> On Tue, May 9, 2017 at 2:40 PM, CrazyCow <zhangzizhong0...@gmail.com>
>> wrote:
>>
>> I don't disagree. The reasons I chose this way are:
>>> 1. We are using other stuff in my team instead of traffic_ctl to manage
>>> the
>>> process and do the upgrade.
>>> 2. traffic_ctl --drain can only support HTTP and it can only be used when
>>> restarting ATS. That makes it hardly useful in our use case.
>>>
>>> 2017-05-08 22:16 GMT-07:00 James Peach <jpe...@apache.org>:
>>>
>>>
>>>>
>>>> On 8 May 2017, at 21:48, Miles Libbey wrote:
>>>>
>>>> We'd also like a bit more fine grained control in the process -- we
>>>>
>>>>> frequently want to perform maintenance on a server (upgrading ATS;
>>>>> upgrading the OS; performing hardware changes, etc) after draining but
>>>>> before restarting ATS. I suppose this would mean allowing the --drain
>>>>> option to apply to traffic_Ctl server stop.
>>>>>
>>>>>
>>>> Yes I agree that draining make sense for stop as well as restart.
>>>>
>>>>
>>>>
>>>> miles
>>>>>
>>>>> On Mon, May 8, 2017 at 8:19 PM, James Peach <jpe...@apache.org> wrote:
>>>>>
>>>>> This patch adds another separate shutdown mechanism that only works for
>>>>>> HTTP/2. I think that we really ought to have a single, well-defined
>>>>>> graceful
>>>>>> shutdown that works for all protocols.
>>>>>>
>>>>>> In HTTP/1.1 it works like this:
>>>>>>         - You (possibly dynamically) set
>>>>>> proxy.config.restart.active_client_threshold
>>>>>>         - You run “traffic_ctl server restart —drain”
>>>>>>
>>>>>> Then, once client connections have been drained to the threshold,
>>>>>> traffic_server restarts. Note that this assumes that you have some
>>>>>> additional orchestration that triggers header_rewrite to inject a
>>>>>> “Connection: close” header, and tell the GSLB to stop sending new
>>>>>> connections.
>>>>>>
>>>>>> After this HTTP/2 change, there is a new graceful shutdown path
>>>>>>         - You (at startup only) set proxy.config.stop.shutdown_tim
>>>>>> eout
>>>>>>         - You send a signal to traffic_server
>>>>>>
>>>>>> This flips a global variable to the “drain” state, then sleeps in the
>>>>>> signal
>>>>>> handler until the timeout is reached. In HTTP/2 only, any new
>>>>>>
>>>>> connections
>>>
>>>> will be accepted and then immediately closed.
>>>>>>
>>>>>> To rationalize these disparate approaches, I suggest that we go back
>>>>>> to
>>>>>> the
>>>>>> traffic_ctl methodology, and enhance it so that it sends a message to
>>>>>> traffic_server that puts it into a known draining state. This should
>>>>>> be
>>>>>> published in a metric and should be reversible so you can abort the
>>>>>> drain.
>>>>>> The metric can be observed by HTTP/1.1 and HTTP/2 to take appropriate
>>>>>> action. We should also add a new setting
>>>>>> “proxy.config.restart.active_client_timeout” (or something like that)
>>>>>>
>>>>> to
>>>
>>>> handle the maximum time to wait for traffic to drain.
>>>>>>
>>>>>> I’m not sure whether it is a good idea to close connections in HTTP/2
>>>>>> while
>>>>>> we are in draining state. If there is a desire for this, I would like
>>>>>>
>>>>> it
>>>
>>>> to
>>>>>> be configurable (defaulting to off).
>>>>>>
>>>>>> On 8 May 2017, at 18:17, Masakazu Kitajo wrote:
>>>>>>
>>>>>> Merged #1710.
>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> You are receiving this because you are subscribed to this thread.
>>>>>>> Reply to this email directly or view it on GitHub:
>>>>>>> https://github.com/apache/trafficserver/pull/1710#event-1073784726
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>

Re: [apache/trafficserver] HTTP2 drain (#1710)

Reply via email to