Re: IEP-83 Thin Client Keepalive (heartbeat)

Pavel Tupitsyn Mon, 07 Feb 2022 06:24:03 -0800

Maksim, agree. Let's not be too clever and only log a warning.

On Mon, Feb 7, 2022 at 5:23 PM Pavel Tupitsyn <[email protected]> wrote:


> Ivan, idleTimeout already exists, I don't think we should change the way
> it works (or did I misunderstand you?)
>
> Of course, enabling heartbeats means that otherwise idle clients will no
> longer be disconnected by the server.
> I think we should cross-link those properties in the documentation and
> explain this behavior.
>
> On Mon, Feb 7, 2022 at 4:39 PM Ivan Daschinsky <[email protected]>
> wrote:
>
>> >>3. Already implemented: when ClientConnectorConfiguration#idleTimeout is
>> not zero, server disconnects idle clients
>> >>
>> But I suppose it would be great to have:
>> 1. If client supports keep alive, use idleTimeout
>> 2. If not, do not use it.
>>
>> But I am not sure if it is correct or not.
>>
>> пн, 7 февр. 2022 г. в 16:01, Maksim Timonin <[email protected]>:
>>
>> > I believe explicit is better than implicit :) Also in case of dynamic
>> > calculation of timeout, it can change dynamically, for example
>> restarting a
>> > cluster with different configuration should reconfigure clients too.
>> Looks
>> > complicated.
>> >
>> > My vote for WARN + javadocs with mention of this issue.
>> >
>> > On Mon, Feb 7, 2022 at 3:51 PM Pavel Tupitsyn <[email protected]>
>> > wrote:
>> >
>> > > > WDYT, should we add a WARN message for clients that configure
>> > > > keepAliveTimeout greater than idleTimeout on the server side?
>> > >
>> > > I think we should either log a WARN, or retrieve idleTimeout from
>> server
>> > > and configure heartbeatTimeout accordingly (e.g. divide by 2).
>> > > Thoughts?
>> > >
>> > > On Mon, Feb 7, 2022 at 3:14 PM Maksim Timonin <
>> [email protected]>
>> > > wrote:
>> > >
>> > > > Hi Pavel,
>> > > >
>> > > > Thanks for the links. Yes, I forgot that the flag of changed
>> topology
>> > is
>> > > > lazy. Also I missed that the keepAlive setting is configured on the
>> > > client
>> > > > side (alternatively to idleTimeout that is on the server side).
>> > > >
>> > > > Now I understand, this feature can be helpful then. Every client can
>> > > > configure itself in case it's possible to be idle sometimes, and
>> choose
>> > > > an appropriate timeout by itself too. And by default the feature
>> should
>> > > be
>> > > > disabled.
>> > > >
>> > > > WDYT, should we add a WARN message for clients that configure
>> > > > keepAliveTimeout greater than idleTimeout on the server side?
>> > > >
>> > > >
>> > > >
>> > > > On Mon, Feb 7, 2022 at 1:05 PM Pavel Tupitsyn <[email protected]
>> >
>> > > > wrote:
>> > > >
>> > > > > Ivan,
>> > > > >
>> > > > > I suggest the following:
>> > > > >
>> > > > > 1. Server sends KEEP_ALIVE feature flag, which means it accepts
>> > > > > OP_KEEP_ALIVE empty message
>> > > > > 2. Client sends OP_KEEP_ALIVE when the connection is idle for a
>> > > > > certain period of time
>> > > > > 3. Already implemented: when
>> ClientConnectorConfiguration#idleTimeout
>> > > is
>> > > > > not zero, server disconnects idle clients
>> > > > >
>> > > > > This way we don't need server->client keepalives, as you correctly
>> > > noted.
>> > > > >
>> > > > > On Mon, Feb 7, 2022 at 12:43 PM Ivan Daschinsky <
>> [email protected]
>> > >
>> > > > > wrote:
>> > > > >
>> > > > > > Pavel, I suppose that ideally:
>> > > > > > 1. Client send in handshake flag, that it supports KEEP_ALIVE
>> > feature
>> > > > and
>> > > > > > server takes it into account.
>> > > > > > 2. Each request of client can be considered as keep-alive ping.
>> > > > > > 3. Client send failure should be processed using retry policy.
>> > > > > > 4. Server should not send keep-alive packets, it is redundant,
>> but
>> > > > server
>> > > > > > should track requests from client and if there is no requests
>> from
>> > > > client
>> > > > > > with KEEP_ALIVE feature,
>> > > > > > automatically close connection and free resources.
>> > > > > >
>> > > > > > Similar approach is used in zookeeper clients.
>> > > > > >
>> > > > > > пн, 7 февр. 2022 г. в 12:24, Pavel Tupitsyn <
>> [email protected]
>> > >:
>> > > > > >
>> > > > > > > Ivan,
>> > > > > > >
>> > > > > > > Ideally, the check should come from both sides.
>> > > > > > > - Client periodically sends keepalive to server
>> > > > > > > - Server periodically sends keepalive to client
>> > > > > > >
>> > > > > > > Feature flags will be added accordingly, so it is not
>> necessary
>> > to
>> > > > > > > implement this in all thin clients.
>> > > > > > >
>> > > > > > > On Mon, Feb 7, 2022 at 11:43 AM Ivan Daschinsky <
>> > > [email protected]
>> > > > >
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > I suppose it is great idea, but this functionality can be
>> hard
>> > to
>> > > > > > > implement
>> > > > > > > > for some platforms. I.e. sync python client or php (there
>> is no
>> > > > real
>> > > > > > > > multithreading for python (GIL) and php is single threaded
>> by
>> > > > > design).
>> > > > > > > But
>> > > > > > > > for async clients it is not very hard to implement.
>> > Nevertheless,
>> > > > > this
>> > > > > > > > feature should be optional, because of possible technical
>> > > > > limitations.
>> > > > > > > >
>> > > > > > > > Pavel, is this check mostly for client side? Or servers can
>> do
>> > > some
>> > > > > > > actions
>> > > > > > > > if there is no activity from thin client (i.e. closing
>> context
>> > > and
>> > > > > free
>> > > > > > > > resources such as queries' handles and so on?)
>> > > > > > > >
>> > > > > > > > пн, 7 февр. 2022 г. в 11:09, Pavel Tupitsyn <
>> > > [email protected]
>> > > > >:
>> > > > > > > >
>> > > > > > > > > Hi Maksim,
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > > half-state is a possible situation when an Ignite node
>> goes
>> > > > down
>> > > > > or
>> > > > > > > > > somehow removes connection to a thin client
>> > > > > > > > >
>> > > > > > > > > Half-open state is also possible when, for example, an
>> > > > intermediate
>> > > > > > > > router
>> > > > > > > > > is rebooted [1].
>> > > > > > > > >
>> > > > > > > > > This is what we seem to have encountered with one of our
>> > > > customers
>> > > > > -
>> > > > > > > they
>> > > > > > > > > have a stable cluster, and long-living (multiple days)
>> thin
>> > > > client
>> > > > > > > > > connections which can be idle for some time.
>> > > > > > > > > And only when we send some data on such an idle
>> connection do
>> > > we
>> > > > > > > discover
>> > > > > > > > > that it is broken.
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > > But with enabled (true by default) partitionAwareness
>> > feature
>> > > > > > clients
>> > > > > > > > can
>> > > > > > > > > be notified about topology changes
>> > > > > > > > >
>> > > > > > > > > Partition awareness is a "lazy" notification in a form of
>> a
>> > > > > response
>> > > > > > > > > message flag [2].
>> > > > > > > > > You won't get one on an idle connection.
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > > the connections are removed on the server side by client
>> > idle
>> > > > > > timeout
>> > > > > > > > >
>> > > > > > > > > Idle timeout is disabled by default.
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > > is it OK to keep such connections alive for a long time
>> > > > > > > > >
>> > > > > > > > > I think it is up to the user.
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > > in the case of partition awareness features it can lead
>> to
>> > > > > wasting
>> > > > > > > TCP
>> > > > > > > > > sockets on Ignite nodes, can't it
>> > > > > > > > >
>> > > > > > > > > Can you please elaborate?
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > [1]
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://blog.stephencleary.com/2009/05/detection-of-half-open-dropped.html
>> > > > > > > > > [2]
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-23%3A+Best+Effort+Affinity+for+Thin+Clients
>> > > > > > > > >
>> > > > > > > > > On Fri, Feb 4, 2022 at 4:01 PM Maksim Timonin <
>> > > > > > [email protected]
>> > > > > > > >
>> > > > > > > > > wrote:
>> > > > > > > > >
>> > > > > > > > > > Hi Pavel,
>> > > > > > > > > >
>> > > > > > > > > > Thanks for starting this thread! Can I ask some
>> questions
>> > > here
>> > > > to
>> > > > > > get
>> > > > > > > > the
>> > > > > > > > > > feature more clearly?
>> > > > > > > > > >
>> > > > > > > > > > As I understand it correctly, half-state is a possible
>> > > > situation
>> > > > > > when
>> > > > > > > > an
>> > > > > > > > > > Ignite node goes down or somehow removes connection to a
>> > thin
>> > > > > > client.
>> > > > > > > > But
>> > > > > > > > > > with enabled (true by default) partitionAwareness
>> feature
>> > > > clients
>> > > > > > can
>> > > > > > > > be
>> > > > > > > > > > notified about topology changes. So, there are possible
>> > > cases:
>> > > > > > > > > > 1. ThinClient connects to a single node.
>> > > > > > > > > > 2. Ignite node removes connection from itself.
>> > > > > > > > > >
>> > > > > > > > > > I like the idea for the case with a single node, as it
>> > helps
>> > > > fail
>> > > > > > > fast.
>> > > > > > > > > > But is it OK to connect a client to a single node only?
>> > > > > > > > > >
>> > > > > > > > > > For the second one: you mention that a case for the
>> second
>> > > > option
>> > > > > > is
>> > > > > > > > > > "Long-living and mostly idle connections are especially
>> > > > > susceptible
>> > > > > > > to
>> > > > > > > > > this
>> > > > > > > > > > behavior". If I understand correctly the connections are
>> > > > removed
>> > > > > on
>> > > > > > > the
>> > > > > > > > > > server side by client idle timeout. Can we just
>> configure
>> > the
>> > > > > idle
>> > > > > > > > > timeout
>> > > > > > > > > > for cases where we really need keeping alive idle
>> > > connections?
>> > > > > Are
>> > > > > > > > there
>> > > > > > > > > > any other cases with unexpectedly dropped connections?
>> > > > > > > > > >
>> > > > > > > > > > I'm wondering is it OK to keep such connections alive
>> for a
>> > > > long
>> > > > > > > time?
>> > > > > > > > > > Also in the case of partition awareness features it can
>> > lead
>> > > to
>> > > > > > > wasting
>> > > > > > > > > TCP
>> > > > > > > > > > sockets on Ignite nodes, can't it?
>> > > > > > > > > >
>> > > > > > > > > > Thanks!
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > On Thu, Feb 3, 2022 at 2:24 PM Pavel Tupitsyn <
>> > > > > > [email protected]>
>> > > > > > > > > > wrote:
>> > > > > > > > > >
>> > > > > > > > > >> Igniters,
>> > > > > > > > > >>
>> > > > > > > > > >> Please review the proposal to add heartbeat messages to
>> > the
>> > > > thin
>> > > > > > > > client
>> > > > > > > > > >> protocol (both 2.x and 3.x) and let me know your
>> thoughts:
>> > > > > > > > > >>
>> > > > > > > > > >>
>> > > > > > > > > >>
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-83+Thin+Client+Keepalive
>> > > > > > > > > >>
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > --
>> > > > > > > > Sincerely yours, Ivan Daschinskiy
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > > Sincerely yours, Ivan Daschinskiy
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>>
>> --
>> Sincerely yours, Ivan Daschinskiy
>>
>

Re: IEP-83 Thin Client Keepalive (heartbeat)

Reply via email to