> WDYT, should we add a WARN message for clients that configure > keepAliveTimeout greater than idleTimeout on the server side?
I think we should either log a WARN, or retrieve idleTimeout from server and configure heartbeatTimeout accordingly (e.g. divide by 2). Thoughts? On Mon, Feb 7, 2022 at 3:14 PM Maksim Timonin <timoninma...@apache.org> wrote: > Hi Pavel, > > Thanks for the links. Yes, I forgot that the flag of changed topology is > lazy. Also I missed that the keepAlive setting is configured on the client > side (alternatively to idleTimeout that is on the server side). > > Now I understand, this feature can be helpful then. Every client can > configure itself in case it's possible to be idle sometimes, and choose > an appropriate timeout by itself too. And by default the feature should be > disabled. > > WDYT, should we add a WARN message for clients that configure > keepAliveTimeout greater than idleTimeout on the server side? > > > > On Mon, Feb 7, 2022 at 1:05 PM Pavel Tupitsyn <ptupit...@apache.org> > wrote: > > > Ivan, > > > > I suggest the following: > > > > 1. Server sends KEEP_ALIVE feature flag, which means it accepts > > OP_KEEP_ALIVE empty message > > 2. Client sends OP_KEEP_ALIVE when the connection is idle for a > > certain period of time > > 3. Already implemented: when ClientConnectorConfiguration#idleTimeout is > > not zero, server disconnects idle clients > > > > This way we don't need server->client keepalives, as you correctly noted. > > > > On Mon, Feb 7, 2022 at 12:43 PM Ivan Daschinsky <ivanda...@gmail.com> > > wrote: > > > > > Pavel, I suppose that ideally: > > > 1. Client send in handshake flag, that it supports KEEP_ALIVE feature > and > > > server takes it into account. > > > 2. Each request of client can be considered as keep-alive ping. > > > 3. Client send failure should be processed using retry policy. > > > 4. Server should not send keep-alive packets, it is redundant, but > server > > > should track requests from client and if there is no requests from > client > > > with KEEP_ALIVE feature, > > > automatically close connection and free resources. > > > > > > Similar approach is used in zookeeper clients. > > > > > > пн, 7 февр. 2022 г. в 12:24, Pavel Tupitsyn <ptupit...@apache.org>: > > > > > > > Ivan, > > > > > > > > Ideally, the check should come from both sides. > > > > - Client periodically sends keepalive to server > > > > - Server periodically sends keepalive to client > > > > > > > > Feature flags will be added accordingly, so it is not necessary to > > > > implement this in all thin clients. > > > > > > > > On Mon, Feb 7, 2022 at 11:43 AM Ivan Daschinsky <ivanda...@gmail.com > > > > > > wrote: > > > > > > > > > I suppose it is great idea, but this functionality can be hard to > > > > implement > > > > > for some platforms. I.e. sync python client or php (there is no > real > > > > > multithreading for python (GIL) and php is single threaded by > > design). > > > > But > > > > > for async clients it is not very hard to implement. Nevertheless, > > this > > > > > feature should be optional, because of possible technical > > limitations. > > > > > > > > > > Pavel, is this check mostly for client side? Or servers can do some > > > > actions > > > > > if there is no activity from thin client (i.e. closing context and > > free > > > > > resources such as queries' handles and so on?) > > > > > > > > > > пн, 7 февр. 2022 г. в 11:09, Pavel Tupitsyn <ptupit...@apache.org > >: > > > > > > > > > > > Hi Maksim, > > > > > > > > > > > > > > > > > > > half-state is a possible situation when an Ignite node goes > down > > or > > > > > > somehow removes connection to a thin client > > > > > > > > > > > > Half-open state is also possible when, for example, an > intermediate > > > > > router > > > > > > is rebooted [1]. > > > > > > > > > > > > This is what we seem to have encountered with one of our > customers > > - > > > > they > > > > > > have a stable cluster, and long-living (multiple days) thin > client > > > > > > connections which can be idle for some time. > > > > > > And only when we send some data on such an idle connection do we > > > > discover > > > > > > that it is broken. > > > > > > > > > > > > > > > > > > > But with enabled (true by default) partitionAwareness feature > > > clients > > > > > can > > > > > > be notified about topology changes > > > > > > > > > > > > Partition awareness is a "lazy" notification in a form of a > > response > > > > > > message flag [2]. > > > > > > You won't get one on an idle connection. > > > > > > > > > > > > > > > > > > > the connections are removed on the server side by client idle > > > timeout > > > > > > > > > > > > Idle timeout is disabled by default. > > > > > > > > > > > > > > > > > > > is it OK to keep such connections alive for a long time > > > > > > > > > > > > I think it is up to the user. > > > > > > > > > > > > > > > > > > > in the case of partition awareness features it can lead to > > wasting > > > > TCP > > > > > > sockets on Ignite nodes, can't it > > > > > > > > > > > > Can you please elaborate? > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > https://blog.stephencleary.com/2009/05/detection-of-half-open-dropped.html > > > > > > [2] > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-23%3A+Best+Effort+Affinity+for+Thin+Clients > > > > > > > > > > > > On Fri, Feb 4, 2022 at 4:01 PM Maksim Timonin < > > > timoninma...@apache.org > > > > > > > > > > > wrote: > > > > > > > > > > > > > Hi Pavel, > > > > > > > > > > > > > > Thanks for starting this thread! Can I ask some questions here > to > > > get > > > > > the > > > > > > > feature more clearly? > > > > > > > > > > > > > > As I understand it correctly, half-state is a possible > situation > > > when > > > > > an > > > > > > > Ignite node goes down or somehow removes connection to a thin > > > client. > > > > > But > > > > > > > with enabled (true by default) partitionAwareness feature > clients > > > can > > > > > be > > > > > > > notified about topology changes. So, there are possible cases: > > > > > > > 1. ThinClient connects to a single node. > > > > > > > 2. Ignite node removes connection from itself. > > > > > > > > > > > > > > I like the idea for the case with a single node, as it helps > fail > > > > fast. > > > > > > > But is it OK to connect a client to a single node only? > > > > > > > > > > > > > > For the second one: you mention that a case for the second > option > > > is > > > > > > > "Long-living and mostly idle connections are especially > > susceptible > > > > to > > > > > > this > > > > > > > behavior". If I understand correctly the connections are > removed > > on > > > > the > > > > > > > server side by client idle timeout. Can we just configure the > > idle > > > > > > timeout > > > > > > > for cases where we really need keeping alive idle connections? > > Are > > > > > there > > > > > > > any other cases with unexpectedly dropped connections? > > > > > > > > > > > > > > I'm wondering is it OK to keep such connections alive for a > long > > > > time? > > > > > > > Also in the case of partition awareness features it can lead to > > > > wasting > > > > > > TCP > > > > > > > sockets on Ignite nodes, can't it? > > > > > > > > > > > > > > Thanks! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Feb 3, 2022 at 2:24 PM Pavel Tupitsyn < > > > ptupit...@apache.org> > > > > > > > wrote: > > > > > > > > > > > > > >> Igniters, > > > > > > >> > > > > > > >> Please review the proposal to add heartbeat messages to the > thin > > > > > client > > > > > > >> protocol (both 2.x and 3.x) and let me know your thoughts: > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-83+Thin+Client+Keepalive > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Sincerely yours, Ivan Daschinskiy > > > > > > > > > > > > > > > > > > -- > > > Sincerely yours, Ivan Daschinskiy > > > > > >