Maksim, agree. Let's not be too clever and only log a warning. On Mon, Feb 7, 2022 at 5:23 PM Pavel Tupitsyn <ptupit...@apache.org> wrote:
> Ivan, idleTimeout already exists, I don't think we should change the way > it works (or did I misunderstand you?) > > Of course, enabling heartbeats means that otherwise idle clients will no > longer be disconnected by the server. > I think we should cross-link those properties in the documentation and > explain this behavior. > > On Mon, Feb 7, 2022 at 4:39 PM Ivan Daschinsky <ivanda...@gmail.com> > wrote: > >> >>3. Already implemented: when ClientConnectorConfiguration#idleTimeout is >> not zero, server disconnects idle clients >> >> >> But I suppose it would be great to have: >> 1. If client supports keep alive, use idleTimeout >> 2. If not, do not use it. >> >> But I am not sure if it is correct or not. >> >> пн, 7 февр. 2022 г. в 16:01, Maksim Timonin <timoninma...@apache.org>: >> >> > I believe explicit is better than implicit :) Also in case of dynamic >> > calculation of timeout, it can change dynamically, for example >> restarting a >> > cluster with different configuration should reconfigure clients too. >> Looks >> > complicated. >> > >> > My vote for WARN + javadocs with mention of this issue. >> > >> > On Mon, Feb 7, 2022 at 3:51 PM Pavel Tupitsyn <ptupit...@apache.org> >> > wrote: >> > >> > > > WDYT, should we add a WARN message for clients that configure >> > > > keepAliveTimeout greater than idleTimeout on the server side? >> > > >> > > I think we should either log a WARN, or retrieve idleTimeout from >> server >> > > and configure heartbeatTimeout accordingly (e.g. divide by 2). >> > > Thoughts? >> > > >> > > On Mon, Feb 7, 2022 at 3:14 PM Maksim Timonin < >> timoninma...@apache.org> >> > > wrote: >> > > >> > > > Hi Pavel, >> > > > >> > > > Thanks for the links. Yes, I forgot that the flag of changed >> topology >> > is >> > > > lazy. Also I missed that the keepAlive setting is configured on the >> > > client >> > > > side (alternatively to idleTimeout that is on the server side). >> > > > >> > > > Now I understand, this feature can be helpful then. Every client can >> > > > configure itself in case it's possible to be idle sometimes, and >> choose >> > > > an appropriate timeout by itself too. And by default the feature >> should >> > > be >> > > > disabled. >> > > > >> > > > WDYT, should we add a WARN message for clients that configure >> > > > keepAliveTimeout greater than idleTimeout on the server side? >> > > > >> > > > >> > > > >> > > > On Mon, Feb 7, 2022 at 1:05 PM Pavel Tupitsyn <ptupit...@apache.org >> > >> > > > wrote: >> > > > >> > > > > Ivan, >> > > > > >> > > > > I suggest the following: >> > > > > >> > > > > 1. Server sends KEEP_ALIVE feature flag, which means it accepts >> > > > > OP_KEEP_ALIVE empty message >> > > > > 2. Client sends OP_KEEP_ALIVE when the connection is idle for a >> > > > > certain period of time >> > > > > 3. Already implemented: when >> ClientConnectorConfiguration#idleTimeout >> > > is >> > > > > not zero, server disconnects idle clients >> > > > > >> > > > > This way we don't need server->client keepalives, as you correctly >> > > noted. >> > > > > >> > > > > On Mon, Feb 7, 2022 at 12:43 PM Ivan Daschinsky < >> ivanda...@gmail.com >> > > >> > > > > wrote: >> > > > > >> > > > > > Pavel, I suppose that ideally: >> > > > > > 1. Client send in handshake flag, that it supports KEEP_ALIVE >> > feature >> > > > and >> > > > > > server takes it into account. >> > > > > > 2. Each request of client can be considered as keep-alive ping. >> > > > > > 3. Client send failure should be processed using retry policy. >> > > > > > 4. Server should not send keep-alive packets, it is redundant, >> but >> > > > server >> > > > > > should track requests from client and if there is no requests >> from >> > > > client >> > > > > > with KEEP_ALIVE feature, >> > > > > > automatically close connection and free resources. >> > > > > > >> > > > > > Similar approach is used in zookeeper clients. >> > > > > > >> > > > > > пн, 7 февр. 2022 г. в 12:24, Pavel Tupitsyn < >> ptupit...@apache.org >> > >: >> > > > > > >> > > > > > > Ivan, >> > > > > > > >> > > > > > > Ideally, the check should come from both sides. >> > > > > > > - Client periodically sends keepalive to server >> > > > > > > - Server periodically sends keepalive to client >> > > > > > > >> > > > > > > Feature flags will be added accordingly, so it is not >> necessary >> > to >> > > > > > > implement this in all thin clients. >> > > > > > > >> > > > > > > On Mon, Feb 7, 2022 at 11:43 AM Ivan Daschinsky < >> > > ivanda...@gmail.com >> > > > > >> > > > > > > wrote: >> > > > > > > >> > > > > > > > I suppose it is great idea, but this functionality can be >> hard >> > to >> > > > > > > implement >> > > > > > > > for some platforms. I.e. sync python client or php (there >> is no >> > > > real >> > > > > > > > multithreading for python (GIL) and php is single threaded >> by >> > > > > design). >> > > > > > > But >> > > > > > > > for async clients it is not very hard to implement. >> > Nevertheless, >> > > > > this >> > > > > > > > feature should be optional, because of possible technical >> > > > > limitations. >> > > > > > > > >> > > > > > > > Pavel, is this check mostly for client side? Or servers can >> do >> > > some >> > > > > > > actions >> > > > > > > > if there is no activity from thin client (i.e. closing >> context >> > > and >> > > > > free >> > > > > > > > resources such as queries' handles and so on?) >> > > > > > > > >> > > > > > > > пн, 7 февр. 2022 г. в 11:09, Pavel Tupitsyn < >> > > ptupit...@apache.org >> > > > >: >> > > > > > > > >> > > > > > > > > Hi Maksim, >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > > half-state is a possible situation when an Ignite node >> goes >> > > > down >> > > > > or >> > > > > > > > > somehow removes connection to a thin client >> > > > > > > > > >> > > > > > > > > Half-open state is also possible when, for example, an >> > > > intermediate >> > > > > > > > router >> > > > > > > > > is rebooted [1]. >> > > > > > > > > >> > > > > > > > > This is what we seem to have encountered with one of our >> > > > customers >> > > > > - >> > > > > > > they >> > > > > > > > > have a stable cluster, and long-living (multiple days) >> thin >> > > > client >> > > > > > > > > connections which can be idle for some time. >> > > > > > > > > And only when we send some data on such an idle >> connection do >> > > we >> > > > > > > discover >> > > > > > > > > that it is broken. >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > > But with enabled (true by default) partitionAwareness >> > feature >> > > > > > clients >> > > > > > > > can >> > > > > > > > > be notified about topology changes >> > > > > > > > > >> > > > > > > > > Partition awareness is a "lazy" notification in a form of >> a >> > > > > response >> > > > > > > > > message flag [2]. >> > > > > > > > > You won't get one on an idle connection. >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > > the connections are removed on the server side by client >> > idle >> > > > > > timeout >> > > > > > > > > >> > > > > > > > > Idle timeout is disabled by default. >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > > is it OK to keep such connections alive for a long time >> > > > > > > > > >> > > > > > > > > I think it is up to the user. >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > > in the case of partition awareness features it can lead >> to >> > > > > wasting >> > > > > > > TCP >> > > > > > > > > sockets on Ignite nodes, can't it >> > > > > > > > > >> > > > > > > > > Can you please elaborate? >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > [1] >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://blog.stephencleary.com/2009/05/detection-of-half-open-dropped.html >> > > > > > > > > [2] >> > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://cwiki.apache.org/confluence/display/IGNITE/IEP-23%3A+Best+Effort+Affinity+for+Thin+Clients >> > > > > > > > > >> > > > > > > > > On Fri, Feb 4, 2022 at 4:01 PM Maksim Timonin < >> > > > > > timoninma...@apache.org >> > > > > > > > >> > > > > > > > > wrote: >> > > > > > > > > >> > > > > > > > > > Hi Pavel, >> > > > > > > > > > >> > > > > > > > > > Thanks for starting this thread! Can I ask some >> questions >> > > here >> > > > to >> > > > > > get >> > > > > > > > the >> > > > > > > > > > feature more clearly? >> > > > > > > > > > >> > > > > > > > > > As I understand it correctly, half-state is a possible >> > > > situation >> > > > > > when >> > > > > > > > an >> > > > > > > > > > Ignite node goes down or somehow removes connection to a >> > thin >> > > > > > client. >> > > > > > > > But >> > > > > > > > > > with enabled (true by default) partitionAwareness >> feature >> > > > clients >> > > > > > can >> > > > > > > > be >> > > > > > > > > > notified about topology changes. So, there are possible >> > > cases: >> > > > > > > > > > 1. ThinClient connects to a single node. >> > > > > > > > > > 2. Ignite node removes connection from itself. >> > > > > > > > > > >> > > > > > > > > > I like the idea for the case with a single node, as it >> > helps >> > > > fail >> > > > > > > fast. >> > > > > > > > > > But is it OK to connect a client to a single node only? >> > > > > > > > > > >> > > > > > > > > > For the second one: you mention that a case for the >> second >> > > > option >> > > > > > is >> > > > > > > > > > "Long-living and mostly idle connections are especially >> > > > > susceptible >> > > > > > > to >> > > > > > > > > this >> > > > > > > > > > behavior". If I understand correctly the connections are >> > > > removed >> > > > > on >> > > > > > > the >> > > > > > > > > > server side by client idle timeout. Can we just >> configure >> > the >> > > > > idle >> > > > > > > > > timeout >> > > > > > > > > > for cases where we really need keeping alive idle >> > > connections? >> > > > > Are >> > > > > > > > there >> > > > > > > > > > any other cases with unexpectedly dropped connections? >> > > > > > > > > > >> > > > > > > > > > I'm wondering is it OK to keep such connections alive >> for a >> > > > long >> > > > > > > time? >> > > > > > > > > > Also in the case of partition awareness features it can >> > lead >> > > to >> > > > > > > wasting >> > > > > > > > > TCP >> > > > > > > > > > sockets on Ignite nodes, can't it? >> > > > > > > > > > >> > > > > > > > > > Thanks! >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > On Thu, Feb 3, 2022 at 2:24 PM Pavel Tupitsyn < >> > > > > > ptupit...@apache.org> >> > > > > > > > > > wrote: >> > > > > > > > > > >> > > > > > > > > >> Igniters, >> > > > > > > > > >> >> > > > > > > > > >> Please review the proposal to add heartbeat messages to >> > the >> > > > thin >> > > > > > > > client >> > > > > > > > > >> protocol (both 2.x and 3.x) and let me know your >> thoughts: >> > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://cwiki.apache.org/confluence/display/IGNITE/IEP-83+Thin+Client+Keepalive >> > > > > > > > > >> >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > -- >> > > > > > > > Sincerely yours, Ivan Daschinskiy >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > > >> > > > > > -- >> > > > > > Sincerely yours, Ivan Daschinskiy >> > > > > > >> > > > > >> > > > >> > > >> > >> >> >> -- >> Sincerely yours, Ivan Daschinskiy >> >