What about "keepAlive", "keepAliveInterval" then? It looks more common and matches the IEP title :)
On Tue, Feb 15, 2022 at 5:54 PM Pavel Tupitsyn <ptupit...@apache.org> wrote: > To summarize, we add two properties to the ClientConfiguration: > bool heartbeatsEnabled = true; > long defaultHeartbeatInterval = 60_000; // Default 1 minute, used > > Logic: > if (heartbeatsEnabled) { > heartbeatInterval = serverIdleTimeout > 0 ? serverIdleTimeout / 3 : > defaultHeartbeatInterval; > } > > > Thoughts, objections? > > On Tue, Feb 15, 2022 at 4:32 PM Ivan Daschinsky <ivanda...@gmail.com> > wrote: > > > Pavel, sorry, i've made mistake. But current behaviour is ok for me. This > > timeout cannot be change on server side runtime. But we can simplify > > protocol just use one opcode and message > > > > вт, 15 февр. 2022 г., 14:54 Ivan Daschinsky <ivanda...@gmail.com>: > > > > > > Idle timeout can't change, why send it back with every heartbeat > > > response? > > > May be I am wrong, but from code I see this behaviour. But if I am > wrong, > > > this is ok behaviour for me. > > > > > > > > > > > > вт, 15 февр. 2022 г. в 14:00, Pavel Tupitsyn <ptupit...@apache.org>: > > > > > >> Ivan, I mostly agree with your proposal, except this point: > > >> > > >> > Response to heartbeat request -- is idle timeout > > >> Idle timeout can't change, why send it back with every heartbeat > > response? > > >> > > >> > possible cases with cluster restart, upgrade > > >> In those cases, a new connection will be established, and we'll > retrieve > > >> the new timeout after the handshake. > > >> > > >> > > >> On Tue, Feb 15, 2022 at 12:04 PM Maksim Timonin < > > timoninma...@apache.org> > > >> wrote: > > >> > > >> > Hi Ivan, > > >> > > > >> > Cases you described sound reasonable to me. Then the client should > > just > > >> set > > >> > up the `keepAlive` flag, and it just works. > > >> > > > >> > So, there are 3 branches: > > >> > 1. Users don't configure keepAlive at all. > > >> > 2. Users configure keepAliveHeartbeatInterval (long, ms). > > >> > 3. Users configure keepAlive (boolean). > > >> > > > >> > AFAIU, Pavel's proposal is about covering the second case only. But > > >> > actually the 2nd and 3rd aren't conflicted with each other.I think > for > > >> both > > >> > branches, a cluster should respond with idleTimeout value on every > > keep > > >> > alive client request. Because there are possible cases with cluster > > >> > restart, upgrade, etc. Clients should check every response and in > case > > >> of > > >> > changed idleTimeout. For 2nd case write a WARN message, and for 3rd > - > > >> > reconfigure themself in case of changed idleTimeout. > > >> > > > >> > > > >> > > > >> > > > >> > On Tue, Feb 15, 2022 at 9:51 AM Ivan Daschinsky < > ivanda...@gmail.com> > > >> > wrote: > > >> > > > >> > > Regarding discussion here [1] > > >> > > > > >> > > I suppose that this feature, despite the fact that initial > intention > > >> of > > >> > > Pavel was different, can drastically > > >> > > improve the usage pattern of thin clients and give a lot of > > >> opportunities > > >> > > if the following is done: > > >> > > > > >> > > 1. GridNioServer has a great feature -- idle timeout. If a server > > did > > >> > not > > >> > > receive any from a client -- it will be kicked off. > > >> > > But there are some scenarios that make the use of this feature > > >> > > impossible: > > >> > > a. Multiple workers waiting for batch tasks and relatively low > > >> requests > > >> > > rate -- this services will be often kicked off and must reconnect. > > >> > > In order to prevent this behaviour, the user must implement a kind > > of > > >> > > heartbeating by himself. > > >> > > b. Quite often user may want to implement leader-follower pattern > > for > > >> > > services for HA, so followers also will be considered as idle. > > Kicking > > >> > off > > >> > > these followers > > >> > > is not acceptable, so user should also implement heartbeating by > > >> > himself. > > >> > > > > >> > > My proposition is: > > >> > > 1. Add two flags -- enable/disable heartbeats, and very optional > > >> > heartbeat > > >> > > timeout. Set enable to true by default, timeout to default > heartbeat > > >> > > timeout. > > >> > > 2. If server and client both support this feature, and heartbeats > > are > > >> not > > >> > > explicitly disabled on client side: > > >> > > 3. Response to heartbeat request -- is idle timeout. If idle > timeout > > >> is > > >> > set > > >> > > on the server side , set heartbeat timeout to one-third of it, > > instead > > >> > set > > >> > > to default or specified value. > > >> > > > > >> > > Pros: > > >> > > 1. Easy to set up -- just flag on client side and just set timeout > > on > > >> > > server side. > > >> > > 2. Hard to configure improperly, i.e set heartbeat timeout not > short > > >> > enough > > >> > > in order to prevent kicking out by server. > > >> > > 3. If the user just wants heartbeats without setting idle timeout > -- > > >> > > heartbeats are by default on and with reasonable timeout. > > >> > > > > >> > > Cons: > > >> > > 1. If someone will rely on old behavior and just wants to drop his > > >> > clients > > >> > > on timeout -- this will not work without reconfiguring, he should > > >> disable > > >> > > heartbeats. > > >> > > But I cannot even imagine that someone will find this behaviour > > >> > desirable. > > >> > > I strongly believe that this behaviour prevents users from using > > >> > > idleTimeout on server side. > > >> > > > > >> > > [1] -- > > >> https://github.com/apache/ignite/pull/9817#discussion_r805628955 > > >> > > > > >> > > пт, 11 февр. 2022 г. в 10:58, Pavel Tupitsyn < > ptupit...@apache.org > > >: > > >> > > > > >> > > > I've prepared a PR, please have a look: > > >> > > > https://github.com/apache/ignite/pull/9817 > > >> > > > > > >> > > > On Mon, Feb 7, 2022 at 6:37 PM Ivan Daschinsky < > > ivanda...@gmail.com > > >> > > > >> > > > wrote: > > >> > > > > > >> > > > > I see potential in this feature, especially if we use > something > > >> like > > >> > > > > continuous query. Stale clients can consume a lot of resources > > >> and it > > >> > > is > > >> > > > > worth kick these clients out. > > >> > > > > > > >> > > > > пн, 7 февр. 2022 г. в 18:25, Pavel Tupitsyn < > > ptupit...@apache.org > > >> >: > > >> > > > > > > >> > > > > > > If we use new approach, we can reduce this timeout. But > this > > >> can > > >> > > > affect > > >> > > > > > old clients. > > >> > > > > > > > >> > > > > > idleTimeout is disabled by default, we are not going to > change > > >> > this. > > >> > > > > > > > >> > > > > > > Also, let's think about that sending heartbeats and > interval > > >> of > > >> > > > sending > > >> > > > > > > heartbeats could be calculated on the server side (i.e. > one > > >> third > > >> > > of > > >> > > > > idle > > >> > > > > > > timeout) and sent to the client during handshake. > > >> > > > > > > Also we can introduce something like a negotiation > mechanism > > >> as > > >> > in > > >> > > > > > > zookeeper. > > >> > > > > > > > >> > > > > > I tend to agree with Maksim here, let's keep it simple and > > >> > explicit. > > >> > > > > > Log a warning, but don't do anything clever. > > >> > > > > > > > >> > > > > > On Mon, Feb 7, 2022 at 6:15 PM Ivan Daschinsky < > > >> > ivanda...@gmail.com> > > >> > > > > > wrote: > > >> > > > > > > > >> > > > > > > >> idleTimeout already exists, I don't think we should > > change > > >> the > > >> > > way > > >> > > > > it > > >> > > > > > > works (or did I misunderstand you?) > > >> > > > > > > If we use new approach, we can reduce this timeout. But > this > > >> can > > >> > > > affect > > >> > > > > > old > > >> > > > > > > clients. > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > Also, let's think about that sending heartbeats and > interval > > >> of > > >> > > > sending > > >> > > > > > > heartbeats could be calculated on the server side (i.e. > one > > >> third > > >> > > of > > >> > > > > idle > > >> > > > > > > timeout) and sent to the client > > >> > > > > > > during handshake. > > >> > > > > > > Also we can introduce something like a negotiation > mechanism > > >> as > > >> > in > > >> > > > > > > zookeeper. > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > пн, 7 февр. 2022 г. в 18:05, Pavel Tupitsyn < > > >> > ptupit...@apache.org > > >> > > >: > > >> > > > > > > > > >> > > > > > > > Igor, > > >> > > > > > > > > > >> > > > > > > > > Maybe clients should pass this information on to the > > >> > handshake. > > >> > > > > > > > > > >> > > > > > > > Do you think we should log a mismatched timeout warning > on > > >> the > > >> > > > > server, > > >> > > > > > > not > > >> > > > > > > > on the client? > > >> > > > > > > > Or should we do both? > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > I've updated the proposal with OP_GET_IDLE_TIMEOUT and > > some > > >> > other > > >> > > > > > details > > >> > > > > > > > discussed above. > > >> > > > > > > > > > >> > > > > > > > On Mon, Feb 7, 2022 at 5:42 PM Igor Sapego < > > >> isap...@apache.org > > >> > > > > >> > > > > wrote: > > >> > > > > > > > > > >> > > > > > > > > Feature seems useful for me as it makes connection > > >> management > > >> > > > more > > >> > > > > > > robust > > >> > > > > > > > > and > > >> > > > > > > > > predictable. > > >> > > > > > > > > > > >> > > > > > > > > I agree with Pavel, that we should print warning when > > >> > heartbeat > > >> > > > > > period > > >> > > > > > > is > > >> > > > > > > > > larger than > > >> > > > > > > > > idle timeout, but I see a problem here as idle timeout > > is > > >> > > > > configured > > >> > > > > > on > > >> > > > > > > > > server and is not > > >> > > > > > > > > known to clients, while heartbeats configured on > clients > > >> and > > >> > > > their > > >> > > > > > > period > > >> > > > > > > > > is not known > > >> > > > > > > > > to the server. Maybe clients should pass this > > information > > >> on > > >> > to > > >> > > > the > > >> > > > > > > > > handshake. > > >> > > > > > > > > > > >> > > > > > > > > Regarding Python and PHP clients - can not we use some > > >> kind > > >> > of > > >> > > > > timers > > >> > > > > > > to > > >> > > > > > > > > implement > > >> > > > > > > > > this feature? > > >> > > > > > > > > > > >> > > > > > > > > Best Regards, > > >> > > > > > > > > Igor > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > On Mon, Feb 7, 2022 at 5:23 PM Pavel Tupitsyn < > > >> > > > > ptupit...@apache.org> > > >> > > > > > > > > wrote: > > >> > > > > > > > > > > >> > > > > > > > > > Maksim, agree. Let's not be too clever and only log > a > > >> > > warning. > > >> > > > > > > > > > > > >> > > > > > > > > > On Mon, Feb 7, 2022 at 5:23 PM Pavel Tupitsyn < > > >> > > > > > ptupit...@apache.org> > > >> > > > > > > > > > wrote: > > >> > > > > > > > > > > > >> > > > > > > > > > > Ivan, idleTimeout already exists, I don't think we > > >> should > > >> > > > > change > > >> > > > > > > the > > >> > > > > > > > > way > > >> > > > > > > > > > > it works (or did I misunderstand you?) > > >> > > > > > > > > > > > > >> > > > > > > > > > > Of course, enabling heartbeats means that > otherwise > > >> idle > > >> > > > > clients > > >> > > > > > > will > > >> > > > > > > > > no > > >> > > > > > > > > > > longer be disconnected by the server. > > >> > > > > > > > > > > I think we should cross-link those properties in > the > > >> > > > > > documentation > > >> > > > > > > > and > > >> > > > > > > > > > > explain this behavior. > > >> > > > > > > > > > > > > >> > > > > > > > > > > On Mon, Feb 7, 2022 at 4:39 PM Ivan Daschinsky < > > >> > > > > > > ivanda...@gmail.com> > > >> > > > > > > > > > > wrote: > > >> > > > > > > > > > > > > >> > > > > > > > > > >> >>3. Already implemented: when > > >> > > > > > > > > ClientConnectorConfiguration#idleTimeout > > >> > > > > > > > > > is > > >> > > > > > > > > > >> not zero, server disconnects idle clients > > >> > > > > > > > > > >> >> > > >> > > > > > > > > > >> But I suppose it would be great to have: > > >> > > > > > > > > > >> 1. If client supports keep alive, use idleTimeout > > >> > > > > > > > > > >> 2. If not, do not use it. > > >> > > > > > > > > > >> > > >> > > > > > > > > > >> But I am not sure if it is correct or not. > > >> > > > > > > > > > >> > > >> > > > > > > > > > >> пн, 7 февр. 2022 г. в 16:01, Maksim Timonin < > > >> > > > > > > > timoninma...@apache.org > > >> > > > > > > > > >: > > >> > > > > > > > > > >> > > >> > > > > > > > > > >> > I believe explicit is better than implicit :) > > Also > > >> in > > >> > > case > > >> > > > > of > > >> > > > > > > > > dynamic > > >> > > > > > > > > > >> > calculation of timeout, it can change > > dynamically, > > >> for > > >> > > > > example > > >> > > > > > > > > > >> restarting a > > >> > > > > > > > > > >> > cluster with different configuration should > > >> > reconfigure > > >> > > > > > clients > > >> > > > > > > > too. > > >> > > > > > > > > > >> Looks > > >> > > > > > > > > > >> > complicated. > > >> > > > > > > > > > >> > > > >> > > > > > > > > > >> > My vote for WARN + javadocs with mention of > this > > >> > issue. > > >> > > > > > > > > > >> > > > >> > > > > > > > > > >> > On Mon, Feb 7, 2022 at 3:51 PM Pavel Tupitsyn < > > >> > > > > > > > ptupit...@apache.org > > >> > > > > > > > > > > > >> > > > > > > > > > >> > wrote: > > >> > > > > > > > > > >> > > > >> > > > > > > > > > >> > > > WDYT, should we add a WARN message for > > clients > > >> > that > > >> > > > > > > configure > > >> > > > > > > > > > >> > > > keepAliveTimeout greater than idleTimeout > on > > >> the > > >> > > > server > > >> > > > > > > side? > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > >> > > I think we should either log a WARN, or > > retrieve > > >> > > > > idleTimeout > > >> > > > > > > > from > > >> > > > > > > > > > >> server > > >> > > > > > > > > > >> > > and configure heartbeatTimeout accordingly > > (e.g. > > >> > > divide > > >> > > > by > > >> > > > > > 2). > > >> > > > > > > > > > >> > > Thoughts? > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > >> > > On Mon, Feb 7, 2022 at 3:14 PM Maksim > Timonin < > > >> > > > > > > > > > >> timoninma...@apache.org> > > >> > > > > > > > > > >> > > wrote: > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > >> > > > Hi Pavel, > > >> > > > > > > > > > >> > > > > > >> > > > > > > > > > >> > > > Thanks for the links. Yes, I forgot that > the > > >> flag > > >> > of > > >> > > > > > changed > > >> > > > > > > > > > >> topology > > >> > > > > > > > > > >> > is > > >> > > > > > > > > > >> > > > lazy. Also I missed that the keepAlive > > setting > > >> is > > >> > > > > > configured > > >> > > > > > > > on > > >> > > > > > > > > > the > > >> > > > > > > > > > >> > > client > > >> > > > > > > > > > >> > > > side (alternatively to idleTimeout that is > on > > >> the > > >> > > > server > > >> > > > > > > > side). > > >> > > > > > > > > > >> > > > > > >> > > > > > > > > > >> > > > Now I understand, this feature can be > helpful > > >> > then. > > >> > > > > Every > > >> > > > > > > > client > > >> > > > > > > > > > can > > >> > > > > > > > > > >> > > > configure itself in case it's possible to > be > > >> idle > > >> > > > > > sometimes, > > >> > > > > > > > and > > >> > > > > > > > > > >> choose > > >> > > > > > > > > > >> > > > an appropriate timeout by itself too. And > by > > >> > default > > >> > > > the > > >> > > > > > > > feature > > >> > > > > > > > > > >> should > > >> > > > > > > > > > >> > > be > > >> > > > > > > > > > >> > > > disabled. > > >> > > > > > > > > > >> > > > > > >> > > > > > > > > > >> > > > WDYT, should we add a WARN message for > > clients > > >> > that > > >> > > > > > > configure > > >> > > > > > > > > > >> > > > keepAliveTimeout greater than idleTimeout > on > > >> the > > >> > > > server > > >> > > > > > > side? > > >> > > > > > > > > > >> > > > > > >> > > > > > > > > > >> > > > > > >> > > > > > > > > > >> > > > > > >> > > > > > > > > > >> > > > On Mon, Feb 7, 2022 at 1:05 PM Pavel > > Tupitsyn < > > >> > > > > > > > > > ptupit...@apache.org > > >> > > > > > > > > > >> > > > >> > > > > > > > > > >> > > > wrote: > > >> > > > > > > > > > >> > > > > > >> > > > > > > > > > >> > > > > Ivan, > > >> > > > > > > > > > >> > > > > > > >> > > > > > > > > > >> > > > > I suggest the following: > > >> > > > > > > > > > >> > > > > > > >> > > > > > > > > > >> > > > > 1. Server sends KEEP_ALIVE feature flag, > > >> which > > >> > > means > > >> > > > > it > > >> > > > > > > > > accepts > > >> > > > > > > > > > >> > > > > OP_KEEP_ALIVE empty message > > >> > > > > > > > > > >> > > > > 2. Client sends OP_KEEP_ALIVE when the > > >> > connection > > >> > > is > > >> > > > > > idle > > >> > > > > > > > for > > >> > > > > > > > > a > > >> > > > > > > > > > >> > > > > certain period of time > > >> > > > > > > > > > >> > > > > 3. Already implemented: when > > >> > > > > > > > > > >> ClientConnectorConfiguration#idleTimeout > > >> > > > > > > > > > >> > > is > > >> > > > > > > > > > >> > > > > not zero, server disconnects idle clients > > >> > > > > > > > > > >> > > > > > > >> > > > > > > > > > >> > > > > This way we don't need server->client > > >> > keepalives, > > >> > > as > > >> > > > > you > > >> > > > > > > > > > correctly > > >> > > > > > > > > > >> > > noted. > > >> > > > > > > > > > >> > > > > > > >> > > > > > > > > > >> > > > > On Mon, Feb 7, 2022 at 12:43 PM Ivan > > >> Daschinsky > > >> > < > > >> > > > > > > > > > >> ivanda...@gmail.com > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > >> > > > > wrote: > > >> > > > > > > > > > >> > > > > > > >> > > > > > > > > > >> > > > > > Pavel, I suppose that ideally: > > >> > > > > > > > > > >> > > > > > 1. Client send in handshake flag, that > it > > >> > > supports > > >> > > > > > > > > KEEP_ALIVE > > >> > > > > > > > > > >> > feature > > >> > > > > > > > > > >> > > > and > > >> > > > > > > > > > >> > > > > > server takes it into account. > > >> > > > > > > > > > >> > > > > > 2. Each request of client can be > > >> considered as > > >> > > > > > > keep-alive > > >> > > > > > > > > > ping. > > >> > > > > > > > > > >> > > > > > 3. Client send failure should be > > processed > > >> > using > > >> > > > > retry > > >> > > > > > > > > policy. > > >> > > > > > > > > > >> > > > > > 4. Server should not send keep-alive > > >> packets, > > >> > it > > >> > > > is > > >> > > > > > > > > redundant, > > >> > > > > > > > > > >> but > > >> > > > > > > > > > >> > > > server > > >> > > > > > > > > > >> > > > > > should track requests from client and > if > > >> there > > >> > > is > > >> > > > no > > >> > > > > > > > > requests > > >> > > > > > > > > > >> from > > >> > > > > > > > > > >> > > > client > > >> > > > > > > > > > >> > > > > > with KEEP_ALIVE feature, > > >> > > > > > > > > > >> > > > > > automatically close connection and free > > >> > > resources. > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > >> > > > > > Similar approach is used in zookeeper > > >> clients. > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > >> > > > > > пн, 7 февр. 2022 г. в 12:24, Pavel > > >> Tupitsyn < > > >> > > > > > > > > > >> ptupit...@apache.org > > >> > > > > > > > > > >> > >: > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > >> > > > > > > Ivan, > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > Ideally, the check should come from > > both > > >> > > sides. > > >> > > > > > > > > > >> > > > > > > - Client periodically sends keepalive > > to > > >> > > server > > >> > > > > > > > > > >> > > > > > > - Server periodically sends keepalive > > to > > >> > > client > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > Feature flags will be added > > accordingly, > > >> so > > >> > it > > >> > > > is > > >> > > > > > not > > >> > > > > > > > > > >> necessary > > >> > > > > > > > > > >> > to > > >> > > > > > > > > > >> > > > > > > implement this in all thin clients. > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > On Mon, Feb 7, 2022 at 11:43 AM Ivan > > >> > > Daschinsky > > >> > > > < > > >> > > > > > > > > > >> > > ivanda...@gmail.com > > >> > > > > > > > > > >> > > > > > > >> > > > > > > > > > >> > > > > > > wrote: > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > > I suppose it is great idea, but > this > > >> > > > > functionality > > >> > > > > > > can > > >> > > > > > > > > be > > >> > > > > > > > > > >> hard > > >> > > > > > > > > > >> > to > > >> > > > > > > > > > >> > > > > > > implement > > >> > > > > > > > > > >> > > > > > > > for some platforms. I.e. sync > python > > >> > client > > >> > > or > > >> > > > > php > > >> > > > > > > > > (there > > >> > > > > > > > > > >> is no > > >> > > > > > > > > > >> > > > real > > >> > > > > > > > > > >> > > > > > > > multithreading for python (GIL) and > > >> php is > > >> > > > > single > > >> > > > > > > > > threaded > > >> > > > > > > > > > >> by > > >> > > > > > > > > > >> > > > > design). > > >> > > > > > > > > > >> > > > > > > But > > >> > > > > > > > > > >> > > > > > > > for async clients it is not very > hard > > >> to > > >> > > > > > implement. > > >> > > > > > > > > > >> > Nevertheless, > > >> > > > > > > > > > >> > > > > this > > >> > > > > > > > > > >> > > > > > > > feature should be optional, because > > of > > >> > > > possible > > >> > > > > > > > > technical > > >> > > > > > > > > > >> > > > > limitations. > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > Pavel, is this check mostly for > > client > > >> > side? > > >> > > > Or > > >> > > > > > > > servers > > >> > > > > > > > > > can > > >> > > > > > > > > > >> do > > >> > > > > > > > > > >> > > some > > >> > > > > > > > > > >> > > > > > > actions > > >> > > > > > > > > > >> > > > > > > > if there is no activity from thin > > >> client > > >> > > (i.e. > > >> > > > > > > closing > > >> > > > > > > > > > >> context > > >> > > > > > > > > > >> > > and > > >> > > > > > > > > > >> > > > > free > > >> > > > > > > > > > >> > > > > > > > resources such as queries' handles > > and > > >> so > > >> > > on?) > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > пн, 7 февр. 2022 г. в 11:09, Pavel > > >> > Tupitsyn > > >> > > < > > >> > > > > > > > > > >> > > ptupit...@apache.org > > >> > > > > > > > > > >> > > > >: > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > Hi Maksim, > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > half-state is a possible > > situation > > >> > when > > >> > > an > > >> > > > > > > Ignite > > >> > > > > > > > > node > > >> > > > > > > > > > >> goes > > >> > > > > > > > > > >> > > > down > > >> > > > > > > > > > >> > > > > or > > >> > > > > > > > > > >> > > > > > > > > somehow removes connection to a > > thin > > >> > > client > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > Half-open state is also possible > > >> when, > > >> > for > > >> > > > > > > example, > > >> > > > > > > > an > > >> > > > > > > > > > >> > > > intermediate > > >> > > > > > > > > > >> > > > > > > > router > > >> > > > > > > > > > >> > > > > > > > > is rebooted [1]. > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > This is what we seem to have > > >> encountered > > >> > > > with > > >> > > > > > one > > >> > > > > > > of > > >> > > > > > > > > our > > >> > > > > > > > > > >> > > > customers > > >> > > > > > > > > > >> > > > > - > > >> > > > > > > > > > >> > > > > > > they > > >> > > > > > > > > > >> > > > > > > > > have a stable cluster, and > > >> long-living > > >> > > > > (multiple > > >> > > > > > > > days) > > >> > > > > > > > > > >> thin > > >> > > > > > > > > > >> > > > client > > >> > > > > > > > > > >> > > > > > > > > connections which can be idle for > > >> some > > >> > > time. > > >> > > > > > > > > > >> > > > > > > > > And only when we send some data > on > > >> such > > >> > an > > >> > > > > idle > > >> > > > > > > > > > >> connection do > > >> > > > > > > > > > >> > > we > > >> > > > > > > > > > >> > > > > > > discover > > >> > > > > > > > > > >> > > > > > > > > that it is broken. > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > But with enabled (true by > > default) > > >> > > > > > > > > partitionAwareness > > >> > > > > > > > > > >> > feature > > >> > > > > > > > > > >> > > > > > clients > > >> > > > > > > > > > >> > > > > > > > can > > >> > > > > > > > > > >> > > > > > > > > be notified about topology > changes > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > Partition awareness is a "lazy" > > >> > > notification > > >> > > > > in > > >> > > > > > a > > >> > > > > > > > form > > >> > > > > > > > > > of > > >> > > > > > > > > > >> a > > >> > > > > > > > > > >> > > > > response > > >> > > > > > > > > > >> > > > > > > > > message flag [2]. > > >> > > > > > > > > > >> > > > > > > > > You won't get one on an idle > > >> connection. > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > the connections are removed on > > the > > >> > > server > > >> > > > > side > > >> > > > > > > by > > >> > > > > > > > > > client > > >> > > > > > > > > > >> > idle > > >> > > > > > > > > > >> > > > > > timeout > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > Idle timeout is disabled by > > default. > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > is it OK to keep such > connections > > >> > alive > > >> > > > for > > >> > > > > a > > >> > > > > > > long > > >> > > > > > > > > > time > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > I think it is up to the user. > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > in the case of partition > > awareness > > >> > > > features > > >> > > > > it > > >> > > > > > > can > > >> > > > > > > > > > lead > > >> > > > > > > > > > >> to > > >> > > > > > > > > > >> > > > > wasting > > >> > > > > > > > > > >> > > > > > > TCP > > >> > > > > > > > > > >> > > > > > > > > sockets on Ignite nodes, can't it > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > Can you please elaborate? > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > [1] > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > >> > > > > > > >> > > > > > > > > > >> > > > > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > >> > > > >> > > > > > > > > > >> > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > https://blog.stephencleary.com/2009/05/detection-of-half-open-dropped.html > > >> > > > > > > > > > >> > > > > > > > > [2] > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > >> > > > > > > >> > > > > > > > > > >> > > > > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > >> > > > >> > > > > > > > > > >> > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-23%3A+Best+Effort+Affinity+for+Thin+Clients > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > On Fri, Feb 4, 2022 at 4:01 PM > > Maksim > > >> > > > Timonin > > >> > > > > < > > >> > > > > > > > > > >> > > > > > timoninma...@apache.org > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > wrote: > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > Hi Pavel, > > >> > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > Thanks for starting this > thread! > > >> Can I > > >> > > ask > > >> > > > > > some > > >> > > > > > > > > > >> questions > > >> > > > > > > > > > >> > > here > > >> > > > > > > > > > >> > > > to > > >> > > > > > > > > > >> > > > > > get > > >> > > > > > > > > > >> > > > > > > > the > > >> > > > > > > > > > >> > > > > > > > > > feature more clearly? > > >> > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > As I understand it correctly, > > >> > half-state > > >> > > > is > > >> > > > > a > > >> > > > > > > > > possible > > >> > > > > > > > > > >> > > > situation > > >> > > > > > > > > > >> > > > > > when > > >> > > > > > > > > > >> > > > > > > > an > > >> > > > > > > > > > >> > > > > > > > > > Ignite node goes down or > somehow > > >> > removes > > >> > > > > > > > connection > > >> > > > > > > > > > to a > > >> > > > > > > > > > >> > thin > > >> > > > > > > > > > >> > > > > > client. > > >> > > > > > > > > > >> > > > > > > > But > > >> > > > > > > > > > >> > > > > > > > > > with enabled (true by default) > > >> > > > > > > partitionAwareness > > >> > > > > > > > > > >> feature > > >> > > > > > > > > > >> > > > clients > > >> > > > > > > > > > >> > > > > > can > > >> > > > > > > > > > >> > > > > > > > be > > >> > > > > > > > > > >> > > > > > > > > > notified about topology > changes. > > >> So, > > >> > > there > > >> > > > > are > > >> > > > > > > > > > possible > > >> > > > > > > > > > >> > > cases: > > >> > > > > > > > > > >> > > > > > > > > > 1. ThinClient connects to a > > single > > >> > node. > > >> > > > > > > > > > >> > > > > > > > > > 2. Ignite node removes > connection > > >> from > > >> > > > > itself. > > >> > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > I like the idea for the case > > with a > > >> > > single > > >> > > > > > node, > > >> > > > > > > > as > > >> > > > > > > > > it > > >> > > > > > > > > > >> > helps > > >> > > > > > > > > > >> > > > fail > > >> > > > > > > > > > >> > > > > > > fast. > > >> > > > > > > > > > >> > > > > > > > > > But is it OK to connect a > client > > >> to a > > >> > > > single > > >> > > > > > > node > > >> > > > > > > > > > only? > > >> > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > For the second one: you mention > > >> that a > > >> > > > case > > >> > > > > > for > > >> > > > > > > > the > > >> > > > > > > > > > >> second > > >> > > > > > > > > > >> > > > option > > >> > > > > > > > > > >> > > > > > is > > >> > > > > > > > > > >> > > > > > > > > > "Long-living and mostly idle > > >> > connections > > >> > > > are > > >> > > > > > > > > > especially > > >> > > > > > > > > > >> > > > > susceptible > > >> > > > > > > > > > >> > > > > > > to > > >> > > > > > > > > > >> > > > > > > > > this > > >> > > > > > > > > > >> > > > > > > > > > behavior". If I understand > > >> correctly > > >> > the > > >> > > > > > > > connections > > >> > > > > > > > > > are > > >> > > > > > > > > > >> > > > removed > > >> > > > > > > > > > >> > > > > on > > >> > > > > > > > > > >> > > > > > > the > > >> > > > > > > > > > >> > > > > > > > > > server side by client idle > > timeout. > > >> > Can > > >> > > we > > >> > > > > > just > > >> > > > > > > > > > >> configure > > >> > > > > > > > > > >> > the > > >> > > > > > > > > > >> > > > > idle > > >> > > > > > > > > > >> > > > > > > > > timeout > > >> > > > > > > > > > >> > > > > > > > > > for cases where we really need > > >> keeping > > >> > > > alive > > >> > > > > > > idle > > >> > > > > > > > > > >> > > connections? > > >> > > > > > > > > > >> > > > > Are > > >> > > > > > > > > > >> > > > > > > > there > > >> > > > > > > > > > >> > > > > > > > > > any other cases with > unexpectedly > > >> > > dropped > > >> > > > > > > > > connections? > > >> > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > I'm wondering is it OK to keep > > such > > >> > > > > > connections > > >> > > > > > > > > alive > > >> > > > > > > > > > >> for a > > >> > > > > > > > > > >> > > > long > > >> > > > > > > > > > >> > > > > > > time? > > >> > > > > > > > > > >> > > > > > > > > > Also in the case of partition > > >> > awareness > > >> > > > > > features > > >> > > > > > > > it > > >> > > > > > > > > > can > > >> > > > > > > > > > >> > lead > > >> > > > > > > > > > >> > > to > > >> > > > > > > > > > >> > > > > > > wasting > > >> > > > > > > > > > >> > > > > > > > > TCP > > >> > > > > > > > > > >> > > > > > > > > > sockets on Ignite nodes, can't > > it? > > >> > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > Thanks! > > >> > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > On Thu, Feb 3, 2022 at 2:24 PM > > >> Pavel > > >> > > > > Tupitsyn > > >> > > > > > < > > >> > > > > > > > > > >> > > > > > ptupit...@apache.org> > > >> > > > > > > > > > >> > > > > > > > > > wrote: > > >> > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> Igniters, > > >> > > > > > > > > > >> > > > > > > > > >> > > >> > > > > > > > > > >> > > > > > > > > >> Please review the proposal to > > add > > >> > > > heartbeat > > >> > > > > > > > > messages > > >> > > > > > > > > > to > > >> > > > > > > > > > >> > the > > >> > > > > > > > > > >> > > > thin > > >> > > > > > > > > > >> > > > > > > > client > > >> > > > > > > > > > >> > > > > > > > > >> protocol (both 2.x and 3.x) > and > > >> let > > >> > me > > >> > > > know > > >> > > > > > > your > > >> > > > > > > > > > >> thoughts: > > >> > > > > > > > > > >> > > > > > > > > >> > > >> > > > > > > > > > >> > > > > > > > > >> > > >> > > > > > > > > > >> > > > > > > > > >> > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > >> > > > > > > >> > > > > > > > > > >> > > > > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > >> > > > >> > > > > > > > > > >> > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-83+Thin+Client+Keepalive > > >> > > > > > > > > > >> > > > > > > > > >> > > >> > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > -- > > >> > > > > > > > > > >> > > > > > > > Sincerely yours, Ivan Daschinskiy > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > >> > > > > > -- > > >> > > > > > > > > > >> > > > > > Sincerely yours, Ivan Daschinskiy > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > >> > > > > > > >> > > > > > > > > > >> > > > > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > >> > > > >> > > > > > > > > > >> > > >> > > > > > > > > > >> > > >> > > > > > > > > > >> -- > > >> > > > > > > > > > >> Sincerely yours, Ivan Daschinskiy > > >> > > > > > > > > > >> > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > -- > > >> > > > > > > Sincerely yours, Ivan Daschinskiy > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > > >> > > > > -- > > >> > > > > Sincerely yours, Ivan Daschinskiy > > >> > > > > > > >> > > > > > >> > > > > >> > > > > >> > > -- > > >> > > Sincerely yours, Ivan Daschinskiy > > >> > > > > >> > > > >> > > > > > > > > > -- > > > Sincerely yours, Ivan Daschinskiy > > > > > >