I believe there is no check (wait for socket failure) on client side; the check is done with the request being sent....And if there is any failure with request, the connection end-point is put into suspect/dead-proxy state and handled....
But i do see the CqListener callbacks that gets invoked when the connection is lost, not sure if its only for server-to-client connection or we do have it for client-to-server connection? -Anil. On Wed, Jul 27, 2016 at 9:41 AM, Anthony Baker <aba...@pivotal.io> wrote: > I’m not convinced that this case couldn’t be further optimized. If the > socket connection from the client to the dead server was terminated, > shouldn’t we ignore it when sending the next invocation? > > Anthony > > On Jul 26, 2016, at 12:02 PM, Michael Stolz <mst...@pivotal.io> wrote: > > Yep that lazy discovery on the client side is part of what makes it > possible to have very large numbers of connected clients on Geode. Without > that, every client would need to be notified even if they are just lying > dormant. > > -- > Mike Stolz > Principal Engineer, GemFire Product Manager > Mobile: 631-835-4771 > > On Tue, Jul 26, 2016 at 10:09 AM, Dan Smith <dsm...@pivotal.io> wrote: > >> Hi Olivier, >> >> Only the peers receive broadcasts about membership changes, not the >> clients. The client lazily discovers metadata from the locators and servers >> so I think what you are seeing is actually expected. In this case the >> client might still think it has a connection to the killed server, and >> would not actually discover that the connection is dead until it tries to >> send the new function to that server. >> >> -Dan >> >> On Tue, Jul 26, 2016 at 1:44 AM, Olivier Mallassi < >> olivier.malla...@gmail.com> wrote: >> >>> Hi all, >>> >>> I would need your help to better understand the behavior I have observed >>> (regarding function execution with node failure) >>> >>> - I have a function (optimizeForWrite=true, hasResult=true, isHA=true) >>> that is executed (onRegion(mypartitionedRegion)) every two minutes >>> (poll frequency has been increased for test) >>> - then, just after a execution of the function I kill -9 one of the >>> member (member-timeout=1) >>> - then, the function is executed again (around 2 min later). In that >>> case, the function is executed twice (on the remaining members). >>> In that case, the context.isDuplicate() returns true so that I just exit >>> the function >>> >>> >>> if (functionContext.isPossibleDuplicate()) { >>> logger.warning(.... >>> //exit >>> functionContext.getResultSender().lastResult(null); >>> } >>> >>> >>> The function being HA, this is the expected behavior. >>> >>> Yet, what I do not understand is that it seems the "node failure" is >>> detected only when the function is executed where as the node failure has >>> already been broadcasted (Membership cluster). Can someone give me more >>> insights on this? Is this a misconfig between client / locator so that >>> client are still not aware of the node failure? >>> >>> >>> Many thx. >>> >>> oliv/ >>> >> >> > >