I’m not convinced that this case couldn’t be further optimized.  If the socket 
connection from the client to the dead server was terminated, shouldn’t we 
ignore it when sending the next invocation?

Anthony

> On Jul 26, 2016, at 12:02 PM, Michael Stolz <mst...@pivotal.io> wrote:
> 
> Yep that lazy discovery on the client side is part of what makes it possible 
> to have very large numbers of connected clients on Geode. Without that, every 
> client would need to be notified even if they are just lying dormant.
> 
> --
> Mike Stolz
> Principal Engineer, GemFire Product Manager
> Mobile: 631-835-4771
> 
> On Tue, Jul 26, 2016 at 10:09 AM, Dan Smith <dsm...@pivotal.io 
> <mailto:dsm...@pivotal.io>> wrote:
> Hi Olivier,
> 
> Only the peers receive broadcasts about membership changes, not the clients. 
> The client lazily discovers metadata from the locators and servers so I think 
> what you are seeing is actually expected. In this case the client might still 
> think it has a connection to the killed server, and would not actually 
> discover that the connection is dead until it tries to send the new function 
> to that server.
> 
> -Dan
> 
> On Tue, Jul 26, 2016 at 1:44 AM, Olivier Mallassi <olivier.malla...@gmail.com 
> <mailto:olivier.malla...@gmail.com>> wrote:
> Hi all,
> 
> I would need your help to better understand the behavior I have observed 
> (regarding function execution with node failure)
> 
> - I have a function (optimizeForWrite=true, hasResult=true, isHA=true) that 
> is executed (onRegion(mypartitionedRegion)) every two minutes (poll frequency 
> has been increased for test)
> - then, just after a execution of the function I kill -9 one of the member 
> (member-timeout=1)
> - then, the function is executed again (around 2 min later). In that case, 
> the function is executed twice (on the remaining members).
> In that case, the context.isDuplicate() returns true so that I just exit the 
> function
> 
> 
> if (functionContext.isPossibleDuplicate()) {
>     logger.warning(....
>     //exit
>     functionContext.getResultSender().lastResult(null);
> }
> 
> The function being HA, this is the expected behavior.
> 
> Yet, what I do not understand is that it seems the "node failure" is detected 
> only when the function is executed where as the node failure has already been 
> broadcasted (Membership cluster). Can someone give me more insights on this? 
> Is this a misconfig between client / locator so that client are still not 
> aware of the node failure?
> 
> 
> Many thx.
> 
> oliv/
> 
> 

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to