Re: Function Execution and HA

Olivier Mallassi Thu, 28 Jul 2016 00:00:17 -0700

it make sense. It also look like the function is executed twice by the same
processor (I assume thread) (as least the log shows the same Processor id
and tid).


In my case, with isHA=true, it implies I send the results twice to my
client. The client may be able to deal with that but what would be the
easiest way of ignoring the second/ retry execution?

1/ if I do something like
if(context.isPossibleDuplicate())
    logger.warning(....)
    context.getResultSender().lastResult(null);

to force "exit", I end up with a FunctionException (from memory "member...
did not send last result". I can get the exact stack if you want).

2/ if I configure the function with isHA=false. The function is not
re-executed (as expected)
In my case, it could be ok (even if I need to go into more tests) but in
the case the node gets down while executing the function, the current
function execution will fail.
then, it will be re-executed in the next polling loop (in my use case).

any thoughts?

regards

On Thu, Jul 28, 2016 at 12:29 AM, Anilkumar Gingade <aging...@pivotal.io>
wrote:

> I believe there is no check (wait for socket failure) on client side; the
> check is done with the request being sent....And if there is any failure
> with request, the connection end-point is put into suspect/dead-proxy state
> and handled....
>
> But i do see the CqListener callbacks that gets invoked when the
> connection is lost, not sure if its only for server-to-client connection or
> we do have it for client-to-server connection?
>
> -Anil.
>
>
>
>
>
>
> On Wed, Jul 27, 2016 at 9:41 AM, Anthony Baker <aba...@pivotal.io> wrote:
>
>> I’m not convinced that this case couldn’t be further optimized.  If the
>> socket connection from the client to the dead server was terminated,
>> shouldn’t we ignore it when sending the next invocation?
>>
>> Anthony
>>
>> On Jul 26, 2016, at 12:02 PM, Michael Stolz <mst...@pivotal.io> wrote:
>>
>> Yep that lazy discovery on the client side is part of what makes it
>> possible to have very large numbers of connected clients on Geode. Without
>> that, every client would need to be notified even if they are just lying
>> dormant.
>>
>> --
>> Mike Stolz
>> Principal Engineer, GemFire Product Manager
>> Mobile: 631-835-4771
>>
>> On Tue, Jul 26, 2016 at 10:09 AM, Dan Smith <dsm...@pivotal.io> wrote:
>>
>>> Hi Olivier,
>>>
>>> Only the peers receive broadcasts about membership changes, not the
>>> clients. The client lazily discovers metadata from the locators and servers
>>> so I think what you are seeing is actually expected. In this case the
>>> client might still think it has a connection to the killed server, and
>>> would not actually discover that the connection is dead until it tries to
>>> send the new function to that server.
>>>
>>> -Dan
>>>
>>> On Tue, Jul 26, 2016 at 1:44 AM, Olivier Mallassi <
>>> olivier.malla...@gmail.com> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I would need your help to better understand the behavior I have
>>>> observed (regarding function execution with node failure)
>>>>
>>>> - I have a function (optimizeForWrite=true, hasResult=true, isHA=true)
>>>> that is executed (onRegion(mypartitionedRegion)) every two minutes
>>>> (poll frequency has been increased for test)
>>>> - then, just after a execution of the function I kill -9 one of the
>>>> member (member-timeout=1)
>>>> - then, the function is executed again (around 2 min later). In that
>>>> case, the function is executed twice (on the remaining members).
>>>> In that case, the context.isDuplicate() returns true so that I just
>>>> exit the function
>>>>
>>>>
>>>> if (functionContext.isPossibleDuplicate()) {
>>>>     logger.warning(....
>>>>     //exit
>>>>     functionContext.getResultSender().lastResult(null);
>>>> }
>>>>
>>>>
>>>> The function being HA, this is the expected behavior.
>>>>
>>>> Yet, what I do not understand is that it seems the "node failure" is
>>>> detected only when the function is executed where as the node failure has
>>>> already been broadcasted (Membership cluster). Can someone give me more
>>>> insights on this? Is this a misconfig between client / locator so that
>>>> client are still not aware of the node failure?
>>>>
>>>>
>>>> Many thx.
>>>>
>>>> oliv/
>>>>
>>>
>>>
>>
>>
>

Re: Function Execution and HA

Reply via email to