Hi Mikhail, proposed changes make sense to me.
I left some comments to the pr.
Thank you!

On Wed, May 6, 2020 at 2:28 PM Mikhail Petrov <pmgheap....@gmail.com> wrote:

> Hello, Igniters.
>
> I am working on IGNITE-12894 - [1]. It seems that it has the root cause
> which is similar to the problem described in this thread.
>
> To solve these problems, I propose to change the behavior of the
> IgniteServiceProcessor#serviceTopology if the timeout argument is 0.
> At the moment, IgniteServiceProcessor#serviceTopology returns the
> topology immediately, regardless of whether it was initialized or not in
> this case. I propose to wait for the service topology to be initialized
> if the requested service is already registered on local node, but the
> full message was not received from the coordinator yet.
>
> So the final behavior of IgniteServices#serviceProxy() will be:
> 1. If the timeout is specified - it waits for the topology over a
> specified timeout even if the requested service was not registered yet.
> As in current implementation.
>
> 2. If the timeout is not specified - if service was not registered it
> fails immediately, else it is waiting for the topology initialization
> (full message from the coordinator) if needed.
>
> Here is PR with the implementation of the described proposal - [2].
>
> WDYT?
>
> [1] - https://issues.apache.org/jira/browse/IGNITE-12894
> [2] - https://github.com/apache/ignite/pull/7771
>
> On 30.12.2019 13:03, Alexey Goncharuk wrote:
> > Agree, sounds like a plan, thanks for taking over!
> >
> > пн, 30 дек. 2019 г. в 13:00, Vyacheslav Daradur <daradu...@gmail.com>:
> >
> >> Alexey,
> >>
> >> I would not make it default in the current implementation.
> >>
> >> Waiting of proxies on non-deployment-initiator nodes should be
> >> improved - additional checks are required:
> >> 1) We should not wait if requested service has not been submitted to
> >> deploy (when there is no info about such service)
> >> 2) If service deployment failed - getting proxy should be failed or
> >> interrupted as well (do not wait for all available timeout)
> >>
> >> Let's schedule this improvement to next release, I'll try to find a
> >> time to implement it.
> >>
> >> What do you think?
> >>
> >> On Mon, Dec 30, 2019 at 12:05 PM Alexey Goncharuk
> >> <alexey.goncha...@gmail.com> wrote:
> >>> Vyacheslav, thanks for the explanation, makes sense to me.
> >>>
> >>> I was thinking though, should we make the behavior with the timeout
> >> default
> >>> for all proxies?
> >>>
> >>> Just my opinion - I think for a user it would be hard to control which
> >> node
> >>> deploys the service, especially if multiple nodes deploy it
> concurrently.
> >>> Most likely users will end up always calling the second option of the
> >> proxy
> >>> (with the timeout), so, perhaps, make it default?
> >>>
> >>> вс, 29 дек. 2019 г. в 21:05, Vyacheslav Daradur <daradu...@gmail.com>:
> >>>
> >>>> Alexey,
> >>>>
> >>>> I've prepared pr [1] to show our proxy invocation guarantees and to
> >>>> avoid misunderstanding.
> >>>>
> >>>> Please, let me know if you think that we should improve our guaranties
> >>>> in some cases.
> >>>>
> >>>> [1] https://github.com/apache/ignite/pull/7213
> >>>>
> >>>> On Tue, Dec 24, 2019 at 7:27 PM Vyacheslav Daradur <
> >> daradu...@gmail.com>
> >>>> wrote:
> >>>>>> even the local deployment looks broken: if a compute job
> >>>>>> is sent to a remote node after the service deployment
> >>>>> This is a different case and covered by retries:
> >>>>> * If you deploy a service from node A to node B, then take a proxy
> >>>>> from node A (deployment initiator) it should NOT fail even if node B
> >>>>> has not received yet a message that deployment finished successfully,
> >>>>> because of proxy invocation retries.
> >>>>>
> >>>>> Look like It's better to describe all these cases on the wiki.
> >>>>>
> >>>>>> Should we schedule this ticket for the further work on Services
> >> IEP?
> >>>>> If it is a frequent use-case we definitely should implement it.
> >>>>>
> >>>>>
> >>>>> On Tue, Dec 24, 2019 at 6:55 PM Alexey Goncharuk
> >>>>> <alexey.goncha...@gmail.com> wrote:
> >>>>>> Ok, got it.
> >>>>>>
> >>>>>> I agree that this is consistent with the old behavior, but this is
> >> the
> >>>> kind
> >>>>>> of errors we wanted to get rid of when we started the IEP. From the
> >>>>>> user perspective, even the local deployment looks broken: if a
> >> compute
> >>>> job
> >>>>>> is sent to a remote node after the service deployment, the job
> >>>> execution
> >>>>>> may fail due to this error.
> >>>>>>
> >>>>>> Should we schedule this ticket for the further work on Services
> >> IEP?
> >>>>>> вт, 24 дек. 2019 г. в 18:49, Vyacheslav Daradur <
> >> daradu...@gmail.com>:
> >>>>>>> Not sure that "user fallback" is the right definition, it is not
> >> new
> >>>>>>> behaviour in comparison with legacy implementation.
> >>>>>>>
> >>>>>>> Our synchronous deployment provides guaranties for a deployment
> >>>>>>> initiator to be able to start work with service immediately after
> >>>>>>> deployment finished successfully.
> >>>>>>> For not the deployment initiator we can't provide such guarantees
> >>>> now,
> >>>>>>> because of unknown deployment result and possibly fail.
> >>>>>>>
> >>>>>>> In this case, a reasonable timeout might be an acceptable
> >> solution.
> >>>>>>> We can improve guaranties in future releases, but there is an
> >> open
> >>>>>>> question:
> >>>>>>> - how long taking of proxy should wait? - deployment of "heavy"
> >>>>>>> service may take a while
> >>>>>>>
> >>>>>>> On Tue, Dec 24, 2019 at 6:19 PM Alexey Goncharuk
> >>>>>>> <alexey.goncha...@gmail.com> wrote:
> >>>>>>>> What should be the user fallback in this case? Retry
> >> infinitely? Is
> >>>>>>> there a
> >>>>>>>> way to wait for the proper deployment?
> >>>>>>>>
> >>>>>>>> вт, 24 дек. 2019 г. в 12:41, Vyacheslav Daradur <
> >>>> daradu...@gmail.com>:
> >>>>>>>>> I’ll take a look at the end of the week.
> >>>>>>>>>
> >>>>>>>>> There is one more use-case:
> >>>>>>>>> * if you initiate deployment from node A, but getting proxy
> >> on
> >>>> node B
> >>>>>>>>> (which isn’t deployment initiator) to call service on node A
> >> -
> >>>> it may
> >>>>>>> fail
> >>>>>>>>> with "service not found", this is expected behaviour because
> >> we
> >>>> didn't
> >>>>>>>>> provide such guarantees.
> >>>>>>>>>
> >>>>>>>>> API of getting proxy with timeout should be used in this
> >> case:
> >>>>>>>>> T serviceProxy(String name, Class<? super T> svcItf, boolean
> >>>> sticky,
> >>>>>>> long
> >>>>>>>>> timeout)
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> вт, 24 дек. 2019 г. в 12:11, Alexey Goncharuk <
> >>>>>>> alexey.goncha...@gmail.com
> >>>>>>>>>> :
> >>>>>>>>>> Well, this is exactly the case. The service is deployed
> >> from
> >>>> node A,
> >>>>>>> the
> >>>>>>>>>> proxy is created on node B, and "service not found"
> >> exception
> >>>> gets
> >>>>>>> thrown
> >>>>>>>>>> to a user anyway. Perhaps, the retry happens too fast?
> >>>>>>>>>>
> >>>>>>>>>> Created a ticket [1].
> >>>>>>>>>>
> >>>>>>>>>> [1] https://issues.apache.org/jira/browse/IGNITE-12490
> >>>>>>>>>>
> >>>>>>>>>> пн, 23 дек. 2019 г. в 22:08, Vyacheslav Daradur <
> >>>> daradu...@gmail.com
> >>>>>>>> :
> >>>>>>>>>>> Hi, Alexey
> >>>>>>>>>>>
> >>>>>>>>>>> Please attach a reproducer to the ticket.
> >>>>>>>>>>>
> >>>>>>>>>>> As far as I remember we have the following behaviour for
> >> the
> >>>>>>> proxies:
> >>>>>>>>>>> Let's assume you have deployed service from node A, then:
> >>>>>>>>>>> * if you invoke service locally from node A - it is
> >>>> guaranteed to
> >>>>>>>>>>> service to be deployed and ready to work
> >>>>>>>>>>> * if you take a proxy from node A to remote node B right
> >>>> after
> >>>>>>> deploy
> >>>>>>>>>>> - there is might be a race between disco-spi (a message
> >> which
> >>>>>>> releases
> >>>>>>>>>>> deployed service)  and comm-spi (remote call works via
> >>>> Compute over
> >>>>>>>>>>> comm-spi), but it shouldn't affect end-users because the
> >>>> failed
> >>>>>>>>>>> request will be retried in this case
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Dec 23, 2019 at 6:55 PM Alexey Goncharuk
> >>>>>>>>>>> <alexey.goncha...@gmail.com> wrote:
> >>>>>>>>>>>> Nikolay,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Yes, I've rechecked, the new service processor is being
> >>>> used.
> >>>>>>> I'll
> >>>>>>>>>> file a
> >>>>>>>>>>>> bug shortly.
> >>>>>>>>>>>>
> >>>>>>>>>>>> пн, 23 дек. 2019 г. в 17:33, Николай Ижиков <
> >>>> nizhi...@apache.org
> >>>>>>>> :
> >>>>>>>>>>>>> Alexey, are you sure, you are testing new service
> >>>> framework?
> >>>>>>>>>>>>> Is yes - you definitely should file a bug.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> 23 дек. 2019 г., в 17:02, Alexey Goncharuk <
> >>>>>>>>>>> alexey.goncha...@gmail.com>
> >>>>>>>>>>>>> написал(а):
> >>>>>>>>>>>>>> Igniters,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I have a question based on one of my recent tests
> >>>> debugging.
> >>>>>>>>>>>>>> The test is related to Ignite services. I noticed
> >> that
> >>>>>>> sometimes
> >>>>>>>>> a
> >>>>>>>>>>> proxy
> >>>>>>>>>>>>>> invocation of a newly deployed service fails
> >> because
> >>>> the
> >>>>>>> service
> >>>>>>>>>>> cannot
> >>>>>>>>>>>>> be
> >>>>>>>>>>>>>> found. I managed to reduce the test to a simple
> >> "start
> >>>> two
> >>>>>>> nodes,
> >>>>>>>>>>> deploy
> >>>>>>>>>>>>> a
> >>>>>>>>>>>>>> service, create a proxy, invoke the proxy"
> >> scenario.
> >>>> The
> >>>>>>> proxy
> >>>>>>>>>>> invocation
> >>>>>>>>>>>>>> fails in about ~80% of runs.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> As far as I remember, the new discovery-based
> >> service
> >>>>>>> deployment
> >>>>>>>>>> was
> >>>>>>>>>>>>>> supposed to be synchronous, so not only non-proxy
> >>>> service
> >>>>>>>>> instances
> >>>>>>>>>>>>> should
> >>>>>>>>>>>>>> work, but the proxies as well. Was my understanding
> >>>> correct?
> >>>>>>>>>> Should I
> >>>>>>>>>>>>> file
> >>>>>>>>>>>>>> a bug for the observed behavior?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> --AG
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> --
> >>>>>>>>>>> Best Regards, Vyacheslav D.
> >>>>>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Best Regards, Vyacheslav D.
> >>>>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Best Regards, Vyacheslav D.
> >>>>
> >>>>
> >>>> --
> >>>> Best Regards, Vyacheslav D.
> >>>>
> >>
> >>
> >> --
> >> Best Regards, Vyacheslav D.
> >>
>


-- 
Best Regards,
Vyacheslav D.

Reply via email to