Agree, sounds like a plan, thanks for taking over! пн, 30 дек. 2019 г. в 13:00, Vyacheslav Daradur <daradu...@gmail.com>:
> Alexey, > > I would not make it default in the current implementation. > > Waiting of proxies on non-deployment-initiator nodes should be > improved - additional checks are required: > 1) We should not wait if requested service has not been submitted to > deploy (when there is no info about such service) > 2) If service deployment failed - getting proxy should be failed or > interrupted as well (do not wait for all available timeout) > > Let's schedule this improvement to next release, I'll try to find a > time to implement it. > > What do you think? > > On Mon, Dec 30, 2019 at 12:05 PM Alexey Goncharuk > <alexey.goncha...@gmail.com> wrote: > > > > Vyacheslav, thanks for the explanation, makes sense to me. > > > > I was thinking though, should we make the behavior with the timeout > default > > for all proxies? > > > > Just my opinion - I think for a user it would be hard to control which > node > > deploys the service, especially if multiple nodes deploy it concurrently. > > Most likely users will end up always calling the second option of the > proxy > > (with the timeout), so, perhaps, make it default? > > > > вс, 29 дек. 2019 г. в 21:05, Vyacheslav Daradur <daradu...@gmail.com>: > > > > > Alexey, > > > > > > I've prepared pr [1] to show our proxy invocation guarantees and to > > > avoid misunderstanding. > > > > > > Please, let me know if you think that we should improve our guaranties > > > in some cases. > > > > > > [1] https://github.com/apache/ignite/pull/7213 > > > > > > On Tue, Dec 24, 2019 at 7:27 PM Vyacheslav Daradur < > daradu...@gmail.com> > > > wrote: > > > > > > > > > even the local deployment looks broken: if a compute job > > > > > is sent to a remote node after the service deployment > > > > > > > > This is a different case and covered by retries: > > > > * If you deploy a service from node A to node B, then take a proxy > > > > from node A (deployment initiator) it should NOT fail even if node B > > > > has not received yet a message that deployment finished successfully, > > > > because of proxy invocation retries. > > > > > > > > Look like It's better to describe all these cases on the wiki. > > > > > > > > > Should we schedule this ticket for the further work on Services > IEP? > > > > > > > > If it is a frequent use-case we definitely should implement it. > > > > > > > > > > > > On Tue, Dec 24, 2019 at 6:55 PM Alexey Goncharuk > > > > <alexey.goncha...@gmail.com> wrote: > > > > > > > > > > Ok, got it. > > > > > > > > > > I agree that this is consistent with the old behavior, but this is > the > > > kind > > > > > of errors we wanted to get rid of when we started the IEP. From the > > > > > user perspective, even the local deployment looks broken: if a > compute > > > job > > > > > is sent to a remote node after the service deployment, the job > > > execution > > > > > may fail due to this error. > > > > > > > > > > Should we schedule this ticket for the further work on Services > IEP? > > > > > > > > > > вт, 24 дек. 2019 г. в 18:49, Vyacheslav Daradur < > daradu...@gmail.com>: > > > > > > > > > > > Not sure that "user fallback" is the right definition, it is not > new > > > > > > behaviour in comparison with legacy implementation. > > > > > > > > > > > > Our synchronous deployment provides guaranties for a deployment > > > > > > initiator to be able to start work with service immediately after > > > > > > deployment finished successfully. > > > > > > For not the deployment initiator we can't provide such guarantees > > > now, > > > > > > because of unknown deployment result and possibly fail. > > > > > > > > > > > > In this case, a reasonable timeout might be an acceptable > solution. > > > > > > > > > > > > We can improve guaranties in future releases, but there is an > open > > > > > > question: > > > > > > - how long taking of proxy should wait? - deployment of "heavy" > > > > > > service may take a while > > > > > > > > > > > > On Tue, Dec 24, 2019 at 6:19 PM Alexey Goncharuk > > > > > > <alexey.goncha...@gmail.com> wrote: > > > > > > > > > > > > > > What should be the user fallback in this case? Retry > infinitely? Is > > > > > > there a > > > > > > > way to wait for the proper deployment? > > > > > > > > > > > > > > вт, 24 дек. 2019 г. в 12:41, Vyacheslav Daradur < > > > daradu...@gmail.com>: > > > > > > > > > > > > > > > I’ll take a look at the end of the week. > > > > > > > > > > > > > > > > There is one more use-case: > > > > > > > > * if you initiate deployment from node A, but getting proxy > on > > > node B > > > > > > > > (which isn’t deployment initiator) to call service on node A > - > > > it may > > > > > > fail > > > > > > > > with "service not found", this is expected behaviour because > we > > > didn't > > > > > > > > provide such guarantees. > > > > > > > > > > > > > > > > API of getting proxy with timeout should be used in this > case: > > > > > > > > T serviceProxy(String name, Class<? super T> svcItf, boolean > > > sticky, > > > > > > long > > > > > > > > timeout) > > > > > > > > > > > > > > > > > > > > > > > > вт, 24 дек. 2019 г. в 12:11, Alexey Goncharuk < > > > > > > alexey.goncha...@gmail.com > > > > > > > > >: > > > > > > > > > > > > > > > > > Well, this is exactly the case. The service is deployed > from > > > node A, > > > > > > the > > > > > > > > > proxy is created on node B, and "service not found" > exception > > > gets > > > > > > thrown > > > > > > > > > to a user anyway. Perhaps, the retry happens too fast? > > > > > > > > > > > > > > > > > > Created a ticket [1]. > > > > > > > > > > > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-12490 > > > > > > > > > > > > > > > > > > пн, 23 дек. 2019 г. в 22:08, Vyacheslav Daradur < > > > daradu...@gmail.com > > > > > > >: > > > > > > > > > > > > > > > > > > > Hi, Alexey > > > > > > > > > > > > > > > > > > > > Please attach a reproducer to the ticket. > > > > > > > > > > > > > > > > > > > > As far as I remember we have the following behaviour for > the > > > > > > proxies: > > > > > > > > > > > > > > > > > > > > Let's assume you have deployed service from node A, then: > > > > > > > > > > * if you invoke service locally from node A - it is > > > guaranteed to > > > > > > > > > > service to be deployed and ready to work > > > > > > > > > > * if you take a proxy from node A to remote node B right > > > after > > > > > > deploy > > > > > > > > > > - there is might be a race between disco-spi (a message > which > > > > > > releases > > > > > > > > > > deployed service) and comm-spi (remote call works via > > > Compute over > > > > > > > > > > comm-spi), but it shouldn't affect end-users because the > > > failed > > > > > > > > > > request will be retried in this case > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Dec 23, 2019 at 6:55 PM Alexey Goncharuk > > > > > > > > > > <alexey.goncha...@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > > > Nikolay, > > > > > > > > > > > > > > > > > > > > > > Yes, I've rechecked, the new service processor is being > > > used. > > > > > > I'll > > > > > > > > > file a > > > > > > > > > > > bug shortly. > > > > > > > > > > > > > > > > > > > > > > пн, 23 дек. 2019 г. в 17:33, Николай Ижиков < > > > nizhi...@apache.org > > > > > > >: > > > > > > > > > > > > > > > > > > > > > > > Alexey, are you sure, you are testing new service > > > framework? > > > > > > > > > > > > > > > > > > > > > > > > Is yes - you definitely should file a bug. > > > > > > > > > > > > > > > > > > > > > > > > > 23 дек. 2019 г., в 17:02, Alexey Goncharuk < > > > > > > > > > > alexey.goncha...@gmail.com> > > > > > > > > > > > > написал(а): > > > > > > > > > > > > > > > > > > > > > > > > > > Igniters, > > > > > > > > > > > > > > > > > > > > > > > > > > I have a question based on one of my recent tests > > > debugging. > > > > > > > > > > > > > > > > > > > > > > > > > > The test is related to Ignite services. I noticed > that > > > > > > sometimes > > > > > > > > a > > > > > > > > > > proxy > > > > > > > > > > > > > invocation of a newly deployed service fails > because > > > the > > > > > > service > > > > > > > > > > cannot > > > > > > > > > > > > be > > > > > > > > > > > > > found. I managed to reduce the test to a simple > "start > > > two > > > > > > nodes, > > > > > > > > > > deploy > > > > > > > > > > > > a > > > > > > > > > > > > > service, create a proxy, invoke the proxy" > scenario. > > > The > > > > > > proxy > > > > > > > > > > invocation > > > > > > > > > > > > > fails in about ~80% of runs. > > > > > > > > > > > > > > > > > > > > > > > > > > As far as I remember, the new discovery-based > service > > > > > > deployment > > > > > > > > > was > > > > > > > > > > > > > supposed to be synchronous, so not only non-proxy > > > service > > > > > > > > instances > > > > > > > > > > > > should > > > > > > > > > > > > > work, but the proxies as well. Was my understanding > > > correct? > > > > > > > > > Should I > > > > > > > > > > > > file > > > > > > > > > > > > > a bug for the observed behavior? > > > > > > > > > > > > > > > > > > > > > > > > > > --AG > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Best Regards, Vyacheslav D. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Best Regards, Vyacheslav D. > > > > > > > > > > > > > > > > > > > > > > -- > > > > Best Regards, Vyacheslav D. > > > > > > > > > > > > -- > > > Best Regards, Vyacheslav D. > > > > > > > -- > Best Regards, Vyacheslav D. >