Re: Service grid redesign

Denis Magda Thu, 05 Apr 2018 12:37:40 -0700

Val,

Sounds like a great solution. I'm totally for it.


--
Denis

On Thu, Apr 5, 2018 at 12:32 PM, Valentin Kulichenko <
[email protected]> wrote:

> Denis,
>
> This is why I'm suggesting to use DeploymentSpi for this. The way I see
> this is that instead of deploying classes on local classpath, user can
> deploy them in the storage that SPI points to. If class is updated in the
> storage, Ignite detects this and automatically restarts the service. This
> is a very simple and straightforward approach that doesn't required a lot
> of changes on our side and allows to reuse existing implementation of
> DeploymentSpi.
>
> -Val
>
> On Thu, Apr 5, 2018 at 12:13 PM, Denis Magda <[email protected]> wrote:
>
> > >
> > > There is no need to deserialize services on the coordinator. It should
> > only
> > > be able to calculate the assignments.
> > > *LazyServiceConfiguration *should be used to deliver the service
> > > configurations, just like it is done right now.
> >
> >
> > Can that configuration be tweaked over the time requiring to update the
> > class on all the nodes (if, for instance, someone wants to deploy the
> next
> > version of a service)? Just want to be sure we don't need to restart the
> > cluster nodes (that won't be used for service deployments) on
> > services-related configurational changes.
> >
> > --
> > Denis
> >
> > On Thu, Apr 5, 2018 at 8:18 AM, Denis Mekhanikov <[email protected]>
> > wrote:
> >
> > > Denis,
> > > There is no need to deserialize services on the coordinator. It should
> > only
> > > be able to calculate the assignments.
> > > *LazyServiceConfiguration *should be used to deliver the service
> > > configurations, just like it is done right now.
> > >
> > > Val,
> > > Usage of DeploymentSpi is a good idea, I didn't think about this
> > > possibility.
> > > This is a viable alternative to peer-class-loading, not that
> > user-friendly
> > > though.
> > > But if peer-class-loading is that hard to implement, then I vote for
> > > DeploymentSpi.
> > > As far as I understand, it won't require us to do any additional
> changes
> > in
> > > Ignite, but will make users think about using a proper DeploymentSpi.
> > > Please correct me, if I'm wrong.
> > > It would be good, though, to add some examples on service redeployment,
> > > when implementation class changes.
> > >
> > > Denis
> > >
> > > чт, 5 апр. 2018 г. в 2:33, Valentin Kulichenko <
> > > [email protected]>:
> > >
> > > > I don't think peer class loading is even possible for services. I
> > believe
> > > > we should reuse DeploymentSpi [1] for versioning.
> > > >
> > > > [1] https://apacheignite.readme.io/docs/deployment-spi
> > > >
> > > > -Val
> > > >
> > > > On Wed, Apr 4, 2018 at 12:52 PM, Denis Magda <[email protected]>
> > > wrote:
> > > >
> > > > > Sorry, that was me who renamed the IEP to "Oil Change in Service
> > Grid".
> > > > Was
> > > > > writing this email after the renaming. Like that title more because
> > > it's
> > > > > fun and highlights what we're intended to do - cleaning of our
> > service
> > > > grid
> > > > > engine and powering it up with new "liquid" (new communication and
> > > > > deployment approach not available before).
> > > > >
> > > > > Denis
> > > > >
> > > > >
> > > > > > This message contains serialized service instance and its
> > > > configuration.
> > > > > > It is delivered to the coordinator node first, that calculates
> the
> > > > > service
> > > > > > deployment assignments and adds this information to the message.
> > > > >
> > > > >
> > > > > I would consider using a NodeFilter first to decide where a service
> > can
> > > > be
> > > > > potentially deployed.  Otherwise, we would require service classes
> to
> > > be
> > > > on
> > > > > every node (every node might become a coordinator) which is not the
> > > > desired
> > > > > requirement.
> > > > >
> > > > >
> > > > > As for the peer-class-loading, I would backup up Dmitriy here.
> Let's
> > at
> > > > > least not to focus on this task for now. We should design services
> > > > > versioning in the right way first and support it.
> > > > >
> > > > > --
> > > > > Denis
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Apr 4, 2018 at 12:20 PM, Dmitriy Setrakyan <
> > > > [email protected]>
> > > > > wrote:
> > > > >
> > > > > > Here is the correct link:
> > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-
> > > > > > 17%3A+Oil+Change+in+Service+Grid
> > > > > >
> > > > > > I have looked at the tickets there, and I believe that we should
> > not
> > > > > > support peer-deployment for services. It is very hard and I do
> not
> > > > think
> > > > > we
> > > > > > should even try.
> > > > > >
> > > > > > I am proposing closing this ticket as Won't Fix -
> > > > > > https://issues.apache.org/jira/browse/IGNITE-975
> > > > > >
> > > > > > D.
> > > > > >
> > > > > > On Wed, Apr 4, 2018 at 5:39 AM, Denis Mekhanikov <
> > > > [email protected]>
> > > > > > wrote:
> > > > > >
> > > > > > > Vyacheslav,
> > > > > > >
> > > > > > > I've just posted my first draft of the IEP:
> > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-
> > > > > > 17%3A+Service+grid+
> > > > > > > improvements
> > > > > > > It's not finished yet, but you can get the idea from it.
> > > > > > > If you have some thoughts on your mind, please let me know,
> I'll
> > > add
> > > > > them
> > > > > > > to the IEP.
> > > > > > >
> > > > > > > Denis
> > > > > > >
> > > > > > > ср, 4 апр. 2018 г. в 13:09, Vyacheslav Daradur <
> > > [email protected]
> > > > >:
> > > > > > >
> > > > > > > > Denis, thanks for the link.
> > > > > > > >
> > > > > > > > I looked through the task and I think that understand your
> > > redesign
> > > > > > point
> > > > > > > > now.
> > > > > > > >
> > > > > > > > Do you have a clear plan or IEP for the whole redesign?
> > > > > > > >
> > > > > > > > I'm interested in this component and I'd like to take part in
> > the
> > > > > > > > development.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Apr 2, 2018 at 2:55 PM, Denis Mekhanikov <
> > > > > > [email protected]>
> > > > > > > > wrote:
> > > > > > > > > Vyacheslav,
> > > > > > > > >
> > > > > > > > > Service deployment design, based on replicated utility
> cache
> > > has
> > > > > > proven
> > > > > > > > to
> > > > > > > > > be unstable and deadlock-prone.
> > > > > > > > > You can find a list of JIRA issues, connected to it, in my
> > > > previous
> > > > > > > > letter.
> > > > > > > > >
> > > > > > > > > The intention behind it is similar to the binary metadata
> > > > redesign,
> > > > > > > that
> > > > > > > > > happened in the following ticket: IGNITE-4157
> > > > > > > > > <https://issues.apache.org/jira/browse/IGNITE-4157>
> > > > > > > > > This change in service deployment procedure will eliminate
> > need
> > > > for
> > > > > > > > another
> > > > > > > > > internal replicated cache
> > > > > > > > > and make service deployment more reliable on unstable
> > topology.
> > > > > > > > >
> > > > > > > > > Denis
> > > > > > > > >
> > > > > > > > > вт, 27 мар. 2018 г. в 23:21, Vyacheslav Daradur <
> > > > > [email protected]
> > > > > > >:
> > > > > > > > >
> > > > > > > > >> Hi, Denis Mekhanikov!
> > > > > > > > >>
> > > > > > > > >> As far as I know, Ignite services are based on IgniteCache
> > and
> > > > we
> > > > > > have
> > > > > > > > >> all its features. We can use listeners or continuous
> queries
> > > for
> > > > > > > > >> deployment synchronizations.
> > > > > > > > >>
> > > > > > > > >> Why do you want using the discovery layer for that?
> > > > > > > > >>
> > > > > > > > >> One more thing: we can use baseline approach for services,
> > > that
> > > > > > means
> > > > > > > > >> *IgniteService.deploy()* returns ready to work service
> after
> > > > > > > > >> deployment on baseline nodes and deploy to other nodes on
> > > > demand,
> > > > > > for
> > > > > > > > >> example when deployed service's loading will be hight.
> > > > > > > > >>
> > > > > > > > >> About versioning, maybe there is sense to extend public
> API:
> > > > > > > > >> IgniteServices.service(name, *version*)?
> > > > > > > > >>
> > > > > > > > >> At first deployment, we can compute service's hashcode
> (just
> > > for
> > > > > an
> > > > > > > > >> example) and store it, after new deployment request for
> > > services
> > > > > > with
> > > > > > > > >> an existing name we will compute new service's hashcode
> and
> > > > > compare
> > > > > > > > >> them if they have different hashcodes that we will deploy
> > new
> > > > > > service
> > > > > > > > >> as service with a different version.
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >> On Fri, Mar 23, 2018 at 10:03 PM, Denis Magda <
> > > > [email protected]>
> > > > > > > > wrote:
> > > > > > > > >> > Denis,
> > > > > > > > >> >
> > > > > > > > >> > Thanks for the extensive analysis. There is a vast room
> > for
> > > > > > > > optimizations
> > > > > > > > >> > on the service grid side.
> > > > > > > > >> >
> > > > > > > > >> > Yakov, Sam, Alex G.,
> > > > > > > > >> >
> > > > > > > > >> > How do you like the idea of the usage of discovery
> > protocol
> > > > for
> > > > > > the
> > > > > > > > >> service
> > > > > > > > >> > grid system messages exchange? Any pitfalls?
> > > > > > > > >> >
> > > > > > > > >> >
> > > > > > > > >> > --
> > > > > > > > >> > Denis
> > > > > > > > >> >
> > > > > > > > >> >
> > > > > > > > >> > On Fri, Mar 23, 2018 at 8:01 AM, Denis Mekhanikov <
> > > > > > > > [email protected]
> > > > > > > > >> >
> > > > > > > > >> > wrote:
> > > > > > > > >> >
> > > > > > > > >> >> Igniters,
> > > > > > > > >> >>
> > > > > > > > >> >> I'd like to start a discussion on Ignite service grid
> > > > redesign.
> > > > > > > > >> >> We have a number of problems in our current
> architecture,
> > > > that
> > > > > > have
> > > > > > > > to
> > > > > > > > >> be
> > > > > > > > >> >> addressed.
> > > > > > > > >> >>
> > > > > > > > >> >> Here are the most severe ones:
> > > > > > > > >> >>
> > > > > > > > >> >> One of them is lack of guarantee, that service is
> > > > successfully
> > > > > > > > deployed
> > > > > > > > >> and
> > > > > > > > >> >> ready for work by the time, when
> > *IgniteService.deploy*()*
> > > > > > methods
> > > > > > > > >> return.
> > > > > > > > >> >> Furthermore, if an exception is thrown from
> > *Service.init()
> > > > > > > *method,
> > > > > > > > >> then
> > > > > > > > >> >> the deploying side is not able to receive it, or even
> > > > > understand,
> > > > > > > > that
> > > > > > > > >> >> service is in unusable state.
> > > > > > > > >> >> So, you may end up in such situation, when you
> deployed a
> > > > > service
> > > > > > > > >> without
> > > > > > > > >> >> receiving any errors, then called a service's method,
> and
> > > > hung
> > > > > > > > >> indefinitely
> > > > > > > > >> >> on this invocation.
> > > > > > > > >> >> JIRA ticket:
> > > > https://issues.apache.org/jira/browse/IGNITE-3392
> > > > > > > > >> >>
> > > > > > > > >> >> Another problem is locking during service deployment on
> > > > > unstable
> > > > > > > > >> topology.
> > > > > > > > >> >> This issue is caused by missing updates in continuous
> > query
> > > > > > > > listeners on
> > > > > > > > >> >> the internal cache.
> > > > > > > > >> >> It is hard to reproduce, but it happens sometimes. We
> > > > shouldn't
> > > > > > > allow
> > > > > > > > >> such
> > > > > > > > >> >> possibility, that deployment methods hang without
> saying
> > > > > > anything.
> > > > > > > > >> >> JIRA ticket:
> > > > https://issues.apache.org/jira/browse/IGNITE-6259
> > > > > > > > >> >>
> > > > > > > > >> >> I think, we should change the deployment procedure to
> > make
> > > it
> > > > > > more
> > > > > > > > >> >> reliable.
> > > > > > > > >> >> Moving from operating over internal replicated service
> > > cache
> > > > to
> > > > > > > > sending
> > > > > > > > >> >> custom discovery events seems to be a good idea.
> > > > > > > > >> >> Service deployment may trigger a discovery event, that
> > will
> > > > > make
> > > > > > > > chosen
> > > > > > > > >> >> nodes deploy the service, and the same event will
> notify
> > > > other
> > > > > > > nodes
> > > > > > > > >> about
> > > > > > > > >> >> the deployed service instances.
> > > > > > > > >> >> It will eliminate the need for distributed transactions
> > on
> > > > the
> > > > > > > > internal
> > > > > > > > >> >> replicated system cache, and make the service
> deployment
> > > > > protocol
> > > > > > > > more
> > > > > > > > >> >> transparent.
> > > > > > > > >> >>
> > > > > > > > >> >> There are a few points, that should be taken into
> account
> > > > > though.
> > > > > > > > >> >>
> > > > > > > > >> >> First of all, we can't wait for services to be deployed
> > and
> > > > > > > > initialised
> > > > > > > > >> in
> > > > > > > > >> >> the discovery thread.
> > > > > > > > >> >> So, we need to make notification about service
> deployment
> > > > > result
> > > > > > > > >> >> asynchronous, presumably over communication protocol.
> > > > > > > > >> >> I can think of a procedure similar to the current
> > exchange
> > > > > > > protocol,
> > > > > > > > >> when
> > > > > > > > >> >> service deployment is initialised with an initial
> > discovery
> > > > > > > message,
> > > > > > > > >> >> followed by asynchronous notifications from the hosting
> > > > servers
> > > > > > > over
> > > > > > > > >> >> communication. And finally, one more discovery message
> > will
> > > > > > notify
> > > > > > > > all
> > > > > > > > >> >> nodes about the service deployment result and location
> of
> > > the
> > > > > > > > deployed
> > > > > > > > >> >> service instances. Coordinator will be responsible for
> > > > > collecting
> > > > > > > of
> > > > > > > > the
> > > > > > > > >> >> deployment results in this scheme.
> > > > > > > > >> >>
> > > > > > > > >> >> Another problem is failover in case, when some nodes
> fail
> > > > > during
> > > > > > > > >> deployment
> > > > > > > > >> >> or further work.
> > > > > > > > >> >> The following cases should be handled:
> > > > > > > > >> >>
> > > > > > > > >> >>    1. coordinator failure during deployment;
> > > > > > > > >> >>    2. failure of nodes, that were chosen to host the
> > > service,
> > > > > > > during
> > > > > > > > >> >>    deployment;
> > > > > > > > >> >>    3. failure of nodes, that contain deployed services,
> > > after
> > > > > the
> > > > > > > > >> >>    deployment.
> > > > > > > > >> >>
> > > > > > > > >> >> The first case may be resolved by either continuation
> of
> > > > > > deployment
> > > > > > > > >> with a
> > > > > > > > >> >> new coordinator, or by cancelling it.
> > > > > > > > >> >> The second case will require another node to be chosen
> > and
> > > > > > > notified.
> > > > > > > > >> Maybe
> > > > > > > > >> >> another discovery message will be needed.
> > > > > > > > >> >> The third case will require redeployment, so
> coordinator
> > > > should
> > > > > > > track
> > > > > > > > >> >> topology changes and redeploy failed services.
> > > > > > > > >> >>
> > > > > > > > >> >> Another good improvement would be service versioning.
> > This
> > > > > matter
> > > > > > > was
> > > > > > > > >> >> already discussed in another thread:
> > > > > > > > >> >>
> > > > > > > > >>
> > > > > > > > http://apache-ignite-developers.2346864.n4.nabble.
> > > > > > > com/Service-versioning-
> > > > > > > > >> >> td20858.html
> > > > > > > > >> >> Let's resume this discussion and state the final
> decision
> > > > here.
> > > > > > > > >> >> This feature is closely connected to peer class
> loading,
> > > > which
> > > > > is
> > > > > > > not
> > > > > > > > >> >> working for services currently.
> > > > > > > > >> >> So, service versioning should be implemented along with
> > > peer
> > > > > > class
> > > > > > > > >> loading.
> > > > > > > > >> >> JIRA ticket for versioning:
> > > > > > > > >> >> https://issues.apache.org/jira/browse/IGNITE-6069
> > > > > > > > >> >> Peer class loading: https://issues.apache.org/
> > > > > > > jira/browse/IGNITE-975
> > > > > > > > >> >>
> > > > > > > > >> >> Please share your thoughts. Constructive criticism is
> > > highly
> > > > > > > > >> appreciated.
> > > > > > > > >> >>
> > > > > > > > >> >> Denis
> > > > > > > > >> >>
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >> --
> > > > > > > > >> Best Regards, Vyacheslav D.
> > > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Best Regards, Vyacheslav D.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Service grid redesign

Reply via email to