I think it is about time we take another look at our service functionality. All the points you have raised sound reasonable to me.
On Fri, Mar 23, 2018 at 6:01 PM, Denis Mekhanikov <dmekhani...@gmail.com> wrote: > Igniters, > > I'd like to start a discussion on Ignite service grid redesign. > We have a number of problems in our current architecture, that have to be > addressed. > > Here are the most severe ones: > > One of them is lack of guarantee, that service is successfully deployed and > ready for work by the time, when *IgniteService.deploy*()* methods return. > Furthermore, if an exception is thrown from *Service.init() *method, then > the deploying side is not able to receive it, or even understand, that > service is in unusable state. > So, you may end up in such situation, when you deployed a service without > receiving any errors, then called a service's method, and hung indefinitely > on this invocation. > JIRA ticket: https://issues.apache.org/jira/browse/IGNITE-3392 > > Another problem is locking during service deployment on unstable topology. > This issue is caused by missing updates in continuous query listeners on > the internal cache. > It is hard to reproduce, but it happens sometimes. We shouldn't allow such > possibility, that deployment methods hang without saying anything. > JIRA ticket: https://issues.apache.org/jira/browse/IGNITE-6259 > > I think, we should change the deployment procedure to make it more > reliable. > Moving from operating over internal replicated service cache to sending > custom discovery events seems to be a good idea. > Service deployment may trigger a discovery event, that will make chosen > nodes deploy the service, and the same event will notify other nodes about > the deployed service instances. > It will eliminate the need for distributed transactions on the internal > replicated system cache, and make the service deployment protocol more > transparent. > > There are a few points, that should be taken into account though. > > First of all, we can't wait for services to be deployed and initialised in > the discovery thread. > So, we need to make notification about service deployment result > asynchronous, presumably over communication protocol. > I can think of a procedure similar to the current exchange protocol, when > service deployment is initialised with an initial discovery message, > followed by asynchronous notifications from the hosting servers over > communication. And finally, one more discovery message will notify all > nodes about the service deployment result and location of the deployed > service instances. Coordinator will be responsible for collecting of the > deployment results in this scheme. > > Another problem is failover in case, when some nodes fail during deployment > or further work. > The following cases should be handled: > > 1. coordinator failure during deployment; > 2. failure of nodes, that were chosen to host the service, during > deployment; > 3. failure of nodes, that contain deployed services, after the > deployment. > > The first case may be resolved by either continuation of deployment with a > new coordinator, or by cancelling it. > The second case will require another node to be chosen and notified. Maybe > another discovery message will be needed. > The third case will require redeployment, so coordinator should track > topology changes and redeploy failed services. > > Another good improvement would be service versioning. This matter was > already discussed in another thread: > http://apache-ignite-developers.2346864.n4.nabble.com/Service-versioning- > td20858.html > Let's resume this discussion and state the final decision here. > This feature is closely connected to peer class loading, which is not > working for services currently. > So, service versioning should be implemented along with peer class loading. > JIRA ticket for versioning: > https://issues.apache.org/jira/browse/IGNITE-6069 > Peer class loading: https://issues.apache.org/jira/browse/IGNITE-975 > > Please share your thoughts. Constructive criticism is highly appreciated. > > Denis >