> we do not plan to make cross-cluster calls into the services If you are making local calls, I think there is no point in using Ignite services. Can you describe the use case - what are you trying to achieve?
On Tue, Jun 28, 2022 at 8:55 PM Arthur Naseef <artnas...@apache.org> wrote: > Hello - I'm getting started with Ignite and looking seriously at using it > for a specific use-case. > > Working on a Proof-Of-Concept (POC), I am finding a question related to > performance, and wondering if the solution, using Ignite Services, is a > good fit for the use-case. > > In my testing, I am getting the following timings: > > - Startup of 20,000 ignite services takes 30 seconds > - Startup of 50,000 ignite services takes 250 seconds > - The 2.5x increase from 20,000 to 50,000 yielded > 8x cost in startup > time (appears to be exponential growth) > > Watching the JVM during this time, I see the following: > > - Heap usage is not significant (do not see signs of GC) > - CPU usage is only slightly increased - on the order of 20% total > (system has 12 cores/24 threads) > - Network utilization is reasonable > - Futex system call (measured with "strace -r") appears to be taking > the most time by far. > > The use-case involves the following: > > - Startup of up-to hundreds-of-thousands of services at cluster spin-up > - Frequent, small adjustments to the services running over time > - Need to rebalance when a new node joins the cluster, or an old one > leaves the cluster > - Once the services are deployed, we do not plan to make cross-cluster > calls into the services (i.e. we do *not* plan to use ignite's > services().serviceProxy() on these) > - Jobs don't look like a fit because these (1) are "long-running" > (actually periodically scheduled tasks) and (2) they need to redistribute > even after they start running > > This is starting to get long. I have more details to share. Here is the > repo with the code being used to test, and a link to a wiki page with some > of the details: > > https://github.com/opennms-forge/distributed-scheduling-poc/ > > > https://github.com/opennms-forge/distributed-scheduling-poc/wiki/Ignite-Startup-Performance > > > Questions I have in mind: > > - Are services a good fit here? We expect to reach upwards of 500,000 > services in a cluster with multiple nodes. > - Any thoughts on tracking down the bottleneck and alleviating it? (I > have started taking timing measurements in the Ignite code) > > Stopping here - please ask questions and I'll gladly fill in details. Any > tips are welcome, including ideas for tracking down just where the > bottleneck exists. > > Art > >