Re: Performance and large numbers of servers

Arthur Naseef Tue, 28 Jun 2022 12:27:44 -0700

Yes.  The "services" in our case will be schedules that periodically
perform fast operations.


For example a service could be, "ping this device every <x> seconds".

Art

On Tue, Jun 28, 2022 at 12:20 PM Pavel Tupitsyn <[email protected]>
wrote:

> > we do not plan to make cross-cluster calls into the services
>
> If you are making local calls, I think there is no point in using Ignite
> services.
> Can you describe the use case - what are you trying to achieve?
>
> On Tue, Jun 28, 2022 at 8:55 PM Arthur Naseef <[email protected]>
> wrote:
>
>> Hello - I'm getting started with Ignite and looking seriously at using it
>> for a specific use-case.
>>
>> Working on a Proof-Of-Concept (POC), I am finding a question related to
>> performance, and wondering if the solution, using Ignite Services, is a
>> good fit for the use-case.
>>
>> In my testing, I am getting the following timings:
>>
>>    - Startup of 20,000 ignite services takes 30 seconds
>>    - Startup of 50,000 ignite services takes 250 seconds
>>    - The 2.5x increase from 20,000 to 50,000 yielded > 8x cost in
>>    startup time (appears to be exponential growth)
>>
>> Watching the JVM during this time, I see the following:
>>
>>    - Heap usage is not significant (do not see signs of GC)
>>    - CPU usage is only slightly increased - on the order of 20% total
>>    (system has 12 cores/24 threads)
>>    - Network utilization is reasonable
>>    - Futex system call (measured with "strace -r") appears to be taking
>>    the most time by far.
>>
>> The use-case involves the following:
>>
>>    - Startup of up-to hundreds-of-thousands of services at cluster
>>    spin-up
>>    - Frequent, small adjustments to the services running over time
>>    - Need to rebalance when a new node joins the cluster, or an old one
>>    leaves the cluster
>>    - Once the services are deployed, we do not plan to make
>>    cross-cluster calls into the services (i.e. we do *not* plan to use
>>    ignite's services().serviceProxy() on these)
>>    - Jobs don't look like a fit because these (1) are "long-running"
>>    (actually periodically scheduled tasks) and (2) they need to redistribute
>>    even after they start running
>>
>> This is starting to get long.  I have more details to share.  Here is the
>> repo with the code being used to test, and a link to a wiki page with some
>> of the details:
>>
>> https://github.com/opennms-forge/distributed-scheduling-poc/
>>
>>
>> https://github.com/opennms-forge/distributed-scheduling-poc/wiki/Ignite-Startup-Performance
>>
>>
>> Questions I have in mind:
>>
>>    - Are services a good fit here?  We expect to reach upwards of
>>    500,000 services in a cluster with multiple nodes.
>>    - Any thoughts on tracking down the bottleneck and alleviating it?
>>    (I have started taking timing measurements in the Ignite code)
>>
>> Stopping here - please ask questions and I'll gladly fill in details.
>> Any tips are welcome, including ideas for tracking down just where the
>> bottleneck exists.
>>
>> Art
>>
>>

Re: Performance and large numbers of servers

Reply via email to