Hi Pavel,
We have six servers, but these don't have any issue, and 40 client nodes
(the Igntie node is started with IgniteConfiguration.ClientMode = true).
The 40 client nodes are the ones where we are having the memory issue.
The _ignite.GetOrCreateNearCache is execute on the client nodes. We also
tried to using the following code but the memory issue was the same:
var nearCacheCfg = new NearCacheConfiguration
{
// Use LRU eviction policy to automatically evict entries whenever it
reaches 100000 in size.
EvictionPolicy = new LruEvictionPolicy
{
MaxSize = 5000, // 5000 elements
MaxMemorySize = 500000000
}
};
return _ignite.GetOrCreateCache<TKey, TValue>(new CacheConfiguration(
cacheName), nearCacheCfg);
On Wed, Feb 5, 2020 at 10:02 AM Pavel Tupitsyn <[email protected]> wrote:
> Hi Eduard,
>
> Do you have any client nodes (IgniteConfiguration.ClientMode=true), or
> just servers?
>
> Is the following line executed on Ignite server node?
> _ignite.GetOrCreateNearCache
>
> On Wed, Feb 5, 2020 at 11:44 AM Eduard Llull <[email protected]> wrote:
>
>> Hi everyone,
>>
>> We have been using Ignite and Ignite.NET in the recent months in a
>> project. We currently have six Ignite servers (started with ignite.sh) and
>> a bunch of thick clients split in two .NET Core application deployed in 30
>> servers.
>>
>> We store de-normalized data in the Ignite data grid: one of the .NET Core
>> applications puts data into the cache and the other application is a gRPC
>> service that just reads that data to compute a response. The data is split
>> in a dozen of caches which are created programatically from the application
>> that writes into the caches.
>>
>> The caches are PARTITIONED and TRANSACTIONAL and the partitions have two
>> backups.
>>
>> It's been working fine so far but we identified that one particular cache
>> was the most read and to reduce network usage and improve response time of
>> the gRPC service we decided to use a near cache. That particular cache has
>> ~2300 entries which occupies ~110MB of space and the near cache is
>> configured with a maxSize=5000 and maxMemorySize=500000000
>>
>> [image: image.png]
>>
>> The embedded JVM in the gRPC .NET Core application is started with the
>> following parameters:
>> -Xmx=1024
>> -Xms=1024
>> -Djava.net.preferIPv4Stack=true
>> -Xrs
>> -XX:+AlwaysPreTouch
>> -XX:+UseG1GC
>> -XX:+ScavengeBeforeFullGC
>> -XX:+DisableExplicitGC
>> -DIGNITE_NO_SHUTDOWN_HOOK=true
>> -Dcom.sun.management.jmxremote
>> -Dcom.sun.management.jmxremote.ssl=false
>> -Dcom.sun.management.jmxremote.authenticate=false
>> -Dcom.sun.management.jmxremote.local.only=false
>> -Dcom.sun.management.jmxremote.port=12345
>>
>> If we don't use the near cache, at every gRPC call the server receives it
>> executes the following code to get the cache (this works fine):
>> return _ignite.GetCache<TKey, TValue>(cacheName);
>>
>> And if we want to use the near cache, that line is changed to:
>> var nearCacheCfg = new NearCacheConfiguration
>> {
>> // Use LRU eviction policy to automatically evict entries whenever it
>> reaches 100000 in size.
>> EvictionPolicy = new LruEvictionPolicy
>> {
>> MaxSize = 5000, // 5000 elements
>> MaxMemorySize = 500000000
>> }
>> };
>> return _ignite.GetOrCreateNearCache<TKey, TValue>(cacheName, nearCacheCfg
>> );
>>
>> But since we added the near cache the application memory usage never
>> stabilizes: without the near cache the application uses ~2.5GB of RAM in
>> every server but wen we use the near cache, the application memory usage
>> never stops growing.
>>
>> This is the memory usage of one of the servers with the gRPC application.
>> [image: image.png]
>>
>> In the graph above, the version with the near cache was deployed on
>> February the 3rd at 17:00. At 01:30 of Febreary the 4th the server started
>> swapping and at arround 7:45 the application crashed. This is a detail:
>> [image: image.png]
>>
>> I would very much like to create a reproducer but it looks like it would
>> take a very long time to execute the reproduce the issue as the gRPC
>> application needs several hours to use all the memory and if we take into
>> account that every server with the gRPC application receives around 90
>> requests per second, if the memory leak exists it is very slow.
>>
>> Does anybody have any idea where the problem can be or how to find it?
>>
>>
>> Thank you very much
>>
>>