Hello again,
I've spent another day on this issue looking at configuration and with
cluster larger than configured backup count ignite did the work.
After that I reverted to testing of single node with minimal
configuration and come to the point that the only way to keep Ignite
survive load was setting ExpiryPolicy to very low value (10s).
To me it seems that Ignite behavior is to preserve all entries in memory
at any cost, even if there is a risk of running into OOM. Is that true?
I would like to change this behavior and make sure that entries do not
expire because of time as long as there is memory available to store
them. They should be evicted by appropriate EvictionPolicy only if
memory fills up.
To me it looks like DataStoreSettings do not make any impact in this
regard. At least setting page eviction to LRU do not change it.
Please let me know if I am doing something wrong as I can not prove
Ignite to be working stable. Even with such basic objectives I outlined
in earlier mail.
Kind regards,
Łukasz
On 27.01.2023 00:58, Łukasz Dywicki wrote:
Dear all,
I come across use of Apache Ignite to cache results of expensive
computation operation.
Objectives are basic:
- Keep most of "hot" data in memory
- Offload cold part to cache store
- Keep memory utilization under control (evict entries as needed)
While it sounds basic, it doesn't seem to fit Ignite defaults.
What I am testing now is behavior with large objects which can grow up
to 10 mb (serialized) or 25 mb (json representation). Usually objects
will stay far below that threshold, but we can't make assumption on that.
I began testing various configurations of Ignite in order to facilitate
offloading of memory contents to database. So far I am stuck for two
days at Ignite/application itself running out of memory after processing
several of such large objects. While I know that storing 10 mb blob in
database is not the best idea, I have to test that behavior too.
By observing database contents I see that number of entries there grows,
but cache do not seem to be evicted. When I try to switch eviction, it
does require onheap to be switched on, and it still fails with LRU
eviction policy.
So far I ended up with a named cache and default region configured as
below:
```
IgniteConfiguration igniteConfiguration = new IgniteConfiguration();
igniteConfiguration.setDataStorageConfiguration(new
DataStorageConfiguration()
.setDefaultDataRegionConfiguration(new DataRegionConfiguration()
.setPersistenceEnabled(false)
.setInitialSize(256L * 1024 * 1024)
.setMaxSize(512L * 1024 * 1024)
.setPageEvictionMode(DataPageEvictionMode.RANDOM_LRU)
.setSwapPath(null)
.setEvictionThreshold(0.75)
)
.setPageSize(DataStorageConfiguration.MAX_PAGE_SIZE)
);
CacheConfiguration<SessionKey, ExpensiveObject> expensiveCache = new
CacheConfiguration<>()
.setName(CACHE_NAME)
.setBackups(2)
.setAtomicityMode(CacheAtomicityMode.ATOMIC)
.setCacheStoreFactory(cacheJdbcBlobStoreFactory)
.setWriteThrough(true)
.setOnheapCacheEnabled(true)
.setEvictionPolicyFactory(new LruEvictionPolicyFactory<>(1024))
.setReadThrough(true);
igniteConfiguration.setCacheConfiguration(
expensiveCache
);
```
What I observe is following - the cache keeps writing data into
database, but it does not remove old entries fast enough to prevent crash.
JVM parameters I use are fairly basic:
-Xms1g -Xmx1g -XX:+AlwaysPreTouch -XX:+UseG1GC -XX:+ScavengeBeforeFullGC
-XX:+DisableExplicitGC
The store mechanism is jdbc blob store. Exceptions I get happen to occur
in Ignite itself, processing (application code writing cache) or
communication thread used to feed cache. I collected one case here:
https://gist.github.com/splatch/b5ec9134cd9df19bc62f007dd17a19a1
The error message in linked gist advice to enable persistence (which I
did via cache store!), increase memory limit (which I don't want to do),
or enable eviction/expiry policy (which somehow miss behave).
To me it looks like self defense mechanisms Ignite has are being tricked
leading whole application to crash.
Can you please advise me which settings to tune and how in order to get
Ignite more stable under such load?
Kind regards,
Łukasz