I thought flushsize could be set as several times higher than the batch size is that in a cluster, data nodes would flush in parallel. For example there's a cluster with 10 nodes, and flushSize is 10240, thread count = 2, batch size = 512. Then each node would flush out in 2 thread, and each thread flushes out in batch of 512.
Could someone confirms or clarify the understanding? Thank you! On Wed, May 3, 2017 at 12:16 AM, Matt <[email protected]> wrote: > In fact, I don't see why you would need both batchSize and flushSize. If > I got it right, only the min of them would be used by Ignite to know when > to flush, why do we have both in the first place? > > In case they're both necessary for a reason I'm not seeing, I still wonder > if the default values should be batchSize > flushSize as I think or not. > > On Wed, May 3, 2017 at 3:26 AM, Matt <[email protected]> wrote: > >> I'm writing to confirm I managed to fix my problem by fine tuning the >> config params for the write behind cache until the performance was fine. I >> still see single element inserts from time to time, but just a few of them >> every now and then not like before. You should definitely avoid synchronous >> single elements insertions, I hope that changes in future versions. >> >> Regarding writeBehindBatchSize and writeBehindFlushSize, I don't see the >> point of setting both values when batchSize < flushSize (default values are >> 512 and 10240 respectively). If I'm not wrong, the cache is flushed >> whenever the its size is equal to min(batchSize, flushSize). Since >> batchSize is less than flushSize, flushSize is never really used and the >> size of the flush is controlled by the size of the cache itself only. >> >> That is how it works by default, on the other hand if we swap their >> values (ie, batchSize=10240 and flushSize=512) the behavior would be the >> same (Ignite would call writeAll() with 512 elements each time), but the >> number of elements flushed would be controlled by the correct variable (ie, >> flushSize). >> >> Were the default values supposed to be the other way around or am I >> missing something? >> >> On Tue, May 2, 2017 at 9:13 PM, Denis Magda <[email protected]> wrote: >> >>> Matt, >>> >>> Cross-posting to the dev list. >>> >>> Yes, Ignite switches to the synchronous mode once the buffer is >>> exhausted. However, I do agree that it would be a right solution to flush >>> multiple entries rather than one in the synchronous mode. *Igniters*, I was >>> sure we had a ticket for that optimization but unable to find it. Does >>> anybody know the ticket name/number? >>> >>> To omit the performance degradation you have to tweak the following >>> parameters so that the write-behind store can keep up with you updates: >>> - setWriteBehindFlushThreadCount >>> - setWriteBehindFlushFrequency >>> - setWriteBehindBatchSize >>> - setWriteBehindFlushSize >>> >>> Usually it helped all the times to Apache Ignite users. >>> >>> > QUESTION 2 >>> > >>> > I've read on the docs that using ATOMIC mode (default mode) is better >>> for performance, but I'm not getting why. If I'm not wrong using >>> TRANSACTIONAL mode would cause the CacheStore to reuse connections (not >>> call openConnection(autocommit=true) on each writeAll()). >>> > >>> > Shouldn't it be better to use transactional mode? >>> >>> Transactional mode enables 2 phase commit protocol: >>> https://apacheignite.readme.io/docs/transactions#two-phase-commit-2pc >>> >>> This is why atomic operations are swifter in general. >>> >>> — >>> Denis >>> >>> > On May 2, 2017, at 10:40 AM, Matt <[email protected]> wrote: >>> > >>> > No, only with inserts, I haven't tried removing at this rate yet but >>> it may have the same problem. >>> > >>> > I'm debugging Ignite internal code and I may be onto something. The >>> thing is Ignite has a cacheMaxSize (aka, WriteBehindFlushSize) and >>> cacheCriticalSize (which by default is cacheMaxSize*1.5). When the cache >>> reaches that size Ignite starts writing elements SYNCHRONOUSLY, as you can >>> see in [1]. >>> > >>> > I think this makes things worse since only one single value is flushed >>> at a time, it becomes much slower forcing Ignite to do more synchronous >>> writes. >>> > >>> > Anyway, I'm still not sure why the cache reaches that level when the >>> database is clearly able to keep up with the insertions. I'll check if it >>> has to do with the number of open connections or what. >>> > >>> > Any insight on this is very welcome! >>> > >>> > [1] https://github.com/apache/ignite/blob/master/modules/core/sr >>> c/main/java/org/apache/ignite/internal/processors/cache/stor >>> e/GridCacheWriteBehindStore.java#L620 >>> > >>> > On Tue, May 2, 2017 at 2:17 PM, Jessie Lin < >>> [email protected]> wrote: >>> > I noticed that behavior when any cache.remove operation is involved. I >>> keep putting stuff in cache seems to be working properly. >>> > >>> > Do you use remove operation? >>> > >>> > On Tue, May 2, 2017 at 9:57 AM, Matt <[email protected]> wrote: >>> > I'm stuck with that. No matter what config I use (flush size, write >>> threads, etc) this is the behavior I always get. It's as if Ignite internal >>> buffer is full and it's trying to write and get rid of the oldest (one) >>> element only. >>> > >>> > Any idea people? What is your CacheStore configuration to avoid this? >>> > >>> > On Tue, May 2, 2017 at 11:50 AM, Jessie Lin < >>> [email protected]> wrote: >>> > Hello Matt, thank you for posting. I've noticed similar behavior. >>> > >>> > Would be curious to see the response from the engineering team. >>> > >>> > Best, >>> > Jessie >>> > >>> > On Tue, May 2, 2017 at 1:03 AM, Matt <[email protected]> wrote: >>> > Hi all, >>> > >>> > I have two questions for you! >>> > >>> > QUESTION 1 >>> > >>> > I'm following the example in [1] (a mix between "jdbc transactional" >>> and "jdbc bulk operations") and I've enabled write behind, however after >>> the first 10k-20k insertions the performance drops *dramatically*. >>> > >>> > Based on prints I've added to the CacheStore, I've noticed what Ignite >>> is doing is this: >>> > >>> > - writeAll called with 512 elements (Ignites buffers elements, that's >>> good) >>> > - openConnection with autocommit=true is called each time inside >>> writeAll (since session is not stored in atomic mode) >>> > - writeAll is called with 512 elements a few dozen times, each time it >>> opens a new JDBC connection as mentioned above >>> > - ... >>> > - writeAll called with ONE element (for some reason Ignite stops >>> buffering elements) >>> > - writeAll is called with ONE element from here on, each time it opens >>> a new JDBC connection as mentioned above >>> > - ... >>> > >>> > Things to note: >>> > >>> > - All config values are the defaults ones except for write through and >>> write behind which are both enabled. >>> > - I'm running this as a server node (only one node on the cluster, the >>> application itself). >>> > - I see the problem even with a big heap (ie, Ignite is not nearly out >>> of memory). >>> > - I'm using PostgreSQL for this test (it's fine ingesting around 40k >>> rows per second on this computer, so that shouldn't be a problem) >>> > >>> > What is causing Ignite to stop buffering elements after calling >>> writeAll() a few dozen times? >>> > >>> > QUESTION 2 >>> > >>> > I've read on the docs that using ATOMIC mode (default mode) is better >>> for performance, but I'm not getting why. If I'm not wrong using >>> TRANSACTIONAL mode would cause the CacheStore to reuse connections (not >>> call openConnection(autocommit=true) on each writeAll()). >>> > >>> > Shouldn't it be better to use transactional mode? >>> > >>> > Regards, >>> > Matt >>> > >>> > [1] https://apacheignite.readme.io/docs/persistent-store#section >>> -cachestore-example >>> > >>> > >>> > >>> > >>> >>> >> >
