Hi again,
Any advice? We're really struggling with our cluster stability - we
had to turn on throttling before sending data to DataStreamer, but
problems still happen.
In DataStreamer javadoc I found this:
perNodeParallelOperations(int) - sometimes data may be added to the
data streamer via addData(Object, Object) method faster than it can be
put in cache. In this case, new buffered stream messages are sent to
remote nodes before responses from previous ones are received. This
could cause unlimited heap memory utilization growth on local and
remote nodes. To control memory utilization, this setting limits
maximum allowed number of parallel buffered stream messages that are
being processed on remote nodes. If this number is exceeded, then
addData(Object, Object) method will block to control memory
utilization. Default is equal to CPU count on remote node multiply by
DFLT_PARALLEL_OPS_MULTIPLIER.
This could be the case - we have unlimited heap memory utilization
growth, on histogram I see GridDhtAtomicSingleUpdateRequest and
GridNearAtomicUpdateResponse classes.
How can we check that "data may be added... faster than it can be put
in cache"? Is there any growing metric exposed via JMX to check?
Regarding HDD I managed to run hdparm:
/dev/sda1:
Timing cached reads: 15036 MB in 2.00 seconds = 7525.21 MB/sec
Timing buffered disk reads: 2664 MB in 3.00 seconds = 887.36 MB/sec
Regards,
Piotr
On 2021/10/06 12:26:07, Piotr Jagielski wrote:
> OK I managed to take a larger heap histogram - attached
>
> Also I found WARNs about long running cache futures:
>
> 2021-10-06 14:15:29 WARN First 10 long running cache futures
> [total=5986879]
> 2021-10-06 14:15:29 WARN >>> Future [startTime=14:00:24.385,
> curTime=14:15:28.372, fut=GridDhtAtomicSingleUpdat
> eFuture [allUpdated=true, super=GridDhtAtomicAbstractUpdateFuture
> [futId=436214273, resCnt=0, addedReader=false,
> dhtRes=TransformMapView
> {44055f7f-02d5-42bf-bcbe-2e78b45b7954=[res=false, size=1,
nearSize=0]}]]]
> 2021-10-06 14:15:29 WARN >>> Future [startTime=14:00:24.385,
> curTime=14:15:28.372, fut=GridDhtAtomicSingleUpdat
> eFuture [allUpdated=true, super=GridDhtAtomicAbstractUpdateFuture
> [futId=436214275, resCnt=0, addedReader=false,
> dhtRes=TransformMapView
> {44055f7f-02d5-42bf-bcbe-2e78b45b7954=[res=false, size=1,
nearSize=0]}]]]
> 2021-10-06 14:15:29 WARN >>> Future [startTime=14:00:24.385,
> curTime=14:15:28.372, fut=GridDhtAtomicSingleUpdat
> eFuture [allUpdated=true, super=GridDhtAtomicAbstractUpdateFuture
> [futId=436214277, resCnt=0, addedReader=false,
> dhtRes=TransformMapView
> {44055f7f-02d5-42bf-bcbe-2e78b45b7954=[res=false, size=1,
nearSize=0]}]]]
> 2021-10-06 14:15:29 WARN >>> Future [startTime=14:00:24.385,
> curTime=14:15:28.372, fut=GridDhtAtomicSingleUpdat
> eFuture [allUpdated=true, super=GridDhtAtomicAbstractUpdateFuture
> [futId=436214279, resCnt=0, addedReader=false,
> dhtRes=TransformMapView
> {44055f7f-02d5-42bf-bcbe-2e78b45b7954=[res=false, size=1,
nearSize=0]}]]]
> 2021-10-06 14:15:29 WARN >>> Future [startTime=14:00:24.385,
> curTime=14:15:28.372, fut=GridDhtAtomicSingleUpdat
> eFuture [allUpdated=true, super=GridDhtAtomicAbstractUpdateFuture
> [futId=436214281, resCnt=0, addedReader=false,
> dhtRes=TransformMapView
> {44055f7f-02d5-42bf-bcbe-2e78b45b7954=[res=false, size=1,
nearSize=0]}]]]
> 2021-10-06 14:15:29 WARN >>> Future [startTime=14:00:24.385,
> curTime=14:15:28.372, fut=GridDhtAtomicSingleUpdat
> eFuture [allUpdated=true, super=GridDhtAtomicAbstractUpdateFuture
> [futId=436214283, resCnt=0, addedReader=false,
> dhtRes=TransformMapView
> {44055f7f-02d5-42bf-bcbe-2e78b45b7954=[res=false, size=1,
nearSize=0]}]]]
> 2021-10-06 14:15:29 WARN >>> Future [startTime=14:00:24.385,
> curTime=14:15:28.372, fut=GridDhtAtomicSingleUpdat
> eFuture [allUpdated=true, super=GridDhtAtomicAbstractUpdateFuture
> [futId=436214285, resCnt=0, addedReader=false,
> dhtRes=TransformMapView
> {44055f7f-02d5-42bf-bcbe-2e78b45b7954=[res=false, size=1,
nearSize=0]}]]]
> 2021-10-06 14:15:29 WARN >>> Future [startTime=14:00:24.385,
> curTime=14:15:28.372, fut=GridDhtAtomicSingleUpdat
> eFuture [allUpdated=true, super=GridDhtAtomicAbstractUpdateFuture
> [futId=436214287, resCnt=0, addedReader=false,
> dhtRes=TransformMapView
> {44055f7f-02d5-42bf-bcbe-2e78b45b7954=[res=false, size=1,
nearSize=0]}]]]
> 2021-10-06 14:15:29 WARN >>> Future [startTime=14:00:24.385,
> curTime=14:15:28.372, fut=GridDhtAtomicSingleUpdat
> eFuture [allUpdated=true, super=GridDhtAtomicAbstractUpdateFuture
> [futId=436214289, resCnt=0, addedReader=false,
> dhtRes=TransformMapView
> {44055f7f-02d5-42bf-bcbe-2e78b45b7954=[res=false, size=1,
nearSize=0]}]]]
> 2021-10-06 14:15:29 WARN >>> Future [startTime=14:00:24.385,
> curTime=14:15:28.372, fut=GridDhtAtomicSingleUpdat
> eFuture [allUpdated=true, super=GridDhtAtomicAbstractUpdateFuture
> [futId=436214291, resCnt=0, addedReader=false,
> dhtRes=TransformMapView
> {44055f7f-02d5-42bf-bcbe-2e78b45b7954=[res=false, size=1,
nearSize=0]}]]]
>
>
> On 2021/10/06 11:00:15, Piotr Jagielski wrote:
> > Hi,
> >
> > Thanks for the quick answer.
> >
> > I've attached the config and logs with thread dumps. I can take heap
> > histogram when we'll experience problems again - for now we
disabled the
> > have update process to keep stability.
> >
> > Regarding HDD - this could be good point, I can see that
> > lastCheckpointFsyncDuration (regarding JMX stats) is the slow part:
> >
> >
> > Maybe
> >
>
https://ignite.apache.org/docs/latest/persistence/persistence-tuning#pages-writes-throttling
>
> > would be a good idea?
> >
> > Regards
> >
> > On 2021/10/06 10:09:03, Anton Kurbanov wrote:
> > > Hello Piotr,
> > >
> > > Please share the configuration and the logs, preferably with thread
> dumps
> > > attached during the time when system_worker_blocked message pops up.
> > >
> > > It is difficult to diagnose an issue with only metrics involved,
as for
> > > example, the checkpoint itself has several phases which might be
long
> > with
> > > different reasons for this. For example, if the fsync phase is
> slow, then
> > > the reason is the slow disk (probably a slow HDD).
> > >
> > > A few heap histograms taken might also be very helpful in
identifying
> > what
> > > kind of objects are alive in the heap and what are their GC roots
> to help
> > > identify which component is holding these objects.
> > >
> > > Best regards,
> > > Anton
> > >
> > > ср, 6 окт. 2021 г. в 12:57, Piotr Jagielski :
> > >
> > > > Hi,
> > > >
> > > > We experience stability problems on our Ignite cluster (2.10)
under
> > heavy
> > > > load. Our cluster nodes are 3x 8 CPU, 32GB RAM.
> > > >
> > > > We mainly use 2 persistent caches:
> > > > - aggregates - only updates, around 6K records / sec, ~70 mln
records
> > > > total, stored mostly on disk (dataRegion maxSize = 4GB)
> > > > - customers - mainly reads by jdbc thin client + massive update
> of all
> > > > records once a day (~20 mln records) at about 60K records / sec,
> stored
> > > > off-heap (maxSize = 8GB)
> > > >
> > > > For updates we use DataStreamer with:
> > > > - perNodeParallelOperations = 5
> > > > - perNodeBufferSize = 500
> > > > - autoFlushFrequency = 1000 millis
> > > >
> > > > Under normal load (only aggregate updates) cluster behaves
> > normally, the
> > > > problems happen only during massive customer cache updates. We
> observe:
> > > > - Heap starvation (we have Xms4g / Xmx8g)
> > > > - Long gc pauses (up to 5 secs)
> > > > - SYSTEM_WORKER_BLOCKED logs
> > > > - Long checkpoint write times (up to 20 secs)
> > > > - Increasing Outbound message queue (> 100 entries)
> > > >
> > > > For now, we increased walSegmentSize to 256MB, any other options
> we can
> > > > adjust? Maybe something from this list
> > > >
> >
https://ignite.apache.org/docs/latest/persistence/persistence-tuning? Is
> > > > the data streamer too fast for the cluster?
> > > >
> > > > I can provide more logs/configuration if needed.
> > > >
> > > > Regards,
> > > > Piotr
> > > >
> > > >
> > >
> >
>