Ivan Bessonov created IGNITE-24877:
--------------------------------------

             Summary: Dirty pages count calculation is wrong
                 Key: IGNITE-24877
                 URL: https://issues.apache.org/jira/browse/IGNITE-24877
             Project: Ignite
          Issue Type: Bug
            Reporter: Ivan Bessonov

In checkpointer, we have a separate entities for "dirty pages" and "dirty 
partitions". We mark partition dirty by calling {{{}"markPartitionAsDirty"{}}}.

Bad thing happen if some pages are updated, but partitions was not explicitly 
marked as dirty.  In such a case we write more pages to the storage than we 
initially calculated. The reason is simple - every delta file also has a meta 
page, that's rarely marked dirty in an explicit manner.

This might happen in tests. The probability of it happening in real case is 
low, because we constantly update safe time in replicator's state machine. 
Nonetheless, this is a ticking time bomb.

The reason why it's dangerous is throttling. Throttling estimates the time 
until checkpoint completion by comparing dirty pages count and written pages 
count. If the latter is larger, we'll get a negative time, which breaks 
everything.

Rather than accounting for bugs, I suggest having a proper dirty pages count 
calculation, by accounting for all dirty partitions even if they were not 
explicitly marked as dirty.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to