Re: Mark dirty for DataPage: small changes in huge objects.

Ivan Rakov Mon, 08 Oct 2018 07:51:59 -0700

Huge +1.

Page dirty flag is set in PageMemoryImpl#writeUnlockPage body. Callerpasses "markDirty=true" boolean flag if he assumes that page content mayhave changed (dirty flag will be set even if page content remainedintact). Instead of this, we can dump page content to thread-localbuffer after successful write lock and compare it bytewise with newcontent on write unlock.

I believe, this logic should be introduced as a separate data storagemode as it have both positive and negative effects.


Positive:

Small updates of large entries will produce much less dirty pages. Itcan dramatically boost performance of updates - especially when SQLupdate of single field is performed over large objects.


Negative:

CPU consumption and latency will be increased. We'll need some time tocopy and compare page content. Anyway, lack of disk IOPS hits us muchmore often than lack of CPU - benchmarks will show whether such impactwill be perceptible.


Let's file a ticket for this task unless there are any objections.

Best Regards,
Ivan Rakov

On 08.10.2018 16:18, Dmitriy Pavlov wrote:

Hi Igniters,

I'd like to share a case which was implemented in the previous version of
TC Bot. It is a kind of REST responses cache <RestParms, Response>:
Response {
   Long tsRefreshed; // timestamp of the last call to real service
   List<Build> builds; // a huge list of builds, most times it is not
changed.
}

And it seems timestamp (ts) offset in all entries pages is constant and it
requires 8 bytes. Data in builds storage will require a number of pages in
the durable memory, probably >10-20 pages.

So if REST (real service) responds with the same builds content only TS is
updated. After that, I did cache.put(restParms, reponse).

So my question is, will such update, which affects only 1 field causes mark
dirty for 1 page or for 20? I feel according to checkpoints amount that we
mark all pages as dirty even if the content is not modified. If so, I would
like to suggest a slight change to Ignite: for data pages mark as only that
pages, which has a modification in its content.

I understand that previous implementation in the Bot was quite naive (now
it is changed), but still, what if we will check for modifications by
mem-compare before marking? Mark dirty now seems to cause additional data
to be flushed to disk on next checkpoint.

I would appreciate if Native Persistence Experts can help me to find a
place in the code, where such updates are performed? (Maybe I miss
something).

Sincerely,
Dmitriy Pavlov

Re: Mark dirty for DataPage: small changes in huge objects.

Reply via email to