> 1 янв. 2021 г., в 23:05, Andrey Borodin <x4...@yandex-team.ru> написал(а):
>
> I've found this thread in CF looking for something to review.
We discussed patches with Noah offlist. I'm resending summary to list.
There are two patches:
1. slru-truncate-modulo-v5.patch
2. slru-truncate-t-insurance-v4.patch
It would be a bit easier to review if patches were versioned together (v5
both), because 2nd patch applies on top of 1st. Also 2nd patch have a problem
if applying with git (in async.c).
First patch fixes a bug with possible SLRU truncation around wrapping point too
early.
Basic idea of the patch is "If we want to delete a range we must be eligible to
delete it's beginning and ending".
So to test if page is deletable it tests that first and last xids(or other
SLRU's unit) are of no interest. To test if a segment is deletable it tests if
first and last pages can be deleted.
Patch adds test in unusual manner: they are implemented as assert functions.
Tests are fast, they are only checking basic and edge cases. But tests will not
be run if Postgres is build without asserts.
I'm a little suspicious of implementation of *PagePrecedes(int page1, int
page2) functions. Consider following example from the patch:
static bool
CommitTsPagePrecedes(int page1, int page2)
{
TransactionId xid1;
TransactionId xid2;
xid1 = ((TransactionId) page1) * COMMIT_TS_XACTS_PER_PAGE;
xid1 += FirstNormalTransactionId + 1;
xid2 = ((TransactionId) page2) * COMMIT_TS_XACTS_PER_PAGE;
xid2 += FirstNormalTransactionId + 1;
return (TransactionIdPrecedes(xid1, xid2) &&
TransactionIdPrecedes(xid1, xid2 + COMMIT_TS_XACTS_PER_PAGE - 1));
}
We are adding FirstNormalTransactionId to xids to avoid them being special xids.
We add COMMIT_TS_XACTS_PER_PAGE - 1 to xid2 to shift it to the end of the page.
But due to += FirstNormalTransactionId we shift slightly behind page boundaries
and risk that xid2 + COMMIT_TS_XACTS_PER_PAGE - 1 can become
FrozenTransactionId (FirstNormalTransactionId - 1). Thus we add +1 to all
values in scope. While the logic is correct, coding is difficult. Maybe we
could just use
page1_first_normal_xid = ((TransactionId) page1) * CLOG_XACTS_PER_PAGE
+ FirstNormalTransactionId;
page2_first_normal_xid = ((TransactionId) page2) * CLOG_XACTS_PER_PAGE
+ FirstNormalTransactionId;
page2_last_xid = ((TransactionId) page2) * CLOG_XACTS_PER_PAGE +
CLOG_XACTS_PER_PAGE - 1;
But I'm not insisting on this.
Following comment is not correct for 1Kb and 4Kb pages
+ * At every supported BLCKSZ, (1 << 31) % COMMIT_TS_XACTS_PER_PAGE == 128.
All above notes are not blocking the fix, I just wanted to let committer know
about this. I think that it's very important to have this bug fixed.
Second patch converts binary *PagePreceeds() functions to *PageDiff()s and adds
logic to avoid deleting pages in suspicious cases. This logic depends on the
scale of returned by diff value: it expects that overflow happens between
INT_MIN\INT_MAX. This it prevents page deletion if page_diff <= INT_MIN / 2
(too far from current cleaning point; and in normal cases, of cause).
It must be comparison here, not equality test.
- ctl->PagePrecedes(segpage, trunc->earliestExistingPage))
+ ctl->PageDiff(segpage, trunc->earliestExistingPage))
This
int diff_max = ((QUEUE_MAX_PAGE + 1) / 2) - 1,
seems to be functional equivalent of
int diff_max = ((QUEUE_MAX_PAGE - 1) / 2),
What I like about the patch is that it removes all described above trickery
around + FirstNormalTransactionId + 1.
AFAICS the overall purpose of the 2nd patch is to help corrupted by other bugs
clusters avoid deleting SLRU segments.
I'm a little bit afraid that this kind of patch can hide bugs (while
potentially saving some users data). Besides this patch seems like a useful
precaution. Maybe we could emit scary warnings if SLRU segments do not stack
into continuous range?
Thanks!
Best regards, Andrey Borodin.