On 09/03/2024 22:41, Melanie Plageman wrote:
On Wed, Mar 6, 2024 at 7:59 AM Heikki Linnakangas <hlinn...@iki.fi> wrote:
Does GlobalVisTestIsRemovableXid() handle FrozenTransactionId correctly?
Okay, so I thought a lot about this, and I don't understand how
GlobalVisTestIsRemovableXid() would not handle FrozenTransactionId
correctly.
vacrel->cutoffs.OldestXmin is computed initially from
GetOldestNonRemovableTransactionId() which uses ComputeXidHorizons().
GlobalVisState is updated by ComputeXidHorizons() (when it is
updated).
I do see that the comment above GlobalVisTestIsRemovableXid() says
* It is crucial that this only gets called for xids from a source that
* protects against xid wraparounds (e.g. from a table and thus protected by
* relfrozenxid).
and then in
* Convert 32 bit argument to FullTransactionId. We can do so safely
* because we know the xid has to, at the very least, be between
* [oldestXid, nextXid), i.e. within 2 billion of xid.
I'm not sure what oldestXid is here.
It's true that I don't see GlobalVisTestIsRemovableXid() being called
anywhere else with an xmin as an input. I think that hints that it is
not safe with FrozenTransactionId. But I don't see what could go
wrong.
Maybe it has something to do with converting it to a FullTransactionId?
FullTransactionIdFromU64(U64FromFullTransactionId(rel) + (int32)
(xid - rel_xid));
Sorry, I couldn't quite figure it out :(
I just tested it. No, GlobalVisTestIsRemovableXid does not work for
FrozenTransactionId. I just tested it with state->definitely_needed ==
{0, 4000000000} and xid == FrozenTransactionid, and it incorrectly
returned 'false'. It treats FrozenTransactionId as if was a regular xid '2'.
The XLOG_HEAP2_FREEZE_PAGE record is a little smaller than
XLOG_HEAP2_PRUNE. But we could optimize the XLOG_HEAP2_PRUNE format for
the case that there's no pruning, just freezing. The record format
(xl_heap_prune) looks pretty complex as it is, I think it could be made
both more compact and more clear with some refactoring.
I'm happy to change up xl_heap_prune format. In its current form,
according to pahole, it has no holes and just 3 bytes of padding at
the end.
One way we could make it smaller is by moving the isCatalogRel member
to directly after snapshotConflictHorizon and then adding a flags
field and defining flags to indicate whether or not other members
exist at all. We could set bits for HAS_FREEZE_PLANS, HAS_REDIRECTED,
HAS_UNUSED, HAS_DEAD. Then I would remove the non-optional uint16
nredirected, nunused, nplans, ndead and just put the number of
redirected/unused/etc at the beginning of the arrays containing the
offsets.
Sounds good.
Then I could write various macros for accessing them. That
would make it smaller, but it definitely wouldn't make it less complex
(IMO).
I don't know, it might turn out not that complex. If you define the
formats of each of those "sub-record types" as explicit structs, per
attached sketch, you won't need so many macros. Some care is still
needed with alignment though.
--
Heikki Linnakangas
Neon (https://neon.tech)
/*
* XXX: This one combined record type replaces existing
* XLOG_HEAP2_PRUNE, XLOG_HEAP2_VACUUM, and XLOG_HEAP2_FREEZE_PAGE record types
*/
/*
* This is what we need to know about page pruning and freezing, both during
* VACUUM and during opportunistic pruning.
*
* If XLPH_HAS_REDIRECTIONS, XLHP_HAS_DEAD_ITEMS or XLHP_HAS_FREEZE_PLANS is
* set, acquires a full cleanup lock. Otherwise an ordinary exclusive lock is
* enough (VACUUM's second heap pass generates such records).
*
* The data for block reference 0 contains "sub-records" depending on which
* of the XLPH_HAS_* flags are set. See xlhp_* struct definitions below.
*/
typedef struct xl_heap_prune
{
TransactionId snapshotConflictHorizon;
uint8 flags;
} xl_heap_prune;
#define XLHP_IS_CATALOG_REL 0x01 /* to handle
recovery conflict during logical
* decoding on standby */
#define XLHP_HAS_REDIRECTIONS 0x02
#define XLHP_HAS_DEAD_ITEMS 0x04
#define XLHP_HAS_NOW_UNUSED_ITEMS 0x08
#define XLHP_HAS_FREEZE_PLANS 0x10
#define SizeOfHeapPrune (offsetof(xl_heap_prune, flags) + sizeof(uint8))
/*
* Sub-record contained in block reference 0 of a prune record if
* XLHP_HAS_REDIRECTIONS is set
*/
typedef struct xl_heap_prune_redirections
{
uint16 nitems;
struct
{
OffsetNumber from;
OffsetNumber to;
} items[FLEXIBLE_ARRAY_MEMBER];
} xl_heap_prune_redirections;
/*
* Sub-record contained in block reference 0 of a prune record if
* XLHP_HAS_DEAD_ITEMS is set
*/
typedef struct xlhp_dead_items
{
uint16 nitems;
OffsetNumber items[FLEXIBLE_ARRAY_MEMBER];
} xlhp_dead_items;
/*
* Sub-record contained in block reference 0 of a prune record if
* XLHP_HAS_NOW_UNUSED_ITEMS is set
*/
typedef struct xlhp_now_unused_items
{
uint16 nitems;
OffsetNumber items[FLEXIBLE_ARRAY_MEMBER];
} xlhp_now_unused_items;
/*
* This struct represents a 'freeze plan', which describes how to freeze a
* group of one or more heap tuples (appears in xlhp_freeze sub-record)
*/
#define XLH_FREEZE_XVAC 0x02
#define XLH_INVALID_XVAC 0x04
typedef struct xlhp_freeze_plan
{
TransactionId xmax;
uint16 t_infomask2;
uint16 t_infomask;
uint8 frzflags;
/* Length of individual page offset numbers array for this plan */
uint16 ntuples;
OffsetNumber items[FLEXIBLE_ARRAY_MEMBER];
} xlhp_freeze_plan;
/*
* This is what we need to know about a block being frozen during vacuum
*
* Backup block 0's data contains an array of xl_heap_freeze_plan structs
* (with nplans elements), followed by one or more page offset number arrays.
* Each such page offset number array corresponds to a single freeze plan
* (REDO routine freezes corresponding heap tuples using freeze plan).
*/
typedef struct xlhp_freeze
{
uint16 nplans;
/* xlhp_freeze_plan structs follow */
} xlhp_freeze;