Re: Make unlogged table resets detectable

Heikki Linnakangas Thu, 03 Jun 2021 23:42:40 -0700

On 03/06/2021 23:04, Jeff Davis wrote:

One problem with unlogged tables is that the application has no way to
tell if they were reset, or they just happen to be empty.


This can be a problem with sharding, where you might have different
shards of an unlogged table on different servers. If one server
crashes, you'll be missing only one shard of the data, which may appear
inconsistent. In that case, you'd like the application (or sharding
solution) to be able to detect that one shard was lost, and TRUNCATE
those that remain to get back to a reasonable state.

It would be easy enough for the init fork to have a single page with a
flag set. That way, when the main fork is replaced with the init fork,
other code could detect that a reset happened.

I'd suggest using a counter rather than a flag. With a flag, if oneclient clears the flag to acknowledge that a truncation happened, othersmight miss it. See also ABA problem.

When detected, depending on a GUC, the behavior could be to auto-
truncate it (to get the current silent behavior), or refuse to perform
the operation (except an explicit TRUNCATE), or issue a
warning/log/notice.

TRUNCATE isn't quite what happens when an unlogged table isre-initialized. It changes the relfilenode, resets stats, and requires amore strict lock. So I don't think repurposing TRUNCATE forre-initializing a table is a good idea. There's also potential for arace condition, if two connections see that a table needsre-initialization, and issue "TRUNCATE + INSERT" concurrently. One ofthe INSERTs will be lost.


A warning or notice is easy to miss.

The biggest challenge would be: when should we detect that the reset
has happened? There might be a lot of entry points. Another idea would
be to just have a SQL function that the application could call whenever
it needs to know.


Yeah, a SQL function to get the current "reset counter" would be nice.

- Heikki

Re: Make unlogged table resets detectable

Reply via email to