On 03/06/2021 23:04, Jeff Davis wrote:
One problem with unlogged tables is that the application has no way to
tell if they were reset, or they just happen to be empty.

This can be a problem with sharding, where you might have different
shards of an unlogged table on different servers. If one server
crashes, you'll be missing only one shard of the data, which may appear
inconsistent. In that case, you'd like the application (or sharding
solution) to be able to detect that one shard was lost, and TRUNCATE
those that remain to get back to a reasonable state.

It would be easy enough for the init fork to have a single page with a
flag set. That way, when the main fork is replaced with the init fork,
other code could detect that a reset happened.

I'd suggest using a counter rather than a flag. With a flag, if one client clears the flag to acknowledge that a truncation happened, others might miss it. See also ABA problem.

When detected, depending on a GUC, the behavior could be to auto-
truncate it (to get the current silent behavior), or refuse to perform
the operation (except an explicit TRUNCATE), or issue a
warning/log/notice.

TRUNCATE isn't quite what happens when an unlogged table is re-initialized. It changes the relfilenode, resets stats, and requires a more strict lock. So I don't think repurposing TRUNCATE for re-initializing a table is a good idea. There's also potential for a race condition, if two connections see that a table needs re-initialization, and issue "TRUNCATE + INSERT" concurrently. One of the INSERTs will be lost.

A warning or notice is easy to miss.

The biggest challenge would be: when should we detect that the reset
has happened? There might be a lot of entry points. Another idea would
be to just have a SQL function that the application could call whenever
it needs to know.

Yeah, a SQL function to get the current "reset counter" would be nice.

- Heikki


Reply via email to