On 2021/09/01 12:15, Andres Freund wrote:
Hi,

On 2021-09-01 11:34:34 +0900, Fujii Masao wrote:
On 2021/09/01 0:53, Andres Freund wrote:
Of course, we need to be careful to not weaken WAL validity checking too
much. How about the following:

If we're "aborting" a continued record, we set XLP_FIRST_IS_ABORTED_PARTIAL on
the page at which we do so (i.e. the page after the valid end of the WAL).

When do you expect that XLP_FIRST_IS_ABORTED_PARTIAL is set? It's set
when recovery finds a a partially-flushed segment-spanning record?
But maybe we cannot do that (i.e., cannot overwrite the page) because
the page that the flag is set in might have already been archived. No?

I was imagining that XLP_FIRST_IS_ABORTED_PARTIAL would be set in the "tail
end" of a partial record. I.e. if there's a partial record starting in the
successfully archived segment A, but the end of the record, in B, has not been
written to disk before a crash, we'd set XLP_FIRST_IS_ABORTED_PARTIAL at the
end of the valid data in B. Which could not have been archived yet, or we'd
not have a partial record.  So we should never need to set the flag on an
already archived page.

Thanks for clarifying that! Unless I misunderstand that, when recovery ends
at a partially-flushed segment-spanning record, it sets
XLP_FIRST_IS_ABORTED_PARTIAL in the next segment before starting writing
new WAL data there. Therefore XLP_FIRST_IS_CONTRECORD or
XLP_FIRST_IS_ABORTED_PARTIAL must be set in the next segment following
a partially-flushed segment-spanning record. If none of them is set,
the validation code in recovery should report an error.

Yes, this design looks good to me.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION


Reply via email to