On 2020/01/31 23:47, Alvaro Herrera wrote:
On 2020-Jan-31, Fujii Masao wrote:
On 2020/01/31 22:40, Alvaro Herrera wrote:
On 2020-Jan-31, Fujii Masao wrote:

You're thinking to apply this change to the back branches? Sorry
if my understanding is not right. But I don't think that back-patch
is ok because it changes the documented existing behavior
of pg_last_xact_replay_timestamp(). So it looks like the behavior
change not a bug fix.

Yeah, I am thinking in backpatching it.  The documented behavior is
already not what the code does.

Maybe you thought this because getRecordTimestamp() extracts the
timestamp from even WAL record of a restore point? That is, you're
concerned about that pg_last_xact_replay_timestamp() returns the
timestamp of not only commit/abort record but also restore point one.
Right?

right.

As far as I read the code, this problem doesn't occur because
SetLatestXTime() is called only for commit/abort records, in
recoveryStopsAfter(). No?

... uh, wow, you're right about that too.  IMO this is extremely
fragile, easy to break, and under-documented.

Yeah, it's worth improving the code.

But you're right, there's
no bug there at present.

  Do you have a situation where this
change would break something?  If so, can you please explain what it is?

For example, use the return value of pg_last_xact_replay_timestamp()
(and also the timestamp in the log message output at the end of
recovery) as a HINT when setting recovery_target_time later.

Hmm.

I'm not sure how you would use it in that way.  I mean, I understand how
it *can* be used that way, but it seems too fragile to be done in
practice, in a scenario that's not just laboratory games.

Use it to compare with the timestamp retrieved from the master server,
in order to monitor the replication delay.

That's precisely the use case that I'm aiming at.  The timestamp
currently is not useful because this usage breaks when the primary is
inactive (no COMMIT records occur).  During such periods of inactivity,
CHECKPOINT records would keep the "last xtime" current.  This has
actually happened in a production setting, it's not a thought
experiment.

I've heard that someone periodically generates dummy tiny
transactions (say, every minute), as a band-aid solution,
to avoid inactive primary. Of course, this is not a perfect solution.

The idea that I proposed previously was to introduce
pg_last_xact_insert_timestamp() [1] into core. This function returns
the timestamp of commit / abort records in *primary* side.
So we can retrieve that timestamp from the primary (e.g., by using dblink)
and compare its result with pg_last_xact_replay_timestamp() to
calculate the delay in the standby.

Another idea is to include the commit / abort timestamp in
primary-keepalive-message that periodially sent from the primary
to the standby. Then if we introduce the function returning
that timestamp, in the standby side, we can easily compare
the commit / abort timestamps taken from both primary and
standby, in the standby.

[1] 
https://www.postgresql.org/message-id/cahgqgwf3zjfunej5ka683ku5rqubtswtqfq7g1x0g34o+jx...@mail.gmail.com

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters


Reply via email to