On 2020/04/18 16:01, Noah Misch wrote:
On Fri, Apr 17, 2020 at 05:06:29PM +0900, Kyotaro Horiguchi wrote:
At Fri, 17 Apr 2020 17:00:15 +0900 (JST), Kyotaro Horiguchi
<horikyota....@gmail.com> wrote in
By the way, if latch is consumed in WalSndLoop, succeeding call to
WalSndWaitForWal cannot be woke-up by the latch-set. Doesn't that
cause missing wakeups? (in other words, overlooking of wakeup latch).
- Since the only source other than timeout of walsender wakeup is latch,
- we should avoid wasteful consuming of latch. (It is the same issue
- with [1]).
+ Since walsender is wokeup by LSN advancement via latch, we should
+ avoid wasteful consuming of latch. (It is the same issue with [1]).
If wakeup signal is not remembered on walsender (like
InterruptPending), WalSndPhysical cannot enter a sleep with
confidence.
No; per latch.h, "What must be avoided is placing any checks for asynchronous
events after WaitLatch and before ResetLatch, as that creates a race
condition." In other words, the thing to avoid is calling ResetLatch()
without next examining all pending work that a latch would signal. Each
walsender.c WaitLatch call does follow the rules.
On Sat, Apr 18, 2020 at 12:29:58AM +0900, Fujii Masao wrote:
On 2020/04/17 14:41, Noah Misch wrote:
1. Make XLogSendPhysical() more like XLogSendLogical(), calling
WalSndWaitForWal() when no WAL is available. A quick version of this
passes tests, but I'll need to audit WalSndWaitForWal() for things that are
wrong for physical replication.
(1) makes even physical replication walsender sleep in two places and
which seems to make the code for physical replication complicated
more than necessary. I'd like to avoid (1) if possible.
Good point.
2. Make XLogSendLogical() more like XLogSendPhysical(), returning when
insufficient WAL is available. This complicates the xlogreader.h API to
pass back "wait for this XLogRecPtr", and we'd then persist enough state to
resume decoding. This doesn't have any advantages to make up for those.
3. Don't avoid waiting in WalSndLoop(); instead, fix the stall by copying the
WalSndKeepalive() call from WalSndWaitForWal() to WalSndLoop(). This risks
further drift between the two wait sites; on the other hand, one could
refactor later to help avoid that.
Since the additional call of WalSndKeepalive() is necessary only for
logical replication, it should be copied to, e.g., XLogSendLogical(),
instead of WalSndLoop()? For example, when XLogSendLogical() sets
WalSndCaughtUp to true, it should call WalSndKeepalive()?
We'd send a keepalive even when pq_flush_if_writable() can't empty the output
buffer. That could be acceptable, but it's not ideal.
The root problem seems that when WAL record that's no-opes for
logical rep is processed, keep alive message has not sent immediately,
in spite of that we want pg_stat_replication to be updated promptly.
The degree of promptness should be predictable, at least. If we removed the
WalSndKeepalive() from WalSndWaitForWal(), pg_stat_replication updates would
not be prompt, but they would be predictable. I do, however, think prompt
updates are worthwhile.
(3) seems to try to address this problem straightly and looks better to me.
4. Keep the WalSndLoop() wait, but condition it on !logical. This is the
minimal fix, but it crudely punches through the abstraction between
WalSndLoop() and its WalSndSendDataCallback.
(4) also looks good because it's simple, if we can redesign those
functions in good shape.
Let's do that. I'm attaching the replacement implementation and the revert of
v1.
Thanks for the patch! Though referencing XLogSendLogical inside WalSndLoop()
might be a bit ugly,, I'm fine with this change because it's simple and easier
to understand.
+ * Block if we have unsent data. XXX For logical replication,
let
+ * WalSndWaitForWal(), handle any other blocking; idle
receivers need
+ * its additional actions. For physical replication, also
block if
+ * caught up; its send_data does not block.
It might be better to s/WalSndWaitForWal()/send_data()? Because not only
WalSndWaitForWal() but also WalSndWriteData() seems to handle the blocking.
WalSndWriteData() is called also under send_data, i.e., XLogSendLogical().
frame #2: 0x0000000106bcfa84
postgres`WalSndWriteData(ctx=0x00007fb2a4812d20, lsn=22608080, xid=488,
last_write=false) at walsender.c:1247:2
frame #3: 0x0000000106b98295
postgres`OutputPluginWrite(ctx=0x00007fb2a4812d20, last_write=false) at
logical.c:540:2
frame #4: 0x00000001073fe9b8
pgoutput.so`send_relation_and_attrs(relation=0x00000001073ba2c0,
ctx=0x00007fb2a4812d20) at pgoutput.c:353:2
frame #5: 0x00000001073fe7a0
pgoutput.so`maybe_send_schema(ctx=0x00007fb2a4812d20,
relation=0x00000001073ba2c0, relentry=0x00007fb2a483aa60) at pgoutput.c:315:2
frame #6: 0x00000001073fd4c0
pgoutput.so`pgoutput_change(ctx=0x00007fb2a4812d20, txn=0x00007fb2a502e428,
relation=0x00000001073ba2c0, change=0x00007fb2a5030428) at pgoutput.c:394:2
frame #7: 0x0000000106b99094
postgres`change_cb_wrapper(cache=0x00007fb2a482ed20, txn=0x00007fb2a502e428,
relation=0x00000001073ba2c0, change=0x00007fb2a5030428) at logical.c:753:2
frame #8: 0x0000000106ba2200
postgres`ReorderBufferCommit(rb=0x00007fb2a482ed20, xid=488,
commit_lsn=22621088, end_lsn=22621136, commit_time=640675460323211,
origin_id=0, origin_lsn=0) at reorderbuffer.c:1653:7
frame #9: 0x0000000106b93c10 postgres`DecodeCommit(ctx=0x00007fb2a4812d20,
buf=0x00007ffee954c0f8, parsed=0x00007ffee954bf90, xid=488) at decode.c:637:2
frame #10: 0x0000000106b92fa9 postgres`DecodeXactOp(ctx=0x00007fb2a4812d20,
buf=0x00007ffee954c0f8) at decode.c:245:5
frame #11: 0x0000000106b92aee
postgres`LogicalDecodingProcessRecord(ctx=0x00007fb2a4812d20,
record=0x00007fb2a4812fe0) at decode.c:114:4
frame #12: 0x0000000106bd2a16 postgres`XLogSendLogical at walsender.c:2863:3
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION