On 2020-Jun-17, Kyotaro Horiguchi wrote: > @@ -9524,7 +9533,7 @@ GetWALAvailability(XLogRecPtr targetLSN) > * the first WAL segment file since startup, which causes the status > being > * wrong under certain abnormal conditions but that doesn't actually > harm. > */ > - oldestSeg = XLogGetLastRemovedSegno() + 1; > + oldestSeg = last_removed_seg + 1; > > /* calculate oldest segment by max_wal_size and wal_keep_segments */ > XLByteToSeg(currpos, currSeg, wal_segment_size);
This hunk should have updated the comment two lines above. However: > @@ -272,6 +273,14 @@ pg_get_replication_slots(PG_FUNCTION_ARGS) > rsinfo->setResult = tupstore; > rsinfo->setDesc = tupdesc; > > + /* > + * Remember the last removed segment at this point for the consistency > in > + * this table. Since there's no interlock between slot data and > + * checkpointer, the segment can be removed in-between, but that doesn't > + * make any practical difference. > + */ > + last_removed_seg = XLogGetLastRemovedSegno(); I am mystified as to why you added this change. I understand that your point here is to make all slots reported their state as compared to the same LSN, but why do it like that? If a segment is removed in between, it could mean that the view reports more lies than it would if we update the segno for each slot. I mean, suppose two slots are lagging behind and one is reported as 'extended' because when we compute it it's still in range; then a segment is removed. With your coding, we'll report both as extended, but with the original coding, we'll report the new one as lost. By the time the user reads the result, they'll read one incorrect report with the original code, and two incorrect reports with your code. So ... yes it might be more consistent, but what does that buy the user? OTOH it makes GetWALAvailability gain a new argument, which we have to document. > + /* > + * However segments required by the slot has been lost, if walsender is > + * active the walsender can read into the first reserved slot. > + */ > + if (slot_is_active) > + return WALAVAIL_BEING_REMOVED; I don't understand this comment; can you please clarify what you mean? I admit I don't like this slot_is_active argument you propose to add to GetWALAvailability either; previously the function can be called with an LSN coming from anywhere, not just a slot; the new argument implies that the LSN comes from a slot. (Your proposed patch doesn't document this one either.) -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services