On 2020-Apr-28, Kyotaro Horiguchi wrote: > At Mon, 27 Apr 2020 18:33:42 -0400, Alvaro Herrera <alvhe...@2ndquadrant.com> > wrote in > > On 2020-Apr-08, Kyotaro Horiguchi wrote: > > > > > At Wed, 08 Apr 2020 09:37:10 +0900 (JST), Kyotaro Horiguchi > > > <horikyota....@gmail.com> wrote in
> > Thanks for the fix! I propose two changes: > > > > 1. reword the error like this: > > > > ERROR: replication slot "regression_slot3" cannot be advanced > > DETAIL: This slot has never previously reserved WAL, or has been > > invalidated > > Agreed to describe what is failed rather than the cause. However, > logical replications slots are always "previously reserved" at > creation. Bah, of course. I was thinking in making the equivalent messages all identical in all callsites, but maybe they should be different when slots are logical. I'll go over them again. > > 2. use the same error in one other place, to wit > > pg_logical_slot_get_changes() and pg_replication_slot_advance(). I > > made the DETAIL part the same in all places, but the ERROR line is > > adjusted to what each callsite is doing. > > I do think that this change in test_decoding is a bit unpleasant: > > > > -ERROR: cannot use physical replication slot for logical decoding > > +ERROR: cannot get changes from replication slot "repl" > > > > The test is > > -- check that we're detecting a streaming rep slot used for logical > > decoding > > SELECT 'init' FROM pg_create_physical_replication_slot('repl'); > > SELECT data FROM pg_logical_slot_get_changes('repl', NULL, NULL, > > 'include-xids', '0', 'skip-empty-xacts', '1'); > > The message may be understood as "No change has been made since > restart_lsn". Does something like the following work? > > ERROR: replication slot "repl" is not usable to get changes That wording seems okay, but my specific point for this error message is that we were trying to use a physical slot to get logical changes; so the fact that the slot has been invalidated is secondary and we should complain about the *type* of slot rather than the restart_lsn. > By the way there are some other messages that doesn't render the > symptom but the cause. > > "cannot use physical replication slot for logical decoding" > "replication slot \"%s\" was not created in this database" > > Don't they need the same amendment? Maybe, but I don't want to start rewording every single message in uses of replication slots ... I prefer to only modify the ones related to the problem at hand. > > > > On the other hand, physical replication doesn't break by invlidation. > > > > [...] > > Anyway I think the current patch can be applied as is -- and if we want > > physical replication to have some other behavior, we can patch for that > > afterwards. > > Agreed here. The false-invalidation doesn't lead to any serious > consequences. But does it? What happens, for example, if we have a slot used to get a pg_basebackup, then time passes before starting to stream from it and is invalidated? I think this "works fine" (meaning that once we try to stream from the slot to replay at the restored base backup, we will raise an error immediately), but I haven't tried. The worst situation would be producing a corrupt replica. I don't think this is possible. The ideal behavior I think would be that pg_basebackup aborts immediately when the slot is invalidated, to avoid wasting more time producing a doomed backup. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services