Hi Hackers. As discovered elsewhere [ak0125] there is a potential race condition in the pg_replication_origin_drop API
The current code of pg_replication_origin_drop looks like: ==== roident = replorigin_by_name(name, false); Assert(OidIsValid(roident)); replorigin_drop(roident, true); ==== Users cannot deliberately drop a non-existent origin (replorigin_by_name passes missing_ok = false) but there is still a small window where concurrent processes may be able to call replorigin_drop for the same valid roident. Locking within replorigin_drop guards against concurrent drops so the 1st execution will succeed, but then the 2nd execution would give internal cache error: elog(ERROR, "cache lookup failed for replication origin with oid %u", roident); Some ideas to fix this include: 1. Do nothing except write a comment about this in the code. The internal ERROR is not ideal for a user API there is no great harm done. 2. Change the behavior of replorigin_drop to be like replorigin_drop_IF_EXISTS, so the 2nd execution of this race would silently do nothing when it finds the roident is already gone. 3. Same as 2, but make the NOP behavior more explicit by introducing a new "missing_ok" parameter for replorigin_drop. Thoughts? ---- [ak0125] https://www.postgresql.org/message-id/CAA4eK1%2ByeLwBCkTvTdPM-hSk1fr6jT8KJc362CN8zrGztq_JqQ%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia