Hi, On Wed, 27 Jan 2021 11:25:11 +0100 Denis Laxalde <denis.laxa...@dalibo.com> wrote:
> Andres Freund a écrit : > > On 2021-01-21 16:23:58 +0100, Denis Laxalde wrote: > > > We found an issue in pg_upgrade on a cluster with a third-party > > > background worker. The upgrade goes fine, but the new cluster is then in > > > an inconsistent state. The background worker comes from the PoWA > > > extension but the issue does not appear to related to this particular > > > code. > > > > Well, it does imply that that backgrounder did something, as the pure > > existence of a bgworker shouldn't affect anything. Presumably the issue is > > that the bgworker actually does transactional writes, which causes problems > > because the xids / multixacts from early during pg_upgrade won't actually > > be valid after we do pg_resetxlog etc. Indeed, it does some writes. As soon as the powa bgworker starts, it takes "snapshots" of pg_stat_statements state and record them in a local table. To avoid concurrent run, it takes a lock on some of its local rows using SELECT FOR UPDATE, hence the mxid consumption. The inconsistency occurs at least at two place: * the datminmxid and relminmxid fields pg_dump(all)'ed and restored in the new cluster. * the multixid fields in the controlfile read during the check phase and restored later using pg_resetxlog. > > > As a solution, it seems that, for similar reasons that we restrict > > > socket access to prevent accidental connections (from commit > > > f763b77193), we should also prevent background workers to start at this > > > step. > > > > I think that'd have quite the potential for negative impact - [...] > > Thank you for those insights! +1 > > I wonder if we could > > > > a) set default_transaction_read_only to true, and explicitly change it > > in the sessions that need that. According to Denis' tests discussed off-list, it works fine in regard with the powa bgworker, albeit some complaints in logs. However, I wonder how fragile it could be as bgworker could easily manipulate either the GUC or "BEGIN READ WRITE". I realize this is really uncommon practices, but bgworker code from third parties might be surprising. > > b) when in binary upgrade mode / -b, error out on all wal writes in > > sessions that don't explicitly set a session-level GUC to allow > > writes. It feels safer because more specific to the subject. But I wonder if the benefice worth adding some (limited?) complexity and a small/quick conditional block in a very hot code path for a very limited use case. What about c) where the bgworker are not loaded by default during binary upgrade mode / -b unless they explicitly set a bgw_flags (BGWORKER_BINARY_UPGRADE ?) when they are required during pg_upgrade? Regards,