On Thu, Aug 4, 2022 at 9:41 AM Dilip Kumar <dilipbal...@gmail.com> wrote: > > On Thu, Aug 4, 2022 at 12:18 AM Justin Pryzby <pry...@telsasoft.com> wrote: > > > > On Wed, Aug 03, 2022 at 11:26:43AM -0700, Andres Freund wrote: > > > Hm. This looks more like an issue of DROP DATABASE not being > > > interruptible. I > > > suspect this isn't actually related to STRATEGY wal_log and could likely > > > be > > > reproduced in older versions too. > > > > I couldn't reproduce it with file_copy, but my recipe isn't exactly > > reliable. > > That may just mean that it's easier to hit now. > > I think this looks like a problem with drop db but IMHO you are seeing > this behavior only when a database is created using WAL LOG because in > this strategy we are using buffers to write the destination database > pages and some of the dirty buffers and sync requests might still be > pending. And now when we try to drop the database it drops all the > dirty buffers and all pending sync requests and then before it > actually removes the directory it gets interrupted and now you see the > database directory on disk which is partially corrupted. See below > sequence of drop database > > > dropdb() > { > ... > DropDatabaseBuffers(db_id); > ... > ForgetDatabaseSyncRequests(db_id); > ... > RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_FORCE | CHECKPOINT_WAIT); > > WaitForProcSignalBarrier(EmitProcSignalBarrier(PROCSIGNAL_BARRIER_SMGRRELEASE)); > -- Inside this it can process the cancel query and get interrupted > remove_dbtablespaces(db_id); > .. > } > > I reproduced the same error by inducing error just before > WaitForProcSignalBarrier. > > postgres[14968]=# CREATE DATABASE a STRATEGY WAL_LOG ; drop database a; > CREATE DATABASE > ERROR: XX000: test error > LOCATION: dropdb, dbcommands.c:1684 > postgres[14968]=# \c a > connection to server on socket "/tmp/.s.PGSQL.5432" failed: PANIC: > could not open critical system index 2662 > Previous connection kept > postgres[14968]=#
So basically, from this we can say it is completely a problem with drop databases, I mean I can produce any behavior by interrupting drop database 1. If we created some tables/inserted data and the drop database got cancelled, it might have a database directory and those objects are lost. 2. Or you can even drop the database directory and then get cancelled before deleting the pg_database entry then also you will end up with a corrupted database, doesn't matter whether you created it with WAL LOG or FILE COPY. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com