On Tue, Mar 29, 2022 at 11:55:05AM -0400, Robert Haas wrote: > On Mon, Mar 28, 2022 at 3:08 PM Robert Haas <robertmh...@gmail.com> wrote: > > smgrcreate() as we would for most WAL records or whether it should be > > adopting the new system introduced by > > 49d9cfc68bf4e0d32a948fe72d5a0ef7f464944e. I wrote about this concern > > over here: > > > > http://postgr.es/m/CA+TgmoYcUPL+WOJL2ZzhH=zmrhj0iOQ=icfm0suyqbbqzea...@mail.gmail.com > > > > But apart from that question your adaptations here look reasonable to me. > > That commit having been reverted, I committed v6 instead. Let's see > what breaks...
There's a crash 2022-07-31 01:22:51.437 CDT client backend[13362] [unknown] PANIC: could not open critical system index 2662 (gdb) bt #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x00007efe27999801 in __GI_abort () at abort.c:79 #2 0x00005583891941dc in errfinish (filename=<optimized out>, filename@entry=0x558389420437 "relcache.c", lineno=lineno@entry=4328, funcname=funcname@entry=0x558389421680 <__func__.33178> "load_critical_index") at elog.c:675 #3 0x00005583891713ef in load_critical_index (indexoid=indexoid@entry=2662, heapoid=heapoid@entry=1259) at relcache.c:4328 #4 0x0000558389172667 in RelationCacheInitializePhase3 () at relcache.c:4103 #5 0x00005583891b93a4 in InitPostgres (in_dbname=in_dbname@entry=0x55838a50d468 "a", dboid=dboid@entry=0, username=username@entry=0x55838a50d448 "pryzbyj", useroid=useroid@entry=0, load_session_libraries=<optimized out>, override_allow_connections=override_allow_connections@entry=false, out_dbname=0x0) at postinit.c:1087 #6 0x0000558388daa7bb in PostgresMain (dbname=0x55838a50d468 "a", username=username@entry=0x55838a50d448 "pryzbyj") at postgres.c:4081 #7 0x0000558388b9f423 in BackendRun (port=port@entry=0x55838a505dd0) at postmaster.c:4490 #8 0x0000558388ba6e07 in BackendStartup (port=port@entry=0x55838a505dd0) at postmaster.c:4218 #9 0x0000558388ba747f in ServerLoop () at postmaster.c:1808 #10 0x0000558388ba8f93 in PostmasterMain (argc=7, argv=<optimized out>) at postmaster.c:1480 #11 0x0000558388840e1f in main (argc=7, argv=0x55838a4dc000) at main.c:197 while :; do psql -qh /tmp postgres -c "DROP DATABASE a" -c "CREATE DATABASE a TEMPLATE postgres STRATEGY wal_log"; done # Run this for a few loops and then ^C or hold down ^C until it stops, # and then connect to postgres and try to connect to 'a': postgres=# \c a 2022-07-31 01:22:51.437 CDT client backend[13362] [unknown] PANIC: could not open critical system index 2662 Unfortunately, that isn't very consistent, and you have have to run it a bunch of times... I don't know if it's an issue of any significance that CREATE DATABASE / ^C leaves behind a broken database, but it is an issue that the cluster crashes. While struggling to reproduce that problem, I also hit this warning, which may or may not be the same. I added an abort() after WARNING in aset.c to get a backtrace. WARNING: problem in alloc set PortalContext: bogus aset link in block 0x55a63f2f9d60, chunk 0x55a63f2fb138 Program terminated with signal SIGABRT, Aborted. #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 51 ../sysdeps/unix/sysv/linux/raise.c: No existe el archivo o el directorio. (gdb) bt #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x00007f81144f1801 in __GI_abort () at abort.c:79 #2 0x000055a63c834c5d in AllocSetCheck (context=context@entry=0x55a63f26fea0) at aset.c:1491 #3 0x000055a63c835b09 in AllocSetDelete (context=0x55a63f26fea0) at aset.c:638 #4 0x000055a63c854322 in MemoryContextDelete (context=0x55a63f26fea0) at mcxt.c:252 #5 0x000055a63c8591d5 in PortalDrop (portal=portal@entry=0x55a63f2bb7a0, isTopCommit=isTopCommit@entry=false) at portalmem.c:596 #6 0x000055a63c3e4a7b in exec_simple_query (query_string=query_string@entry=0x55a63f24db90 "CREATE DATABASE a TEMPLATE postgres STRATEGY wal_log ;") at postgres.c:1253 #7 0x000055a63c3e7fc1 in PostgresMain (dbname=<optimized out>, username=username@entry=0x55a63f279448 "pryzbyj") at postgres.c:4505 #8 0x000055a63c1dc423 in BackendRun (port=port@entry=0x55a63f271dd0) at postmaster.c:4490 #9 0x000055a63c1e3e07 in BackendStartup (port=port@entry=0x55a63f271dd0) at postmaster.c:4218 #10 0x000055a63c1e447f in ServerLoop () at postmaster.c:1808 #11 0x000055a63c1e5f93 in PostmasterMain (argc=7, argv=<optimized out>) at postmaster.c:1480 #12 0x000055a63be7de1f in main (argc=7, argv=0x55a63f248000) at main.c:197 I reproduced that by running this a couple dozen times in an interactive psql. It doesn't seem to affect STRATEGY=file_copy. SET statement_timeout=0; DROP DATABASE a; SET statement_timeout='60ms'; CREATE DATABASE a TEMPLATE postgres STRATEGY wal_log ; \c a \c postgres Also, if I understand correctly, this patch seems to assume that nobody is connected to the source database. But what's actually enforced is just that nobody *else* is connected. Is it any issue that the current DB can be used as a source? Anyway, both of the above problems are reproducible using a different database. |postgres=# CREATE DATABASE new TEMPLATE postgres STRATEGY wal_log; |CREATE DATABASE -- Justin