On Tue, Mar 29, 2022 at 11:55:05AM -0400, Robert Haas wrote:
> On Mon, Mar 28, 2022 at 3:08 PM Robert Haas <robertmh...@gmail.com> wrote:
> > smgrcreate() as we would for most WAL records or whether it should be
> > adopting the new system introduced by
> > 49d9cfc68bf4e0d32a948fe72d5a0ef7f464944e. I wrote about this concern
> > over here:
> >
> > http://postgr.es/m/CA+TgmoYcUPL+WOJL2ZzhH=zmrhj0iOQ=icfm0suyqbbqzea...@mail.gmail.com
> >
> > But apart from that question your adaptations here look reasonable to me.
> 
> That commit having been reverted, I committed v6 instead. Let's see
> what breaks...

There's a crash

2022-07-31 01:22:51.437 CDT client backend[13362] [unknown] PANIC:  could not 
open critical system index 2662

(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007efe27999801 in __GI_abort () at abort.c:79
#2  0x00005583891941dc in errfinish (filename=<optimized out>, 
filename@entry=0x558389420437 "relcache.c", lineno=lineno@entry=4328,
    funcname=funcname@entry=0x558389421680 <__func__.33178> 
"load_critical_index") at elog.c:675
#3  0x00005583891713ef in load_critical_index (indexoid=indexoid@entry=2662, 
heapoid=heapoid@entry=1259) at relcache.c:4328
#4  0x0000558389172667 in RelationCacheInitializePhase3 () at relcache.c:4103
#5  0x00005583891b93a4 in InitPostgres 
(in_dbname=in_dbname@entry=0x55838a50d468 "a", dboid=dboid@entry=0, 
username=username@entry=0x55838a50d448 "pryzbyj", useroid=useroid@entry=0,
    load_session_libraries=<optimized out>, 
override_allow_connections=override_allow_connections@entry=false, 
out_dbname=0x0) at postinit.c:1087
#6  0x0000558388daa7bb in PostgresMain (dbname=0x55838a50d468 "a", 
username=username@entry=0x55838a50d448 "pryzbyj") at postgres.c:4081
#7  0x0000558388b9f423 in BackendRun (port=port@entry=0x55838a505dd0) at 
postmaster.c:4490
#8  0x0000558388ba6e07 in BackendStartup (port=port@entry=0x55838a505dd0) at 
postmaster.c:4218
#9  0x0000558388ba747f in ServerLoop () at postmaster.c:1808
#10 0x0000558388ba8f93 in PostmasterMain (argc=7, argv=<optimized out>) at 
postmaster.c:1480
#11 0x0000558388840e1f in main (argc=7, argv=0x55838a4dc000) at main.c:197

while :; do psql -qh /tmp postgres -c "DROP DATABASE a" -c "CREATE DATABASE a 
TEMPLATE postgres STRATEGY wal_log"; done
# Run this for a few loops and then ^C or hold down ^C until it stops,
# and then connect to postgres and try to connect to 'a':
postgres=# \c a
2022-07-31 01:22:51.437 CDT client backend[13362] [unknown] PANIC:  could not 
open critical system index 2662

Unfortunately, that isn't very consistent, and you have have to run it a bunch
of times...

I don't know if it's an issue of any significance that CREATE DATABASE / ^C
leaves behind a broken database, but it is an issue that the cluster crashes.

While struggling to reproduce that problem, I also hit this warning, which may
or may not be the same.  I added an abort() after WARNING in aset.c to get a
backtrace.

WARNING:  problem in alloc set PortalContext: bogus aset link in block 
0x55a63f2f9d60, chunk 0x55a63f2fb138

Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51      ../sysdeps/unix/sysv/linux/raise.c: No existe el archivo o el 
directorio.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007f81144f1801 in __GI_abort () at abort.c:79
#2  0x000055a63c834c5d in AllocSetCheck (context=context@entry=0x55a63f26fea0) 
at aset.c:1491
#3  0x000055a63c835b09 in AllocSetDelete (context=0x55a63f26fea0) at aset.c:638
#4  0x000055a63c854322 in MemoryContextDelete (context=0x55a63f26fea0) at 
mcxt.c:252
#5  0x000055a63c8591d5 in PortalDrop (portal=portal@entry=0x55a63f2bb7a0, 
isTopCommit=isTopCommit@entry=false) at portalmem.c:596
#6  0x000055a63c3e4a7b in exec_simple_query 
(query_string=query_string@entry=0x55a63f24db90 "CREATE DATABASE a TEMPLATE 
postgres STRATEGY wal_log ;") at postgres.c:1253
#7  0x000055a63c3e7fc1 in PostgresMain (dbname=<optimized out>, 
username=username@entry=0x55a63f279448 "pryzbyj") at postgres.c:4505
#8  0x000055a63c1dc423 in BackendRun (port=port@entry=0x55a63f271dd0) at 
postmaster.c:4490
#9  0x000055a63c1e3e07 in BackendStartup (port=port@entry=0x55a63f271dd0) at 
postmaster.c:4218
#10 0x000055a63c1e447f in ServerLoop () at postmaster.c:1808
#11 0x000055a63c1e5f93 in PostmasterMain (argc=7, argv=<optimized out>) at 
postmaster.c:1480
#12 0x000055a63be7de1f in main (argc=7, argv=0x55a63f248000) at main.c:197

I reproduced that by running this a couple dozen times in an interactive psql.
It doesn't seem to affect STRATEGY=file_copy.

SET statement_timeout=0; DROP DATABASE a; SET statement_timeout='60ms'; CREATE 
DATABASE a TEMPLATE postgres STRATEGY wal_log ; \c a \c postgres

Also, if I understand correctly, this patch seems to assume that nobody is
connected to the source database.  But what's actually enforced is just that
nobody *else* is connected.  Is it any issue that the current DB can be used as
a source?  Anyway, both of the above problems are reproducible using a
different database.

|postgres=# CREATE DATABASE new TEMPLATE postgres STRATEGY wal_log;
|CREATE DATABASE

-- 
Justin


Reply via email to