Hi Greg,

  

  When you get the different xmin between active snapshot and transaction 
snapshot, maybe there is no coredump. 

  As maybe there is not tupe(xmin) between ActiveSnapshot->xmin and 
TransactionSnapshot->xmin which needs to be scaned in parallel process.

  

  There is no doubt, it is very likely that ActiveSnapshot->xmin precedes 
TransactionSnapshot->xmin.

  

  For this coredump, we must make sure parallel and snapshot overflow. If 
snapshot is not overflow, you cannot get the coredump. 

  As coredump is from parallel scan in MVCC when snapshot is overflow.

  

  Did you use pgbench with the script sub_120.sql which I provide in 
attachment? 

  As the default PGPROC_MAX_CACHED_SUBXIDS is 64. In script sub_120.sql, for 
one transaction, it will use 120 subtransactions which is much larger than 64.

  While getting the snapshot, it must be overflow. I really don't know why your 
snapshot is not overflow.

  Did you increase the number PGPROC_MAX_CACHED_SUBXIDS? Please don't change 
any codes, now we just use the origin codes in PG13.2.

  I have checked the codes in master branch, there is no change about this 
mechanism. This issue should still exist. 

  

  

  Thanks

  Pengcheng

 

-----Original Message-----
From: Greg Nancarrow <gregn4...@gmail.com> 
Sent: 2021年5月14日 16:47
To: Pengchengliu <pengcheng...@tju.edu.cn>
Cc: Andres Freund <and...@anarazel.de>; PostgreSQL-development 
<pgsql-hack...@postgresql.org>
Subject: Re: Parallel scan with SubTransGetTopmostTransaction assert coredump

 

On Fri, May 14, 2021 at 12:25 PM Pengchengliu < 
<mailto:pengcheng...@tju.edu.cn> pengcheng...@tju.edu.cn> wrote:

> 

> Hi Greg,

> 

>   Thanks for you replay and test.

> 

> 

> 

>   When main process gets the transaction snapshot in 
> InitializeParallelDSM->GetTransactionSnapshot,  the transaction snapshot xmin 
> is very likely follows active snapshot xmin.

> 

> 

> 

>   Use gdb it is easy to verify it.

> 

>   Create env as blow:

> 

> 

> 

>   1, Use PG13.2(3fb4c75e857adee3da4386e947ba58a75f3e74b7), init a cluster 
> database.

> 

>   2, Append the postgres.conf as below:

> 

> 

> 

>   max_connections = 2000

> 

>   parallel_setup_cost=0

> 

>   parallel_tuple_cost=0

> 

>   min_parallel_table_scan_size=0

> 

>   max_parallel_workers_per_gather=8

> 

>   max_parallel_workers = 128

> 

> 

> 

>   3, Start the cluster database. Use the init_test.sql script in attachment 
> to create some test tables.

> 

>   4, Use the sub_120.sql script in attachment with pgbench to test it.

> 

> 

> 

>   pgbench  -d postgres -p 33550  -n -r -f sub_120.sql   -c 200 -j 200 -T 1800

> 

> 

> 

> 

> 

> 

> 

>   Then you can login the database, and use GDB to verify it.

> 

>   1, First use explain, make sure force Parallel is OK.

> 

> 

> 

>   postgres=# explain (verbose,analyze) select count(*) from contend1;

> 

>                                                                      

> QUERY PLAN

> 

> 

> 

> ----------------------------------------------------------------------

> ---------------------------------------------------------------

> 

> ----------------

> 

> Finalize Aggregate  (cost=12006.11..12006.12 rows=1 width=8) (actual 

> time=1075.214..1075.449 rows=1 loops=1)

> 

>    Output: count(*)

> 

>    ->  Gather  (cost=12006.08..12006.09 rows=8 width=8) (actual 

> time=1075.198..1075.433 rows=1 loops=1)

> 

>          Output: (PARTIAL count(*))

> 

>          Workers Planned: 8

> 

>          Workers Launched: 0

> 

>          ->  Partial Aggregate  (cost=12006.08..12006.09 rows=1 

> width=8) (actual time=1074.674..1074.676 rows=1 loops=1)

> 

>                Output: PARTIAL count(*)

> 

>                ->  Parallel Seq Scan on public.contend1  

> (cost=0.00..11690.06 rows=126406 width=0) (actual time=0.008..587.454 

> rows=1

> 

> 010200 loops=1)

> 

>                      Output: id, val, c2, c3, c4, c5, c6, c7, c8, c9, 

> c10, crt_time

> 

> Planning Time: 0.123 ms

> 

> Execution Time: 1075.588 ms

> 

>  postgres=# select pg_backend_pid();

> 

> pg_backend_pid

> 

> ----------------

> 

>         2182678

> 

> 

> 

> 

> 

>   2, use gdb to debug our backend process. Add the breakpoint in 
> parallel.c:219 and continue.

> 

> 

> 

>   gdb  -q -p 2182678

> 

>   ...

> 

>   (gdb) b parallel.c:219

> 

> Breakpoint 1 at 0x55d085: file 
> /home/liupc/build/build_postgres2/../../devel/postgres2/src/backend/access/transam/parallel.c,
>  line 219.

> 

>   (gdb) c

> 

> Continuing.

> 

> 

> 

>   3, In the psql clinet, we can execute the explain command in step1 again.

> 

>      After we get the breakpoint in gdb, we wait a moment. Then we execute 
> next.

> 

>      Use gdb check active_snapshot and transaction_snapshot, 
> active_snapshot->xmin is 158987 and transaction_snapshot->xmin is 162160.

> 

> When I use gdb test it, sometimes active_snapshot is the same as 
> transaction_snapshot. Then you can try it multiple times, and before execute 
> next, try wait longer time.

> 

> 

> 

>    Breakpoint 1, InitializeParallelDSM (pcxt=0x2d53670)

> 

>     at 

> /home/liupc/build/build_postgres2/../../devel/postgres2/src/backend/ac

> cess/transam/parallel.c:219

> 

> 219             Snapshot        transaction_snapshot = 
> GetTransactionSnapshot();

> 

> (gdb) n

> 

> 220             Snapshot        active_snapshot = GetActiveSnapshot();

> 

> (gdb)

> 

> 223             oldcontext = MemoryContextSwitchTo(TopTransactionContext);

> 

> (gdb) p *transaction_snapshot

> 

> $1 = {snapshot_type = SNAPSHOT_MVCC, xmin = 162160, xmax = 183011, xip 

> = 0x2d50d10, xcnt = 179, subxip = 0x148a9cddf010,

> 

>   subxcnt = 0, suboverflowed = true, takenDuringRecovery = false, 

> copied = false, curcid = 0, speculativeToken = 0,

> 

>   active_count = 0, regd_count = 0, ph_node = {first_child = 0x0, 

> next_sibling = 0x0, prev_or_parent = 0x0}, whenTaken = 0, lsn = 0}

> 

> (gdb) p *active_snapshot

> 

> $2 = {snapshot_type = SNAPSHOT_MVCC, xmin = 158987, xmax = 173138, xip 

> = 0x2d53288, xcnt = 178, subxip = 0x0, subxcnt = 0,

> 

>   suboverflowed = true, takenDuringRecovery = false, copied = true, 

> curcid = 0, speculativeToken = 0, active_count = 1,

> 

>   regd_count = 2, ph_node = {first_child = 0x0, next_sibling = 0x0, 

> prev_or_parent = 0x2d52e48}, whenTaken = 0, lsn = 0}

> 

> (gdb)

> 

> 

 

Hi Pengcheng,

 

I followed all your steps.

However, I perhaps get different behavior in my environment.

99% of the time, the xmin and xmax of the active_snapshot and 
transaction_snapshot are the same (regardless of how long I wait at different 
points after the breakpoint is hit). I've had one or two instances where the 
xmax values differ. I managed to catch just one case where there were different 
xmin and xmax values in the snapshots, but this occurred just prior to the 
pgbench client completing and terminating, and when I continued in the 
debugger, there was no crash/coredump.

 

However, I think I've spotted something potentially important to this issue:

For me, almost always "suboverflowed = false" in the snapshots (except in that 
one case, right at the end of the pgbench run), yet in your gdb example 
"suboverflowed = true" in both of the snapshots (i.e. the snapshot subxip array 
has overflowed). I'm guessing that this may be related to the coredump issue, 
but I'm not exactly sure how it has happened, and why it seemingly isn't being 
handled correctly and causes that Assert to fire in your case.

Can you try and find out how the snapshot suboverflow is being set in your 
case? (since you are getting this readily in your examined

snapshots???) I think there's only several places where it can be set to "true" 
(e.g. procarray.c:1641).

Also, does increasing PGPROC_MAX_CACHED_SUBXIDS avoid, or delay, the problem 
for you? It's currently defined as 64.

I notice that there's been some changes related to snapshot data handling and 
subxid overflow since 13.2, so I'm wondering whether your coredump issue can be 
reproduced with the latest code?

 

Regards,

Greg Nancarrow

Fujitsu Australia

Attachment: test_script.tar
Description: Unix tar archive

Reply via email to