Ășt 14. 4. 2020 v 5:59 odesĂlatel Rajkumar Raghuwanshi < rajkumar.raghuwan...@enterprisedb.com> napsal:
> Hi, > > I have observed row_number() is giving different results when query > executed in parallel. is this expected w.r.t parallel execution. > > CREATE TABLE tbl1 (c1 INT) partition by list (c1); > CREATE TABLE tbl1_p1 partition of tbl1 FOR VALUES IN (10); > CREATE TABLE tbl1_p2 partition of tbl1 FOR VALUES IN (20); > CREATE TABLE tbl1_p3 partition of tbl1 FOR VALUES IN (30); > > CREATE TABLE tbl2 (c1 INT, c2 INT,c3 INT) partition by list (c1); > CREATE TABLE tbl2_p1 partition of tbl2 FOR VALUES IN (1); > CREATE TABLE tbl2_p2 partition of tbl2 FOR VALUES IN (2); > CREATE TABLE tbl2_p3 partition of tbl2 FOR VALUES IN (3); > CREATE TABLE tbl2_p4 partition of tbl2 FOR VALUES IN (4); > CREATE TABLE tbl2_p5 partition of tbl2 FOR VALUES IN (5); > > INSERT INTO tbl1 VALUES (10),(20),(30); > > INSERT INTO tbl2 VALUES > (1,100,20),(2,200,10),(3,100,20),(4,100,30),(5,100,10); > > postgres=# explain select e.c2, row_number() over () from tbl1 d, tbl2 e > where d.c1=e.c3; > QUERY PLAN > > > --------------------------------------------------------------------------------------- > WindowAgg (cost=1520.35..12287.73 rows=390150 width=12) > -> Merge Join (cost=1520.35..7410.85 rows=390150 width=4) > Merge Cond: (d.c1 = e.c3) > -> Sort (cost=638.22..657.35 rows=7650 width=4) > Sort Key: d.c1 > -> Append (cost=0.00..144.75 rows=7650 width=4) > -> Seq Scan on tbl1_p1 d_1 (cost=0.00..35.50 > rows=2550 width=4) > -> Seq Scan on tbl1_p2 d_2 (cost=0.00..35.50 > rows=2550 width=4) > -> Seq Scan on tbl1_p3 d_3 (cost=0.00..35.50 > rows=2550 width=4) > -> Sort (cost=882.13..907.63 rows=10200 width=8) > Sort Key: e.c3 > -> Append (cost=0.00..203.00 rows=10200 width=8) > -> Seq Scan on tbl2_p1 e_1 (cost=0.00..30.40 > rows=2040 width=8) > -> Seq Scan on tbl2_p2 e_2 (cost=0.00..30.40 > rows=2040 width=8) > -> Seq Scan on tbl2_p3 e_3 (cost=0.00..30.40 > rows=2040 width=8) > -> Seq Scan on tbl2_p4 e_4 (cost=0.00..30.40 > rows=2040 width=8) > -> Seq Scan on tbl2_p5 e_5 (cost=0.00..30.40 > rows=2040 width=8) > (17 rows) > > postgres=# select e.c2, row_number() over () from tbl1 d, tbl2 e where > d.c1=e.c3; > c2 | row_number > -----+------------ > *200 | 1* > 100 | 2 > 100 | 3 > 100 | 4 > 100 | 5 > (5 rows) > > postgres=# > postgres=# set parallel_setup_cost = 0; > SET > postgres=# set parallel_tuple_cost = 0; > SET > postgres=# > postgres=# explain select e.c2, row_number() over () from tbl1 d, tbl2 e > where d.c1=e.c3; > QUERY PLAN > > > ------------------------------------------------------------------------------------------------------ > WindowAgg (cost=130.75..7521.21 rows=390150 width=12) > -> Gather (cost=130.75..2644.34 rows=390150 width=4) > Workers Planned: 2 > -> Parallel Hash Join (cost=130.75..2644.34 rows=162562 width=4) > Hash Cond: (e.c3 = d.c1) > -> Parallel Append (cost=0.00..131.25 rows=4250 width=8) > -> Parallel Seq Scan on tbl2_p1 e_1 > (cost=0.00..22.00 rows=1200 width=8) > -> Parallel Seq Scan on tbl2_p2 e_2 > (cost=0.00..22.00 rows=1200 width=8) > -> Parallel Seq Scan on tbl2_p3 e_3 > (cost=0.00..22.00 rows=1200 width=8) > -> Parallel Seq Scan on tbl2_p4 e_4 > (cost=0.00..22.00 rows=1200 width=8) > -> Parallel Seq Scan on tbl2_p5 e_5 > (cost=0.00..22.00 rows=1200 width=8) > -> Parallel Hash (cost=90.93..90.93 rows=3186 width=4) > -> Parallel Append (cost=0.00..90.93 rows=3186 > width=4) > -> Parallel Seq Scan on tbl1_p1 d_1 > (cost=0.00..25.00 rows=1500 width=4) > -> Parallel Seq Scan on tbl1_p2 d_2 > (cost=0.00..25.00 rows=1500 width=4) > -> Parallel Seq Scan on tbl1_p3 d_3 > (cost=0.00..25.00 rows=1500 width=4) > (16 rows) > > postgres=# select e.c2, row_number() over () from tbl1 d, tbl2 e where > d.c1=e.c3; > c2 | row_number > -----+------------ > 100 | 1 > 100 | 2 > 100 | 3 > *200 | 4* > 100 | 5 > (5 rows) > there are not ORDER BY clause, so order is not defined - paralel hash join surely doesn't ensure a order. I think so this behave is expected. Regards Pavel > Thanks & Regards, > Rajkumar Raghuwanshi >