Re: Wrong results from Parallel Hash Full Join

Melanie Plageman Wed, 19 Apr 2023 17:43:43 -0700

On Wed, Apr 19, 2023 at 3:20 PM Andres Freund <[email protected]> wrote:

> Hi,
>
> On 2023-04-19 12:16:24 -0500, Justin Pryzby wrote:
> > On Wed, Apr 19, 2023 at 11:17:04AM -0400, Melanie Plageman wrote:
> > > Ultimately this is probably fine. If we wanted to modify one of the
> > > existing tests to cover the multi-batch case, changing the select
> > > count(*) to a select * would do the trick. I imagine we wouldn't want
> to
> > > do this because of the excessive output this would produce. I wondered
> > > if there was a pattern in the tests for getting around this.
> >
> > You could use explain (ANALYZE).  But the output is machine-dependant in
> > various ways (which is why the tests use "explain analyze so rarely).
>
> I think with sufficient options it's not machine specific. We have a bunch
> of
>  EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF) ..
> in our tests.
>

Cool. Yea, so ultimately these options are almost enough but memory
usage changes from execution to execution. There are some tests which do
regexp_replace() on the memory usage part of the EXPLAIN ANALYZE output
to allow us to still compare the plans. However, I figured if I was
already going to go to the trouble of using regexp_replace(), I might as
well write a function that returns the "Actual Rows" field from the
EXPLAIN ANALYZE output.

The attached patch does that. I admittedly mostly copy-pasted the
plpgsql function from similar examples in other tests, and I suspect it
may be overkill and also poorly written.

The nice thing about this approach is that we can modify some of the
existing tests in join_hash.sql to use this function and cover the code
to reset the matchbit for serial hashjoin, single batch parallel
hashjoin, and all batches of parallel multi-batch hashjoin without any
additional queries. (I'll leave testing match bit resetting with the
skew hashtable and match bit resetting in case of a rescan for another
day.)

I was able to delete the tests added in 558c9d75fe, as they became
redundant.

I wonder if any other tests are in need of an EXPLAIN (ANALYZE,
MEMORY_USAGE OFF) option? Perhaps it is quite unusual to only require a
deterministic field like 'Actual Rows'. If we had that option we could
also remove the extra EXPLAIN invocations before the actual query
executions.

- Melanie

From a8b1d29546634afcf5c1cf63e889a2155ac2917b Mon Sep 17 00:00:00 2001
From: Melanie Plageman <[email protected]>
Date: Wed, 19 Apr 2023 20:09:37 -0400
Subject: [PATCH v1] Test multi-batch PHJ match bit initialization

558c9d75 fixed a bug with initialization of the hash join match status
bit and added test coverage for serial hash join and single batch
parallel hash join. Add a test for multi-batch parallel hash join.

To test this with the existing relations in the join_hash test file,
this commit modifies some of the test queries to use SELECT * instead of
SELECT COUNT(*), which was needed to ensure that the tuples being
inserted into the hashtable had not already been made into virtual
tuples and lost their HOT status bit.

These modifications made the original tests introduced in 558c9d75
redundant.

Discussion: https://postgr.es/m/ZEAh6CewmARbDVuN%40telsasoft.com
---
 src/test/regress/expected/join_hash.out | 134 +++++++++++-------------
 src/test/regress/sql/join_hash.sql      |  72 +++++++------
 2 files changed, 100 insertions(+), 106 deletions(-)

diff --git a/src/test/regress/expected/join_hash.out b/src/test/regress/expected/join_hash.out
index e892e7cccb..5be0a4274b 100644
--- a/src/test/regress/expected/join_hash.out
+++ b/src/test/regress/expected/join_hash.out
@@ -49,10 +49,35 @@ begin
   end loop;
 end;
 $$;
+-- Return the number of actual rows returned by a query. This is useful when a
+-- count(*) would not execute the desired codepath but a SELECT * would produce
+-- an excessive number of rows.
+create or replace function hj_actual_rows(query text)
+returns jsonb language plpgsql
+as
+$$
+declare
+  elements jsonb;
+begin
+  execute 'explain (analyze, costs off, summary off, timing off, format ''json'') ' || query into strict elements;
+  if jsonb_array_length(elements) > 1 then
+    raise exception 'Cannot return actual rows from more than one plan';
+  end if;
+  return elements->0->'Plan'->'Actual Rows';
+end;
+$$;
 -- Make a simple relation with well distributed keys and correctly
 -- estimated size.
-create table simple as
-  select generate_series(1, 20000) AS id, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa';
+create table simple (id int, value text);
+insert into simple values (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa');
+-- Hash join reuses the HOT status bit to indicate match status. This can only
+-- be guaranteed to produce correct results if all the hash join tuple match
+-- bits are reset before reuse. This is done upon loading them into the
+-- hashtable. To test this, update simple, creating a HOT tuple. If this status
+-- bit isn't cleared, we won't correctly emit the NULL-extended unmatching
+-- tuple in full hash join.
+update simple set id = 1;
+insert into simple select generate_series(2, 20000), 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa';
 alter table simple set (parallel_workers = 2);
 analyze simple;
 -- Make a relation whose size we will under-estimate.  We want stats
@@ -273,6 +298,7 @@ set local max_parallel_workers_per_gather = 2;
 set local work_mem = '192kB';
 set local hash_mem_multiplier = 1.0;
 set local enable_parallel_hash = on;
+set local parallel_tuple_cost = 0;
 explain (costs off)
   select count(*) from simple r join simple s using (id);
                          QUERY PLAN                          
@@ -305,10 +331,11 @@ $$);
 (1 row)
 
 -- parallel full multi-batch hash join
-select count(*) from simple r full outer join simple s using (id);
- count 
--------
- 20000
+select hj_actual_rows(
+  'select * from simple r full outer join simple s on (r.id = 0 - s.id)');
+ hj_actual_rows 
+----------------
+ 40000
 (1 row)
 
 rollback to settings;
@@ -844,20 +871,20 @@ rollback to settings;
 savepoint settings;
 set local max_parallel_workers_per_gather = 0;
 explain (costs off)
-     select  count(*) from simple r full outer join simple s on (r.id = 0 - s.id);
-               QUERY PLAN               
-----------------------------------------
- Aggregate
-   ->  Hash Full Join
-         Hash Cond: ((0 - s.id) = r.id)
-         ->  Seq Scan on simple s
-         ->  Hash
-               ->  Seq Scan on simple r
-(6 rows)
+  select * from simple r full outer join simple s on (r.id = 0 - s.id);
+            QUERY PLAN            
+----------------------------------
+ Hash Full Join
+   Hash Cond: ((0 - s.id) = r.id)
+   ->  Seq Scan on simple s
+   ->  Hash
+         ->  Seq Scan on simple r
+(5 rows)
 
-select  count(*) from simple r full outer join simple s on (r.id = 0 - s.id);
- count 
--------
+select hj_actual_rows(
+  'select * from simple r full outer join simple s on (r.id = 0 - s.id)');
+ hj_actual_rows 
+----------------
  40000
 (1 row)
 
@@ -888,24 +915,24 @@ rollback to settings;
 -- parallelism is possible with parallel-aware full hash join
 savepoint settings;
 set local max_parallel_workers_per_gather = 2;
+set local parallel_tuple_cost = 0;
 explain (costs off)
-     select  count(*) from simple r full outer join simple s on (r.id = 0 - s.id);
-                         QUERY PLAN                          
--------------------------------------------------------------
- Finalize Aggregate
-   ->  Gather
-         Workers Planned: 2
-         ->  Partial Aggregate
-               ->  Parallel Hash Full Join
-                     Hash Cond: ((0 - s.id) = r.id)
-                     ->  Parallel Seq Scan on simple s
-                     ->  Parallel Hash
-                           ->  Parallel Seq Scan on simple r
-(9 rows)
+  select * from simple r full outer join simple s on (r.id = 0 - s.id);
+                   QUERY PLAN                    
+-------------------------------------------------
+ Gather
+   Workers Planned: 2
+   ->  Parallel Hash Full Join
+         Hash Cond: ((0 - s.id) = r.id)
+         ->  Parallel Seq Scan on simple s
+         ->  Parallel Hash
+               ->  Parallel Seq Scan on simple r
+(7 rows)
 
-select  count(*) from simple r full outer join simple s on (r.id = 0 - s.id);
- count 
--------
+select hj_actual_rows(
+  'select * from simple r full outer join simple s on (r.id = 0 - s.id)');
+ hj_actual_rows 
+----------------
  40000
 (1 row)
 
@@ -955,43 +982,6 @@ $$);
 (1 row)
 
 rollback to settings;
--- Hash join reuses the HOT status bit to indicate match status. This can only
--- be guaranteed to produce correct results if all the hash join tuple match
--- bits are reset before reuse. This is done upon loading them into the
--- hashtable.
-SAVEPOINT settings;
-SET enable_parallel_hash = on;
-SET min_parallel_table_scan_size = 0;
-SET parallel_setup_cost = 0;
-SET parallel_tuple_cost = 0;
-CREATE TABLE hjtest_matchbits_t1(id int);
-CREATE TABLE hjtest_matchbits_t2(id int);
-INSERT INTO hjtest_matchbits_t1 VALUES (1);
-INSERT INTO hjtest_matchbits_t2 VALUES (2);
--- Update should create a HOT tuple. If this status bit isn't cleared, we won't
--- correctly emit the NULL-extended unmatching tuple in full hash join.
-UPDATE hjtest_matchbits_t2 set id = 2;
-SELECT * FROM hjtest_matchbits_t1 t1 FULL JOIN hjtest_matchbits_t2 t2 ON t1.id = t2.id;
- id | id 
-----+----
-  1 |   
-    |  2
-(2 rows)
-
--- Test serial full hash join.
--- Resetting parallel_setup_cost should force a serial plan.
--- Just to be safe, however, set enable_parallel_hash to off, as parallel full
--- hash joins are only supported with shared hashtables.
-RESET parallel_setup_cost;
-SET enable_parallel_hash = off;
-SELECT * FROM hjtest_matchbits_t1 t1 FULL JOIN hjtest_matchbits_t2 t2 ON t1.id = t2.id;
- id | id 
-----+----
-  1 |   
-    |  2
-(2 rows)
-
-ROLLBACK TO settings;
 rollback;
 -- Verify that hash key expressions reference the correct
 -- nodes. Hashjoin's hashkeys need to reference its outer plan, Hash's
diff --git a/src/test/regress/sql/join_hash.sql b/src/test/regress/sql/join_hash.sql
index 06bab7a4c7..c9454c302c 100644
--- a/src/test/regress/sql/join_hash.sql
+++ b/src/test/regress/sql/join_hash.sql
@@ -53,10 +53,36 @@ begin
 end;
 $$;
 
+-- Return the number of actual rows returned by a query. This is useful when a
+-- count(*) would not execute the desired codepath but a SELECT * would produce
+-- an excessive number of rows.
+create or replace function hj_actual_rows(query text)
+returns jsonb language plpgsql
+as
+$$
+declare
+  elements jsonb;
+begin
+  execute 'explain (analyze, costs off, summary off, timing off, format ''json'') ' || query into strict elements;
+  if jsonb_array_length(elements) > 1 then
+    raise exception 'Cannot return actual rows from more than one plan';
+  end if;
+  return elements->0->'Plan'->'Actual Rows';
+end;
+$$;
+
 -- Make a simple relation with well distributed keys and correctly
 -- estimated size.
-create table simple as
-  select generate_series(1, 20000) AS id, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa';
+create table simple (id int, value text);
+insert into simple values (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa');
+-- Hash join reuses the HOT status bit to indicate match status. This can only
+-- be guaranteed to produce correct results if all the hash join tuple match
+-- bits are reset before reuse. This is done upon loading them into the
+-- hashtable. To test this, update simple, creating a HOT tuple. If this status
+-- bit isn't cleared, we won't correctly emit the NULL-extended unmatching
+-- tuple in full hash join.
+update simple set id = 1;
+insert into simple select generate_series(2, 20000), 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa';
 alter table simple set (parallel_workers = 2);
 analyze simple;
 
@@ -179,6 +205,7 @@ set local max_parallel_workers_per_gather = 2;
 set local work_mem = '192kB';
 set local hash_mem_multiplier = 1.0;
 set local enable_parallel_hash = on;
+set local parallel_tuple_cost = 0;
 explain (costs off)
   select count(*) from simple r join simple s using (id);
 select count(*) from simple r join simple s using (id);
@@ -188,7 +215,8 @@ $$
   select count(*) from simple r join simple s using (id);
 $$);
 -- parallel full multi-batch hash join
-select count(*) from simple r full outer join simple s using (id);
+select hj_actual_rows(
+  'select * from simple r full outer join simple s on (r.id = 0 - s.id)');
 rollback to settings;
 
 -- The "bad" case: during execution we need to increase number of
@@ -460,8 +488,9 @@ rollback to settings;
 savepoint settings;
 set local max_parallel_workers_per_gather = 0;
 explain (costs off)
-     select  count(*) from simple r full outer join simple s on (r.id = 0 - s.id);
-select  count(*) from simple r full outer join simple s on (r.id = 0 - s.id);
+  select * from simple r full outer join simple s on (r.id = 0 - s.id);
+select hj_actual_rows(
+  'select * from simple r full outer join simple s on (r.id = 0 - s.id)');
 rollback to settings;
 
 -- parallelism not possible with parallel-oblivious full hash join
@@ -476,9 +505,11 @@ rollback to settings;
 -- parallelism is possible with parallel-aware full hash join
 savepoint settings;
 set local max_parallel_workers_per_gather = 2;
+set local parallel_tuple_cost = 0;
 explain (costs off)
-     select  count(*) from simple r full outer join simple s on (r.id = 0 - s.id);
-select  count(*) from simple r full outer join simple s on (r.id = 0 - s.id);
+  select * from simple r full outer join simple s on (r.id = 0 - s.id);
+select hj_actual_rows(
+  'select * from simple r full outer join simple s on (r.id = 0 - s.id)');
 rollback to settings;
 
 
@@ -506,33 +537,6 @@ $$
 $$);
 rollback to settings;
 
-
--- Hash join reuses the HOT status bit to indicate match status. This can only
--- be guaranteed to produce correct results if all the hash join tuple match
--- bits are reset before reuse. This is done upon loading them into the
--- hashtable.
-SAVEPOINT settings;
-SET enable_parallel_hash = on;
-SET min_parallel_table_scan_size = 0;
-SET parallel_setup_cost = 0;
-SET parallel_tuple_cost = 0;
-CREATE TABLE hjtest_matchbits_t1(id int);
-CREATE TABLE hjtest_matchbits_t2(id int);
-INSERT INTO hjtest_matchbits_t1 VALUES (1);
-INSERT INTO hjtest_matchbits_t2 VALUES (2);
--- Update should create a HOT tuple. If this status bit isn't cleared, we won't
--- correctly emit the NULL-extended unmatching tuple in full hash join.
-UPDATE hjtest_matchbits_t2 set id = 2;
-SELECT * FROM hjtest_matchbits_t1 t1 FULL JOIN hjtest_matchbits_t2 t2 ON t1.id = t2.id;
--- Test serial full hash join.
--- Resetting parallel_setup_cost should force a serial plan.
--- Just to be safe, however, set enable_parallel_hash to off, as parallel full
--- hash joins are only supported with shared hashtables.
-RESET parallel_setup_cost;
-SET enable_parallel_hash = off;
-SELECT * FROM hjtest_matchbits_t1 t1 FULL JOIN hjtest_matchbits_t2 t2 ON t1.id = t2.id;
-ROLLBACK TO settings;
-
 rollback;
 
 -- Verify that hash key expressions reference the correct
-- 
2.37.2

Re: Wrong results from Parallel Hash Full Join

Reply via email to