On Mon, Mar 15, 2021 at 8:25 PM Bharath Rupireddy <bharath.rupireddyforpostg...@gmail.com> wrote: > > > The problem with a case like REFRESH MATERIALIZED VIEW is that there's > > > nothing to prevent something that gets run in the course of the query > > > from trying to access the view (and the heavyweight lock won't prevent > > > that, due to group locking). That's probably a stupid thing to do, > > > but it can't be allowed to break the world. The other cases are safe > > > from that particular problem because the table doesn't exist yet. > > Please correct me if my understanding of the above comment (from the > commit e9baa5e9) is wrong - even if the leader opens the matview > relation in exclusive mode, because of group locking(in case we allow > parallel workers to feed in the data to the new heap that gets created > for RMV, see ExecRefreshMatView->make_new_heap), can other sessions > still access the matview relation with older data? > > I performed below testing to prove myself wrong for the above understanding: > session 1: > 1) added few rows to the table t1 on which the mv1 is defined; > 2) refresh materialized view mv1; > > session 2: > 1) select count(*) from mv1; ---> this query is blocked until > session 1's step (2) is completed and gives the latest result even if > the underlying data-generating query runs select part in parallel. > > Is there anything I'm missing here?
I think he was talking about things like functions that try to access the mv from inside the same query, in a worker. I haven't figured out exactly which hazards he meant. I thought about wrong-relfilenode hazards and combocid hazards, but considering the way this thing always inserts into a fresh table before performing merge or swap steps later, I don't yet see why this is different from any other insert-from-select-with-gather. With the script below you can reach this error in the leader: postgres=# refresh materialized view CONCURRENTLY mv; ERROR: cannot start commands during a parallel operation CONTEXT: SQL function "crazy" But that's reachable on master too, even if you change crazy() to do "select 42 from pg_class limit 1" instead of reading mv, when performing a CREATE M V without NO DATA. :-( Without parallel leader participation it runs to completion. ===8<=== set parallel_leader_participation = off; set parallel_setup_cost = 0; set min_parallel_table_scan_size = 0; set parallel_tuple_cost = 0; drop table if exists t cascade; create table t (i int); insert into t select generate_series(1, 100000); create materialized view mv as select 42::int i; create or replace function crazy() returns int as $$ select i from mv limit 1; $$ language sql parallel safe; drop materialized view mv; create materialized view mv as select i + crazy() i from t with no data; create unique index on mv(i); refresh materialized view mv; refresh materialized view concurrently mv; begin; refresh materialized view mv; refresh materialized view mv; commit; begin; refresh materialized view concurrently mv; refresh materialized view concurrently mv; commit; ===8<=== PS, off-topic observation made while trying to think of ways to break your patch: I noticed that REFRESH CONCURRENTLY spends a lot of time in refresh_by_match_merge()'s big FULL JOIN. That is separate from the view query that you're parallelising here, and is used to perform the merge between a temporary table and the target MV table. I hacked the code a bit so that it wasn't scanning a temporary table (unparallelisable), and tried out the Parallel Hash Full Join patch which I intend to commit soon. This allowed REFRESH CONCURRENTLY to complete much faster. Huzzah! Unfortunately that query also does ORDER BY tid; I guess we could remove that to skip a Sort and use Gather instead of the more expensive Gather Merge, and hopefully soon a pushed-down Insert.