On 6/12/19 10:29 AM, Jamison, Kirk wrote: > >> From a user POW, the main issue with relation truncation is that it can block >> queries on standby server during truncation replay. >> >> It could be interesting if you can test this case and give results of your >> path. >> Maybe by performing read queries on standby server and counting wait_event >> with pg_wait_sampling? > > Thanks for the suggestion. I tried using the extension pg_wait_sampling, > But I wasn't sure that I could replicate the problem of blocked queries on > standby server. > Could you advise? > Here's what I did for now, similar to my previous test with hot standby setup, > but with additional read queries of wait events on standby server. > > 128MB shared_buffers > SELECT create_tables(10000); > SELECT insert_tables(10000); > SELECT delfrom_tables(10000); > > [Before VACUUM] > Standby: SELECT the following view from pg_stat_waitaccum > > wait_event_type | wait_event | calls | microsec > -----------------+-----------------+-------+---------- > Client | ClientRead | 2 | 20887759 > IO | DataFileRead | 175 | 2788 > IO | RelationMapRead | 4 | 26 > IO | SLRURead | 2 | 38 > > Primary: Execute VACUUM (induces relation truncates) > > [After VACUUM] > Standby: > wait_event_type | wait_event | calls | microsec > -----------------+-----------------+-------+---------- > Client | ClientRead | 7 | 77662067 > IO | DataFileRead | 284 | 4523 > IO | RelationMapRead | 10 | 51 > IO | SLRURead | 3 | 57 >
(Sorry for the delay, I forgot to answer you) As far as I remember, you should see "relation" wait events (type lock) on standby server. This is due to startup process acquiring AccessExclusiveLock for the truncation and other backend waiting to acquire a lock to read the table. On primary server, vacuum is able to cancel truncation: /* * We need full exclusive lock on the relation in order to do * truncation. If we can't get it, give up rather than waiting --- we * don't want to block other backends, and we don't want to deadlock * (which is quite possible considering we already hold a lower-grade * lock). */ vacrelstats->lock_waiter_detected = false; lock_retry = 0; while (true) { if (ConditionalLockRelation(onerel, AccessExclusiveLock)) break; /* * Check for interrupts while trying to (re-)acquire the exclusive * lock. */ CHECK_FOR_INTERRUPTS(); if (++lock_retry > (VACUUM_TRUNCATE_LOCK_TIMEOUT / VACUUM_TRUNCATE_LOCK_WAIT_INTERVAL)) { /* * We failed to establish the lock in the specified number of * retries. This means we give up truncating. */ vacrelstats->lock_waiter_detected = true; ereport(elevel, (errmsg("\"%s\": stopping truncate due to conflicting lock request", RelationGetRelationName(onerel)))); return; } pg_usleep(VACUUM_TRUNCATE_LOCK_WAIT_INTERVAL * 1000L); } To maximize chances to reproduce we can use big shared_buffers. But I am afraid it is not easy to perform reproducible tests to compare results. Unfortunately I don't have servers to perform tests. Regards,
signature.asc
Description: OpenPGP digital signature