On Wed, Aug 31, 2016 at 8:07 AM, Chinh Nguyen <chinh...@gmail.com> wrote:

> Hello All,
>
> We recently tried to reindex on a heavy used database cluster (master
> + multiple hot-standby async replicas, all taking traffic) and
> replicas were effectively blocked for 10 minutes resulting in drop of
> a lot of read traffic. We reindex by create new index concurrently,
> then rename old and new index and drop old index afterwards.
> After we execute the renaming on master, we found a quick burst of
> exclusive locks as expected:
>
> Process 34482 waits for AccessShareLock on relation 4153657 of
> database 16420; blocked by process 18953.
> Process 18953: ALTER INDEX public.old_index RENAME TO temp_index;
>
> But all hot-standby replicas started to see queries locking up for 10
> minutes, sample log from a replica is below:
>
> LOG: process 41040 still waiting for AccessShareLock on relation
> 4153657 of database 16420 after 1000.072 ms
>
> We have in config:
>
> max_standby_streaming_delay = 10min
> lock_timeout = 10s
>
> Any idea what happened with the replicas? Why queries on replica
> locked for so long vs sub-second on master?
>

I think, it is not the lock release wait time, it is the delay in WAL
replay on standby from master due to the parameter
"max_standby_streaming_delay".
If you have read queries running on standby, WAL replay waits for
"max_standby_streaming_delay" duration (which is 10 mins in your case)
before timing out queries on replica.


Regards,
Venkata B N

Fujitsu Australia

Reply via email to