On Fri, Nov 9, 2018 at 2:02 AM Masahiko Sawada <sawada.m...@gmail.com> wrote:
> On Thu, Nov 8, 2018 at 9:30 PM Kyotaro HORIGUCHI > <horiguchi.kyot...@lab.ntt.co.jp> wrote: > > > > Hello. > > > > At Wed, 7 Nov 2018 19:31:00 +0900, Masahiko Sawada < > sawada.m...@gmail.com> wrote in <CAD21AoASCq808+iqcFoVuLu-+i8kon=6wN3+sY= > evkgm-56...@mail.gmail.com> > > > On Tue, Nov 6, 2018 at 9:16 PM Kyotaro HORIGUCHI > > > <horiguchi.kyot...@lab.ntt.co.jp> wrote: > > > > InitializeMaxBackends() > > > > MaxBackends = MaxConnections + autovacuum_max_workers + 1 + > > > > - max_worker_processes; > > > > + max_worker_processes + > replication_reserved_connections; > > > > > > > > This means walsender doesn't comsume a connection, which is > > > > different from the current behavior. We should reserve a part of > > > > MaxConnections for walsenders. (in PostmasterMain, > > > > max_wal_senders is counted as a part of MaxConnections) > > > > > > Yes. We can force replication_reserved_connections <= max_wal_senders > > > and then reserved connections for replication should be a part of > > > MaxConnections. > > > > > > > > > > > + if (am_walsender && replication_reserved_connections < > max_wal_senders > > > > + && *procgloballist == NULL) > > > > + procgloballist = &ProcGlobal->freeProcs; > > > > > > > > Currently exccesive number of walsenders are rejected in > > > > InitWalSenderSlot and emit the following error. > > > > > > > > > ereport(FATAL, > > > > > (errcode(ERRCODE_TOO_MANY_CONNECTIONS), > > > > > errmsg("number of requested standby connections " > > > > > "exceeds max_wal_senders (currently %d)", > > > > > max_wal_senders))); > > > > > > > > With this patch, if max_wal_senders = > > > > replication_reserved_connections = 3 and the fourth walreceiver > > > > comes, we will get "FATAL: sorry, too many clients already" > > > > instead. It should be fixed. > > > > > > > > When r_r_conn = 2 and max_wal_senders = 3 and the three > > > > walsenders are active, in an exreme case where a new replication > > > > connection comes at the same time another is exiting, we could > > > > end up using two normal slots despite that one slot is vacant in > > > > reserved slots. > > > > > > Doesn't the max_wal_senders prevent the case? > > > > Currently the variable doesn't work as so. We once accept the > > connection request and searches for a vacant slot in > > InitWalSenderSlot and reject the connection if it found that no > > room is available. Even with this patch, we don't count the > > accurate number of active walsenders (for performance reason). If > > reserved slot are filled, there's no way other than to accept the > > connection using non-reserved slot if r_r_conn < > > max_wal_senders. If one of active walsenders went away since we > > allocated non-reserved connection slot until InitWalSenderSlot > > starts searching sendnds[] array. Finally the new walsender on > > the unreserved slot is activated, and one reserved slot is left > > empty. So this is "an extreme case". We could ignore the case. > > > > I'm doubt that we should allow the setting where r_r_conn < > > max_wal_senders, or even r_r_conn != max_wal_senders. We don't > > have a problem like this if we don't allow the cases. > > > > > Wal senders can get connection if we have free procs more than > > > (MaxConnections - reserved for superusers). So I think for normal > > > users the connection must be refused if (MaxConnections - (reserved > > > for superuser and replication) > # of freeprocs) and for wal senders > > > the connection also must be refused if (MaxConnections - (reserved for > > > superuser) > # of freeprocs). I'm not sure we need such trick in > > > InitWalSenderSlot(). > > > > (For clarity, I don't mean my previous patch is good solution.) > > > > It works as far as we accept that some reserved slots can be left > > unused despite of some walsenders are using normal slots. (Just > > exiting a walsender using reserved slot causes this but it is > > usually occupied by walsenders comes later) > > > > Another idea is we acquire a walsnd[] slot before getting a > > connection slot.. > > After more thought, I'm inclined to agree to reserve max_wal_senders > slots and not to have replication_reserved_connections parameter. > > For superuser_reserved_connection, actually it works so that we > certainly reserve slots for superuser in case where slots are almost > full regardless of who is using other slots incluing superusers > themselves. But replication connections requires different behaviour > as it has the another limit (max_wal_senders). If we have > replication_reserved_connections < max_wal_senders, it could end up > with the same issue as what originally reported on this thread. > Therefore many users would set replication_reserved_connections = > max_wal_senders. > > On the other hand, If we always reserve max_wal_senders slots > available slots for normal backend will get decreased in the next > release, which require for users to re-confiugre the max_connection. > But I felt this behavior seems more natural than the current one, so I > think the re-configuration can be acceptable for users. > > Maybe what we should do instead is not consider max_wal_senders a part of the total number of connections, and instead size the things that needs to be sized by them by max_connections + max_wal_senders. That seems more logical given how the parameters are named as well. -- Magnus Hagander Me: https://www.hagander.net/ <http://www.hagander.net/> Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>