On Tue, Nov 17, 2015 at 9:57 AM, Kyotaro HORIGUCHI <horiguchi.kyot...@lab.ntt.co.jp> wrote: > Hello, > > At Tue, 17 Nov 2015 01:09:57 +0900, Masahiko Sawada <sawada.m...@gmail.com> > wrote in <CAD21AoDhqGB=etbfqnkhxr8t53d+8qms4dpm5hvyq4ba2or...@mail.gmail.com> >> > - Notation >> > >> > synchronous_standby_names, and synchronous_replication_method as >> > a variable to provide other syntax is probably no argument >> > except its name. But I feel synchronous_standby_num looks bit >> > too specific. >> > >> > I'd like to propose if this doesn't reprise the argument on >> > notation for replication definitions:p >> > >> > The following two GUCs would be enough to bear future expansion >> > of notation syntax and/or method. >> > >> > synchronous_standby_names : as it is >> > >> > synchronous_replication_method: >> > >> > default is "1-priority", which means the same with the current >> > meaning. possible additional values so far would be, >> > >> > "n-priority": the format of s_s_names is "n, <name>, <name>, >> > <name>...", >> > where n is the number of required acknowledges. >> >> One question is that what is different between the leading "n" in >> s_s_names and the leading "n" of "n-priority"? > > Ah. Sorry for the ambiguous description. 'n' in s_s_names > representing an arbitrary integer number and that in "n-priority" > is literally an "n", meaning "a format with any number of > priority hosts" as a whole. As an instance, > > synchronous_replication_method = "n-priority" > synchronous_standby_names = "2, mercury, venus, earth, mars, jupiter" > > I added "n-" of "n-priority" to distinguish with "1-priority" so > if we won't provide "1-priority" for backward compatibility, > "priority" would be enough to represent the type. > > By the way, s_r_method is not essentially necessary but it would > be important to avoid complexity of autodetection of formats > including currently undefined ones.
Than you for your explanation, I understood that. It means that the format of s_s_names will be changed, which would be not good. So, how about the adding just s_r_method parameter and the number of required ACK is represented in the leading of s_r_method? For example, the following setting is same as above. synchronous_replication_method = "2-priority" synchronous_standby_names = "mercury, venus, earth, mars, jupiter" In quorum method, we can set; synchronous_replication_method = "2-quorum" synchronous_standby_names = "mercury, venus, earth, mars, jupiter" Thought? > > >> > "n-quorum": the format of s_s_names is the same as above, but >> > it is read in quorum context. > > The "n" of this is the same as above. > >> > These can be expanded, for example, as follows, but in future. >> > >> > "complex" : Michael's format. >> > "json" : JSON? >> > "json-ext": specify JSON in external file. >> > >> > Even after we have complex notations, I suppose that many use >> > cases are coverd by the first tree notations. >> >> I'm not sure it's desirable to implement the all kind of methods into core. >> I think it's better to extend replication in order to be more >> extensibility like adding hook function. >> And then other approach is implemented as a contrib module. > > I agree with you. I proposed the following internal design having > that in mind. > >> > - Internal design >> > >> > What should be done in SyncRepReleaseWaiters() is calculating a >> > pair of LSNs that can be regarded as synced and decide whether >> > *this* walsender have advanced the LSN pair, then trying to >> > release backends that wait for the LSNs *if* this walsender has >> > advanced them. >> > >> > From such point, the proposed patch will make redundant trials >> > to release backens. >> > >> > Addition to that, the patch looks to be a mixture of the current >> > implement and the new feature. These are for the same objective >> > so they cannot coexist each other, I think. As the result, codes >> > for both quorum/priority judgement appear at multiple level in >> > call tree. This would be an obstacle for future (possible) >> > expansion. >> > >> > So, I think this feature should be implemented as following, >> > >> > SyncRepInitConfig reads the configuration and stores the result >> > structure into elsewhere such like WalSnd->syncrepset_definition >> > instead of WalSnd->sync_standby_priority, which should be >> > removed. Nothing would be stored if the current wal sender is >> > not a member of the defined replication set. Storing a pointer >> > to matching function there would increase the flexibility but >> > such implement in contrast will make the code difficult to be >> > read.. (I often look for the entity of xlogreader->read_page() >> > ;) >> > >> > Then SyncRepSyncedLsnAdvancedTo() instead of >> > SyncRepGetSynchronousStandbys() returns an LSN pair that can be >> > regarded as 'synced' according to specified definition of >> > replication set and whether this walsender have advanced the >> > LSNs. >> > >> > Finally, SyncRepReleaseWaiters() uses it to release backends if >> > needed. >> > >> > The differences among quorum/priority or others are confined in >> > SyncRepSyncedLsnAdvancedTo(). As the result, >> > SyncRepReleaseWaiters would look as following. >> > >> > | SyncRepReleaseWaiters(void) >> > | { >> > | if (MyWalSnd->syncrepset_definition == NULL || ...) >> > | return; >> > | ... >> > | if (!SyncRepSyncedLsnAdvancedTo(&flush_pos, &write_pos)) >> > | { >> > | /* I haven't advanced the synced LSNs */ >> > | LWLockRelease(SyncRepLock); >> > | rerturn; >> > | } >> > | /* Set the lsn first so that when we wake backends they will relase... >> > >> > I'm not thought concretely about what SyncRepSyncedLsnAdvancedTo >> > does but perhaps yes we can:p in effective manner.. >> > >> > What do you think about this? >> >> I agree with this design. >> What SyncRepSyncedLsnAdvancedTo() does would be different for each >> method, so we can implement "n-priority" style multiple sync >> replication at first version. > > Maybe the first *additional* one if we decide to keep backward > compatibility, as the discussion above. > Regards, -- Masahiko Sawada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers