Hello, At Wed, 10 Feb 2016 02:57:54 +0900, Fujii Masao <masao.fu...@gmail.com> wrote in <cahgqgwhr1mnpagrmh9t0oy0onydkgaymcngvoe-1vlz8z9t...@mail.gmail.com> > On Wed, Feb 10, 2016 at 1:36 AM, Masahiko Sawada <sawada.m...@gmail.com> > wrote: > > On Tue, Feb 9, 2016 at 10:32 PM, Michael Paquier > > <michael.paqu...@gmail.com> wrote: > >> On Wed, Feb 3, 2016 at 7:33 AM, Robert Haas wrote: > >>> Also, to be frank, I think we ought to be putting more effort into > >>> another patch in this same area, specifically Thomas Munro's causal > >>> reads patch. I think a lot of people today are trying to use > >>> synchronous replication to build load-balancing clusters and avoid the > >>> problem where you write some data and then read back stale data from a > >>> standby server. Of course, our current synchronous replication > >>> facilities make no such guarantees - his patch does, and I think > >>> that's pretty important. I'm not saying that we shouldn't do this > >>> too, of course. > >> > >> Yeah, sure. Each one of those patches is trying to solve a different > >> problem where Postgres is deficient, here we'd like to be sure a > >> commit WAL record is correctly flushed on multiple standbys, while the > >> patch of Thomas is trying to ensure that there is no need to scan for > >> the replay position of a standby using some GUC parameters and a > >> validation/sanity layer in syncrep.c to do that. Surely the patch of > >> this thread has got more attention than Thomas', and both of them have > >> merits and try to address real problems. FWIW, the patch of Thomas is > >> a topic that I find rather interesting, and I am planning to look at > >> it as well, perhaps for next CF or even before that. We'll see how > >> other things move on. > > > > Attached first version dedicated language patch (document patch is not yet.) > > Thanks for the patch! Will review it. > > I think that it's time to write the documentation patch. > > Though I've not read the patch yet, I found that your patch > changed s_s_names so that it rejects non-alphabet character > like *, according to my simple test. It should accept any > application_name which we can use.
Thanks for the quick response. At a glance, I'd like to show you some random suggestions, mainly on writing conventions. === Running postgresql with s_s_names = '*', makes error as Fujii-san said. And it yeilds the following message. | $ postgres | FATAL: syntax error: unexpected character "*" Mmm.. It should be tough to find what has happened.. === check_synchronous_standby_names frees parsed SyncRepStandbyNames immediately but no reason is explained there. The following comment looks to be saying something related to this but it doesn't explain the reason to free. + /* + * Any additional validation of standby names should go here. + * + * Don't attempt to set WALSender priority because this is executed by + * postmaster at startup, not WALSender, so the application_name is not + * yet correctly set. + */ Addtion to that, I'd like to see a description like 'syncgroup_yyparse sets the global SyncRepStandbyNames as side effect' around it. === malloc/free are used in create_name_node and other functions to be used in scanner, but syncgroup_gram.y is said to use palloc/pfree. Maybe they should use the same memory allocation/freeing functions. === The variable name SyncRepStandbyNames holds the list of SyncGroupNode*. This is somewhat confusing. How about SyncRepStandbys? === +static void +SyncRepClearStandbyGroupList(SyncGroupNode *group) +{ + SyncGroupNode *n = group->member; The name 'n' is a bit confusing, I believe that the one-letter variables should be used following implicit (and ancient?) convention otherwise pretty short-term and obvious cases. name, or group_name instead might be better. There's similar usage of 'n' in other places. === + * Find active walsender position of WalSnd by name. Returns index of walsnds + * array if found, otherwise return -1. I didn't get what is 'walsender position' within this comment. And as the discussion upthread, there can be multiple walsenders with the same name. So this might be like this. > * Finds the first active synchronous walsender with given name > * in WalSndCtl->wansnds and returns the index of that. Returns > * -1 if not found. === + * Get both synced LSNS: write and flush, using its group function and check + * whether each LSN has advanced to, or not. This is question for all. Which to use synced, synched or synchronized? Maybe we should use non-abbreviated spellings unless the description become too long to make it hard to read. > * Return true if we have enough synchronized standbys and the 'safe' > * written and flushed LSNs, which are LSNs assured in all standbys > * considered should be synchronized. # Please rewrite me. === +SyncRepSyncedLsnAdvancedTo(XLogRecPtr *write_pos, XLogRecPtr *flush_pos) +{ + XLogRecPtr cur_write_pos; + XLogRecPtr cur_flush_pos; + bool ret; The name cur_*_pos are a bit confusing. They hold LSNs where all of standbys choosed as synchronized ones. So how about safe_*_pos? And 'ret' is not the return value of this function and it can have more specific name, such like... satisfied? or else.. === +SyncRepSyncedLsnAdvancedTo(XLogRecPtr *write_pos, XLogRecPtr *flush_pos) ... + /* Check whether each LSN has advanced to */ + if (ret) + { ... + return true; + } + + return false; This might be a kind of favor, It would be simple to be written with reverse-condition. === + SyncRepSyncedLsnAdvancedTo(XLogRecPtr *write_pos, XLogRecPtr *flush_pos) ... + ret = SyncRepStandbyNames->SyncRepGetSyncedLsnsFn(SyncRepStandbyNames, + &cur_write_pos, + &cur_flush_pos); ... + if (MyWalSnd->write >= cur_write_pos) I suppose SyncRepGetSyncedLsnsFn, or SyncRepGetSyncedLsnsPriority can return InvalidXLogRecPtr as cur_*_pos even when it returns true. And, I suppose comparison of LSN values with InvalidXLogRecPtr is not well-defined. Anyway the condition goes wrong when cur_write_pos = InvalidXLogRecPtr (but ret = true). === + * Obtain a array containing positions of standbys of specified group + * currently considered as synchronous up to wait_num of its group. + * Caller is respnsible for allocating the data obtained. # Anyone please reedit my rewriting below.. Perhaps my writing is # quite unreadable.. > * Return the positions of the first group->wait_num > * synchronized standbys in group->member list into > * sync_list. sync_list is assumed to have enough space for > * at least group->wait_num elements. === +bool +SyncRepGetSyncedLsnsPriority(SyncGroupNode *group, XLogRecPtr *write_pos, XLogRecPtr *flush_pos) +{ ... + for(n = group->member; n != NULL; n = n->next) group->member holds two or more items, so the name would be better to be group->members, or member_list. === + /* We already got enough synchronous standbys, return */ + if (num == group->wait_num) As convention for saftiness, this kind of comparison is to use inequality operators. > if (num >= group->wait_num) === At a glance, SyncRepGetSyncedLsnsPriority and SyncRepGetSyncStandbysPriority does almost the same thing and both runs loops over group members. Couldn't they run at once? regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers