On Fri, Dec 18, 2015 at 7:38 AM, Masahiko Sawada <sawada.m...@gmail.com> wrote: [000-_multi_sync_replication_v3.patch]
Hi Masahiko, I haven't tested this version of the patch but I have some comments on the code. +/* Is this wal sender considerable one? */ +bool +SyncRepActiveListedWalSender(int num) Maybe "Is this wal sender managing a standby that is streaming and listed as a synchronous standby?" +/* + * Obtain three palloc'd arrays containing position of standbys currently + * considered as synchronous, and its length. + */ +int +SyncRepGetSyncStandbys(int *sync_standbys) This comment seems to be out of date. I would say "Populate a caller-supplied array which much have enough space for ... Returns ...". +/* + * Obtain standby currently considered as synchronous using + * '1-priority' method. + */ +int +SyncRepGetSyncStandbysOnePriority(int *sync_standbys) + ... code ... Why do we need a separate function and code path for this case? If you used SyncRepGetSyncStandbysPriority with a size of 1, should it not produce the same result in the same time complexity? +/* + * Obtain standby currently considered as synchronous using + * 'priority' method. + */ +int +SyncRepGetSyncStandbysPriority(int *sync_standbys) I would say something more descriptive, maybe like this: "Populates a caller-supplied buffer with the walsnds indexes of the highest priority active synchronous standbys, up to the a limit of 'synchronous_standby_num'. The order of the results is undefined. Returns the number of results actually written." If you got rid of SyncRepGetSyncStandbysOnePriority as suggested above, then this function could be renamed to SyncRepGetSyncStandbys. I think it would be a tiny bit nicer if it also took a Size n argument along with the output buffer pointer. As for the body of that function (which I won't paste here), it contains an algorithm to find the top K elements in an array of N elements. It does that with a linear search through the top K seen so far for each value in the input array, so its worst case is O(KN) comparisons. Some of the sorting gurus on this list might have something to say about that but my take is that it seems fine for the tiny values of K and N that we're dealing with here, and it's nice that it doesn't need any space other than the output buffer, unlike some other top-K algorithms which would win for larger inputs. + /* Found sync standby */ This comment would be clearer as "Found lowest priority standby, so replace it". + if (walsndloc->sync_standby_priority == priority && + walsnd->sync_standby_priority < priority) + sync_standbys[j] = i; In this case, couldn't you also update 'priority' directly, and break out of the loop immediately? Wouldn't "lowest_priority" be a better variable name than "priority"? It might be good to say "lowest" rather than "highest" in the nearby comments, to be consistent with other parts of the code including the function name (lower priority number means higher priority!). +/* + * Obtain currently synced LSN: write and flush, + * using '1-prioirty' method. s/prioirty/priority/ + */ +bool +SyncRepGetSyncLsnsOnePriority(XLogRecPtr *write_pos, XLogRecPtr *flush_pos) Similar to the earlier case, why have a special case for 1-priority? Wouldn't SyncRepGetSyncLsnsPriority produce the same result when is synchronous_standby_num == 1? +/* + * Obtain currently synced LSN: write and flush, + * using 'prioirty' method. s/prioirty/priority/ +SyncRepGetSyncLsnsPriority(XLogRecPtr *write_pos, XLogRecPtr *flush_pos) +{ + int *sync_standbys = NULL; + int num_sync; + int i; + XLogRecPtr synced_write = InvalidXLogRecPtr; + XLogRecPtr synced_flush = InvalidXLogRecPtr; + + sync_standbys = (int *) palloc(sizeof(int) * synchronous_standby_num); Would a fixed size buffer on the stack (of compile time constant size) be better than palloc/free in here and elsewhere? + /* + for (i = 0; i < num_sync; i++) + { + volatile WalSnd *walsndloc = &WalSndCtl->walsnds[sync_standbys[i]]; + if (walsndloc == MyWalSnd) + { + found = true; + break; + } + } + */ Dead code. + if (synchronous_replication_method == SYNC_REP_METHOD_1_PRIORITY) + synchronous_standby_num = 1; + else + synchronous_standby_num = pg_atoi(lfirst(list_head(elemlist)), sizeof(int), 0); Should we detect if synchronous_standby_num > the number of listed servers, which would be a nonsensical configuration? Should we also impose some other kind of constant limits, like must be >= 0 (I haven't tried but I wonder if -1 leads to very large palloc) and must be <= MAX_XXX (smallish sanity check number like 256, rather than the INT_MAX limit imposed by pg_atoi), so that we could use that constant to size stack buffers in the places where you currently palloc? Could 1-priority mode be inferred from the use of a non-number in the leading position, and if so, does the mode concept even need to exist, especially if SyncRepGetSyncLsnsOnePriority and SyncRepGetSyncStandbysOnePriority aren't really needed either way? Is there any difference in behaviour between the following configurations? (Sorry if that particular question has already been duked out in the long thread about GUCs.) synchronous_replication_method = 1-priority synchronous_standby_names = foo, bar synchronous_replication_method = priority synchronous_standby_names = 1, foo, bar (Apologies for the missing leading whitespace in patch fragments pasted above, it seems that my mail client has eaten it). -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers