Kyotaro Horiguchi <horikyota....@gmail.com> writes: > SyncRepGetSyncStandbysPriority() is runing holding SyncRepLock so > sync_standby_priority of any walsender can be changed while the > function is scanning welsenders. The issue is we already have > inconsistent walsender information before we enter the function. Thus > how many times we scan on the array doesn't make any difference.
*Yes it does*. The existing code can deliver entirely broken results if some walsender exits between where we examine the priorities and where we fetch the WAL pointers. While that doesn't seem to be the exact issue we're seeing in the buildfarm, it's still another obvious bug in this code. I will not accept a "fix" that doesn't fix that. > I think we need to do one of the followings. > A) prevent SyncRepGetSyncStandbysPriority from being entered while > walsender priority is inconsistent. > B) make SyncRepGetSyncStandbysPriority be tolerant of priority > inconsistency. > C) protect walsender priority array from beinig inconsistent. (B) seems like the only practical solution from here. We could probably arrange for synchronous update of the priorities when they change in response to a GUC change, but it doesn't seem to me to be practical to do that in response to walsender exit. You'd end up finding that an unexpected walsender exit results in panic'ing the system, which is no better than where we are now. It doesn't seem to me to be that hard to implement the desired semantics for synchronous_standby_names with inconsistent info. In FIRST mode you basically just need to take the N smallest priorities you see in the array, but without assuming there are no duplicates or holes. It might be a good idea to include ties at the end, that is if you see 1,2,2,4 or 1,3,3,4 and you want 2 sync standbys, include the first three of them in the calculation until the inconsistency is resolved. In ANY mode I don't see that inconsistent priorities matter at all. > If we accept to share variable-length information among processes, > sharing sync_standby_names or parsed SyncRepConfigData among processes > would work. Not sure that we really need more than what's being shared now, ie each process's last-known index in the sync_standby_names list. regards, tom lane