On 2020/04/18 0:31, Tom Lane wrote:
Kyotaro Horiguchi writes:
At Fri, 17 Apr 2020 16:03:34 +0900, Fujii Masao
wrote in
I agree that it might be worth considering the removal of am_sync for
the master branch or v14. But I think that it should not be
back-patched.
Ah! Agreed.
Yeah, tha
On Sun, 19 Apr 2020 at 01:00, Tom Lane wrote:
>
> Masahiko Sawada writes:
> > On Sat, 18 Apr 2020 at 00:31, Tom Lane wrote:
> >> + /* Quick out if not even configured to be synchronous */
> >> + if (SyncRepConfig == NULL)
> >> + return false;
>
> > I felt strange a bit that we do the a
Masahiko Sawada writes:
> On Sat, 18 Apr 2020 at 00:31, Tom Lane wrote:
>> + /* Quick out if not even configured to be synchronous */
>> + if (SyncRepConfig == NULL)
>> + return false;
> I felt strange a bit that we do the above check in
> SyncRepGetSyncRecPtr() because SyncRepReleaseW
On Sat, 18 Apr 2020 at 00:31, Tom Lane wrote:
>
> Kyotaro Horiguchi writes:
> > At Fri, 17 Apr 2020 16:03:34 +0900, Fujii Masao
> > wrote in
> >> I agree that it might be worth considering the removal of am_sync for
> >> the master branch or v14. But I think that it should not be
> >> back-patc
Kyotaro Horiguchi writes:
> At Fri, 17 Apr 2020 17:03:11 +0900, Masahiko Sawada
> wrote in
>> On Fri, 17 Apr 2020 at 14:58, Kyotaro Horiguchi
>> wrote:
>> Just for confirmation, since the new approach doesn't change that
>> walsenders reload new config at their convenient timing, it still can
Kyotaro Horiguchi writes:
> At Fri, 17 Apr 2020 16:03:34 +0900, Fujii Masao
> wrote in
>> I agree that it might be worth considering the removal of am_sync for
>> the master branch or v14. But I think that it should not be
>> back-patched.
> Ah! Agreed.
Yeah, that's not necessary to fix the b
At Fri, 17 Apr 2020 17:03:11 +0900, Masahiko Sawada
wrote in
> On Fri, 17 Apr 2020 at 14:58, Kyotaro Horiguchi
> wrote:
> > The attached is baed on syncrep-fixes-1.patch + am_sync elimination.
> >
>
> Just for confirmation, since the new approach doesn't change that
> walsenders reload new co
At Fri, 17 Apr 2020 16:03:34 +0900, Fujii Masao
wrote in
> I agree that it might be worth considering the removal of am_sync for
> the master branch or v14. But I think that it should not be
> back-patched.
Ah! Agreed.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Fri, 17 Apr 2020 at 14:58, Kyotaro Horiguchi wrote:
>
> At Thu, 16 Apr 2020 11:39:06 -0400, Tom Lane wrote in
> > Kyotaro Horiguchi writes:
> > > [ syncrep-fixes-4.patch ]
> >
> > I agree that we could probably improve the clarity of this code with
> > further rewriting, but I'm still very op
On 2020/04/17 3:00, Tom Lane wrote:
Fujii Masao writes:
On 2020/04/14 22:52, Tom Lane wrote:
*Yes it does*. The existing code can deliver entirely broken results
if some walsender exits between where we examine the priorities and
where we fetch the WAL pointers.
IMO that the broken resul
On 2020/04/17 14:58, Kyotaro Horiguchi wrote:
At Thu, 16 Apr 2020 11:39:06 -0400, Tom Lane wrote in
Kyotaro Horiguchi writes:
[ syncrep-fixes-4.patch ]
I agree that we could probably improve the clarity of this code with
further rewriting, but I'm still very opposed to the idea of having
At Thu, 16 Apr 2020 11:39:06 -0400, Tom Lane wrote in
> Kyotaro Horiguchi writes:
> > [ syncrep-fixes-4.patch ]
>
> I agree that we could probably improve the clarity of this code with
> further rewriting, but I'm still very opposed to the idea of having
> callers know that the first num_sync a
Fujii Masao writes:
> On 2020/04/14 22:52, Tom Lane wrote:
>> *Yes it does*. The existing code can deliver entirely broken results
>> if some walsender exits between where we examine the priorities and
>> where we fetch the WAL pointers.
> IMO that the broken results can be delivered when walsen
On 2020/04/14 22:52, Tom Lane wrote:
Kyotaro Horiguchi writes:
SyncRepGetSyncStandbysPriority() is runing holding SyncRepLock so
sync_standby_priority of any walsender can be changed while the
function is scanning welsenders. The issue is we already have
inconsistent walsender information be
Kyotaro Horiguchi writes:
> [ syncrep-fixes-4.patch ]
I agree that we could probably improve the clarity of this code with
further rewriting, but I'm still very opposed to the idea of having
callers know that the first num_sync array elements are the active
ones. It's wrong (or at least differen
At Thu, 16 Apr 2020 16:48:28 +0900, Masahiko Sawada
wrote in
> This is just a notice; I'm reading your latest patch but it seems to
> include unrelated changes:
>
> $ git diff --stat
> src/backend/replication/syncrep.c | 475
> +
On Thu, 16 Apr 2020 at 16:22, Kyotaro Horiguchi wrote:
>
> At Wed, 15 Apr 2020 11:31:49 -0400, Tom Lane wrote in
> > Kyotaro Horiguchi writes:
> > > At Tue, 14 Apr 2020 16:32:40 -0400, Tom Lane wrote in
> > > + stby->is_sync_standby = true; /* might change below */
> >
> > > I'm une
At Wed, 15 Apr 2020 11:31:49 -0400, Tom Lane wrote in
> Kyotaro Horiguchi writes:
> > At Tue, 14 Apr 2020 16:32:40 -0400, Tom Lane wrote in
> > + stby->is_sync_standby = true; /* might change below */
>
> > I'm uneasy with that. In quorum mode all running standbys are marked
> >
Kyotaro Horiguchi writes:
> At Tue, 14 Apr 2020 16:32:40 -0400, Tom Lane wrote in
> + stby->is_sync_standby = true; /* might change below */
> I'm uneasy with that. In quorum mode all running standbys are marked
> as "sync" and that's bogus.
I don't follow that? The existing co
At Wed, 15 Apr 2020 13:01:02 +0900, Masahiko Sawada
wrote in
> On Tue, 14 Apr 2020 at 18:35, Kyotaro Horiguchi
> wrote:
> >
> > At Tue, 14 Apr 2020 13:06:14 +0900, Masahiko Sawada
> > wrote in
> > > On Tue, 14 Apr 2020 at 10:34, Tom Lane wrote:
> > > >
> > > > Kyotaro Horiguchi writes:
> >
At Wed, 15 Apr 2020 11:35:58 +0900 (JST), Kyotaro Horiguchi
wrote in
> I'm looking this more closer.
It looks to be in the right direction to me.
As mentioned in the previous mail, I removed is_sync_standby from
SycnStandbyData. But just doing that breaks pg_stat_get_wal_senders.
It is an exst
On Tue, 14 Apr 2020 at 18:35, Kyotaro Horiguchi wrote:
>
> At Tue, 14 Apr 2020 13:06:14 +0900, Masahiko Sawada
> wrote in
> > On Tue, 14 Apr 2020 at 10:34, Tom Lane wrote:
> > >
> > > Kyotaro Horiguchi writes:
> > > > At Sat, 11 Apr 2020 18:30:30 -0400, Tom Lane wrote
> > > > in
> > > >> Wha
On Tue, Apr 14, 2020 at 04:32:40PM -0400, Tom Lane wrote:
> I wrote:
> > It doesn't seem to me to be that hard to implement the desired
> > semantics for synchronous_standby_names with inconsistent info.
> > In FIRST mode you basically just need to take the N smallest
> > priorities you see in the
At Tue, 14 Apr 2020 16:32:40 -0400, Tom Lane wrote in
> I wrote:
> > It doesn't seem to me to be that hard to implement the desired
> > semantics for synchronous_standby_names with inconsistent info.
> > In FIRST mode you basically just need to take the N smallest
> > priorities you see in the ar
At Tue, 14 Apr 2020 09:52:42 -0400, Tom Lane wrote in
> Kyotaro Horiguchi writes:
> > SyncRepGetSyncStandbysPriority() is runing holding SyncRepLock so
> > sync_standby_priority of any walsender can be changed while the
> > function is scanning welsenders. The issue is we already have
> > incons
I wrote:
> It doesn't seem to me to be that hard to implement the desired
> semantics for synchronous_standby_names with inconsistent info.
> In FIRST mode you basically just need to take the N smallest
> priorities you see in the array, but without assuming there are no
> duplicates or holes. It
Kyotaro Horiguchi writes:
> SyncRepGetSyncStandbysPriority() is runing holding SyncRepLock so
> sync_standby_priority of any walsender can be changed while the
> function is scanning welsenders. The issue is we already have
> inconsistent walsender information before we enter the function. Thus
>
At Tue, 14 Apr 2020 13:06:14 +0900, Masahiko Sawada
wrote in
> On Tue, 14 Apr 2020 at 10:34, Tom Lane wrote:
> >
> > Kyotaro Horiguchi writes:
> > > At Sat, 11 Apr 2020 18:30:30 -0400, Tom Lane wrote in
> > >> What I think we should do about this is, essentially, to get rid of
> > >> SyncRepG
On Tue, 14 Apr 2020 at 10:34, Tom Lane wrote:
>
> Kyotaro Horiguchi writes:
> > At Sat, 11 Apr 2020 18:30:30 -0400, Tom Lane wrote in
> >> What I think we should do about this is, essentially, to get rid of
> >> SyncRepGetSyncStandbys. Instead, let's have each walsender advertise
> >> whether *
Kyotaro Horiguchi writes:
> At Sat, 11 Apr 2020 18:30:30 -0400, Tom Lane wrote in
>> What I think we should do about this is, essentially, to get rid of
>> SyncRepGetSyncStandbys. Instead, let's have each walsender advertise
>> whether *it* believes that it is a sync standby, based on its last
At Mon, 13 Apr 2020 15:31:01 +0900 (JST), Kyotaro Horiguchi
wrote in
> At Sat, 11 Apr 2020 18:30:30 -0400, Tom Lane wrote in
> > The point that I was trying to make originally is that it seems quite
> > insane to imagine that a walsender's sync_standby_priority value is
> > somehow more stable
At Sat, 11 Apr 2020 18:30:30 -0400, Tom Lane wrote in
> Masahiko Sawada writes:
> > On Tue, 31 Mar 2020 at 23:16, Masahiko Sawada
> > wrote:
> >>> Therefore, the band-aid fix seems to be to set the lowest priority to
> >>> very large number at the beginning of SyncRepGetSyncStandbysPriority().
Masahiko Sawada writes:
> On Tue, 31 Mar 2020 at 23:16, Masahiko Sawada
> wrote:
>>> Therefore, the band-aid fix seems to be to set the lowest priority to
>>> very large number at the beginning of SyncRepGetSyncStandbysPriority().
>> I think we can use max_wal_senders.
> Sorry, that's not true.
On Tue, 31 Mar 2020 at 23:16, Masahiko Sawada
wrote:
>
> On Fri, 27 Mar 2020 at 13:54, Fujii Masao wrote:
> >
> >
> >
> > On 2020/03/27 10:26, Tom Lane wrote:
> > > Twice in the past month [1][2], buildfarm member hoverfly has managed
> > > to reach the "unreachable" Assert(false) at the end of
>
On Fri, 27 Mar 2020 at 13:54, Fujii Masao wrote:
>
>
>
> On 2020/03/27 10:26, Tom Lane wrote:
> > Twice in the past month [1][2], buildfarm member hoverfly has managed
> > to reach the "unreachable" Assert(false) at the end of
> > SyncRepGetSyncStandbysPriority.
>
> When I search the past discussi
At Fri, 27 Mar 2020 13:54:25 +0900, Fujii Masao
wrote in
>
>
> On 2020/03/27 10:26, Tom Lane wrote:
> > Twice in the past month [1][2], buildfarm member hoverfly has managed
> > to reach the "unreachable" Assert(false) at the end of
> > SyncRepGetSyncStandbysPriority.
>
> When I search the pa
On 2020/03/27 10:26, Tom Lane wrote:
Twice in the past month [1][2], buildfarm member hoverfly has managed
to reach the "unreachable" Assert(false) at the end of
SyncRepGetSyncStandbysPriority.
When I search the past discussions, I found that Noah Misch reported
the same issue.
https://www.p
Twice in the past month [1][2], buildfarm member hoverfly has managed
to reach the "unreachable" Assert(false) at the end of
SyncRepGetSyncStandbysPriority.
What seems likely to me, after quickly eyeballing the code, is that
hoverfly is hitting the blatantly-obvious race condition in that function
38 matches
Mail list logo