On 8/25/21, 5:40 PM, "Kyotaro Horiguchi" <horikyota....@gmail.com> wrote:
> At Wed, 25 Aug 2021 18:18:59 +0000, "Bossart, Nathan" <bossa...@amazon.com> 
> wrote in
>> Let's say we have the following situation (F = flush, E = earliest
>> registered boundary, and L = latest registered boundary), and let's
>> assume that each segment has a cross-segment record that ends in the
>> next segment.
>>
>>         F     E                                         L
>>         |-----|-----|-----|-----|-----|-----|-----|-----|
>>            1     2     3     4     5     6     7     8
>>
>> Then, we write out WAL to disk and create .ready files as needed.  If
>> we didn't flush beyond the latest registered boundary, the latest
>> registered boundary now becomes the earliest boundary.
>>
>>                           F                             E
>>         |-----|-----|-----|-----|-----|-----|-----|-----|
>>            1     2     3     4     5     6     7     8
>>
>> At this point, the earliest segment boundary past the flush point is
>> before the "earliest" boundary we are tracking.
>
> We know we have some cross-segment records in the regin [E L] so we
> cannot add a .ready file if flush is in the region. I haven't looked
> the latest patch (or I may misunderstand the discussion here) but I
> think we shouldn't move E before F exceeds previous (or in the first
> picture above) L.  Things are done that way in my ancient proposal in
> [1].

The strategy in place ensures that we track a boundary that doesn't
change until the flush position passes it as well as the latest
registered boundary.  I think it is important that any segment
boundary tracking mechanism does at least those two things.  I don't
see how we could do that if we didn't update E until F passed both E
and L.

Nathan

Reply via email to