Hi, yubiao, First of all, thanks for the attention and questions. Then for your three questions: 1. > Does the merge take place in memory or in BK? The snapshot will merge in BK. For specific details, you can see detailed instructions in the* ### Merge snapshot section.* 2. >How do we ensure the atomicity of the two writes, I suggest adding a check We do not guarantee their atomicity. The position of the snapshot is generally unchanged, so the previous index is also valid. If the index write fails after a snapshot is written, the final result is that the snapshot write fails this time. There will be no other worse results, and no dirty data will be introduced due to compression. 3. >Clean up unused aborts data Snapshot cleanup can be found in *####take snapshot ##### How*. The cleanup of the index is done automatically by the compressor. I will add it at *### Snapshot index topic.*
yours sincerely, Xiangying Meng On Mon, Aug 15, 2022 at 3:56 PM Yubiao Feng <yubiao.f...@streamnative.io.invalid> wrote: > Hi Xiangying > > I think Multiple-snapshots for TB is a good idea. And I have these > questions: > > > > The number of the transactions in a snapshot can be configured, and we > hope it is small, then we can merge the small snapshots into a large > snapshot when it reaches a configured number. > > Does the merge take place in memory or in BK? > > - If we merge small-snapshot in memory, can we just use large-snapshot? > - If we merge small-snapshot in BK, how to do it? > > > > > The index is written after each multiple-snapshot is written. > > Snapshot and index are stored in different topics, right? > > How do we ensure the atomicity of the two writes, I suggest adding a check > mechanism that snapshot not recorded in the index is invalid. > > > > > #### Clean up unused aborts data > > Now, this section only has instructions for clear snapshots. > I think we should add this: how to delete/override the index data. > > Thanks > Yubiao Feng > > On Thu, Aug 4, 2022 at 10:27 AM Xiangying Meng <xiangy...@apache.org> > wrote: > > > Hi, Pulsar community, > > I`d like to start a discussion about transaction multiple-snapshot. > > In order to get rid of the capacity limitation of the bookkeeper entry, > we > > plan to use multiple snapshots. More details can be found here > > <https://github.com/apache/pulsar/issues/16913>. > > > > Yours sincerely, > > Xiangying Meng > > >