Re: [Discuss] BP-56: Support non-stop bookie data migration and bookie offline

Yong Zhang Wed, 14 Sep 2022 00:06:13 -0700

It's a good idea to have that tool, but I have some questions about it.

>The process is as follows:
1. Submit the bookie node to be offline;
5. Traverse each ledgers on the offline bookie, and persist these ledgers
and the corresponding offline bookie nodes to the zookeeper directory:
ledgers/offline_ledgers/ledgerId;
6. Get the ledger to be offline;
7. Traverse all fragments on a ledger, and filter out the fragments
containing the offline bookie copy;
8. Copy data for each fragment;
9. When a ledger fragment is copied, delete the corresponding
ledgers/offline_ledgers/ledgerId;
10. When all ledgerId directories under ledgers/offline_ledgers are
deleted, it means that the data has been migrated, you can stop bookies in
batches and go offline;

Does the migration process run in the recovery service? Or run it
standalone?

It looks like most of the work is same with the AutoRecovery service. Mark
the ledger then
do the copy.
So we can reuse the AutoRecovery service to copy the data. The only thing
we need to
do is when a bookie offline, mark all ledgers in the bookie as
under-replicated. Then
let the AutoRecovery service copy it.

I remembered that AutoRecovery will consider the placement policy issue,
looks like the migration
doesn't consider that. For example, if you configured RackAware placement
policy and the new bookie
did not consider the rack, that will cause after migrating data, the
AutoRecovery will copy the data
again to make the data consistent with the placement policy. Using
AutoReocvery can avoid this.

Thanks,
Yong

On Tue, 13 Sept 2022 at 18:32, Enrico Olivelli <eolive...@gmail.com> wrote:

> Thanks for your answers.
>
> I support this BP.
> I have left some comments on the PR, there is some work to be done,
> but you are on your way
>
> When there is consensus about this BP you have to start a VOTE
>
> Thanks
>
> Enrico
>
> Il giorno ven 9 set 2022 alle ore 16:06 lordcheng10
> <1572139...@qq.com.invalid> ha scritto:
> >
> > Sorry for not describing the function of this BP clearly, this BP is not
> the same as your proposal.
> >
> >
> > Because the current bookkeeper does not have the ability to migrate data
> and can only perform data recovery, this BP mainly provides a data
> migration tool.
> >
> >
> > When bookkeeper has data migration tool, the following scenarios can be
> solved:
> > 1. The bookie nodes offline:
> > As mentioned above, after bookkeeper has this data migration tool, the
> offline steps of bookie are only two steps, and the time-consuming is
> greatly reduced:
> > a.Execute the data migration command:
> > bin/bookkeeper shell replicasMigration --bookieIds bookie1,bookie2
> --ledgerIds ALL --readOnly true
> > b. When the data migration is completed, stop all bookie nodes to be
> offline;
> >
> >
> > 2. Expand the bookie nodes to improve the reading speed of historical
> data:
> > a. When the client consumes historical data a few days ago, it hopes to
> increase the reading speed of historical data by expanding the bookie node.
> > b. When we expand the new bookie node to the cluster, the new node can
> only receive the read and write of new data, and cannot improve the reading
> speed of historical data.
> > c. After the data migration tool of bookkeeper, we can migrate some
> historical data to the new node, let the new node provide some historical
> data reading, and improve the reading speed of historical data.
>

Re: [Discuss] BP-56: Support non-stop bookie data migration and bookie offline

Reply via email to