On Thu, Mar 8, 2018 at 8:07 AM, Venkateswara Rao Jujjuri <jujj...@gmail.com>
wrote:

> On Thu, Mar 8, 2018 at 2:38 AM, Ivan Kelly <iv...@apache.org> wrote:
>
> > > Given that RackAwareEnsemble policy defaults to finding a replacement
> > > bookie within
> > > the same rack, when a bookie is lost in a rack, the entire cluster will
> > be
> > > replicating
> > > to the same 'rack'. This puts a lot of pressure on the rack and also
> > takes
> > > a longer time
> > > to bring up the replication levels.
> >
> > I agree this has potential to be problematic.
> >
> > Perhaps we should provide a switch to RackAwareEnsemble,
> > 'preferReplaceInSameRack'.
> >
> > > I would think the right fix is to bring back the targetBookie concept
> > (with
> > > a configuration parameter) and add placement check predicate on top of
> > it.
> > > When this is configured
> > > each bookie picks up the work,  checks if the ensemble placement policy
> > > gets satisfied,
> > > if so replicate it, if not move on.
> >
> > I don't think adding a predicate argument (I guess a
> > BiPredicate<Set<BookieSocketAddress>, BookieSocketAddress>?) to the
> > recover bookie call makes sense. There is already a way to customize
> > this behaviour, by passing in a EnsemblePlacementPolicy on
> > Configuration of the client. The behaviour you want can be achieved by
> > taking one of the current EnsemblePlacementPolicies and overriding
> > replaceBookie, though I guess that's not very user-friendly. However,
> > even if it was user-friendly, how would we make it easy for users to
> > supply a placementpolicy or a even a predicate, as you suggested, to
> > the autorecovery daemon.
> >
>
> In the old model if the bookie is writable
> AND is not part of ensemble, replicate to the local(target) bookie.
> My proposal is t add anotehr AND condition.
>
> if bookie is writable AND not part of ensemble AND satisfies Enseble
> Placement Policy
> write to local(target) bookie.
>


I think this is a good change to take. but we need to differentiate things:

1) if bookie recovery is running a separate daemon, we don't need to do any
changes here.

2) if bookie recovery is running along with the bookie, we construct a
ensemble placement policy which wrap over the rack-aware/region-aware one
and override the predicate to take local bookie into account.


However there is a problem which this "predicate" model, because it can
potentially churn the metadata storage, since now some bookies are actually
doing nothing but polling the ur replication ledger list. so a long term
direction is to change how auditor distributes replication tasks to
replication workers.



>
> Thanks,
> JV
>
>
> > -Ivan
> >
>
>
>
> --
> Jvrao
> ---
> First they ignore you, then they laugh at you, then they fight you, then
> you win. - Mahatma Gandhi
>

Reply via email to