Il ven 1 giu 2018, 16:03 Venkateswara Rao Jujjuri <jujj...@gmail.com> ha
scritto:

> @Enrico
> Let me understand the issue: Bookies are up and running but ZK doesn't show
> bookies on the list.
>
> Do you see if session expired or not? or bookies are hung in some way?
>
> We have seen multiple situations in this area:
> 1. Bookie process is around, but zk session lost. Alost all the time we ran
> into this when bookie is hung in some sense.
>

I suspect this is my case, but dumping the stack of all treads shows a
normal bookie without activity.
No error reported on logs.

2. Bookies are down, but ZK still shows the Bookie in RW list. We suspect
> this is a ZK bug which got fixed in later releases.
>

No, bookie is not in the list of available/readonly bookies


3. Bookies are hug at disk, but was able to keep up ZK session.
>

Not this case, because no thread is performing I/O.

>
> Does your case fit into any of these situations? Or do you believe that the
> Bookie is healthy and up but lost ZK session?
>

I believe bookie is healthy, didn't remember if we tried 'bookie sanity'
check

If so how did you validate that your Bookie is healthy.?
>

Normal dump of stack trace
Port bound, listening
No errors in logs
Process alive
Additional http monitoring interface (custom internal monitoring system) up
and reporting no error

But clients can't see the bookie, even with bookkeeper shell list bookies.

Unfortunately I have such reports from sites of my customers, so it is
difficult to get feedback and perform tests

Enrico


> JV
>
>
> On Fri, Jun 1, 2018 at 12:11 AM, Enrico Olivelli <eolive...@gmail.com>
> wrote:
>
> > Il ven 1 giu 2018, 08:49 Sijie Guo <guosi...@gmail.com> ha scritto:
> >
> > > I don't think there is any zk changes between 4.6.2 and 4.7.0. Are you
> > sure
> > > the upgrade fixes the problem?
> > >
> >
> > I have checked several times and it seems to me that every zk fix in
> 4.7.0
> > has been cherry picked to 4.6.2.
> > It is only a fact that with the upgrade the issue does not appear. Maybe
> it
> > is too early to say that it is working.
> >
> > I will send news
> >
> > Enrico
> >
> >
> >
> > > - Sijie
> > >
> > > On Thu, May 31, 2018 at 11:30 PM, Enrico Olivelli <eolive...@gmail.com
> >
> > > wrote:
> > >
> > > > Seems that al the sites which are reporting this kind of problems are
> > > ONLY
> > > > on 4.6.2.
> > > >
> > > > After an upgrade to 4.7.0 apparently the problem disappears.
> > > >
> > > > I will send news next week
> > > >
> > > > Enrico
> > > >
> > > > Il dom 20 mag 2018, 19:18 Enrico Olivelli <eolive...@gmail.com> ha
> > > > scritto:
> > > >
> > > > > My guess is that is about using zk ACLs
> > > > > I have no evidence
> > > > >
> > > > > Enrico
> > > > >
> > > > >
> > > > > Il mar 15 mag 2018, 14:09 Enrico Olivelli <eolive...@gmail.com> ha
> > > > > scritto:
> > > > >
> > > > >> Il giorno mar 15 mag 2018 alle ore 14:04 Sijie Guo <
> > > guosi...@gmail.com>
> > > > >> ha scritto:
> > > > >>
> > > > >>> On Tue, May 15, 2018 at 4:45 AM, Enrico Olivelli <
> > > eolive...@gmail.com>
> > > > >>> wrote:
> > > > >>>
> > > > >>> > Hi,
> > > > >>> > it is quite some time that we are seeing Bookies in staging
> > > > >>> environments
> > > > >>> > which disappear from ZK but appartently are still up and
> running.
> > > > >>> >
> > > > >>> > I have not dug deeply into this problem, but at first glance it
> > > > should
> > > > >>> be
> > > > >>> > related to ZK session expiration, those machines are heavily
> > loaded
> > > > >>> > sometimes and it is not surprising that ZK session expires.
> > > > >>> >
> > > > >>>
> > > > >>> There should be already a logic on re-registration after session
> > > > expired,
> > > > >>> no?
> > > > >>>
> > > > >>
> > > > >> Yes. The fact is that in the month we are seeing this strange
> > > behaviour.
> > > > >> I don't know if it could be a regression on 4.7.
> > > > >> I have no reports from production sites, but in production we have
> > > > >> dedicated machines for bookies.
> > > > >>
> > > > >>
> > > > >>>
> > > > >>> ZooKeeper stats should always show whether a bookie is able to
> > > connect
> > > > to
> > > > >>> zookeeper. That would probably tell you what happens.
> > > > >>>
> > > > >>
> > > > >> I will check, thank you for your suggestion.
> > > > >>
> > > > >> Enrico
> > > > >>
> > > > >>
> > > > >>
> > > > >>>
> > > > >>>
> > > > >>> >
> > > > >>> > Apart from searching for a bug, I wonder if would it be useful
> an
> > > > >>> automatic
> > > > >>> > self check of the bookie, something like a periodic check which
> > > asks
> > > > >>> to the
> > > > >>> > Registration Manager if the bookie is listed in the expected
> > bookie
> > > > >>> list
> > > > >>> > (readonly/available....)
> > > > >>> >
> > > > >>> > This will be useful even if we are not using ZK as well, now
> that
> > > we
> > > > >>> have
> > > > >>> > this great abstraction of ZK
> > > > >>> >
> > > > >>> > Thoughts ?
> > > > >>> >
> > > > >>> > Enrico
> > > > >>> >
> > > > >>>
> > > > >> --
> > > > >
> > > > >
> > > > > -- Enrico Olivelli
> > > > >
> > > > --
> > > >
> > > >
> > > > -- Enrico Olivelli
> > > >
> > >
> > --
> >
> >
> > -- Enrico Olivelli
> >
>
>
>
> --
> Jvrao
> ---
> First they ignore you, then they laugh at you, then they fight you, then
> you win. - Mahatma Gandhi
>
-- 


-- Enrico Olivelli

Reply via email to