Thank you Ivan!
I hope I did not mess up the dump and added ZK ports.  We are not using
standard ports and in that 3  machines there is also the 3 nodes zk
ensemble which is supporting BK and all the other parts of the application

So one explanation would be that something is connecting to the bookie and
this makes the bookie switch in a corrupted state by double releasing a
bytebuf?

Enrico

Il ven 9 mar 2018, 18:23 Ivan Kelly <iv...@apache.org> ha scritto:

> I need to sign off for the day. I've done some analysis of a tcpdump
> enrico sent to me out of band (may contain sensitive info so best not
> to post on public forum).
>
> I've attached a dump of just first bit of the header. Format is
> <sequence in dump> <whether a request or response>(<remote port>)
> <hexdump of payload>
>
> There are definitely corrupt packets coming from somewhere. Search for
> lines with CORRUPT.
>
> <snip>
> 0247 -  req (049546) - 00:00:00:08:ff:ff:ff:fe:00:00:00:0b    CORRUPT
> </snip>
>
> It's not clear whether these are originating at a valid client or not.
> These trigger corrupt responses from the server, which I guess is the
> double free manifesting itself. Strangely the
> corrupt message seems to have a lot of data in common with what seems
> like an ok message (it's clearer on fixed width font).
>
> <snip>
> 0248 -  resp(049720) -
>
> 00:00:00:54:00:03:00:89:00:00:02:86:00:07:e2:b1:00:00:00:00:00:00:02:86:00:05:e9:76:00:00
> 0249 -  resp(049546) -
> 00:00:00:10:ff:ff:ff:fe:00:00:02:86:00:07:e2:b1:00:00:00:00    CORRUPT
> </snip>
>
> There's also some other weird traffic. Correct BK protobuf traffic
> should be <4 bytes len>:00:03:....
> There seems to be other traffic which is being accepted at the same
> port, but looks like ZK traffic.
>
> Anyhow, I'll dig more on monday.
>
> -Ivan
>
>
> On Fri, Mar 9, 2018 at 3:27 PM, Ivan Kelly <iv...@apache.org> wrote:
> > On Fri, Mar 9, 2018 at 3:20 PM, Enrico Olivelli <eolive...@gmail.com>
> wrote:
> >> Bookies
> >> 10.168.10.117:1822 -> bad bookie with 4.1.21
> >> 10.168.10.116:1822 -> bookie with 4.1.12
> >> 10.168.10.118:1281 -> bookie with 4.1.12
> >>
> >> 10.168.10.117 client machine on which I have 4.1.21 client (different
> >> process than the bookie one)
> > Oh. This dump won't have the stream we need then, as that will be on
> > loopback. Try adding "-i any" to the tcpdump. Sorry, I didn't realize
> > your clients and servers are colocated.
> >
> > -Ivan
>
-- 


-- Enrico Olivelli

Reply via email to