2019-06-21 21:17:03 UTC - Sinan Bir: @Sinan Bir has joined the channel
----
2019-06-21 23:33:18 UTC - Fredrick P Eisele: I have encountered what appears to
be a race condition in pulsar.
The pulsar cluster is running on OpenStack. The configuration consists of three
zk virtual-machines and three pulsar+bookkeeper virtual-machines.
I run a command such as the following:
./bin/pulsar-admin brokers list use
Sometimes I get what I expect:
pulsar01:8080
pulsar02:8080
pulsar03:8080
And sometimes I get:
null
Reason: javax.ws.rs.ProcessingException: Connection refused:
localhost/127.0.0.1:8080
----
2019-06-21 23:37:14 UTC - Fredrick P Eisele: If I run with `strace -f -e
trace=network ./bin/pulsar-admin brokers list use` I get the following:
Successful:
excerpt
strace: Process conn:02 attached
[pid conn:02 connect(288, {sa_family=AF_INET, sin_port=htons(8080),
sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in
progress)
[pid conn:02 getsockopt(288, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
[pid conn:02 getsockname(288, {sa_family=AF_INET, sin_port=htons(59512),
sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
[pid conn:02 getsockname(288, {sa_family=AF_INET, sin_port=htons(59512),
sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
The failure mode retries a few times before giving up.
[pid conn:02 socketpair(AF_UNIX, SOCK_STREAM, 0, [289, 290]) = 0
[pid conn:02 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x8}
---
Failure:
excerpt
strace: Process conn:02 attached
[pid conn:02 connect(288, {sa_family=AF_INET, sin_port=htons(8080),
sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in
progress)
[pid conn:02 getsockopt(288, SOL_SOCKET, SO_ERROR, [111], [4]) = 0
[pid conn:02 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x8}
---
----
2019-06-21 23:42:44 UTC - Fredrick P Eisele: The 111 error code is ECONNREFUSED
per /usr/include/asm-generic/errno.h
----