Here's an update on my progress. I reproduced Dave's observations with
the following steps:

1. Download and unpack 3.0.0 RC 1.

2. Add these two settings to the conf/standalone.conf:
exposingBrokerEntryMetadataToClientEnabled=true
brokerEntryMetadataInterceptors=org.apache.pulsar.common.intercept.AppendIndexMetadataInterceptor,org.apache.pulsar.common.intercept.AppendBrokerTimestampMetadataInterceptor

3. Run bin/pulsar standalone

4. Run curl http://localhost:8080/admin/v2/brokers/health

5. View the following log:

2023-04-13T16:48:39,256-0500
[broker-topic-workers-OrderedExecutor-7-0] ERROR
org.apache.pulsar.common.protocol.Commands -
[PersistentSubscription{topic=persistent://pulsar/standalone/localhost:8080/healthcheck,
name=healthCheck-012d2e05-371f-4ddc-a666-daf9fa243d87}] [-1] Failed to
parse message metadata
java.lang.IndexOutOfBoundsException: readerIndex(100) + length(2)
exceeds writerIndex(100): UnpooledDuplicatedByteBuf(ridx: 100, widx:
100, cap: 100, unwrapped: CompositeByteBuf(ridx: 100, widx: 100, cap:
100, components=2))
    at 
io.netty.buffer.AbstractByteBuf.checkReadableBytes0(AbstractByteBuf.java:1442)
~[io.netty-netty-buffer-4.1.89.Final.jar:4.1.89.Final]
    at io.netty.buffer.AbstractByteBuf.readShort(AbstractByteBuf.java:749)
~[io.netty-netty-buffer-4.1.89.Final.jar:4.1.89.Final]
    at 
org.apache.pulsar.common.protocol.Commands.skipBrokerEntryMetadataIfExist(Commands.java:1692)
~[org.apache.pulsar-pulsar-common-3.0.0.jar:3.0.0]
    at 
org.apache.pulsar.common.protocol.Commands.parseMessageMetadata(Commands.java:452)
~[org.apache.pulsar-pulsar-common-3.0.0.jar:3.0.0]
    at 
org.apache.pulsar.common.protocol.Commands.parseMessageMetadata(Commands.java:445)
~[org.apache.pulsar-pulsar-common-3.0.0.jar:3.0.0]
    at 
org.apache.pulsar.common.protocol.Commands.peekMessageMetadata(Commands.java:1899)
~[org.apache.pulsar-pulsar-common-3.0.0.jar:3.0.0]
    at 
org.apache.pulsar.common.protocol.Commands.peekAndCopyMessageMetadata(Commands.java:1918)
~[org.apache.pulsar-pulsar-common-3.0.0.jar:3.0.0]
    at 
org.apache.pulsar.broker.service.AbstractBaseDispatcher.filterEntriesForConsumer(AbstractBaseDispatcher.java:142)
~[org.apache.pulsar-pulsar-broker-3.0.0.jar:3.0.0]
    at 
org.apache.pulsar.broker.service.AbstractBaseDispatcher.filterEntriesForConsumer(AbstractBaseDispatcher.java:100)
~[org.apache.pulsar-pulsar-broker-3.0.0.jar:3.0.0]
    at 
org.apache.pulsar.broker.service.persistent.PersistentDispatcherSingleActiveConsumer.internalReadEntriesComplete(PersistentDispatcherSingleActiveConsumer.java:210)
~[org.apache.pulsar-pulsar-broker-3.0.0.jar:3.0.0]
    at 
org.apache.pulsar.broker.service.persistent.PersistentDispatcherSingleActiveConsumer.lambda$readEntriesComplete$1(PersistentDispatcherSingleActiveConsumer.java:151)
~[org.apache.pulsar-pulsar-broker-3.0.0.jar:3.0.0]
    at 
org.apache.bookkeeper.common.util.SingleThreadExecutor.safeRunTask(SingleThreadExecutor.java:137)
~[org.apache.bookkeeper-bookkeeper-common-4.16.0.jar:4.16.0]
    at 
org.apache.bookkeeper.common.util.SingleThreadExecutor.run(SingleThreadExecutor.java:113)
~[org.apache.bookkeeper-bookkeeper-common-4.16.0.jar:4.16.0]
    at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
~[io.netty-netty-common-4.1.89.Final.jar:4.1.89.Final]
    at java.lang.Thread.run(Thread.java:833) ~[?:?]

Now that we have a consistent way to reproduce it, I'll work on
understanding the root cause. I believe we run with these two settings
with other versions, so this feels like a regression. I cannot say for
sure yet.

Thanks,
Michael

On Thu, Apr 13, 2023 at 1:47 AM Enrico Olivelli <eolive...@gmail.com> wrote:
>
> Dave,
> IIUC (from offline discussion) those tests started to fail only recently,
> as we run those tests against the master branch of apache/pulsar repo.
>
> Does the beginning of failures match the upgrade to BK 4.16 ?
> The problem may be related to the handling of ByteBufs, IIRC there are a
> few changes in the latest
> BK release.
>
> Enrico
>
> Il giorno gio 13 apr 2023 alle ore 05:02 Hang Chen <chenh...@apache.org> ha
> scritto:
>
> > Hi Dave,
> >     Thanks for verifying this release candidate. I have setup the Pulsar
> > cluster with 3.0 on my laptop and test the produce and consume, it works
> > fine. We also deploy the it on k8s, the produce, consume and health check
> > also works fine. Would you please share your broker configurations?
> >
> > Thanks,
> > Hang
> >
> > Dave Fisher <wave4d...@comcast.net> 于2023年4月13日周四 06:49写道:
> >
> >> Hi -
> >>
> >> We do End to end testing of Pulsar versions on GCP and with Pulsar 3.0
> >> today we have the following problem.
> >>
> >> This image shows the broker deployments not coming up. This is a non-TLS
> >> setup.
> >>
> >> [image: Screen Shot 2023-04-12 at 1.09.59 PM.png]
> >>
> >> The Broker Pods have errors like this one:
> >>
> >> 2023-04-12T20:10:35,995+0000 [broker-topic-workers-OrderedExecutor-3-0] 
> >> ERROR org.apache.pulsar.common.protocol.Commands - 
> >> [PersistentSubscription{topic=persistent://pulsar/pulsar/10.236.0.4:8080/healthcheck,
> >>  name=healthCheck-04a8e22e-3e0b-4aa3-9141-d8970c1b0712}] [-1] Failed to 
> >> parse message metadata
> >> java.lang.IndexOutOfBoundsException: readerIndex(96) + length(2) exceeds 
> >> writerIndex(96): UnpooledDuplicatedByteBuf(ridx: 96, widx: 96, cap: 96, 
> >> unwrapped: CompositeByteBuf(ridx: 96, widx: 96, cap: 96, components=2))
> >>      at 
> >> io.netty.buffer.AbstractByteBuf.checkReadableBytes0(AbstractByteBuf.java:1442)
> >>  ~[io.netty-netty-buffer-4.1.89.Final.jar:4.1.89.Final]
> >>      at 
> >> io.netty.buffer.AbstractByteBuf.readShort(AbstractByteBuf.java:749) 
> >> ~[io.netty-netty-buffer-4.1.89.Final.jar:4.1.89.Final]
> >>      at 
> >> org.apache.pulsar.common.protocol.Commands.skipBrokerEntryMetadataIfExist(Commands.java:1692)
> >>  ~[org.apache.pulsar-pulsar-common-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> >>      at 
> >> org.apache.pulsar.common.protocol.Commands.parseMessageMetadata(Commands.java:452)
> >>  ~[org.apache.pulsar-pulsar-common-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> >>      at 
> >> org.apache.pulsar.common.protocol.Commands.parseMessageMetadata(Commands.java:445)
> >>  ~[org.apache.pulsar-pulsar-common-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> >>      at 
> >> org.apache.pulsar.common.protocol.Commands.peekMessageMetadata(Commands.java:1899)
> >>  ~[org.apache.pulsar-pulsar-common-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> >>      at 
> >> org.apache.pulsar.common.protocol.Commands.peekAndCopyMessageMetadata(Commands.java:1918)
> >>  ~[org.apache.pulsar-pulsar-common-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> >>      at 
> >> org.apache.pulsar.broker.service.AbstractBaseDispatcher.filterEntriesForConsumer(AbstractBaseDispatcher.java:142)
> >>  ~[org.apache.pulsar-pulsar-broker-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> >>      at 
> >> org.apache.pulsar.broker.service.AbstractBaseDispatcher.filterEntriesForConsumer(AbstractBaseDispatcher.java:100)
> >>  ~[org.apache.pulsar-pulsar-broker-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> >>      at 
> >> org.apache.pulsar.broker.service.persistent.PersistentDispatcherSingleActiveConsumer.internalReadEntriesComplete(PersistentDispatcherSingleActiveConsumer.java:210)
> >>  ~[org.apache.pulsar-pulsar-broker-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> >>      at 
> >> org.apache.pulsar.broker.service.persistent.PersistentDispatcherSingleActiveConsumer.lambda$readEntriesComplete$1(PersistentDispatcherSingleActiveConsumer.java:151)
> >>  ~[org.apache.pulsar-pulsar-broker-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> >>      at 
> >> org.apache.bookkeeper.common.util.SingleThreadExecutor.safeRunTask(SingleThreadExecutor.java:137)
> >>  ~[org.apache.bookkeeper-bookkeeper-common-4.16.0.jar:4.16.0]
> >>      at 
> >> org.apache.bookkeeper.common.util.SingleThreadExecutor.run(SingleThreadExecutor.java:113)
> >>  ~[org.apache.bookkeeper-bookkeeper-common-4.16.0.jar:4.16.0]
> >>      at 
> >> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> >>  ~[io.netty-netty-common-4.1.89.Final.jar:4.1.89.Final]
> >>      at java.lang.Thread.run(Thread.java:833) ~[?:?]
> >>
> >> We looked into this but aren’t sure what would cause this to suddenly be
> >> an issue. The candidate would be differences in the new version of BK …
> >>
> >> Best,
> >> Dave
> >>
> >

Reply via email to