For the sake of the VOTE, I am classifying this as a bug and plan to
cherry pick it to all active release branches.

Thanks,
Michael

On Tue, Oct 11, 2022 at 2:59 PM Michael Marshall <mmarsh...@apache.org> wrote:
>
> Hi All,
>
> I created PIP 212: https://github.com/apache/pulsar/issues/18012. It
> already has a PR https://github.com/apache/pulsar/pull/15640. It is a
> minor change that I view as more of a bug fix, but because it changes
> a default value, it needs a PIP.
>
> The main point for discussion is whether we can cherry pick this to
> active release branches. My view is that it should be cherry picked
> because the feature doesn't work out of the box.
>
> If there isn't much discussion, I'll start a vote later this week.
>
> Thanks,
> Michael
>
>
> ### Motivation
>
> The current Bookkeeper configuration defaults to using
> `org.apache.bookkeeper.net.ScriptBasedMapping` for the
> `DNSToSwitchMapping` implementation. However, this default
> configuration does not align with the Broker's default configuration,
> which is `org.apache.pulsar.zookeeper.ZkBookieRackAffinityMapping`. As
> such, the default configuration for a Pulsar cluster does not lead to
> ideal rack awareness when ledgers need to be recovered. The result is
> that a user can configure a cluster for rack awareness and the brokers
> will honor that configuration, but the autorecovery process will not
> because it does not have the correct bookkeeper cluster topology view.
>
> I propose we configure bookkeeper to use the broker's
> `ZkBookieRackAffinityMapping` class. That way, autorecovery will honor
> the operator's configured rack awareness policies out of the box.
>
> ### Goal
>
> Ensure consistent rack awareness policies.
>
> I propose this is a bug fix that requires patching all active versions
> of Pulsar.
>
> ### API Changes
>
> None
>
> ### Implementation
>
> See https://github.com/apache/pulsar/pull/15640.
>
> Add default value for `reppDnsResolverClass` to the
> `conf/bookkeeper.conf` configuration. This change effectively switches
> the default from `org.apache.bookkeeper.net.ScriptBasedMapping` to
> `org.apache.pulsar.zookeeper.ZkBookieRackAffinityMapping`.
>
> ### Alternatives
>
> The tradeoff is that a user relying on the `ScriptBasedMapping`
> default might accidentally get switched to using the
> `ZkBookieRackAffinityMapping` implementation. Given that
> `ScriptBasedMapping` doesn't work out of the box, and that the
> broker's default to `ZkBookieRackAffinityMapping`, I think this is an
> acceptable tradeoff.
>
> ### Anything else?
>
> I manually verified that the `ZkBookieRackAffinityMapping` works by
> running some tests in a minikube cluster deployed with the DataStax
> helm chart for Apache Pulsar. I set up 3 racks, 4 bookies, and a topic
> with a E=2, Qw=2, and Qa=2. I then verified that the autorecovery pod
> correctly discovered the racks and then identified when an ensemble
> was not following the rack placement policy after two bookies were
> removed. I documented my testing a bit more here:
> https://github.com/datastax/pulsar-helm-chart/pull/214.

Reply via email to