lhotari opened a new pull request, #24784:
URL: https://github.com/apache/pulsar/pull/24784

   ### Motivation
   
   There "PIP-234: Support using shared thread pool across multiple Pulsar 
client instance", https://github.com/apache/pulsar/issues/19074, which never 
went forward. The intention is that multiple Pulsar clients could share 
resources. This is not only needed for thread pools, but also for the Netty DNS 
resolver cache which is represented by 
`io.netty.resolver.dns.DnsAddressResolverGroup` in Netty.
   
   PIP-234 has been discussed in the past in threads
   * https://lists.apache.org/thread/5jw06hqlmwnrgvbn9lfom1vkwhwqwwd4
   * https://lists.apache.org/thread/5obfm17g58n3dnbzyxg57vokgmwyp6hx
   Since we don't want to expose Netty internals on the public API, a 
PulsarClientGroup API has been discussed earlier to abstract this.
   
   Before PIP-234 becomes a reality, it's useful to have an internal API for 
sharing Netty's DnsAddressResolverGroup across multiple PulsarClient instances.
   
   There's already an internal API for sharing instances. It's the 
PulsarClientImpl's Lombok generated builder:
   
https://github.com/apache/pulsar/blob/a66e8068058664d65fe71d5d711a14a898840b46/pulsar-client/src/main/java/org/apache/pulsar/client/impl/PulsarClientImpl.java#L198-L203
   
   This PR adds a class `org.apache.pulsar.client.impl.DnsResolverGroupImpl` 
which could later on be abstracted by an interface whenever we get to implement 
the PIP-234's `PulsarClientGroup` abstraction.
   
   In addition, this PR contains changes to use a shared DnsResolverGroup for 
Pulsar broker clients.
   There were changes in the past where resource sharing was incrementally 
added in https://github.com/apache/pulsar/pull/12037, 
https://github.com/apache/pulsar/pull/13836, and 
https://github.com/apache/pulsar/pull/13839.
   
   The problem that this could solve is a heavy load on the DNS server when a 
DNS entry expires and many clients access the same entry. This was something 
that was already addressed in the past for Pulsar Proxy, 
https://github.com/apache/pulsar/pull/15403 .
   
   Similar problems are present in Flink Pulsar use cases where each Pulsar 
sink and source creates it's own Pulsar client instance. It would be useful to 
have PIP-234 available for addressing that with a public Pulsar client API. 
However, in the mean time, the internal API in this PR could be used as a 
workaround.
   
   It should also be noted that Kubernetes default ndots 5 configuration adds 
heavy load on the DNS server.
   This article explains [the ndots 5 
issue](https://pracucci.com/kubernetes-dns-resolution-ndots-options-and-why-it-may-affect-application-performances.html).
 The way to address it for the service url is to add an extra trailing dot to 
make the DNS name [an absolute 
FQDN](https://www.f5.com/glossary/fqdn#:~:text=Trailing%20dot). There's also a 
Pulsar discussion at https://github.com/apache/pulsar/discussions/24030 with 
more information about unnecessary DNS lookups. Changing the service url isn't 
sufficient. There isn't a direct feature to make the pulsar and broker return 
the address in absolute FQDN dns name format for Pulsar topic lookups. A 
similar problem exists for the Pulsar Proxy. I'll create a separate PR to 
address that.
   
   ### Modifications
   
   - add internal DnsResolverGroupImpl abstraction for wrapping Netty's 
DnsAddressResolverGroup and for preparing for PIP-234
   - make it possible to instantiate PulsarClientImpl with a shared 
DnsResolverGroupImpl instance
   - use a shared DnsResolverGroupImpl instance for Pulsar broker clients
   
   ### Documentation
   
   <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->
   
   - [ ] `doc` <!-- Your PR contains doc changes. -->
   - [ ] `doc-required` <!-- Your PR changes impact docs and you will update 
later -->
   - [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
   - [ ] `doc-complete` <!-- Docs have been already added -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to