Communicating over binary protocol is more scalable and performant than
HTTP. Admin API over http has a long history of bottleneck and performance
issues which could also sometimes be a bottleneck for lookup requests and
that was the reason we introduced lookup over binary protocol as well. We
have multiple usecases which require fetching stats with relatively higher
rate and definitely we would like to avoid it over http which could be a
bottleneck for those applications or could be for others.
This PIP doesn't mention security so, let's not misinterpret the usecases.
Sometimes, pulsar is deployed behind the proxy and potentially used SNI
routing proxy which can't be used as http proxy and we would like to let
users access stats for a given topic using the same broker-service url
rather than having separate http endpoints. So, this api addresses scale,
performance, and use accessibility in pulsar.

Thanks,
Rajan

On Thu, May 11, 2023 at 6:24 AM Asaf Mesika <asaf.mes...@gmail.com> wrote:

> Before I dive into the PIP, I have several questions on the background
> provided below:
>
>
> On Tue, May 9, 2023 at 9:08 AM Rajan Dhabalia <rdhaba...@apache.org>
> wrote:
>
> > Hi,
> >
> > Right now, Pulsar provides the topic's stats and stats-internal over HTTP
> > admin API, and this stats data is used by user applications and also by
> > Pulsar internal components such as Pulsar-functions to derive the certain
> > states of the applications.
> > for example, there are use cases where the application wants to check the
> > topic's backlog, subscription's state (readPosition, list of
> > subscriptions), numberOfEntriesSinceFirstNotAckedMessage, etc to
> bootstrap
> > the application or handle the application’s resiliency and state
> > dynamically. Applications can retrieve this stats information by using
> the
> > broker’s admin HTTP APIs.
> >
> > However, stats retrieval over HTTP API doesn’t work well in use cases
> when
> > users would like to access this API at a higher scale when a large number
> > of application nodes would like to use it over HTTP which could overload
> > brokers and sometimes makes broker irresponsive and impact admin API
> > performance. It also becomes difficult when Pulsar is deployed in the
> cloud
> > behind the SNI proxy and applications also want to access large-scale
> stats
> > information periodically over different HTTP ports. Instead it would be
> > better if applications can fetch stats over on the same binary protocol
> for
> > scalability and accessibility reasons.
> >
>
> Why do you think using a binary protocol over HTTP would make more
> performant to respond to multiple calls at once?
> Same question but for the security issue - why do you think the HTTP port
> of admin API is harder to access than the binary protocol port?
>
>
>
>
> >
> > Therefore, there are multiple use cases where producer/consumer
> > applications need stats information for topics using the client library
> > over binary protocol. Hence, this PIP introduces client API for producers
> > and consumers to access topic stats/internal-stats information which can
> be
> > used by applications as needed.
> >
> > Please visit and review the PIP:
> > https://github.com/apache/pulsar/issues/20265
> >
> >
> > Thanks,
> >
> > Rajan
> >
>

Reply via email to