I support this proposal.
I can't understand if this is still a VOTE thread.

If it is a VOTE thread then this is my +1 (binding)

Sorry for late reply

Enrico

Il Gio 3 Ago 2023, 21:43 Rajan Dhabalia <rdhaba...@apache.org> ha scritto:

> Hi Penghui.
>
> I think I tried to answer most of the questions and have enough usecases
> for this PIP as you also mentioned in your last email. Users can use this
> feature for any additional reasons like security, or convenience depending
> on their usecases. I think I have also answered all the concerns and
> questions as it's been going for a long time and I would like to see it
> moving forward and would like to see any suggestions or feedback for API or
> any changes in the proposal.
>
> Thanks,
> Rajan
>
> On Wed, Aug 2, 2023 at 7:06 PM Rajan Dhabalia <rdhaba...@apache.org>
> wrote:
>
> > >> If so, I think it could be a good reason for introducing binary
> > protocol support here. For the security sensitive users like financial
> > application. Usually they will try to reduce the dependencies (less
> > dependencies, less
> > CVEs and the exposed service endpoints. For example, the flink connector
> > also have pulsar-admin dependency but some of the users want to remove
> it.
> >
> > Yes, that could be another usecase to have such API available for users.
> > So, I would like to bump up this discussion again and see if we have any
> > other suggestions or concerns as we have multiple users who need it and
> > would like to move forward with this API to serve those usecases.
> >
> > Thanks,
> > Rajan
> >
> > On Mon, Jun 26, 2023 at 9:03 PM PengHui Li <peng...@apache.org> wrote:
> >
> >> Hi Rajan,
> >>
> >> Thanks for the explanation
> >>
> >> > This feature helps them to avoid multiple different extra
> >> efforts
> >>
> >> If I understand correctly. You want to say users don't want to
> >> add the pulsar-admin dependency or the the cluster don't want
> >> to expose the REST API to external (not the cluster admin or
> >> tenant admin)?
> >>
> >> If so, I think it could be a good reason for introducing binary
> >> protocol support here. For the security sensitive users like financial
> >> application.
> >>
> >> Usually they will try to reduce the dependencies (less dependencies,
> less
> >> CVEs)
> >> and the exposed service endpoints. For example, the flink connector
> >> also have pulsar-admin dependency but some of the users want to
> >> remove it.
> >>
> >> I want to say that from the perspective of improving performance,
> >> it may not be more convincing than the above reason.
> >>
> >> Thanks,
> >> Penghui
> >>
> >>
> >> On Mon, Jun 26, 2023 at 3:37 PM Rajan Dhabalia <rdhaba...@apache.org>
> >> wrote:
> >>
> >> > > I do not deny that binary protocol has performance advantages. But
> >> maybe
> >> > the bottleneck is not the protocol level for now.
> >> >
> >> > Well, sure. We had serious issue with performance in past over http
> but
> >> > this feature we would mainly like to introduce it now for the
> >> applications
> >> > to enhance api user accessibility experience where we have these
> >> multiple
> >> > usecases where applications with large number of topics and high
> fanout
> >> > consumers would like to fetch stats and stats-internal to retrieve
> >> various
> >> > metadata for application startup and managing application state based
> on
> >> > managed-ledge states. You can think of producer/consumer stats api
> >> which is
> >> > used by many usecases in different scenarios of monitoring or state
> >> > management. This feature helps them to avoid multiple different extra
> >> > efforts and performance considerations, which helps to give clean and
> >> easy
> >> > experience for their application.
> >> > I am open to hear about alternative of json string and response schema
> >> > definition but keeping similar response-schema as admin-api response
> for
> >> > stats/stats-internal helps to give consistent view of stats across all
> >> APIs
> >> > and transferring json format helps to skip transformation stats
> >> definition
> >> > and we don't have to make any wire protocol changes whenever we add
> any
> >> new
> >> > field or state in response which makes this protocol response agnostic
> >> and
> >> > do not require maintenance for any changes in stats. So, we can
> consider
> >> > any suggestion but we have multiple usecaes which need this API and we
> >> > should implement it in a way where it can provide a consistent
> >> definition
> >> > view across other APIs without maintenance efforts while making any
> stat
> >> > response changes.
> >> >
> >> > Thanks,
> >> > Rajan
> >> >
> >> > On Sat, Jun 24, 2023 at 5:51 PM PengHui Li <peng...@apache.org>
> wrote:
> >> >
> >> > > > >> Main reasons are performance (you will see improvement due to
> >> http
> >> > > overhead compare to tcp, limited threads on http-server side and
> >> > definitely
> >> > > performance compare to netty-tcp vs http-jetty server), user
> >> feasibility
> >> > > (same reason why we have producer/consumer stats using binary
> protocol
> >> > > where user can get the stats from the same producer/consumer entity
> >> > without
> >> > > having additional client creation for the stats), scale (allowing
> >> large
> >> > > number of requests over binary protocol compare to http is more
> >> > scalable).
> >> > >
> >> > > I do not deny that binary protocol has performance advantages.
> >> > > But maybe the bottleneck is not the protocol level for now.
> >> > >
> >> > > IMO, we should have a clear performance expectation for getting the
> >> topic
> >> > > stats
> >> > > If you want to achieve 10k of requests per second.
> >> > > Maybe the protocol is not the bottleneck. Of course, if you want to
> >> reach
> >> > > 100k or
> >> > > even millions of requests on a single server per second.
> >> > > Exactly, the binary protocol will help. I'm curious whether it is a
> >> real
> >> > > case.
> >> > > Unless you have to query topic stats every second and the broker
> owned
> >> > > 100k topics.
> >> > >
> >> > > And I also saw some benchmark results for the HTTP servers.
> >> > >
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/HTTPCOMPONENTS/HttpCoreBenchmark
> >> > > It's not refreshed, but it's instructive. Try other HTTP servers
> >> should
> >> > > also be considered.
> >> > >
> >> > > The binary protocol for topic stats is an option. But it also has
> >> > > disadvantages and negative effects.
> >> > >
> >> > > - Make the binary protocol more complicated
> >> > > - All the clients need to catch up with the new protocol
> >> > >
> >> > > > >> SNI is layer-4 tcp layer protocol and does not support over the
> >> > HTTP.
> >> > >
> >> > > But you can just have an HTTP proxy to handle the redirect?
> >> > > Why SNI is required for getting topic stats?
> >> > > I'm not sure about the whole story here.
> >> > >
> >> > > > Well JSON would be the better option to avoid data conversion and
> >> copy
> >> > > for
> >> > > complex stats/internal-stats complex data-structures at broker and
> >> client
> >> > > side, maintaining consistency between stats/internal-stats
> >> pojo-schema,
> >> > and
> >> > > accessing stats with consistent pojo definition.
> >> > >
> >> > > This is from the perspective of developers who are already very
> >> familiar
> >> > > with Pulsar.
> >> > > If you don't know it, just want to create a client based on Pulsar's
> >> API.
> >> > > How to do it? For public API, I think a clear definition should be
> the
> >> > > first consideration.
> >> > >
> >> > > Thanks,
> >> > > Penghui
> >> > >
> >> > >
> >> > > On Thu, Jun 22, 2023 at 9:13 AM Rajan Dhabalia <
> rdhaba...@apache.org>
> >> > > wrote:
> >> > >
> >> > > > Please find the response inline.
> >> > > >
> >> > > > On Wed, Jun 21, 2023 at 5:53 PM PengHui Li <peng...@apache.org>
> >> wrote:
> >> > > >
> >> > > > > > However, stats retrieval over HTTP API doesn’t work well in
> use
> >> > cases
> >> > > > > when users would like to access this API at a higher scale when
> a
> >> > large
> >> > > > > number of application nodes would like to use it over HTTP which
> >> > could
> >> > > > > overload brokers and sometimes makes broker irresponsive and
> >> impact
> >> > > admin
> >> > > > > API performance. It’s also become difficult when Pulsar is
> >> deployed
> >> > in
> >> > > > the
> >> > > > > cloud behind the SNI proxy and applications also want to access
> >> > > > large-scale
> >> > > > > stats information periodically over different HTTP port instead
> it
> >> > > would
> >> > > > be
> >> > > > > better if applications can fetch stats over on same binary
> >> protocol
> >> > for
> >> > > > > scalability and accessibility reasons.
> >> > > > >
> >> > > > > From the motivation of the proposal, resolving the performance
> >> issue
> >> > > with
> >> > > > > the REST API
> >> > > > > is the goal. Sorry, I haven't run a benchmark for the REST API,
> >> have
> >> > > you
> >> > > > > tested it or could
> >> > > > > you please figure out where is the bottleneck with the REST API
> >> > > solution?
> >> > > > > And if users use
> >> > > > > the new approach, what is the performance expectation here, get
> a
> >> 30%
> >> > > or
> >> > > > > 50% improvement?
> >> > > > >
> >> > > > >> Main reasons are performance (you will see improvement due to
> >> http
> >> > > > overhead compare to tcp, limited threads on http-server side and
> >> > > definitely
> >> > > > performance compare to netty-tcp vs http-jetty server), user
> >> > feasibility
> >> > > > (same reason why we have producer/consumer stats using binary
> >> protocol
> >> > > > where user can get the stats from the same producer/consumer
> entity
> >> > > without
> >> > > > having additional client creation for the stats), scale (allowing
> >> large
> >> > > > number of requests over binary protocol compare to http is more
> >> > > scalable).
> >> > > >
> >> > > >
> >> > > > > And I'm not sure why it is difficult to fetch the stats behind
> the
> >> > SNI
> >> > > > > proxy, could you please explain
> >> > > > > more? It will be helpful for the reviewers to understand the
> >> issue we
> >> > > > want
> >> > > > > to resolve.
> >> > > > >
> >> > > > >> SNI is layer-4 tcp layer protocol and does not support over the
> >> > http.
> >> > > >
> >> > > >
> >> > > > > For the client side API changes. You have InternalStatsOption
> as a
> >> > > param
> >> > > > of
> >> > > > > the method, but
> >> > > > > I don't see any definition for it. It should be enum? And why
> not
> >> > have
> >> > > a
> >> > > > > pojo for it? It will be
> >> > > > > more easier for users to know what should be set and what should
> >> not.
> >> > > > >
> >> > > > I have added that param map after our last discussion which will
> be
> >> > enum
> >> > > > to pass additional flags.
> >> > > >
> >> > > >
> >> > > > > For the response data. The JSON string is not good for
> >> compatibility.
> >> > > > And I
> >> > > > > saw the discussion
> >> > > > > in the DISCUSSION thread from you and Enrico. But it's not a
> >> > guaranteed
> >> > > > > solution from the API's
> >> > > > > perspective. And for any other clients, cpp, go, rust, they must
> >> > build
> >> > > > > their own pojo based on
> >> > > > > the REST API.
> >> > > > >
> >> > > > Well JSON would be the better option to avoid data conversion and
> >> copy
> >> > > for
> >> > > > complex stats/internal-stats complex data-structures at broker and
> >> > client
> >> > > > side, maintaining consistency between stats/internal-stats
> >> pojo-schema,
> >> > > and
> >> > > > accessing stats with consistent pojo definition.
> >> > > >
> >> > > >
> >> > > > > Thanks,
> >> > > > > Penghui
> >> > > > >
> >> > > > > On Tue, Jun 20, 2023 at 3:06 PM Rajan Dhabalia <
> >> rdhaba...@apache.org
> >> > >
> >> > > > > wrote:
> >> > > > >
> >> > > > > > Hi,
> >> > > > > >
> >> > > > > > I would like to start VOTE for :
> >> > > > > > https://github.com/apache/pulsar/issues/20265
> >> > > > > >
> >> > > > > > Thanks,
> >> > > > > > Rajan
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
>

Reply via email to