[ANNOUNCE] Apache Pulsar 2.9.3 released

2022-07-20 Thread mattison chao
The Apache Pulsar team is proud to announce Apache Pulsar version 2.9.3.

Pulsar is a highly scalable, low latency messaging platform running on
commodity hardware. It provides simple pub-sub semantics over topics,
guaranteed at-least-once delivery of messages, automatic cursor management for
subscribers, and cross-datacenter replication.

For Pulsar release details and downloads, visit:
https://pulsar.apache.org/download

Release Notes are at:https://pulsar.apache.org/release-notes/#2100

We would like to thank the contributors that made the release possible.

Regards,

The Pulsar Team


[GitHub] [pulsar-helm-chart] Anmol057 commented on pull request #274: Bump Apache Pulsar 2.10.1

2022-07-20 Thread GitBox


Anmol057 commented on PR #274:
URL: 
https://github.com/apache/pulsar-helm-chart/pull/274#issuecomment-1190154668

   Hi, when can we expect 2.10.1 to be released?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [pulsar-helm-chart] Anmol057 commented on issue #267: helm chart for pulsar version 2.10.0

2022-07-20 Thread GitBox


Anmol057 commented on issue #267:
URL: 
https://github.com/apache/pulsar-helm-chart/issues/267#issuecomment-1190155738

   Hi, When can we expect the 2.10.0 version to be released?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [pulsar-site] SignorMercurio commented on pull request #147: Generate config docs from source code

2022-07-20 Thread GitBox


SignorMercurio commented on PR #147:
URL: https://github.com/apache/pulsar-site/pull/147#issuecomment-1190258286

   @Anonymitaet PTAL


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [pulsar-helm-chart] MBcom commented on pull request #233: feat(certs): use actual v1 spec for certs

2022-07-20 Thread GitBox


MBcom commented on PR #233:
URL: 
https://github.com/apache/pulsar-helm-chart/pull/233#issuecomment-1190300067

   +1
   We were able to successfully upgrade to v2.9 after making these changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: PIP-187 Add API to analyse a subscription backlog and provide a accurate value

2022-07-20 Thread Asaf Mesika
I'm not sure I understand the context exactly:

You say today we can only know the number of entries, hence we'll have a
wrong number of backlog for subscription since:
1. One entry contains multiple messages (batch message)
2. Subscription may contain a filter, which requires you to read the entire
backlog to know it

There are two things I don't understand:

1. We're adding an observability API, which you need to pay all the read
cost just to know the count. I presume people would want to run this more
than once. So they will read same data multiple times - why would a user be
willing to pay such a hefty price?
2. If the user needs to know an accurate backlog, can't they use the
ability to create a very large number of topics, thus they will know an
accurate backlog without the huge cost?

I have an idea, if that's ok:

What if you can keep, as you said in your document, a metric counting
messages per filter upon write. When you update the filter / add a filter
by adding a new subscription, you can run code that reads from the
beginning of the subscription (first unacked message) to catch up and then
continues. This may be done async, so the metric will take some time to
catch up.
Amortized, it has less cost on the system overall, if compared to reading
all the messages multiple times to get a period size of the subscription.
Both solutions are expensive as opposed to nothing of course. Both has to
be a well documented conscious choice.
WDYT?

Asaf


On Thu, Jul 14, 2022 at 10:34 AM Enrico Olivelli 
wrote:

> Hello,
> this is a PIP to implement a tool to analyse the subscription backlog
>
> Link: https://github.com/apache/pulsar/issues/16597
> Prototype: https://github.com/apache/pulsar/pull/16545
>
> Below you can find the proposal (I will amend the GH issue while we
> discuss, as usual)
>
> Enrico
>
> Motivation
>
> Currently there is no way to have a accurate backlog for a subscription:
>
> you have only the number of "entries", not messages
> server side filters (PIP-105) may filter out some messages
>
> Having the number of entries is sometimes not enough because with
> batch messages the amount of work on the Consumers is proportional to
> the number of messages, that may vary from entry to entry.
>
> Goal
>
> The idea of this patch is to provide a dedicate API (REST,
> pulsar-admin, and Java PulsarAdmin) to "analise" a subscription and
> provide detailed information about that is expected to be delivered to
> Consumers.
>
> The operation will be quite expensive because we have to load the
> messages from storage and pass them to the filters, but due to the
> dynamic nature of Pulsar subscriptions there is no other way to have
> this value.
>
> One good strategy to do monitoring/alerting is to setup alerts on the
> usual "stats" and use this new API to inspect the subscription deeper,
> typically be issuing a manual command.
>
> API Changes
>
> internal ManagedCursor API:
>
> CompletableFuture scan(Predicate condition, long
> maxEntries, long timeOutMs);
>
> This method scans the Cursor from the lastMarkDelete position to the tail.
> There is a time limit and a maxEntries limit, these are needed in
> order to prevent huge (and useless) scans.
> The Predicate can stop the scan, if it doesn't want to continue the
> processing for some reasons.
>
> New REST API:
>
> @GET
>
> @Path("/{tenant}/{namespace}/{topic}/subscription/{subName}/analiseBacklog")
> @ApiOperation(value = "Analyse a subscription, by scanning all the
> unprocessed messages")
>
> public void analiseSubscriptionBacklog(
>@Suspended final AsyncResponse asyncResponse,
> @ApiParam(value = "Specify the tenant", required = true)
> @PathParam("tenant") String tenant,
> @ApiParam(value = "Specify the namespace", required = true)
> @PathParam("namespace") String namespace,
> @ApiParam(value = "Specify topic name", required = true)
> @PathParam("topic") @Encoded String encodedTopic,
> @ApiParam(value = "Subscription", required = true)
> @PathParam("subName") String encodedSubName,
> @ApiParam(value = "Is authentication required to perform
> this operation")
> @QueryParam("authoritative") @DefaultValue("false")
> boolean authoritative) {
>
> API response model:
>
> public class AnaliseSubscriptionBacklogResult {
> private long entries;
> private long messages;
>
> private long filterRejectedEntries;
> private long filterAcceptedEntries;
> private long filterRescheduledEntries;
>
> private long filterRejectedMessages;
> private long filterAcceptedMessages;
> private long filterRescheduledMessages;
>
> private boolean aborted;
>
> The response contains "aborted=true" is the request has been aborted
> by some internal limitations, like a timeout or the scan hit the max
> number of entries.
> We are not going to provide more details about the reason of the stop.
> It will 

Re: [VOTE] PIP-187 Add API to analyze a subscription backlog and provide a accurate value

2022-07-20 Thread Asaf Mesika
Sorry to barge in the vote - I forgot to send my reply on the discussion 2
days ago :)


On Tue, Jul 19, 2022 at 11:22 PM Nicolò Boschi  wrote:

> +1, thanks
>
> Nicolò Boschi
>
> Il Mar 19 Lug 2022, 22:16 Christophe Bornet  ha
> scritto:
>
> > +1
> >
> > Le mar. 19 juil. 2022 à 20:01, Andrey Yegorov <
> andrey.yego...@datastax.com
> > >
> > a écrit :
> >
> > > +1
> > >
> > > On Tue, Jul 19, 2022 at 10:51 AM Dave Fisher  wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > I support this enhancement for when a user occasionally requires
> > accurate
> > > > backlog stats. Once we bring this into service we can see if further
> > > > guardrails are required.
> > > >
> > > > Regards,
> > > > Dave
> > > >
> > > > > On Jul 19, 2022, at 10:02 AM, Enrico Olivelli  >
> > > > wrote:
> > > > >
> > > > > This is the VOTE thread for PIP-187
> > > > >
> > > > > This is the GH issue:
> https://github.com/apache/pulsar/issues/16597
> > > > > This is the PR: https://github.com/apache/pulsar/pull/16545
> > > > >
> > > > > The vote is open for at least 48 hours
> > > > >
> > > > > Below you can find a copy of the text of the PIP
> > > > >
> > > > > Best regards
> > > > > Enrico
> > > > >
> > > > >
> > > > > Motivation
> > > > >
> > > > > Currently there is no way to have a accurate backlog for a
> > > subscription:
> > > > >
> > > > > you have only the number of "entries", not messages
> > > > > server side filters (PIP-105) may filter out some messages
> > > > >
> > > > > Having the number of entries is sometimes not enough because with
> > > > > batch messages the amount of work on the Consumers is proportional
> to
> > > > > the number of messages, that may vary from entry to entry.
> > > > >
> > > > > Goal
> > > > >
> > > > > The idea of this patch is to provide a dedicate API (REST,
> > > > > pulsar-admin, and Java PulsarAdmin) to "analyze" a subscription and
> > > > > provide detailed information about that is expected to be delivered
> > to
> > > > > Consumers.
> > > > >
> > > > > The operation will be quite expensive because we have to load the
> > > > > messages from storage and pass them to the filters, but due to the
> > > > > dynamic nature of Pulsar subscriptions there is no other way to
> have
> > > > > this value.
> > > > >
> > > > > One good strategy to do monitoring/alerting is to setup alerts on
> the
> > > > > usual "stats" and use this new API to inspect the subscription
> > deeper,
> > > > > typically be issuing a manual command.
> > > > >
> > > > > API Changes
> > > > >
> > > > > internal ManagedCursor API:
> > > > >
> > > > > CompletableFuture scan(Predicate condition,
> long
> > > > > maxEntries, long timeOutMs);
> > > > >
> > > > > This method scans the Cursor from the lastMarkDelete position to
> the
> > > > tail.
> > > > > There is a time limit and a maxEntries limit, these are needed in
> > > > > order to prevent huge (and useless) scans.
> > > > > The Predicate can stop the scan, if it doesn't want to continue the
> > > > > processing for some reasons.
> > > > >
> > > > > New REST API:
> > > > >
> > > > >@GET
> > > > >
> > > >
> > >
> >
> @Path("/{tenant}/{namespace}/{topic}/subscription/{subName}/analyzeBacklog
> > > > > Backlog")
> > > > >@ApiOperation(value = "Analyze a subscription, by scanning all
> the
> > > > > unprocessed messages")
> > > > >
> > > > >public void analyzeBacklog SubscriptionBacklog(
> > > > >   @Suspended final AsyncResponse asyncResponse,
> > > > >@ApiParam(value = "Specify the tenant", required = true)
> > > > >@PathParam("tenant") String tenant,
> > > > >@ApiParam(value = "Specify the namespace", required =
> > true)
> > > > >@PathParam("namespace") String namespace,
> > > > >@ApiParam(value = "Specify topic name", required = true)
> > > > >@PathParam("topic") @Encoded String encodedTopic,
> > > > >@ApiParam(value = "Subscription", required = true)
> > > > >@PathParam("subName") String encodedSubName,
> > > > >@ApiParam(value = "Is authentication required to perform
> > > > > this operation")
> > > > >@QueryParam("authoritative") @DefaultValue("false")
> > > > > boolean authoritative) {
> > > > >
> > > > > API response model:
> > > > >
> > > > > public class AnalyzeSubscriptionBacklogResult {
> > > > >private long entries;
> > > > >private long messages;
> > > > >
> > > > >private long filterRejectedEntries;
> > > > >private long filterAcceptedEntries;
> > > > >private long filterRescheduledEntries;
> > > > >
> > > > >private long filterRejectedMessages;
> > > > >private long filterAcceptedMessages;
> > > > >private long filterRescheduledMessages;
> > > > >
> > > > >private boolean aborted;
> > > > >
> > > > > The response contains "aborted=true" is the request has been
> aborted
> > > > > by some internal limitations, like a timeout or the scan hit the
> max
> > > > > number of entries.
> > > > > 

[GitHub] [pulsar-helm-chart] MBcom opened a new issue, #278: Make included Prometheus server configurable

2022-07-20 Thread GitBox


MBcom opened a new issue, #278:
URL: https://github.com/apache/pulsar-helm-chart/issues/278

   **The prometheus configuration is currently hardcoded. Extending it is not 
possible - e.g. connecting it with an external alert manager.**
   We want to connect the included Prometheus server to an external Alert 
Manager which is currently not possible.
   
   **Describe the solution you'd like**
   Allow to modify the prometheus configuration entirely in `values.yaml`
   
   **Describe alternatives you've considered**
   We think that making only external Alert Managers configurable would not 
catch all possible wishes against prometheus configuration.
   
   **Additional context**
   -
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [pulsar-helm-chart] MBcom opened a new issue, #279: Make included prometheus server configurable

2022-07-20 Thread GitBox


MBcom opened a new issue, #279:
URL: https://github.com/apache/pulsar-helm-chart/issues/279

   **The prometheus configuration is currently hardcoded. Extending it is not 
possible - e.g. connecting it with an external alert manager.**
   We want to connect the included Prometheus server to an external Alert 
Manager which is currently not possible.
   
   **Describe the solution you'd like**
   Allow to modify the prometheus configuration entirely in `values.yaml`
   
   **Describe alternatives you've considered**
   We think that making only external Alert Managers configurable would not 
catch all possible wishes against prometheus configuration.
   
   **Additional context**
   -
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [pulsar-helm-chart] MBcom closed issue #279: Make included prometheus server configurable

2022-07-20 Thread GitBox


MBcom closed issue #279: Make included prometheus server configurable
URL: https://github.com/apache/pulsar-helm-chart/issues/279


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [pulsar-helm-chart] MBcom opened a new pull request, #280: Make included prometheus server configurable

2022-07-20 Thread GitBox


MBcom opened a new pull request, #280:
URL: https://github.com/apache/pulsar-helm-chart/pull/280

   Fixes #278
   
   ### Motivation
   
   We want to connect the included Prometheus server to an external Alert 
Manager which is currently not possible.
   
   ### Modifications
   
   We have moved the hard coded configuration from 
`templates/prometheus-configmap.yaml` to `prometheus.configData` in 
`values.yaml`
   
   ### Verifying this change
   
   - [ ] Make sure that the change passes the CI checks.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [pulsar-helm-chart] MBcom opened a new issue, #281: Allow setting nodePort Option for proxy service

2022-07-20 Thread GitBox


MBcom opened a new issue, #281:
URL: https://github.com/apache/pulsar-helm-chart/issues/281

   **Is your feature request related to a problem? Please describe.**
   We can not set a specific nodePort for the proxy service. We want to persist 
a specific nodePort to make the Chart redeployable in our infrastructure.
   
   **Describe the solution you'd like**
   Allow to set a specific NodePort in `values.yaml`.
   
   **Describe alternatives you've considered**
   -
   
   **Additional context**
   -
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [pulsar-helm-chart] MBcom opened a new pull request, #282: Allow setting nodePort Option for proxy service

2022-07-20 Thread GitBox


MBcom opened a new pull request, #282:
URL: https://github.com/apache/pulsar-helm-chart/pull/282

   Fixes #281
   
   ### Motivation
   
   We can not set a specific nodePort for the proxy service. We want to persist 
a specific nodePort to make the Chart redeployable in our infrastructure.
   
   ### Modifications
   
   We add the `proxy.service.nodePort` option to the `values.yaml`. If set, it 
will be included in the proxy service definition 
(`templates/proxy-service.yaml`).
   
   ### Verifying this change
   
   - [ ] Make sure that the change passes the CI checks.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: PIP-187 Add API to analyse a subscription backlog and provide a accurate value

2022-07-20 Thread Enrico Olivelli
Asaf,

Il giorno mer 20 lug 2022 alle ore 15:40 Asaf Mesika
 ha scritto:
>
> I'm not sure I understand the context exactly:
>
> You say today we can only know the number of entries, hence we'll have a
> wrong number of backlog for subscription since:
> 1. One entry contains multiple messages (batch message)
> 2. Subscription may contain a filter, which requires you to read the entire
> backlog to know it

correct

>
> There are two things I don't understand:
>
> 1. We're adding an observability API, which you need to pay all the read
> cost just to know the count. I presume people would want to run this more
> than once. So they will read same data multiple times - why would a user be
> willing to pay such a hefty price?

sometimes it is the case, because processing a message may have a high cost.
So having 10 entries of 100 messages is not correctly representing the
amount of work that must be done by the consumers
and so the user may wish to have an exact count.

Having the filters adds more complexity because you cannot predict how
many entries will be filtered out


> 2. If the user needs to know an accurate backlog, can't they use the
> ability to create a very large number of topics, thus they will know an
> accurate backlog without the huge cost?

I can't understand why creating many topics will help.
instead with filters it is very likely that you have only fewer topics
with many subscriptions with different filters

as you don't know the filters while writing you cannot route the
messages to some topic
also you would need to write the message to potentially multiple
topics, and that would be a huge write amplification
(think about a topic with 100 subscriptions)

>
> I have an idea, if that's ok:
>
> What if you can keep, as you said in your document, a metric counting
> messages per filter upon write.
This is not possible as described above

When you update the filter / add a filter
> by adding a new subscription, you can run code that reads from the
> beginning of the subscription (first unacked message) to catch up and then
> continues. This may be done async, so the metric will take some time to
> catch up.
> Amortized, it has less cost on the system overall, if compared to reading
> all the messages multiple times to get a period size of the subscription.
> Both solutions are expensive as opposed to nothing of course. Both has to
> be a well documented conscious choice.
> WDYT?


Enrico
>
> Asaf
>
>
> On Thu, Jul 14, 2022 at 10:34 AM Enrico Olivelli 
> wrote:
>
> > Hello,
> > this is a PIP to implement a tool to analyse the subscription backlog
> >
> > Link: https://github.com/apache/pulsar/issues/16597
> > Prototype: https://github.com/apache/pulsar/pull/16545
> >
> > Below you can find the proposal (I will amend the GH issue while we
> > discuss, as usual)
> >
> > Enrico
> >
> > Motivation
> >
> > Currently there is no way to have a accurate backlog for a subscription:
> >
> > you have only the number of "entries", not messages
> > server side filters (PIP-105) may filter out some messages
> >
> > Having the number of entries is sometimes not enough because with
> > batch messages the amount of work on the Consumers is proportional to
> > the number of messages, that may vary from entry to entry.
> >
> > Goal
> >
> > The idea of this patch is to provide a dedicate API (REST,
> > pulsar-admin, and Java PulsarAdmin) to "analise" a subscription and
> > provide detailed information about that is expected to be delivered to
> > Consumers.
> >
> > The operation will be quite expensive because we have to load the
> > messages from storage and pass them to the filters, but due to the
> > dynamic nature of Pulsar subscriptions there is no other way to have
> > this value.
> >
> > One good strategy to do monitoring/alerting is to setup alerts on the
> > usual "stats" and use this new API to inspect the subscription deeper,
> > typically be issuing a manual command.
> >
> > API Changes
> >
> > internal ManagedCursor API:
> >
> > CompletableFuture scan(Predicate condition, long
> > maxEntries, long timeOutMs);
> >
> > This method scans the Cursor from the lastMarkDelete position to the tail.
> > There is a time limit and a maxEntries limit, these are needed in
> > order to prevent huge (and useless) scans.
> > The Predicate can stop the scan, if it doesn't want to continue the
> > processing for some reasons.
> >
> > New REST API:
> >
> > @GET
> >
> > @Path("/{tenant}/{namespace}/{topic}/subscription/{subName}/analiseBacklog")
> > @ApiOperation(value = "Analyse a subscription, by scanning all the
> > unprocessed messages")
> >
> > public void analiseSubscriptionBacklog(
> >@Suspended final AsyncResponse asyncResponse,
> > @ApiParam(value = "Specify the tenant", required = true)
> > @PathParam("tenant") String tenant,
> > @ApiParam(value = "Specify the namespace", required = true)
> > @PathParam("namespace") String namespace,
> >   

Re: PIP-187 Add API to analyse a subscription backlog and provide a accurate value

2022-07-20 Thread Asaf Mesika
On Wed, Jul 20, 2022 at 5:46 PM Enrico Olivelli  wrote:

> Asaf,
>
> Il giorno mer 20 lug 2022 alle ore 15:40 Asaf Mesika
>  ha scritto:
> >
> > I'm not sure I understand the context exactly:
> >
> > You say today we can only know the number of entries, hence we'll have a
> > wrong number of backlog for subscription since:
> > 1. One entry contains multiple messages (batch message)
> > 2. Subscription may contain a filter, which requires you to read the
> entire
> > backlog to know it
>
> correct
>
> >
> > There are two things I don't understand:
> >
> > 1. We're adding an observability API, which you need to pay all the read
> > cost just to know the count. I presume people would want to run this more
> > than once. So they will read same data multiple times - why would a user
> be
> > willing to pay such a hefty price?
>
> sometimes it is the case, because processing a message may have a high
> cost.
> So having 10 entries of 100 messages is not correctly representing the
> amount of work that must be done by the consumers
> and so the user may wish to have an exact count.
>
> Having the filters adds more complexity because you cannot predict how
> many entries will be filtered out
>
>
> So it's mainly serving that specific use case of reading the entire
messages over and over (every interval) is an order of magnitude less
expensive than the processing it self.


> > 2. If the user needs to know an accurate backlog, can't they use the
> > ability to create a very large number of topics, thus they will know an
> > accurate backlog without the huge cost?
>
> I can't understand why creating many topics will help.
> instead with filters it is very likely that you have only fewer topics
> with many subscriptions with different filters
>
> as you don't know the filters while writing you cannot route the
> messages to some topic
> also you would need to write the message to potentially multiple
> topics, and that would be a huge write amplification
> (think about a topic with 100 subscriptions)
>
> Yes, I haven't thought about that.
What I was thinking is that those filters are mutually exclusive therefor
topics, but in your case, if you have 100 different filters, and they
overlap, yes it would be way more expensive to write them 100 times.

>
> > I have an idea, if that's ok:
> >
> > What if you can keep, as you said in your document, a metric counting
> > messages per filter upon write.
> This is not possible as described above
>

You wrote above that:

---
you cannot know which subscriptions will be created in a topic
subscription can be created from the past (Earliest)
subscription filters may change over time: they are usually configured
using Subscription Properties, and those properties are dynamic
doing computations on the write path (like running filters) kills
latency and thoughtput

Use a client to clone the subscription and consume data.
This doesn't work because you have to transfer the data to the client,
and this is possibly a huge amount of work and a waste of resources.
---

What if we don't do it directly on the write path.
What if the topic owner creates an internal subscription, consumes the
messages, and updates a count per filter.
Thus, those computation will have less effect directly on the write path.

I'm trying to compare that cost of compuations, with consuming all the
messages, again and again, running filter computation for them, every
interval (say 1min).
The amount of computation in the latter would be more costly, no?


> When you update the filter / add a filter
> > by adding a new subscription, you can run code that reads from the
> > beginning of the subscription (first unacked message) to catch up and
> then
> > continues. This may be done async, so the metric will take some time to
> > catch up.
> > Amortized, it has less cost on the system overall, if compared to reading
> > all the messages multiple times to get a period size of the subscription.
> > Both solutions are expensive as opposed to nothing of course. Both has to
> > be a well documented conscious choice.
> > WDYT?
>
>
> Enrico
> >
> > Asaf
> >
> >
> > On Thu, Jul 14, 2022 at 10:34 AM Enrico Olivelli 
> > wrote:
> >
> > > Hello,
> > > this is a PIP to implement a tool to analyse the subscription backlog
> > >
> > > Link: https://github.com/apache/pulsar/issues/16597
> > > Prototype: https://github.com/apache/pulsar/pull/16545
> > >
> > > Below you can find the proposal (I will amend the GH issue while we
> > > discuss, as usual)
> > >
> > > Enrico
> > >
> > > Motivation
> > >
> > > Currently there is no way to have a accurate backlog for a
> subscription:
> > >
> > > you have only the number of "entries", not messages
> > > server side filters (PIP-105) may filter out some messages
> > >
> > > Having the number of entries is sometimes not enough because with
> > > batch messages the amount of work on the Consumers is proportional to
> > > the number of messages, that may vary from entry to entry.
> > >
> > > Goa

[DISCUSS] Alternatives to changing public protocol

2022-07-20 Thread Asaf Mesika
Hi,

We started discussing in PIP-180, which Penghui recommended I move to a
dedicated thread.

Pulsar has a public API in its binary protocol, which the clients use to
communicate with it. Nonetheless, it is its public API to the server.

I believe the public API should not be changed for internal communication
purposes. PIP-180 gives a really good example: We would like to introduce a
new feature called Shadow Topic and would like to replicate messages from
the source topic to the Shadow topic. It just so happens to be that the
replication mechanism uses the Broker public API to send messages to a
broker. The design would like to expand on that by adding a field to this
public API, to serve that specific feature needs (the field is not generic,
it's specifically named shadow_message_id).

I believe someone who tries to reason about Pulsar, and its architecture,
by looking at its public API should not have any fields which will never be
relevant to the reader.  It makes it hard to reason and understand the
public API.

The second problem is clients: Every such field will eventually trickle
down to the clients, which will need to ignore that field. In my opinion,
it makes it harder for the client's maintainers. Especially when the
community goal is to expand and have many languages clients maintained by
the community

The public API today already contains many fields which are only for
internal use. Here are a few that I found (please correct me if I'm wrong
here):

// Property set on replicated message,
// includes the source cluster name
optional string replicated_from = 5;

// Override namespace's replication
repeated string replicate_to= 7;

// Identify whether a message is a "marker" message used for
// internal metadata instead of application published data.
// Markers will generally not be propagated back to clients
optional int32 marker_type = 20;


I would like to discuss that with you, get your feedback and whether you
think it's correct to accept a decision to avoid changing the public API.

One alternative I was thinking about (I'm still fairly new, so I don't have
all the experience and context here) is creating an internal non-public
API, which will be used for internal communication: different proto,
different port.

Thanks for your time,

Asaf


Re: [DISCUSS] PIP-186: Introduce two phase deletion protocol based on system topic

2022-07-20 Thread Yan Zhao
Hi, Enrico. If we bind the the topic with per-tenant, when tenant be deleted or 
the tenant not be load anymore, we data in the tennat system topic can't be 
consumed before the tenant next load.

On 2022/07/14 15:35:16 Enrico Olivelli wrote:
> This is very interesting.
> 
> I have only one concern.
> I think that we should at least use a per-tenant system topic, or,
> better, per-namespace.
> There is no need to create the deletion topic if there is nothing to delete.
> 
> I am used to dealing with Pulsar clusters in which Tenants are
> strictly isolated.
> Introducing another component that is not tenant aware it kind of a
> problem (we already have such problem with the Transaction
> Coordinator)
> 
> Enrico


[GitHub] [pulsar-site] merlimat merged pull request #151: Add Pulsar Summit San Francisco

2022-07-20 Thread GitBox


merlimat merged PR #151:
URL: https://github.com/apache/pulsar-site/pull/151


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [DISCUSS] PIP-192 New Pulsar Broker Load Balancer

2022-07-20 Thread Rajan Dhabalia
Hi,

I have gone through PIP but I don't see some basic information as part of
PIP:
1. current issue in current load balancer strategy
2. Are there any performance and feature gaps in the current load balancer?
Please provide data or metrics to show the impact.
3. what exactly we are solving in new Load balancer
4. why we can not enhance current load balancer
5. Show improvement numbers, metrics and impact with new load balancer

There were many metrics, historical data, and impacts were analyzed when
introducing the current load balancer. So, I am expecting similar
information documented and discussed before coming up with new
implementation and merging it.

Also, here we have to clearly document that it will not impact behavior of
existing load-balancers now or in future. Pulsar is used by many companies
and orgs so, deprecating and not maintaining existing components is not
acceptable in any circumstances.

Thanks,
Rajan



On Tue, Jul 19, 2022 at 10:15 PM Heesung Sohn
 wrote:

> Dear Pulsar dev community,
>
> We would like to open a discussion here about PIP-192: New Pulsar Broker
> Load Balancer .
>
> Regards,
> Heesung
>


Re: [DISCUSS] PIP-186: Introduce two phase deletion protocol based on system topic

2022-07-20 Thread Enrico Olivelli
Il Mer 20 Lug 2022, 18:05 Yan Zhao  ha scritto:

> Hi, Enrico. If we bind the the topic with per-tenant, when tenant be
> deleted or the tenant not be load anymore, we data in the tennat system
> topic can't be consumed before the tenant next load.
>

This is a good point.

So let's go with the system topic.
Thanks

Are we adding a configuratio flag?

Enrico



> On 2022/07/14 15:35:16 Enrico Olivelli wrote:
> > This is very interesting.
> >
> > I have only one concern.
> > I think that we should at least use a per-tenant system topic, or,
> > better, per-namespace.
> > There is no need to create the deletion topic if there is nothing to
> delete.
> >
> > I am used to dealing with Pulsar clusters in which Tenants are
> > strictly isolated.
> > Introducing another component that is not tenant aware it kind of a
> > problem (we already have such problem with the Transaction
> > Coordinator)
> >
> > Enrico
>


Re: [DISCUSS] PIP-192 New Pulsar Broker Load Balancer

2022-07-20 Thread Heesung Sohn
Hi Rajan,

Please find my answers inline.

Regards,
Heesung

On Wed, Jul 20, 2022 at 1:37 PM Rajan Dhabalia  wrote:

> Hi,
>
> I have gone through PIP but I don't see some basic information as part of
> PIP:
> 1. current issue in current load balancer strategy

2. Are there any performance and feature gaps in the current load balancer?

Please provide data or metrics to show the impact.

3. what exactly we are solving in new Load balancer
>

As we linked in the motivation section, this proposal is based on the Pulsar
Broker Load Balance Improvement Areas


doc shared by our StreamNative team last month
.


> 4. why we can not enhance current load balancer
>

As the PIP changes almost every place (data models, event handlers,
cache/storage, logs/metrics),
 creating a new load balancer and isolating the new code is safer and
cleaner.
Then, customers could safely enable/disable the new load balancer
by a configuration before deprecating the old one.


> 5. Show improvement numbers, metrics and impact with new load balancer

There were many metrics, historical data, and impacts were analyzed when
> introducing the current load balancer. So, I am expecting similar
> information documented and discussed before coming up with new
> implementation and merging it.
>
>
 I agree. We will be running load balance performance tests(old vs new)
when we validate the new load manager. We are in the early step of getting
this PIP
 approved by the community.

Also, here we have to clearly document that it will not impact behavior of
> existing load-balancers now or in future. Pulsar is used by many companies
> and orgs so, deprecating and not maintaining existing components is not
> acceptable in any circumstances.
>
>
 I agree. We mentioned this in the PIP like the following.

New Load Manager
...

   - It isolates the new code in the new classes without breaking the
   existing logic.


   - This new load manager will be disabled in the first releases until
   proven stable.



> Thanks,
> Rajan
>
>
>
> On Tue, Jul 19, 2022 at 10:15 PM Heesung Sohn
>  wrote:
>
> > Dear Pulsar dev community,
> >
> > We would like to open a discussion here about PIP-192: New Pulsar Broker
> > Load Balancer .
> >
> > Regards,
> > Heesung
> >
>


Re: [DISCUSS] PIP-192 New Pulsar Broker Load Balancer

2022-07-20 Thread Matteo Merli
On Wed, Jul 20, 2022 at 1:37 PM Rajan Dhabalia  wrote:

> Also, here we have to clearly document that it will not impact behavior of
> existing load-balancers now or in future. Pulsar is used by many companies
> and orgs so, deprecating and not maintaining existing components is not
> acceptable in any circumstances.

This is exactly the reason why this is going to be implemented as a
new loadmanager instead of improving the existing
ModularLoadManagerImpl.

It gives the flexibility to start fresh without the existing baggage
of choices and try a significantly different approach.

The current ModularLoadManagerImpl will not go away. Once the new load
manager will be ready and considered stable enough, there might be a
new discussion on whether to change the default implementation. Even
then, users will still be able to opt for the old load manager.


Re: [DISCUSS] PIP-180: Shadow Topic, an alternative way to support readonly topic ownership.

2022-07-20 Thread PengHui Li
Hi Haiting,

One question about the schema.
How can the consumer get the schema from the shadow topic during
consumption?
We should add this part in the proposal.

Thanks,
Penghui

On Mon, Jul 11, 2022 at 9:09 PM Asaf Mesika  wrote:

> On Thu, Jun 23, 2022 at 6:26 AM Haiting Jiang 
> wrote:
>
> > Hi Asaf,
> >
> > > I did a quick reading and I couldn't understand the gist of this
> change:
> > > The shadow topic doesn't really have it's own messages, or it's own
> > ledgers
> > > right? When it reads messages, it reads from the original topic
> ledgers.
> > So
> > > the only thing you need to do is sync the "metadata" - ledgers list?
> >
> > Yes, mostly ledger id list and LAC of the last ledger.
>
>
> > > One question comes to mind here: Why not simply read the ledger
> > information
> > > from original topic, without copy?
> >
> > Yes, old ledger information will be read from metadata store when
> > ShadowManagedLedger initializes. The replicator is only for new messages,
> > to
> > reduce the consume latency of subscription in shadow topic. And the
> reason
> > we also replicates message data is to populates the entry cache when
> shadow
> > topic have many active subscriptions.
> >
> > One optimization we can do is that, there would be not much help for
> shadow
> > replicator to replicate message in backlog. We can come up with some
> > policy to
> > reset shadow replicator cursor in future PR.
> >
>
> I'm not sure I'm following you.
> What do you mean by old ledger information and new ledger information?
>
> What I'm trying to understand is: why do you need to copy the source topic
> metadata: Ledgers ID list and LAC of the last ledger? Why can't you just
> use the original topic metadata?
>
>
>
> >
> > > Another question - I couldn't understand why you need to change the
> > > protocol to introduce shadow message id. Can you please explain that to
> > me?
> > > Is CommandSend used only internally between Pulsar Clusters or used by
> a
> > > Pulsar Client?
> >
> > CommandSend is designed for pulsar producer client first, and
> > geo-replication
> > reuse producer client to replicate messages between pulsar clusters.
> >
> > The shadow message id contains the ledger id and entry id of this
> message.
> > When shadow topic receive the message id, it is able to update
> > `lastConfirmedEntry` directly, so that subscription can consume this this
> > new
> > message.
> > Also shadow topic can tell if the message is from shadow replicator and
> > reject
> > otherwise.
> >
> >
> I think the flow of information is the part I don't understand.
>
> In the PIP you write "The message sync procedure of shadow topic is
> supported by shadow replication, which is very like geo-replication, with
> these differences:"
> What I don't understand is that you write that this is a read-only topic,
> so why replicate/sync messages?
>
> I managed to understand that you want to populate the BK entry cache of the
> topic ledgers in the shadow topic broker. Instead of reading from BK and
> storing it in the cache, you favor copying from the source topic broker
> cache memory to the shadow topic broker cache. Is this to save the
> bandwidth of BK? I presume the most recent messages of BK would be in
> memory anyway, no?
>
>
>
>
> > Thanks,
> > Haiting
> >
> > On 2022/06/22 15:57:11 Asaf Mesika wrote:
> > > Hi,
> > >
> > > I did a quick reading and I couldn't understand the gist of this
> change:
> > > The shadow topic doesn't really have it's own messages, or it's own
> > ledgers
> > > right? When it reads messages, it reads from the original topic
> ledgers.
> > So
> > > the only thing you need to do is sync the "metadata" - ledgers list?
> > > One question comes to mind here: Why not simply read the ledger
> > information
> > > from original topic, without copy?
> > >
> > > Another question - I couldn't understand why you need to change the
> > > protocol to introduce shadow message id. Can you please explain that to
> > me?
> > > Is CommandSend used only internally between Pulsar Clusters or used by
> a
> > > Pulsar Client?
> > >
> > > Thanks,
> > >
> > > Asaf
> > >
> > > On Tue, Jun 21, 2022 at 11:00 AM Haiting Jiang <
> jianghait...@apache.org>
> > > wrote:
> > >
> > > > Hi Pulsar community:
> > > >
> > > > I open a pip to discuss "Shadow Topic, an alternative way to support
> > > > readonly topic ownership."
> > > >
> > > > Proposal Link: https://github.com/apache/pulsar/issues/16153
> > > >
> > > > ---
> > > >
> > > > ## Motivation
> > > >
> > > > The motivation is the same as PIP-63[1], with a new broadcast use
> case
> > of
> > > > supporting 100K subscriptions in a single topic.
> > > > 1. The bandwidth of a broker limits the number of subscriptions for a
> > > > single
> > > >topic.
> > > > 2. Subscriptions are competing for the network bandwidth on brokers.
> > > > Different
> > > >subscriptions might have different levels of severity.
> > > > 3. When synchronizing cross-city message reading, cross-city access

Re: [DISCUSS] Alternatives to changing public protocol

2022-07-20 Thread PengHui Li
Thanks for starting this proposal Asaf.

I'm trying to think more about this part today.

Currently, the public API protocol is defined in PulsarApi.proto [1]
And we have internal proto files such as MLDataFormats.proto [2],
SchemaRegistryFormat.proto [3], ResourceUsage.proto [4].

It looks like the geo-replication is a relatively special one, the data
replication depends on not only the MessageMetadata but also
the topic lookup, create producer, close producer, etc. I think this should
be the challenge to have a separate proto for geo-replication.
Except for geo-replication, all the commands defined in PulsarApi.proto are
public APIs/fields.

The geo-replication is based on the Pulsar producer, which can reuse all
the producer's ability to implement the geo-replication.
But the replication needs some extra information.

[1]
https://github.com/apache/pulsar/blob/master/pulsar-common/src/main/proto/PulsarApi.proto
[2]
https://github.com/apache/pulsar/blob/master/managed-ledger/src/main/proto/MLDataFormats.proto
[3]
https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/proto/SchemaRegistryFormat.proto
[4]
https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/proto/ResourceUsage.proto

On Wed, Jul 20, 2022 at 11:47 PM Asaf Mesika  wrote:

> Hi,
>
> We started discussing in PIP-180, which Penghui recommended I move to a
> dedicated thread.
>
> Pulsar has a public API in its binary protocol, which the clients use to
> communicate with it. Nonetheless, it is its public API to the server.
>
> I believe the public API should not be changed for internal communication
> purposes. PIP-180 gives a really good example: We would like to introduce a
> new feature called Shadow Topic and would like to replicate messages from
> the source topic to the Shadow topic. It just so happens to be that the
> replication mechanism uses the Broker public API to send messages to a
> broker. The design would like to expand on that by adding a field to this
> public API, to serve that specific feature needs (the field is not generic,
> it's specifically named shadow_message_id).
>
> I believe someone who tries to reason about Pulsar, and its architecture,
> by looking at its public API should not have any fields which will never be
> relevant to the reader.  It makes it hard to reason and understand the
> public API.
>
> The second problem is clients: Every such field will eventually trickle
> down to the clients, which will need to ignore that field. In my opinion,
> it makes it harder for the client's maintainers. Especially when the
> community goal is to expand and have many languages clients maintained by
> the community
>
> The public API today already contains many fields which are only for
> internal use. Here are a few that I found (please correct me if I'm wrong
> here):
>
> // Property set on replicated message,
> // includes the source cluster name
> optional string replicated_from = 5;
>
> // Override namespace's replication
> repeated string replicate_to= 7;
>
> // Identify whether a message is a "marker" message used for
> // internal metadata instead of application published data.
> // Markers will generally not be propagated back to clients
> optional int32 marker_type = 20;
>
>
> I would like to discuss that with you, get your feedback and whether you
> think it's correct to accept a decision to avoid changing the public API.
>
> One alternative I was thinking about (I'm still fairly new, so I don't have
> all the experience and context here) is creating an internal non-public
> API, which will be used for internal communication: different proto,
> different port.
>
> Thanks for your time,
>
> Asaf
>


Re: PIP-187 Add API to analyse a subscription backlog and provide a accurate value

2022-07-20 Thread PengHui Li
> What if the topic owner creates an internal subscription, consumes the
messages, and updates a count per filter.

I agree with this approach. If we need to scan all the backlogs to
calculate the
accurate backlogs for each operation, it's so expensive and difficult to
apply to
the production environment. With the counter for each filter(subscription)
and only
re-scan the data after the filter changes will reduce a lot of overhead.

If we want to expose the accurate backlogs in the Prometheus endpoint,
it's almost impossible.

Thanks,
Penghui

On Wed, Jul 20, 2022 at 11:23 PM Asaf Mesika  wrote:

> On Wed, Jul 20, 2022 at 5:46 PM Enrico Olivelli 
> wrote:
>
> > Asaf,
> >
> > Il giorno mer 20 lug 2022 alle ore 15:40 Asaf Mesika
> >  ha scritto:
> > >
> > > I'm not sure I understand the context exactly:
> > >
> > > You say today we can only know the number of entries, hence we'll have
> a
> > > wrong number of backlog for subscription since:
> > > 1. One entry contains multiple messages (batch message)
> > > 2. Subscription may contain a filter, which requires you to read the
> > entire
> > > backlog to know it
> >
> > correct
> >
> > >
> > > There are two things I don't understand:
> > >
> > > 1. We're adding an observability API, which you need to pay all the
> read
> > > cost just to know the count. I presume people would want to run this
> more
> > > than once. So they will read same data multiple times - why would a
> user
> > be
> > > willing to pay such a hefty price?
> >
> > sometimes it is the case, because processing a message may have a high
> > cost.
> > So having 10 entries of 100 messages is not correctly representing the
> > amount of work that must be done by the consumers
> > and so the user may wish to have an exact count.
> >
> > Having the filters adds more complexity because you cannot predict how
> > many entries will be filtered out
> >
> >
> > So it's mainly serving that specific use case of reading the entire
> messages over and over (every interval) is an order of magnitude less
> expensive than the processing it self.
>
>
> > > 2. If the user needs to know an accurate backlog, can't they use the
> > > ability to create a very large number of topics, thus they will know an
> > > accurate backlog without the huge cost?
> >
> > I can't understand why creating many topics will help.
> > instead with filters it is very likely that you have only fewer topics
> > with many subscriptions with different filters
> >
> > as you don't know the filters while writing you cannot route the
> > messages to some topic
> > also you would need to write the message to potentially multiple
> > topics, and that would be a huge write amplification
> > (think about a topic with 100 subscriptions)
> >
> > Yes, I haven't thought about that.
> What I was thinking is that those filters are mutually exclusive therefor
> topics, but in your case, if you have 100 different filters, and they
> overlap, yes it would be way more expensive to write them 100 times.
>
> >
> > > I have an idea, if that's ok:
> > >
> > > What if you can keep, as you said in your document, a metric counting
> > > messages per filter upon write.
> > This is not possible as described above
> >
>
> You wrote above that:
>
> ---
> you cannot know which subscriptions will be created in a topic
> subscription can be created from the past (Earliest)
> subscription filters may change over time: they are usually configured
> using Subscription Properties, and those properties are dynamic
> doing computations on the write path (like running filters) kills
> latency and thoughtput
>
> Use a client to clone the subscription and consume data.
> This doesn't work because you have to transfer the data to the client,
> and this is possibly a huge amount of work and a waste of resources.
> ---
>
> What if we don't do it directly on the write path.
> What if the topic owner creates an internal subscription, consumes the
> messages, and updates a count per filter.
> Thus, those computation will have less effect directly on the write path.
>
> I'm trying to compare that cost of compuations, with consuming all the
> messages, again and again, running filter computation for them, every
> interval (say 1min).
> The amount of computation in the latter would be more costly, no?
>
>
> > When you update the filter / add a filter
> > > by adding a new subscription, you can run code that reads from the
> > > beginning of the subscription (first unacked message) to catch up and
> > then
> > > continues. This may be done async, so the metric will take some time to
> > > catch up.
> > > Amortized, it has less cost on the system overall, if compared to
> reading
> > > all the messages multiple times to get a period size of the
> subscription.
> > > Both solutions are expensive as opposed to nothing of course. Both has
> to
> > > be a well documented conscious choice.
> > > WDYT?
> >
> >
> > Enrico
> > >
> > > Asaf
> > >
> > >
> > > On Thu, Jul 14, 2022 at 10:34 AM Enrico

Re: [DISCUSS] Apache Pulsar 2.11.0 Release

2022-07-20 Thread PengHui Li
Thanks for volunteering Nicolò.

> So a plan could be to try to merge the work in progress targeted for 2.11
by the mid of August and then start the code freezing as described in the
PIP.

So the target release date will be early September. One point is Pulsar
Summit
San Francisco will start on August 18, 2022. I think maybe we can start to
test
the master branch for now and continue the in-progress tasks. If we can
have a
major release before Pulsar Summit, it should be good news to the Community.

Thanks.
Penghui

On Mon, Jul 18, 2022 at 4:06 PM Enrico Olivelli  wrote:

> Nicolò,
>
> Il Lun 18 Lug 2022, 10:00 Nicolò Boschi  ha scritto:
>
> > Thanks Penghui for the reminder.
> > I'd like to also include PIP: 181 Pulsar shell if the time permits.
> >
> > I believe that is a good idea to start testing the code freeze proposed
> by
> > PIP-175 (https://github.com/apache/pulsar/issues/15966). Even if not
> > officially approved, we discussed it many times and agreed to the
> > usefulness of the code freezing.
> >
>
> Great idea!
>
> We should really try it
>
> So a plan could be to try to merge the work in progress targeted for 2.11
> > by the mid of August and then start the code freezing as described in the
> > PIP.
> >
> > Also, I volunteer for driving the release if nobody else is interested
> >
>
>
> Thanks for volunteering
>
> Enrico
>
>
> > Thanks,
> > Nicolò Boschi
> >
> >
> > Il giorno lun 18 lug 2022 alle ore 06:59 Yunze Xu
> >  ha scritto:
> >
> > > In addition to #16202, there is a following PR to support the correct
> > > ACK implementation for chunked messages. It should depend on #16202
> > > But I think I can submit an initial PR this week and change the tests
> > > after #16202 is merged.
> > >
> > > Thanks,
> > > Yunze
> > >
> > >
> > >
> > >
> > > > 2022年7月18日 11:22,PengHui Li  写道:
> > > >
> > > > Hi all,
> > > >
> > > > We released 2.10.0 three months ago. And there are many great changes
> > in
> > > > the master branch,
> > > > including new features and performance improvements.
> > > >
> > > > - PIP 74: apply client memory to consumer
> > > > https://github.com/apache/pulsar/pull/15216
> > > > - PIP 143: Support split bundles by specified boundaries
> > > > https://github.com/apache/pulsar/pull/13796
> > > > - PIP 145: regex subscription improvements
> > > > https://github.com/apache/pulsar/pull/16062
> > > > - PIP 160: transaction performance improvements (still in progress
> and
> > > > merged some PRs)
> > > > - PIP 161: new exclusive producer mode support
> > > > https://github.com/apache/pulsar/pull/15488
> > > > - PIP 182: Provide new load balance placement strategy implementation
> > for
> > > > ModularLoadManagerStrategy
> https://github.com/apache/pulsar/pull/16281
> > > > Add Pulsar Auth support for the Pulsar SQL
> > > > https://github.com/apache/pulsar/pull/15571
> > > >
> > > > And some features are blocked in the review stage, but they are
> > powerful
> > > > improvements for Pulsar
> > > >
> > > > PIP 37: Support chunking with Shared subscription
> > > > https://github.com/apache/pulsar/pull/16202
> > > > PIP-166: Function add MANUAL delivery semantics
> > > > https://github.com/apache/pulsar/pull/16279
> > > >
> > > > You can find the complete change list in 2.11.0 at
> > > >
> > >
> >
> https://github.com/apache/pulsar/pulls?q=is%3Apr+milestone%3A2.11.0+-label%3Arelease%2F2.10.1+-label%3Arelease%2F2.10.2
> > > >
> > > > And maybe I missed some important in-progress PRs, please let me know
> > if
> > > it
> > > > should be a blocker of the 2.11.0 release.
> > > >
> > > > It's a good time to discuss the target time of the 2.11.0 release.
> > > > I think we can leave 2 weeks to complete the in-progress PRs and 2
> > weeks
> > > to
> > > > accept bug fixes.
> > > > And target the 2.11.0 release in mid-August.
> > > >
> > > > Please let me know what you think.
> > > >
> > > > Thanks,
> > > > Penghui
> > >
> > >
> >
>