Hi Girish, Very useful proposal.
Would it be possible to enable comments on the Google Doc? It's pretty hard to comment on the doc since copying is also disabled. In the scope definition 4.2, "The initial scope is to target unordered consumption flows. Even in the current world, there are challenges with normal partition scale up for ordered consumption based topics, so keeping the partition scale down out of scope for that as well." If we don't care about ordered consumption and re-keying, I guess the feature isn't very hard to implement. Pulsar already contains the topic termination feature which will let consumers to consume messages while publishers cannot publish more messages. This is the "ready-only topic" feature that could be used as one of the building blocks for implementing the decrease of the partition count for a topic. For the final design, it would be great to have a design for ordered consumption flows. It might not be trivial to design it. I happened to be at a local Kafka meetup a few months ago and this particular challenge was discussed in the context of Kafka and how painful it is to handle manually and what problems could happen in production when large scale streaming applications assume that a specific key is contained in a specific partition. There's a similar challenge also when the number of partitions are increased so this problem isn't specific to decreasing partitions. In ordered consumption flows, there is most likely an ordering key and a specific key is assigned to a specific partition. If the partition count changes, there would have to be some rekeying/reassignment that happens. I guess it should be part of the metadata of the message key to know what partition count did the producing application use when assigning the message to a specific partition. That information could be used in consuming application to detect when keying or number of partitions changes. This feels like a generic problem so perhaps we should look for existing ways to solve this challenge in streaming applications. -Lari On Wed, 24 Jan 2024 at 09:00, Girish Sharma <scrapmachi...@gmail.com> wrote: > > Bumping this up! Hoping this can be discussed so that I get rule out that > this approach has any fatal flaws. > > Regards > > On Fri, Jan 19, 2024 at 11:58 AM Girish Sharma <scrapmachi...@gmail.com> > wrote: > > > Hello everyone, > > > > A a true cloud native platform, which supports scale up and scale down, I > > feel like there is a need to be able to reduce partition count in pulsar to > > truly achieve a scale down after events like sales (akin to black friday, > > etc) or huge temporary publish burst due to backfill. > > > > I looked through the archives (upto 2021) and did not find any prior > > discussion on the same topic. > > > > I have given this an initial thought to figure out what would it need to > > support such a feature in the lowest footprint possible. I am attaching the > > document explaining the need, requirements and initial high level details > > [0]. What I would like is to understand if the community also finds this > > feature helpful and does the approach described in the document have some > > fatal flaw? Summarizing the approach here as well: > > > > - Introduce an ability to convert a normal topic object into a > > read-only topic via admin api and an additional partitioned-topic > > metadata > > property (just like shadow source, etc) > > - Add logic to block produce but allow new consumers and dispatch call > > based on this flag > > - Add logic in GC to clean out read only topics when all of their > > ledgers expire (TTL/retention) > > > > Goal is that there is no data movement involved and no impact on existing > > partitions during this scale down. > > > > Looking forward to the discussion. > > > > [0] > > https://docs.google.com/document/d/1sbGQSwDihQftIRsxAXg5Zm4uxKQ0kRk9HadKYRFTswI/edit?usp=sharing > > > > Regards > > -- > > Girish Sharma > > > > > -- > Girish Sharma