Hello Lari, thanks for the comments. replies inline.

On Wed, Jan 24, 2024 at 7:36 PM Lari Hotari <lhot...@apache.org> wrote:

> Hi Girish,
>
> Very useful proposal.
>
> Would it be possible to enable comments on the Google Doc? It's pretty
> hard to comment on the doc since copying is also disabled.
>
> I've enabled them now. Thank you for going through the doc.


> In the scope definition 4.2,
> "The initial scope is to target unordered consumption flows. Even in
> the current world, there are challenges with normal partition scale up
> for ordered consumption based topics, so keeping the partition scale
> down out of scope for that as well."
>
> If we don't care about ordered consumption and re-keying, I guess the
> feature isn't very hard to implement.
> Pulsar already contains the topic termination feature which will let
> consumers to consume messages while publishers cannot publish more
> messages. This is the "ready-only topic" feature that could be used as
> one of the building blocks for implementing the decrease of the
> partition count for a topic.
>

Yes, terminated topic is already very close to the read-only topic barring
the grace period and maybe the scope of un-terminating a topic. I will
merge my read-only with the existing terminate API/feature.


>
> For the final design, it would be great to have a design for ordered
> consumption flows. It might not be trivial to design it. I happened to
> be at a local Kafka meetup a few months ago and this particular
> challenge was discussed in the context of Kafka and how painful it is
> to handle manually and what problems could happen in production when
> large scale streaming applications assume that a specific key is
> contained in a specific partition.
>
> There's a similar challenge also when the number of partitions are
> increased so this problem isn't specific to decreasing partitions.
> In ordered consumption flows, there is most likely an ordering key and
> a specific key is assigned to a specific partition. If the partition
> count changes, there would have to be some rekeying/reassignment that
> happens.
>
> I agree that this is an existing problem in both kafka and pulsar for both
partition count scale up (and scale down in kafka via re-mapping). For that
purpose, I've kept it out of scope. But what I would ensure is that adding
this new feature of partitions scale down is not increasing the complexity
or difficulty of providing seamless partition count change for ordered
consumption in future.


-- 
Girish Sharma

Reply via email to