ted topic-> [A,B,C] // broker A,B,C consumed m1
>> >>>>>>>> t5: A-> own bundle // broker A knows that its assignment has been
>> >>>>>>> accepted,
>> >>>>>>>> so proceeding to own the bundle (meanwhile deferring lookup
>> >&
; assignment is
> >>>>>>>> running(meanwhile deferring lookup requests)
> >>>>>>>> t5: C -> defer client lookups // broker C knows that bundle
> >>>>>>> assignment is
> >>>>>>>> running(meanwhile deferring lookup requests)
> >>>>>>>>
> >>>>>>>> Analysis: The "post-filter + a single topic" can perform ok in
> >> this
> >>>>>>> case
> >>>>>>>> without the additional leader coordination and the secondary topic
> >>>>>>> because
> >>>>>>>> the early broadcast can inform all brokers and prevent them from
> >>>>>>> requesting
> >>>>>>>> other assignments for the same bundle.
> >>>>>>>>
> >>>>>>>> I think the post-filter option is initially not bad because:
> >>>>>>>>
> >>>>>>>> 1. it is safe in the worst case (in case the messages are not
> >>>>>>> correctly
> >>>>>>>> pre-filtered at the leader)
> >>>>>>>> 2. it performs ok because the early broadcast can prevent
> >>>>>>>> concurrent assignment requests.
> >>>>>>>> 3. initially less complex to implement (leaderless conflict
> >>>>>>> resolution and
> >>>>>>>> requires a single topic)
> >>>>>>>> 4. it is not a "one-way door" decision (we could add the
> >> pre-filter
> >>>>>>> logic
> >>>>>>>> as well later)
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Heesung
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Sat, Oct 29, 2022 at 1:03 PM Heesung Sohn <
> >>>>>>> heesung.s...@streamnative.io>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi Michael,
> >>>>>>>>>
> >>>>>>>>> For the pre-prefilter(pre-compaction) option,
> >>>>>>>>> I think the leader requires a write-through cache to compact
> >>>>>>> messages
> >>>>>>>>> based on the latest states. Otherwise, the leader needs to wait
> >> for
> >>>>>>> the
> >>>>>>>>> last msg from the (compacted) topic before compacting the next
> >> msg
> >>>>>>> for the
> >>>>>>>>> same bundle.
> >>>>>>>>>
> >>>>>>>>> Pulsar guarantees "a single writer". However, for the worst-case
> >>>>>>>>> scenario(due to network partitions, bugs in zk or etcd leader
> >>>>>>> election,
> >>>>>>>>> bugs in bk, data corruption ), I think it is safe to place the
> >>>>>>> post-filter
> >>>>>>>>> on the consumer side(compaction and table views) as well in
> >> order to
> >>>>>>>>> validate the state changes.
> >>>>>>>>>
> >>>>>>>>> For the two-topic approach,
> >>>>>>>>> I think we lose a single linearized view. Could we clarify how
> >> to
> >>>>>>> handle
> >>>>>>>>> the following(edge cases and failure recovery)?
> >>>>>>>>> 0. Is the un-compacted topic a persistent topic or a
> >> non-persistent
> >>>>>>> topic?
> >>>>>>>>> 1. How does the leader recover state from the two topics?
> >>>>>>>>> 2. How do we handle the case when the leader fails before
> >> writing
> >>>>>>> messages
> >>>>>>>>> to the compacted topic
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>> Heesung
> >>>>>>>>>
> >>>>>>>>> On Fri, Oct 28, 2022 at 6:56 PM Michael Marshall <
> >>>>>>> mmarsh...@apache.org>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Sharing some more thoughts. We could alternatively use two
> >> topics
> >>>>>>>>>> instead of one. In this design, the first topic is the
> >> unfiltered
> >>>>>>>>>> write ahead log that represents many writers (brokers) trying
> >> to
> >>>>>>>>>> acquire ownership of bundles. The second topic is the
> >> distilled log
> >>>>>>>>>> that represents the "winners" or the "owners" of the bundles.
> >>>>>>> There is
> >>>>>>>>>> a single writer, the leader broker, that reads from the input
> >> topic
> >>>>>>>>>> and writes to the output topic. The first topic is normal and
> >> the
> >>>>>>>>>> second is compacted.
> >>>>>>>>>>
> >>>>>>>>>> The primary benefit in a two topic solution is that it is easy
> >> for
> >>>>>>> the
> >>>>>>>>>> leader broker to trade off ownership without needing to slow
> >> down
> >>>>>>>>>> writes to the input topic. The leader broker will start
> >> consuming
> >>>>>>> from
> >>>>>>>>>> the input topic when it has fully consumed the table view on
> >> the
> >>>>>>>>>> output topic. In general, I don't think consumers know when
> >> they
> >>>>>>> have
> >>>>>>>>>> "reached the end of a table view", but we should be able to
> >>>>>>> trivially
> >>>>>>>>>> figure this out if we are the topic's only writer and the
> >> topic and
> >>>>>>>>>> writer are collocated on the same broker.
> >>>>>>>>>>
> >>>>>>>>>> In that design, it might make sense to use something like the
> >>>>>>>>>> replication cursor to keep track of this consumer's state.
> >>>>>>>>>>
> >>>>>>>>>> - Michael
> >>>>>>>>>>
> >>>>>>>>>> On Fri, Oct 28, 2022 at 5:12 PM Michael Marshall <
> >>>>>>> mmarsh...@apache.org>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks for your proposal, Heesung.
> >>>>>>>>>>>
> >>>>>>>>>>> Fundamentally, we have the problems listed in this PIP
> >> because
> >>>>>>> we have
> >>>>>>>>>>> multiple writers instead of just one writer. Can we solve
> >> this
> >>>>>>> problem
> >>>>>>>>>>> by changing our write pattern? What if we use the leader
> >> broker
> >>>>>>> as the
> >>>>>>>>>>> single writer? That broker would intercept attempts to
> >> acquire
> >>>>>>>>>>> ownership on bundles and would grant ownership to the first
> >>>>>>> broker to
> >>>>>>>>>>> claim an unassigned bundle. It could "grant ownership" by
> >>>>>>> letting the
> >>>>>>>>>>> first write to claim an unassigned bundle get written to the
> >>>>>>> ownership
> >>>>>>>>>>> topic. When a bundle is already owned, the leader won't
> >> persist
> >>>>>>> that
> >>>>>>>>>>> event to the bookkeeper. In this design, the log becomes a
> >> true
> >>>>>>>>>>> ownership log, which will correctly work with the existing
> >> topic
> >>>>>>>>>>> compaction and table view solutions. My proposal essentially
> >>>>>>> moves the
> >>>>>>>>>>> conflict resolution to just before the write, and as a
> >>>>>>> consequence, it
> >>>>>>>>>>> greatly reduces the need for post processing of the event
> >> log.
> >>>>>>> One
> >>>>>>>>>>> trade off might be that the leader broker could slow down the
> >>>>>>> write
> >>>>>>>>>>> path, but given that the leader would just need to verify the
> >>>>>>> current
> >>>>>>>>>>> state of the bundle, I think it'd be performant enough.
> >>>>>>>>>>>
> >>>>>>>>>>> Additionally, we'd need the leader broker to be "caught up"
> >> on
> >>>>>>> bundle
> >>>>>>>>>>> ownership in order to grant ownership of topics, but unless
> >> I am
> >>>>>>>>>>> mistaken, that is already a requirement of the current PIP
> >> 192
> >>>>>>>>>>> paradigm.
> >>>>>>>>>>>
> >>>>>>>>>>> Below are some additional thoughts that will be relevant if
> >> we
> >>>>>>> move
> >>>>>>>>>>> forward with the design as it is currently proposed.
> >>>>>>>>>>>
> >>>>>>>>>>> I think it might be helpful to update the title to show that
> >> this
> >>>>>>>>>>> proposal will also affect table view as well. I didn't catch
> >>>>>>> that at
> >>>>>>>>>>> first.
> >>>>>>>>>>>
> >>>>>>>>>>> Do you have any documentation describing how the
> >>>>>>>>>>> TopicCompactionStrategy will determine which states are
> >> valid in
> >>>>>>> the
> >>>>>>>>>>> context of load balancing? I looked at
> >>>>>>>>>>> https://github.com/apache/pulsar/pull/18195, but I couldn't
> >>>>>>> seem to
> >>>>>>>>>>> find anything for it. That would help make this proposal less
> >>>>>>>>>>> abstract.
> >>>>>>>>>>>
> >>>>>>>>>>> The proposed API seems very tied to the needs of PIP 192. For
> >>>>>>> example,
> >>>>>>>>>>> `isValid` is not a term I associate with topic compaction.
> >> The
> >>>>>>>>>>> fundamental question for compaction is which value to keep
> >> (or
> >>>>>>> build a
> >>>>>>>>>>> new value). I think we might be able to simplify the API by
> >>>>>>> replacing
> >>>>>>>>>>> the "isValid", "isMergeEnabled", and "merge" methods with a
> >>>>>>> single
> >>>>>>>>>>> method that lets the implementation handle one or all tasks.
> >> That
> >>>>>>>>>>> would also remove the need to deserialize payloads multiple
> >>>>>>> times too.
> >>>>>>>>>>>
> >>>>>>>>>>> I also feel like mentioning that after working with the PIP
> >> 105
> >>>>>>> broker
> >>>>>>>>>>> side filtering, I think we should avoid running UDFs in the
> >>>>>>> broker as
> >>>>>>>>>>> much as possible. (I do not consider the load balancing
> >> logic to
> >>>>>>> be a
> >>>>>>>>>>> UDF here.) I think it would be worth not making this a user
> >>>>>>> facing
> >>>>>>>>>>> feature unless there is demand for real use cases.
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks!
> >>>>>>>>>>> Michael
> >>>>>>>>>>>
> >>>>>>>>>>> On Fri, Oct 28, 2022 at 1:21 AM 丛搏
> >> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> +1(non-binding)
> >>>>>>>>>>>>
> >>>>>>>>>>>> thanks,
> >>>>>>>>>>>> bo
> >>>>>>>>>>>>
> >>>>>>>>>>>> Heesung Sohn
> >>>>>>> 于2022年10月19日周三
> >>>>>>>>>> 07:54写道:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Hi pulsar-dev community,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I raised a pip to discuss : PIP-215: Configurable Topic
> >>>>>>> Compaction
> >>>>>>>>>> Strategy
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> PIP link: https://github.com/apache/pulsar/issues/18099
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>> Heesung
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>
>
>
lter option is initially not bad because:
>>>>>>>>
>>>>>>>> 1. it is safe in the worst case (in case the messages are not
>>>>>>> correctly
>>>>>>>> pre-filtered at the leader)
>>>>>>>> 2.
gt; >>>> > > same bundle.
> > >>>> > >
> > >>>> > > Pulsar guarantees "a single writer". However, for the worst-case
> > >>>> > > scenario(due to network partitions, bugs in zk or etcd leader
> > >>>>
before writing
> >>>> messages
> >>>> > > to the compacted topic
> >>>> > >
> >>>> > > Regards,
> >>>> > > Heesung
> >>>> > >
> >>>> > > On Fri,
gt;> > >> a single writer, the leader broker, that reads from the input topic
>>>> > >> and writes to the output topic. The first topic is normal and the
>>>> > >> second is compacted.
>>>> > >>
>>>> > >> The primary benefit i
or to keep track of this consumer's state.
>>> > >>
>>> > >> - Michael
>>> > >>
>>> > >> On Fri, Oct 28, 2022 at 5:12 PM Michael Marshall <
>>> mmarsh...@apache.org>
>>> > >> wrote:
>>> > >> >
>>> > >> > Thanks for your proposal, Heesung.
>>> > >> >
>>> > >> > Fundamentally, we have the problems listed in this PIP because we
>>> have
>>> > >> > multiple writers instead of just one writer. Can we solve this
>>> problem
>>> > >> > by changing our write pattern? What if we use the leader broker
>>> as the
>>> > >> > single writer? That broker would intercept attempts to acquire
>>> > >> > ownership on bundles and would grant ownership to the first
>>> broker to
>>> > >> > claim an unassigned bundle. It could "grant ownership" by letting
>>> the
>>> > >> > first write to claim an unassigned bundle get written to the
>>> ownership
>>> > >> > topic. When a bundle is already owned, the leader won't persist
>>> that
>>> > >> > event to the bookkeeper. In this design, the log becomes a true
>>> > >> > ownership log, which will correctly work with the existing topic
>>> > >> > compaction and table view solutions. My proposal essentially
>>> moves the
>>> > >> > conflict resolution to just before the write, and as a
>>> consequence, it
>>> > >> > greatly reduces the need for post processing of the event log. One
>>> > >> > trade off might be that the leader broker could slow down the
>>> write
>>> > >> > path, but given that the leader would just need to verify the
>>> current
>>> > >> > state of the bundle, I think it'd be performant enough.
>>> > >> >
>>> > >> > Additionally, we'd need the leader broker to be "caught up" on
>>> bundle
>>> > >> > ownership in order to grant ownership of topics, but unless I am
>>> > >> > mistaken, that is already a requirement of the current PIP 192
>>> > >> > paradigm.
>>> > >> >
>>> > >> > Below are some additional thoughts that will be relevant if we
>>> move
>>> > >> > forward with the design as it is currently proposed.
>>> > >> >
>>> > >> > I think it might be helpful to update the title to show that this
>>> > >> > proposal will also affect table view as well. I didn't catch that
>>> at
>>> > >> > first.
>>> > >> >
>>> > >> > Do you have any documentation describing how the
>>> > >> > TopicCompactionStrategy will determine which states are valid in
>>> the
>>> > >> > context of load balancing? I looked at
>>> > >> > https://github.com/apache/pulsar/pull/18195, but I couldn't seem
>>> to
>>> > >> > find anything for it. That would help make this proposal less
>>> > >> > abstract.
>>> > >> >
>>> > >> > The proposed API seems very tied to the needs of PIP 192. For
>>> example,
>>> > >> > `isValid` is not a term I associate with topic compaction. The
>>> > >> > fundamental question for compaction is which value to keep (or
>>> build a
>>> > >> > new value). I think we might be able to simplify the API by
>>> replacing
>>> > >> > the "isValid", "isMergeEnabled", and "merge" methods with a single
>>> > >> > method that lets the implementation handle one or all tasks. That
>>> > >> > would also remove the need to deserialize payloads multiple times
>>> too.
>>> > >> >
>>> > >> > I also feel like mentioning that after working with the PIP 105
>>> broker
>>> > >> > side filtering, I think we should avoid running UDFs in the
>>> broker as
>>> > >> > much as possible. (I do not consider the load balancing logic to
>>> be a
>>> > >> > UDF here.) I think it would be worth not making this a user facing
>>> > >> > feature unless there is demand for real use cases.
>>> > >> >
>>> > >> > Thanks!
>>> > >> > Michael
>>> > >> >
>>> > >> > On Fri, Oct 28, 2022 at 1:21 AM 丛搏 wrote:
>>> > >> > >
>>> > >> > > +1(non-binding)
>>> > >> > >
>>> > >> > > thanks,
>>> > >> > > bo
>>> > >> > >
>>> > >> > > Heesung Sohn
>>> 于2022年10月19日周三
>>> > >> 07:54写道:
>>> > >> > > >
>>> > >> > > > Hi pulsar-dev community,
>>> > >> > > >
>>> > >> > > > I raised a pip to discuss : PIP-215: Configurable Topic
>>> Compaction
>>> > >> Strategy
>>> > >> > > >
>>> > >> > > > PIP link: https://github.com/apache/pulsar/issues/18099
>>> > >> > > >
>>> > >> > > > Regards,
>>> > >> > > > Heesung
>>> > >>
>>> > >
>>>
>>
> > >> > first write to claim an unassigned bundle get written to the
>> ownership
>> > >> > topic. When a bundle is already owned, the leader won't persist
>> that
>> > >> > event to the bookkeeper. In this design, the log becomes a true
>> > >> > ownership log, which will correctly work with the existing topic
>> > >> > compaction and table view solutions. My proposal essentially moves
>> the
>> > >> > conflict resolution to just before the write, and as a
>> consequence, it
>> > >> > greatly reduces the need for post processing of the event log. One
>> > >> > trade off might be that the leader broker could slow down the write
>> > >> > path, but given that the leader would just need to verify the
>> current
>> > >> > state of the bundle, I think it'd be performant enough.
>> > >> >
>> > >> > Additionally, we'd need the leader broker to be "caught up" on
>> bundle
>> > >> > ownership in order to grant ownership of topics, but unless I am
>> > >> > mistaken, that is already a requirement of the current PIP 192
>> > >> > paradigm.
>> > >> >
>> > >> > Below are some additional thoughts that will be relevant if we move
>> > >> > forward with the design as it is currently proposed.
>> > >> >
>> > >> > I think it might be helpful to update the title to show that this
>> > >> > proposal will also affect table view as well. I didn't catch that
>> at
>> > >> > first.
>> > >> >
>> > >> > Do you have any documentation describing how the
>> > >> > TopicCompactionStrategy will determine which states are valid in
>> the
>> > >> > context of load balancing? I looked at
>> > >> > https://github.com/apache/pulsar/pull/18195, but I couldn't seem
>> to
>> > >> > find anything for it. That would help make this proposal less
>> > >> > abstract.
>> > >> >
>> > >> > The proposed API seems very tied to the needs of PIP 192. For
>> example,
>> > >> > `isValid` is not a term I associate with topic compaction. The
>> > >> > fundamental question for compaction is which value to keep (or
>> build a
>> > >> > new value). I think we might be able to simplify the API by
>> replacing
>> > >> > the "isValid", "isMergeEnabled", and "merge" methods with a single
>> > >> > method that lets the implementation handle one or all tasks. That
>> > >> > would also remove the need to deserialize payloads multiple times
>> too.
>> > >> >
>> > >> > I also feel like mentioning that after working with the PIP 105
>> broker
>> > >> > side filtering, I think we should avoid running UDFs in the broker
>> as
>> > >> > much as possible. (I do not consider the load balancing logic to
>> be a
>> > >> > UDF here.) I think it would be worth not making this a user facing
>> > >> > feature unless there is demand for real use cases.
>> > >> >
>> > >> > Thanks!
>> > >> > Michael
>> > >> >
>> > >> > On Fri, Oct 28, 2022 at 1:21 AM 丛搏 wrote:
>> > >> > >
>> > >> > > +1(non-binding)
>> > >> > >
>> > >> > > thanks,
>> > >> > > bo
>> > >> > >
>> > >> > > Heesung Sohn
>> 于2022年10月19日周三
>> > >> 07:54写道:
>> > >> > > >
>> > >> > > > Hi pulsar-dev community,
>> > >> > > >
>> > >> > > > I raised a pip to discuss : PIP-215: Configurable Topic
>> Compaction
>> > >> Strategy
>> > >> > > >
>> > >> > > > PIP link: https://github.com/apache/pulsar/issues/18099
>> > >> > > >
>> > >> > > > Regards,
>> > >> > > > Heesung
>> > >>
>> > >
>>
>
> Additionally, we'd need the leader broker to be "caught up" on
> bundle
> > >> > ownership in order to grant ownership of topics, but unless I am
> > >> > mistaken, that is already a requirement of the current PIP 192
> > >> > paradigm.
> > >> >
> > >> > Below are some additional thoughts that will be relevant if we move
> > >> > forward with the design as it is currently proposed.
> > >> >
> > >> > I think it might be helpful to update the title to show that this
> > >> > proposal will also affect table view as well. I didn't catch that at
> > >> > first.
> > >> >
> > >> > Do you have any documentation describing how the
> > >> > TopicCompactionStrategy will determine which states are valid in the
> > >> > context of load balancing? I looked at
> > >> > https://github.com/apache/pulsar/pull/18195, but I couldn't seem to
> > >> > find anything for it. That would help make this proposal less
> > >> > abstract.
> > >> >
> > >> > The proposed API seems very tied to the needs of PIP 192. For
> example,
> > >> > `isValid` is not a term I associate with topic compaction. The
> > >> > fundamental question for compaction is which value to keep (or
> build a
> > >> > new value). I think we might be able to simplify the API by
> replacing
> > >> > the "isValid", "isMergeEnabled", and "merge" methods with a single
> > >> > method that lets the implementation handle one or all tasks. That
> > >> > would also remove the need to deserialize payloads multiple times
> too.
> > >> >
> > >> > I also feel like mentioning that after working with the PIP 105
> broker
> > >> > side filtering, I think we should avoid running UDFs in the broker
> as
> > >> > much as possible. (I do not consider the load balancing logic to be
> a
> > >> > UDF here.) I think it would be worth not making this a user facing
> > >> > feature unless there is demand for real use cases.
> > >> >
> > >> > Thanks!
> > >> > Michael
> > >> >
> > >> > On Fri, Oct 28, 2022 at 1:21 AM 丛搏 wrote:
> > >> > >
> > >> > > +1(non-binding)
> > >> > >
> > >> > > thanks,
> > >> > > bo
> > >> > >
> > >> > > Heesung Sohn
> 于2022年10月19日周三
> > >> 07:54写道:
> > >> > > >
> > >> > > > Hi pulsar-dev community,
> > >> > > >
> > >> > > > I raised a pip to discuss : PIP-215: Configurable Topic
> Compaction
> > >> Strategy
> > >> > > >
> > >> > > > PIP link: https://github.com/apache/pulsar/issues/18099
> > >> > > >
> > >> > > > Regards,
> > >> > > > Heesung
> > >>
> > >
>
opicCompactionStrategy will determine which states are valid in the
> >> > context of load balancing? I looked at
> >> > https://github.com/apache/pulsar/pull/18195, but I couldn't seem to
> >> > find anything for it. That would help make this
t be able to simplify the API by replacing
>> > the "isValid", "isMergeEnabled", and "merge" methods with a single
>> > method that lets the implementation handle one or all tasks. That
>> > would also remove the need to deserialize payloads multiple times too.
>> >
>> > I also feel like mentioning that after working with the PIP 105 broker
>> > side filtering, I think we should avoid running UDFs in the broker as
>> > much as possible. (I do not consider the load balancing logic to be a
>> > UDF here.) I think it would be worth not making this a user facing
>> > feature unless there is demand for real use cases.
>> >
>> > Thanks!
>> > Michael
>> >
>> > On Fri, Oct 28, 2022 at 1:21 AM 丛搏 wrote:
>> > >
>> > > +1(non-binding)
>> > >
>> > > thanks,
>> > > bo
>> > >
>> > > Heesung Sohn 于2022年10月19日周三
>> 07:54写道:
>> > > >
>> > > > Hi pulsar-dev community,
>> > > >
>> > > > I raised a pip to discuss : PIP-215: Configurable Topic Compaction
>> Strategy
>> > > >
>> > > > PIP link: https://github.com/apache/pulsar/issues/18099
>> > > >
>> > > > Regards,
>> > > > Heesung
>>
>
after working with the PIP 105 broker
> > side filtering, I think we should avoid running UDFs in the broker as
> > much as possible. (I do not consider the load balancing logic to be a
> > UDF here.) I think it would be worth not making this a user facing
> > feature unless there is demand for real use cases.
> >
> > Thanks!
> > Michael
> >
> > On Fri, Oct 28, 2022 at 1:21 AM 丛搏 wrote:
> > >
> > > +1(non-binding)
> > >
> > > thanks,
> > > bo
> > >
> > > Heesung Sohn 于2022年10月19日周三
> 07:54写道:
> > > >
> > > > Hi pulsar-dev community,
> > > >
> > > > I raised a pip to discuss : PIP-215: Configurable Topic Compaction
> Strategy
> > > >
> > > > PIP link: https://github.com/apache/pulsar/issues/18099
> > > >
> > > > Regards,
> > > > Heesung
>
th not making this a user facing
> feature unless there is demand for real use cases.
>
> Thanks!
> Michael
>
> On Fri, Oct 28, 2022 at 1:21 AM 丛搏 wrote:
> >
> > +1(non-binding)
> >
> > thanks,
> > bo
> >
> > Heesung Sohn 于2022年10月19日周三 07:54写道:
> > >
> > > Hi pulsar-dev community,
> > >
> > > I raised a pip to discuss : PIP-215: Configurable Topic Compaction
> > > Strategy
> > >
> > > PIP link: https://github.com/apache/pulsar/issues/18099
> > >
> > > Regards,
> > > Heesung
h not making this a user facing
feature unless there is demand for real use cases.
Thanks!
Michael
On Fri, Oct 28, 2022 at 1:21 AM 丛搏 wrote:
>
> +1(non-binding)
>
> thanks,
> bo
>
> Heesung Sohn 于2022年10月19日周三 07:54写道:
> >
> > Hi pulsar-dev community,
+1(non-binding)
thanks,
bo
Heesung Sohn 于2022年10月19日周三 07:54写道:
>
> Hi pulsar-dev community,
>
> I raised a pip to discuss : PIP-215: Configurable Topic Compaction Strategy
>
> PIP link: https://github.com/apache/pulsar/issues/18099
>
> Regards,
> Heesung
> > > states.
> >> > > > > Even if there are conflict state changes, only the first valid
> >> state
> >> > > > > change will be accepted(as explained in Conflict State
> >> > Resolution(Race
> >> > > > > Conditions section in the PIP)) in BSC.
> >> > > > >
> >> > > > > Also, another goal of this PIP-192 is to reduce client lookup
> >> > retries.
> >> > > In
> >> > > > > BSC, the client lookup response will be deferred(max x secs)
> until
> >> > the
> >> > > > > bundle state becomes finally "Owned".
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > >> 2. bundle State Channel(BSC) owner depends on the leader
> broker,
> >> > this
> >> > > > >> also makes topic transfer strongly dependent on the leader.
> >> > > > >>
> >> > > > > BSC will use separate leader znodes to decide the owner brokers
> of
> >> > the
> >> > > > > internal BSC system topic.As described in this section in the
> >> > PIP-192,
> >> > > > > "Bundle State and Load Data TableView Scalability",
> >> > > > > We could use a partitioned topic(configurable) for this BSC
> system
> >> > > topic.
> >> > > > > Then, there could be a separate owner broker for each partition
> >> > > > > (e.g. zk leader znodes, /loadbalance/leader/part-1-owner,
> >> > part-2-owner,
> >> > > > > ..etc).
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > >> 3. the code becomes more complex and harder to maintain
> >> > > > >>
> >> > > > >> What tradeoffs are the current implementations based on?
> >> > > > >>
> >> > > > >> Here are some Pros and Cons of BSC I can think of.
> >> > > > >
> >> > > > > Pros:
> >> > > > > - It supports more distributed load balance operations(bundle
> >> > > assignment)
> >> > > > > in a sequentially consistent manner
> >> > > > > - For really large clusters, by a partitioned system topic, BSC
> >> can
> >> > be
> >> > > > > more scalable than the current single-leader coordination
> >> solution.
> >> > > > > - The load balance commands(across brokers) are sent via event
> >> > > > > sourcing(more reliable/transparent/easy-to-track) instead of RPC
> >> with
> >> > > > > retries.
> >> > > > >
> >> > > > > Cons:
> >> > > > > - It is a new implementation and will require significant effort
> >> to
> >> > > > > stabilize the new implementation.
> >> > > > > (Based on our PoC code, I think the event sourcing handlers are
> >> > easier
> >> > > to
> >> > > > > understand and follow the logic.
> >> > > > > Also, this new load balancer will be pluggable(will be
> >> implemented in
> >> > > new
> >> > > > > classes), so it should not break the existing load balance
> logic.
> >> > > > > Users will be able to configure old/new broker load balancer.)
> >> > > > >
> >> > > > >
> >> > > > > Thank you for sharing your questions about PIP-192 here. But I
> >> think
> >> > > this
> >> > > > > PIP-215 is independent of PIP-192(though PIP-192 needs some of
> the
> >> > > > features
> >> > > > > in PIP-215).
> >> > > > >
> >> > > > > Thanks,
> >> > > > > Heesung
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > >> Thanks,
> >> > > > >> bo
> >> > > > >>
> >> > > > >> Heesung Sohn
> >> 于2022年10月19日周三
> >> > > > >> 07:54写道:
> >> > > > >> >
> >> > > > >> > Hi pulsar-dev community,
> >> > > > >> >
> >> > > > >> > I raised a pip to discuss : PIP-215: Configurable Topic
> >> Compaction
> >> > > > >> Strategy
> >> > > > >> >
> >> > > > >> > PIP link: https://github.com/apache/pulsar/issues/18099
> >> > > > >> >
> >> > > > >> > Regards,
> >> > > > >> > Heesung
> >> > > > >>
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
>
client lookup
>> > retries.
>> > > In
>> > > > > BSC, the client lookup response will be deferred(max x secs) until
>> > the
>> > > > > bundle state becomes finally "Owned".
>> > > > >
>> > > > >
>> > > > >
&
ity",
> > > > > We could use a partitioned topic(configurable) for this BSC system
> > > topic.
> > > > > Then, there could be a separate owner broker for each partition
> > > > > (e.g. zk leader znodes, /loadbalance/leader/part-1-owner,
> > part-2-owner,
> > > > > ..etc).
> > > > >
> > > > >
> > > > >
> > > > >> 3. the code becomes more complex and harder to maintain
> > > > >>
> > > > >> What tradeoffs are the current implementations based on?
> > > > >>
> > > > >> Here are some Pros and Cons of BSC I can think of.
> > > > >
> > > > > Pros:
> > > > > - It supports more distributed load balance operations(bundle
> > > assignment)
> > > > > in a sequentially consistent manner
> > > > > - For really large clusters, by a partitioned system topic, BSC can
> > be
> > > > > more scalable than the current single-leader coordination solution.
> > > > > - The load balance commands(across brokers) are sent via event
> > > > > sourcing(more reliable/transparent/easy-to-track) instead of RPC
> with
> > > > > retries.
> > > > >
> > > > > Cons:
> > > > > - It is a new implementation and will require significant effort to
> > > > > stabilize the new implementation.
> > > > > (Based on our PoC code, I think the event sourcing handlers are
> > easier
> > > to
> > > > > understand and follow the logic.
> > > > > Also, this new load balancer will be pluggable(will be implemented
> in
> > > new
> > > > > classes), so it should not break the existing load balance logic.
> > > > > Users will be able to configure old/new broker load balancer.)
> > > > >
> > > > >
> > > > > Thank you for sharing your questions about PIP-192 here. But I
> think
> > > this
> > > > > PIP-215 is independent of PIP-192(though PIP-192 needs some of the
> > > > features
> > > > > in PIP-215).
> > > > >
> > > > > Thanks,
> > > > > Heesung
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >> Thanks,
> > > > >> bo
> > > > >>
> > > > >> Heesung Sohn
> 于2022年10月19日周三
> > > > >> 07:54写道:
> > > > >> >
> > > > >> > Hi pulsar-dev community,
> > > > >> >
> > > > >> > I raised a pip to discuss : PIP-215: Configurable Topic
> Compaction
> > > > >> Strategy
> > > > >> >
> > > > >> > PIP link: https://github.com/apache/pulsar/issues/18099
> > > > >> >
> > > > >> > Regards,
> > > > >> > Heesung
> > > > >>
> > > > >
> > > >
> > >
> >
>
gt; > > > ..etc).
> > > >
> > > >
> > > >
> > > >> 3. the code becomes more complex and harder to maintain
> > > >>
> > > >> What tradeoffs are the current implementations based on?
> > > >>
> > > &
t; > > in a sequentially consistent manner
> > > - For really large clusters, by a partitioned system topic, BSC can be
> > > more scalable than the current single-leader coordination solution.
> > > - The load balance commands(across brokers) are sent via event
> &
de, I think the event sourcing handlers are easier to
> > understand and follow the logic.
> > Also, this new load balancer will be pluggable(will be implemented in new
> > classes), so it should not break the existing load balance logic.
> > Users will be able to configure old/new broker load balancer.)
> >
> >
> > Thank you for sharing your questions about PIP-192 here. But I think this
> > PIP-215 is independent of PIP-192(though PIP-192 needs some of the
> features
> > in PIP-215).
> >
> > Thanks,
> > Heesung
> >
> >
> >
> >
> >
> >> Thanks,
> >> bo
> >>
> >> Heesung Sohn 于2022年10月19日周三
> >> 07:54写道:
> >> >
> >> > Hi pulsar-dev community,
> >> >
> >> > I raised a pip to discuss : PIP-215: Configurable Topic Compaction
> >> Strategy
> >> >
> >> > PIP link: https://github.com/apache/pulsar/issues/18099
> >> >
> >> > Regards,
> >> > Heesung
> >>
> >
>
be able to configure old/new broker load balancer.)
>
>
> Thank you for sharing your questions about PIP-192 here. But I think this
> PIP-215 is independent of PIP-192(though PIP-192 needs some of the features
> in PIP-215).
>
> Thanks,
> Heesung
>
>
>
>
>
>> Thanks,
>> bo
>>
>> Heesung Sohn 于2022年10月19日周三
>> 07:54写道:
>> >
>> > Hi pulsar-dev community,
>> >
>> > I raised a pip to discuss : PIP-215: Configurable Topic Compaction
>> Strategy
>> >
>> > PIP link: https://github.com/apache/pulsar/issues/18099
>> >
>> > Regards,
>> > Heesung
>>
>
bo
>
> Heesung Sohn 于2022年10月19日周三
> 07:54写道:
> >
> > Hi pulsar-dev community,
> >
> > I raised a pip to discuss : PIP-215: Configurable Topic Compaction
> Strategy
> >
> > PIP link: https://github.com/apache/pulsar/issues/18099
> >
> > Regards,
> > Heesung
>
月19日周三 07:54写道:
>
> Hi pulsar-dev community,
>
> I raised a pip to discuss : PIP-215: Configurable Topic Compaction Strategy
>
> PIP link: https://github.com/apache/pulsar/issues/18099
>
> Regards,
> Heesung
Hi pulsar-dev community,
I raised a pip to discuss : PIP-215: Configurable Topic Compaction Strategy
PIP link: https://github.com/apache/pulsar/issues/18099
Regards,
Heesung
25 matches
Mail list logo