Hi Haiting,

> But please make sure we have to make it compatible with previous
implementations, like the `toString` method

Yeah, I agree, I will keep it compatible.

BTW, while I'm working on this, I found the MessageId implementations
are more complicated than I thought. The MessageIdImpl class must be a
POJO. Otherwise it cannot be passed into the admin API as the Entity. But I
still insists on returning an Optional<T> instead of T with default
value explained
in API docs.


Thanks,
Yunze

On Wed, Nov 9, 2022 at 3:02 PM Haiting Jiang <jianghait...@gmail.com> wrote:
>
> Overall, this makes sense to me.
> The current status of MessageId is a bit messy, especially for client
> developers and senior users who are interested in the implementation
> details.
> But please make sure we have to make it compatible with previous
> implementations, like the `toString` method, I bet someone has already
> done the parsing and got the ledgerId and entryId from it.
>
>
> Thanks,
> Haiting
>
> On Tue, Nov 8, 2022 at 6:25 PM Yunze Xu <y...@streamnative.io.invalid> wrote:
> >
> > Hi all,
> >
> > Currently we have the following 5 implementations of MessageId:
> >
> > - MessageIdImpl: (ledger id, entry id, partition index)
> >   - BatchMessageIdImpl: adds (batch index, batch size, acker), where
> >     acker is a wrapper of a BitSet.
> >   - ChunkMessageIdImpl: adds another MessageIdImpl that represents
> >     the first MessageIdImpl of a BitSet.
> >   - MultiMessageIdImpl: adds a map that maps the topic name to the
> >     MessageId.
> > - TopicMessageIdImpl: adds the topic name and the partition name
> >
> > These implementations are such a mess. For example, when users get a
> > MessageId from `Producer#send`:
> >
> > ```java
> > var id = producer.send("msg");
> > ```
> >
> > There is no getter to get some specific fields like ledger id. You can
> > only see a representation from `toString` method and got some output
> > like "0:0:-1:0". Maybe this design is to hidden some details, but if
> > users don't know the details like ledger id and entry id, how could
> > you know what does "0:0:-1:0" mean? What if `MessageId#toString`'s
> > implementation changed? Should it be treated as a breaking change?
> >
> > The original definition of the underlying MessageIdData is much more
> > clear:
> >
> > ```proto
> > message MessageIdData {
> >     required uint64 ledgerId = 1;
> >     required uint64 entryId  = 2;
> >     optional int32 partition = 3 [default = -1];
> >     optional int32 batch_index = 4 [default = -1];
> >     repeated int64 ack_set = 5;
> >     optional int32 batch_size = 6;
> >
> >     // For the chunk message id, we need to specify the first chunk message 
> > id.
> >     optional MessageIdData first_chunk_message_id = 7;
> > }
> > ```
> >
> > IMO, MessageId should be a wrapper of MessageIdData. It's more natural
> > to have an interface like:
> >
> > ```java
> > interface MessageId {
> >     long ledgerId();
> >     long entryId();
> >     Optional<Integer> partition();
> >     Optional<Integer> batchIndex();
> >     // ...
> > ```
> >
> > Additionally, there are many places that use only the triple of
> > (ledger id, entry id, batch index) as the key to represent the position.
> > Currently, they are done by adding a conversion from
> > BatchMessageIdImpl to MessageIdImpl. However, it's more intuitive to
> > write something like:
> >
> > ```java
> > class MessageIdPosition implements Comparable<MessageIdPosition> {
> >     private final MessageId messageId;
> >     // TODO: compare only the triple (ledger, entry, batch)
> > ```
> >
> > Therefore, I'm going to write a proposal to redesign the MessageId
> > interface only by adding some getters. Regarding the 5 existing
> > implementations, I think we can drop them because they are a part
> > of `pulsar-client`, not `pulsar-client-api`.
> >
> > Please feel free to share your points.
> >
> > Thanks,
> > Yunze

Reply via email to