Hi Enrico,

> We also need a way to represent this as a String or a byte[]

We already have the `toByteArray` method, right?

Thanks,
Yunze

On Tue, Nov 8, 2022 at 6:43 PM Enrico Olivelli <eolive...@gmail.com> wrote:
>
> Il giorno mar 8 nov 2022 alle ore 11:25 Yunze Xu
> <y...@streamnative.io.invalid> ha scritto:
> >
> > Hi all,
> >
> > Currently we have the following 5 implementations of MessageId:
> >
> > - MessageIdImpl: (ledger id, entry id, partition index)
> >   - BatchMessageIdImpl: adds (batch index, batch size, acker), where
> >     acker is a wrapper of a BitSet.
> >   - ChunkMessageIdImpl: adds another MessageIdImpl that represents
> >     the first MessageIdImpl of a BitSet.
> >   - MultiMessageIdImpl: adds a map that maps the topic name to the
> >     MessageId.
> > - TopicMessageIdImpl: adds the topic name and the partition name
> >
> > These implementations are such a mess. For example, when users get a
> > MessageId from `Producer#send`:
> >
> > ```java
> > var id = producer.send("msg");
> > ```
> >
> > There is no getter to get some specific fields like ledger id. You can
> > only see a representation from `toString` method and got some output
> > like "0:0:-1:0". Maybe this design is to hidden some details, but if
> > users don't know the details like ledger id and entry id, how could
> > you know what does "0:0:-1:0" mean? What if `MessageId#toString`'s
> > implementation changed? Should it be treated as a breaking change?
> >
> > The original definition of the underlying MessageIdData is much more
> > clear:
> >
> > ```proto
> > message MessageIdData {
> >     required uint64 ledgerId = 1;
> >     required uint64 entryId  = 2;
> >     optional int32 partition = 3 [default = -1];
> >     optional int32 batch_index = 4 [default = -1];
> >     repeated int64 ack_set = 5;
> >     optional int32 batch_size = 6;
> >
> >     // For the chunk message id, we need to specify the first chunk message 
> > id.
> >     optional MessageIdData first_chunk_message_id = 7;
> > }
> > ```
> >
> > IMO, MessageId should be a wrapper of MessageIdData. It's more natural
> > to have an interface like:
> >
> > ```java
> > interface MessageId {
> >     long ledgerId();
> >     long entryId();
> >     Optional<Integer> partition();
> >     Optional<Integer> batchIndex();
> >     // ...
> > ```
>
> This is very good for client applications.
> We also need a way to represent this as a String or a byte[], this way
> client applications have a standard way to store
> message offsets into an external system (for instance when you want to
> user the Reader API and keep track of the position by yourself)
>
> Enrico
>
> >
> > Additionally, there are many places that use only the triple of
> > (ledger id, entry id, batch index) as the key to represent the position.
> > Currently, they are done by adding a conversion from
> > BatchMessageIdImpl to MessageIdImpl. However, it's more intuitive to
> > write something like:
> >
> > ```java
> > class MessageIdPosition implements Comparable<MessageIdPosition> {
> >     private final MessageId messageId;
> >     // TODO: compare only the triple (ledger, entry, batch)
> > ```
> >
> > Therefore, I'm going to write a proposal to redesign the MessageId
> > interface only by adding some getters. Regarding the 5 existing
> > implementations, I think we can drop them because they are a part
> > of `pulsar-client`, not `pulsar-client-api`.
> >
> > Please feel free to share your points.
> >
> > Thanks,
> > Yunze

Reply via email to