Hi Enrico, > We also need a way to represent this as a String or a byte[]
We already have the `toByteArray` method, right? Thanks, Yunze On Tue, Nov 8, 2022 at 6:43 PM Enrico Olivelli <eolive...@gmail.com> wrote: > > Il giorno mar 8 nov 2022 alle ore 11:25 Yunze Xu > <y...@streamnative.io.invalid> ha scritto: > > > > Hi all, > > > > Currently we have the following 5 implementations of MessageId: > > > > - MessageIdImpl: (ledger id, entry id, partition index) > > - BatchMessageIdImpl: adds (batch index, batch size, acker), where > > acker is a wrapper of a BitSet. > > - ChunkMessageIdImpl: adds another MessageIdImpl that represents > > the first MessageIdImpl of a BitSet. > > - MultiMessageIdImpl: adds a map that maps the topic name to the > > MessageId. > > - TopicMessageIdImpl: adds the topic name and the partition name > > > > These implementations are such a mess. For example, when users get a > > MessageId from `Producer#send`: > > > > ```java > > var id = producer.send("msg"); > > ``` > > > > There is no getter to get some specific fields like ledger id. You can > > only see a representation from `toString` method and got some output > > like "0:0:-1:0". Maybe this design is to hidden some details, but if > > users don't know the details like ledger id and entry id, how could > > you know what does "0:0:-1:0" mean? What if `MessageId#toString`'s > > implementation changed? Should it be treated as a breaking change? > > > > The original definition of the underlying MessageIdData is much more > > clear: > > > > ```proto > > message MessageIdData { > > required uint64 ledgerId = 1; > > required uint64 entryId = 2; > > optional int32 partition = 3 [default = -1]; > > optional int32 batch_index = 4 [default = -1]; > > repeated int64 ack_set = 5; > > optional int32 batch_size = 6; > > > > // For the chunk message id, we need to specify the first chunk message > > id. > > optional MessageIdData first_chunk_message_id = 7; > > } > > ``` > > > > IMO, MessageId should be a wrapper of MessageIdData. It's more natural > > to have an interface like: > > > > ```java > > interface MessageId { > > long ledgerId(); > > long entryId(); > > Optional<Integer> partition(); > > Optional<Integer> batchIndex(); > > // ... > > ``` > > This is very good for client applications. > We also need a way to represent this as a String or a byte[], this way > client applications have a standard way to store > message offsets into an external system (for instance when you want to > user the Reader API and keep track of the position by yourself) > > Enrico > > > > > Additionally, there are many places that use only the triple of > > (ledger id, entry id, batch index) as the key to represent the position. > > Currently, they are done by adding a conversion from > > BatchMessageIdImpl to MessageIdImpl. However, it's more intuitive to > > write something like: > > > > ```java > > class MessageIdPosition implements Comparable<MessageIdPosition> { > > private final MessageId messageId; > > // TODO: compare only the triple (ledger, entry, batch) > > ``` > > > > Therefore, I'm going to write a proposal to redesign the MessageId > > interface only by adding some getters. Regarding the 5 existing > > implementations, I think we can drop them because they are a part > > of `pulsar-client`, not `pulsar-client-api`. > > > > Please feel free to share your points. > > > > Thanks, > > Yunze