Il giorno mar 8 nov 2022 alle ore 11:25 Yunze Xu <y...@streamnative.io.invalid> ha scritto: > > Hi all, > > Currently we have the following 5 implementations of MessageId: > > - MessageIdImpl: (ledger id, entry id, partition index) > - BatchMessageIdImpl: adds (batch index, batch size, acker), where > acker is a wrapper of a BitSet. > - ChunkMessageIdImpl: adds another MessageIdImpl that represents > the first MessageIdImpl of a BitSet. > - MultiMessageIdImpl: adds a map that maps the topic name to the > MessageId. > - TopicMessageIdImpl: adds the topic name and the partition name > > These implementations are such a mess. For example, when users get a > MessageId from `Producer#send`: > > ```java > var id = producer.send("msg"); > ``` > > There is no getter to get some specific fields like ledger id. You can > only see a representation from `toString` method and got some output > like "0:0:-1:0". Maybe this design is to hidden some details, but if > users don't know the details like ledger id and entry id, how could > you know what does "0:0:-1:0" mean? What if `MessageId#toString`'s > implementation changed? Should it be treated as a breaking change? > > The original definition of the underlying MessageIdData is much more > clear: > > ```proto > message MessageIdData { > required uint64 ledgerId = 1; > required uint64 entryId = 2; > optional int32 partition = 3 [default = -1]; > optional int32 batch_index = 4 [default = -1]; > repeated int64 ack_set = 5; > optional int32 batch_size = 6; > > // For the chunk message id, we need to specify the first chunk message > id. > optional MessageIdData first_chunk_message_id = 7; > } > ``` > > IMO, MessageId should be a wrapper of MessageIdData. It's more natural > to have an interface like: > > ```java > interface MessageId { > long ledgerId(); > long entryId(); > Optional<Integer> partition(); > Optional<Integer> batchIndex(); > // ... > ```
This is very good for client applications. We also need a way to represent this as a String or a byte[], this way client applications have a standard way to store message offsets into an external system (for instance when you want to user the Reader API and keep track of the position by yourself) Enrico > > Additionally, there are many places that use only the triple of > (ledger id, entry id, batch index) as the key to represent the position. > Currently, they are done by adding a conversion from > BatchMessageIdImpl to MessageIdImpl. However, it's more intuitive to > write something like: > > ```java > class MessageIdPosition implements Comparable<MessageIdPosition> { > private final MessageId messageId; > // TODO: compare only the triple (ledger, entry, batch) > ``` > > Therefore, I'm going to write a proposal to redesign the MessageId > interface only by adding some getters. Regarding the 5 existing > implementations, I think we can drop them because they are a part > of `pulsar-client`, not `pulsar-client-api`. > > Please feel free to share your points. > > Thanks, > Yunze