>Maybe this design is to hidden some details, but if users don't know the details like ledger id and entry id, how could you know what does "0:0:-1:0" mean?
Abstractions exist for a reason. Ledgerid and entryid are implementation details, and an application should not be interpreting that at all. -j On Tue, Nov 8, 2022 at 3:43 AM Yunze Xu <y...@streamnative.io.invalid> wrote: > I didn't look into these two methods at the moment. But I think it's > possible to > retain only the `fromByteArray`. > > Thanks, > Yunze > > On Tue, Nov 8, 2022 at 7:02 PM Enrico Olivelli <eolive...@gmail.com> > wrote: > > > > Il giorno mar 8 nov 2022 alle ore 11:52 Yunze Xu > > <y...@streamnative.io.invalid> ha scritto: > > > > > > Hi Enrico, > > > > > > > We also need a way to represent this as a String or a byte[] > > > > > > We already have the `toByteArray` method, right? > > > > Yes, correct. So we are fine. I forgot about it and I answered too > quickly. > > > > I am not sure if this can be in the scope of this initiative, but we > > should somehow get rid of > > stuff like "fromByteArrayWithTopic" vs "fromByteArray". > > > > Thanks > > Enrico > > > > > > > > Thanks, > > > Yunze > > > > > > On Tue, Nov 8, 2022 at 6:43 PM Enrico Olivelli <eolive...@gmail.com> > wrote: > > > > > > > > Il giorno mar 8 nov 2022 alle ore 11:25 Yunze Xu > > > > <y...@streamnative.io.invalid> ha scritto: > > > > > > > > > > Hi all, > > > > > > > > > > Currently we have the following 5 implementations of MessageId: > > > > > > > > > > - MessageIdImpl: (ledger id, entry id, partition index) > > > > > - BatchMessageIdImpl: adds (batch index, batch size, acker), > where > > > > > acker is a wrapper of a BitSet. > > > > > - ChunkMessageIdImpl: adds another MessageIdImpl that represents > > > > > the first MessageIdImpl of a BitSet. > > > > > - MultiMessageIdImpl: adds a map that maps the topic name to the > > > > > MessageId. > > > > > - TopicMessageIdImpl: adds the topic name and the partition name > > > > > > > > > > These implementations are such a mess. For example, when users get > a > > > > > MessageId from `Producer#send`: > > > > > > > > > > ```java > > > > > var id = producer.send("msg"); > > > > > ``` > > > > > > > > > > There is no getter to get some specific fields like ledger id. You > can > > > > > only see a representation from `toString` method and got some > output > > > > > like "0:0:-1:0". Maybe this design is to hidden some details, but > if > > > > > users don't know the details like ledger id and entry id, how could > > > > > you know what does "0:0:-1:0" mean? What if `MessageId#toString`'s > > > > > implementation changed? Should it be treated as a breaking change? > > > > > > > > > > The original definition of the underlying MessageIdData is much > more > > > > > clear: > > > > > > > > > > ```proto > > > > > message MessageIdData { > > > > > required uint64 ledgerId = 1; > > > > > required uint64 entryId = 2; > > > > > optional int32 partition = 3 [default = -1]; > > > > > optional int32 batch_index = 4 [default = -1]; > > > > > repeated int64 ack_set = 5; > > > > > optional int32 batch_size = 6; > > > > > > > > > > // For the chunk message id, we need to specify the first > chunk message id. > > > > > optional MessageIdData first_chunk_message_id = 7; > > > > > } > > > > > ``` > > > > > > > > > > IMO, MessageId should be a wrapper of MessageIdData. It's more > natural > > > > > to have an interface like: > > > > > > > > > > ```java > > > > > interface MessageId { > > > > > long ledgerId(); > > > > > long entryId(); > > > > > Optional<Integer> partition(); > > > > > Optional<Integer> batchIndex(); > > > > > // ... > > > > > ``` > > > > > > > > This is very good for client applications. > > > > We also need a way to represent this as a String or a byte[], this > way > > > > client applications have a standard way to store > > > > message offsets into an external system (for instance when you want > to > > > > user the Reader API and keep track of the position by yourself) > > > > > > > > Enrico > > > > > > > > > > > > > > Additionally, there are many places that use only the triple of > > > > > (ledger id, entry id, batch index) as the key to represent the > position. > > > > > Currently, they are done by adding a conversion from > > > > > BatchMessageIdImpl to MessageIdImpl. However, it's more intuitive > to > > > > > write something like: > > > > > > > > > > ```java > > > > > class MessageIdPosition implements Comparable<MessageIdPosition> { > > > > > private final MessageId messageId; > > > > > // TODO: compare only the triple (ledger, entry, batch) > > > > > ``` > > > > > > > > > > Therefore, I'm going to write a proposal to redesign the MessageId > > > > > interface only by adding some getters. Regarding the 5 existing > > > > > implementations, I think we can drop them because they are a part > > > > > of `pulsar-client`, not `pulsar-client-api`. > > > > > > > > > > Please feel free to share your points. > > > > > > > > > > Thanks, > > > > > Yunze >