Il giorno mar 8 nov 2022 alle ore 11:52 Yunze Xu
<y...@streamnative.io.invalid> ha scritto:
>
> Hi Enrico,
>
> > We also need a way to represent this as a String or a byte[]
>
> We already have the `toByteArray` method, right?

Yes, correct. So we are fine. I forgot about it and I answered too quickly.

I am not sure if this can be in the scope of this initiative, but we
should somehow get rid of
stuff like "fromByteArrayWithTopic" vs "fromByteArray".

Thanks
Enrico

>
> Thanks,
> Yunze
>
> On Tue, Nov 8, 2022 at 6:43 PM Enrico Olivelli <eolive...@gmail.com> wrote:
> >
> > Il giorno mar 8 nov 2022 alle ore 11:25 Yunze Xu
> > <y...@streamnative.io.invalid> ha scritto:
> > >
> > > Hi all,
> > >
> > > Currently we have the following 5 implementations of MessageId:
> > >
> > > - MessageIdImpl: (ledger id, entry id, partition index)
> > >   - BatchMessageIdImpl: adds (batch index, batch size, acker), where
> > >     acker is a wrapper of a BitSet.
> > >   - ChunkMessageIdImpl: adds another MessageIdImpl that represents
> > >     the first MessageIdImpl of a BitSet.
> > >   - MultiMessageIdImpl: adds a map that maps the topic name to the
> > >     MessageId.
> > > - TopicMessageIdImpl: adds the topic name and the partition name
> > >
> > > These implementations are such a mess. For example, when users get a
> > > MessageId from `Producer#send`:
> > >
> > > ```java
> > > var id = producer.send("msg");
> > > ```
> > >
> > > There is no getter to get some specific fields like ledger id. You can
> > > only see a representation from `toString` method and got some output
> > > like "0:0:-1:0". Maybe this design is to hidden some details, but if
> > > users don't know the details like ledger id and entry id, how could
> > > you know what does "0:0:-1:0" mean? What if `MessageId#toString`'s
> > > implementation changed? Should it be treated as a breaking change?
> > >
> > > The original definition of the underlying MessageIdData is much more
> > > clear:
> > >
> > > ```proto
> > > message MessageIdData {
> > >     required uint64 ledgerId = 1;
> > >     required uint64 entryId  = 2;
> > >     optional int32 partition = 3 [default = -1];
> > >     optional int32 batch_index = 4 [default = -1];
> > >     repeated int64 ack_set = 5;
> > >     optional int32 batch_size = 6;
> > >
> > >     // For the chunk message id, we need to specify the first chunk 
> > > message id.
> > >     optional MessageIdData first_chunk_message_id = 7;
> > > }
> > > ```
> > >
> > > IMO, MessageId should be a wrapper of MessageIdData. It's more natural
> > > to have an interface like:
> > >
> > > ```java
> > > interface MessageId {
> > >     long ledgerId();
> > >     long entryId();
> > >     Optional<Integer> partition();
> > >     Optional<Integer> batchIndex();
> > >     // ...
> > > ```
> >
> > This is very good for client applications.
> > We also need a way to represent this as a String or a byte[], this way
> > client applications have a standard way to store
> > message offsets into an external system (for instance when you want to
> > user the Reader API and keep track of the position by yourself)
> >
> > Enrico
> >
> > >
> > > Additionally, there are many places that use only the triple of
> > > (ledger id, entry id, batch index) as the key to represent the position.
> > > Currently, they are done by adding a conversion from
> > > BatchMessageIdImpl to MessageIdImpl. However, it's more intuitive to
> > > write something like:
> > >
> > > ```java
> > > class MessageIdPosition implements Comparable<MessageIdPosition> {
> > >     private final MessageId messageId;
> > >     // TODO: compare only the triple (ledger, entry, batch)
> > > ```
> > >
> > > Therefore, I'm going to write a proposal to redesign the MessageId
> > > interface only by adding some getters. Regarding the 5 existing
> > > implementations, I think we can drop them because they are a part
> > > of `pulsar-client`, not `pulsar-client-api`.
> > >
> > > Please feel free to share your points.
> > >
> > > Thanks,
> > > Yunze

Reply via email to