Il giorno mar 8 nov 2022 alle ore 11:25 Yunze Xu
<y...@streamnative.io.invalid> ha scritto:
>
> Hi all,
>
> Currently we have the following 5 implementations of MessageId:
>
> - MessageIdImpl: (ledger id, entry id, partition index)
>   - BatchMessageIdImpl: adds (batch index, batch size, acker), where
>     acker is a wrapper of a BitSet.
>   - ChunkMessageIdImpl: adds another MessageIdImpl that represents
>     the first MessageIdImpl of a BitSet.
>   - MultiMessageIdImpl: adds a map that maps the topic name to the
>     MessageId.
> - TopicMessageIdImpl: adds the topic name and the partition name
>
> These implementations are such a mess. For example, when users get a
> MessageId from `Producer#send`:
>
> ```java
> var id = producer.send("msg");
> ```
>
> There is no getter to get some specific fields like ledger id. You can
> only see a representation from `toString` method and got some output
> like "0:0:-1:0". Maybe this design is to hidden some details, but if
> users don't know the details like ledger id and entry id, how could
> you know what does "0:0:-1:0" mean? What if `MessageId#toString`'s
> implementation changed? Should it be treated as a breaking change?
>
> The original definition of the underlying MessageIdData is much more
> clear:
>
> ```proto
> message MessageIdData {
>     required uint64 ledgerId = 1;
>     required uint64 entryId  = 2;
>     optional int32 partition = 3 [default = -1];
>     optional int32 batch_index = 4 [default = -1];
>     repeated int64 ack_set = 5;
>     optional int32 batch_size = 6;
>
>     // For the chunk message id, we need to specify the first chunk message 
> id.
>     optional MessageIdData first_chunk_message_id = 7;
> }
> ```
>
> IMO, MessageId should be a wrapper of MessageIdData. It's more natural
> to have an interface like:
>
> ```java
> interface MessageId {
>     long ledgerId();
>     long entryId();
>     Optional<Integer> partition();
>     Optional<Integer> batchIndex();
>     // ...
> ```

This is very good for client applications.
We also need a way to represent this as a String or a byte[], this way
client applications have a standard way to store
message offsets into an external system (for instance when you want to
user the Reader API and keep track of the position by yourself)

Enrico

>
> Additionally, there are many places that use only the triple of
> (ledger id, entry id, batch index) as the key to represent the position.
> Currently, they are done by adding a conversion from
> BatchMessageIdImpl to MessageIdImpl. However, it's more intuitive to
> write something like:
>
> ```java
> class MessageIdPosition implements Comparable<MessageIdPosition> {
>     private final MessageId messageId;
>     // TODO: compare only the triple (ledger, entry, batch)
> ```
>
> Therefore, I'm going to write a proposal to redesign the MessageId
> interface only by adding some getters. Regarding the 5 existing
> implementations, I think we can drop them because they are a part
> of `pulsar-client`, not `pulsar-client-api`.
>
> Please feel free to share your points.
>
> Thanks,
> Yunze

Reply via email to