Hi all,

Currently we have the following 5 implementations of MessageId:

- MessageIdImpl: (ledger id, entry id, partition index)
  - BatchMessageIdImpl: adds (batch index, batch size, acker), where
    acker is a wrapper of a BitSet.
  - ChunkMessageIdImpl: adds another MessageIdImpl that represents
    the first MessageIdImpl of a BitSet.
  - MultiMessageIdImpl: adds a map that maps the topic name to the
    MessageId.
- TopicMessageIdImpl: adds the topic name and the partition name

These implementations are such a mess. For example, when users get a
MessageId from `Producer#send`:

```java
var id = producer.send("msg");
```

There is no getter to get some specific fields like ledger id. You can
only see a representation from `toString` method and got some output
like "0:0:-1:0". Maybe this design is to hidden some details, but if
users don't know the details like ledger id and entry id, how could
you know what does "0:0:-1:0" mean? What if `MessageId#toString`'s
implementation changed? Should it be treated as a breaking change?

The original definition of the underlying MessageIdData is much more
clear:

```proto
message MessageIdData {
    required uint64 ledgerId = 1;
    required uint64 entryId  = 2;
    optional int32 partition = 3 [default = -1];
    optional int32 batch_index = 4 [default = -1];
    repeated int64 ack_set = 5;
    optional int32 batch_size = 6;

    // For the chunk message id, we need to specify the first chunk message id.
    optional MessageIdData first_chunk_message_id = 7;
}
```

IMO, MessageId should be a wrapper of MessageIdData. It's more natural
to have an interface like:

```java
interface MessageId {
    long ledgerId();
    long entryId();
    Optional<Integer> partition();
    Optional<Integer> batchIndex();
    // ...
```

Additionally, there are many places that use only the triple of
(ledger id, entry id, batch index) as the key to represent the position.
Currently, they are done by adding a conversion from
BatchMessageIdImpl to MessageIdImpl. However, it's more intuitive to
write something like:

```java
class MessageIdPosition implements Comparable<MessageIdPosition> {
    private final MessageId messageId;
    // TODO: compare only the triple (ledger, entry, batch)
```

Therefore, I'm going to write a proposal to redesign the MessageId
interface only by adding some getters. Regarding the 5 existing
implementations, I think we can drop them because they are a part
of `pulsar-client`, not `pulsar-client-api`.

Please feel free to share your points.

Thanks,
Yunze

Reply via email to