Great discussion!
I agree with the conclusions!
IMO, we should not encourage users to touch the different implementations
of the MessageId.
Instead, we should provide a `MesssageIdUtils` in the client library(not
the interface)
with annotations `@InterfaceAudience.Public @InterfaceStability.Stable
Great discussion. I generally agree with the conclusions.
I'll add two points.
Several endpoints in the topics admin api require that message ids are
consistently serialized and deserialized over HTTP. For example, the
reset cursor call requires a message id in a correct format or
"latest" or "ea
I also changed my mind after I saw Flink's MesssageIdUtils implementation.
Now it's clear to me that:
- For application users, the APIs in the pulsar-client-api module are
what they should use.
- For Pulsar ecosystem developers, the APIs in the pulsar-client
module are interfaces
So at the moment
Hi,
I was reading the email thread why we want to change MessageId interface:
https://lists.apache.org/thread/rdkqnkohbmkjjs61hvoqplhhngr0b0sd
>> Currently we have the following 5 implementations of MessageId:
>> These implementations are such a mess. For example, when users get a
MessageId from `
FWIW, the Flink Pulsar connector hacky parses the message id internals to
get the next message id:
https://github.com/apache/flink/blob/421f057a7488fd64854a82424755f76b89561a0b/flink-connectors/flink-connector-pulsar/src/main/java/org/apache/flink/connector/pulsar/source/enumerator/cursor/MessageId
After reading Joe's comments I have changed my mind.
Actually it is better to not expose "ledgerId" and "entryId" to client
applications.
They are useless pieces of information.
And also if in the future we want to change the way we internally
address a message we will always have to support these
Hi Jiaqi,
> I don't think `tostring` should be used in any serious case because it has
no standard.
I agree. But it's better to keep it not changed. Just like my previous reply, it
might be a de-facto standard because the `toString()` like methods are used
in logging, not only for debugging. For
Hi Joe,
I think the most controversial point is what should a MessageId be used for.
>From your opinion, it should only be used as a comparable object (opaque),
which represents the position of a message [1]. What I have thought is,
MessageId should be a wrapper of the MessageIdData in PulsarApi.p
Thanks, this is very inspiring to me.
But I have a different opinion on `tostring`.
>>You can only see a representation from `toString` method and got some
output like "0:0:-1:0".
I don't think `tostring` should be used in any serious case because it has
no standard. There is no constraint on ho
Messageid is an identifier which identifies a message. How that id is
constructed, or what it contains should not matter to an application, and
an application should not assume anything about the implementation of that
id.
>What about the partition index? We have a `TopicMetadata` interface tha
Hi Haiting,
> But please make sure we have to make it compatible with previous
implementations, like the `toString` method
Yeah, I agree, I will keep it compatible.
BTW, while I'm working on this, I found the MessageId implementations
are more complicated than I thought. The MessageIdImpl class
Overall, this makes sense to me.
The current status of MessageId is a bit messy, especially for client
developers and senior users who are interested in the implementation
details.
But please make sure we have to make it compatible with previous
implementations, like the `toString` method, I bet so
Hi Joe,
Then what would we expect users to do with the MessageId? It should only
be passed to Consumer#seek or ReaderBuilder#startMessageId?
What about the partition index? We have a `TopicMetadata` interface that returns
the number of partitions. If the partition is also "implementation
details"
>Maybe this design is to hidden some details, but if
users don't know the details like ledger id and entry id, how could
you know what does "0:0:-1:0" mean?
Abstractions exist for a reason. Ledgerid and entryid are implementation
details, and an application should not be interpreting that at all
I didn't look into these two methods at the moment. But I think it's possible to
retain only the `fromByteArray`.
Thanks,
Yunze
On Tue, Nov 8, 2022 at 7:02 PM Enrico Olivelli wrote:
>
> Il giorno mar 8 nov 2022 alle ore 11:52 Yunze Xu
> ha scritto:
> >
> > Hi Enrico,
> >
> > > We also need a wa
Il giorno mar 8 nov 2022 alle ore 11:52 Yunze Xu
ha scritto:
>
> Hi Enrico,
>
> > We also need a way to represent this as a String or a byte[]
>
> We already have the `toByteArray` method, right?
Yes, correct. So we are fine. I forgot about it and I answered too quickly.
I am not sure if this ca
Hi Enrico,
> We also need a way to represent this as a String or a byte[]
We already have the `toByteArray` method, right?
Thanks,
Yunze
On Tue, Nov 8, 2022 at 6:43 PM Enrico Olivelli wrote:
>
> Il giorno mar 8 nov 2022 alle ore 11:25 Yunze Xu
> ha scritto:
> >
> > Hi all,
> >
> > Currently w
Il giorno mar 8 nov 2022 alle ore 11:25 Yunze Xu
ha scritto:
>
> Hi all,
>
> Currently we have the following 5 implementations of MessageId:
>
> - MessageIdImpl: (ledger id, entry id, partition index)
> - BatchMessageIdImpl: adds (batch index, batch size, acker), where
> acker is a wrapper o
Hi all,
Currently we have the following 5 implementations of MessageId:
- MessageIdImpl: (ledger id, entry id, partition index)
- BatchMessageIdImpl: adds (batch index, batch size, acker), where
acker is a wrapper of a BitSet.
- ChunkMessageIdImpl: adds another MessageIdImpl that represen
19 matches
Mail list logo