Hi Asaf,

> (the field is not generic,
> it's specifically named shadow_message_id).

As Penghui suggested, this field name is changed to `message_id` for 
potential generic usage. :)


> The second problem is clients: Every such field will eventually trickle
> down to the clients, which will need to ignore that field. In my opinion,
> it makes it harder for the client's maintainers. Especially when the
> community goal is to expand and have many languages clients maintained by
> the community

Our current client's implementation is quite complex already. Comparing with 
this,
ignoring a few fields does not seems to be a significant hard thing in this,
as long as we document it well, right?


> I believe someone who tries to reason about Pulsar, and its architecture,
> by looking at its public API should not have any fields which will never be
> relevant to the reader.  It makes it hard to reason and understand the
> public API.
> 

This design principle of keeping the public API clean is clear and easy to
understand and I totally support this. But in the case of PIP-180 or
geo-replication, the replicator can be considered as a special producer
client, and it just inherited the basic semantic of a normal producer and
extended its abilities to support some special internal usage.

Of course we can use a different protocol and different port for strictly
inter-broker communications in theory. But the side effect of this would be
more codes, more machine resource usage, harder to maintain, and longer time to
make the feature steady, comparing with just extending the abilities of
producer client.

If this come to a case that inter-broker communication is needed and it is not
the case of producer or consumer, I think we should definitely consider to
introduce the dedicated port and protocols.


Thanks,
Haiting

On 2022/07/20 15:47:16 Asaf Mesika wrote:
> Hi,
> 
> We started discussing in PIP-180, which Penghui recommended I move to a
> dedicated thread.
> 
> Pulsar has a public API in its binary protocol, which the clients use to
> communicate with it. Nonetheless, it is its public API to the server.
> 
> I believe the public API should not be changed for internal communication
> purposes. PIP-180 gives a really good example: We would like to introduce a
> new feature called Shadow Topic and would like to replicate messages from
> the source topic to the Shadow topic. It just so happens to be that the
> replication mechanism uses the Broker public API to send messages to a
> broker. The design would like to expand on that by adding a field to this
> public API, to serve that specific feature needs (the field is not generic,
> it's specifically named shadow_message_id).
> 
> I believe someone who tries to reason about Pulsar, and its architecture,
> by looking at its public API should not have any fields which will never be
> relevant to the reader.  It makes it hard to reason and understand the
> public API.
> 
> The second problem is clients: Every such field will eventually trickle
> down to the clients, which will need to ignore that field. In my opinion,
> it makes it harder for the client's maintainers. Especially when the
> community goal is to expand and have many languages clients maintained by
> the community
> 
> The public API today already contains many fields which are only for
> internal use. Here are a few that I found (please correct me if I'm wrong
> here):
> 
> // Property set on replicated message,
> // includes the source cluster name
> optional string replicated_from = 5;
> 
> // Override namespace's replication
> repeated string replicate_to    = 7;
> 
> // Identify whether a message is a "marker" message used for
> // internal metadata instead of application published data.
> // Markers will generally not be propagated back to clients
> optional int32 marker_type = 20;
> 
> 
> I would like to discuss that with you, get your feedback and whether you
> think it's correct to accept a decision to avoid changing the public API.
> 
> One alternative I was thinking about (I'm still fairly new, so I don't have
> all the experience and context here) is creating an internal non-public
> API, which will be used for internal communication: different proto,
> different port.
> 
> Thanks for your time,
> 
> Asaf
> 

On 2022/07/20 15:47:16 Asaf Mesika wrote:
> Hi,
> 
> We started discussing in PIP-180, which Penghui recommended I move to a
> dedicated thread.
> 
> Pulsar has a public API in its binary protocol, which the clients use to
> communicate with it. Nonetheless, it is its public API to the server.
> 
> I believe the public API should not be changed for internal communication
> purposes. PIP-180 gives a really good example: We would like to introduce a
> new feature called Shadow Topic and would like to replicate messages from
> the source topic to the Shadow topic. It just so happens to be that the
> replication mechanism uses the Broker public API to send messages to a
> broker. The design would like to expand on that by adding a field to this
> public API, to serve that specific feature needs (the field is not generic,
> it's specifically named shadow_message_id).
> 
> I believe someone who tries to reason about Pulsar, and its architecture,
> by looking at its public API should not have any fields which will never be
> relevant to the reader.  It makes it hard to reason and understand the
> public API.
> 
> The second problem is clients: Every such field will eventually trickle
> down to the clients, which will need to ignore that field. In my opinion,
> it makes it harder for the client's maintainers. Especially when the
> community goal is to expand and have many languages clients maintained by
> the community
> 
> The public API today already contains many fields which are only for
> internal use. Here are a few that I found (please correct me if I'm wrong
> here):
> 
> // Property set on replicated message,
> // includes the source cluster name
> optional string replicated_from = 5;
> 
> // Override namespace's replication
> repeated string replicate_to    = 7;
> 
> // Identify whether a message is a "marker" message used for
> // internal metadata instead of application published data.
> // Markers will generally not be propagated back to clients
> optional int32 marker_type = 20;
> 
> 
> I would like to discuss that with you, get your feedback and whether you
> think it's correct to accept a decision to avoid changing the public API.
> 
> One alternative I was thinking about (I'm still fairly new, so I don't have
> all the experience and context here) is creating an internal non-public
> API, which will be used for internal communication: different proto,
> different port.
> 
> Thanks for your time,
> 
> Asaf
> 

Reply via email to