Hi Chris,

thanks for the comments, replies inline:


On Sat, Mar 23, 2019 at 5:53 AM Chris Bartholomew
<c_bartholo...@yahoo.com.invalid> wrote:
>
> Hi,
>
> I have a few thoughts that are hopefully useful. I have some experience 
> testing a similar feature for a proprietary messaging system and understand 
> that system's design well, which is similar to this proposal.
>
> Would it make sense to have the snapshot frequency based on messages not 
> time, or to allow both modes? Many messages can be processed in a second, 
> which could create a lot of duplicates in the case of a failover. By using 
> message counts, it would bound the number of duplicate messages and thus the 
> number of messages that end-user application would have to keep track of for 
> de-duplication. It would be performance impacting compared to the timer 
> solution, I assume, but would give the end user more control and might be 
> worth the trade off.

Sure, it's a good point and that's perfectly possible to do so. For
many features we have these double constraints on messages and time.
I just wanted to start with the simplest implementation (and config
options) and then expand from there, since it's always easier to
add more options and knobs than to remove them later.

> When there is a single consumer for the global subscription, the behavior is 
> clear and well documented here. What happens if multiple consumers are 
> consuming from the subscription at the same time on different clusters? 
> Presumably the implementation will have to handle the case where a message 
> has been consumed locally and then gets notified that the message has been 
> consumed remotely--maybe this will just work, I am not sure. In any case, 
> having multiple consumers on a global subscription is probably not what the 
> end user wants--they likely want an pure active-standby setup.
>
> Would it make sense to somehow mark a subscription as the active one, with 
> others being standby? That way, messages can only be consumed from one 
> subscription at time, which makes sense in a disaster recovery scenario. This 
> could be configurable by the operator to make the failover to the recovery 
> cluster controllable.

With this proposal, the behavior will be that:
 * If consumers are attached in multiple regions they will see some
dups (eg: message is received in both regions)
 * Some messages might be acknowledged quickly and therefore
automatically discarded in the other regions

To summarize:
 * It probably doesn't make sense to have consumers active in multiple
regions at the same time (on a replicated subscription)
 * It won't cause major harm (only some dups)

The main issue is that deciding which "cluster" is active will require
coordination across multiple regions which
goes against the main goal of async replication which is to isolate
the clusters from each other to the max
extent to minimize the chances of one single malfunctioning cluster to
impact other clusters.

Even in case of an operator tool, it would have to be applied
consistently on the multiple datacenters.

My take is that it's better to leave the consumer-cluster-failover
operation outside the scope of Pulsar itself, unless
we can find a way to do it in a precise, reliable and consistent way.

> Last thought is on message TTL. I assume that works similarly to actually 
> consume the message and will just work. You probably only want to enable TTL 
> on only one of the clusters and let it handle the expiry of messages for the 
> whole group.

This change won't touch the current logic of TTL. Right now, messages
that have already expired won't be replicated,
though that typically happens if there was some network issue between
the clusters that caused the geo-replication
to back up.

In normal conditions, messages are replicated immediately after the
local publish so the TTL is actually
applied in both clusters.

Reply via email to