On Mon, Apr 29, 2019 at 1:57 PM Joe F <joefranc...@gmail.com> wrote:
>
> I have suggestions for an alternate solution.
>
> If source message-ids were known for replicated messages, a composite
> cursor can be maintained for replicated subscriptions as an n-tuple.  Since
> messages are ordered from a source, it would be possible to restart from a
> known cursor n-tuple in any cluster by  a combination of cursor
> positioning  _and_ filtering

Knowing the source message id alone is not enough to establish the
order relationship across all the clusters. I think that would only
work in the 2 clusters scenario.

In general, for each message being written in region A, both locally
published or replicated from other regions, we'd have to associate the
highest (original from the source) message id. While that could be
easy in the simplest case (broker maintains hashmap of the highest
source message id from each region), it becomes more difficult to
consider failure cases in which we have to reconstruct that hashmap
from the log.

Also, that would require to modify each message before persisting it,
in the target region.

Finally, the N-tuple of message ids would have to either:
 * Have a mapping in the broker (local-message -> N-tuple)
 * Be pushed to consumers so that they will ack with the complete context

>
> A simple way to approximate this is for each cluster to insert its own
> ticker marks into the topic. A ticker carries the messsage id as the
> message body. The ticker mark can be inserted every 't ' time interval or
> every 'n' messages as needed.
>
> The n-tuple of the tickers from each cluster is a well-known state  that
> can be re-started anywhere by proper positioning and filtering
>
> That is a simpler solution for users to understand and trouble-shoot.  It
> would be resilient to cluster failures, and does NOT require all clusters
> to be up, to determine cursor position. No cross-cluster
> communication/ordering is needed.

I don't think this approach can remove the requirement of "all the
clusters to be up" because the information in A won't have any context
on what was exchanged between B and C.

One quick example:
 * Message c2 was replicated to B but not A. Then C goes offline.
 * A tuple will be something like (a3, b4, c1)
 * In region B, b4 was actually written *after* c2, but c2 doesn't get
sent from B to A, since it was already replicated itself.
 * If we do a failover from A to B considering a3 ~= b4 we would be
missing the message c2

Reply via email to