GitHub user StormLord07 edited a discussion: Guaranteeing reply delivery to the
same instance in KeyShared subscription without per-instance topics
Hello,
I’m trying to implement a request/response pattern between two programs using
Pulsar, but I want to avoid:
* Creating a large number of per-instance topics or subscriptions
* Duplicating messages to all instances
# Idea
* **Program A** may have multiple instances.
* All share the same KeyShared subscription on a topic.
* Each instance has its own logical “key” (e.g., `key_1`, `key_2`, …) and
consumes only messages routed to it by message key hash range.
* When sending a request, it uses a producer and sets a message property
`reply_id = key_1` to indicate where to send the response.
* **Program B** also has multiple instances.
* Consumes from the same request topic.
* When replying, it publishes a message with `key = key_1`.
---
# Problem
If Program B publishes with `key = key_1`, can we guarantee that the message
will be routed back to the *same instance* of Program A that originally sent
the request?
Assumptions:
* Using KeyShared subscription mode
* Key stays consistent (`key_1` maps to the same consumer)
* Auto-split or sticky hash range policy is used
---
# Reasoning
* ConsumerExclusive: Apps starting up are **not guaranteed** to have freed the
previous subscription (consumer name still considered active by the broker).
This forces us to recreate new subscriptions each time, which would lead to
topic/subscription bloat over time. replication tends to slow things down a
lot. since sometimes the long disconnected exclusive consumers tend to stick
and replicate.
* If we try to use ConsumerShared, we risk the response being routed to an
instance of the program that is in the process of shutting down, resulting in
lost or undeliverable replies.
* ConsumerFailover seems like a good idea, but I am a bit fuzzy about its
working. And it seems its even worse, since the message will most certanly be
lost to the old instance since it's a master at the moment.
And the other problem is we can't exactly know which instance is being
restarted, but that i may be wrong, I do not exactly know how they are being
restarted.
---
Edit:
I’ve implemented STICKY consumers with a very narrow range. For example, I
calculate:
```c
int32_t range = (murmur3_32(<key>) % 65536) + 1;
```
Then I assign the range as `{range, range}`, ensuring that messages with the
same key always land on the same consumer.
However, as the number of program instances increases, we run into the birthday
paradox: the probability of collisions grows rapidly. When that happens, the
resulting bugs will be extremely difficult to diagnose.
# TLDR
The question is whether the KeyShared hash-range mapping is stable enough
between producer and consumer keys to make this safe, or whether we still need
per-instance topics/subscriptions for strict targeting. Or is there a
better/suggested way?
GitHub link: https://github.com/apache/pulsar/discussions/24616
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]