To start, sorry if this breaks threading and/or ends up looking
hideous, I am new to using these mailing lists and was not subscribed
so can't seem to resume the thread from the latest message.

> > The proposal is two-fold: first, we would like to implement a new KeyShared
> > routing mechanism which tracks outstanding messages and their consumers,
> > routing messages with outstanding keys to the current consumer handling
> > that key, and any messages with new keys to any arbitrary consumer with
> > available permits (or perhaps the consumer with the most permits).
> > Basically a "first available consumer" routing strategy.  As far as my
> > naive first attempt goes, I initially decided to modify
> > the StickyKeyConsumerSelector::select call to accept an mledger.Position in
> > addition to the hash.  I also added a release call to notify the selector
> > of positions to consider no longer outstanding.  I could see this being
> > implemented any number of other ways as well (such as an entirely new
> > dispatcher), and would appreciate guidance should this proposal move
> > forward.
>
> If you are using the consistent hash consumer selector. You can try to
> add more replica points. But it also depends on your keys, if the number
> of messages for each key roughly the same and how many keys you have.
>
We are using the consistent hash selector today, and have tried several
different values for replica point count, but none have been satisfactory
up to 2000.  As you can imagine, at our consumer count and rate of
consumer churn increasing point count beyond that also hurts performance
a decent bit.

As for our key distribution, I wouldn't call it perfectly even, but it
is even enough.
Internal metrics in our consumers show that the issue is not strictly
a "hot key";
the hot consumers are receiving a high cardinality of keys still but
perhaps that
swath of keys is harder to work on or more numerous at that time for some
reason.  Hence the requirement to be able to route work away from those busy
workers ad hoc.

> I just thought about it(available permits-based selector) roughly. The
> available
> permits are unstable. But after the key is assigned to a consumer, the
> relationship will not change, right?
>
I'm not sure what you mean by "unstable", hopefully you can clarify a bit?  In
implementing the POC though I have run into the issue that a large group of
messages are routed to consumers before any are sent/deducted from the
available permit counts.  A two-step process of grouping messages by hash
and then selecting a consumer during iteration/sending would suffice, but
perhaps this leans more towards this being a new dispatcher?  Not sure yet.

In response to your question about keys being assigned to consumers not
changing, no, they will be free to change as soon as that consumer no longer
has any unacknowledged messages with that key/hash.  This allows keys to
reroute to less busy consumers dynamically while still maintaining
serial/in-order message processing by key.

I've also considered another potential issue but I believe it exists today:
topic unloading.  Once reloaded, any messages in flight will be redelivered
to new consumers, yet without strict transactions existing consumers will
likely continue to work on the messages they have in progress at the time.
Perhaps this isn't worth worrying about but I did consider it and I don't have
any good solutions in mind.

> > Second, I believe the ability to choose a KeyShared routing scheme and
> > perhaps the settings for that scheme should be configurable as a
> > namespace/topic policy and not just in the broker config.  I have not begun
> > work on implementing that at all yet, but would assume it is not too
> > complicated to do so (though the settings construct may be more freeform
> > than expected).
>
> Yes, that looks good.
>
> Thanks,
> Penghui

Thanks for your help so far,
-Tim Corbett

Reply via email to