Hi Sophie,
I am +1, for explicitly documenting that this list must be identical in
contents and order across all clients in the application in the KIP. And
later in the docs of the config. If we are too much concerned about this
and we think that we should explicitly check the order and content, we
could think about another encoding that is order independent or not
encode the tags at all.
Good point about upgrading and evolving. That seems related to the
encoding of the tags. If the encoding contained a bit more information,
we could use the intersection of the tags to evolve the tags of existing
applications. That means, we only use tags that are present on all
clients. Having an encoding that contains more information would also
give us the possibility to support removing, changing, and reordering of
tags.
I do not completely understand your question about the
"ClientTagsVersion". Could we not reuse the version of the subscription
data? What are the main benefits of introducing "ClientTagsVersion"?
Versioning a field of the protocol seems a bit too detailed for me, but
I could easily be missing something important here.
Regarding "standby.task.rack.aware.assignment.tags", we intentionally
tried to avoid the words "standby" and "replica" in this name to not
limit this config to standby tasks. In future, we may want to use the
same config also for active tasks. Admittedly, I do not yet know how we
would use this config for active tasks but I think it is better to keep
it generic as much as reasonable.
Best,
Bruno
On 15.03.21 23:07, Sophie Blee-Goldman wrote:
Hey Levani, thanks for the KIP! This looks great.
A few comments/questions about the "task.assignment.rack.awareness" config:
since this is used to determine the encoding of the client tags, we should
make sure to specify that this list must be identical in contents and order
across all clients in the application. Unfortunately there doesn't seem to
be a good way to actually enforce this, so we should call it out in the
config doc string at the least.
On that note, should it be possible for users to upgrade or evolve their
tags over time? For example if a user wants to leverage this feature for an
existing app, or add new tags to an app that already has some configured. I
think we would need to either enforce that you can only add new tags to the
end but never remove/change/reorder the existing ones, or else adopt a
similar strategy as to version probing and force all clients to remain on
the old protocol until everyone in the group has been updated to use the
new tags. It's fine with me if you'd prefer to leave this out of scope for
the time being, as long as we design this to be forwards-compatible as best
we can. Have you considered adding a "ClientTagsVersion" to the
SubscriptionInfo so that we're set up to extend this feature in the future?
In my experience so far, any time we *don't* version a protocol like this
we end up regretting it later.
Whatever you decide, it should be documented clearly -- in the KIP and in
the actual docs, ie upgrade guide and/or the config doc string -- so that
users know whether they can ever change the client tags on a running
application or not. (I think this is hinted at in the KIP, but not called
out explicitly)
By the way, I feel the "task.assignment.rack.awareness" config should have
the words "clients" and/or "tags" somewhere in it, otherwise it's a bit
unclear what it actually means. And maybe it should specify that it applies
to standby task placement only? Obviously we don't need to cover every
possible detail in the config name alone, but it could be a little more
specific. What about "standby.task.rack.aware.assignment.tags" or something
like that?
On Mon, Mar 15, 2021 at 2:12 PM Levani Kokhreidze <levani.co...@gmail.com>
wrote:
Hello all,
Bumping this thread as we are one binding vote short accepting this KIP.
Please let me know if you have any extra concerns and/or suggestions.
Regards,
Levani
On 12. Mar 2021, at 13:14, Levani Kokhreidze <levani.co...@gmail.com>
wrote:
Hi Guozhang,
Thanks for the feedback. I think it makes sense.
I updated the KIP with your proposal [1], it’s a nice optimisation.
I do agree that having the same configuration across Kafka Streams
instances is the reasonable requirement.
Best,
Levani
[1] -
https://cwiki.apache.org/confluence/display/KAFKA/KIP-708%3A+Rack+awareness+for+Kafka+Streams
<
https://cwiki.apache.org/confluence/display/KAFKA/KIP-708%3A+Rack+awareness+for+Kafka+Streams
On 12. Mar 2021, at 03:36, Guozhang Wang <wangg...@gmail.com <mailto:
wangg...@gmail.com>> wrote:
Hello Levani,
Thanks for the great write-up! I think this proposal makes sense,
though I
have one minor suggestion regarding the protocol format change: note the
subscription info is part of the group metadata message that we need to
write into the internal topic, and hence it's always better if we can
save
on the number of bytes written there. For this, I'm wondering if we can
encode the key part instead of writing raw bytes based on the
configurations, i.e.:
1. streams will look at the `task.assignment.rack.awareness` values, and
encode them in a deterministic manner, e.g. in your example zone = 0,
cluster = 1. This assumes that all instances will configure this value
in
the same way and then with a deterministic manner all instances will
have
the same encodings, which I think is a reasonable requirement.
2. the sent protocol would be "key => short, value => bytes" instead.
WDYT?
Otherwise, I'm +1 on the KIP!
Guozhang
On Thu, Mar 11, 2021 at 8:29 AM John Roesler <vvcep...@apache.org
<mailto:vvcep...@apache.org>> wrote:
Thanks for the KIP!
I'm +1 (binding)
-John
On Wed, 2021-03-10 at 13:13 +0200, Levani Kokhreidze wrote:
Hello all,
I’d like to start the voting on KIP-708 [1]
Best,
Levani
[1] -
https://cwiki.apache.org/confluence/display/KAFKA/KIP-708%3A+Rack+awareness+for+Kafka+Streams
<
https://cwiki.apache.org/confluence/display/KAFKA/KIP-708%3A+Rack+awareness+for+Kafka+Streams
--
-- Guozhang