On Tue, Jun 29, 2021, at 07:22, Magnus Edenhill wrote: > Den tors 17 juni 2021 kl 00:52 skrev Colin McCabe <cmcc...@apache.org>: > > A few critiques: > > > > - As I wrote above, I think this could benefit a lot by being split into > > several RPCs. A registration RPC, a report RPC, and an unregister RPC seem > > like logical choices. > > > > Responded to this in your previous mail, but in short I think a single > request is sufficient and keeps the implementation complexity / state down. >
Hi Magnus, I still suspect that trying to do everything with a single RPC is more complex than using multiple RPCs. Can you go into more detail about how the client learns what metrics it should send? This was the purpose of the "registration" step in my scheme above. It seems quite awkward to combine an RPC for reporting metrics with and RPC for finding out what metrics are configured to be reported. For example, how would you build a tool to check what metrics are configured to be reported? Does the tool have to report fake metrics, just because there's no other way to get back that information? Seems wrong. (It would be a bit like combining createTopics and listTopics for "simplicity") > > - I don't think the client should be able to choose its own UUID. This > > adds complexity and introduces a chance that clients will choose an ID that > > is not unique. We already have an ID that the client itself supplies > > (clientID) so there is no need to introduce another such ID. > > > > The CLIENT_INSTANCE_ID (which is a combination of the client.id and a UUID) > is actually generated by the receiving broker on first contact. > The need for a new unique semi-random id is outlined in the KIP, but in > short; the client.id is not unique, and we need something unique that still > is prefix-matchable to the client.id so that we can add subscriptions > either using prefix-matching of just the client.id (which may match one or > more client instances), and exact matching which will match a one specific > client instance. Hmm... the client id is already sent in every RPC as part of the header. It's not necessary to send it again as part of one of the other RPC fields, right? More generally, why does the client instance ID need to be prefix-matchable? That seems like an implementation detail of the metrics collection system used on the broker side. Maybe someone wants to group by things other than client IDs -- perhaps client versions, for instance. By the same argument, we should put the client version string in the client instance ID, since someone might want to group by that. Or maybe we should include the hostname, and the IP, and, and, and.... You see the issue here. I think we shouldn't get involved in this kind of decision -- if we just pass a UUID, the broker-side software can group it or prefix it however it wants internally. > > - In general the schema seems to have a bad case of string-itis. UUID, > > content type, and requested metrics are all strings. Since these messages > > will be sent very frequently, it's quite costly to use strings for all > > these things. We have a type for UUID, which uses 16 bytes -- let's use > > that type for client instance ID, rather than a string which will be much > > larger. Also, since we already send clientID in the message header, there > > is no need to include it again in the instance ID. > > > > As explained above we need the client.id in the CLIENT_INSTANCE_ID. And I > don't think the overhead of this one string per request is going to be much > of an issue, > typical metric push intervals are probably in the >60s range. > If this becomes a problem we could use a per-connection identifier that the > broker translates to the client instance id before pushing metrics upwards > in the system. > This is actually an interesting design question -- why not use a per-TCP-connection identifier, rather than a per-client-instance identifier? If we are grouping by other things anyway (clientID, principal, etc.) on the server side, do we need to maintain a per-process identifier rather than a per-connection one? > > > - I think it would also be nice to have an enum or something for > > AcceptedContentTypes, RequestedMetrics, etc. We know that new additions to > > these categories will require KIPs, so it should be straightforward for the > > project to just have an enum that allows us to communicate these as ints. > > > > I'm thinking this might be overly constraining. The broker doesn't parse or > handle the received metrics data itself but just pushes it to the metrics > plugin, using an enum would require a KIP and broker upgrade if the metrics > plugin > supports a newer version of OTLP. > It is probably better if we don't strictly control the metric format itself. > Unfortunately, we have to strictly control the metrics format, because otherwise clients can't implement it. I agree that we don't need to specify how the broker-side code works, since that is pluggable. It's also reasonable for the clients to have pluggable extensions as well, but this KIP won't be of much use if we don't at least define a basic set of metrics that most clients can understand how to send. The open source clients will not implement anything more than what is specified in the KIP (or at least the AK one won't...) > > > > - Can you talk about whether you are adding any new library dependencies > > to the Kafka client? It seems like you'd want to add opencensus / > > opentelemetry, if we are using that format here. > > > > Yeah, as we get closer to concensus more implementation specific details > will be added to the KIP. > I'm not sure if OpenCensus adds any value to this KIP, to be honest. Their primary focus was never on the format of the data being sent (in fact, the last time they checked, they left the format up to each OpenCensus implementation). That may have changed, but I think it still has limited usefulness to us, since we have our own format which we have to use anyway. > > > > > - Standard client resource labels: can we send these only in the > > registration RPC? > > > > These labels are part of the serialized OTLP data, which means it would > need to be unpacked and repacked (including compression) by the broker (or > metrics plugin), which I believe is more costly than sending them for each > request. > Hmm, that data is about 10 fields, most of which are strings. It certainly adds a lot of overhead to resend it each time. I don't follow the comment about unpacking and repacking -- since the client registered with the broker it already knows all this information, so there's nothing to unpack or repack, except from memory. If it's more convenient to serialize it once rather than multiple times, that is an implementation detail of the broker side plugin, which we are not specifying here anyway. best, Colin > Thanks, > Magnus > > > > > >