Thanks, Konstantine. I'm not sure it's within the scope of this PR to address the `worker.id` MDC parameter, especially because the behavior is not yet well defined as to what happens when the `client.id` is set to a non-unique value. I guess I'd prefer to limit the scope to just the additional connector-specific contexts.
Best regards, Randall On Mon, Apr 15, 2019 at 2:03 PM Konstantine Karantasis < konstant...@confluent.io> wrote: > Thanks for the quick updates on the KIP Randall! > > 3. Indeed. I thought I remembered some ambiguity in the numbering of tasks, > but looking closer I think we are fine, especially now that you mention > explicitly 0-based indexing for task ids. > 4. Fine with the example as is. > 5. Sounds good. > 6. "client.id" is set by default. It's just that it's not unique across > the > connect cluster by default. I think it's ok to use it as a default value > for "worker.id" in MDC. Regarding placement, I'd consider adding it > wherever there's the least redundancy in logging and also make it > optional/configurable based on the logger's conversion pattern. > Up to you if you think there's room for inclusion in this KIP. > > Konstantine > > > > On Fri, Apr 12, 2019 at 3:24 PM Randall Hauch <rha...@gmail.com> wrote: > > > Thanks for the review and feedback, Konstantine. > > > > 1. Great suggestion. I've updated the KIP to hopefully make it more clear > > that the uncommented line is unchanged from the existing Log4J > > configuration file. > > > > 2. Regarding including a `-` before the task number is acceptable if it > > makes it easier to, read and filter. I've updated the KIP and PR to > > incorporate your suggestion. > > > > 3. Task numbers do start at 0, as seen by the DistributedHerder code that > > creates the tasks ( > > > > > https://github.com/apache/kafka/blob/02221bd907a23041c95ce6446986bff631652b3a/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/distributed/DistributedHerder.java#L608-L611 > > ). > > I've updated the KIP to highlight that task numbers are 0-based. As you > can > > see in the sample log included in the KIP, "task-0" corresponds to the > > first task, not to the connector. In fact, the following are examples of > > the log messages with the "worker" scope where the Connector > implementation > > is called: > > > > [2019-04-02 17:01:38,315] INFO [local-file-source|worker] Creating > > connector local-file-source of type FileStreamSource > > (org.apache.kafka.connect.runtime.Worker:227) > > [2019-04-02 17:01:38,317] INFO [local-file-source|worker] Instantiated > > connector local-file-source with version 2.3.0-SNAPSHOT of type class > > org.apache.kafka.connect.file.FileStreamSourceConnector > > (org.apache.kafka.connect.runtime.Worker:230) > > ... > > > > [2019-04-02 17:11:46,408] INFO [local-file-sink|worker] Stopping > connector > > local-file-sink (org.apache.kafka.connect.runtime.Worker:334) > > [2019-04-02 17:11:46,408] INFO [local-file-sink|worker] Stopped connector > > local-file-sink (org.apache.kafka.connect.runtime.Worker:350) > > > > > > The FileStreamSourceConnector class does not actually log any messages > > itself, but if it did those would be with the > "[local-file-source|worker]" > > context and would include the name of the class and line number in > > parentheses (instead of e.g., > > "org.apache.kafka.connect.runtime.Worker:230"). > > > > 4. I thought about doing a more complex example, but IMO the extra log > > lines made the *sample* in the KIP quite a bit longer harder to > understand. > > I thought it important to keep the KIP a bit more readable while showing > > which scope appear for the different log messages. A second task would > > essentially have the nearly the same log messages, just with a task > number > > in the scope. > > > > 5. The PR already changes the > `runtime/src/test/resources/log4j.properties` > > to include the connector context MDC parameter. Because that's not a > public > > API, I've not mentioned it in the KIP. > > > > 6. I guess I understand your goal - it's not always clear which worker is > > being referenced. However, I'm not sure whether it's that valuable to > > include the "client.id" (if set) just in the "worker" scope. That means > > that it's maybe more useful to introduce a second MDC parameter (e.g., > "%{ > > worker.id}") and optionally include that on all log messages. We'd have > to > > set the MDC context in the code in each thread, which isn't too much > > effort. The other advantage of this approach is that it doesn't have to > be > > configurable: you can control it via your own logging configuration file > > (rather than optionally including it in the "worker" scope on some of the > > log messages). Thoughts? What would the "%{worker.id}" MDC value be set > to > > if "client.id" is not set? > > > > Final notes: Removed the placeholders, and corrected the typos and > grammar. > > > > Thanks again for the detailed review! > > > > On Fri, Apr 12, 2019 at 2:05 PM Konstantine Karantasis < > > konstant...@confluent.io> wrote: > > > > > Thanks for the KIP Randall. > > > > > > It might not be obvious right away, but this is a great improvement > when > > > running Connect with multiple connectors or when debugging Connect, for > > > instance in integration tests or system tests. KIP looks good to me > > > overall, I have just a few comments below: > > > > > > 1. In the snippet of the config, by including an uncommented line, > maybe > > > it's not immediately clear that this line is an existing line in > > > connect-log4j.properties and not an addition. Should this be mentioned > > in a > > > separate code block or in a different way? > > > > > > 2. Currently when adding the taskId to a connector or connector task, > we > > > precede it with a dash (-). I feel this addition would make it easier > to > > > parse the taskId visually as well as with parsing tools (for example I > > can > > > use `-` as a delimiter easier than 'k' or even a multi-character > > delimiter > > > such as 'task'). Do you think it's an overhead? > > > > > > 3. I think a specific mention to the convention used on the base-index > > for > > > taskIds would be useful. For example, taskId equal to zero in most > cases > > > represents the (source/sink) connector, while taskId > 0 corresponds to > > > (source/sink) tasks. By reading the KIP, I assume that task0 (or task-0 > > if > > > you agree with my previous comment) will be the connector. Is this the > > > case? > > > > > > 4. Should the example include a snippet for taskId > 0? > > > > > > 5. Do you intend to enable connector MDC in tests? In this case how > would > > > you distinguish between multiple workers in integrations tests? Do you > > > intend to address this here, in the PR corresponding to this KIP or > > follow > > > up later? > > > > > > 6. Related to my previous comment (5). The property 'client.id' is > > already > > > used to identify a Connect worker. Have you thought of offering the > > ability > > > to identify the worker MDC if this property is set, keeping at the same > > > time with MDC the ability to not include it as identifier? > > > > > > Finally, I'd suggest removing text placeholders. I think it will be > > easier > > > to follow the KIP. Currently I see placeholder text below Contents and > in > > > Rejected alternatives. > > > > > > And my least favorite section, but I can't help it :) > > > Typos/grammar: > > > to understand what (is) happening within the worker, > > > Kafka Connect use(s) ... > > > quotes maybe not needed, text already verbatim in > > > `config/connect-log4j.properties` > > > > > > Best, > > > Konstantine > > > > > > On Tue, Apr 2, 2019 at 4:20 PM Randall Hauch <rha...@gmail.com> wrote: > > > > > > > I've been working on https://github.com/apache/kafka/pull/5743 for a > > > > while, > > > > but there were a number of comment, suggestions, and mild concerns on > > the > > > > PR. One of those comments was that maybe changing the Connect log > > content > > > > in this way probably warrants a KIP. So here it is: > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-449%3A+Add+connector+contexts+to+Connect+worker+logs > > > > > > > > I've also updated my PR reflect the KIP. Please reply with comments > > > and/or > > > > feedback. > > > > > > > > Best regards, > > > > > > > > Randall > > > > > > > > > >