Thanks, Konstantine. I'm not sure it's within the scope of this PR to
address the `worker.id` MDC parameter, especially because the behavior is
not yet well defined as to what happens when the `client.id` is set to a
non-unique value. I guess I'd prefer to limit the scope to just the
additional connector-specific contexts.

Best regards,

Randall

On Mon, Apr 15, 2019 at 2:03 PM Konstantine Karantasis <
konstant...@confluent.io> wrote:

> Thanks for the quick updates on the KIP Randall!
>
> 3. Indeed. I thought I remembered some ambiguity in the numbering of tasks,
> but looking closer I think we are fine, especially now that you mention
> explicitly 0-based indexing for task ids.
> 4. Fine with the example as is.
> 5. Sounds good.
> 6. "client.id" is set by default. It's just that it's not unique across
> the
> connect cluster by default. I think it's ok to use it as a default value
> for "worker.id" in MDC. Regarding placement, I'd consider adding it
> wherever there's the least redundancy in logging and also make it
> optional/configurable based on the logger's conversion pattern.
> Up to you if you think there's room for inclusion in this KIP.
>
> Konstantine
>
>
>
> On Fri, Apr 12, 2019 at 3:24 PM Randall Hauch <rha...@gmail.com> wrote:
>
> > Thanks for the review and feedback, Konstantine.
> >
> > 1. Great suggestion. I've updated the KIP to hopefully make it more clear
> > that the uncommented line is unchanged from the existing Log4J
> > configuration file.
> >
> > 2. Regarding including a `-` before the task number is acceptable if it
> > makes it easier to, read and filter. I've updated the KIP and PR to
> > incorporate your suggestion.
> >
> > 3. Task numbers do start at 0, as seen by the DistributedHerder code that
> > creates the tasks (
> >
> >
> https://github.com/apache/kafka/blob/02221bd907a23041c95ce6446986bff631652b3a/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/distributed/DistributedHerder.java#L608-L611
> > ).
> > I've updated the KIP to highlight that task numbers are 0-based. As you
> can
> > see in the sample log included in the KIP, "task-0" corresponds to the
> > first task, not to the connector. In fact, the following are examples of
> > the log messages with the "worker" scope where the Connector
> implementation
> > is called:
> >
> > [2019-04-02 17:01:38,315] INFO [local-file-source|worker] Creating
> > connector local-file-source of type FileStreamSource
> > (org.apache.kafka.connect.runtime.Worker:227)
> > [2019-04-02 17:01:38,317] INFO [local-file-source|worker] Instantiated
> > connector local-file-source with version 2.3.0-SNAPSHOT of type class
> > org.apache.kafka.connect.file.FileStreamSourceConnector
> > (org.apache.kafka.connect.runtime.Worker:230)
> > ...
> >
> > [2019-04-02 17:11:46,408] INFO [local-file-sink|worker] Stopping
> connector
> > local-file-sink (org.apache.kafka.connect.runtime.Worker:334)
> > [2019-04-02 17:11:46,408] INFO [local-file-sink|worker] Stopped connector
> > local-file-sink (org.apache.kafka.connect.runtime.Worker:350)
> >
> >
> > The FileStreamSourceConnector class does not actually log any messages
> > itself, but if it did  those would be with the
> "[local-file-source|worker]"
> > context and would include the name of the class and line number in
> > parentheses (instead of e.g.,
> > "org.apache.kafka.connect.runtime.Worker:230").
> >
> > 4. I thought about doing a more complex example, but IMO the extra log
> > lines made the *sample* in the KIP quite a bit longer harder to
> understand.
> > I thought it important to keep the KIP a bit more readable while showing
> > which scope appear for the different log messages. A second task would
> > essentially have the nearly the same log messages, just with a task
> number
> > in the scope.
> >
> > 5. The PR already changes the
> `runtime/src/test/resources/log4j.properties`
> > to include the connector context MDC parameter. Because that's not a
> public
> > API, I've not mentioned it in the KIP.
> >
> > 6. I guess I understand your goal - it's not always clear which worker is
> > being referenced. However, I'm not sure whether it's that valuable to
> > include the "client.id" (if set) just in the "worker" scope. That means
> > that it's maybe more useful to introduce a second MDC parameter (e.g.,
> "%{
> > worker.id}") and optionally include that on all log messages. We'd have
> to
> > set the MDC context in the code in each thread, which isn't too much
> > effort. The other advantage of this approach is that it doesn't have to
> be
> > configurable: you can control it via your own logging configuration file
> > (rather than optionally including it in the "worker" scope on some of the
> > log messages). Thoughts? What would the "%{worker.id}" MDC value be set
> to
> > if "client.id" is not set?
> >
> > Final notes: Removed the placeholders, and corrected the typos and
> grammar.
> >
> > Thanks again for the detailed review!
> >
> > On Fri, Apr 12, 2019 at 2:05 PM Konstantine Karantasis <
> > konstant...@confluent.io> wrote:
> >
> > > Thanks for the KIP Randall.
> > >
> > > It might not be obvious right away, but this is a great improvement
> when
> > > running Connect with multiple connectors or when debugging Connect, for
> > > instance in integration tests or system tests. KIP looks good to me
> > > overall, I have just a few comments below:
> > >
> > > 1. In the snippet of the config, by including an uncommented line,
> maybe
> > > it's not immediately clear that this line is an existing line in
> > > connect-log4j.properties and not an addition. Should this be mentioned
> > in a
> > > separate code block or in a different way?
> > >
> > > 2. Currently when adding the taskId to a connector or connector task,
> we
> > > precede it with a dash (-). I feel this addition would make it easier
> to
> > > parse the taskId visually as well as with parsing tools (for example I
> > can
> > > use `-` as a delimiter easier than 'k' or even a multi-character
> > delimiter
> > > such as 'task'). Do you think it's an overhead?
> > >
> > > 3. I think a specific mention to the convention used on the base-index
> > for
> > > taskIds would be useful. For example, taskId equal to zero in most
> cases
> > > represents the (source/sink) connector, while taskId > 0 corresponds to
> > > (source/sink) tasks. By reading the KIP, I assume that task0 (or task-0
> > if
> > > you agree with my previous comment) will be the connector. Is this the
> > > case?
> > >
> > > 4. Should the example include a snippet for taskId > 0?
> > >
> > > 5. Do you intend to enable connector MDC in tests? In this case how
> would
> > > you distinguish between multiple workers in integrations tests? Do you
> > > intend to address this here, in the PR corresponding to this KIP or
> > follow
> > > up later?
> > >
> > > 6. Related to my previous comment (5). The property 'client.id' is
> > already
> > > used to identify a Connect worker. Have you thought of offering the
> > ability
> > > to identify the worker MDC if this property is set, keeping at the same
> > > time with MDC the ability to not include it as identifier?
> > >
> > > Finally, I'd suggest removing text placeholders. I think it will be
> > easier
> > > to follow the KIP. Currently I see placeholder text below Contents and
> in
> > > Rejected alternatives.
> > >
> > > And my least favorite section, but I can't help it :)
> > > Typos/grammar:
> > > to understand what (is) happening within the worker,
> > > Kafka Connect use(s) ...
> > > quotes maybe not needed, text already verbatim in
> > > `config/connect-log4j.properties`
> > >
> > > Best,
> > > Konstantine
> > >
> > > On Tue, Apr 2, 2019 at 4:20 PM Randall Hauch <rha...@gmail.com> wrote:
> > >
> > > > I've been working on https://github.com/apache/kafka/pull/5743 for a
> > > > while,
> > > > but there were a number of comment, suggestions, and mild concerns on
> > the
> > > > PR. One of those comments was that maybe changing the Connect log
> > content
> > > > in this way probably warrants a KIP. So here it is:
> > > >
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-449%3A+Add+connector+contexts+to+Connect+worker+logs
> > > >
> > > > I've also updated my PR reflect the KIP. Please reply with comments
> > > and/or
> > > > feedback.
> > > >
> > > > Best regards,
> > > >
> > > > Randall
> > > >
> > >
> >
>

Reply via email to