Hi Chris, Thank you for your comments! I updated the KIP. I still need to add the example before/after log lines, will do that soon, but I addressed all the other points. 1. Added more details on thread renaming under Public Interfaces, removed pseudo code. 2. Removed the stale header - originally, client.id related changes were part of the KIP, and I failed removing all leftovers of that version. 3. Threads listed under Public Interfaces with current/proposed names. 4. Added a comment in the connect-log4j.properties, similar to the one introduced in KIP-449. We don't have a dedicated MM2 log4j config, not sure if we should introduce it here. 5. Clarified testing section - I think thread names should not be tested (they never were), but testing will focus on the new MDC context value.
Thanks, Daniel Chris Egerton <chr...@aiven.io.invalid> ezt írta (időpont: 2023. jún. 5., H, 16:46): > Hi Daniel, > > Thanks for the updates! A few more thoughts: > > 1. The "Proposed Changes" section seems a bit like a pseudocode > implementation of the KIP. We don't really need this level of detail; most > of the time, we're just looking for implementation details that don't > directly affect the user-facing changes proposed in the "Public Interfaces" > section but are worth mentioning because they (1) demonstrate how the > user-facing changes are made possible, (2) indirectly affect user-facing > behavior, or (3) go into more detail (like providing examples) about the > user-facing behavior. For this KIP, I think examples of user-facing > behavior (like before/after of thread names and log messages) and possibly > an official description of the scope of the changes (which threads are > going to be renamed and/or include the new MDC key, and which aren't?) are > all that we'd really need in this section; everything else is fairly clear > IMO. FWIW, the reason we want to discourage going into too much detail with > KIPs is that it can quickly devolve into code review over mailing list, > which can hold KIPs up for longer than necessary when the core design > changes they contain are already basically accepted by everyone. > > 2. The "MM2 distributed mode client.id and log change" header seems like > it > may no longer be accurate; the contents do not mention any changes to > client IDs. I might be missing something though; please correct me if I am. > > 3. Can you provide some before/after examples of what thread names and log > messages will look like? I'm wondering about the thread that the > distributed herder runs on, threads for connectors and tasks, and threads > for polling internal topics (which we currently manage with the > KafkaBasedLog class). It's fine if some of these are unchanged, I just want > to better understand the scope of the proposed changes and get an idea of > how they may appear to users. > > 4. There's no mention of changes to the default Log4j config files that we > ship. Is this intentional? I feel like we need some way for users to easily > discover this feature; if we're not going to touch our default Log4j config > files, is there another way that we can expect users to find out about the > new MDC key? > > 5. RE the "Test Plan" section: can you provide a little more detail of > which cases we'll be covering with the proposed unit tests? Will we be > verifying that the MDC context is set in various places? If so, which > places? And the same with thread names? (There doesn't have to be a ton of > detail, but a little more than "unit tests" would be nice 😄) > > Cheers, > > Chris > > On Mon, Jun 5, 2023 at 9:45 AM Dániel Urbán <urb.dani...@gmail.com> wrote: > > > I updated the KIP accordingly. Tried to come up with more generic terms > in > > the Connect code instead of referring to "flow" anywhere. > > Daniel > > > > Dániel Urbán <urb.dani...@gmail.com> ezt írta (időpont: 2023. jún. 5., > H, > > 14:49): > > > > > Hi Chris, > > > > > > Thank you for your comments! > > > > > > I agree that the toString based logging is not ideal, and I believe all > > > occurrences are within a proper logging context, so they can be > ignored. > > > If thread names can be changed unconditionally, I agree, using a new > MDC > > > key is the ideal solution. > > > > > > Will update the KIP accordingly. > > > > > > Thanks, > > > Daniel > > > > > > Chris Egerton <chr...@aiven.io.invalid> ezt írta (időpont: 2023. jún. > > 2., > > > P, 22:23): > > > > > >> Hi Daniel, > > >> > > >> Are there any cases of Object::toString being used by internal classes > > for > > >> logging context that can't be covered by MDC contexts? For example, > > >> anything logged by WorkerSinkTask (or any WorkerTask subclass) already > > has > > >> the MDC set for the task [1]. Since the Object::toString-prefixed > style > > of > > >> logging is a bit obsolete after the introduction of MDC contexts it'd > > feel > > >> a little strange to continue trying to accommodate it, especially if > the > > >> changes from this KIP are going to be opt-in regardless. > > >> > > >> As far as thread names go: unlike log statements, I don't believe > > changing > > >> them requires a KIP, and would be fine with merging a PR for that > > without > > >> worrying about compatibility. > > >> > > >> With that in mind, I'm still wondering if we can add a separate MDC > key > > >> for > > >> MM2 replication flows (perhaps "mirror.maker.flow") and > unconditionally > > >> add > > >> that to the MDC contexts for every thread that gets spun up by each > > >> DistributedHerder instance that MM2 creates. This would be different > > from > > >> the draft PR and KIP, which involve altering the content of the > existing > > >> "connector.context" MDC key. Users would opt in to the change by > > altering > > >> their Log4j config instead of altering their MM2 config, which matches > > >> precedent set with the introduction of the "connector.context" key. > > >> > > >> Thoughts? > > >> > > >> [1] - > > >> > > >> > > > https://github.com/apache/kafka/blob/146a6976aed0d9f90c70b6f21dca8b887cc34e71/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerTask.java#L255 > > >> > > >> Cheers, > > >> > > >> Chris > > >> > > >> On Tue, May 23, 2023 at 11:36 AM Dániel Urbán <urb.dani...@gmail.com> > > >> wrote: > > >> > > >> > Hi Chris, > > >> > > > >> > Thank you for your comments! > > >> > 1. This approach is OK for me. I thought that the "sample" configs > in > > >> the > > >> > repo do not fall into the same category as the default of a new > > config. > > >> > Adding a commented line instead should be ok, and the future opt-out > > >> change > > >> > sounds good to me. > > >> > > > >> > 2. Besides the MDC, there are 2 other places where the flow context > > >> > information could be added: > > >> > - Some of the Connect internal classes use their .toString methods > > when > > >> > logging (e.g. WorkerSinkTask has a line like this: log.info("{} > > >> Executing > > >> > sink task", this);). These toString implementations contain the > > >> connector > > >> > name, and nothing else, so in MM2 dedicated mode, adding the flow > > would > > >> > make these lines unique. > > >> > - Connect worker thread names. Currently, they contain the connector > > >> > name/task ID, but to make them unique, the flow should be added in > MM2 > > >> > distributed mode. > > >> > In my draft PR, I changed both of these, but for the sake of > backward > > >> > compatibility, I made the new toString/thread name dependent on the > > new, > > >> > suggested configuration flag. > > >> > If we go with a new MDC key, but we still want to change the > toString > > >> > methods and the thread names, we still might need an extra flag to > > turn > > >> on > > >> > the new behavior. > > >> > AFAIK the toString calls are all made inside a proper logging > context, > > >> so > > >> > changing the toString methods don't add much value. I think that the > > >> thread > > >> > name changes are useful, giving more context for log lines written > > >> outside > > >> > of a log context. > > >> > > > >> > In short, I think that MDC + thread name changes are necessary to > make > > >> MM2 > > >> > dedicated logs useful for diagnostics. If we keep both, then maybe > > >> having a > > >> > single config to control both at the same time is better than > having a > > >> new > > >> > MDC key (configured separately in the log pattern) and a config flag > > >> (set > > >> > separately in the properties file) for the thread name change. > > >> > > > >> > Thanks, > > >> > Daniel > > >> > > > >> > > > > > >