I updated the KIP accordingly. Tried to come up with more generic terms in the Connect code instead of referring to "flow" anywhere. Daniel
Dániel Urbán <urb.dani...@gmail.com> ezt írta (időpont: 2023. jún. 5., H, 14:49): > Hi Chris, > > Thank you for your comments! > > I agree that the toString based logging is not ideal, and I believe all > occurrences are within a proper logging context, so they can be ignored. > If thread names can be changed unconditionally, I agree, using a new MDC > key is the ideal solution. > > Will update the KIP accordingly. > > Thanks, > Daniel > > Chris Egerton <chr...@aiven.io.invalid> ezt írta (időpont: 2023. jún. 2., > P, 22:23): > >> Hi Daniel, >> >> Are there any cases of Object::toString being used by internal classes for >> logging context that can't be covered by MDC contexts? For example, >> anything logged by WorkerSinkTask (or any WorkerTask subclass) already has >> the MDC set for the task [1]. Since the Object::toString-prefixed style of >> logging is a bit obsolete after the introduction of MDC contexts it'd feel >> a little strange to continue trying to accommodate it, especially if the >> changes from this KIP are going to be opt-in regardless. >> >> As far as thread names go: unlike log statements, I don't believe changing >> them requires a KIP, and would be fine with merging a PR for that without >> worrying about compatibility. >> >> With that in mind, I'm still wondering if we can add a separate MDC key >> for >> MM2 replication flows (perhaps "mirror.maker.flow") and unconditionally >> add >> that to the MDC contexts for every thread that gets spun up by each >> DistributedHerder instance that MM2 creates. This would be different from >> the draft PR and KIP, which involve altering the content of the existing >> "connector.context" MDC key. Users would opt in to the change by altering >> their Log4j config instead of altering their MM2 config, which matches >> precedent set with the introduction of the "connector.context" key. >> >> Thoughts? >> >> [1] - >> >> https://github.com/apache/kafka/blob/146a6976aed0d9f90c70b6f21dca8b887cc34e71/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerTask.java#L255 >> >> Cheers, >> >> Chris >> >> On Tue, May 23, 2023 at 11:36 AM Dániel Urbán <urb.dani...@gmail.com> >> wrote: >> >> > Hi Chris, >> > >> > Thank you for your comments! >> > 1. This approach is OK for me. I thought that the "sample" configs in >> the >> > repo do not fall into the same category as the default of a new config. >> > Adding a commented line instead should be ok, and the future opt-out >> change >> > sounds good to me. >> > >> > 2. Besides the MDC, there are 2 other places where the flow context >> > information could be added: >> > - Some of the Connect internal classes use their .toString methods when >> > logging (e.g. WorkerSinkTask has a line like this: log.info("{} >> Executing >> > sink task", this);). These toString implementations contain the >> connector >> > name, and nothing else, so in MM2 dedicated mode, adding the flow would >> > make these lines unique. >> > - Connect worker thread names. Currently, they contain the connector >> > name/task ID, but to make them unique, the flow should be added in MM2 >> > distributed mode. >> > In my draft PR, I changed both of these, but for the sake of backward >> > compatibility, I made the new toString/thread name dependent on the new, >> > suggested configuration flag. >> > If we go with a new MDC key, but we still want to change the toString >> > methods and the thread names, we still might need an extra flag to turn >> on >> > the new behavior. >> > AFAIK the toString calls are all made inside a proper logging context, >> so >> > changing the toString methods don't add much value. I think that the >> thread >> > name changes are useful, giving more context for log lines written >> outside >> > of a log context. >> > >> > In short, I think that MDC + thread name changes are necessary to make >> MM2 >> > dedicated logs useful for diagnostics. If we keep both, then maybe >> having a >> > single config to control both at the same time is better than having a >> new >> > MDC key (configured separately in the log pattern) and a config flag >> (set >> > separately in the properties file) for the thread name change. >> > >> > Thanks, >> > Daniel >> > >> >