[jira] [Commented] (KAFKA-18066) Misleading/mismatched StreamThread id in logging

Peter Lee (Jira) Sat, 23 Nov 2024 09:43:34 -0800


    [ 
https://issues.apache.org/jira/browse/KAFKA-18066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17900609#comment-17900609
 ]


Peter Lee commented on KAFKA-18066:
-----------------------------------

Hi [~ableegoldman] 
I’m working on this and would like to hear your thoughts.:D

Currently, if we move the *creation* logic directly into the {{StreamThread}} 
constructor, it becomes harder to refactor tests that use mocks. For reference, 
see the [current test 
cases|https://github.com/peterxcli/kafka/blob/2519e4af0c19d2540093c283f14dfe4111a5a21e/streams/src/test/java/org/apache/kafka/streams/processor/internals/StreamThreadTest.java#L1391-L1461].

To address this, I’m considering splitting the initialization process as 
follows:
 # Keep the {{StreamThread}} constructor focused on mandatory, static 
dependencies:
{code:java}
final StreamThread streamThread = new StreamThread(
    time,
    config,
    adminClient,
    streamsMetrics,
    topologyMetadata,
    threadId,
    logContext,
    referenceContainer.assignmentErrorCode,
    referenceContainer.nextScheduledRebalanceMs,
    referenceContainer.nonFatalExceptionsToHandle,
    shutdownErrorHook,
    streamsUncaughtExceptionHandler,
    cache::resize
); {code}

 # Add an {{initializeComponents}} method for setting up additional components:
{code:java}
streamThread.initializeComponents(
    mainConsumer,
    restoreConsumer,
    changelogReader,
    originalReset,
    taskManager,
    stateUpdater
);{code}

However, this approach requires removing the {{final}} modifier from the 
properties set in {{{}initializeComponents{}}}. While it simplifies testing 
with mocks, it might introduce potential mutability concerns.

I’d appreciate any suggestions or insights! Thanks!

> Misleading/mismatched StreamThread id in logging
> ------------------------------------------------
>
>                 Key: KAFKA-18066
>                 URL: https://issues.apache.org/jira/browse/KAFKA-18066
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>            Reporter: A. Sophie Blee-Goldman
>            Assignee: Peter Lee
>            Priority: Minor
>              Labels: newbie, newbie++
>
> While debugging a test application I was confused to see a number of log 
> lines where the StreamThread name appeared twice but had a different thread 
> id/index in the same message. For example:
> {code:java}
> [INFO ] 2024-11-19 04:59:14.541 
> [e2e-963c5b74-0353-4253-bdf2-b71881d9d9f2-StreamThread-1] StreamThread - 
> stream-thread [e2e-963c5b74-0353-4253-bdf2-b71881d9d9f2-StreamThread-3] 
> Creating thread producer client{code}
> Generally you would expect that the actual Logger prefix (the first thread 
> name, in this case StreamThread-1) is the same as the LogContext prefix (the 
> second thread name, ie the StreamThread-3 in this example). I dug into it and 
> figured out that this happens for all of the messages logged during the 
> StreamThread#create method, ie before the new thread is actually created. 
> What happened was StreamThread-1 had actually died, and started up a new 
> thread (StreamThread-3) to replace itself before shutting down. So we were 
> logging things _about_ StreamThread-3, but _from_ StreamThread-1.
> While this doesn't necessarily harm anyone, it's quite confusing to see and 
> requires extensive knowledge of Streams to understand (a) that it's not a 
> bug, and (b) which thread the messages are actually referring to. It also 
> makes things harder to parse and read – for example I often filter logs on 
> the Logger prefix to gather everything related to a particular thread and eg 
> the clients it owns. The name of the currently executing thread is more 
> reliable and gathers everything whereas not every logger is configured with 
> the LogContext prefix (eg `stream-thread 
> [e2e-963c5b74-0353-4253-bdf2-b71881d9d9f2-StreamThread-3]`). 
> We should move things out of the static StreamThread#create method and into 
> the thread constructor to make the logging consistent and reliable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-18066) Misleading/mismatched StreamThread id in logging

Reply via email to