Chris M. Hostetter created SOLR-15783: -----------------------------------------
Summary: MDC Values can leak between request threads, trace_id currently does Key: SOLR-15783 URL: https://issues.apache.org/jira/browse/SOLR-15783 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Reporter: Chris M. Hostetter Assignee: Chris M. Hostetter Currently, the only general "MDC Cleanup" code that happens from request to request is done by {{MDCLoggingContext.clear()}} and {{MDCLoggingContext.reset()}} ... but these methods (in addition to only being concerned with a static list of 5 specific MDC values) are currently designed and used in such a way that things like "trace_id" are only "reset" at the _begining_ of {{HttpSolrCall.call()}} ... which means any request specific log messages that happen before that point (like authn errors) will incorrectly refer to the trace_id of whatever request the thread _previously_ processed. As a trivial example of this, if you apply a patch like... {noformat} diff --git a/solr/core/src/java/org/apache/solr/client/solrj/embedded/JettySolrRunner.java b/solr/core/src/java/org/apache/solr/client/solrj/embedded/JettySolrRunner.java index 5f481c48e92..97c864617f4 100644 --- a/solr/core/src/java/org/apache/solr/client/solrj/embedded/JettySolrRunner.java +++ b/solr/core/src/java/org/apache/solr/client/solrj/embedded/JettySolrRunner.java @@ -109,7 +109,7 @@ public class JettySolrRunner { private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); - private static final int THREAD_POOL_MAX_THREADS = 10000; + private static final int THREAD_POOL_MAX_THREADS = 10; // nocommit // NOTE: needs to be larger than SolrHttpClient.threadPoolSweeperMaxIdleTime private static final int THREAD_POOL_MAX_IDLE_TIME_MS = 260000; diff --git a/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java b/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java index 28167270099..fa6bb9a4eb8 100644 --- a/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java +++ b/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java @@ -484,6 +484,9 @@ public class SolrDispatchFilter extends BaseSolrFilter { response.sendError(429, errorMessage); } + log.error("nocommit: fake log message before HttpSolrCall.call() as if auth error happened: MDC_claims={} ; real_traceId={}", + org.slf4j.MDC.get(org.apache.solr.logging.MDCLoggingContext.TRACE_ID), span.context().toTraceId()); + AtomicReference<HttpServletRequest> wrappedRequest = new AtomicReference<>(); if (!authenticateRequest(request, response, wrappedRequest)) { // the response and status code have already been sent return; {noformat} ...and then run {{./gradlew test --tests TestDistributedTracing -Ptests.verbose=true}} you can pick out any single jetty query thread and see something like... {noformat} 2> 4103 ERROR (qtp1310066330-70) [] o.a.s.s.SolrDispatchFilter nocommit: fake log message before HttpSolrCall.call() as if auth error happened: MDC_claims=null ; real_traceId=1 2> 6146 ERROR (qtp1310066330-70) [n:127.0.0.1:34135_solr t:1] o.a.s.s.SolrDispatchFilter nocommit: fake log message before HttpSolrCall.call() as if auth error happened: MDC_claims=1 ; real_traceId=22 2> 6661 ERROR (qtp1310066330-70) [n:127.0.0.1:34135_solr t:22] o.a.s.s.SolrDispatchFilter nocommit: fake log message before HttpSolrCall.call() as if auth error happened: MDC_claims=22 ; real_traceId=50 {noformat} (Note also that during authn, the nodeId isn't populated for a 'new' thread, it's only set if it was leaked from the last request processed) -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org