[ https://issues.apache.org/jira/browse/CAMEL-21302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17891046#comment-17891046 ]
Freeman Yue Fang commented on CAMEL-21302: ------------------------------------------ Hi [~jpoth], [~davsclaus], I have more update of this issue after reading code in camel-tracing and camel-opentelemetry. 1. For the scenario that a camel producer endpoint using asynchronous way to send request and handle response in different threads, actually in camel-tracing and camel-opentelemetry have the mechanism to accommodate this scenario already. There is a ExchangeAsyncStartedEvent which calls into org.apache.camel.tracing.ActiveSpanManager {code} /** * If the underlying span is active, closes its scope without ending the span. This method should be called after * async execution is started on the same thread on which span was activated. ExchangeAsyncStartedEvent is used to * notify about it. * * @param exchange The exchange */ public static void endScope(Exchange exchange) { Holder holder = exchange.getProperty(ExchangePropertyKey.ACTIVE_SPAN, Holder.class); if (holder != null) { holder.closeScope(); } } {code} when a asynchronous camel producer(for example, a camel-cxf producer) endpoint sends request asynchronously. And this ExchangeAsyncStartedEvent and endScope method in org.apache.camel.tracing.ActiveSpanManager ensures the producer CLIENT scope can be closed in the expected thread. 2. Why we have the issue exposed in AsyncCxfTest.java? The problem actually comes from the DirectProduer used in the testcase. The leak context actually is from DirectProducer but not CxfRsProducer(direct CLIENT but not cxfrs CLIENT) {code} There must be no leaking span after test ==> expected: <{}> but was: <{opentelemetry-trace-span-key=SdkSpan{traceId=c29955d3b94b5673cc9ab23f43938a92, spanId=d6fa2ad9cdd6c636, parentSpanContext=ImmutableSpanContext{traceId=c29955d3b94b5673cc9ab23f43938a92, spanId=c0f0bed8f3615a2d, traceFlags=01, traceState=ArrayBasedTraceState{entries=[]}, remote=false, valid=true}, name=send, kind=CLIENT, attributes=AttributesMap{data={url.scheme=direct, camel.uri=direct://send, url.path=send, component=camel-direct}, capacity=128, totalAddedValues=4}, status=ImmutableStatusData{statusCode=UNSET, description=}, totalRecordedEvents=0, totalRecordedLinks=0, startEpochNanos=1729272514478877531, endEpochNanos=1729272514491415078}}> {code} And if we take a look at the related code in DirectProducer.java {code} } else { // the consumer may be forced synchronous if (consumer.getEndpoint().isSynchronous()) { consumer.getProcessor().process(exchange); callback.done(true); return true; } else { return consumer.getAsyncProcessor().process(exchange, callback); } } {code} The above code {code} return consumer.getAsyncProcessor().process(exchange, callback); {code} will call the next DirectConsumer endpoint in the same thread, which will create Direct INTERNAL span/scope and this messes the scope/span stack, this causes that the previous camel-direct producer CLIENT scope can't be closed properly and hence causes the leak. I just sent a PR https://github.com/apache/camel/pull/16008 And in the PR https://github.com/apache/camel/pull/16008/commits/b9bf81b24c8a3c24117cc5454cafac906e8dd136#diff-62012c8f6e9737b3fe210a2abbb42fee51e22315ad31b82caf9df4ba3c53e9f5 I added CurrentSpanTest.testDirectToDirectToAsync which exposes the same problem even without cxfrs endpoint involved. Best Regards Freeman > camel-opentelemetry context leak with direct async producer > ----------------------------------------------------------- > > Key: CAMEL-21302 > URL: https://issues.apache.org/jira/browse/CAMEL-21302 > Project: Camel > Issue Type: Bug > Components: camel-opentelemetry > Reporter: John Poth > Assignee: Freeman Yue Fang > Priority: Major > > There seems to be a Otel context leak when using a CXF producer in async > mode. This causes different requests to have the same _traceId._ As a > workaround, setting _synchronous=true_ on the CXF producer resolves the > issue. Here's a reproducer: > {code:java} > @Override > protected RoutesBuilder createRouteBuilder() { > return new RouteBuilder() { > @Override > public void configure() { > from("direct:start").routeId("myRoute") > .to("direct:send") > .end(); > from("direct:send") > .log("message") > .to("cxfrs:http://localhost:" + port1 > + "/rest/helloservice/sayHello?synchronous=false"); > // setting to 'true' resolves the issue > restConfiguration() > .port(port1); > rest("/rest/helloservice") > .post("/sayHello").routeId("rest-GET-say-hi") > .to("direct:sayHi"); > from("direct:sayHi") > .routeId("mock-GET-say-hi") > .log("example") > .to("mock:end"); > }}; > {code} > > I've added the complete unit here: > https://github.com/apache/camel/blob/7d83a62b8e442dc9ac6fd79b153192add940301e/components/camel-opentelemetry/src/test/java/org/apache/camel/opentelemetry/AsyncCxfTest.java -- This message was sent by Atlassian Jira (v8.20.10#820010)