geoffreyclaude opened a new pull request, #14547: URL: https://github.com/apache/datafusion/pull/14547
## Which issue does this PR close? Relates to #9415. Does not fully close the issue, but moves forward with a pre-requisite. ## Rationale for this change This allows DataFusion to integrate with users of the [`tracing`](https://docs.rs/tracing/latest/tracing/) crate by propagating the trace context as users would expect, without investing in the full integration of the `tracing` ecosystem. When the (new) `tracing` feature is enabled, all tasks spawned on new threads (e.g. those spawned during repartitioning or while reading/writing Parquet files) inherit the current tracing span. This enhancement allows to propagate trace context through thread boundaries, into external data sources or custom exec nodes, and allows linking all generated logs and spans to the expected trace context. Previously, tasks spawned on new threads would lose the trace context, as it is thread-local and must be "manually" propagated to the new thread. ## What changes are included in this PR? - Update the common runtime so that tasks spawned on new threads are instrumented with the current tracing span when the `tracing` feature is enabled by wrapping the `tokio::task::JoinSet` in a custom `JoinSet`. - Add a new Cargo.toml feature (`tracing`) in the common-runtime crate, along with necessary dependency updates. - Provide an integration example in `datafusion-examples/examples/tracing.rs` that runs a SQL query over the `alltypes_tiny_pages_plain.parquet` file to demonstrate end-to-end propagation of the tracing context across multiple threads. - Update root `README.md` to reflect the availability and usage of the new `trace` feature. ## Are these changes tested? Yes. While there are no dedicated unit tests for this feature, the integration example in `datafusion-examples/examples/tracing.rs` serves as a comprehensive test. This example executes a query that triggers task spawns (such as through repartitioning and Parquet reading) and logs tracing output. By reviewing the logs, one can verify that the tracing span context is correctly propagated end to end. ## Are there any user-facing changes? No changes are expected for users who do not enable the `tracing` feature. The performance overhead *should* be inexistent when the feature is disabled, and completely negligible when enabled. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org