This is an automated email from the ASF dual-hosted git repository.
github-bot pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/datafusion.git
The following commit(s) were added to refs/heads/main by this push:
new 567ba75840 doc: Add an auto-generated dependency graph for internal
crates (#19280)
567ba75840 is described below
commit 567ba75840494170cbe7e50c695110d447426c8c
Author: Yongting You <[email protected]>
AuthorDate: Wed Jan 14 10:33:41 2026 +0800
doc: Add an auto-generated dependency graph for internal crates (#19280)
## Which issue does this PR close?
<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->
- Closes #.
## Rationale for this change
<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->
A dependency graph for workspace member crates are often needed when
doing refactors, I want it to be included in the doc, and have a script
to update it automatically.
Here is the preview:
<img width="1203" height="951" alt="image"
src="https://github.com/user-attachments/assets/527c18fc-258e-465f-a150-f2aafe3e6db9"
/>
## What changes are included in this PR?
<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->
- adds a script to generate the dependency graph `deps.svg`, and verify
if the existing one is up to date.
- adds a documentation page in `Contributor Guide` to show this graph
- adds a CI job to check if the generated dependency graph is up to date
with the code.
## Are these changes tested?
<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code
If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->
I tested the dependency graph display locally, see above.
Is it possible to see the preview from this PR's change online?
I also included a dummy crate in the initial commit, to test if the CI
can catch it and throw understandable error message.
## Are there any user-facing changes?
No
<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->
<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
---------
Co-authored-by: Martin Grigorov <[email protected]>
Co-authored-by: Jeffrey Vo <[email protected]>
---
.github/workflows/dependencies.yml | 2 +-
.github/workflows/docs.yaml | 6 +
.github/workflows/docs_pr.yaml | 7 +-
docs/.gitignore | 4 +
docs/README.md | 5 +
docs/build.sh | 9 +-
docs/scripts/generate_dependency_graph.sh | 97 +++++++++++
.../architecture/dependency-graph.md | 180 +++++++++++++++++++++
docs/source/index.rst | 1 +
9 files changed, 308 insertions(+), 3 deletions(-)
diff --git a/.github/workflows/dependencies.yml
b/.github/workflows/dependencies.yml
index fef65870b6..f32eb7d2dd 100644
--- a/.github/workflows/dependencies.yml
+++ b/.github/workflows/dependencies.yml
@@ -66,4 +66,4 @@ jobs:
- name: Install cargo-machete
run: cargo install cargo-machete --version ^0.9 --locked
- name: Detect unused dependencies
- run: cargo machete --with-metadata
\ No newline at end of file
+ run: cargo machete --with-metadata
diff --git a/.github/workflows/docs.yaml b/.github/workflows/docs.yaml
index 3e2c48643c..b62055b13b 100644
--- a/.github/workflows/docs.yaml
+++ b/.github/workflows/docs.yaml
@@ -51,6 +51,12 @@ jobs:
python3 -m venv venv
source venv/bin/activate
pip install -r docs/requirements.txt
+ - name: Install dependency graph tooling
+ run: |
+ set -x
+ sudo apt-get update
+ sudo apt-get install -y graphviz
+ cargo install cargo-depgraph --version ^1.6 --locked
- name: Build docs
run: |
diff --git a/.github/workflows/docs_pr.yaml b/.github/workflows/docs_pr.yaml
index 81eeb4039b..784a33d4c5 100644
--- a/.github/workflows/docs_pr.yaml
+++ b/.github/workflows/docs_pr.yaml
@@ -54,10 +54,15 @@ jobs:
python3 -m venv venv
source venv/bin/activate
pip install -r docs/requirements.txt
+ - name: Install dependency graph tooling
+ run: |
+ set -x
+ sudo apt-get update
+ sudo apt-get install -y graphviz
+ cargo install cargo-depgraph --version ^1.6 --locked
- name: Build docs html and check for warnings
run: |
set -x
source venv/bin/activate
cd docs
./build.sh # fails on errors
-
diff --git a/docs/.gitignore b/docs/.gitignore
index a3adddc690..e73866cc0f 100644
--- a/docs/.gitignore
+++ b/docs/.gitignore
@@ -20,3 +20,7 @@ build/
venv/
.python-version
__pycache__/
+
+# Generated dependency graph artifacts (produced during docs CI)
+source/_static/data/deps.dot
+source/_static/data/deps.svg
diff --git a/docs/README.md b/docs/README.md
index c3d87ee8e8..0340a3b8bf 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -40,6 +40,11 @@ needing to create a virtual environment:
uv run --with-requirements requirements.txt bash build.sh
```
+The docs build regenerates the workspace dependency graph via
+`docs/scripts/generate_dependency_graph.sh`, so ensure `cargo`,
`cargo-depgraph`
+(`cargo install cargo-depgraph --version ^1.6 --locked`), and Graphviz `dot`
+(`brew install graphviz` or `sudo apt-get install -y graphviz`) are available.
+
## Build & Preview
Run the provided script to build the HTML pages.
diff --git a/docs/build.sh b/docs/build.sh
index 9e4a118580..e12e3c1a5f 100755
--- a/docs/build.sh
+++ b/docs/build.sh
@@ -18,7 +18,14 @@
# under the License.
#
-set -e
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+cd "${SCRIPT_DIR}"
+
rm -rf build 2> /dev/null
+# Keep the workspace dependency graph in sync with the codebase.
+scripts/generate_dependency_graph.sh
+
make html
diff --git a/docs/scripts/generate_dependency_graph.sh
b/docs/scripts/generate_dependency_graph.sh
new file mode 100755
index 0000000000..771f6f1932
--- /dev/null
+++ b/docs/scripts/generate_dependency_graph.sh
@@ -0,0 +1,97 @@
+#!/usr/bin/env bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# See `usage()` for details about this script.
+#
+# The key commands to generate the dependency graph SVG in this script are:
+# cargo depgraph ... | dot -Tsvg > deps.svg
+# See below for the exact command used.
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+REPO_DIR="$(cd "${SCRIPT_DIR}/../.." && pwd)"
+OUTPUT_DIR="${REPO_DIR}/docs/source/_static/data"
+SVG_OUTPUT="${OUTPUT_DIR}/deps.svg"
+
+usage() {
+ cat <<EOF
+Generate the workspace dependency graph SVG for the docs.
+
+'deps.svg' is embedded in the DataFusion docs (Contributor Guide →
Architecture → Workspace Dependency Graph).
+
+Output:
+ SVG: ${SVG_OUTPUT}
+
+Usage: $(basename "$0")
+
+Options:
+ -h, --help Show this help message.
+EOF
+}
+
+while [[ $# -gt 0 ]]; do
+ case "$1" in
+ -h|--help)
+ usage
+ exit 0
+ ;;
+ *)
+ echo "Unknown option: $1" >&2
+ usage
+ exit 1
+ ;;
+ esac
+ shift
+done
+
+if ! command -v cargo >/dev/null 2>&1; then
+ echo "cargo is required to build the dependency graph." >&2
+ exit 1
+fi
+
+if ! command -v cargo-depgraph > /dev/null 2>&1; then
+ echo "cargo-depgraph is required (install with: cargo install
cargo-depgraph)." >&2
+ exit 1
+fi
+
+if ! command -v dot >/dev/null 2>&1; then
+ echo "Graphviz 'dot' is required to render the SVG." >&2
+ exit 1
+fi
+
+mkdir -p "${OUTPUT_DIR}"
+
+(
+ cd "${REPO_DIR}"
+ # Ignore utility crates only used by internal scripts
+ cargo depgraph \
+ --workspace-only \
+ --all-deps \
+ --dedup-transitive-deps \
+ --exclude gen,gen-common \
+ | dot \
+ -Grankdir=TB \
+ -Gconcentrate=true \
+ -Goverlap=false \
+ -Tsvg \
+ > "${SVG_OUTPUT}"
+)
+
+echo "Wrote dependency graph SVG to ${SVG_OUTPUT}"
diff --git a/docs/source/contributor-guide/architecture/dependency-graph.md
b/docs/source/contributor-guide/architecture/dependency-graph.md
new file mode 100644
index 0000000000..be3502f48b
--- /dev/null
+++ b/docs/source/contributor-guide/architecture/dependency-graph.md
@@ -0,0 +1,180 @@
+<!---
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+-->
+
+# Workspace Dependency Graph
+
+This page shows the dependency relationships between DataFusion's workspace
+crates. This only includes internal dependencies, external crates like `Arrow`
are not included
+
+The dependency graph is auto-generated by
`docs/scripts/generate_dependency_graph.sh` to ensure it stays up-to-date, and
the script now runs automatically as part of `docs/build.sh`.
+
+## Dependency Graph for Workspace Crates
+
+<!--
+ Below is an embedded .svg file, with interactive functionalities like
drag/zoom-in/etc.
+ -->
+
+```{raw} html
+<div id="workspace-deps-wrapper" style="border:1px solid
#d4d4d8;border-radius:10px;overflow:hidden;background:#fff;">
+ <div id="workspace-deps-inline"
style="min-height:760px;width:100%;background:#f8fafc;overflow:hidden;padding:0;margin:0;">
+```
+
+```{eval-rst}
+.. raw:: html
+ :file: ../../_static/data/deps.svg
+```
+
+```{raw} html
+ </div>
+ <div style="padding:10px 12px;background:#f1f5f9;border-top:1px solid
#e5e7eb;display:flex;justify-content:space-between;align-items:center;flex-wrap:wrap;gap:8px;">
+ <span style="color:#334155;font-size:0.95rem;">Interactive SVG (pan, zoom,
search)</span>
+ <div style="display:flex;align-items:center;gap:6px;">
+ <button id="workspace-deps-zoom-out" type="button" style="padding:6px
10px;border:1px solid
#cbd5e1;border-radius:6px;background:#fff;color:#334155;cursor:pointer;">−</button>
+ <button id="workspace-deps-zoom-in" type="button" style="padding:6px
10px;border:1px solid
#cbd5e1;border-radius:6px;background:#fff;color:#334155;cursor:pointer;">+</button>
+ </div>
+ <a href="../../_static/data/deps.svg" target="_blank" rel="noopener"
+ style="font-weight:600;color:#2563eb;text-decoration:none;">Open SVG
↗</a>
+ </div>
+</div>
+<script>
+ (function () {
+ const host = document.getElementById("workspace-deps-inline");
+ if (!host) {
+ return;
+ }
+
+ const svg = host.querySelector("svg");
+ if (!svg) {
+ host.textContent = "Unable to load dependency graph.";
+ host.style.display = "flex";
+ host.style.alignItems = "center";
+ host.style.justifyContent = "center";
+ host.style.background = "#f8fafc";
+ return;
+ }
+
+ svg.removeAttribute("width");
+ svg.removeAttribute("height");
+ svg.style.width = "100%";
+ svg.style.height = "100%";
+ svg.style.cursor = "grab";
+ svg.style.touchAction = "none";
+
+ const rawViewBox = (svg.getAttribute("viewBox") ||
"").split(/\s+/).map(Number);
+ if (rawViewBox.length !== 4 || rawViewBox.some((v) => Number.isNaN(v))) {
+ return;
+ }
+
+ const initial = {
+ x: rawViewBox[0],
+ y: rawViewBox[1],
+ width: rawViewBox[2],
+ height: rawViewBox[3],
+ };
+
+ const state = { ...initial };
+ const applyViewBox = () => {
+ svg.setAttribute("viewBox", `${state.x} ${state.y} ${state.width}
${state.height}`);
+ };
+
+ let isPanning = false;
+ let last = { x: 0, y: 0 };
+
+ svg.addEventListener("pointerdown", (event) => {
+ isPanning = true;
+ last = { x: event.clientX, y: event.clientY };
+ svg.setPointerCapture(event.pointerId);
+ svg.style.cursor = "grabbing";
+ });
+
+ const endPan = (event) => {
+ if (event && svg.hasPointerCapture(event.pointerId)) {
+ svg.releasePointerCapture(event.pointerId);
+ }
+ isPanning = false;
+ svg.style.cursor = "grab";
+ };
+
+ svg.addEventListener("pointerup", endPan);
+ svg.addEventListener("pointerleave", endPan);
+ svg.addEventListener("pointercancel", endPan);
+
+ const zoomBy = (factor) => {
+ const targetWidth = state.width * factor;
+ const targetHeight = state.height * factor;
+ const minSize = Math.max(initial.width * 0.05, 10);
+ const maxSize = initial.width * 20;
+ const clampedWidth = Math.min(Math.max(targetWidth, minSize), maxSize);
+ const clampedHeight = Math.min(Math.max(targetHeight, minSize), maxSize);
+
+ state.x += (state.width - clampedWidth) / 2;
+ state.y += (state.height - clampedHeight) / 2;
+ state.width = clampedWidth;
+ state.height = clampedHeight;
+ applyViewBox();
+ };
+
+ const normalizeDelta = (deltaY, deltaMode) => {
+ // Make trackpad/wheel zoom feel smooth across devices.
+ const multiplier = deltaMode === 1 ? 16 : deltaMode === 2 ?
window.innerHeight : 1;
+ return deltaY * multiplier;
+ };
+
+ svg.addEventListener("pointermove", (event) => {
+ if (!isPanning) {
+ return;
+ }
+ const scaleX = state.width / svg.clientWidth;
+ const scaleY = state.height / svg.clientHeight;
+ state.x -= (event.clientX - last.x) * scaleX;
+ state.y -= (event.clientY - last.y) * scaleY;
+ last = { x: event.clientX, y: event.clientY };
+ applyViewBox();
+ });
+
+ svg.addEventListener("wheel", (event) => {
+ event.preventDefault();
+
+ const delta = normalizeDelta(event.deltaY, event.deltaMode);
+ const factor = Math.exp(delta * 0.0015); // smaller magnitude for
smoother scrolling
+ zoomBy(factor);
+ }, { passive: false });
+
+ const zoomIn = document.getElementById("workspace-deps-zoom-in");
+ const zoomOut = document.getElementById("workspace-deps-zoom-out");
+ if (zoomIn) {
+ zoomIn.addEventListener("click", () => zoomBy(0.9));
+ }
+ if (zoomOut) {
+ zoomOut.addEventListener("click", () => zoomBy(1.1));
+ }
+ })();
+</script>
+```
+
+### Legend
+
+- black lines: normal dependency
+- blue lines: dev-dependency
+- green lines: build-dependency
+- dotted lines: optional dependency (could be removed by disabling a cargo
feature)
+
+Transitive dependencies are intentionally ignored to keep the graph readable.
+
+The dependency graph is generated through `cargo depgraph` by
`docs/scripts/generate_dependency_graph.sh`.
diff --git a/docs/source/index.rst b/docs/source/index.rst
index 181d54a664..ae210de099 100644
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -159,6 +159,7 @@ To get started, see
contributor-guide/communication
contributor-guide/development_environment
contributor-guide/architecture
+ contributor-guide/architecture/dependency-graph
contributor-guide/testing
contributor-guide/api-health
contributor-guide/howtos
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]