This is an automated email from the ASF dual-hosted git repository. skrawcz pushed a commit to branch update_references in repository https://gitbox.apache.org/repos/asf/hamilton.git
commit 0a17d39039d9bcfc9fcfc32ee1be23c23561d2d4 Author: Stefan Krawczyk <[email protected]> AuthorDate: Fri Jun 13 23:30:00 2025 -0700 Adds ability to build and publish docs This does a few things: 1. Adds .github workflow to build things 2. Adds .asf.yaml that specifies that a site should be built and deployed. No real design decisions, other than this seems to work and is low effort. Squashed commits below: Adds to asf.yaml and attempts to write to docs to branch (+4 squashed commits) Squashed commits: [e44ca55e] Adds incubating reference [486c57fc] Adds pushing to asf-* branches Will see if this works. [4312df53] Fixes docs and creates PDF [919ccc22] Adds github workflow to build docs WIP commit. --- .github/workflows/sphinx-docs.yml | 146 +++++++++++++++++++++ README.md | 2 +- docs/code-comparisons/airflow.rst | 2 +- docs/code-comparisons/dagster.rst | 4 +- docs/code-comparisons/kedro.rst | 8 +- docs/code-comparisons/langchain.rst | 16 +-- docs/concepts/best-practices/function-naming.rst | 2 +- .../best-practices/migrating-to-hamilton.rst | 4 +- docs/get-started/index.rst | 2 +- docs/hamilton-ui/index.rst | 4 +- docs/hamilton-ui/ui.rst | 4 +- docs/how-tos/use-for-feature-engineering.rst | 8 +- 12 files changed, 174 insertions(+), 28 deletions(-) diff --git a/.github/workflows/sphinx-docs.yml b/.github/workflows/sphinx-docs.yml new file mode 100644 index 00000000..d05938f9 --- /dev/null +++ b/.github/workflows/sphinx-docs.yml @@ -0,0 +1,146 @@ +name: Build Sphinx Documentation + +on: + push: + branches: [ "main", "update_references"] + paths: + - 'docs/**' + - '.github/workflows/sphinx-docs.yml' + pull_request: + branches: [ "main", "update_references" ] + paths: + - 'docs/**' + - '.github/workflows/sphinx-docs.yml' + workflow_dispatch: + +concurrency: + group: "doc-pages" + cancel-in-progress: true + +jobs: + build-docs: + runs-on: ubuntu-latest + + steps: + - uses: actions/checkout@v4 + + - name: Set up Python 3.11 + uses: actions/setup-python@v5 + with: + python-version: '3.11' + cache: 'pip' + + - name: Install system dependencies + run: | + sudo apt-get update + sudo apt-get install -y graphviz + + - name: Upgrade pip and setuptools + run: | + python -m pip install --upgrade --no-cache-dir pip setuptools + + - name: Install Sphinx and dependencies + run: | + python -m pip install --upgrade --no-cache-dir sphinx sphinx-rtd-theme sphinx-simplepdf + python -m pip install --upgrade --upgrade-strategy only-if-needed --no-cache-dir .[docs] + + - name: Build Sphinx documentation + working-directory: ./docs + run: | + python -m sphinx -T -W --keep-going -b dirhtml -d _build/doctrees -D language=en . _build/html + + - name: Build PDF documentation + working-directory: ./docs + run: | + # Build PDF using simplepdf + python -m sphinx -T -b simplepdf -d _build/doctrees -D language=en . _build/pdf + + - name: Upload HTML artifact + uses: actions/upload-artifact@v4 + with: + name: sphinx-docs-html + path: docs/_build/html/ + retention-days: 5 + + - name: Upload PDF artifact + uses: actions/upload-artifact@v4 + with: + name: sphinx-docs-pdf + path: docs/_build/pdf/ + retention-days: 5 + + - name: Deploy documentation + working-directory: ./docs + run: | + # Set target branch based on current branch + if [ "${{ github.ref }}" = "refs/heads/main" ]; then + TARGET_BRANCH="asf-site" + echo "Deploying to production (asf-site) branch" + else + TARGET_BRANCH="asf-staging" + echo "Deploying to staging (asf-staging) branch" + fi + + # Configure git + git config --global user.name "GitHub Actions" + git config --global user.email "[email protected]" + + # Create a temporary directory + mkdir -p /tmp/gh-pages + + # Store current directory + CURRENT_DIR=$(pwd) + ls -lsa $CURRENT_DIR + + # Try to clone the repository with the target branch + if ! git clone --branch $TARGET_BRANCH --single-branch \ + https://github.com/${{ github.repository }}.git /tmp/gh-pages 2>/dev/null; then + # If branch doesn't exist, initialize a new repository and create the branch + echo "Branch $TARGET_BRANCH doesn't exist. Creating it..." + rm -rf /tmp/gh-pages + mkdir -p /tmp/gh-pages + cd /tmp/gh-pages + git init + git config --local init.defaultBranch $TARGET_BRANCH + git checkout -b $TARGET_BRANCH + git remote add origin https://github.com/${{ github.repository }}.git + cd "$CURRENT_DIR" + else + echo "CD'ing into $CURRENT_DIR" + cd "$CURRENT_DIR" + fi + + # Remove existing content directory if it exists + rm -rf /tmp/gh-pages/content + + # # Ensure build directories exist + # mkdir -p "$CURRENT_DIR/_build/html" + # mkdir -p "$CURRENT_DIR/_build/pdf" + + # Copy the built HTML documentation to the content directory + mkdir -p /tmp/gh-pages/content + cp -r "$CURRENT_DIR/_build/html/"* /tmp/gh-pages/content/ 2>/dev/null || echo "No HTML files to copy" + + # Copy the PDF documentation to the content/pdf directory + mkdir -p /tmp/gh-pages/content/pdf + cp -r "$CURRENT_DIR/_build/pdf/Hamilton.pdf" /tmp/gh-pages/content/_static/ 2>/dev/null || echo "No PDF file to copy" + + # Add, commit and push the changes + cd /tmp/gh-pages + git status + ls -lhsa content + # Create a README if it doesn't exist + if [ ! -f README.md ]; then + echo "# Documentation for $TARGET_BRANCH" > README.md + echo "This branch contains the built documentation." >> README.md + fi + git add -A + git status + # Check if there are changes to commit (including untracked files) + if [ -n "$(git status --porcelain)" ]; then + git commit -m "Deploy documentation from ${{ github.sha }}" + git push https://x-access-token:${{ github.token }}@github.com/${{ github.repository }}.git $TARGET_BRANCH + echo "Changes pushed to $TARGET_BRANCH branch" + else + echo "No changes to deploy" + fi diff --git a/README.md b/README.md index 5012c8f3..4fc75c31 100644 --- a/README.md +++ b/README.md @@ -30,7 +30,7 @@ </div> <br></br> -Apache Hamilton is a lightweight Python library for directed acyclic graphs (DAGs) of data transformations. Your DAG is **portable**; it runs anywhere Python runs, whether it's a script, notebook, Airflow pipeline, FastAPI server, etc. Your DAG is **expressive**; Apache Hamilton has extensive features to define and modify the execution of a DAG (e.g., data validation, experiment tracking, remote execution). +Apache Hamilton (incubating) is a lightweight Python library for directed acyclic graphs (DAGs) of data transformations. Your DAG is **portable**; it runs anywhere Python runs, whether it's a script, notebook, Airflow pipeline, FastAPI server, etc. Your DAG is **expressive**; Apache Hamilton has extensive features to define and modify the execution of a DAG (e.g., data validation, experiment tracking, remote execution). To create a DAG, write regular Python functions that specify their dependencies with their parameters. As shown below, it results in readable code that can always be visualized. Apache Hamilton loads that definition and automatically builds the DAG for you! diff --git a/docs/code-comparisons/airflow.rst b/docs/code-comparisons/airflow.rst index fba602b2..48f1293c 100644 --- a/docs/code-comparisons/airflow.rst +++ b/docs/code-comparisons/airflow.rst @@ -33,7 +33,7 @@ executing pipelines. It is more complex to set up and run. Note: If you stuck th the `example_dag.py`, the Apache Hamilton pipeline could be used in the Airflow PythonOperator! Apache Hamilton: -_________ +________________ The below code here shows how you can define a simple data pipeline using Apache Hamilton. The pipeline consists of three functions that are executed in sequence. The pipeline is defined in a module called `pipeline.py`, and then executed in a separate script called `run.py`, which imports the pipeline module and executes it. diff --git a/docs/code-comparisons/dagster.rst b/docs/code-comparisons/dagster.rst index 97c02c25..a0dc7ce3 100644 --- a/docs/code-comparisons/dagster.rst +++ b/docs/code-comparisons/dagster.rst @@ -88,7 +88,7 @@ Dataflow definition :align: left +------------------------------------------------------------+----------------------------------------------------------+ - | Apache Hamilton | Dagster | + | Apache Hamilton | Dagster | +============================================================+==========================================================+ | .. literalinclude:: _dagster_snippets/hamilton_dataflow.py | .. literalinclude:: _dagster_snippets/dagster_dataflow.py| | | | @@ -125,7 +125,7 @@ Dataflow execution :align: left +-------------------------------------------------------------+------------------------------------------------------------+ - | Apache Hamilton | Dagster | + | Apache Hamilton | Dagster | +=============================================================+============================================================+ | .. literalinclude:: _dagster_snippets/hamilton_execution.py | .. literalinclude:: _dagster_snippets/dagster_execution.py | | | | diff --git a/docs/code-comparisons/kedro.rst b/docs/code-comparisons/kedro.rst index a12166ae..7ab51c08 100644 --- a/docs/code-comparisons/kedro.rst +++ b/docs/code-comparisons/kedro.rst @@ -30,7 +30,7 @@ Imperative (``Kedro``) vs. declarative (``Apache Hamilton``) leads to significan :align: left +---------------------------------------------------------+------------------------------------------------------------+ - | Kedro (imperative) | Apache Hamilton (declarative) | + | Kedro (imperative) | Apache Hamilton (declarative) | +=========================================================+============================================================+ | .. literalinclude:: _kedro_snippets/kedro_definition.py | .. literalinclude:: _kedro_snippets/hamilton_definition.py | | | | @@ -50,7 +50,7 @@ With ``Apache Hamilton``, you pass the module containing all functions from **St :align: left +---------------------------------------------------------+------------------------------------------------------------+ - | Kedro (imperative) | Apache Hamilton (declarative) | + | Kedro (imperative) | Apache Hamilton (declarative) | +=========================================================+============================================================+ | .. literalinclude:: _kedro_snippets/kedro_assemble.py | .. literalinclude:: _kedro_snippets/hamilton_assemble.py | | | | @@ -84,7 +84,7 @@ For comparable side-by-side code, we can dig into ``Kedro`` and use the ``Sequen :align: left +---------------------------------------------------------+------------------------------------------------------------+ - | Kedro (imperative) | Apache Hamilton (declarative) | + | Kedro (imperative) | Apache Hamilton (declarative) | +=========================================================+============================================================+ | .. literalinclude:: _kedro_snippets/kedro_execution.py | .. literalinclude:: _kedro_snippets/hamilton_execution.py | | | | @@ -160,7 +160,7 @@ Kedro This provides guidance when building your first data pipeline, but it's also a lot to take in at once. As you'll see in the `project comparison on GitHub <https://github.com/apache/hamilton/tree/main/examples/kedro>`_, ``Kedro`` involves more files making it harder to navigate. Also, it's reliant on YAML which is `generally seen as an unreliable format <https://noyaml.com/>`_. If you have an existing data stack or favorite library, it might clash with ``Kedro``'s way of thing (e.g., you [...] Apache Hamilton -~~~~~~~~ +w~~~~~~~~~~~~~~~ ``Apache Hamilton`` attempts to get you started quickly. In fact, this page pretty much covered what you need to know: diff --git a/docs/code-comparisons/langchain.rst b/docs/code-comparisons/langchain.rst index ee27b518..6bff3018 100644 --- a/docs/code-comparisons/langchain.rst +++ b/docs/code-comparisons/langchain.rst @@ -25,7 +25,7 @@ A simple joke example :align: left +-----------------------------------------------------------+----------------------------------------------------------+-------------------------------------------------------------+ - | Apache Hamilton | Vanilla | LangChain | + | Apache Hamilton | Vanilla | LangChain | +===========================================================+==========================================================+=============================================================+ | .. literalinclude:: langchain_snippets/hamilton_invoke.py | .. literalinclude:: langchain_snippets/vanilla_invoke.py | .. literalinclude:: langchain_snippets/lcel_invoke.py | | | | | @@ -49,7 +49,7 @@ Note: you could use @config.when to include both streamed and non-streamed versi :align: left +-------------------------------------------------------------+------------------------------------------------------------+---------------------------------------------------------------+ - | Apache Hamilton | Vanilla | LangChain | + | Apache Hamilton | Vanilla | LangChain | +=============================================================+============================================================+===============================================================+ | .. literalinclude:: langchain_snippets/hamilton_streamed.py | .. literalinclude:: langchain_snippets/vanilla_streamed.py | .. literalinclude:: langchain_snippets/lcel_streamed.py | | | | | @@ -74,7 +74,7 @@ e.g. Ray, Dask, etc. We use multi-threading here. :align: left +-------------------------------------------------------------+------------------------------------------------------------+---------------------------------------------------------------+ - | Apache Hamilton | Vanilla | LangChain | + | Apache Hamilton | Vanilla | LangChain | +=============================================================+============================================================+===============================================================+ | .. literalinclude:: langchain_snippets/hamilton_batch.py | .. literalinclude:: langchain_snippets/vanilla_batch.py | .. literalinclude:: langchain_snippets/lcel_batch.py | | | | | @@ -99,7 +99,7 @@ is that you need to use the async Apache Hamilton Driver. :align: left +-------------------------------------------------------------+------------------------------------------------------------+---------------------------------------------------------------+ - | Apache Hamilton | Vanilla | LangChain | + | Apache Hamilton | Vanilla | LangChain | +=============================================================+============================================================+===============================================================+ | .. literalinclude:: langchain_snippets/hamilton_async.py | .. literalinclude:: langchain_snippets/vanilla_async.py | .. literalinclude:: langchain_snippets/lcel_async.py | | | | | @@ -125,7 +125,7 @@ that uses the different OpenAI model. :align: left +------------------------------------------------------------------+-----------------------------------------------------------------+---------------------------------------------------------------+ - | Apache Hamilton | Vanilla | LangChain | + | Apache Hamilton | Vanilla | LangChain | +==================================================================+=================================================================+===============================================================+ | .. literalinclude:: langchain_snippets/hamilton_completion.py | .. literalinclude:: langchain_snippets/vanilla_completion.py | .. literalinclude:: langchain_snippets/lcel_completion.py | | | | | @@ -152,7 +152,7 @@ to use Anthropic. :align: left +------------------------------------------------------------------+-----------------------------------------------------------------+---------------------------------------------------------------+ - | Apache Hamilton | Vanilla | LangChain | + | Apache Hamilton | Vanilla | LangChain | +==================================================================+=================================================================+===============================================================+ | .. literalinclude:: langchain_snippets/hamilton_anthropic.py | .. literalinclude:: langchain_snippets/vanilla_anthropic.py | .. literalinclude:: langchain_snippets/lcel_anthropic.py | | | | | @@ -178,7 +178,7 @@ printing. :align: left +------------------------------------------------------------------+-----------------------------------------------------------------+---------------------------------------------------------------+ - | Apache Hamilton | Vanilla | LangChain | + | Apache Hamilton | Vanilla | LangChain | +==================================================================+=================================================================+===============================================================+ | .. literalinclude:: langchain_snippets/hamilton_logging.py | .. literalinclude:: langchain_snippets/vanilla_logging.py | .. literalinclude:: langchain_snippets/lcel_logging.py | | | | | @@ -197,7 +197,7 @@ opinion it's better to be explicit about it. :align: left +------------------------------------------------------------------+-----------------------------------------------------------------+---------------------------------------------------------------+ - | Apache Hamilton | Vanilla | LangChain | + | Apache Hamilton | Vanilla | LangChain | +==================================================================+=================================================================+===============================================================+ | .. literalinclude:: langchain_snippets/hamilton_fallbacks.py | .. literalinclude:: langchain_snippets/vanilla_fallbacks.py | .. literalinclude:: langchain_snippets/lcel_fallbacks.py | | | | | diff --git a/docs/concepts/best-practices/function-naming.rst b/docs/concepts/best-practices/function-naming.rst index cda48577..3255d4cd 100644 --- a/docs/concepts/best-practices/function-naming.rst +++ b/docs/concepts/best-practices/function-naming.rst @@ -12,7 +12,7 @@ You don't need to get this right the first time -- search and replace is really is something to converge thinking on! It enables you to define your Apache Hamilton dataflow ------------------------------------------------ +------------------------------------------------------ As the name of a hamilton function defines the name of the created artifact, naming is vital to a readable, extensible hamilton codebase. Names must mean something: diff --git a/docs/concepts/best-practices/migrating-to-hamilton.rst b/docs/concepts/best-practices/migrating-to-hamilton.rst index e1384a73..3a656e1a 100644 --- a/docs/concepts/best-practices/migrating-to-hamilton.rst +++ b/docs/concepts/best-practices/migrating-to-hamilton.rst @@ -1,6 +1,6 @@ -===================== +============================= Migrating to Apache Hamilton -===================== +============================= Here are two suggestions for helping you migrate to Apache Hamilton diff --git a/docs/get-started/index.rst b/docs/get-started/index.rst index 360d5ccc..e6db53cd 100644 --- a/docs/get-started/index.rst +++ b/docs/get-started/index.rst @@ -19,7 +19,7 @@ It allows you to: Get started with Apache Hamilton locally -------------------------------------- +---------------------------------------- The following section of the docs will teach you how to install Apache Hamilton and get started with your own project. .. toctree:: diff --git a/docs/hamilton-ui/index.rst b/docs/hamilton-ui/index.rst index 8ebe9c35..84698730 100644 --- a/docs/hamilton-ui/index.rst +++ b/docs/hamilton-ui/index.rst @@ -1,6 +1,6 @@ -=========== +=================== Apache Hamilton UI -=========== +=================== Reference --------- diff --git a/docs/hamilton-ui/ui.rst b/docs/hamilton-ui/ui.rst index e68b53a6..eeec6b2d 100644 --- a/docs/hamilton-ui/ui.rst +++ b/docs/hamilton-ui/ui.rst @@ -157,7 +157,7 @@ Then, navigate to the project page (dashboard/projects), in the running UI, and Remember the project ID -- you'll use it for the next steps. Existing Apache Hamilton Code ----------------------- +------------------------------------ Add the following adapter to your code if you have existing Apache Hamilton code: .. code-block:: python @@ -183,7 +183,7 @@ Then run your DAG, and follow the links in the logs! Note that the link is corre the local mode -- if you're on postgres it links to 8241 (but you'll want to follow it to 8241). I need some Apache Hamilton code to run --------------------------------- +---------------------------------------------- If you don't have Apache Hamilton code to run this with, you can run Apache Hamilton UI example under `examples/hamilton_ui <https://github.com/apache/hamilton/tree/main/examples/hamilton_ui>`_: .. code-block:: bash diff --git a/docs/how-tos/use-for-feature-engineering.rst b/docs/how-tos/use-for-feature-engineering.rst index da249c90..d9cb9244 100644 --- a/docs/how-tos/use-for-feature-engineering.rst +++ b/docs/how-tos/use-for-feature-engineering.rst @@ -52,7 +52,7 @@ Here is a sketch of the above pattern: Apache Hamilton Example -^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^^ We do not provide a specific example here, since most of the examples in the examples folder fall under this category. Some examples to browse: @@ -96,7 +96,7 @@ Here's a sketch of how you might use Apache Hamilton in conjunction with a Kafka Apache Hamilton Example -^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^^ Currently we don't have a streaming example. But we are working on it. We direct users to look at the online example for now, since conceptually from a modularity stand point, things would be set up in a similar way. @@ -120,7 +120,7 @@ the `@config.*` decorator, to help you segment your feature computation dataflow We skip showing a sketch of structure here, and invite you to look at the examples below. Apache Hamilton Example -^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^^ We direct users to look at `Feature engineering in multiple contexts <https://github.com/apache/hamilton/tree/main/examples/feature_engineering/feature_engineering_multiple_contexts>`__ that currently describes two scenarios around how you could incorporate Apache Hamilton into an online web-service, and have it aligned with your batch offline processes. Note, these examples should give you the high level first principles @@ -139,6 +139,6 @@ FAQ ---- Q. Can I use Apache Hamilton for feature engineering with Feast? -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Yes, you can use Apache Hamilton with Feast. See our [Feast example](https://github.com/apache/hamilton/tree/main/examples/feast) and accompanying [blog post](https://blog.dagworks.io/p/featurization-integrating-hamilton). Typically people use Apache Hamilton on the offline side to compute features that then get pushed to Feast. For the online side it varies as to how to integrate the two.
