(hamilton) 03/14: Add documentation for `unpack_fields`

skrawcz Sun, 22 Jun 2025 23:56:09 -0700

This is an automated email from the ASF dual-hosted git repository.

skrawcz pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/hamilton.git


commit f723544c6643f9cdb6bb460db322a12e92ee534e
Author: Charles Swartz <[email protected]>
AuthorDate: Sun Apr 6 09:24:47 2025 -0400

    Add documentation for `unpack_fields`
---
 docs/concepts/function-modifiers.rst        | 29 +++++++++++++++++--
 docs/reference/decorators/index.rst         |  1 +
 docs/reference/decorators/unpack_fields.rst | 44 +++++++++++++++++++++++++++++
 3 files changed, 71 insertions(+), 3 deletions(-)

diff --git a/docs/concepts/function-modifiers.rst 
b/docs/concepts/function-modifiers.rst
index c3b38c50..4c4797b2 100644
--- a/docs/concepts/function-modifiers.rst
+++ b/docs/concepts/function-modifiers.rst
@@ -191,13 +191,36 @@ Sometimes, your node outputs multiple values that you 
would like to name and mak
 
     To add metadata to extracted nodes, use ``@tag_output``, which works just 
like ``@tag``.
 
+@unpack_fields
+~~~~~~~~~~~~~~
+
+A good example is splitting a dataset into training, validation, and test 
splits. We use ``@unpack_fields``, which requires specifying the names of the 
fields to extract. The function must return a tuple with at least as many 
elements as there are specified fields. Note that selecting a subset of the 
tuple or using an indeterminate tuple size is also possible.
+
+.. code-block:: python
+
+    from typing import Tuple
+    from hamilton.function_modifiers import unpack_fields
+
+    @unpack_fields("X_train" "X_validation", "X_test")
+    def dataset_splits(X: np.ndarray) -> Tuple[np.ndarray, np.ndarray, 
np.ndarray]:
+        """Randomly split data into train, validation, test"""
+        X_train, X_validation, X_test = random_split(X)
+        return X_train, X_validation, X_test
+
+.. image:: ./_function-modifiers/extract_fields.png
+    :height: 250px
+
+
+Now, ``X_train``, ``X_validation``, and ``X_test`` are available to other 
nodes and can be queried with ``.execute()``. However, since ``dataset_splits`` 
is itself a node, you can query it to obtain all splits in a single tuple!
+
 @extract_fields
 ~~~~~~~~~~~~~~~
 
-A good example is splitting a dataset into train, validation, and test splits. 
We will use ``@extract_fields``, which requires specifying in a dictionary the 
``field_name: field_type`` of each field.
+Additionally, we can extract fields from an output dictionary using 
``@extract_fields``. In this case, you must specify the dictionary keys and 
their types. The function must return a dictionary that contains, at a minimum, 
those keys specified in the decorator.
 
 .. code-block:: python
 
+    from typing import Dict
     from hamilton.function_modifiers import extract_fields
 
     @extract_fields(dict(  # don't forget the dictionary
@@ -205,7 +228,7 @@ A good example is splitting a dataset into train, 
validation, and test splits. W
         X_validation=np.ndarray,
         X_test=np.ndarray,
     ))
-    def dataset_splits(X: np.ndarray) -> dict:
+    def dataset_splits(X: np.ndarray) -> Dict:
         """Randomly split data into train, validation, test"""
         X_train, X_validation, X_test = random_split(X)
         return dict(
@@ -218,7 +241,7 @@ A good example is splitting a dataset into train, 
validation, and test splits. W
     :height: 250px
 
 
-Now, ``X_train``, ``X_validation``, and ``X_test`` are available to other 
nodes, and they can be queried by ``.execute()``. But, since ``dataset_splits`` 
is its own node, you can query it to get all splits in a dictionary!
+Again, ``X_train``, ``X_validation``, and ``X_test`` are now available to 
other nodes, or you can query the ``dataset_splits`` node to retrieve all 
splits in a dictionary.
 
 @extract_columns
 ~~~~~~~~~~~~~~~~
diff --git a/docs/reference/decorators/index.rst 
b/docs/reference/decorators/index.rst
index d5154e01..49bd8d43 100644
--- a/docs/reference/decorators/index.rst
+++ b/docs/reference/decorators/index.rst
@@ -21,6 +21,7 @@ Reference
    dataloader
    datasaver
    does
+   unpack_fields
    extract_columns
    extract_fields
    inject
diff --git a/docs/reference/decorators/unpack_fields.rst 
b/docs/reference/decorators/unpack_fields.rst
new file mode 100644
index 00000000..f91c3187
--- /dev/null
+++ b/docs/reference/decorators/unpack_fields.rst
@@ -0,0 +1,44 @@
+=======================
+unpack_fields
+=======================
+This decorator works on a function that outputs a tuple and unpacks its 
elements to make them individually available for consumption. Essentially, it 
expands the original function into n separate functions, each of which takes 
the original output tuple and, in return, outputs a specific field based on the 
index supplied to the ``unpack_fields`` decorator.
+
+.. code-block:: python
+
+    import pandas as pd
+    from hamilton.function_modifiers import unpack_fields
+
+    @unpack_fields('X_train', 'X_test', 'y_train', 'y_test')
+    def train_test_split_func(
+        feature_matrix: np.ndarray,
+        target: np.ndarray,
+        test_size_fraction: float,
+        shuffle_train_test_split: bool,
+    ) -> Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]:
+        ...  # Calculate the train-test split
+        return X_train, X_test, y_train, y_test
+
+
+The arguments to the decorator not only represent the names of the resulting 
fields but also determine their position in the output tuple. This means you 
can choose to unpack a subset of the fields or declare an indeterminate number 
of fields — as long as the number of requested fields does not exceed the 
number of elements in the output tuple.
+
+.. code-block:: python
+
+    import pandas as pd
+    from hamilton.function_modifiers import unpack_fields
+
+    @unpack_fields('X_train', 'X_test', 'y_train', 'y_test')
+    def train_test_split_func(
+        feature_matrix: np.ndarray,
+        target: np.ndarray,
+        test_size_fraction: float,
+        shuffle_train_test_split: bool,
+    ) -> Tuple[np.ndarray, ...]:  # indeterminate number of fields
+        ...  # Calculate the train-test split
+        return X_train, X_test, y_train, y_test
+
+----
+
+**Reference Documentation**
+
+.. autoclass:: hamilton.function_modifiers.unpack_fields
+   :special-members: __init__

(hamilton) 03/14: Add documentation for `unpack_fields`

Reply via email to