I’ve always hated the magic __getitem__ interface. My previous ideas include:
a. Make task.output an accessor (say XComCollection) instead of an XComArg.
This accessor type uses __getitem__ to let you specify a key. task.output[None]
references the default XCom (what you get right now with task.output).
b. Add a new function task.get_output(key) to access non-default XCom.
For both options, the current __getitem__ implementation on XComArg is removed
(changed to do dict access).
One thing to consider if we do this though is multiple_outputs=True. Currently
this flag combined with the __getitem__ implementation provides a streamlined
interface in task flow:
@task(multiple_outputs=True)
def up():
return {"a": [1, 2], "b": [3, 4, 5]} # Push XComs for this task under keys
"a" and "b"
def down(v):
print(v) # Prints [1, 2]
down(up()["a"])
but this would break if we change the __getitem__ implementation. Maybe the
correct solution would be to also remove multiple_outputs? I have not thought
this through (and probably should).
TP
> On 7 Feb 2025, at 01:29, Josh Fell <[email protected]> wrote:
>
> Hi all,
>
> Way back when, the `output` property on operators was introduced as a more
> "Pythonic" means to retrieve XComs from tasks via an XComArg object.
> Meaning a DAG author could use `my_task.output` as an equivalence to:
> `task_instance.xcom_pull(task_ids="my_task")`. With the use of the `output`
> property, there is an option to select specific XCom keys as well (i.e.
> `my_task.output["my_xcom_key"]` is equivalent to
> `task_instance.xcom_pull(task_ids="my_task", key="my_xcom_key").
>
> However, the use of `__getitem__` to change the XCom key doesn't allow
> direct retrieval of nested values within XComs but rather continues to
> update the key. Effectively `my_task.output["my_xcom_key"]["object_key"]`
> is _not_ equivalent to `task_instance.xcom_pull(task_ids="my_task",
> key="my_xcom_key")["object_key"]`. The former results in attempting to
> retrieve an XCom with a key of "object_key" and the latter retrieves the
> value tied to the "object_key" key of an XCom.
>
> There is quite an old issue related to this non-equivalence
> <https://github.com/apache/airflow/issues/16618> and I am bringing up the
> idea to modify the means to change the XCom key by using `__getattr__`
> instead as part of Airflow 3 (briefly mentioned in a comment
> <https://github.com/apache/airflow/issues/16618#issuecomment-876288276> as
> well). So the new, proposed method for changing XCom keys would be
> `my_task.output.my_xcom_key["object_key"]`.
>
> This change would bring parity between the `output` property and the
> classic `xcom_pull()` method. The obvious drawback is this would be a
> slight authoring change for existing DAGs that use the `output` property.
> Perhaps if the change could be automated in migration tooling the behavior
> change wouldn't be so impactful.
>
> What do you all think about this proposed update?
>
> Josh
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]