The GitHub Actions job "Tests" on airflow.git has failed.
Run started by GitHub user potiuk (triggered by potiuk).

Head commit for run:
204020a329d954dca14ef30ea7f72c25782da85b / Balthazar Rouberol 
<[email protected]>
KubernetesPodOperator: add base_container_name to the templated fields (#47864)

I would like to propose adding `base_container_name` to the
`KubernetesPodOperator` templated fields.

The rationale is that the base container name is part of the log lines
emitted by the KubernetesPodManager, which is a good opportunity to have
it give as much context as possible.

For example, in a Wikimedia DAG, we defined the following operators:

```python
class WikimediaDumpOperator(KubernetesPodOperator):
    """
    Base class for all types of wiki dumps run as Kubernetes Pods.
    """

    dump_type = "generic"

    def __init__(self, wiki: str, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.wiki = wiki

        # Name of the "dumps" container (default is base, which isn't super 
telling)
        self.base_container_name = f"mediawiki-{self.dump_type}-dump"

        # name of the pod itself
        # made templated in https://github.com/apache/airflow/pull/46268
        self.name = f"{self.base_container_name}-{wiki}"

class WikimediaSqlXmlDumpsOperator(WikimediaDumpOperator):
    """Operator class running the sql/xml wiki dumps as Kubernetes Pods"""

    dump_type = "sql-xml"

class WikimediaWikidataDumpsOperator(WikimediaDumpOperator):
    """Operator class running the wikidata dumps as Kubernetes Pods"""

    dump_type = "wikidata"

```

Adding `base_container_name` to the templated fields would allow us to
rewrite the `WikimediaDumpOperator` to the following:

```python
class WikimediaDumpOperator(KubernetesPodOperator):
    """
    Base class for all types of wiki dumps run as Kubernetes Pods.
    """

    dump_type = "generic"

    def __init__(self, wiki: str, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.wiki = wiki
```

and we could invoke the operator as such:

```python
WikimediaSqlXmlOperator(
    ...,
    base_container_name='mediawiki-{{ task.dump_type }}-dump',
    name='{{ task.base_container_name }}-{{ task.wiki }}'
    ...
)
```

The endgame would be to have our logs contain as much context as
possible while avoiding mixing passing both keyword args to the
conttructor _and_ infering some attributes _within_ the `__init__`
method itself.

Report URL: https://github.com/apache/airflow/actions/runs/14016767378

With regards,
GitHub Actions via GitBox


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to