I never conceived of the idea of reverting everything in such a radical fashion, because I never thought those actions would even be possible!
So having the OTEL as the backbone of Airflow's metrics/logs/traces is actually a great idea, in my opinion. With the usage of OTEL collector that can then export the OTEL data into any external format that it can, we are definitely safe from introducing complexity. As for the statsd and dogstatsd capability, we should obviously maintain it for the backward compatibility, but I do agree that OTEL should be the definite choice by default. On Sun, Mar 12, 2023 at 11:50 AM Jarek Potiuk <ja...@potiuk.com> wrote: > I think that is a good point but I have a conceptually different > proposal. From the coding perspective it might be similar in terms of > implementation and complexity, but long term it will much better > reflect the way how "Airflow as a Platform" currently is implemented. > > My proposal is very close to what was at the center of our recent > discussions on the OpenLineage integration. I think we should revert > the architecture. > > IMHO rather than what we currently have as a common 'stats" interface > should be deprecated and open-telemetry API should be THE common > metrics API airflow should use as "common interface" for metrics (and > anyone who would like to use Airflow metrics should implement). Why > should we define our own "Stats" API, and chose which implementation > should handle this? We already have Open Telemetry API and by Open > Telemetry configuration we can use which collector to use in order to > export the metrics. > > So just to rephrase it - our current Stats/Datadog Statsd > implementation should be merely just our own simple custom > OpenTelemetry Collectors - they should collect the metrics which > airflow sends via OTEL API and send them to Stats/DatadogStats in the > same way current metrics are sent. But those should only be used for > backwards compatibility reasons - we should not aim to implement a > generic "fully-featured/reusable" statsd or DataDog Statsd collector - > just provide a bare minimum that mimics current behaviour for > backwards compatibility. That should be considerably smalller task. > > If we do it this way, then it is rather simple, I think. OpenTelemetry > provider might or might not be a separate provider (we might not need > it eventually that should provide "open telemetry" functionality > in-airflow core - initialization, configuration some common code > etc.). This might well be in `airflow.otel` package - no need for > separate provider there I think. > > Then any external entity (including provider packages) might provide a > collector implementation that will collect the metrics and export > them. In our case (for backported Statsd/DatadogStatsd)- those would > be StatsD and DataDog providers that will provide a collector that > might be configured as the collector used by Airflow's OpenTelemetry. > But the assumption is that the existing collectors for those who > already integrated with Open Telemetry (Grafana, New Relic, Amazon, > Google) should already have collector, that the users should be able > to just configure and use, so for exmaple I do not even expect > anything in AWS provider to collect the metrics from airflow - there > should be an existing CloudWatch OTEL collector, that should collect > the metrics and send them to CloudWatch. The most that should be - > possibly - in AWS provider is the documentation how to enable the > collector for CloudWatch and dependency to pull the cloudwatch > collector package. > > I hope what I am writing makes sense :). > > J. > > On Thu, Mar 9, 2023 at 10:42 PM Ferruzzi, Dennis > <ferru...@amazon.com.invalid> wrote: > > > > Hi folks. I am working on adding support for OpenTelemetry based on > AIP-49 and I think we have come to a point where it is worth discussing > options. > > > > Currently `airflow/stats.py`[1] contains classes for a base/NoStats > option as well as Statsd and DataDog, and I will be adding in another > option for OTel. I think it is getting to the point where we may want to > break these out like we do with provider packages and let the users install > only the metrics backend(s) they want. > > > > > > If we do, then where would they live? Should they fall in with the > service provider packages at `airflow/providers/{statsd | datadog | > otel}/`, or maybe a new location like `airflow/stats/providers/{statsd | > datadog | otel}`? If we make the move, then we would also need to sort out > how to handle the change. Perhaps the provider packages for the existing > options should be bundled with core for now and later moved to > fully-separated like all the other provide packages? > > > > I'd love to hear what you folks think. > > - ferruzzi > > > > [1] https://github.com/apache/airflow/blob/main/airflow/stats.py > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > For additional commands, e-mail: dev-h...@airflow.apache.org > >