I think that is a good point but I have a conceptually different proposal. From the coding perspective it might be similar in terms of implementation and complexity, but long term it will much better reflect the way how "Airflow as a Platform" currently is implemented.
My proposal is very close to what was at the center of our recent discussions on the OpenLineage integration. I think we should revert the architecture. IMHO rather than what we currently have as a common 'stats" interface should be deprecated and open-telemetry API should be THE common metrics API airflow should use as "common interface" for metrics (and anyone who would like to use Airflow metrics should implement). Why should we define our own "Stats" API, and chose which implementation should handle this? We already have Open Telemetry API and by Open Telemetry configuration we can use which collector to use in order to export the metrics. So just to rephrase it - our current Stats/Datadog Statsd implementation should be merely just our own simple custom OpenTelemetry Collectors - they should collect the metrics which airflow sends via OTEL API and send them to Stats/DatadogStats in the same way current metrics are sent. But those should only be used for backwards compatibility reasons - we should not aim to implement a generic "fully-featured/reusable" statsd or DataDog Statsd collector - just provide a bare minimum that mimics current behaviour for backwards compatibility. That should be considerably smalller task. If we do it this way, then it is rather simple, I think. OpenTelemetry provider might or might not be a separate provider (we might not need it eventually that should provide "open telemetry" functionality in-airflow core - initialization, configuration some common code etc.). This might well be in `airflow.otel` package - no need for separate provider there I think. Then any external entity (including provider packages) might provide a collector implementation that will collect the metrics and export them. In our case (for backported Statsd/DatadogStatsd)- those would be StatsD and DataDog providers that will provide a collector that might be configured as the collector used by Airflow's OpenTelemetry. But the assumption is that the existing collectors for those who already integrated with Open Telemetry (Grafana, New Relic, Amazon, Google) should already have collector, that the users should be able to just configure and use, so for exmaple I do not even expect anything in AWS provider to collect the metrics from airflow - there should be an existing CloudWatch OTEL collector, that should collect the metrics and send them to CloudWatch. The most that should be - possibly - in AWS provider is the documentation how to enable the collector for CloudWatch and dependency to pull the cloudwatch collector package. I hope what I am writing makes sense :). J. On Thu, Mar 9, 2023 at 10:42 PM Ferruzzi, Dennis <ferru...@amazon.com.invalid> wrote: > > Hi folks. I am working on adding support for OpenTelemetry based on AIP-49 > and I think we have come to a point where it is worth discussing options. > > Currently `airflow/stats.py`[1] contains classes for a base/NoStats option as > well as Statsd and DataDog, and I will be adding in another option for OTel. > I think it is getting to the point where we may want to break these out like > we do with provider packages and let the users install only the metrics > backend(s) they want. > > > If we do, then where would they live? Should they fall in with the service > provider packages at `airflow/providers/{statsd | datadog | otel}/`, or maybe > a new location like `airflow/stats/providers/{statsd | datadog | otel}`? If > we make the move, then we would also need to sort out how to handle the > change. Perhaps the provider packages for the existing options should be > bundled with core for now and later moved to fully-separated like all the > other provide packages? > > I'd love to hear what you folks think. > - ferruzzi > > [1] https://github.com/apache/airflow/blob/main/airflow/stats.py > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org For additional commands, e-mail: dev-h...@airflow.apache.org