potiuk commented on issue #20053:
URL: https://github.com/apache/airflow/issues/20053#issuecomment-986279158


   Hmm I read the issue as a proposal to build a "shared service" inside of 
Airflow that will be built in order to be usable to provide custom metrics by 
various providers. Am I wrong?
   
   If that the first case - "Provide  a common way for all providers to report 
their telemetry", this is pretty much the same as OpenTelemetry integration 
would provide - most likely (this is the scope of the POC on what's possible 
and how deeply we can integrate and what might be a way OpenTelemetry might be 
used by different parts of Airflow: reporting low-level metrics of Airlfow, 
reporting higher level metrics of Airflow components (such as SQLAlchemy), 
reporting "logical metrics" of Airflow but fainally - and possibly reporting 
custom metrics by other parts of Airflow (including Providers). 
   
   Once we integrate Open Telemetry into airflow, this might become de-facto 
standards for all components (including providers) to use open-telemetry, 
because it will become a common "telemetry" language that Airflow will use (at 
lest I see it as something that we will be able to validate during the 
OpenTelemetry POC so that we can have some good idea of what's possible and how 
much it involves). And in this case (the provider part) is where the overlap 
might be significant and I think this at least warrants a discussion at the 
devlist if the proposal goes beyond a POC. 
   
   I do not yet know what would it mean to add a "common" Prometheus provider 
that BigQuery would use. 
   
   Is it an actua "provider" that could be installed as one of the 70+ 
providers? Would google provider/ BiqQuery depend on it ? Which versions? What 
will be the dependencies? Or Is it a shared service which is part of the 
Airflow itself, or is it just set of library calls that do not even require 
configuration of Airlfow as a whole? How "common" this will be ? Which parts 
will be reusable ? Will they really be reusable? Why do we need it all all - 
cannot this be implemented directly in BigQuery?
   
   There are many questions, I think that warrant at the very least discussing 
the concept in the devlist. Not really  a call but explaining the concept in a 
mail to a devlist, where the concept will be discussed. Likely this shoudl turn 
into Airflow Improvement Proposal, if it's going to turn in to a kind of 
"shared service". This is why we are running the POC now to be better prepared 
to answer many of those questions, but idea is to come up with the AIP. 
   
   This is usually how it works in Airflow, when it comes to the changes that 
go beyond one provider or fix - and (at least how I am reading the proposal), 
this is something that you would like to provide as "common" interface for 
potentially many providers. Whic definitely warrants a devlist discussion.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to