jceresini opened a new issue #21440:
URL: https://github.com/apache/airflow/issues/21440


   ### Apache Airflow Provider(s)
   
   google
   
   ### Versions of Apache Airflow Providers
   
   $ pip freeze | grep airflow-providers
   apache-airflow-providers-apache-beam==3.1.0
   apache-airflow-providers-ftp==2.0.1
   apache-airflow-providers-google==6.3.0
   apache-airflow-providers-http==2.0.2
   apache-airflow-providers-imap==2.1.0
   apache-airflow-providers-sqlite==2.0.1
   
   ### Apache Airflow version
   
   2.2.3 (latest released)
   
   ### Operating System
   
   MacOS 11.6.1 (20G224)
   
   ### Deployment
   
   Virtualenv installation
   
   ### Deployment details
   
   The only python packages explicitly installed were 
`apache-airflow-providers-google` and `apache-airflow-providers-apache-beam`.
   
   ### What happened
   
   [2022-02-08 16:44:23,610] {credentials_provider.py:330} INFO - Getting 
connection using `google.auth.default()` since no key file is defined for hook.
   [2022-02-08 16:44:25,409] {http.py:123} WARNING - Invalid JSON content from 
response: b'{\n  "error": {\n    "code": 403,\n    "message": "User must be 
authenticated when user project is provided",\n    "status": 
"PERMISSION_DENIED",\n    "details": [\n      {\n        "@type": 
"type.googleapis.com/google.rpc.ErrorInfo",\n        "reason": 
"USER_PROJECT_DENIED",\n        "domain": "googleapis.com",\n        
"metadata": {\n          "service": "dataflow.googleapis.com",\n          
"consumer": "projects/cd-np-test1"\n        }\n      }\n    ]\n  }\n}\n'
   
   That first line is expected. The operator should be calling 
`google.auth.default()` to get my credentials. With some quick debugging I 
confirmed that is in fact the case, and it has the correct credentials.
   
   After that, retrieving the discovery document fails with a 403. If I force 
the api client to use a cached copy of the discovery document, the task 
completes just fine. So the authentication is working when making calls to 
create the dataflow job. But for some reason the dataflow API's discovery 
document url doesn't like the request. Note that you can access the discovery 
document with no auth at all: 
https://www.googleapis.com/discovery/v1/apis/dataflow/v1b3/rest
   
   ### What you expected to happen
   
   I should be able to use local application default credentials with running 
airflow locally, to trigger the dataflow operator
   
   ### How to reproduce
   
   Ran the following in a new venv:
   
   ```
   gcloud auth application-default login # successfully authenticated as my user
   
   python3 -m pip install apache-airflow-providers-google 
apache-airflow-providers-apache-beam
   export AIRFLOW__CORE__LOAD_EXAMPLES=false
   export AIRFLOW_HOME=$(pwd)/airflowhome
   airflow db init
   
   # Copy the python file into the ./airflowhome/dags (below)
   
   airflow tasks test test_dag dataflow_task $(date +%Y-%m-%dT%H:%M:%S)
   ```
   Contents of the python file:
   
   ```python
   from airflow.providers.google.cloud.operators.dataflow import (
       DataflowStartFlexTemplateOperator,
   )
   from datetime import datetime
   from airflow import DAG
   
   
   with DAG(dag_id="test_dag", schedule_interval=None, 
start_date=datetime.now()) as dag:
   
       dataflow_task = DataflowStartFlexTemplateOperator(
           task_id="dataflow_task",
           body={
               "launchParameter": {
                   "containerSpecGcsPath": "gs://foo/bar/baz.json",
                   "jobName": "test",
                   "environment": {
                       "serviceAccountEmail": 
"[email protected]",
                   },
               }
           },
           wait_until_finished=True,
           location="us-central1",
           project_id="some-project",
       )
   ```
   
   ### Anything else
   
   The `google-api-python-client` is locked to versions `<2.0`. In newer 
versions of the client, discovery documents are shipped with the client and the 
lookup is rarely used. As long as upgrading that dependency doesn't break other 
things, that might be a simple fix.
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to