[
https://issues.apache.org/jira/browse/AIRFLOW-3425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexander Sokolov updated AIRFLOW-3425:
---------------------------------------
Priority: Minor (was: Major)
> GoogleCloudBaseHook doesn't set default scope when google_cloud_default
> connection is empty
> -------------------------------------------------------------------------------------------
>
> Key: AIRFLOW-3425
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3425
> Project: Apache Airflow
> Issue Type: Bug
> Components: gcp, hooks
> Affects Versions: 1.10.1
> Reporter: Alexander Sokolov
> Priority: Minor
>
> When using Application Default Credentials to authenticating to GCP.
> _If the environment variable {{GOOGLE_APPLICATION_CREDENTIALS}} is set to the
> path of a valid service account JSON private key file, then it is loaded and
> returned. The project ID returned is the project ID defined in the service
> account file if available (some older files do not contain project ID
> information)._
> {{[2018-11-30 11:18:35,038] \{models.py:1569} INFO - Executing
> <Task(FileToGoogleCloudStorageOperator): upload_transformed_data_to_gcs> on
> 2018-11-29T17:54:03.110575+00:00}}
> {{[2018-11-30 11:18:35,039] \{base_task_runner.py:124} INFO - Running:
> ['bash', '-c', 'airflow run transform_raw_data upload_transformed_data_to_gcs
> 2018-11-29T17:54:03.110575+00:00 --job_id 98 --raw -sd
> DAGS_FOLDER/transform_raw_data.py --cfg_path
> /var/folders/5j/n3y23gt93l19lmt1tf4mgpr80000gn/T/tmprbpmva2a']}}
> {{[2018-11-30 11:18:35,606] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs [2018-11-30 11:18:35,606]
> \{settings.py:174} INFO - setting.configure_orm(): Using pool settings.
> pool_size=5, pool_recycle=1800}}
> {{[2018-11-30 11:18:35,981] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs [2018-11-30 11:18:35,979]
> {__init__.py:51} INFO - Using executor LocalExecutor}}
> {{[2018-11-30 11:18:36,202] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs [2018-11-30 11:18:36,201]
> \{models.py:258} INFO - Filling up the DagBag from
> /Users/alex/airflow/dags/transform_raw_data.py}}
> {{[2018-11-30 11:18:38,391] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs [2018-11-30 11:18:38,390]
> \{base_hook.py:83} INFO - Using connection to: localhost}}
> {{[2018-11-30 11:18:43,427] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs [2018-11-30 11:18:43,426]
> \{cli.py:492} INFO - Running <TaskInstance:
> transform_raw_data.upload_transformed_data_to_gcs
> 2018-11-29T17:54:03.110575+00:00 [running]> on host laptop.local}}
> {{[2018-11-30 11:18:48,574] \{logging_mixin.py:95} INFO - [2018-11-30
> 11:18:48,574] \{gcp_api_base_hook.py:86} INFO - Getting connection using
> `google.auth.default()` since no key file is defined for hook.}}
> {{[2018-11-30 11:18:48,583] \{logging_mixin.py:95} INFO - [2018-11-30
> 11:18:48,583] \{discovery.py:267} INFO - URL being requested: GET
> [https://www.googleapis.com/discovery/v1/apis/storage/v1/rest]}}
> {{[2018-11-30 11:18:48,852] \{models.py:1736} ERROR - ('invalid_scope: Empty
> or missing scope not allowed.', '\{\n "error": "invalid_scope",\n
> "error_description": "Empty or missing scope not allowed."\n}')}}
> {{Traceback (most recent call last):}}
> {{ File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/airflow/models.py",
> line 1633, in _run_raw_task}}
> {{ result = task_copy.execute(context=context)}}
> {{ File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/airflow/contrib/operators/file_to_gcs.py",
> line 75, in execute}}
> {{ filename=self.src)}}
> {{ File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/airflow/contrib/hooks/gcs_hook.py",
> line 187, in upload}}
> {{ service = self.get_conn()}}
> {{ File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/airflow/contrib/hooks/gcs_hook.py",
> line 48, in get_conn}}
> {{ 'storage', 'v1', http=http_authorized, cache_discovery=False)}}
> {{ File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/googleapiclient/_helpers.py",
> line 130, in positional_wrapper}}
> {{ return wrapped(*args, **kwargs)}}
> {{ File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/googleapiclient/discovery.py",
> line 222, in build}}
> {{ requested_url, discovery_http, cache_discovery, cache)}}
> {{ File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/googleapiclient/discovery.py",
> line 269, in _retrieve_discovery_doc}}
> {{ resp, content = http.request(actual_url)}}
> {{ File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/google_auth_httplib2.py",
> line 187, in request}}
> {{ self._request, method, uri, request_headers)}}
> {{ File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/google/auth/credentials.py",
> line 122, in before_request}}
> {{ self.refresh(request)}}
> {{ File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/google/oauth2/service_account.py",
> line 322, in refresh}}
> {{ request, self._token_uri, assertion)}}
> {{ File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/google/oauth2/_client.py",
> line 145, in jwt_grant}}
> {{ response_data = _token_endpoint_request(request, token_uri, body)}}
> {{ File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/google/oauth2/_client.py",
> line 111, in _token_endpoint_request}}
> {{ _handle_error_response(response_body)}}
> {{ File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/google/oauth2/_client.py",
> line 61, in _handle_error_response}}
> {{ error_details, response_body)}}
> *{{google.auth.exceptions.RefreshError: ('invalid_scope: Empty or missing
> scope not allowed.', '\{\n "error": "invalid_scope",\n "error_description":
> "Empty or missing scope not allowed."\n}')}}*
> {{[2018-11-30 11:18:48,865] \{models.py:1764} INFO - Marking task as
> FAILED.}}
> {{[2018-11-30 11:18:48,899] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs
> /Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/psycopg2/__init__.py:144:
> UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in
> order to keep installing from binary please use "pip install psycopg2-binary"
> instead. For details see:
> <[http://initd.org/psycopg/docs/install.html#binary-install-from-pypi]>.}}
> {{[2018-11-30 11:18:48,899] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs """)}}
> {{[2018-11-30 11:18:48,904] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs Traceback (most recent call last):}}
> {{[2018-11-30 11:18:48,904] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/bin/airflow", line 32,
> in <module>}}
> {{[2018-11-30 11:18:48,905] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs args.func(args)}}
> {{[2018-11-30 11:18:48,905] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/airflow/utils/cli.py",
> line 74, in wrapper}}
> {{[2018-11-30 11:18:48,905] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs return f(*args, **kwargs)}}
> {{[2018-11-30 11:18:48,905] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/airflow/bin/cli.py",
> line 498, in run}}
> {{[2018-11-30 11:18:48,905] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs _run(args, dag, ti)}}
> {{[2018-11-30 11:18:48,905] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/airflow/bin/cli.py",
> line 402, in _run}}
> {{[2018-11-30 11:18:48,905] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs pool=args.pool,}}
> {{[2018-11-30 11:18:48,905] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/airflow/utils/db.py",
> line 74, in wrapper}}
> {{[2018-11-30 11:18:48,906] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs return func(*args, **kwargs)}}
> {{[2018-11-30 11:18:48,906] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/airflow/models.py",
> line 1633, in _run_raw_task}}
> {{[2018-11-30 11:18:48,906] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs result =
> task_copy.execute(context=context)}}
> {{[2018-11-30 11:18:48,906] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/airflow/contrib/operators/file_to_gcs.py",
> line 75, in execute}}
> {{[2018-11-30 11:18:48,906] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs filename=self.src)}}
> {{[2018-11-30 11:18:48,906] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/airflow/contrib/hooks/gcs_hook.py",
> line 187, in upload}}
> {{[2018-11-30 11:18:48,906] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs service = self.get_conn()}}
> {{[2018-11-30 11:18:48,907] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/airflow/contrib/hooks/gcs_hook.py",
> line 48, in get_conn}}
> {{[2018-11-30 11:18:48,907] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs 'storage', 'v1',
> http=http_authorized, cache_discovery=False)}}
> {{[2018-11-30 11:18:48,907] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/googleapiclient/_helpers.py",
> line 130, in positional_wrapper}}
> {{[2018-11-30 11:18:48,907] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs return wrapped(*args, **kwargs)}}
> {{[2018-11-30 11:18:48,907] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/googleapiclient/discovery.py",
> line 222, in build}}
> {{[2018-11-30 11:18:48,907] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs requested_url, discovery_http,
> cache_discovery, cache)}}
> {{[2018-11-30 11:18:48,908] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/googleapiclient/discovery.py",
> line 269, in _retrieve_discovery_doc}}
> {{[2018-11-30 11:18:48,908] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs resp, content =
> http.request(actual_url)}}
> {{[2018-11-30 11:18:48,908] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/google_auth_httplib2.py",
> line 187, in request}}
> {{[2018-11-30 11:18:48,908] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs self._request, method, uri,
> request_headers)}}
> {{[2018-11-30 11:18:48,908] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/google/auth/credentials.py",
> line 122, in before_request}}
> {{[2018-11-30 11:18:48,908] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs self.refresh(request)}}
> {{[2018-11-30 11:18:48,908] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/google/oauth2/service_account.py",
> line 322, in refresh}}
> {{[2018-11-30 11:18:48,908] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs request, self._token_uri,
> assertion)}}
> {{[2018-11-30 11:18:48,908] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/google/oauth2/_client.py",
> line 145, in jwt_grant}}
> {{[2018-11-30 11:18:48,908] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs response_data =
> _token_endpoint_request(request, token_uri, body)}}
> {{[2018-11-30 11:18:48,909] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/google/oauth2/_client.py",
> line 111, in _token_endpoint_request}}
> {{[2018-11-30 11:18:48,909] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs
> _handle_error_response(response_body)}}
> {{[2018-11-30 11:18:48,909] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs File
> "/Users/alex/.local/share/virtualenvs/project-X9xQGmcz/lib/python3.6/site-packages/google/oauth2/_client.py",
> line 61, in _handle_error_response}}
> {{[2018-11-30 11:18:48,909] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs error_details, response_body)}}
> {{[2018-11-30 11:18:48,909] \{base_task_runner.py:107} INFO - Job 98:
> Subtask upload_transformed_data_to_gcs *google.auth.exceptions.RefreshError:
> ('invalid_scope: Empty or missing scope not allowed.', '\{\n "error":
> "invalid_scope",\n "error_description": "Empty or missing scope not
> allowed."\n}')*}}
> {{[2018-11-30 11:18:50,180] \{logging_mixin.py:95} INFO - [2018-11-30
> 11:18:50,178] \{jobs.py:2612} INFO - Task exited with return code 1}}
> h3. Prerequisites:
> export GOOGLE_APPLICATION_CREDENTIALS=~/keys/google_service_account.json
> Keep *_google_cloud_default_* connection empty.
>
> h3. Root cause:
> Extra parameters are
> {'extra__jdbc__drv_path': '', 'extra__jdbc__drv_clsname': '',
> 'extra__google_cloud_platform__project': '',
> 'extra__google_cloud_platform__key_path': '',
> 'extra__google_cloud_platform__keyfile_dict': '',
> 'extra__google_cloud_platform__scope': ''}
> {{Code in line
> [https://github.com/apache/incubator-airflow/blob/master/airflow/contrib/hooks/gcp_api_base_hook.py#L78]}}
> {{checks if scope is None, but it's empty string.}}
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)