KCIV opened a new issue #21276:
URL: https://github.com/apache/airflow/issues/21276
### Apache Airflow version
2.2.3 (latest released)
### What happened
Using request.get inside an airflow task in which the .get can take more
than 5 minutes will cause a remote connection closure to happen which doesn't
seem to be configurable in any sense. Even when setting .get(_url,
timeout=None) causes the same remote connection closure to occur.
I have tested the endpoint using chrome, firefox, python on linux (VM),
python on windows, and swagger in which none get the same remote connection
failure.
### What you expected to happen
Because this connection closure only happens when using request.get inside
airflow and not in any other instance, I would expect airflow to have a
configuration for this timeout amount or a setting to remove a timeout
(although that may be bad practice).
Airflow logs point towards the request library however it seems to be an
issue in how airflow uses the library rather than the library itself.
### How to reproduce
I have the following API endpoint set up using basic fastAPI
```
@app.get('/sleeptest')
def sleep_test():
time.sleep(600)
return "completed"
```
this endpoint can be called by browser, swagger, python (linux VM) and
python (Windows 10) using simple request.get(_url) and will return successfully
"completed" after the 600 seconds.
However when a task in airflow is built like this
```
@task()
def get_sleep_test(resulstofprior):
returns = requests.get(api + "/sleeptest")
print(returns)
```
airflow will throw the following error after exactly 5 minutes every time.
```
Reading local file:
/opt/airflow/logs/package/get_sleep_test/2022-02-01T21:21:02.173122+00:00/7.log
[2022-02-01, 22:36:11 UTC] {taskinstance.py:1032} INFO - Dependencies all
met for <TaskInstance: package.get_sleep_test
manual__2022-02-01T21:21:02.173122+00:00 [queued]>
[2022-02-01, 22:36:11 UTC] {taskinstance.py:1032} INFO - Dependencies all
met for <TaskInstance: package.get_sleep_test
manual__2022-02-01T21:21:02.173122+00:00 [queued]>
[2022-02-01, 22:36:11 UTC] {taskinstance.py:1238} INFO -
--------------------------------------------------------------------------------
[2022-02-01, 22:36:11 UTC] {taskinstance.py:1239} INFO - Starting attempt 7
of 7
[2022-02-01, 22:36:11 UTC] {taskinstance.py:1240} INFO -
--------------------------------------------------------------------------------
[2022-02-01, 22:36:11 UTC] {taskinstance.py:1259} INFO - Executing
<Task(_PythonDecoratedOperator): get_sleep_test> on 2022-02-01
21:21:02.173122+00:00
[2022-02-01, 22:36:11 UTC] {standard_task_runner.py:52} INFO - Started
process 2682 to run task
[2022-02-01, 22:36:11 UTC] {standard_task_runner.py:76} INFO - Running:
['***', 'tasks', 'run', 'package', 'get_sleep_test',
'manual__2022-02-01T21:21:02.173122+00:00', '--job-id', '58', '--raw',
'--subdir', 'DAGS_FOLDER/package.py', '--cfg-path', '/tmp/tmpc70kpyas',
'--error-file', '/tmp/tmp7dab7rsa']
[2022-02-01, 22:36:11 UTC] {standard_task_runner.py:77} INFO - Job 58:
Subtask get_sleep_test
[2022-02-01, 22:36:11 UTC] {logging_mixin.py:109} INFO - Running
<TaskInstance: package.get_sleep_test manual__2022-02-01T21:21:02.173122+00:00
[running]> on host 607ca202a787
[2022-02-01, 22:36:11 UTC] {taskinstance.py:1426} INFO - Exporting the
following env vars:
AIRFLOW_CTX_DAG_OWNER=***
AIRFLOW_CTX_DAG_ID=package
AIRFLOW_CTX_TASK_ID=get_sleep_test
AIRFLOW_CTX_EXECUTION_DATE=2022-02-01T21:21:02.173122+00:00
AIRFLOW_CTX_DAG_RUN_ID=manual__2022-02-01T21:21:02.173122+00:00
[2022-02-01, 22:36:11 UTC] {logging_mixin.py:109} INFO - starting sleep test
[2022-02-01, 22:36:11 UTC] {logging_mixin.py:109} INFO - starting sleep test
[2022-02-01, 22:41:35 UTC] {taskinstance.py:1700} ERROR - Task failed with
exception
Traceback (most recent call last):
File
"/home/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py",
line 706, in urlopen
chunked=chunked,
File
"/home/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py",
line 445, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File
"/home/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py",
line 440, in _make_request
httplib_response = conn.getresponse()
File "/usr/local/lib/python3.7/http/client.py", line 1373, in getresponse
response.begin()
File "/usr/local/lib/python3.7/http/client.py", line 319, in begin
version, status, reason = self._read_status()
File "/usr/local/lib/python3.7/http/client.py", line 288, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File
"/home/airflow/.local/lib/python3.7/site-packages/requests/adapters.py", line
449, in send
timeout=timeout
File
"/home/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py",
line 756, in urlopen
method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
File
"/home/airflow/.local/lib/python3.7/site-packages/urllib3/util/retry.py", line
532, in increment
raise six.reraise(type(error), error, _stacktrace)
File
"/home/airflow/.local/lib/python3.7/site-packages/urllib3/packages/six.py",
line 769, in reraise
raise value.with_traceback(tb)
File
"/home/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py",
line 706, in urlopen
chunked=chunked,
File
"/home/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py",
line 445, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File
"/home/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py",
line 440, in _make_request
httplib_response = conn.getresponse()
File "/usr/local/lib/python3.7/http/client.py", line 1373, in getresponse
response.begin()
File "/usr/local/lib/python3.7/http/client.py", line 319, in begin
version, status, reason = self._read_status()
File "/usr/local/lib/python3.7/http/client.py", line 288, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.',
RemoteDisconnected('Remote end closed connection without response'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File
"/home/airflow/.local/lib/python3.7/site-packages/airflow/models/taskinstance.py",
line 1329, in _run_raw_task
self._execute_task_with_callbacks(context)
File
"/home/airflow/.local/lib/python3.7/site-packages/airflow/models/taskinstance.py",
line 1455, in _execute_task_with_callbacks
result = self._execute_task(context, self.task)
File
"/home/airflow/.local/lib/python3.7/site-packages/airflow/models/taskinstance.py",
line 1506, in _execute_task
result = execute_callable(context=context)
File
"/home/airflow/.local/lib/python3.7/site-packages/airflow/decorators/base.py",
line 134, in execute
return_value = super().execute(context)
File
"/home/airflow/.local/lib/python3.7/site-packages/airflow/operators/python.py",
line 174, in execute
return_value = self.execute_callable()
File
"/home/airflow/.local/lib/python3.7/site-packages/airflow/operators/python.py",
line 185, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File "/opt/airflow/dags/package.py", line 108, in get_sleep_test
returns = requests.get(api + "/sleeptest", timeout=None)
File "/home/airflow/.local/lib/python3.7/site-packages/requests/api.py",
line 75, in get
return request('get', url, params=params, **kwargs)
File "/home/airflow/.local/lib/python3.7/site-packages/requests/api.py",
line 61, in request
return session.request(method=method, url=url, **kwargs)
File
"/home/airflow/.local/lib/python3.7/site-packages/requests/sessions.py", line
542, in request
resp = self.send(prep, **send_kwargs)
File
"/home/airflow/.local/lib/python3.7/site-packages/requests/sessions.py", line
655, in send
r = adapter.send(request, **kwargs)
File
"/home/airflow/.local/lib/python3.7/site-packages/requests/adapters.py", line
498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.',
RemoteDisconnected('Remote end closed connection without response'))
[2022-02-01, 22:41:35 UTC] {taskinstance.py:1277} INFO - Marking task as
FAILED. dag_id=package, task_id=get_sleep_test, execution_date=20220201T212102,
start_date=20220201T223611, end_date=20220201T224135
[2022-02-01, 22:41:35 UTC] {standard_task_runner.py:92} ERROR - Failed to
execute job 58 for task get_sleep_test
Traceback (most recent call last):
File
"/home/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py",
line 706, in urlopen
chunked=chunked,
File
"/home/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py",
line 445, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File
"/home/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py",
line 440, in _make_request
httplib_response = conn.getresponse()
File "/usr/local/lib/python3.7/http/client.py", line 1373, in getresponse
response.begin()
File "/usr/local/lib/python3.7/http/client.py", line 319, in begin
version, status, reason = self._read_status()
File "/usr/local/lib/python3.7/http/client.py", line 288, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File
"/home/airflow/.local/lib/python3.7/site-packages/requests/adapters.py", line
449, in send
timeout=timeout
File
"/home/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py",
line 756, in urlopen
method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
File
"/home/airflow/.local/lib/python3.7/site-packages/urllib3/util/retry.py", line
532, in increment
raise six.reraise(type(error), error, _stacktrace)
File
"/home/airflow/.local/lib/python3.7/site-packages/urllib3/packages/six.py",
line 769, in reraise
raise value.with_traceback(tb)
File
"/home/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py",
line 706, in urlopen
chunked=chunked,
File
"/home/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py",
line 445, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File
"/home/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py",
line 440, in _make_request
httplib_response = conn.getresponse()
File "/usr/local/lib/python3.7/http/client.py", line 1373, in getresponse
response.begin()
File "/usr/local/lib/python3.7/http/client.py", line 319, in begin
version, status, reason = self._read_status()
File "/usr/local/lib/python3.7/http/client.py", line 288, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.',
RemoteDisconnected('Remote end closed connection without response'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File
"/home/airflow/.local/lib/python3.7/site-packages/airflow/task/task_runner/standard_task_runner.py",
line 85, in _start_by_fork
args.func(args, dag=self.dag)
File
"/home/airflow/.local/lib/python3.7/site-packages/airflow/cli/cli_parser.py",
line 48, in command
return func(*args, **kwargs)
File
"/home/airflow/.local/lib/python3.7/site-packages/airflow/utils/cli.py", line
92, in wrapper
return f(*args, **kwargs)
File
"/home/airflow/.local/lib/python3.7/site-packages/airflow/cli/commands/task_command.py",
line 298, in task_run
_run_task_by_selected_method(args, dag, ti)
File
"/home/airflow/.local/lib/python3.7/site-packages/airflow/cli/commands/task_command.py",
line 107, in _run_task_by_selected_method
_run_raw_task(args, ti)
File
"/home/airflow/.local/lib/python3.7/site-packages/airflow/cli/commands/task_command.py",
line 184, in _run_raw_task
error_file=args.error_file,
File
"/home/airflow/.local/lib/python3.7/site-packages/airflow/utils/session.py",
line 70, in wrapper
return func(*args, session=session, **kwargs)
File
"/home/airflow/.local/lib/python3.7/site-packages/airflow/models/taskinstance.py",
line 1329, in _run_raw_task
self._execute_task_with_callbacks(context)
File
"/home/airflow/.local/lib/python3.7/site-packages/airflow/models/taskinstance.py",
line 1455, in _execute_task_with_callbacks
result = self._execute_task(context, self.task)
File
"/home/airflow/.local/lib/python3.7/site-packages/airflow/models/taskinstance.py",
line 1506, in _execute_task
result = execute_callable(context=context)
File
"/home/airflow/.local/lib/python3.7/site-packages/airflow/decorators/base.py",
line 134, in execute
return_value = super().execute(context)
File
"/home/airflow/.local/lib/python3.7/site-packages/airflow/operators/python.py",
line 174, in execute
return_value = self.execute_callable()
File
"/home/airflow/.local/lib/python3.7/site-packages/airflow/operators/python.py",
line 185, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File "/opt/airflow/dags/package.py", line 108, in get_sleep_test
returns = requests.get(api + "/sleeptest", timeout=None)
File "/home/airflow/.local/lib/python3.7/site-packages/requests/api.py",
line 75, in get
return request('get', url, params=params, **kwargs)
File "/home/airflow/.local/lib/python3.7/site-packages/requests/api.py",
line 61, in request
return session.request(method=method, url=url, **kwargs)
File
"/home/airflow/.local/lib/python3.7/site-packages/requests/sessions.py", line
542, in request
resp = self.send(prep, **send_kwargs)
File
"/home/airflow/.local/lib/python3.7/site-packages/requests/sessions.py", line
655, in send
r = adapter.send(request, **kwargs)
File
"/home/airflow/.local/lib/python3.7/site-packages/requests/adapters.py", line
498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.',
RemoteDisconnected('Remote end closed connection without response'))
[2022-02-01, 22:41:35 UTC] {local_task_job.py:154} INFO - Task exited with
return code 1
[2022-02-01, 22:41:35 UTC] {local_task_job.py:264} INFO - 0 downstream tasks
scheduled from follow-on schedule check
```
I have also tried configuring the endpoint to by async with await along with
asyncio.sleep() but neither fixed the issue of airflow timing out.
I have tried endless amounts of timeout changes both in airflow .env
settings as well as fastAPI settings, ping, poke, retry with no change in the
failure point.
### Operating System
Windows 10
### Versions of Apache Airflow Providers
Using and running dockerdesktop calling the compose using image 2.2.3
### Deployment
Docker-Compose
### Deployment details
docker engine v20.10.8
the process is docker-compose up airflow-init from the airflow project
directory.
then docker-compose up to run the airflow.
### Anything else
Hopefully I have not made any grave mistakes in setting up this example, it
feels extremely bare bones and easy to reproduce.
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]