george-zubrienko commented on issue #19038:
URL: https://github.com/apache/airflow/issues/19038#issuecomment-987970167
Seeing this issue a lot with 3 schedulers on 2.2.2, KubernetesExecutor on
k8s 1.21. Code is PythonOperator calling a webservice. We do get some logs logs
on startup though - this is right before job code starts:
```
[2021-12-07, 13:32:00 UTC] {taskinstance.py:1262} INFO - Executing
<Task(PythonOperator): ...> on 2021-12-07 06:00:00+00:00
[2021-12-07, 13:32:00 UTC] {standard_task_runner.py:52} INFO - Started
process 13 to run task
[2021-12-07, 13:32:00 UTC] {standard_task_runner.py:76} INFO - Running:
['airflow', 'tasks', 'run', ....]
[2021-12-07, 13:32:00 UTC] {standard_task_runner.py:77} INFO - Job 129472:
Subtask ...
[2021-12-07, 13:32:05 UTC] {local_task_job.py:211} WARNING - State of this
instance has been externally set to queued. Terminating instance.
[2021-12-07, 13:32:05 UTC] {process_utils.py:100} INFO - Sending
Signals.SIGTERM to GPID 13
[2021-12-07, 13:32:07 UTC] {process_utils.py:66} INFO - Process
psutil.Process(pid=13, status='terminated', exitcode=1, started='13:32:00')
(13) terminated with exit code 1
```
Next try for this task started running, but then:
```
[2021-12-07, 13:32:04 UTC] {chained.py:84} INFO - DefaultAzureCredential
acquired a token from EnvironmentCredential
[2021-12-07, 13:32:06 UTC] {taskinstance.py:1411} ERROR - Received SIGTERM.
Terminating subprocesses.
[2021-12-07, 13:32:06 UTC] {taskinstance.py:1703} ERROR - Task failed with
exception
```
And then retry 3 goes through finally. Only seeing this issue on pipelines
with high number of parallel tasks - in our case, 3 task pools 48 + 48 + 90
total capacity. Also for >1 scheduler, scheduler pods sometimes print this to
logs:
```
sqlalchemy.exc.OperationalError: (psycopg2.errors.DeadlockDetected) deadlock
detected
DETAIL: Process 1368 waits for ShareLock on transaction 30815670; blocked
by process 1160.
Process 1160 waits for ShareLock on transaction 30815664; blocked by process
1368.
HINT: See server log for query details.
CONTEXT: while updating tuple (13513,3) in relation "task_instance"
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]