zockette opened a new issue #19397:
URL: https://github.com/apache/airflow/issues/19397
### Apache Airflow version
2.2.1 (latest released)
### Operating System
RHEL 7.8
### Versions of Apache Airflow Providers
apache-airflow-providers-celery==2.1.0
apache-airflow-providers-ftp==2.0.1
apache-airflow-providers-http==2.0.1
apache-airflow-providers-imap==2.0.1
apache-airflow-providers-microsoft-azure==3.2.0
apache-airflow-providers-mysql==2.1.1
apache-airflow-providers-oracle==2.0.1
apache-airflow-providers-sqlite==2.0.1
### Deployment
Virtualenv installation
### Deployment details
Forced install of mysqlclient==1.4.6 because 2.x doesn't work with our setup
(it requires some extra libs which I can't freely install on the host).
### What happened
I've attempted to upgrade from 2.1.2 directly to 2.2.1 when running "airflow
db upgrade" and ran into many DDL issues some of which I couldn't solve.
I've also attempted to upgrade to 2.1.3 (ok), then 2.1.4 (ok) and then 2.2.0
and face the same issues as with 2.2.1.
1. Fail: "task_instance"
sqlalchemy.exc.DatabaseError: (mysql.connector.errors.DatabaseError)
1093 (HY000): You can't specify target table 'task_instance' for update in FROM
clause
[SQL: delete from task_instance
where (task_id, dag_id,
execution_date) IN (
select source.task_id,
source.dag_id, source.execution_date
from task_instance as
source
left join dag_run as dr
on (source.dag_id =
dr.dag_id and source.execution_date = dr.execution_date)
where dr.id is null
)
]
Fix:
delete from task_instance where
concat(task_instance.execution_date,task_instance.task_id,task_instance.dag_id)
in ( select plop from (SELECT
concat(task_instance.execution_date,task_instance.task_id,task_instance.dag_id)
plop FROM task_instance LEFT JOIN dag_run ON task_instance.dag_id =
dag_run.dag_id AND task_instance.execution_date = dag_run.execution_date WHERE
dag_run.run_id IS NULL) as trash);
2. Fail: "task_fail"
[2021-11-03 19:02:11,281] {db.py:815} ERROR - Automatic migration is
not available
[2021-11-03 19:02:11,282] {db.py:817} ERROR - The task_fail table has 2
rows without a corresponding dag_run row. You must manually correct this
problem (possibly by deleting the problem rows).
Fix:
select id from task_fail LEFT JOIN dag_run ON dag_run.dag_id =
task_fail.dag_id AND dag_run.execution_date = task_fail.execution_date WHERE
dag_run.run_id IS NULL; --id 92, 93
delete from task_fail where id in (##, ##);
3. Fail: "task_reschedule_ti_fkey"
sqlalchemy.exc.DatabaseError: (mysql.connector.errors.DatabaseError)
1215 (HY000): Cannot add foreign key constraint
[SQL: ALTER TABLE task_reschedule ADD CONSTRAINT
task_reschedule_ti_fkey FOREIGN KEY(dag_id, task_id, run_id) REFERENCES
task_instance (dag_id, task_id, run_id) ON DELETE CASCADE]
Fix:
ALTER TABLE task_reschedule MODIFY dag_id VARCHAR(250)
CHARACTER SET utf8 COLLATE utf8_bin; --didn't have a collation set.
4. Fail: "dag_id"
sqlalchemy.exc.ProgrammingError:
(mysql.connector.errors.ProgrammingError) 1091 (42000): Can't DROP 'dag_id';
check that column/key exists
[SQL: ALTER TABLE dag_run DROP INDEX dag_id]
Fix:
create index dag_id on dag_run(dag_id);
5. Fail: "dag_id_2"
sqlalchemy.exc.ProgrammingError:
(mysql.connector.errors.ProgrammingError) 1091 (42000): Can't DROP 'dag_id_2';
check that column/key exists
[SQL: ALTER TABLE dag_run DROP INDEX dag_id_2]
Fix:
create index dag_id_2 on dag_run(dag_id);
6. Fail: "dag_run_dag_id_execution_date_key"
Error on ALTER TABLE dag_run ADD CONSTRAINT
dag_run_dag_id_execution_date_key UNIQUE (dag_id, execution_date)]
Fix: None.
I gave up on 6 :(
### What you expected to happen
I hoped it would work 👍
### How to reproduce
I can reproduce at will on my setup by just running "airflow db upgrade"
against my 2.1.2 mysql db.
### Anything else
I'm guessing it could either be caused by mysqlclient==1.4.6 somehow not
being fully compatible with the upgrade queries or because something went wrong
and unnoticed the previous upgrade from 2.0.1 to 2.1.2.
Thanks for you help!
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]