zockette opened a new issue #19397:
URL: https://github.com/apache/airflow/issues/19397


   ### Apache Airflow version
   
   2.2.1 (latest released)
   
   ### Operating System
   
   RHEL 7.8
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-celery==2.1.0
   apache-airflow-providers-ftp==2.0.1
   apache-airflow-providers-http==2.0.1
   apache-airflow-providers-imap==2.0.1
   apache-airflow-providers-microsoft-azure==3.2.0
   apache-airflow-providers-mysql==2.1.1
   apache-airflow-providers-oracle==2.0.1
   apache-airflow-providers-sqlite==2.0.1
   
   
   ### Deployment
   
   Virtualenv installation
   
   ### Deployment details
   
   Forced install of mysqlclient==1.4.6 because 2.x doesn't work with our setup 
(it requires some extra libs which I can't freely install on the host).
   
   ### What happened
   
   I've attempted to upgrade from 2.1.2 directly to 2.2.1 when running "airflow 
db upgrade" and ran into many DDL issues some of which I couldn't solve.
   I've also attempted to upgrade to 2.1.3 (ok), then 2.1.4 (ok) and then 2.2.0 
and face the same issues as with 2.2.1.
   
   1. Fail: "task_instance"
        sqlalchemy.exc.DatabaseError: (mysql.connector.errors.DatabaseError) 
1093 (HY000): You can't specify target table 'task_instance' for update in FROM 
clause
                                [SQL: delete from task_instance
                                                where (task_id, dag_id, 
execution_date) IN (
                                                        select source.task_id, 
source.dag_id, source.execution_date
                                                        from task_instance as 
source
                                                        left join dag_run as dr
                                                        on (source.dag_id = 
dr.dag_id and source.execution_date = dr.execution_date)
                                                        where dr.id is null
                                                )
                                ]
        Fix:
                delete from task_instance where 
concat(task_instance.execution_date,task_instance.task_id,task_instance.dag_id) 
in ( select plop from (SELECT 
concat(task_instance.execution_date,task_instance.task_id,task_instance.dag_id) 
plop FROM task_instance LEFT JOIN dag_run ON task_instance.dag_id = 
dag_run.dag_id AND task_instance.execution_date = dag_run.execution_date WHERE 
dag_run.run_id IS NULL) as trash);
   2. Fail: "task_fail"
        [2021-11-03 19:02:11,281] {db.py:815} ERROR - Automatic migration is 
not available
        [2021-11-03 19:02:11,282] {db.py:817} ERROR - The task_fail table has 2 
rows without a corresponding dag_run row. You must manually correct this 
problem (possibly by deleting the problem rows).
        Fix: 
                select id from task_fail LEFT JOIN dag_run ON dag_run.dag_id = 
task_fail.dag_id AND dag_run.execution_date = task_fail.execution_date WHERE 
dag_run.run_id IS NULL; --id 92, 93
                delete from task_fail where id in (##, ##);
   3. Fail: "task_reschedule_ti_fkey"
        sqlalchemy.exc.DatabaseError: (mysql.connector.errors.DatabaseError) 
1215 (HY000): Cannot add foreign key constraint
        [SQL: ALTER TABLE task_reschedule ADD CONSTRAINT 
task_reschedule_ti_fkey FOREIGN KEY(dag_id, task_id, run_id) REFERENCES 
task_instance (dag_id, task_id, run_id) ON DELETE CASCADE]
        Fix:
                ALTER TABLE task_reschedule MODIFY dag_id VARCHAR(250) 
CHARACTER SET utf8 COLLATE utf8_bin; --didn't have a collation set.
   4. Fail: "dag_id"
                sqlalchemy.exc.ProgrammingError: 
(mysql.connector.errors.ProgrammingError) 1091 (42000): Can't DROP 'dag_id'; 
check that column/key exists
                [SQL: ALTER TABLE dag_run DROP INDEX dag_id]
        Fix:
                create index dag_id on dag_run(dag_id);
   5. Fail: "dag_id_2"
                sqlalchemy.exc.ProgrammingError: 
(mysql.connector.errors.ProgrammingError) 1091 (42000): Can't DROP 'dag_id_2'; 
check that column/key exists
                [SQL: ALTER TABLE dag_run DROP INDEX dag_id_2]
        Fix:
                create index dag_id_2 on dag_run(dag_id);
   
   6. Fail: "dag_run_dag_id_execution_date_key"
        Error on ALTER TABLE dag_run ADD CONSTRAINT 
dag_run_dag_id_execution_date_key UNIQUE (dag_id, execution_date)]
        Fix: None.
   
   I gave up on 6 :(
   
   ### What you expected to happen
   
   I hoped it would work 👍 
   
   
   
   ### How to reproduce
   
   I can reproduce at will on my setup by just running "airflow db upgrade" 
against my 2.1.2 mysql db.
   
   ### Anything else
   
   I'm guessing it could either be caused by mysqlclient==1.4.6 somehow not 
being fully compatible with the upgrade queries or because something went wrong 
and unnoticed the previous upgrade from 2.0.1 to 2.1.2.
   
   Thanks for you help!
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to