Re: [DISCUSS] indexes for API calls

2024-05-31 Thread Daniel Standish
Lots of good discussion here. We should create separate threads for the questions about (1) whether to keep or drop mysql / mssql / sqlite / mongodb / just-pipe-to-/dev/null/ and (2) to UUID or not to UUID and (3) database agnosticism a.k.a. an interface. But some responses... Using UUIDS was th

[RESULT][VOTE] May 2024 PR of the Month

2024-05-31 Thread Briana Okyere
Hey All, Congratulations to Daniel Standish on winning PR of the Month with 8 votes for PR #39336: Scheduler to handle incrementing of try_number Well done all! PR #39336 will be featured in the May 2024 Newsletter, and thank you to all those who mad

Re: [DISCUSS] indexes for API calls

2024-05-31 Thread Andrey Anshin
> however it might or might not be affected. Similarly MariaDB, but MySQL does not seem to have proper UUID support, so we should really use UUID7 rather that UUID4 for such UUIDs in case we do not want to affect insert performance on MySQL. Some NITs here, I guess better to use deterministic UUID

Re: [VOTE] Airflow Providers prepared on May 30, 2024

2024-05-31 Thread Pankaj Koti
+1 (non-binding) Concurring with Wei! Best regards, *Pankaj Koti* Senior Software Engineer (Airflow OSS Engineering team) Location: Pune, Maharashtra, India Timezone: Indian Standard Time (IST) On Fri, May 31, 2024 at 8:55 AM Wei Lee wrote: > +1 (non-binding) > > Tested my changes and our ex

Re: [DISCUSS] Restore the SQL server backend

2024-05-31 Thread Wei Lee
I agree with Jed and the following comments. If my memory serves me right, this topic has been discussed a few times in the past. 5% doesn't seem very convincing. Even if it's biased, I'm still not persuaded that there are a large number of users that are worth the community's effort. And Jarek

Re: [DISCUSS] indexes for API calls

2024-05-31 Thread Vincent Beck
Interesting thread. I think what makes this discussion complex is that Airflow makes a lot of different queries (API, Scheduler, ...). I think it is even harder to keep track of all the different queries Airflow makes and thus, hard to figure if such index is needed. Also, Airflow evolves (and

Re: [DISCUSS] indexes for API calls

2024-05-31 Thread Andrey Anshin
IMHO, blindy adding new indexes into the `dag_run` and `task_instance` tables will cause additional maintenance costs. There are 8 indexes already exists per each of this tables SELECT pi.schemaname schema_name, pi.tablename table_name, count(*) num FROM pg_indexes pi WHERE pi.scheman

Re: [DISCUSS] Restore the SQL server backend

2024-05-31 Thread Elad Kalif
I agree with Jarek I am a bit worried about the mental model of this proposal as you are offering to deliver a feature but you are not offering being a community member. I had a lot of frustration with the MsSQL backend tests, it really caused me pain as a contributor. According to your mental mod

Re: [DISCUSS] indexes for API calls

2024-05-31 Thread Pierre Jeambrun
Indeed Jarek I feel like this is another point in favor of stick to "Postgres" As mentioned, maybe we were a little reckless when adding all these kinds of filters. If they are not often used and we rarely / never see performance github issues on those, marking them as 'non optimised but here for

Re: [DISCUSS] indexes for API calls

2024-05-31 Thread Jarek Potiuk
And to be perfectly honest - if people (like me) hesitate on settling on architectural decisions because they are afraid that their changes might have unintended consequences, because we want to support all the different kinds of databases - this is one more reason we should stick to "Postgres onl

Re: [DISCUSS] indexes for API calls

2024-05-31 Thread Jarek Potiuk
Using UUIDS was the proposal how we can bypass the limitation of MYSQL for Airflow 2 when we discussed whether to do a "simple" version of team-prefix in dag id, or whether we want to mess with adding yet-another-field-to-indexes-that-are-already-too-long-for-mysql and it was based on the assumptio

Re: [DISCUSS] Restore the SQL server backend

2024-05-31 Thread Jarek Potiuk
> We also understand and are ready to address the concerns stated in the vote about support and resolving CI issues Hello James, Could you please explain how exactly are you planning to help a number of maintainers who are working on developing new feature to make sure they know and realise unobv

Re: [DISCUSS] indexes for API calls

2024-05-31 Thread Daniel Standish
Yes uuid is risky and problematic as primary key. If you do it you need to do carefully/ sequential. But I think that we are not going with UUID pk on any tables at this time. BUT I do want to add a uuid for every TI try that is not PK but can be used as a more convenient identifier when tying to