Re: [DISCUSS] DB changes to support multi-team

Vincent Beck Thu, 12 Jun 2025 05:55:05 -0700

> The tl;dr of it is Strip prefix all of their IDs with the type, so that they 
> are easy to know about what type they are just by looking at the ID without 
> needing a DB query:


I agree, this is a very nice idea. We start having many different 
entities/resources using UUID as PK, having this prefix would help us to 
identity which resource we are talking about.

On 2025/06/12 09:33:17 Jarek Potiuk wrote:
> > Are per team, what really is the benefit of this approach? If I’m
> understanding this idea right then each team would have different workers,
> different schedulers, different API servers? So the only thing that is
> actually shared is the DB? Is anything else shared?
> 
> Actually the scheduler and API server are supposed to be shared - the idea
> is that only components that can potentially execute code coming from DAGs
> of the team are supposed to be "per-team". See the images in
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-67+Multi-team+deployment+of+Airflow+components,
> They need to be slightly updated with after-airflow-3 changes, but the
> "gray box" - which is shared, in "Proposed target architecture with
> multi-team setup" encompasses both Task SDK and Webserver (which currently
> is api_server).
> 
> > The tl;dr of it is Strip prefix all of their IDs with the type, so that
> they are easy to know about what type they are just by looking at the ID
> without needing a DB query:
> 
> Nice idea. I think we should follow it.
> 
> J.
> 
> On Thu, Jun 12, 2025 at 11:20 AM Ash Berlin-Taylor <a...@apache.org> wrote:
> 
> > Slightly off topic/slightly related, but I came across this recently
> > https://dev.to/stripe/designing-apis-for-humans-object-ids-3o5a
> >
> > The tl;dr of it is Strip prefix all of their IDs with the type, so that
> > they are easy to know about what type they are just by looking at the ID
> > without needing a DB query:
> >
> > > The above snippet is trying to retrieve a PaymentIntent from a connected
> > account, however without even looking at the code you can immediately spot
> > the error: a Customer ID (cus_) is being used instead of an Account ID
> > (acct_). Without prefixes this would be much harder to debug; if Stripe
> > used UUIDs instead then we’d have to look up the ID (probably in the Stripe
> > Dashboard) to find out what kind of object it is and if it’s even valid.
> >
> > What do we think about using something similar?
> >
> > > On 10 Jun 2025, at 14:44, Vincent Beck <vincb...@apache.org> wrote:
> > >
> > > Oh I think you meant as an alternative solution.
> > >
> > > I still do not like this solution because I feel like on the long run it
> > will bite us. Yes on the short term we would avoid massive migration on
> > most of the tables and we would not need to update a lot of requests but on
> > the long run this will cause issues like:
> > > - Performances. To get the list of DAGs from teamX, you would need to do
> > something like `WHERE dag_id like "teamX__%s"`
> > > - Edge cases. Old DAGs with "__" in their name would cause issues. e.g.
> > "my__dag" would be interpreted as DAG "dag" within team "my". This is just
> > an example but I can feel we would need to handle many different other edge
> > cases
> > > - Just a natural feeling with no real datapoint that this is something
> > that will cause us headaches later and ends up more complicated and less
> > maintainable than updating the DB schema.
> > >
> > > But this is only my personal opinion, maybe others think otherwise :)
> > >
> > > Vincent
> > >
> > >
> > > On 2025/06/10 13:32:49 Vincent Beck wrote:
> > >> For backward compatibility purposes I also think the default team is a
> > good idea. On the API side, if a team is not provided, then the default
> > team is assigned.
> > >>
> > >>> For newer versions, we should probably start adding `team_id__dag_id`
> > in
> > >>> the dag_id column.
> > >>> Fallback to "default" if not specified.
> > >>>
> > >>> For API:
> > >>>
> > >>> Internally resolve dag_id = team_id + "__" + original_dag_id.
> > >>> For old DAGs, just treat dag_id = original_dag_id with team "default".
> > >>> Replace all dagbag / related operations to split and use the
> > >>> original_dag_id.
> > >>
> > >> Unless I misunderstood you are proposing to update all `dag_id` columns
> > to include the team_id in it? If so, I really dont think including the
> > `team_id` in the `dag_id` is a good idea, and more importantly, it is not
> > needed. `dag_id` would no longer be PK and a unique constraint will be
> > created on the two columns (`dag_id`, `team_id`). Why do you want to have
> > the `team_id` as part of the `dag_id`?
> > >>
> > >> On 2025/06/10 05:45:48 Amogh Desai wrote:
> > >>> Hi All,
> > >>>
> > >>> From the perspective of migrating the task instance table queries to
> > use
> > >>> the `ti.id` in
> > >>>
> > >>>> One concern I have is that if team ID is introduced and naturally we
> > >>> want to have a dag_id uniqueness only enforced within a team (I assume
> > >>> this is a natural consequence?) then we have a very strong break in
> > API?
> > >>> Because all Dag related API calls use dag_id as identifier. I would
> > >>> dis-like to force to switch all user access to UUID as well as to force
> > >>> to pre-fix all calls with team_id. This would be rather a v3 of the
> > API.
> > >>> Do we have a plan how we make the API non breaking? (Also such path's
> > >>> are used in UI but there I'd see it not too critical if team_id is
> > added
> > >>> as prefix in a path)
> > >>>
> > >>> I concur with what Jens has to say here. It might be a very valid use
> > case
> > >>> to have
> > >>> dag_id be unique per team. But that construct should be achievable with
> > >>> unique on the
> > >>> (dag_id, team_id).
> > >>>
> > >>> Just an idea I want to throw around:
> > >>> I guess to avoid major breakage, at least for the time being, we should
> > >>> introduce a concept
> > >>> of "default" team. A team that belongs at the deployment level or the
> > >>> "starting point" when AF
> > >>> is installed.
> > >>>
> > >>> For newer versions, we should probably start adding `team_id__dag_id`
> > in
> > >>> the dag_id column.
> > >>> Fallback to "default" if not specified.
> > >>>
> > >>> For API:
> > >>>
> > >>> Internally resolve dag_id = team_id + "__" + original_dag_id.
> > >>> For old DAGs, just treat dag_id = original_dag_id with team "default".
> > >>> Replace all dagbag / related operations to split and use the
> > >>> original_dag_id.
> > >>>
> > >>> This will allow:
> > >>>
> > >>>
> > >>>   -
> > >>>
> > >>>   Old DAGs continue to work with their unprefixed dag_id.
> > >>>   -
> > >>>
> > >>>   New DAGs can safely use the same dag_ids but in different teams.
> > >>>   -
> > >>>
> > >>>   API stays stable: still /dags/{dag_id}.
> > >>>
> > >>>
> > >>> Thanks & Regards,
> > >>> Amogh Desai
> > >>>
> > >>>
> > >>> On Tue, Jun 10, 2025 at 3:52 AM Daniel Standish
> > >>> <daniel.stand...@astronomer.io.invalid> wrote:
> > >>>
> > >>>> re
> > >>>>>
> > >>>>> From the point of dag_id and Dag display name (same for tasks) I am
> > >>>>> rather requiring to keep them. Task ID and Dag ID is used in
> > technical
> > >>>>> terms and the display names are for humans and allow special
> > characters.
> > >>>>
> > >>>>
> > >>>> I don't really understand what the point of having a separate display
> > >>>> name.  I thought the reason we needed display name (instead of just
> > >>>> allowing unicode in dag id) was something to do with the fact that
> > dag id
> > >>>> was a PK.  If it's no longer PK, then that would be non-issue.  Yes
> > we'd
> > >>>> need to figure out a path for users to migrate / deprecate.  But it
> > seems
> > >>>> sorta pointless to have two fields when one would do.
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> On Mon, Jun 9, 2025 at 3:19 PM Daniel Standish <
> > >>>> daniel.stand...@astronomer.io> wrote:
> > >>>>
> > >>>>> An idea re backcompat.
> > >>>>>
> > >>>>> Can there be a default team?  Then, existing API routes can stay the
> > >>>> same,
> > >>>>> (though maybe deprecate).  But then you add new ones that take team
> > id.
> > >>>> Or
> > >>>>> possibly add as a parameter, and if omitted, you get the default.
> > >>>>>
> > >>>>> On Mon, Jun 9, 2025 at 11:53 AM Jens Scheffler
> > >>>> <j_scheff...@gmx.de.invalid>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Hi,
> > >>>>>>
> > >>>>>> As we have not made the migration to AF3 in our environment I can
> > not
> > >>>>>> speak about performance impact of UUID in TI Table, but I assume
> > even if
> > >>>>>> then the complexity is still lower than having a compound primary
> > key.
> > >>>>>>
> > >>>>>> So from DB perspective I see a very large DB migration coming as
> > almost
> > >>>>>> the whole DB needs to be re-written. Which is okay but need to be
> > taken
> > >>>>>> with care as migration will take a long downtime for large
> > instances.
> > >>>>>>
> > >>>>>> From the point of dag_id and Dag display name (same for tasks) I am
> > >>>>>> rather requiring to keep them. Task ID and Dag ID is used in
> > technical
> > >>>>>> terms and the display names are for humans and allow special
> > characters.
> > >>>>>>
> > >>>>>> One concern I have is that if team ID is introduced and naturally we
> > >>>>>> want to have a dag_id uniqueness only enforced within a team (I
> > assume
> > >>>>>> this is a natural consequence?) then we have a very strong break in
> > API?
> > >>>>>> Because all Dag related API calls use dag_id as identifier. I would
> > >>>>>> dis-like to force to switch all user access to UUID as well as to
> > force
> > >>>>>> to pre-fix all calls with team_id. This would be rather a v3 of the
> > API.
> > >>>>>> Do we have a plan how we make the API non breaking? (Also such
> > path's
> > >>>>>> are used in UI but there I'd see it not too critical if team_id is
> > added
> > >>>>>> as prefix in a path)
> > >>>>>>
> > >>>>>> Jens
> > >>>>>>
> > >>>>>> On 09.06.25 18:37, Jarek Potiuk wrote:
> > >>>>>>> I think it would be great to hear if there were any issues observed
> > >>>>>> (with
> > >>>>>>> either migration or performance) after we migrated task instance in
> > >>>>>> #43161
> > >>>>>>> and learning from that we could decide whether to use UUID as well
> > for
> > >>>>>> the
> > >>>>>>> dag table.
> > >>>>>>> But that would be my preference to use UUID7 - similarly as we did
> > in
> > >>>>>> TI.
> > >>>>>>>
> > >>>>>>>> If we are adding a surrogate key for dag, is there any longer a
> > >>>> reason
> > >>>>>> to
> > >>>>>>> have both dag_id and dag display name?
> > >>>>>>>
> > >>>>>>> I think the main reason is that we would have to implement merging
> > >>>>>> dag_id
> > >>>>>>> and display name (or rather replacing dag_id with display name) and
> > >>>> that
> > >>>>>>> would  also require adding UUID for the task table (and replacing
> > >>>>>>> task_display_name) for consistency.
> > >>>>>>>
> > >>>>>>> Also it means migration of existing dags to move "dag_display_name"
> > >>>> and
> > >>>>>>> "task_display_name" to be dag_id, task_id. Also if we merge these
> > two,
> > >>>>>> it
> > >>>>>>> means that users will have to change their API calls to use
> > different
> > >>>>>> ids
> > >>>>>>> to query their dags after rename.
> > >>>>>>>
> > >>>>>>> The original proposal from Vincent is transparent for DAG authors
> > and
> > >>>>>> API
> > >>>>>>> as I understand it.
> > >>>>>>>
> > >>>>>>> I think personally, even if we would like to get rid of
> > display_names
> > >>>>>>> (which I am not sure of), that should be a separate migration -
> > >>>>>> precisely
> > >>>>>>> because of increased complexity of the migration process and
> > impact on
> > >>>>>> DAG
> > >>>>>>> authors / APIs. Not impossible, but simply adds a different group
> > of
> > >>>>>> people
> > >>>>>>> that should be involved in the migration and external systems that
> > use
> > >>>>>>> Airflow APIs - which makes the migration less likely/more risky
> > for a
> > >>>>>>> number of users.
> > >>>>>>>
> > >>>>>>> J.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Mon, Jun 9, 2025 at 6:08 PM Daniel Standish
> > >>>>>>> <daniel.stand...@astronomer.io.invalid> wrote:
> > >>>>>>>
> > >>>>>>>> re
> > >>>>>>>>
> > >>>>>>>> * `dag`: Add `team_id` column and enforce a unique constraint on
> > >>>>>> (`dag_id`,
> > >>>>>>>>> `team_id`).
> > >>>>>>>>
> > >>>>>>>> If we are adding a surrogate key for dag, is there any longer a
> > >>>> reason
> > >>>>>> to
> > >>>>>>>> have both dag_id and dag display name?
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On Mon, Jun 9, 2025 at 7:20 AM Beck, Vincent
> > >>>>>> <vincb...@amazon.com.invalid>
> > >>>>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> Hi everyone,
> > >>>>>>>>>
> > >>>>>>>>> As part of the multi-team AIP effort ([AIP-67][1]), I’m planning
> > to
> > >>>>>> begin
> > >>>>>>>>> work on updating the database schema to support multiple teams.
> > >>>> Since
> > >>>>>>>> this
> > >>>>>>>>> is a significant and potentially disruptive change, I wanted to
> > >>>> first
> > >>>>>>>>> gather feedback on the proposed approach.
> > >>>>>>>>>
> > >>>>>>>>> ## Proposed plan
> > >>>>>>>>>
> > >>>>>>>>> 1. Introduce a UUID primary key on the `dag` table
> > >>>>>>>>>
> > >>>>>>>>> Replace the current `dag_id` primary key with a new `id` column
> > >>>>>>>> containing
> > >>>>>>>>> a generated UUID. This is similar to the change proposed in
> > #43161,
> > >>>>>> but
> > >>>>>>>>> applied to the `dag` table.
> > >>>>>>>>>
> > >>>>>>>>> 1. Update all foreign keys referencing `dag.dag_id`
> > >>>>>>>>>
> > >>>>>>>>> Update foreign keys across all related tables to reference `
> > dag.id`
> > >>>>>>>>> instead of `dag.dag_id`. Impacted tables include:
> > >>>>>>>>>
> > >>>>>>>>> - dag_schedule_asset_alias_reference
> > >>>>>>>>> - task_outlet_asset_reference
> > >>>>>>>>> - dag_schedule_asset_reference
> > >>>>>>>>> - asset_dag_run_queue
> > >>>>>>>>> - dag_version
> > >>>>>>>>> - dag_schedule_asset_uri_reference
> > >>>>>>>>> - dag_tag
> > >>>>>>>>> - dag_owner_attributes
> > >>>>>>>>> - dag_warning
> > >>>>>>>>> - dag_schedule_asset_name_reference
> > >>>>>>>>> - deadline
> > >>>>>>>>> - dag_code
> > >>>>>>>>> - serialized_dag
> > >>>>>>>>> - task_instance
> > >>>>>>>>> - dag_run
> > >>>>>>>>> - backfill
> > >>>>>>>>> - rendered_task_instance_fields
> > >>>>>>>>> - task_map
> > >>>>>>>>> - xcom
> > >>>>>>>>> - job
> > >>>>>>>>> - log
> > >>>>>>>>>
> > >>>>>>>>> 1. Add `team_id` column to tables
> > >>>>>>>>>
> > >>>>>>>>> * `dag`: Add `team_id` column and enforce a unique constraint on
> > >>>>>>>>> (`dag_id`, `team_id`).
> > >>>>>>>>> * `slot_pool`: Modify the unique constraint to be on (`pool`,
> > >>>>>> `team_id`)
> > >>>>>>>>> instead of `pool` alone.
> > >>>>>>>>> * `connection`: Modify the unique constraint to be on (`conn_id`,
> > >>>>>>>>> `team_id`) instead of `conn_id` alone.
> > >>>>>>>>> * `variable`: Modify the unique constraint to be on (`key`,
> > >>>> `team_id`)
> > >>>>>>>>> instead of `key` alone.
> > >>>>>>>>>
> > >>>>>>>>> I was also thinking adding the `team_id` column to the table
> > >>>>>>>>> `task_instance` for optimization/simplification purposes, to make
> > >>>>>> queries
> > >>>>>>>>> simpler/more optimized. The scheduler makes a lot of queries on
> > the
> > >>>>>> task
> > >>>>>>>>> instance level and having the `team_id` in this table would
> > simplify
> > >>>>>>>> them.
> > >>>>>>>>> We can always decide when working on the implementation to add
> > the
> > >>>>>> column
> > >>>>>>>>> `team_id` to other tables if we find out this would simplify
> > things.
> > >>>>>>>>>
> > >>>>>>>>> Note: Some have suggested allowing variables and connections to
> > be
> > >>>>>> shared
> > >>>>>>>>> across teams. Personally, I believe introducing the concept of
> > >>>>>>>>> shared/global resources would add unnecessary complexity and
> > >>>>>> potentially
> > >>>>>>>>> confuse users. That said, this can be revisited later. If we
> > decide
> > >>>> to
> > >>>>>>>>> support global/shared resources, we can introduce new tables to
> > >>>>>> support
> > >>>>>>>>> that model.
> > >>>>>>>>>
> > >>>>>>>>> ## Alternative Approach
> > >>>>>>>>>
> > >>>>>>>>> Instead of using UUIDs as primary keys, another option would be:
> > >>>>>>>>>
> > >>>>>>>>> * Change the primary key of `dag` to a composite key (`dag_id`,
> > >>>>>>>> `team_id`)
> > >>>>>>>>> * Update all foreign keys accordingly
> > >>>>>>>>>
> > >>>>>>>>> I’m personally not in favor of this approach, for the following
> > >>>>>> reasons:
> > >>>>>>>>>
> > >>>>>>>>> * It adds complexity to nearly all queries involving the `dag`
> > table
> > >>>>>>>>> * It may negatively affect database performance (though I’m not
> > a DB
> > >>>>>>>>> expert)
> > >>>>>>>>> * It requires specifying both `dag_id` and `team_id` to access a
> > DAG
> > >>>>>>>>> * We previously went down this path with `task_instance`, and
> > >>>>>> eventually
> > >>>>>>>>> moved to UUIDs to simplify things—this feels like a good
> > opportunity
> > >>>>>> to
> > >>>>>>>>> learn from that experience
> > >>>>>>>>>
> > >>>>>>>>> That said, I’m happy to discuss this further if others feel
> > >>>>>> differently.
> > >>>>>>>>>
> > >>>>>>>>> You can find more context and details on this topic in the
> > >>>> multi-team
> > >>>>>>>>> airflow project plan Google doc [2].
> > >>>>>>>>>
> > >>>>>>>>> Thanks,
> > >>>>>>>>>
> > >>>>>>>>> Vincent
> > >>>>>>>>>
> > >>>>>>>>> [1]
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>
> > https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-67+Multi-team+deployment+of+Airflow+components
> > >>>>>>>>> [2]
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>
> > https://docs.google.com/document/d/11rKo5D2QpT5NvMtDR1RZDjaih5jT5H-dt0aepkfmXSE/edit?tab=t.0#heading=h.4c16fc5qa1w8
> > >>>>>>
> > >>>>>>
> > ---------------------------------------------------------------------
> > >>>>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > >>>>>> For additional commands, e-mail: dev-h...@airflow.apache.org
> > >>>>>>
> > >>>>>>
> > >>>>
> > >>>
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > >> For additional commands, e-mail: dev-h...@airflow.apache.org
> > >>
> > >>
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > > For additional commands, e-mail: dev-h...@airflow.apache.org
> > >
> >
> >
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org

Re: [DISCUSS] DB changes to support multi-team

Reply via email to