#37031: Improve writing migrations guide to adding unique fields on existing 
table
-------------------------------------+-------------------------------------
     Reporter:  Clifford Gama        |                    Owner:  Clifford
         Type:                       |  Gama
  Cleanup/optimization               |                   Status:  assigned
    Component:  Documentation        |                  Version:  dev
     Severity:  Normal               |               Resolution:
     Keywords:  migrations           |             Triage Stage:
                                     |  Unreviewed
    Has patch:  0                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------
Changes (by Clifford Gama):

 * summary:
     Improve "Writing migrations" how-to -- unique fields and
     ManyToManyField through models
     =>
     Improve writing migrations guide to adding unique fields on existing
     table
 * type:  Bug => Cleanup/optimization
 * version:  6.0 => dev


Old description:

> In the writing migrations docs there are two advanced migration scenarios
> that have inaccuracies and could be made clearer.
>
> **Migrations that add unique fields**
>
> 1. The current approach splits the work across three files: one to add
> the field, one to populate values, and one to restore the constraint. All
> three operations can be placed in a single migration, which is simpler to
> follow and has the added benefit of being atomic — removing the race
> condition that is currently warned about in the docs.
>
> 2. I think the section should mention that `Field.db_default` avoids this
> problem entirely by having the database generate a unique value per row.
> This is worth noting upfront so readers can choose the simpler path where
> their use case allows.
>
> 3. No mention of performant alternatives for large tables: The
> `RunPython` example iterates row by row with individual saves. For large
> tables this will be very slow. The docs should note that
> `QuerySet.bulk_update()` or RunSQL are worth considering in that case.
>
> **Changing a ManyToManyField to use a through model**
>
> 1. Inaccurate description of how Django handles this change: The section
> states that "the default migration will delete the existing table and
> create a new one". This is not accurate. Django
> [https://github.com/django/django/blob/d61f33f03b3177afdf1d76153014bad4107b1224/django/db/backends/base/schema.py#L894
> refuses to apply a migration] when `through=` is added/changed on an
> existing `ManyToManyField`.
>
> 2. The through model example does not accurately reflect the database:
> The current example uses `on_delete=DO_NOTHING` and
> `models.UniqueConstraint`, whereas Django's auto-generated through tables
> use `CASCADE` and `unique_together`. Since the state and database are not
> in sync, I think this could cause issues in later migrations.
>
> 3. The example can be simplified by setting `Meta.db_table` on the new
> through model to match the existing table name, eliminating the need for
> a `RunSQL` rename operation.
>
> The section also suggests using `sqlmigrate` or `dbshell` to find the
> existing table name, which is indirect. The simplest approach is to
> inspect `field.through._meta.db_table` before modifying the field.

New description:

 1. The current approach splits the work across three files: one to add the
 field, one to populate values, and one to restore the constraint. All
 three operations can be placed in a single migration, which is simpler to
 follow and has the added benefit of being atomic — removing the race
 condition that is currently warned about in the docs.

 2. I think the section should mention that `Field.db_default` avoids this
 problem entirely by having the database generate a unique value per row.
 This is worth noting upfront so readers can choose the simpler path where
 their use case allows.

 3. No mention of performant alternatives for large tables: The `RunPython`
 example iterates row by row with individual saves. For large tables this
 will be very slow. The docs should note that `QuerySet.bulk_update()` or
 RunSQL are worth considering in that case.

--
-- 
Ticket URL: <https://code.djangoproject.com/ticket/37031#comment:3>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/django-updates/0107019d8b9b03b6-c68a1686-4791-45af-94c3-d05b02a9bdd7-000000%40eu-central-1.amazonses.com.

Reply via email to