Hi Christian, Squashmigrations contains just the sort of optimiser you mention, complete with all the examples you mention (though because it's foreign-key-aware, it sometimes can't optimise everything away, as it can only see a single app).
It's not used at execution time, however, as part of the design of migrations is that everything gets executed the same everywhere; that said, it would not be infeasible to add a command line option or a third-party command that did do this, if the project was willing to potentially end up with different column orders in tables and so on. There's nothing blocking it from what I know - it shouldn't be too hard to take the loaded operation set and run the optimiser on it before it's applied. Andrew On Fri, Sep 11, 2015 at 2:38 AM, Christian Hammond <[email protected]> wrote: > Hi everyone, > > We're working on upgrading our product from Django 1.6 to 1.8, and one of > the biggest tasks is moving from Django Evolution to Django's migrations. > Please correct me if I'm missing something, but from my testing and > investigation into the code, it seems that the migrate command will always > apply each migration file one-at-a-time, individually, rather than > pre-optimizing all the changes that need to be made across all pending > migrations. This results in a lot of unnecessary, potentially expensive SQL > operations. I'm wondering if there are plans to optimize this to reduce > migration time. > > For example, say we have a Book model. Over the course of 5 migrations, > we've added 3 fields, changed one field, and removed one more. (Maybe there > are other migrations for other models being applied in-between these > migrations.) > > Now let's say either a new database is being created for the first time, > or an existing database is being upgraded from a state prior to the > creation of the Book model. My expectation is that the database would only > need to issue a single CREATE TABLE statement (and possibly follow-up ALTER > TABLE ADD CONSTRAINTs). However, it's looking like it's instead executing a > CREATE TABLE and then multiple ALTER TABLEs (even multiple ones per field > within one migration). > > While not too terrible for a brand new table, it is a problem for an > existing table with a lot of data. If you have 5 new field changes to an > existing table containing a million rows and you're running on MySQL (for > instance), you'll end up with 5 ALTER TABLEs instead of one, causing MySQL > to dump data/alter table/import data 5 times. > > Django Evolution used to do this as well, and it made for very slow > database migrations for tables with a lot of data. We had complaints of > multi-day-long upgrades from some big users. Nowadays, it contains an > optimizer ( > https://github.com/beanbaginc/django-evolution/blob/master/django_evolution/mutators.py#L377) > that first gathers all the operations that need to be performed, > consolidates them whenever possible (such as rolling up any follow-up ALTER > TABLEs into an earlier CREATE TABLE, filters out anything table-related for > a table that will just be deleted later, and consolidates ALTER TABLEs into > as few statements as possible.). This has done wonders for database > migration time. > > I know there's squashmigrations, but that's only really useful if you're > in full control over all affected code and databases. (The documentation > states, "if you’re a third-party project, just ensure your users upgrade > releases in order without skipping any," which isn't feasible in shipping > products.). It seems that the same optimization work could be done at > application time though, couldn't it? > > Are there already plans to add an optimizer for the full migration? What > are the big issues blocking it? > > Thanks, > > Christian > > -- > You received this message because you are subscribed to the Google Groups > "Django developers (Contributions to Django itself)" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/django-developers. > To view this discussion on the web visit > https://groups.google.com/d/msgid/django-developers/6e91e997-4e17-48fe-9ca4-3f34accfca41%40googlegroups.com > <https://groups.google.com/d/msgid/django-developers/6e91e997-4e17-48fe-9ca4-3f34accfca41%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/django-developers. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAFwN1uoTY7DEw1a5PrJspXgfaK2MYYKBE1ygBU-uwow%3DnL3njw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
