Re: How to handle a migration that last too long for being deployed?

Mike Dewhirst Tue, 17 Apr 2018 21:06:28 -0700

On 17/04/2018 6:27 PM, Adrien Cossa wrote:

Hi,
On 04/14/2018 02:38 AM, Mike Dewhirst wrote:
Does it actually stop users reading? If the entire migration happensin a single transaction, the database (Postgres anyway) should remainaccessible until the moment it is committed.
Maybe you could announce a maintenance operation which will onlyinterrupt certain actions for a few minutes?
I am not sure if I understand well how that works. If the migration isatomic, is is true that:- the users can read normally between the beginning of the migrationand the beginning of the transaction commit, and they will get the olddata- the users trying to read during the transaction commit will have towait for it to finish, and they will get the migrated data- the users who tries to write at anytime between the beginning of themigration and the end of the transaction commit will have to wait forit to finish, and they might overwrite the migrated data
If it works like that, I think that the solution I was thinking aboutis good enough for my needs:
- process the queryset by chunks. Note that if you want to useprefetch_related, you can't use a queryset iterator. So I get thesuccessive chunks by filtering with PK ranges ... I have benchmarked abit and a good value for the chunk size seems to be 500, it's notslower then any other value and keeps the memory usage down. If I hada lot of RAM, I could also raise this value to 5000 or 15000 withoutreally slowing the process.
- I have actually two separate atomic migrations (in case somethinggoes wrong, it is better then one non atomic migration with two atomicoperations): one to process the objects that are not modifiable by theusers (because they are in a certain status etc.) and a second toprocess the objects that could be modified by the users. The secondmigration concerns only 4-5% of the total objects, so it should bemuch faster. As I use the PKs to fetch the objects, I have to fix thedesired chunk size of 500 in order to get some chunks of (approx.) thesame size with 500 * total_objects_count / filtered_queryset_count.
There remains the problem of users that would try to write during thesecond migration: their changes will be written indeed to the oldmodel, but not taken in account by the new models (remember I want tosplit one model in two smaller ones). So maybe I should append here tothe second migration all the operations that are responsible fordeleting the old model? This way, people trying to write will get anerror - which is the best we can do here. Am I right?

It would be nice to avoid errors. That is why I suggested announcingthat you intend to take the system offline for a short period. It takesoff all the pressure and you can choose the simplest mechanism.

Users will get a benefit from the migration or you wouldn't be doing it.Therefore they should be happy to accept a little downtime. You mighthave to do a bit of selling :)

I might consider making production readonly, dumping the database,loading it up on a fast machine with heaps of RAM and a SSD for themigration then dumping and reloading on the production machine.

That way you can leave it online read-only and take it offline only forthe relatively brief reload after the off-site migration. A bit ofpractice and timing will indicate whether that method has legs. Or wings!

Thanks for your help!

Cheers,
Adrien
Adrien Cossa <co...@init.at> wrote:

Hi everybody!
I would like to know what options exist when you have a hugemigration that will obviously not run on your productive server.
I have spitted a model in two smaller ones and wrote then a migrationto populate these new models. The number of original objects isaround 250,000 and I have also a few references to update. In theend, the migration lasted more than 30 mn on my machine (16 GB RAMand it was swapping a lot) and it failed on another machine becausethe RAM was out (the process was using then about 13 GB). On theproductive server we have even less RAM so to run the migration as itis is really out of question.
I have tried to use all the Django mechanisms that I know to optimizethe queries: select_related, prefetch_related, bulk_create,QuerySet.update... Now, the migration I am talking about usebulk_create(batch_size=None) and process the whole queryset at once.Before that, as the migration was not so long lasting because I had 2references less to update, I tried other values for batch_size andalso I processed the queryset as pages of a few hundreds or thousandsobjects. The results were not better then batch_size=None and "all atonce", that's why I finally used "basic settings" (and the migrationwas lasting about 5 minutes). I will have to reintroduce some tweaksbecause the extra updates of the two relations I mentioned is makinghere a big difference.
I am wondering if someone already found him/herself in a similarsituation, and with what solution you finally came to.
If the migration lasts very long, it's not a problem by itself but Idon't want to lock the database for 15 mn. The fact is that I don'tknow what is happening during the migration process, what is lockedby what? I will split the migration in "pages" to use less RAManyway, but I was also thinking of migrating in two different steps*or* files, in order to process separately the objects that are noteditable (basically most of them, that we keep for history but theyare read-only) and the others (which should be much faster and thuspeople working will not be blocked for long). Does it make sense?Some other ideas?
Thanks a lot!

Adrien
--
You received this message because you are subscribed to the GoogleGroups "Django users" group.To unsubscribe from this group and stop receiving emails from it, sendan email to django-users+unsubscr...@googlegroups.com<mailto:django-users+unsubscr...@googlegroups.com>.To post to this group, send email to django-users@googlegroups.com<mailto:django-users@googlegroups.com>.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visithttps://groups.google.com/d/msgid/django-users/7043dd61-1c9f-0da2-db06-ae270f69a58c%40init.at<https://groups.google.com/d/msgid/django-users/7043dd61-1c9f-0da2-db06-ae270f69a58c%40init.at?utm_medium=email&utm_source=footer>.
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups "Django 
users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/15d57b6d-d435-cfb6-3012-95e64499fb56%40dewhirst.com.au.
For more options, visit https://groups.google.com/d/optout.

Re: How to handle a migration that last too long for being deployed?

Reply via email to