I have learned the hard way this week that data migrations in django 1.8, 
as wonderful has they are, do not scale.

My test data table is now sitting at about 90,000,000 rows.  I was able to 
add a "null=True" field in an instant, as documented.  Then came my attempt 
to fill it -- I tried using RunSQL, as suggested, and the migration ran for 
more than a day before crashing. The entire migration is treated as a 
single transaction, so none of the work was committed. My next attempt was 
to use RunPython with manual transaction control, but that cannot be done 
because the "atomic=False" argument to RunPython does not work on a 
Postgres database. 

My final solution (which should be done propagating my new field about 
three days from now) was to have the migration module spawn a copy of 
itself as a no-wait subprocess. It then runs as a main program, it opens 
its own connection to the database, and does the conversion a chunk at a 
time with manual transaction control. 

I think that my solution may be of benefit to others who might be able to 
adapt my code to their own situation -- but I am not sure how to publish 
it.  It is not a module, so publication on PyPi or such would be wrong.  
What would be effective?

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/58bbdd64-2b6d-491f-9429-3cba8a54d94f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to