I was thinking of something like that ... but I am not a good enough 
plpgsql programmer to figure out how to do the chunking in SQL.  I needed a 
way to do the data calculations a thousand rows at a time, rather than all 
ninety-million in one gulp.  So I have Python do the chunking and SQL do 
the update.  

There was also the documentation aspect: "where do I store the SQL code and 
the instructions for how to do the migration manually?"  This way, 
everything is contained in the migration module.  The SQL is stored in 
Python triple-quoted string literals, along with the code to run it, and 
the database connection information comes right out of the django settings 
module.

On Thursday, May 14, 2015 at 4:16:43 PM UTC-6, John Fabiani wrote:
>
>  As a newbie - would it be better to use pgadmin (or psql) to make the 
> changes and migrate --fake so that Django would be happy?
>
> Johnf
>
> On 05/14/2015 02:43 PM, Vernon D. Cole wrote:
>  
> I have learned the hard way this week that data migrations in django 1.8, 
> as wonderful has they are, do not scale.
>
> My test data table is now sitting at about 90,000,000 rows.  I was able to 
> add a "null=True" field in an instant, as documented.  Then came my attempt 
> to fill it -- I tried using RunSQL, as suggested, and the migration ran for 
> more than a day before crashing. The entire migration is treated as a 
> single transaction, so none of the work was committed. My next attempt was 
> to use RunPython with manual transaction control, but that cannot be done 
> because the "atomic=False" argument to RunPython does not work on a 
> Postgres database. 
>
> My final solution (which should be done propagating my new field about 
> three days from now) was to have the migration module spawn a copy of 
> itself as a no-wait subprocess. It then runs as a main program, it opens 
> its own connection to the database, and does the conversion a chunk at a 
> time with manual transaction control. 
>
> I think that my solution may be of benefit to others who might be able to 
> adapt my code to their own situation -- but I am not sure how to publish 
> it.  It is not a module, so publication on PyPi or such would be wrong.  
> What would be effective?
>
>  -- 
> You received this message because you are subscribed to the Google Groups 
> "Django users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to django-users...@googlegroups.com <javascript:>.
> To post to this group, send email to django...@googlegroups.com 
> <javascript:>.
> Visit this group at http://groups.google.com/group/django-users.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/django-users/58bbdd64-2b6d-491f-9429-3cba8a54d94f%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/django-users/58bbdd64-2b6d-491f-9429-3cba8a54d94f%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>
>
>  

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-users+unsubscr...@googlegroups.com.
To post to this group, send email to django-users@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-users/dd2abf3f-0981-43cc-ac39-8c05a730f3ec%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to