A cleaned up version of my patch can be found at 
http://paste.pocoo.org/show/474838/.
Test fixes aren't included in the paste, so modeltests/fixtures test
will not pass due to different ordering of the results. Github
contains the latest version including some test-fixing:
https://github.com/akaariai/django/tree/fixture_loading

I hope the patch will help in solving the natural keys problem. The
biggest problem is that the patch doesn't handle maximum amount of SQL
parameters. Give it 100 serialized instances of a model containing 10
fields and it will fail.

 - Anssi

On Sep 12, 4:41 pm, Anssi Kääriäinen <[email protected]> wrote:
> On Sep 12, 3:38 pm, "Jonas H." <[email protected]> wrote:
>
>
>
>
>
>
>
>
>
> > On 09/12/2011 12:15 AM, Anssi Kääriäinen wrote:
>
> > > The feature could be useful if there are users loading big fixture
> > > files regularly. Otherwise it complicates fixture loading for little
> > > gain.
>
> > Maybe we could simply add an option to the loaddata command -- so that
> > if someone really needs tons of fixtures for their tests it's possible
> > to profit from bulk insertions by manually invoking loaddata from their
> > test code. And the implementation is quite simple:
>
> >http://paste.pocoo.org/show/474602/(doesn'tcover all edge-cases yet)
>
> > I did some benchmarking with this code and it speeds up fixture loading
> > *a lot*:http://www.chartgo.com/get.do?id=bdfe6af778(chunksize=0does
> > not use `bulk_create` but `save`, and the speedups seen for chunksize=1
> > is because `bulk_create` is used, thus avoiding `save` overhead)
>
> > Jonas
>
> I like this idea much better than trying to hack loaddata to use
> bulk_create while maintaining compatibility with the current code.
>
> The hard limitations would be as follows:
>   - There must not be any updates.
>
> Then there are limitations which could be lifted later on:
>
>   - No natural keys (or the targets of the natural keys must exists in
> the DB). I think this could be lifted later on - the dumped objects
> are ordered in a way that natural keys do not form circles - just save
> the objects in the same order and resolve the natural keys when saving
> - not when deserializing.
>
>   - Inherited models must be saved using the normal way. This could be
> lifted: make bulk_create insert inherited objects if they have PK set
> trusting that the user will insert the base objects in the same
> transaction, or that they are already present. That is, create a
> similar "raw" mode for bulk create that exists for Model.save_base.
>
>   - Objects with M2M data are saved the normal way. This could be
> improved later on, so that m2m data would also be bulk saved along
> with the objects.
>
>   - All objects must be loaded into memory: this is easy to lift, just
> flush the collected objects once per N objects. I am not sure of this,
> but you probably can flush the collected objects also once you find
> out a new class - the objects are serialized class at a time.
>
>   - Signals aren't sent at all. It is easy to batch send the signals
> if wanted.
>
> My version of the patch solves all those cases in a way compatible
> with the current implementation. The biggest difference to your
> version is that my version can be used when running tests - but the
> speed difference for Django's test suite is somewhere around 2-3%. The
> cost is some added complexity, and one select per batch to see which
> PKs are already in the DB and which ones not. So it seems there is not
> much point for the added complexity.
>
> The most difficult problem is that my patch _will_ break some users
> fixture loading due to the SQL length / parameter amount limitations
> of different backends. This is hard to solve cleanly. For example
> SQLite3 seems to have a 999 parameter limitation, so that you can save
> 333 three field models, 99 ten field models or just 10 hundred field
> models. If you have a bulk_create flag, then the backwards
> incompatibility is not a problem.
>
> So, in summary, it seems having a bulk_create flag is the only way
> forward.
>
>  - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Reply via email to