Hi Everyone,

Coming back weeks later, to say thank you for your contributions. I got
quite sidetracked from this task, but I should report back on this soon.

Thank you!

On Wed, Jan 19, 2011 at 10:05 AM, Thomas Weholt <thomas.weh...@gmail.com>wrote:

> Bulk inserts are the way the go if you can. When inserting a bunch of
> data, avoid using the django orm. Do it in plain SQL. The overhead of
> creating django orm model instances is way too expensive. Alltough it
> may not be bulk insert the sense Nick mentioned above I wrote DSE [
> http://pypi.python.org/pypi/dse/0.3.1 ] for that exact purpose, to
> insert or update a bunch of data using plain SQL, but it hasn`t been
> tested much so don`t use it in production. The thing it solves, can be
> done in simple SQL;
>
> Use cursor.executemany("<prepared insert statement>", <list of tuples
> with params>).
>
> DSE takes care of creating SQL-statements for insert and updates based
> on your models, handles any default values you might have defined in
> your model, caches lists of params and executes cursor.executemany
> when the list of cached items reaches a specified limit. My experience
> is that the performance gain of this solution or a similar one is
> huge. Using cursor.executemany might be what Nick meant by bulk
> insert, but I think different DB backends handles it differently. I
> don`t know. Anyway, I've inserted many thousands of records using DSE
> and it takes a fraction of the time when compared to doing it with the
> orm.
>
>
> NB! DSE is a proof-of-concept project more than anything else. It
> needs a good re-write, extensive testing and docs, but it might be
> helpful.
>
> Thomas
>
>
>
>
>
> On Wed, Jan 19, 2011 at 2:35 AM, Nick Arnett <nick.arn...@gmail.com>
> wrote:
> >
> >
> > On Tue, Jan 18, 2011 at 12:04 PM, Sithembewena Lloyd Dube
> > <zebr...@gmail.com> wrote:
> >>
> >> Hi all,
> >>
> >> I am building a search app. that will query an API. The app. will also
> >> store search terms in a very simple table structure.
> >>
> >> Big question: if the app. eventually hit 10 million searches and I was
> >> storing every single search term, would the table hold or would I run
> into
> >> issues?
> >
> > As someone else said, 10 million records is no big deal for MySQL, in
> > principle.
> > However, you probably would do better to avoid all the overhead of a
> > database transaction for storing each of these.  I'm going to assume that
> > there will be duplicates, especially if you normalize the queries.  It
> would
> > make a lot more sense to log the queries into a text file, which has
> > extremely low overhead.  Then you'd periodically process the log files,
> > normalizing and eliminating duplicates, producing a bulk insert to load
> into
> > the database.  Bulk inserts will be FAR more efficient than using Django.
> > Nick
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Django users" group.
> > To post to this group, send email to django-users@googlegroups.com.
> > To unsubscribe from this group, send email to
> > django-users+unsubscr...@googlegroups.com.
> > For more options, visit this group at
> > http://groups.google.com/group/django-users?hl=en.
> >
>
>
>
> --
> Mvh/Best regards,
> Thomas Weholt
> http://www.weholt.org
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django users" group.
> To post to this group, send email to django-users@googlegroups.com.
> To unsubscribe from this group, send email to
> django-users+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/django-users?hl=en.
>
>


-- 
Regards,
Sithembewena Lloyd Dube

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Reply via email to