Hi Everyone, Coming back weeks later, to say thank you for your contributions. I got quite sidetracked from this task, but I should report back on this soon.
Thank you! On Wed, Jan 19, 2011 at 10:05 AM, Thomas Weholt <thomas.weh...@gmail.com>wrote: > Bulk inserts are the way the go if you can. When inserting a bunch of > data, avoid using the django orm. Do it in plain SQL. The overhead of > creating django orm model instances is way too expensive. Alltough it > may not be bulk insert the sense Nick mentioned above I wrote DSE [ > http://pypi.python.org/pypi/dse/0.3.1 ] for that exact purpose, to > insert or update a bunch of data using plain SQL, but it hasn`t been > tested much so don`t use it in production. The thing it solves, can be > done in simple SQL; > > Use cursor.executemany("<prepared insert statement>", <list of tuples > with params>). > > DSE takes care of creating SQL-statements for insert and updates based > on your models, handles any default values you might have defined in > your model, caches lists of params and executes cursor.executemany > when the list of cached items reaches a specified limit. My experience > is that the performance gain of this solution or a similar one is > huge. Using cursor.executemany might be what Nick meant by bulk > insert, but I think different DB backends handles it differently. I > don`t know. Anyway, I've inserted many thousands of records using DSE > and it takes a fraction of the time when compared to doing it with the > orm. > > > NB! DSE is a proof-of-concept project more than anything else. It > needs a good re-write, extensive testing and docs, but it might be > helpful. > > Thomas > > > > > > On Wed, Jan 19, 2011 at 2:35 AM, Nick Arnett <nick.arn...@gmail.com> > wrote: > > > > > > On Tue, Jan 18, 2011 at 12:04 PM, Sithembewena Lloyd Dube > > <zebr...@gmail.com> wrote: > >> > >> Hi all, > >> > >> I am building a search app. that will query an API. The app. will also > >> store search terms in a very simple table structure. > >> > >> Big question: if the app. eventually hit 10 million searches and I was > >> storing every single search term, would the table hold or would I run > into > >> issues? > > > > As someone else said, 10 million records is no big deal for MySQL, in > > principle. > > However, you probably would do better to avoid all the overhead of a > > database transaction for storing each of these. I'm going to assume that > > there will be duplicates, especially if you normalize the queries. It > would > > make a lot more sense to log the queries into a text file, which has > > extremely low overhead. Then you'd periodically process the log files, > > normalizing and eliminating duplicates, producing a bulk insert to load > into > > the database. Bulk inserts will be FAR more efficient than using Django. > > Nick > > > > -- > > You received this message because you are subscribed to the Google Groups > > "Django users" group. > > To post to this group, send email to django-users@googlegroups.com. > > To unsubscribe from this group, send email to > > django-users+unsubscr...@googlegroups.com. > > For more options, visit this group at > > http://groups.google.com/group/django-users?hl=en. > > > > > > -- > Mvh/Best regards, > Thomas Weholt > http://www.weholt.org > > -- > You received this message because you are subscribed to the Google Groups > "Django users" group. > To post to this group, send email to django-users@googlegroups.com. > To unsubscribe from this group, send email to > django-users+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/django-users?hl=en. > > -- Regards, Sithembewena Lloyd Dube -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.