Hey Russ,

thanks for your reply. I looked at the bulk insertion problem, and
that made me think that probably using the variables themselves is the
problem. It does not really make sense to me why anybody bothers to
use variables in a long SQL query for the IN clause, let alone a bulk
insertion? Does that benefit anything?

Anyway, I conducted an experiment and tried a simple SQL query with
lots of items in the IN clause to prove that its the variables
themselves, not the limitation of IN clause in SQLite, is whats
causing the problem. I have been successfully able to execute a query
as "SELECT * FROM projects WHERE id IN (...)" where on the place of
`...` there were 15,000 numbers. So SQLite has no problem with lot of
stuff as an argument of the IN operator. Now, the only limitation is
the length of the SQL query itself. According to the SQLite
documentation it is 1megabyte, and that suffices for 100k+ elements.
With that big query, you are right, it would be very impractical and
slow for the end users to interact with the application (sending 1M of
data to the webserver will probably be very unresponsive), so I think
this is a fair limitation that should never be exceeded for this use.

In Django code it seems that it would suffice to make a change in the
file 'db/models/sql/where.py'. Particularly, condition on the number
of elements of IN, and, if it is over let's say 100 of them, put them
into the SQL query directly as a string (e.g. '1,2,3') - not as
variables('?, ?, ?').

Though, for the future, I still believe there should be a better
solution than to rely on not reaching this limit. I would propose that
a temporary table should be created, filled up with the right
arguments of IN operator, and the query rewritten as a JOIN on the
original left argument of IN (or for simplicity, and perhaps worse
performance, a nested query) and this temporary table. That of course
only if the number of elements on the right side of IN is more than
some number. But this is for another discussion.

My question is therefore: Will the change in Django code that I
propose have any bad consequences or do you think it might actually
work satisfactorily?

Thanks.

Best,
Lukas

On Apr 23, 12:56 am, Russell Keith-Magee <russ...@keith-magee.com>
wrote:
> Hi Lukas,
>
> I haven't looked into the problem in detail, but it doesn't surprise me that 
> it exists.
>
> It's also probable that it's something that is new to 1.4 (or, at least, that 
> it manifests in slightly different ways in 1.4). One of the features added in 
> 1.4 is bulk insertion of data. This means that you can use a single SQL 
> statement to insert all your m2m relation; however, the downside is that on 
> SQLite, there is a limit of 500 insertions that can be done at any one time 
> (2 values per insertion).
>
> This is logged as ticket #17788. There have been some discussions about a fix 
> for this problem that will break the bulk insertion into batches. I suspect 
> that when this problem is fixed, your problem will go away.
>
> That said, I would also suggest you have a think about how you represent this 
> widget in your UI. The approach you're describing is going to be very slow 
> for the end user -- every time they load the page, they're going to have to 
> download the HTML for a widget 1000 choices; when they send their selections, 
> they're going to have to POST up to 1000 choices back to the server. This all 
> takes time, and won't result in a particularly responsive web page.
>
> If you've got a situation where there are 1000 options, but usually only a 
> small number -- say, O(10) -- are selected, then perhaps an AJAX autocomplete 
> widget of some kind would be a better approach. If the user is going to 
> select a lot more options, then you might need to use a more exotic widget 
> that avoids the need to transfer full lists back and forth.
>
> Yours,
> Russ Magee %-)
>
>
>
>
>
>
>
> On Monday, 23 April 2012 at 5:38 AM, Lukas Zilka wrote:
> > Hello,
>
> > I have a form with ModelMultipleChoiceField on it, and I am getting
> > 'DatabaseError: too many SQL variables' (using SQLite) when the user
> > picks more than 1000 entries in the selection widget and posts the
> > form.
>
> > The problem seems to be the method clean of ModelMultipleChoiceField,
> > which tries to select objects from the database simply by the IN SQL
> > clause (e.g. SELECT * FROM projects WHERE id IN (1,2,3)). When the
> > number of numbers in the IN argument rises over 1000 the too many SQL
> > variables happens.
>
> > I am using the most recent version of Django (1.5.dev17922), though I
> > think it is irrelevant because similar issues happened even with older
> > versions.
>
> > I have always worked around this problem by custom temporary models
> > that I used for joins of more complex queries. But more and more it
> > seems to me that, either there already is a systematic solution to
> > this problem that I am missing (which I hope somebody could point out
> > to me), or there at least needs to be a need for it -- how do you
> > address similar situations? -- in which case I can start thinking how
> > to propose something general that could later be incorporated into
> > Django.
>
> > Thanks for any suggestions or help.
>
> > --
> > You received this message because you are subscribed to the Google Groups 
> > "Django users" group.
> > To post to this group, send email to django-users@googlegroups.com 
> > (mailto:django-users@googlegroups.com).
> > To unsubscribe from this group, send email to 
> > django-users+unsubscr...@googlegroups.com 
> > (mailto:django-users+unsubscr...@googlegroups.com).
> > For more options, visit this group 
> > athttp://groups.google.com/group/django-users?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Reply via email to