>
> With an appropriate index, SQL databases should be able to find rows
> matching a condition on two columns about as quickly as if the condition
> was only on one column, even in a table containing million of objects.
>
Databases can of course find the rows with appropriate indexes, but the
problem is that GenericForeignKeys are precisely not such appropriate ones.
> I'd like to see a benchmark before rejecting this idea on performance
> grounds. Could you point us to some data supporting your assertion that
> this isn't scalable?
>
To prove the increased performance I actually did some benchmark. I used
the script in the attachment. It needs a model that has only one field
called text. It randomly generates 10,000 users and 1,000,000 objects. Then
it assigns to each user respectively 1,000 object permissions of randomly
selected previously created objects. At least it checks for each user
respectively 1,000 randomly selected objects again. That's what I got:
For the first version of my contrib.auth patch:
Summary:
granted permissions in 1 day, 10:41:07.523402
checked permissions in 22:48:44.662974
For a module using GenericForeignKeys (I used django-guardian because it
seemed to be the most used one):
granted permissions in 11:31:12.216808
checked permissions <pending>
I don't have the results for checking permissions with django-guardian yet
because it took ages to check them. Each single check for a object
permission took about 4 to 5 seconds! Now think about how long it would
take to check for all users (10,000*1,000*4secs)! I think that is out of
all proportion to my suggested solution.
This happens because of the lack of a appropriate index as you mentioned
before. So the database has to do a full table scan and that's definitely
not scalable at all.
By the way, the reason for the GenericForeignKeys being faster in creating
them is that the database does not have to create the indexes (i.e. the
foreign keys).
> I'm -1 on a contrib app (django.contrib.auth) injecting a dynamic field
> into every model.
>
> (Yes, Django does that for the primary key, to avoid declaring it in every
> model. But it's core, not a contrib app.)
>
I totally agree with this but I don't see another way at the moment.
> What's the definition of "when the auth app is initialized"? Wouldn't your
> solution required that django.contrib.auth be loaded before all other apps?
>
Unfortunately it does but I have a open mind about better implementations.
>
> With the solution that I prefer, this doesn't seem absolutely necessary,
> the following statements are simple enough:
>
> ObjectPermission.objects.create(user=user, object=instance)
> ObjectPermission.objects.get(user=user, object=instance).delete()
>
I think that's matter of taste ;-)
Your need to add ‘object_permissions’ to a model is an incorrect approach
> in my opinion as it would require any old apps to undergo a schema
> migration. Your generic relations concerns might be valid however I would
> like to mention Django guardian does use generic FK but so does built in
> django auth – so I think this approach is just fine.
You don't have to add this field manually, auth app does that. And django
auth does not use generic relationships, it only makes use of the
ContentType field to comfortably store the type assigned to a permission.
Best regards,
Moritz
>
--
You received this message because you are subscribed to the Google Groups
"Django developers" group.
To view this discussion on the web visit
https://groups.google.com/d/msg/django-developers/-/fp-Bu8sgUq0J.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/django-developers?hl=en.
#!/usr/bin/env python -u
import string
import random
from datetime import datetime
NUM_OBJECTS=1000000
NUM_USERS=10000
NUM_PERMS_PER_USER=1000
def random_string(length=25, chars=string.ascii_letters+string.digits):
return ''.join(random.choice(chars) for i in range(length))
class Benchmark(object):
def __init__(self, User, TestModel, perm='testapp.add_testmodel', grant_perm=lambda user, perm, obj: ''):
self.time_grant=0
self.time_check=0
self._user=User
self._model=TestModel
self._perm=perm
self._grant_perm=grant_perm
def create_users(self, num=NUM_USERS):
print 'Creating users:',
self._user.objects.bulk_create(self._user(username=random_string()) for i in range(num))
print 'done'
def create_objects(self, num=NUM_OBJECTS):
print 'Creating objects:',
self._model.objects.bulk_create(self._model(text=random_string(50)) for i in xrange(num))
print 'done'
def grant_permissions(self, num_perms_per_user=NUM_PERMS_PER_USER, num_objects=NUM_OBJECTS):
print 'Grant permissions to users:',
k=1
t1=datetime.now()
for u in self._user.objects.all():
print k,
for i in xrange(num_perms_per_user):
j=random.randrange(1,num_objects)
self._grant_perm(u, self._perm, self._model.objects.get(pk=j))
k+=1
t2=datetime.now()
self.time_grant=t2-t1
print 'done in %s' % self.time_grant
def check_permissions(self, num_perms_per_user=NUM_PERMS_PER_USER, num_objects=NUM_OBJECTS):
print 'Check permissions:',
k=1
t1=datetime.now()
for u in self._user.objects.all():
print k,
for i in xrange(num_perms_per_user):
j=random.randrange(1,num_objects)
u.has_perm(self._perm, self._model.objects.get(pk=j))
k+=1
t2=datetime.now()
self.time_check=t2-t1
print 'done in %s' % self.time_check
def benchmark(self):
self.create_users()
self.create_objects()
self.grant_permissions()
self.check_permissions()
print 'Summary:'
print 'granted permissions in %s' % self.time_grant
print 'checked permissions in %s' % self.time_check