On Jun 2, 2013, at 8:22 AM, Michal Petrucha <[email protected]> wrote:
> GenericForeignKey and nontrivial field types
> --------------------------------------------
>
> As I've indicated in my proposal, just casting any value to a string
> and then performing a reversible transformation on such strings may
> work well enough for string and integer database columns, not so much
> for things like dates, timestamps IP addresses or other similar types.
>
> Any ideas on how to make this work? Should I try to extend the backend
> API to include explicit casts for each nontrivial column type to a
> string representation equal to the one used by Python? Or should I
> just document this as unsupported?
There's already a `db_type` method that you can override (that receives a
`connection` object) for the actual database type. It's pretty easy to do
something to the effect of (for instance):
if 'postgres' in connection['ENGINE']:
return 'uuid'
if 'mysql' in connection['ENGINE']:
return 'char(36)'
[...]
However, having done some work myself on trying to create non-trivial field
subclasses, it's everything after that which gets difficult. Django provides an
overridable method to cast the value, but nothing for anything else in the
expression (the field or the operator), which are hard-coded into the backend.
(This is a source of frustration for me personally, because it makes it very
difficult to write classes for, say, PostgreSQL arrays, without either
resorting to massively ugly syntax or subclassing nearly every single class
involved in the process of creating a query (Manager, QuerySet, Query,
WhereNode...)
I ultimately went through the subclass-half-the-world technique quite recently
(a couple of weeks ago), as I want some non-trivial custom fields for a new
project I am about to start for my company (sadly, the project is private,
although I'd be happy to share field code if it would help in any way). What I
ended up doing is checking the Field subclass for a custom
`get_db_lookup_expression` method (that's not a Django field method -- I made
it up), and then my Field subclasses could use that to return a full expression
in the form "{field} = {value}". If the method is present and I get something
(other than None) back from that method, then use it, otherwise I pass it on to
the DatabaseOperators class for its usual processing. Using that method
prevents me from having to modify a monolithic DatabaseOperators subclass for
each new field I add (it seems to me that fields should know how to perform
their lookups).
The other challenge was defining the QuerySet lookup expressions. Django
essentially hard-codes the things it understands for lookups (e.g.
Foo.objects.filter(bar__gt=5) being transformed into "select ... from app_foo
where bar > 5"). The set of lookup suffices (exact, gt, gte, lt, lte, etc.) is,
sadly, also essentially hard-coded. I wrote an ArrayField to use PostgreSQL
arrays, and really wanted a way to be able to lookup based on the length of the
array (so, something like `Foo.objects.exclude(photos__len=0)`, for instance,
to give me all Foos with no photos). I did manage to make that work, but it was
a struggle. Also, the set of lookup suffices is universal, even though some of
them don't make sense on some fields ("year" on IntegerField, for instance).
So, my ideas on getting non-trivial field subclasses to work is basically:
1. Make fields be the arbiter of what lookup types they understand.
(IntegerFields shouldn't understand "year"; maybe someone's ArrayField subclass
does understand "len".) This probably needs to be something that can be
determined on the fly, as composite fields will probably need lookup types
corresponding to their individual sub-fields.
2. Make lookup types chainable. In the Array "len" example, `photos__len__gt=5`
makes sense.
3. Make it so fields define how they are looked up based on database engine and
lookup type.
Moving these things into the Field implementation (rather than in the backend)
should mean that non-trivial field subclasses become much easier. It'll also
eliminate the need for, say, django.contrib.gis to have an entirely different
set of backends -- a large reason gis jumps through those hoops (as best as I
can tell from reading it; disclaimer: I am not a contributor) is to work around
the restrictions being described.
I hope that helps your thinking. I have this stuff fresh in my head because
I've just worked on an implementation for PostgreSQL arrays and composite
fields that I need for my work. While I've thought a decent bit about
extensibility (for my own purposes), I haven't open-sourced it largely because
I know I haven't solved all the problems yet. Having read your e-mail, I now
hope that I don't have to, as I expect your work to outshine mine. I look
forward to replacing what I've done with what you do. :-)
One more absolutely massive disclaimer: I am not a core developer or even an
active developer of any kind on Django. I'm sure other voices will offer
guidance, and listen to theirs over mine. I'm sending this because I've spent a
lot of time very recently in the code you are looking to enhance, and because
extensible APIs generally are a passion of mine. That said, take my comments
for what they are -- one of a large number of voices, where I am not a source
of substantial expertise. I hope I can help focus and clarify your thinking a
little bit. I am sure others will reply and offer much more useful and much
more complete guidance.
Best Regards,
Luke Sneeringer
--
You received this message because you are subscribed to the Google Groups
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.