On Thu, Dec 27, 2012 at 11:29 AM, Donald Stufft <[email protected]>wrote:
> On Wednesday, December 26, 2012 at 10:00 PM, Russell Keith-Magee wrote: > > Why? Because we've gone to extraordinary lengths to make sure this sort of > thing is at least theoretically possible. > > Although we use the term "ORM", and there's currently only relational > implementations of Django's ORM, there's nothing relational about the > Django ORM API. We've very deliberately posed the API in terms of functions > you want to perform on objects: > > * Get me the author named "Douglas Adams" > * Get a list of books that are more than 3 years old > * Update the login counter on this user by one. > > Because except for very simple models you will not be able to sanely take > a model written for a Relational database and switch it to a NonRelational > database. If you cannot provide the same sort of mostly transparent > switching like the ORM provides for MySQL -> PostgreSQL -> Oracle then > there is little benefit in keeping it within the same system. > > All of your examples work on simple models sure. What about: > > * Anything requiring a join, explicitly with select_related() or > implicitly with __ magic. > * select_for_update > * No standard way of handling "Related" fields (Do you Inline them? > Mimic a ForeignKey?) > * The entire transaction system on systems without transactions > > There is also the problem of vastly different access patterns, > assumptions, and performance characteristics. > > * Getting a list of Books that are more than 3 year old is a very > simple operation in SQL with very predictable performance, getting a list > of books older than 3 years old if they are stored in Redis, less so. > * Systems that depended on a unique=True enforcing a constraint of > uniqueness no longer happening. > * index=True becoming NO-OP. > * A simple Join potentially goes from an inexpensive operation to one > that requires traversing several million rows with horrible performance. > > The access patterns, assumptions of functionality, and assumption of > performances are so different between even the different NoSQL solutions, > much less the various NoSQL solutions and Relational databases that either > you're going to have second class citizens (Sure you can use X system with > Django models, but only as a competely segregated unit and you can't touch > [a list of features]), or you're going to need to limit the features down > to a subset that all databases can support (We already have this problem > with PostgreSQL vs MySQL vs SQLite, it will be tenfold if we include NoSQL > databases). In order to actually use the power of your datastore you need > to use a class of "ORM" that is designed to work within it's access > patterns. > > Django as a whole should be avoiding giving people footguns, and > attempting to shove NonRelational databases into the ORM will be providing > a massive footgun. As soon as it happens you'll have a whole host of people > attempting to run apps and sites that depended on things that relational > databases assured suddenly having it yanked out from underneath them and it > will be Django's fault for providing that footgun. > That depends entirely on what you consider the goal of the ORM to be. You have assumed that the goal would be "allow an arbitrary query to run on any underlying data store, and run with equivalent efficiency". In this model, you could take your fully operational Django PostgreSQL project, and roll it out under MongoDB (or any other supported store), and it would Magically Workâ˘. I completely agree that this is a completely unrealistic goal, and would, as you rightly point out, constitute a high-calibre footgun. However, there's another way of looking at it. You're focussing on the ORM as a query generation engine. Of more interest is the ORM as a metadata layer for models in a data store, with some basic reliable querying features. Think of it this way -- the goal isn't to allow an arbitrary query to run on any data store. The goal is to allow Django's admin to operate on a model in any data store, or to allow a Django ModelForm to retrieve and/or store an object in any datastore. The queries required to support Django's admin and/or ModelForms are all inherently simple CRUD operations -- operations that have simple (and for the most part, efficient) analogs in every data store. Any non-trivial query will *always* require an understanding of the underlying data storage. The ORM is an abstraction, and while it can make certain queries easier to write, you can't use it in a vacuum -- you have to be aware of the SQL that is being generated. And sometimes, you need to fall back to raw SQL to get the job done. I don't see a non-relational backend to Django's ORM being any different. We can make simple retrieval operations easy. But there's no way we can automatically optimise queries for every possible data store -- at some point, a brain will need to be engaged in the process, and purpose-built optimisations will need to be developed. Similarly, just because there's an efficient relational representation of a data concept (e.g., a foreign key), doesn't mean there's an equally efficient non-relational representation. Interestingly, your arguments about the complications of switching from a relational data store to a non-relational store apply equally to switching between different relational stores. Just because you have a project running under MySQL, doesn't mean you can just change the database backend and have it run under PostgreSQL. Django will make certain aspects of the transition easy (i.e., the easy queries), but if you've done any sort of query or index optimisation, or you're relying on transaction behaviour, this switch will be equally problematic. However, the bit you *won't* have to worry about is the basic, out of the box Admin interface, and the behaviour of metadata inspecting Django features like the forms library. So - what I'm talking about here isn't some magical abstraction layer that makes the choice of data store irrelevant. It's a way to make simple things simple, and complex things possible. It's about making it *possible* to wrap a Django form around a MongoDB object. It's about making it possible to display those objects in Django's admin. Yes, some features of the ORM will be lost. Some will be inefficient. And there will almost certainly be some non-relational operations that aren't available on relational stores, and vice versa. Finally, I'd also point out that a lot of this sort of analysis and discussion has been covered in the past. The hubbub about non-relational stores has been around for a while, and the relationship between noSQL and Django has been discussed a lot on mailing lists, and at conferences. If you're interested in the topic, it's worth seeing what has been said in the past. Yours, Russ Magee %-) -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
