In article <pan.2011.01.23.06.09.16@pfln.invalid>, Deadly Dirk <dirk@pfln.invalid> wrote: > The same thing applies to MongoDB which is equally fast but does allow ad > hoc queries and has quite a few options how to do them. It allows you to > do the same kind of querying as RDBMS software, with the exception of > joins. No joins.
Well, sort of. You can use forEach() to get some join-like functionality. You don't get the full join optimization that SQL gives you, but at least you get to do some processing on the server side so you don't have to ship 40 gazillion records over the network to pick the three you wanted. > It also allows map/reduce queries using JavaScript and > is not completely schema free. What do you mean by "not completely schema free"? > Databases have sub-objects called "collections" which can be indexed > or partitioned across several machines ("sharding"), which is an > excellent thing for building shared-nothing clusters. We've been running Mongo 1.6.x for a few months. Based on our experiences, I'd say sharding is definitely not ready for prime time. There's two issues; stability and architecture. First, stability. We see mongos (the sharding proxy) crash a couple of times a week. We finally got the site stabilized by rigging upstart to monitor and automatically restart mongos when it crashes. Fortunately, mongos crashing doesn't cause any data loss (at least not that we've noticed). Hopefully this is something the 10gen folks will sort out in the 1.8 release. The architectural issues are more complex. Mongo can enforce uniqueness on a field, but only on non-sharded collection. Security (i.e. password authentication) does not work in a sharded environment. If I understand the release notes correctly, that's something which may get fixed in some future release. > Scripting languages like Python are > very well supported and linked against MongoDB The Python interface is very nice. In some ways, the JS interface is nicer, only because you can get away with less quoting, i.e. JS: find({inquisition: {$ne: 'spanish'}} Py: find({'inquisition': {'$ne': 'spanish'}} The PHP interface is (like everything in PHP), sucky: PHP: find(array('inquisition' => array('$ne' => 'spanish')) The common thread here is that unlike SQL, you're not feeding the database a string which it parses, you're feeding it a data structure. You're stuck with whatever data structure syntax the host language supports. Well, actually, that's not true. If you wanted to, you could write a front end which lets you execute: "find where inquisition != spanish" and have code to parse that and turn it into the required data structure. The odds of anybody doing that are pretty low, however. It would just feel wrong. In much the same way that SQLAlchemy's functional approach to building a SQL query just feels wrong to somebody who knows SQL. > I find MongoDB well suited for what is > traditionally known as data warehousing. I'll go along with that. It's a way to build a fast (possibly distributed, if they get sharding to work right) network datastore with some basic query capability. Compared to SQL, you end up doing a lot more work on the application side, and take on a lot more of the responsibility to enforce data integrity yourself. > You may want to look > at this Youtube clip entitled "MongoDB is web scale": > > http://www.youtube.com/watch?v=b2F-DItXtZs That's the funniest thing I've seen in a long time. The only sad part is that it's all true. There are some nice things to NO-SQL databases (particularly the schema-free part). A while ago, we discovered that about 200 of the 300,000 documents in one of our collections were effectively duplicates of other documents ("document" in mongo-speak means "record" or perhaps "row" in SQL-speak). It was trivial to add "is_dup_of" fields to just those 200 records, and a little bit of code in our application to check the retrieved documents for that field and retrieve the pointed-to document. In SQL, that would have meant adding another column, or perhaps another table. Either way would have been far more painful than the fix we were able to do in mongo. -- http://mail.python.org/mailman/listinfo/python-list