On Tue, Jun 3, 2014 at 7:10 PM, Marko Rauhamaa <ma...@pacujo.net> wrote: > Chris Angelico <ros...@gmail.com>: > >> def request.process(self): # I know this isn't valid syntax >> db.act(whatever) # may block but shouldn't for long >> db.commit() # ditto >> write(self, response) # won't block >> >> This works as long as your database is reasonably fast and close > > I find that assumption unacceptable.
It is a dangerous assumption. > The DB APIs desperately need asynchronous variants. As it stands, you > are forced to delegate your DB access to threads/processes. > >> So how do you deal with the possibility that the database will block? > > You separate the request and response parts of the DB methods. That's > how it is implemented internally anyway. > > Say no to blocking APIs. Okay, but how do you handle two simultaneous requests going through the processing that you see above? You *MUST* separate them onto two transactions, otherwise one will commit half of the other's work. (Or are you forgetting Databasing 101 - a transaction should be a logical unit of work?) And since you can't, with most databases, have two transactions on one connection, that means you need a separate connection for each request. Given that the advantages of asyncio include the ability to scale to arbitrary numbers of connections, it's not really a good idea to then say "oh but you need that many concurrent database connections". Most systems can probably handle a few thousand threads without a problem, but a few million is going to cause major issues; but most databases start getting inefficient at a few thousand concurrent sessions. >> but otherwise, you would need to completely rewrite the main code. > > That's a good reason to avoid threads. Once you realize you would have > been better off with an async approach, you'll have to start over. You > can easily turn a nonblocking solution into a blocking one but not the > other way around. Alright. I'm throwing down the gauntlet. Write me a purely nonblocking web site concept that can handle a million concurrent connections, where each one requires one query against the database, and one in a hundred of them require five queries which happen atomically. I can do it with a thread pool and blocking database queries, and by matching the thread pool size and the database concurrent connection limit, I can manage memory usage fairly easily; how do you do it efficiently with pure async I/O? ChrisA -- https://mail.python.org/mailman/listinfo/python-list