Hi Raphael, On Mon, Feb 28, 2011 at 10:01:55AM +0000, Raphael Cohn wrote: > AMQP Observations > Your comments about AMQP seem to mostly be appropriate for one of the > older versions, eg 0-8, and I don't think they particularly apply to later > versions, eg 1-0. AMQP 0-8 did have some issues that didn't always make it > an optimal choice. For instance:-
I was using the latest version of RabbitMQ and a few different client APIs and analyzed the protocol exchange. I find it's usually best to see what the latest software actually being used is doing rather than a spec that may not yet be implemented. It looks to have been 0.9.1 according to the header. With StormMQ leading the effort on free clients, this certainly helps, and I'm sure more server implementations will be popping up. > - Exchanges, etc are no longer part of the spec; queues can be transient, > with configurable transience and timeouts (eg destroy it 10s after the > last message was retrieved) Ahh, interesting. This is pretty different from previous versions then. > - Configuration is part of the act of sending messages, not separate, eg > open, send to new queue, etc Ahh, great. > Using HTTP > Whilst you can always put any asynchronous protocol over a synchronous > one, it doesn't always work out too well. For example, starting on such > an approach means that any 'kernel' will be optimised for 'pulling' from a > queue, when an efficient queue serve handling tens of thousands of > connections needs to be able to 'push' incoming messages, after filtering, > to their destinations. Pushing it all into the HTTP request is a sensible > approach for simple req-response protocols, but it's going to put a heavy > burden onto your queue server. I would disagree here, a pull-based kernel can still be quite efficient. Gearman, a different queue protocol/server, is pull based (basically long-polling) and the server I wrote could easily route 50k fully-synchronous messages/second on a 4-core machine. This was also without any form of batching optimizations, which will be part of the OpenStack queue service. The pull operations need to be designed correctly so they are optimized for high throughput, which as a result makes it slightly more chatty for idle or low-throughput connections. The pull-based kernel also addresses a couple other problems typical with distributed queue systems. The first is worker readiness. With workers that may wish to connect to multiple servers for HA (messages are spread based on hashing or geographic region), you don't want the worker busy with a message when another server is trying to push another message to it. If the worker always initiates the receipt of a message, you can ensure it is able to process the message immediately. Otherwise a queue server may push a message to a worker and it could block until the worker is done with the current message from the other server, delaying the response time. This same issue can occur with a single server connection when the worker is unresponsive due to latency, machine being busy, etc. Another issue a pull-based kernel helps with is affinity for fast workers and response times. By allowing workers to pull the message after a long-poll, the fastest workers will be the ones doing the most work. For HA solutions where you have multiple workers on multiple servers, this allows the workers to naturally do the most work where it is most efficient. This can be a very useful form of load balancing that you get for free with pull-based queues. Having said all this, it will still be very easy to add push-based protocols on top of this, so protocols like AMQP should not be difficult to add on. > RTT: This is almost irrelevant once you decide to use TLS. TLS set up and > tear down, essential for most cloud operations, is far more inefficient > than any protocol it tunnels. And anyone sending messages without > encryption should be shot. It's not acceptable to send other people's data > unsecured anymore (indeed, if it ever was). I didn't count TCP and/or SSL/TLS RTTs since those will apply to any protocol when security is a concern (unless you have a way of preventing replay attacks built into the service). The synchronous RTTs, regardless of the source, do matter though as these make the user experience suffer. It sounds like the pipelining in the 1-0 spec takes care of my concerns though. > 201 Created, etc: What happens to you message if your TCP connection dies > mid reply? How do you know if your message was queued? Is there a > reconcillation API? See: http://wiki.openstack.org/QueueService#Behavior_and_Constraints Basically, duplicates are possible for a number of reasons, this being one of them. Workers need to be idempotent. > Of course, some of these concerns could be addressed with HTTP sessions or > cookies, but that's quite nasty to use in most environments. Yeah, don't want to do there. :) > Fundamentally, if your wish is to support languages such as PHP with a > messaging layer, then HTTP initially seems to be the way to go. The > reality is that any TCP based approach is inappropriate here, because > opening a TCP connection on the back of a TCP connection is very weak. The > right solution is to use connection caching - people have done it with > databases for years for this reason - but some of these, erm, web > languages, make such an approach too hard. Hopefully the growth of > sensible back ends like node.js will make this a thing of the past. AMQP I'll hold my comments back from node.js, but will agree that connection caching is certainly more widespread than it used to be. We can help aid this by writing efficient interfaces into the service, but we still have devices like cell phones (or other devices over cell networks) where connection caching isn't always going to be an efficient option. > has internally multiplexed sessions (virtual connections if you will), so > a PHP runtime, say, could open just one AMQP connection - and assign each > incoming request to one of the available sessions. With 65,536 sessions > available, only the most incredible PHP server would need them all at > once, given that most PHP code falls over if 3 people connect... In > practice, the right place then to open a messaging connection with the > sorts of web apps these languages are used for is in the browser itself - > a job for which WebSockets, and not HTTP, would seem the right solution. > And AMQP is rather well suited to WebSockets. I'm not too worried about the number of connections, we can deal with those fairly efficiently. I do worry about the expense of setting up and tearing down a connection, which as we've discussed now is not a concern with AMQP 1-0. Websockets, Webhooks, and PuSH all have places in a service like this, and these will be implemented once the basic REST interface is ready. I'm certainly much more interested in implementing AMQP 1-0 sooner than later after hearing that my main efficiency concerns were addressed with the latest spec (thank you for bringing this to my attention). I think there is still a lot of value in starting with a REST based API for ease of understanding and accessibility. > Transient Queues > Oh, these are really easy to do. But distributed hashing isn't. It's > actually a really interesting problem for queuing, and one we had to > address with StormMQ. Intriguingly, in low usage situations, it actually > makes message receipt non-deterministic and potentially unordered; getting > this 'right' depends if you in err in favour of at-most-once messaging or > at-least-once messsaging. Yup, the behavior and constraints link above talks about the message ordering with hashing, and how it's not guaranteed. As you can probably tell by now, this service is taking a minimalist approach. It does a few things really well, but is punting on some of the tricky issues like duplication and ordering. This is because it doesn't matter for a large class of applications. For applications where it does matter, there may be more efficient ways of handling it in the application context than in a generic queue service. We can also add features later for such things as common patterns emerge. Thanks again for all the great feedback and the information on the 1-0 spec. -Eric _______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp