Excerpts from Samuel Merritt's message of 2014-09-09 16:12:09 -0700: > On 9/9/14, 12:03 PM, Monty Taylor wrote: > > On 09/04/2014 01:30 AM, Clint Byrum wrote: > >> Excerpts from Flavio Percoco's message of 2014-09-04 00:08:47 -0700: > >>> Greetings, > >>> > >>> Last Tuesday the TC held the first graduation review for Zaqar. During > >>> the meeting some concerns arose. I've listed those concerns below with > >>> some comments hoping that it will help starting a discussion before the > >>> next meeting. In addition, I've added some comments about the project > >>> stability at the bottom and an etherpad link pointing to a list of use > >>> cases for Zaqar. > >>> > >> > >> Hi Flavio. This was an interesting read. As somebody whose attention has > >> recently been drawn to Zaqar, I am quite interested in seeing it > >> graduate. > >> > >>> # Concerns > >>> > >>> - Concern on operational burden of requiring NoSQL deploy expertise to > >>> the mix of openstack operational skills > >>> > >>> For those of you not familiar with Zaqar, it currently supports 2 nosql > >>> drivers - MongoDB and Redis - and those are the only 2 drivers it > >>> supports for now. This will require operators willing to use Zaqar to > >>> maintain a new (?) NoSQL technology in their system. Before expressing > >>> our thoughts on this matter, let me say that: > >>> > >>> 1. By removing the SQLAlchemy driver, we basically removed the > >>> chance > >>> for operators to use an already deployed "OpenStack-technology" > >>> 2. Zaqar won't be backed by any AMQP based messaging technology for > >>> now. Here's[0] a summary of the research the team (mostly done by > >>> Victoria) did during Juno > >>> 3. We (OpenStack) used to require Redis for the zmq matchmaker > >>> 4. We (OpenStack) also use memcached for caching and as the oslo > >>> caching lib becomes available - or a wrapper on top of dogpile.cache - > >>> Redis may be used in place of memcached in more and more deployments. > >>> 5. Ceilometer's recommended storage driver is still MongoDB, > >>> although > >>> Ceilometer has now support for sqlalchemy. (Please correct me if I'm > >>> wrong). > >>> > >>> That being said, it's obvious we already, to some extent, promote some > >>> NoSQL technologies. However, for the sake of the discussion, lets assume > >>> we don't. > >>> > >>> I truly believe, with my OpenStack (not Zaqar's) hat on, that we can't > >>> keep avoiding these technologies. NoSQL technologies have been around > >>> for years and we should be prepared - including OpenStack operators - to > >>> support these technologies. Not every tool is good for all tasks - one > >>> of the reasons we removed the sqlalchemy driver in the first place - > >>> therefore it's impossible to keep an homogeneous environment for all > >>> services. > >>> > >> > >> I whole heartedly agree that non traditional storage technologies that > >> are becoming mainstream are good candidates for use cases where SQL > >> based storage gets in the way. I wish there wasn't so much FUD > >> (warranted or not) about MongoDB, but that is the reality we live in. > >> > >>> With this, I'm not suggesting to ignore the risks and the extra burden > >>> this adds but, instead of attempting to avoid it completely by not > >>> evolving the stack of services we provide, we should probably work on > >>> defining a reasonable subset of NoSQL services we are OK with > >>> supporting. This will help making the burden smaller and it'll give > >>> operators the option to choose. > >>> > >>> [0] http://blog.flaper87.com/post/marconi-amqp-see-you-later/ > >>> > >>> > >>> - Concern on should we really reinvent a queue system rather than > >>> piggyback on one > >>> > >>> As mentioned in the meeting on Tuesday, Zaqar is not reinventing message > >>> brokers. Zaqar provides a service akin to SQS from AWS with an OpenStack > >>> flavor on top. [0] > >>> > >> > >> I think Zaqar is more like SMTP and IMAP than AMQP. You're not really > >> trying to connect two processes in real time. You're trying to do fully > >> asynchronous messaging with fully randomized access to any message. > >> > >> Perhaps somebody should explore whether the approaches taken by large > >> scale IMAP providers could be applied to Zaqar. > >> > >> Anyway, I can't imagine writing a system to intentionally use the > >> semantics of IMAP and SMTP. I'd be very interested in seeing actual use > >> cases for it, apologies if those have been posted before. > > > > It seems like you're EITHER describing something called XMPP that has at > > least one open source scalable backend called ejabberd. OR, you've > > actually hit the nail on the head with bringing up SMTP and IMAP but for > > some reason that feels strange. > > > > SMTP and IMAP already implement every feature you've described, as well > > as retries/failover/HA and a fully end to end secure transport (if > > installed properly) If you don't actually set them up to run as a public > > messaging interface but just as a cloud-local exchange, then you could > > get by with very low overhead for a massive throughput - it can very > > easily be run on a single machine for Sean's simplicity, and could just > > as easily be scaled out using well known techniques for public cloud > > sized deployments? > > > > So why not use existing daemons that do this? You could still use the > > REST API you've got, but instead of writing it to a mongo backend and > > trying to implement all of the things that already exist in SMTP/IMAP - > > you could just have them front to it. You could even bypass normal > > delivery mechanisms and do neat things with local injection. > > > > I don't care about the NoSQL question on its own. Mongo is fine. Redis > > is fine. I don't think either has any features for this use case that > > make a licks worth of difference compared to MySQL or Postgres, but I > > also don't think they are a PROBLEM in an of themselves. > > > > The main thing I care about here is every description I've heard of what > > zaqar wants to do (which does seem to be getting clearer through this > > thread) is still well implemented somewhere as an existing scalable > > service. Is zaqar actually Rabbit with a REST interface? Is it ejabberd > > with a rest interface? Or is it IMAP/SMTP with a REST interface. You'll > > note that probably nobody would think a single server that wanted to be > > both Rabbit AND IMAP/SMTP is a good idea ... at least this is one of the > > reasons why we all think Microsoft Exchange is a pile of garbage, no? > > > > I also worry about the fact that one description of zaqar was used to > > communicate a need for divergent requirements (it needs to be a > > high-volume fast message broker/queue - which, btw, sounds more like > > Rabbit/oslo.messaging and less like what Clint describes above) ... and > > that's why it wants to use falcon and not pecan and why it wants to use > > mongo and not SQL. And then what we're doing it reimplementing something > > like rabbit except in python (again, given as the justification for > > deviating from how other bits of OpenStack work) > > > > BUT - if that's not actually what zaqar is - if it isn't a rabbit > > replacement and doesn't need to do massive high volume sub-second > > queuing because what it's actually modeling is a message subscription > > service that's closer to email than to anything else, then there is > > nothing about the components that are happily used in the rest of > > OpenStack that should be precluded from being used. A REST api written > > in pecan should be fine ... as should an SQL backend, because 99% of all > > operations are going to be primary key lookups where even a moderately > > tuned database should be absolutely fine at keeping up. > > > > So which is it? Because it sounds like to me it's a thing that actually > > does NOT need to diverge in technology in any way, but that I've been > > told that it needs to diverge because it's delivering a different set of > > features - and I'm pretty sure if it _is_ the thing that needs to > > diverge in technology because of its feature set, then it's a thing I > > don't think we should be implementing in python in OpenStack because it > > already exists and it's called AMQP. > > Whether Zaqar is more like AMQP or more like email is a really strange > metric to use for considering its inclusion. > > Let me put on my web application developer's hat. Whenever I've worked > on a web app, I've invariably wound up needing HTTP servers, background > workers, and some sort of queue to connect the two. > > I've done the thing where I've stored queue entries in the app's > database and had the workers poll for jobs; the load adds up > surprisingly fast, and it's got some bad positive-feedback failure > modes. However, it is nice and durable, so my app doesn't lose messages. > > I've done the thing where I've thrown together a VM and stuck Redis or > rabbitmq or beanstalkd on it. That gets me nice, fast queues, but no > semblance of reliability. If that one VM dies, all my queued messages > are lost. > > Then there's Zaqar, which is this nice HTTP API that I can use for my > application's queues. I go and make a couple of POST requests and now > I've got some queues for my application to use. My app servers POST > messages to their queues, and my background workers sit and make GET > requests for messages to process. I can have the whole thing up and > running in a few hours. Better yet, I barely have to monitor the thing. > I can poll for queue stats every few minutes and alert if the queue gets > too full, but that's all I've got to do. I don't have to worry about my > queue VM going into swap, or my queue VM's NIC getting saturated, or > kernel panics, or automatically promoting rabbitmq slaves to masters, or > waking up at 3 AM to fix my app when I lose messages during a rabbitmq > promotion, or any of that stuff. Using Zaqar means I can just worry > about my application and leave all that other garbage to my cloud provider.
What you just described is the queue pattern I spoke of. It does not require random access by message ID in any way shape or form. It is also well served by AMQP. I wonder if people are still confused by Zaqar's API and architecture because this would be fine for an architecture if what you describe above were the requirements: https://www.dropbox.com/s/yonloa9ytlf8fdh/ZaqarQueueOnly.png?dl=0 Just stick a REST shim in front of AMQP that enforces tenant permissions and maps logical "zaqar queues" to whatever the backend serving queue is. So why would we need a NoSQL database for the data itself if all we are doing is shoving messages in and taking them out the other end? _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev