Re: [Openstack] Messaging reliability/durability expectations

2014-10-16 Thread Joshua Harlow
98903, Directors: Michael Cunningham (USA), Matt Parsons (USA), Charlie Peters (USA), Michael O'Neill (Ireland) Date: Thursday, October 16, 2014 at 6:26 AM To: Aaron Knister Cc: "openstack@lists.openstack.org" Subject: Re: [Openstack] Messaging reliability/durability expectations >

Re: [Openstack] Messaging reliability/durability expectations

2014-10-16 Thread Gordon Sim
On 10/16/2014 01:51 PM, Aaron Knister wrote: Thanks, again, for your replies. I started looking at the code to see about implementing acknowledgements in the Qpid driver and I'll admit after some digging I've come up confused. These lines (it's in master as well as the stable icehouce branch) htt

Re: [Openstack] Messaging reliability/durability expectations

2014-10-16 Thread Aaron Knister
Hi Gordon, Thanks, again, for your replies. I started looking at the code to see about implementing acknowledgements in the Qpid driver and I'll admit after some digging I've come up confused. These lines (it's in master as well as the stable icehouce branch) http://git.io/w3KkQw and http://git

Re: [Openstack] Messaging reliability/durability expectations

2014-10-16 Thread Gordon Sim
On 10/14/2014 10:36 PM, Noel Burton-Krahn wrote: Unfortunately, durable queues don't fix the case where rabbit dies and restarts on a new host (and loses its durable queue store) Note that losing the durable queue store means the store-and-forward guarantee is lost and therefore both request a

Re: [Openstack] Messaging reliability/durability expectations

2014-10-16 Thread Gordon Sim
On 10/14/2014 10:48 PM, Aaron Knister wrote: The fixes to all 3 of these issues seem to be patches to the rabbit driver for oslo. Are the other drivers (e.g. qpid) any more robust or are they just not heavily used so more bugs may be lurking there? As mentioned, the qpid driver does not use ack

Re: [Openstack] Messaging reliability/durability expectations

2014-10-14 Thread Aaron Knister
Thanks Chris, Noel for your helpful replies! The fixes to all 3 of these issues seem to be patches to the rabbit driver for oslo. Are the other drivers (e.g. qpid) any more robust or are they just not heavily used so more bugs may be lurking there? I'd really like to use zeromq but the lack of any

Re: [Openstack] Messaging reliability/durability expectations

2014-10-14 Thread Noel Burton-Krahn
Hi Aaron, Unfortunately, durable queues don't fix the case where rabbit dies and restarts on a new host (and loses its durable queue store) There's a fix here, but it's been waiting a while for a merge https://review.openstack.org/#/c/109373/ -- Noel On Tue, Oct 14, 2014 at 1:51 PM, Aaron K

Re: [Openstack] Messaging reliability/durability expectations

2014-10-14 Thread Chris Friesen
On 10/14/2014 09:39 AM, Sandy Walsh wrote: Sort of. Openstack RPC-over-AMQP (oslo.messaging) automatically ack()'s all messages that are received. So, it becomes the responsibility of the sender to retry. For example, the scheduler in Nova does this. However, if the client fails before getting t

Re: [Openstack] Messaging reliability/durability expectations

2014-10-14 Thread Aaron Knister
(Tested in icehouse) On Tue, Oct 14, 2014 at 4:50 PM, Aaron Knister wrote: > For those of you following alone at home-- I just discovered that durable > queues are particularly nice for nova scheduler. Without them an outage of > either the MQ daemon (qpid in my case) or the scheduler itself ca

Re: [Openstack] Messaging reliability/durability expectations

2014-10-14 Thread Aaron Knister
For those of you following alone at home-- I just discovered that durable queues are particularly nice for nova scheduler. Without them an outage of either the MQ daemon (qpid in my case) or the scheduler itself can cause the scheduling requests to get dropped on the floor. With durability the ins

Re: [Openstack] Messaging reliability/durability expectations

2014-10-14 Thread Gordon Sim
On 10/14/2014 07:36 PM, Aaron Knister wrote: With RabbitMQ if a message is silently dropped by the broker will a timeout still occur/exception be raised because no reply/ack was received? With the QPID driver the automatic ack()'s Sandy mentioned don't occur? Will the sender eventually become aw

Re: [Openstack] Messaging reliability/durability expectations

2014-10-14 Thread Aaron Knister
Thanks Gordon, and Sandy. With RabbitMQ if a message is silently dropped by the broker will a timeout still occur/exception be raised because no reply/ack was received? With the QPID driver the automatic ack()'s Sandy mentioned don't occur? Will the sender eventually become aware that a message w

Re: [Openstack] Messaging reliability/durability expectations

2014-10-14 Thread Gordon Sim
I agree that greater clarity on expectations around reliability are needed. The drivers all differ in this regard. As it stands today, the impl_rabbit driver only retries an RPC request if an exception occurs while sending it. However messages are sent unconfirmed[1]. This means a message can

Re: [Openstack] Messaging reliability/durability expectations

2014-10-14 Thread Sandy Walsh
s...@gmail.com] Sent: Tuesday, October 14, 2014 12:00 PM To: Raghu Vadapalli Cc: Subject: Re: [Openstack] Messaging reliability/durability expectations Thanks Raghu. I think I might not be asking the right questions. Part of my ignorance here comes from not understanding AMQP. I think really what I

Re: [Openstack] Messaging reliability/durability expectations

2014-10-14 Thread Aaron Knister
Thanks Raghu. I think I might not be asking the right questions. Part of my ignorance here comes from not understanding AMQP. I think really what I'm trying to figure out is whether openstack expects durable queues. It sounds like the answer is no but confirmation of this would be great. Even if

Re: [Openstack] Messaging reliability/durability expectations

2014-10-14 Thread Raghu Vadapalli
— Raghu On Tuesday, Oct 14, 2014 at 7:49 AM, Aaron Knister , wrote: Thanks Remo, could you elaborate a little? Which is part of RabbitMQ? The HA layer or the message retransmission? I'm currently using qpid. Also, just so I'm clear, is it the openstack code or the low-level messaging drivers

Re: [Openstack] Messaging reliability/durability expectations

2014-10-14 Thread Aaron Knister
Thanks Remo, could you elaborate a little? Which is part of RabbitMQ? The HA layer or the message retransmission? I'm currently using qpid. Also, just so I'm clear, is it the openstack code or the low-level messaging drivers (rabbit, zmq, qpid) that retransmit on message delivery failure? Thank

Re: [Openstack] Messaging reliability/durability expectations

2014-10-13 Thread Remo Mattei
That is part of RabbitMQ and yes it will resend the msg. Remo > On Oct 13, 2014, at 22:31, Aaron Knister wrote: > > Hi Everyone, > > I'm building a production-grade cloud where HA is a requirement. I'm > currently working on implementing HA of the messaging layer and am not clear > on what

[Openstack] Messaging reliability/durability expectations

2014-10-13 Thread Aaron Knister
Hi Everyone, I'm building a production-grade cloud where HA is a requirement. I'm currently working on implementing HA of the messaging layer and am not clear on what the expectations/assumptions are of the messaging layer regarding message durability and reliability. I've seen documentation sc