[Openstack] A single cross-zone database?

2011-03-16 Thread Sandy Walsh
Hi y'all, getting any sleep before Feature Freeze? As you know, one of the main design tenants of OpenStack is Share Nothing (where possible). http://wiki.openstack.org/BasicDesignTenets That's the mantra we've been chanting with Zones. But it does cause a problem with a particular Use Case: "

Re: [Openstack] A single cross-zone database?

2011-03-16 Thread Justin Santa Barbara
Thanks for raising this Sandy: +1 on keeping separate DBs until a problem arises. I don't see a performance problem with recursively querying child zones. I guess this will partially depend on our zone topology: if the intent is to have child zones that are geographically distributed where the la

Re: [Openstack] A single cross-zone database?

2011-03-16 Thread Paul Voccio
Sandy, Not only is this expensive, but there is no way I can see at the moment to do pagination, which is what makes this really expensive. If someone asked for an entire list of all their instances and it was > 10,000 then I would think they're ok with waiting while that response is gathered a

Re: [Openstack] A single cross-zone database?

2011-03-16 Thread Justin Santa Barbara
Good point that pagination makes this harder. However, thankfully the limit is implemented using a token (the last ID seen), not an absolute offset, so I believe we can still do pagination even in loosely coordinated DBs. Good job whoever dodged that bullet (Jorge?) (Aside #1: Sorting by uptime

Re: [Openstack] A single cross-zone database?

2011-03-16 Thread Ed Leafe
On Mar 16, 2011, at 12:23 PM, Paul Voccio wrote: > Not only is this expensive, but there is no way I can see at the moment to do > pagination, which is what makes this really expensive. If someone asked for > an entire list of all their instances and it was > 10,000 then I would think > they're

Re: [Openstack] A single cross-zone database?

2011-03-16 Thread Paul Voccio
Ed, I would agree. The caching would go with the design tenet #7: Accept eventual consistency and use it where it is appropriate. If we're ok with accepting that the list may or may not always be up to date and feel its appropriate, we should be good with the caching. pvo On 3/16/11 11:45 AM,

Re: [Openstack] A single cross-zone database?

2011-03-16 Thread Andrew Shafer
Global temporal consistency is a myth. If you decide not to cache and support pagination then querying every zone for every page is potentially as misleading as caching because what should be on each page could change for every request. +1 for cache with ttl On Wed, Mar 16, 2011 at 11:58 AM, Pa

Re: [Openstack] A single cross-zone database?

2011-03-16 Thread Justin Santa Barbara
Can someone explain _why_ we need caching? With our approach to pagination, without caching, the answer is always correct: each query always returns the next {limit} values whose ID is >= {start-id}. I agree that in practice this means that there's no way to guarantee you get all values while the

Re: [Openstack] A single cross-zone database?

2011-03-16 Thread Glen Campbell
Instead of building data-specific caching (which always worries me), you could simply build the service to return the data directly, then add a "Cach-Control: max-age=NNN" header to the result. That way, users who wanted to improve their performance could add a squid layer (or other caching HTTP pr

Re: [Openstack] A single cross-zone database?

2011-03-16 Thread Eric Day
We can handle pagination whether we have a single database, multiple databases with cache, or query each zone on each request. In the last case an instance would be identified with the zone it exists in (for example, the marker would be a fully qualified zone:instance name) and we can just pick up

Re: [Openstack] A single cross-zone database?

2011-03-16 Thread Eric Day
Justin, You don't need a cache, but large installations will probably want it for the usual reasons (improve response time, reduce load, etc). Since this is one of our primary use cases, we need to do it. Also, to do caching correctly, it's not just something you tack on. Efficient caching systems

Re: [Openstack] A single cross-zone database?

2011-03-16 Thread Ed Leafe
On Mar 16, 2011, at 2:24 PM, Justin Santa Barbara wrote: > Can someone explain _why_ we need caching? We don't *need* caching - it is simply the most direct way to avoid multiple expensive calls. > With our approach to pagination, without caching, the answer is always > correct: each q

Re: [Openstack] A single cross-zone database?

2011-03-16 Thread Eric Day
It's always a trade-off with generic vs context-specific caching. You can usually be more efficient and provide a better UX with context-specific caching, but it can be a bit more work. For Nova (and any distributed application along these lines) I think we need to use context-specific active cach

Re: [Openstack] A single cross-zone database?

2011-03-16 Thread Justin Santa Barbara
Inline... > Can someone explain _why_ we need caching? > > We don't *need* caching - it is simply the most direct way to avoid > multiple expensive calls. So if we don't need it...? You cite avoiding expensive calls, but I think it's entirely unproven that those call are too expensive.

Re: [Openstack] A single cross-zone database?

2011-03-16 Thread Eric Day
On Wed, Mar 16, 2011 at 06:47:03PM +, Ed Leafe wrote: > > With our approach to pagination, without caching, the answer is always > > correct: each query always returns the next {limit} values whose ID is >= > > {start-id}. > > But for this example, you have to traverse *all* the zones

[Openstack] Ubuntu Cloud days, call for session

2011-03-16 Thread Ahmed Kamal
Hi everyone, From March 23rd 2011 to March 24th 2011 Ubuntu is hosting the very first Ubuntu Cloud Days. This is an event of IRC tutorials and sessions. It would be awesome if someone would volunteer to hold an openstack irc session. Sessions are usually not too hard to prepare, if you're con

Re: [Openstack] A single cross-zone database?

2011-03-16 Thread Justin Santa Barbara
I agree that we could have a better marker, but I'm just going off the spec at the moment. I've checked the agreed blueprint, and caching in zones is out of scope for Cactus. Please propose a discussion topic for the Design Summit. Justin On Wed, Mar 16, 2011 at 12:21 PM, Eric Day wrote:

Re: [Openstack] A single cross-zone database?

2011-03-16 Thread Ed Leafe
On Mar 16, 2011, at 3:39 PM, Justin Santa Barbara wrote: > I agree that we could have a better marker, but I'm just going off the spec > at the moment. > > I've checked the agreed blueprint, and caching in zones is out of scope for > Cactus. > > Please propose a discussion topic for the Design

Re: [Openstack] A single cross-zone database?

2011-03-16 Thread Justin Santa Barbara
Seems that the person writing the code (Sandy) wants _not_ to do a single DB initially. It sounds like there are back channels where a single DB is being pushed on Sandy. To me, it sounds like we have these choices: 1. We can have a zones implementation in Cactus. As specified in the blue

Re: [Openstack] A single cross-zone database?

2011-03-16 Thread Paul Voccio
For Cactus, I'm with Justin and Sandy on #1 to get something working. Justin — you said earlier that you're not sure this is going to be a problem. From experience, this is a problem with trying to query all the instances across zones for Rackspace now. Sandy and others including myself have tal

Re: [Openstack] A single cross-zone database?

2011-03-16 Thread Sandy Walsh
Yup, that's a fair assessment. That said, even without SDB and caching it's going to be tight for Cactus anyway. There are lots of little issues that are cropping up once I got down to 100'. -S From: Justin Santa Barbara mailto:jus...@fathomdb.com>> Sandy: Hav