Re: [ovs-dev] RFC: OVN database options

2016-03-11 Thread Ben Pfaff
On Fri, Mar 11, 2016 at 05:49:07PM +0100, Ivan Kelly wrote: > >> Well, if you do the log tailing thing I suggested, then the client > >> will have access to a consistent snapshot, since they would only read > >> from the database directly once, and all client updates after that > >> would come from

Re: [ovs-dev] RFC: OVN database options

2016-03-11 Thread Ben Pfaff
On Fri, Mar 11, 2016 at 12:13:25PM -0500, Mike Bayer wrote: > On 03/10/2016 06:50 PM, Ben Pfaff wrote: > > > >I've been a fan of Postgres since I used in the 1990s for a web-based > >application. It didn't occur to me that it was appropriate here. > >Julien, thanks so much for joining the discussi

Re: [ovs-dev] RFC: OVN database options

2016-03-11 Thread Mike Bayer
On 03/10/2016 06:50 PM, Ben Pfaff wrote: I've been a fan of Postgres since I used in the 1990s for a web-based application. It didn't occur to me that it was appropriate here. Julien, thanks so much for joining the discussion. So yes, it has everything OVN needs. It can push notifications t

Re: [ovs-dev] RFC: OVN database options

2016-03-11 Thread Ivan Kelly
>> Well, if you do the log tailing thing I suggested, then the client >> will have access to a consistent snapshot, since they would only read >> from the database directly once, and all client updates after that >> would come from the log which arrive in a well defined order. > > OK. > > I'm conce

Re: [ovs-dev] RFC: OVN database options

2016-03-11 Thread Ben Pfaff
On Fri, Mar 11, 2016 at 05:26:06PM +0100, Ivan Kelly wrote: > On Fri, Mar 11, 2016 at 5:20 PM, Ben Pfaff wrote: > > On Fri, Mar 11, 2016 at 05:10:15PM +0100, Ivan Kelly wrote: > >> > Just to make sure, does this means that a Zookeeper client cannot read a > >> > consistent snapshot of the entire d

Re: [ovs-dev] RFC: OVN database options

2016-03-11 Thread Ivan Kelly
On Fri, Mar 11, 2016 at 5:20 PM, Ben Pfaff wrote: > On Fri, Mar 11, 2016 at 05:10:15PM +0100, Ivan Kelly wrote: >> > Just to make sure, does this means that a Zookeeper client cannot read a >> > consistent snapshot of the entire database? >> Yes, exactly. It can only read one node at a time, so wr

Re: [ovs-dev] RFC: OVN database options

2016-03-11 Thread Ben Pfaff
On Fri, Mar 11, 2016 at 05:10:15PM +0100, Ivan Kelly wrote: > > Just to make sure, does this means that a Zookeeper client cannot read a > > consistent snapshot of the entire database? > Yes, exactly. It can only read one node at a time, so writes can occur > between the reading of two nodes. OK.

Re: [ovs-dev] RFC: OVN database options

2016-03-11 Thread Ivan Kelly
> Just to make sure, does this means that a Zookeeper client cannot read a > consistent snapshot of the entire database? Yes, exactly. It can only read one node at a time, so writes can occur between the reading of two nodes. -Ivan ___ dev mailing list d

Re: [ovs-dev] RFC: OVN database options

2016-03-11 Thread Ben Pfaff
On Fri, Mar 11, 2016 at 09:58:18AM +0100, Ivan Kelly wrote: > > Zookeeper transactions can be isolated depending on what level of > > isolation you need. > > A setData on a node operation can contain a version, so that it fails > > if that node has changed since the version. This means with a multi

Re: [ovs-dev] RFC: OVN database options

2016-03-11 Thread Julien Danjou
On Thu, Mar 10 2016, Russell Bryant wrote: > Specific to the OVN+OpenStack use case, I imagine a frequent question would > be, "why do I have to use MariaDB+Galera AND PostgreSQL in the same > environment?!" I suppose OpenStack works with PostgreSQL, too, and it's > just a deployment choice that

Re: [ovs-dev] RFC: OVN database options

2016-03-11 Thread Ivan Kelly
> Zookeeper transactions can be isolated depending on what level of > isolation you need. > A setData on a node operation can contain a version, so that it fails > if that node has changed since the version. This means with a multi[1] > of setData operations, you can effectively get a snapshot isol

Re: [ovs-dev] RFC: OVN database options

2016-03-10 Thread Han Zhou
On Wed, Mar 9, 2016 at 11:11 PM, Ben Pfaff wrote: > > Beyond supporting this usage model, the basic requirements for the OVN > use case are: > > - Size: 20 MB to 100 MB of data (estimated database size to hold > data for our target scale of 1,000 hypervisors and 20,000 > logical po

Re: [ovs-dev] RFC: OVN database options

2016-03-10 Thread Ben Pfaff
On Thu, Mar 10, 2016 at 07:11:55PM +0100, Ivan Kelly wrote: > > - Zookeeper. The ZK model is similar to etc so it may have > > similar issues. Also, ZK makes the transaction log available, > > for use by observer nodes to scale out reads, and this may be > > another way for

Re: [ovs-dev] RFC: OVN database options

2016-03-10 Thread Ben Pfaff
On Thu, Mar 10, 2016 at 12:52:43PM -0500, Russell Bryant wrote: > On Thu, Mar 10, 2016 at 2:11 AM, Ben Pfaff wrote: > > > Database txn ACID consist trk HA OSC Py format > > - --- --- --- --- --- --- --- -- > > ActorDByes ACID strong

Re: [ovs-dev] RFC: OVN database options

2016-03-10 Thread Ben Pfaff
On Thu, Mar 10, 2016 at 01:14:41PM -0600, Ryan Moats wrote: > Do we want to be *in* the DB business? I think the answer is no, > which means we should be doing the work to *not* be in the DB business > - refactoring the IDL to allow different DBs to be attached while > ensuring that the requiremen

Re: [ovs-dev] RFC: OVN database options

2016-03-10 Thread Ryan Moats
"dev" wrote on 03/10/2016 01:11:09 AM: > From: Ben Pfaff > To: dev@openvswitch.org > Date: 03/10/2016 01:31 AM > Subject: [ovs-dev] RFC: OVN database options > Sent by: "dev" > > Requirements > > > OVN uses two databases, t

Re: [ovs-dev] RFC: OVN database options

2016-03-10 Thread Ben Pfaff
On Fri, Mar 11, 2016 at 12:55:54AM +0900, Dan Mihai Dumitriu wrote: > The NB DB does need HA and ACID transactions, but it has few clients, so > it's probably not a very hard problem - could even use BDB with log > shipping - > http://www.oracle.com/technetwork/database/database-technologies/berkel

Re: [ovs-dev] RFC: OVN database options

2016-03-10 Thread Ben Pfaff
On Fri, Mar 11, 2016 at 12:43:19AM +0900, Dan Mihai Dumitriu wrote: > Another thing to add to the protocol would be per table versioning, so that > when a client gets disconnected, if it happens to reconnect to another > server in the cluster, it can exchange table versions and resync, coming up >

Re: [ovs-dev] RFC: OVN database options

2016-03-10 Thread Ben Pfaff
On Thu, Mar 10, 2016 at 04:15:13PM +0200, Liran Schour wrote: > I'd like to raise the following issues for discussion: > > 1. That the client side is abstracted from the specific choice of > server-side database by using a db-abstraction layer on the client side. > We already have some kind of a

Re: [ovs-dev] RFC: OVN database options

2016-03-10 Thread Ivan Kelly
> - Zookeeper. The ZK model is similar to etc so it may have > similar issues. Also, ZK makes the transaction log available, > for use by observer nodes to scale out reads, and this may be > another way for the clients to track table changes. Not quite. It does make the log

Re: [ovs-dev] RFC: OVN database options

2016-03-10 Thread Ben Pfaff
On Thu, Mar 10, 2016 at 11:26:29AM +0100, Ivan Kelly wrote: > > - Zookeeper. The issues here are similar to those for etcd. > > Also, Zookeeper transactions don't seem to be isolated. > Zookeeper transactions can be isolated depending on what level of > isolation you need. > A setData on

Re: [ovs-dev] RFC: OVN database options

2016-03-10 Thread Russell Bryant
On Thu, Mar 10, 2016 at 2:11 AM, Ben Pfaff wrote: > Database txn ACID consist trk HA OSC Py format > - --- --- --- --- --- --- --- -- > ActorDByes ACID strong NO yes yes yes yes sql > Aerospike yes ACID strong NO

Re: [ovs-dev] RFC: OVN database options

2016-03-10 Thread Dan Mihai Dumitriu
On Fri, Mar 11, 2016 at 12:55 AM, Dan Mihai Dumitriu wrote: > Great writeup Ben. > > The NB DB does need HA and ACID transactions, but it has few clients, so > it's probably not a very hard problem - could even use BDB with log > shipping - > http://www.oracle.com/technetwork/database/database-te

Re: [ovs-dev] RFC: OVN database options

2016-03-10 Thread Dan Mihai Dumitriu
Great writeup Ben. The NB DB does need HA and ACID transactions, but it has few clients, so it's probably not a very hard problem - could even use BDB with log shipping - http://www.oracle.com/technetwork/database/database-technologies/berkeleydb/overview/index-085366.html . However, one more pot

Re: [ovs-dev] RFC: OVN database options

2016-03-10 Thread Dan Mihai Dumitriu
These are great points Liran. These points are also very closely related to one another. I agree that the SB DB could be entirely in memory - of course, for high availability of course it should be replicated. As a bonus, replication of an in-memory data structure is easier than of a durable data

Re: [ovs-dev] RFC: OVN database options

2016-03-10 Thread Liran Schour
I'd like to raise the following issues for discussion: 1. That the client side is abstracted from the specific choice of server-side database by using a db-abstraction layer on the client side. We already have some kind of an abstraction layer in the code: ovsdb-idl. Maybe we can start from the

Re: [ovs-dev] RFC: OVN database options

2016-03-10 Thread Ivan Kelly
> - Zookeeper. The issues here are similar to those for etcd. > Also, Zookeeper transactions don't seem to be isolated. Zookeeper transactions can be isolated depending on what level of isolation you need. A setData on a node operation can contain a version, so that it fails if that node

[ovs-dev] RFC: OVN database options

2016-03-09 Thread Ben Pfaff
Requirements OVN uses two databases, the "northbound" and "southbound" databases, in a somewhat idiosyncratic manner. Each client of one of these databases maintains an in-memory replica of the database (or some subset of it), and the server sends it updates to this replica as they a