On Wed, Jun 15, 2016 at 10:02:02AM -0700, Ben Pfaff wrote:
> On Fri, Jun 10, 2016 at 03:59:34PM -0700, Ben Pfaff wrote:
> > After talking to Justin, I think I'm going to take a few days (maybe
> > Wednesday through Friday next week) to hack on etcd related stuff, with
> > the goal being to come up with a detailed to-do list and try to verify
> > that the stuff that I think should work really will.  (For example, Andy
> > thinks that the gRPC implementation for C is incomplete, but neither of
> > us has really tried anything.)
> > 
> > I guess the first step for me personally is an x86-64 container or
> > chroot or something.
> 
> Here are the questions I started out with:
> 
> 1. Can I get etcd v3 working on an i386 laptop somehow, given etcd's
>    hostility to non-x86-64 architectures?
> 
> 2. Can I get a C program to talk to etcd v3, given:
> 
>    - etcd is layered on grpc, which does not have a full set of bindings
>      for C but instead infrastructure bindings on which the other
>      language bindings are layered.
> 
>    - etcd uses protobufs (as is typical) inside grpc, but protobufs do
>      not have official C bindings either.
> 
> 3. How should the OVN databases be arranged within etcd?  There are
>    multiple possibilities:
> 
>    - Define OVSDB bindings to etcd and implement those bindings in the
>      OVSDB client libraries (C and Python).
> 
>    - Define OVSDB bindings to etcd and implement those bindings in the
>      OVSDB server (so that ovsdb-server uses etcd as a storage layer).
> 
>    - Define a native etcd schema for OVN SB (and probably NB) database
>      and make ovn-controller and ovn-northd use it natively.
> 
> Here's what I've accomplished so far:
> 
> * Managed to get etcd v3 beta running in a docker container on my
>   laptop.  This solves the problem I personally have due to running i386
>   userspace on my laptop, because Docker containers run x86-64 userspace
>   internally regardless of whether the host uses i386 or x86-64
>   userspace.
> 
>   (This does not answer the larger question of how archs other than
>   x86-64 will use OVN.)
> 
>   I suspect this is not the same reason that the rest of the world is so
>   excited about containers, but it works for me.
> 
> * Managed to interact with etcd in the x86-64 container from the etcdctl
>   v3 command-line client running on my i386 host.  This does not seem to
>   have the same arch problem as running the etcd server, or at least the
>   client does useful work instead of immediately aborting with an error
>   due to an unsupported arch.
> 
> * Managed to build the grpc C libraries.
> 
> * Managed to use "nanopb" (https://github.com/nanopb/nanopb) to generate
>   C bindings for the Protocol Buffer definitions for the etcd v3
>   protocol.  This took a little bit of coaxing.  I have not yet tried to
>   compile or use what it generated.  (grpc itself uses nanopb
>   internally, which gives me hope that nanopb works OK.)
> 
> Here's what is left:
> 
> * Write a C test program that tries to tie together the grpc C library
>   and the nanopb-generated etcd bindings into something that can
>   actually talk to etcd, probably something comparable to a simplified
>   etcdctl in C.
> 
>   This has multiple steps, at least the following:
> 
>   - Figure out what's needed to tie together all the grpc C
>     infrastructure code into a C library that can actually implement a
>     grpc service.  I don't know whether this is on the spectrum from
>     trivial to troublesome.
> 
>   - Layer a library on top of the previous one from the previous bullet
>     that can send and receive etcd messages.
> 
>   - Layer a test program on top of the previous library.
> 
>   - Get grpc and nanopb-generated code and everything else to actually
>     link together.  This is always more of a challenge than it ought to
>     be.
> 
> * Figure out OVN database handling (question 3 above).
> 
> So far, I'm feeling fairly optimistic.

Here's a further update.

Yesterday, I wrote and successfully tested a C program that acts as a
gRPC client against the "hello world" example server included in the
gRPC distribution, using the grpc C infrastructure library.  The
program does not use a protobuf library to encode or decode the
request or reply embedded in the RPC messages; instead, it sends a
hard-coded byte array as the request, and it just hex-dumps the
received reply to the console.  Still, I'm convinced that we can
practically implement a gRPC client in C using the Google
infrastructure library.

My tentative conclusion is that it is practical for OVN to use etcd.
This is short of concluding that OVN should use etcd, but it points in
the right direction.

Here's what's on my mind now.

First, etcd v3 is a prerelease beta, and it's based on a prerelease beta
of gRPC, which is based on a prerelease beta of protobufs.  None of
these prereleases are packaged for Debian or Red Hat, making them hard
to work with.  Raft-based HA for OVSDB is even less advanced since the
code for it is incomplete.

Second, we are extremely time-constrained at this point in the release
cycle.  Our stated intention is to branch in July.  I've been delayed
so much on the Raft implementation by various things (perhaps chief
among them the etcd proof-of-concept!) that I no longer have any
confidence that it can be realistically ready for July.  In terms of
order of magnitude, switching to etcd is as large a project as the
Raft implementation, so I don't think that it is a good idea to
undertake at this time either.

Third, in the OVN meeting yesterday in #openvswitch, a couple of
attendees said that it would be acceptable for OVN to, at least
initially, have only active-passive HA rather than active-active.
Coincidentally, HPE has posted patches that add support for
active-passive HA, e.g. see this patch from March:
        https://patchwork.ozlabs.org/patch/603152/

My proposal is this:

1. Postpone the OVN etcd vs. Raft decision to the next release cycle in
   6 months.  This will give etcd and its underlying infrastructure time
   to mature, and in the alternative it will give proper time to
   implement Raft in OVSDB.

2. Prioritize getting HPE's OVSDB replication patches into this current
   cycle.  I do not think I've seen a revision of the patches since
   March, so the first step would be to ask for a new version, and in
   the alternative I'll take a look at revising them myself.

Comments?
_______________________________________________
discuss mailing list
discuss@openvswitch.org
http://openvswitch.org/mailman/listinfo/discuss

Reply via email to