Re: [ovs-dev] [ovn v2] ovn: Introduce ovn-controller.

Russell Bryant Sat, 11 Apr 2015 16:59:38 -0700

On 04/11/2015 02:59 AM, Justin Pettit wrote:
> 
>> On Apr 10, 2015, at 6:59 AM, Russell Bryant <rbry...@redhat.com> wrote:
>>
>> On 04/09/2015 03:53 AM, Justin Pettit wrote:
>>> +EXTRA_DIST += ovn/controller/ovn-controller.8.xml
>>
>> I like the use of the controller subdir.  I guess we should create
>> similar dirs for ovn-nbctl and ovn-nbd, even though they're both just 1
>> C file for now.  I can do it, perhaps after my pending patches against
>> those files go in.
> 
> Yeah, I was thinking we should especially do that for programs that
> require multiple source files.  ovn-nbd seems likely to need that.  I
> wonder if we should create a "utilities" directory for things like
> ovn-nbctl that only use a single file.  I imagine we'll start getting
> other utilities, too.


Sounds good to me.

>>> +    for (i = 0; i < bridge_rec->n_ports; i++) {
>>> +        const struct ovsrec_port *port_rec = bridge_rec->ports[i];
>>> +        const char *iface_id;
>>> +        int j;
>>> +
>>> +        if (!strcmp(port_rec->name, bridge_rec->name)) {
>>> +            continue;
>>> +        }
>>
>> I don't think I understand the above condition.  It's making the loop
>> ignore all ports that aren't named the same as the bridge?
>>
>> It looks like the condition should just be removed, but maybe I just
>> misunderstand it.
> 
> It should only be matching on ports that have the same name as the
> bridge.  In OVS, we always create a port with the same name as the
> bridge, but it would not be part of the logical network, so the code
> just skips it.

Thanks for the explanation!

> 
>>> +void
>>> +bindings_run(struct controller_ctx *ctx)
>>> +{
>>> +    const struct sbrec_bindings *bindings_rec;
>>> +    struct ovsdb_idl_txn *txn;
>>> +    struct sset lports;
>>> +    const char *name;
>>> +    int retval;
>>> +
>>> +    sset_init(&lports);
>>> +    get_local_iface_ids(ctx, &lports);
>>> +
>>> +    txn = ovsdb_idl_txn_create(ctx->ovnsb_idl);
>>> +    ovsdb_idl_txn_add_comment(txn,
>>> +                              "ovn-controller: updating bindings for '%s'",
>>> +                              ctx->chassis_name);
>>
>> This will probably end up with a lot of extra comments in ovsdb.  This
>> code is going to run a lot more often than when changes will actually be
>> put in a transaction.
> 
> Most of the transactions should have no effect on the database, which
> will return TXN_UNCHANGED.  The comment is associated with a
> transaction, so if the transaction gets dropped, there should be no
> entry in the database.

OK, great.  I wasn't sure if a transaction that effectively only had the
comment would result in anything happening or not.  Thanks for
explaining.  :-)

>>> +
>>> +            if (!strcmp(bindings_rec->chassis, ctx->chassis_name)) {
>>> +                continue;
>>> +            }
>>> +            if (bindings_rec->chassis[0]) {
>>> +                VLOG_INFO("Changing chassis for lport %s from %s to %s",
>>> +                          bindings_rec->logical_port, 
>>> bindings_rec->chassis,
>>> +                          ctx->chassis_name);
>>> +            }
>>> +            sbrec_bindings_set_chassis(bindings_rec, ctx->chassis_name);
>>> +        } else if (!strcmp(bindings_rec->chassis, ctx->chassis_name)) {
>>> +            sbrec_bindings_set_chassis(bindings_rec, "");
>>> +        }
>>> +    }
>>> +
>>> +    retval = ovsdb_idl_txn_commit_block(txn);
>>> +    if (retval == TXN_ERROR) {
>>> +        VLOG_INFO("Problem committing bindings information: %s",
>>> +                  ovsdb_idl_txn_status_to_string(retval));
>>> +    }
>>
>> It's worth tracking if a change was actually put into the transaction in
>> the loop above and skip trying to commit if no changes were needed.
>>
>> Also see the comments regarding transaction committing and error
>> handling that I put in chassis_run in chassis.c, as all of it applies
>> here too.
> 
> As I mentioned above, I don't think a transaction that has no effect
> on the database should actually be committed, so I don't think it's a
> problem.  However, let me know if I misunderstand what you're
> suggesting.

You understood it.  The above explanation makes it less valuable.  I
suppose tracking could save a few CPU cycles, because you could skip
calling txn_commit().  I don't know exactly how many it saves though
without digging into what txn_commit() would really be doing in that case.

>>> +void
>>> +chassis_init(struct controller_ctx *ctx)
>>> +{
>>> +    ovsdb_idl_add_table(ctx->ovs_idl, &ovsrec_table_open_vswitch);
>>> +    ovsdb_idl_add_column(ctx->ovs_idl, 
>>> &ovsrec_open_vswitch_col_external_ids);
>>
>> This is already done in ovn-controller.c.  Do you just want this not to
>> assume the core code has added anything?  I guess it's harmless and
>> indeed helpful for this code to be explicit about what it cares about.
> 
> Yeah, I was just going for explicit.  I just worry if someone changes
> how ovn-controller.c is written, we'll silently get incorrect
> behavior.

OK, makes good sense to me!

> 
>>> +}
>>> +
>>> +static void
>>> +register_chassis(struct controller_ctx *ctx,
>>> +                 const struct sbrec_chassis *chassis_rec,
>>> +                 const char *encap_type, const char *encap_ip)
>>> +{
>>> +    struct sbrec_encap *encap_rec;
>>> +    struct ovsdb_idl_txn *txn;
>>> +    int retval;
>>> +
>>> +    txn = ovsdb_idl_txn_create(ctx->ovnsb_idl);
>>> +    ovsdb_idl_txn_add_comment(txn,
>>> +                              "ovn-controller: registering chassis '%s'",
>>> +                              ctx->chassis_name);
>>> +
>>> +    if (!chassis_rec) {
>>> +        chassis_rec = sbrec_chassis_insert(txn);
>>> +        sbrec_chassis_set_name(chassis_rec, ctx->chassis_name);
>>> +    }
>>> +
>>> +    encap_rec = sbrec_encap_insert(txn);
>>> +
>>> +    sbrec_encap_set_type(encap_rec, encap_type);
>>> +    sbrec_encap_set_ip(encap_rec, encap_ip);
>>> +
>>> +    sbrec_chassis_set_encaps(chassis_rec, &encap_rec, 1);
>>> +
>>> +    retval = ovsdb_idl_txn_commit_block(txn);
>>> +    if (retval != TXN_SUCCESS && retval != TXN_UNCHANGED) {
>>> +        VLOG_INFO("Problem committing chassis information: %s",
>>> +                  ovsdb_idl_txn_status_to_string(retval));
>>> +    }
>>> +    ovsdb_idl_txn_destroy(txn);
>>
>> Some thoughts on the transaction handling here:
>>
>> Another return value that isn't really an error is TXN_TRY_AGAIN.  The
>> docs for TRY_AGAIN say it happens if a "verify" operation fails, and
>> none of those are being used here, so maybe it's never expected.
>>
>> There's also TXN_UNCOMMITTED, which sounds like txn_commit just needs to
>> be called again.
>>
>> If it *does* fail, there's no attempt to make this try again right away.
>> It's depending on the main loop to come back around after some changes
>> have been detected.  I suppose it's possible that could be a long delay
>> we don't really want to wait for.
> 
> Yeah, I agree that it should be handled better.  I've changed the
> logic so that it it will immediately wake if the commit doesn't fail.
> In that case, it just treat all the errors as a sign to try again.
> 
>> A bigger issue here to me though is that this is a blocking operation
>> done outside of the main loop.  I think it would be better to integrate
>> committing transactions into the main loop.  I tried to do this in ovn-nbd.
> 
> There's a trade-off here between creating too many transactions and
> too few.  I was leaning towards more transactions, since they can be
> annotated through comments which can aid debugging.  The argument
> against it is that it creates too many transactions that must be
> waded through and a lot of blocking.  My thinking on this is changing
> towards creating fewer transactions, but my preference is to get it
> in as it is now, and then reassess later.

I want to make sure I understand this.  Are you saying that by not
integrating committing transactions back into the main loop, you end up
with more transactions?  Further, you suggest we may want to rework
this, but you'd prefer to just get it in as-of now?

That's all fine with me.  I just want to make sure I'm following.

I'll try to rephrase my original concern another way.  In this code we
have 2 open connections.  I was thinking that it would be ideal that any
time we ever end up in poll(), we're actually polling on both of them.
By committing the transaction inside of bindings_run() or chassis_run()
like this, we end up doing a poll() on one connection
(txn_commit_block() -> poll_block()).  Input could have arrived on the
other while we're sleeping and we won't wake up to handle it.

I guess I don't expect it to cause any real problems in practice,
though.  I was thinking of it more in terms of an ideal pattern to be
following.

> 
>>> +}
>>> +
>>> +void
>>> +chassis_run(struct controller_ctx *ctx)
>>> +{
>>> +    const struct sbrec_chassis *chassis_rec;
>>> +    const struct ovsrec_open_vswitch *cfg;
>>> +    const char *encap_type, *encap_ip;
>>> +    static bool inited = false;
>>> +
>>> +    SBREC_CHASSIS_FOR_EACH(chassis_rec, ctx->ovnsb_idl) {
>>> +        if (!strcmp(chassis_rec->name, ctx->chassis_name)) {
>>> +            break;
>>> +        }
>>> +    }
>>
>> I noted in some comments on ovn-controller.c that this code is actually
>> getting called much more often than needed.  It's going to do this
>> chassis lookup every time OVN_Southbound changes (a lot), where we
>> really only want to run this if the stuff we're watching in the
>> Open_vSwitch db changes.
> 
> The IDL should prevent transactions that never commit from going to
> ovsdb-server, so this should essentially be a no-op in terms of
> taxing ovsdb-server.  That said, it's obviously best to not
> needlessly waste unnecessary CPU cycles, but in this case, I'm
> planning to add a future update that creates the local tunnels based
> on new additions to the Chassis table.

Ah, I didn't think about this updating the Open_vSwitch db, too.  I
agree that it negates the comment.

>>> +    struct unixctl_server *unixctl;
>>> +    struct controller_ctx ctx;
>>
>> Instead of the memset() further below, I tend to prefer something like:
>>
>>    struct controller_ctx ctx = {
>>        .chassis_name = NULL,
>>    };
> 
> I don't have a strong preference, so I went with your suggestion.

OK, it's really just a habit for me, and I'm not even sure it's
justified.  I have some vague memory of thinking it might provide a
better opportunity to optimize out needing to set memory to 0.  From a
style perspective, it keeps the initialization with the declaration,
which is kind of nice.

Anyway, I hope the minor style things like this aren't too annoying.  I
can do better to keep them to myself if so.  :-)

>>> +
>>> +    get_initial_snapshot(ctx.ovs_idl);
>>> +
>>> +    get_core_config(&ctx);
>>> +
>>> +    ctx.ovnsb_idl = ovsdb_idl_create(ovn_remote, &sbrec_idl_class, true, 
>>> true);
>>> +
>>> +    get_initial_snapshot(ctx.ovnsb_idl);
>>
>> There's a slight downside here in that get_initial_snapshot() blocks and
>> leaves ovs_idl unserviced in the meantime.  It's just during the initial
>> startup though, so I'm OK leaving it.
> 
> I don't understand the issue of leaving ovs_idl unserviced during startup. 

There's probably not a real enough issue to care.  It just means data
could be backing up in ovs_idl that we could be handling while just
sleeping in poll() waiting for data on ovnsb_idl.

>>> +
>>> +    exiting = false;
>>> +    while (!exiting) {
>>> +        ovsdb_idl_run(ctx.ovs_idl);
>>> +        ovsdb_idl_run(ctx.ovnsb_idl);
>>> +
>>> +        if (!ovsdb_idl_is_alive(ctx.ovnsb_idl)) {
>>> +            int retval = ovsdb_idl_get_last_error(ctx.ovnsb_idl);
>>> +            VLOG_ERR("%s: database connection failed (%s)",
>>> +                     ovn_remote, ovs_retval_to_string(retval));
>>> +            retval = EXIT_FAILURE;
>>> +            break;
>>> +        }
>>> +
>>> +        if (!ovsdb_idl_is_alive(ctx.ovs_idl)) {
>>> +            int retval = ovsdb_idl_get_last_error(ctx.ovs_idl);
>>> +            VLOG_ERR("%s: database connection failed (%s)",
>>> +                     ovs_remote, ovs_retval_to_string(retval));
>>> +            retval = EXIT_FAILURE;
>>> +            break;
>>> +        }
>>> +
>>> +        chassis_run(&ctx);
>>> +        bindings_run(&ctx);
>>> +        unixctl_server_run(unixctl);
>>> +
>>> +        unixctl_server_wait(unixctl);
>>> +        if (exiting) {
>>> +            poll_immediate_wake();
>>> +        }
>>> +
>>> +        ovsdb_idl_wait(ctx.ovs_idl);
>>> +        ovsdb_idl_wait(ctx.ovnsb_idl);
>>> +        poll_block();
>>> +    }
>>
>> It would be handy to track the seqno for each IDL so that we know which
>> (if any) db has changes on each loop iteration.  That could let
>> foo_run() optimize out some processing.
>>
>> For example, chassis_run() only cares about changes to the ovs db, and
>> we're only watching the minimal set of that db that we care about.  The
>> OVN_Southbound db is going to change a *LOT* more often.
> 
> I can see your point. but I don't think it's a huge problem, since
> chassis_run() will actually be monitoring all the Chassis tables in a
> near future patch to create all the various tunnels.  Let me know if
> you still think it's an issue knowing that's going to happen,
> though.

Makes sense.  Thanks for pointing that out.  That helps avoid making
this more complicated for no good reason.

>> I see ovs-vswitchd claims:
>>
>>  ovs_print_version(OFP10_VERSION, OFP10_VERSION);
>>
>> but perhaps that is just out of date?
>>
>> and ovs-ofctl says OFP10 through OFP13 and provides an option to let you
>> pick which version to use.
> 
> Yeah, I suspect that's out of date.  We should fix that up.

OK, thanks for clarifying.  I can see how it's the kind of thing to
easily forgotten about.

>>> +
>>> +        case '?':
>>> +            exit(EXIT_FAILURE);
>>
>> I see that this exists in ovs-vsctl, as well.  What's the reason for it?
>>
>> If '?' exists, I guess I'd expect it to call usage().
> 
> This is how getopt() indicates an unsupported option.  It then prints an 
> error to stderr.

Ah, perfect.  Thanks for the explanation! :-)

>>> +
>>> +        default:
>>> +            abort();
>>
>> Maybe log an error about an unuexpected option or something if this were
>> to happen?
> 
> As mentioned, an unexpected option should be caught in getopt(), so this 
> should be just an invalid condition.

Perfect, thanks again for explaining.

> I'll send out v3 in a minute.  There's an incremental at the end of this 
> message.
> 
> Thanks for the thorough review!

Thanks again for being welcoming of my participation!

-- 
Russell Bryant
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [ovn v2] ovn: Introduce ovn-controller.

Reply via email to