I forgot that I needed to respond here.  Sorry, hope late is better
than never.

On Tue, Apr 09, 2013 at 05:40:44PM -0700, Ethan Jackson wrote:
> > The BFD timers are 32-bit counts of microseconds.  The timers in
> > struct bfd are 64-bit and appear to be milliseconds, but I'm not
> > certain that all of them have the same unit.  Comments on units would
> > be welcome.  Do they need to be 64-bit?
> 
> All the timers are in milliseconds, and converted to microseconds when put 
> into
> the packet.  Typically we've used long long int for time, so I've stayed with
> that convention.  Any reason to change it?

No, that's fine, but I found it hard to infer the units other than
from the type.  Could some comments mention that they are msecs?

> > The interface for bfd_configure() is unusual.  It does seem to be
> > appropriate.
> 
> Yeah it's a bit odd.  I think it's better than sprinkling a little bit of bfd
> config code everywhere like we do with CFM.  I'm happy to change it if you
> prefer however.

It's fine.

> > This might be a personal problem, but I find the following:
> >> +        udp_src++;
> >> +        udp_src = udp_src < 49152 ? 49152 : udp_src;
> >> +        bfd->udp_src = udp_src;
> > a little harder to verify as correct than:
> >         bfd->udp_src = (udp_src++ & 16383) + 49152;
> 
> I assume you mean modulo here?

(x & 16383) == (x % 16384) for nonnegative integer x.  If you like %
better then by all means use x % 16384.

> > In bfd_put_packet(), it might be wise to start out with
> > ofpbuf_reserve(p, 2); to properly align what comes afterward.
> 
> I'm happy to do it, but I'm curious why?  Performance?  Some strange
> architecture will be upset?

Just about any RISC architecture gives a bus error if you try to
access misaligned data.

> > It looks like bfd_process_packet() expects the caller to have pulled
> > off everything except the L7 payload.  It would be nice to put this in
> > a comment.
> 
> Perhaps I'm confused.  Based on my reading of the code, we don't expect the
> headers to have been removed.  Instead, we simply require the l7 pointer to
> have been properly populated.

I must have misread the code.   I take it back.

> > I don't see a check for this requirement from 6.8.6 (I think it only
> > requires comparing p->size to msg->length):
> 
> I think we implicitly make this check by using ofpbuf_at() which will return
> NULL if there aren't at least BFD_PACKET_LEN bytes.  

I agree that we know there are at least BFD_PACKET_LEN bytes.  We
don't know that there are at least msg->length bytes, since
msg->length might be greater than BFD_PACKET_LEN.
> 
> > The pseudocode at the end of 6.8.6 only mentions updating
> > bfd.LocalDiag in a couple of cases, but the implementation appears to
> > update LocalDiag in every case where it changes the session state.
> 
> I found the RFC a bit confusing in this respect.  For example,
> suppose you follow their instructions with respect to the diagnostic
> strictly.  Then if the remote neighbor goes down, you will set
> DIAG_RMT_DOWN, and have no way of going back to DIAG_NONE when the
> session comes back up.  It isn't done upon reception of BFD packet
> in either the DOWN or INIT states.  This didn't seem correct, to me,
> so I choose to interpret the spec as setting the diagnostic to none
> if not otherwise specified.  Does that sound reasonable?

LocalDiag is described as:

      The diagnostic code specifying the reason for the most recent
      change in the local session state.  This MUST be initialized to
      zero (No Diagnostic).

I assumed, without carefully thinking it through, that this was sort
of like errno, in that it can get set to some kind of error but
nothing ever sets it back to 0, so that you can always use it to find
out what the most recent error was, even after recovery.  I don't know
whether that is the intent, but if it is then it would explain why it
is never set back to "No Diagnostic".

> > I don't see anything in bfd_process_packet() that corresponds to:
> >
> >       If the packet was not discarded, it has been received for purposes
> >       of the Detection Time expiration rules in section 6.8.4.
> >
> > but maybe I just missed it.
> 
> I interpreted this line as satisfying the requirement:
> 
>     bfd->detect_time = bfd_rx_interval(bfd) * bfd->mult + time_msec();

Oh, then I think it's in the wrong place.  For example, this:

      If bfd.SessionState is AdminDown

          Discard the packet

is implemented as:

    if (bfd->state == STATE_ADMIN_DOWN) {
        VLOG_DBG_RL(&rl, "Administratively down, dropping control message.");
        return;
    }

but

     bfd->detect_time = bfd_rx_interval(bfd) * bfd->mult + time_msec();

happens before that, so that the detection time is reset even if the
packet was discarded.

> > The following loses a bit of information by truncating to 0.  Did you
> > consider a format like "now%+lld ms" so that you get output like
> > "now+5 ms" or "now-10 ms"?
> >> +    ds_put_format(ds, "\tDetect Time: %lldms from now\n",
> >> +                  msec_diff(time_msec(), bfd->detect_time));
> >> +    ds_put_format(ds, "\tNext TX Time: %lldms from now\n",
> >> +                  msec_diff(time_msec(), bfd->next_tx));
> >> +    ds_put_format(ds, "\tLast TX Time: %lldms ago\n",
> >> +                  msec_diff(bfd->last_tx, time_msec()));
> 
> Just to make sure I follow.  The advantage of doing this is that
> when we truncate to 0, we know whether we were slightly above, or
> slightly below zero?  I don't think it matters that much, but I'm
> happy to change.

I guess I can foresee having timers that we think should be in the
future actually being in the past, and vice versa, and by giving the
actual time instead of just "0 from now" or "0 ago" then debugging
might be easier.  But if you do not think that this is likely, then
the format you have already chosen is more user-friendly.

> > ofproto-dpif.c
> > --------------
> >
> > Why do we now need to call time_refresh() in run_fast()?
> 
> I was thinking that we'd want to assure that the results of time_msec() are
> accurate when jumping into the CFM/BFD code.  However, Since we've increased
> the accuracy of the cached time, It's probably not worth it.  At any rate, it
> should be in a different patch.

OK.

Thanks,

Ben.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to