Re: [ovs-dev] [PATCH v2] Simplify kernel sFlow implementation

Jesse Gross Thu, 18 Aug 2011 04:52:36 -0700

On Aug 18, 2011 7:34 AM, "Pravin Shelar" <pshe...@nicira.com> wrote:
>
> On Wed, Aug 17, 2011 at 3:04 PM, Neil McKee <neil.mc...@inmon.com> wrote:
> >
> > On Aug 17, 2011, at 2:30 PM, Ben Pfaff wrote:
> >
> >> On Wed, Aug 17, 2011 at 02:18:52PM -0700, Neil McKee wrote:
> >>>
> >>> On Aug 17, 2011, at 12:25 PM, Ben Pfaff wrote:
> >>>
> >>>> On Wed, Aug 17, 2011 at 11:51:05AM -0700, Pravin Shelar wrote:
> >>>>> On Tue, Aug 16, 2011 at 11:17 PM, Jesse Gross <je...@nicira.com>
wrote:
> >>>>>> The use of the checksum for actions surprised me a little bit, as
it
> >>>>>> is semantically equivalent to what we have today but perhaps not as
> >>>>>> accurate. ?Ben made a couple of good suggestions in the previous
> >>>>>> thread about how to do this cleanly and generically. ?Did you run
into
> >>>>>> problems with those?
> >>>>>
> >>>>> AFAIU, Ben was suggesting to have cookie set from userspace. that
> >>>>> cookie can be returned on upcall. so that userspace can validate
flow
> >>>>> used in kernel datapath. Other part was about keeping flows which
are
> >>>>> deleted and updated. so that sflow can lookup those flows if
required.
> >>>>> I thought cookie can be generated in kernel using checksum and
return
> >>>>> it to userspace. In this way we do not need to change user-kernel
> >>>>> interface when sflow need extra information as it has exact flow
that
> >>>>> kernel used.
> >>>>> Actually it is mentioned in previous mail thread.
> >>>>
> >>>> I was proposing two steps.  In the first step, userspace would set a
> >>>> cookie that directly encodes information needed for sflow output.
> >>>
> >>> And you have to do that every time, regardless of whether any
> >>> packets from that flow end up being sampled?
> >>
> >> It's very cheap, so that's hardly worth worrying about.
> >>
> >>>> In
> >>>> the second step, userspace would set a cookie that uniquely
identifies
> >>>> a version of a flow.  The second step is harder because userspace has
> >>>> to keep around old versions of a flow for a while (the hardest part
is
> >>>> figuring out when they can be discarded, I think).
> >>>>
> >>>> A checksum ties the set of actions to a flow.  That's all we need for
> >>>> sflow, although it means that we either need to keep around old
> >>>> versions of a flow, as in the second step above, or just discard
> >>>> sampled packets for old versions (I think that your patch does the
> >>>> latter).
> >>>
> >>> Better to send the sample out with "unknown" egress port/vlan than
> >>> to just drop it.  Otherwise you are likely to systematically drop
> >>> samples from flows that are short-lived, and the results will be
> >>> biased.  (You do still know the ingress-port, right?)
> >>
> >> Thanks for the advice.  I had not realized that sending a sample with
> >> unknown egress data was an option.  Yes, we still know the ingress
> >> port.
> >>
> >> Our flows only expire through timeout (after about 5 seconds of
> >> inactivity), so I don't think that this would systematically bias
> >> against such flows.
> >>
> >>>> It is not as general a solution as a unique identifier,
> >>>> because when a flow's actions change from A to B and then back to A
> >>>> there is no way to distinguish whether a sampled packet corresponds
to
> >>>> the first or second time that A was set.  (That doesn't matter for
> >>>> sflow.)  That's a corner case; I don't know if it's important.  And,
> >>>> of course, the IP checksum only probabilistically tells us whether
> >>>> there was a change.
> >>>
> >>> This may be stating the obvious, but it's important that you never
> >>> send out a sample with the *wrong* egress port/vlan.  That could
> >>> break all kinds of things (topology discovery, end-host location,
> >>> policy violations etc.)
> >>
> >> Yes, I understand.
> >>
> >>> Perhaps we could know more about what you are trying to achieve by
> >>> changing this?  Every suggestion seems to involve more complexity,
> >>> more overhead or less accuracy.  What's the upside?
> >>
> >> We're trying to minimize the amount of code and complexity in the
> >> kernel.
> >
> > In the datapath directory I can see a total of about 30 lines of code
that is related to sampling.   The code to include the actions in the upcall
seems to be just this:
> >
> >        upcall.actions = acts->actions;
> >        upcall.actions_len = acts->actions_len;
> >
> > how many lines of user-space code are you willing to write to reduce
this?  I must be missing something.   What am I missing?
>
> Final goal is  generalizing sampling action. so that we can assign
> probability to any action.
> so sampling would be userspace action with a probability. thats why we
> need to decided generic interface.


To elaborate a little more on the background:

We're currently preparing to submit the kernel portion of Open vSwitch into
upstream Linux. Once we do that it will be much more difficult to make
changes that require coordination between userspace and kernel because of
compatibility concerns. Therefore, the goal is to implement as much as
possible in userspace and have only the simplest, most generic primatives in
the kernel to support that, particularly in the userspace interface. It's
also a good idea in any case to minimize kernel code.

_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] [PATCH v2] Simplify kernel sFlow implementation

Reply via email to