OK.  Makes sense.

It might be OK from an sFlow perspective if  the user-space lookup occasionally 
fails.    Just as long as it never retrieves the wrong action-list.  That would 
be a problem.

If you can look up the actions in user-space,  do you still need to generate 
that 64-bit packed-field?  It seems like it would be better if you can avoid 
doing that.

Neil


On Aug 11, 2011, at 11:39 AM, Ben Pfaff wrote:

> Neil, don't worry!  This really is a step forward and not a step back.
> The actions passed to the kernel and back to userspace are not really
> as helpful as you think, because they discard a lot of semantic
> information.  These actions do say *what* happened but they do not say
> *why*.  One off-the-cuff example of the difference is that, if packets
> in some stream or streams are being mirrored ("spanned") to a
> monitoring port separate from sFlow, then the kernel actions will just
> show two outputs and we'll just report to sFlow that there were two,
> whereas it might be a better idea to ignore the mirror output and
> report the "primary" output port since that will presumably be more
> useful for analysis at the sflow collector.
> 
> All that we were calculating from the actions before were the egress
> ports, VLAN IDs, and egress priority.  That's only 48 bits total.  The
> rest we can (and already do) recover from the flow data or the packet,
> which will still be available.
> 
> When (if?) we do need more than 64 bits, the way forward is more work,
> but perfectly doable:
> 
>       1. Instead of using the 64-bit data to store the sFlow
>          information itself, switch to using it to store a unique
>          ID.  (The unique ID could even be kernel-generated.)  Track
>          this along with the userspace data associated with each
>          flow.
> 
>       2. When an sflow message arrives from the kernel, look up the
>          userspace data for it based on the unique ID and use it to
>          formulate the sflow message.
> 
>       3. When a flow changes or gets deleted, keep a copy of the old
>          version around in userspace as long as there might still be
>          a sflow message coming along for it from the kernel.
> 
> Only step #3 is any trouble at all.  We might need to do this kind of
> thing anyway, for other reasons, if we want to have perfectly accurate
> flow stats in userspace.  That's because, currently, there is a (very
> brief) window of time, after the kernel deletes a flow from its flow
> table and sends the final statistics to userspace, in which packets
> can still arrive and be processed according to the flow's previous set
> of actions.  The same construct as #3 above is one possible solution.
> 
> On Thu, Aug 11, 2011 at 11:09:36AM -0700, Neil McKee wrote:
>> I think the current behavior where the whole actions-list is serialized and 
>> passed up with the sample is actually really elegant and future-proof.   New 
>> actions are being added all the time,  and having access to a sampled-stream 
>> of "exactly-what-is-happening-in-there" is always going to be good to have.  
>> Not just for sFlow reporting,  but for any analysis or optimization that you 
>> might want to apply.   There is a symmetry to this solution:  what you push 
>> down to the kernel is a list of actions,  and what you get back on the 
>> sampling feed is a packet and the list of actions that were applied to it.  
>> What's not to like about that?
>> 
>> Furthermore,  whereas your current sFlow implementation may only be 
>> reporting the ingress and egress ports, VLAN Ids,  and L2 priorities, plus 
>> the number of outbound interfaces (can you get all that into 64-bits?),  
>> that doesn't mean you will never want to report more than that.   Perhaps 
>> there is already something else that should be exported with the sFlow feed? 
>>   If so,  a new sFlow structure can be designed and it can go out with the 
>> sampled header as another annotation.  It's open-ended,  and adding new 
>> structures here will never require any further change to the kernel module.
>> 
>> In particular,  if an open-flow controller is receiving this sFlow feed so 
>> it can make network-wide optimization decisions in real time,  then it seems 
>> likely that you would want to export details about all the actions.   
>> Restricting the output might result in awkward blind-spots, which could make 
>> these optimization algorithms exponentially harder to write.   The HPLabs 
>> article referenced in this blog post is a good example of the sort of 
>> network-wide optimization I am referring to:
>> http://blog.sflow.com/2011/05/openflow-and-sflow.html
>> 
>> And a first draft to define new structures for reporting OpenFlow-related 
>> data over sFlow is here:
>> http://www.sflow.org/draft-sflow-openflow.txt
>> but given the rapid pace of OpenFlow development,  this probably needs to be 
>> reworked.
>> 
>> Neil
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> On Aug 9, 2011, at 9:42 PM, Ben Pfaff wrote:
>> 
>>> On Tue, Aug 09, 2011 at 10:46:38AM -0700, pravin shelar wrote:
>>>> This patch is first step toward generic/simple sampling
>>>> action which will be completely independent of sflow.
>>>> Following patch cleanup sampling upcall by striping it down to
>>>> minimum essential data that userspace can not figure out for
>>>> itself.
>>>> 
>>>> Signed-off by: Pravin Shelar <pshe...@nicira.com>
>>> 
>>> (That's "-by", not " by".)
>>> 
>>> I like the general approach here--the sflow support is a big wart--but I
>>> think that the userspace interface changes represent a regression.
>>> Before, by passing the actions down to userspace, there was always a
>>> consistent view of those actions.  With this change (as comments in the
>>> code make clear) the actions might have changed or the facet might even
>>> have been deleted.
>>> 
>>> A really good fix would probably require being able to somehow keep
>>> old copies of flows that have changed around for a while in userspace,
>>> link them to the sflow messages, and free them when they are known not
>>> to be needed anymore.  I think that's probably feasible, but in fact I
>>> think that we can avoid regression with something simpler.
>>> 
>>> Suppose we keep the "sample pool" netlink option that this patch
>>> removes, and add a way for userspace to set a small amount of user data
>>> on a flow, and pass that along as the "user data" netlink option.  64
>>> bits of user data should be plenty of space to store what the userspace
>>> sflow implementation actually needs (16 bits of final VLAN TCI plus 32
>>> bits for sFlow's idea of output port).  Plus, those same 64 bits would
>>> be more than enough for the sequence number that the "really good fix"
>>> mentioned in the previous paragraph would probably need.
>>> 
>>> With the current patch, the obvious place to put this user data would be
>>> in the flow itself.  Another approach (that you have probably already
>>> considered) is to make sFlow sampling a flow action, in which case the
>>> user data could be that action's argument.
>>> 
>>> Thanks,
>>> 
>>> Ben.
>>> _______________________________________________
>>> dev mailing list
>>> dev@openvswitch.org
>>> http://openvswitch.org/mailman/listinfo/dev
>> 

_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to