OK. Makes sense. It might be OK from an sFlow perspective if the user-space lookup occasionally fails. Just as long as it never retrieves the wrong action-list. That would be a problem.
If you can look up the actions in user-space, do you still need to generate that 64-bit packed-field? It seems like it would be better if you can avoid doing that. Neil On Aug 11, 2011, at 11:39 AM, Ben Pfaff wrote: > Neil, don't worry! This really is a step forward and not a step back. > The actions passed to the kernel and back to userspace are not really > as helpful as you think, because they discard a lot of semantic > information. These actions do say *what* happened but they do not say > *why*. One off-the-cuff example of the difference is that, if packets > in some stream or streams are being mirrored ("spanned") to a > monitoring port separate from sFlow, then the kernel actions will just > show two outputs and we'll just report to sFlow that there were two, > whereas it might be a better idea to ignore the mirror output and > report the "primary" output port since that will presumably be more > useful for analysis at the sflow collector. > > All that we were calculating from the actions before were the egress > ports, VLAN IDs, and egress priority. That's only 48 bits total. The > rest we can (and already do) recover from the flow data or the packet, > which will still be available. > > When (if?) we do need more than 64 bits, the way forward is more work, > but perfectly doable: > > 1. Instead of using the 64-bit data to store the sFlow > information itself, switch to using it to store a unique > ID. (The unique ID could even be kernel-generated.) Track > this along with the userspace data associated with each > flow. > > 2. When an sflow message arrives from the kernel, look up the > userspace data for it based on the unique ID and use it to > formulate the sflow message. > > 3. When a flow changes or gets deleted, keep a copy of the old > version around in userspace as long as there might still be > a sflow message coming along for it from the kernel. > > Only step #3 is any trouble at all. We might need to do this kind of > thing anyway, for other reasons, if we want to have perfectly accurate > flow stats in userspace. That's because, currently, there is a (very > brief) window of time, after the kernel deletes a flow from its flow > table and sends the final statistics to userspace, in which packets > can still arrive and be processed according to the flow's previous set > of actions. The same construct as #3 above is one possible solution. > > On Thu, Aug 11, 2011 at 11:09:36AM -0700, Neil McKee wrote: >> I think the current behavior where the whole actions-list is serialized and >> passed up with the sample is actually really elegant and future-proof. New >> actions are being added all the time, and having access to a sampled-stream >> of "exactly-what-is-happening-in-there" is always going to be good to have. >> Not just for sFlow reporting, but for any analysis or optimization that you >> might want to apply. There is a symmetry to this solution: what you push >> down to the kernel is a list of actions, and what you get back on the >> sampling feed is a packet and the list of actions that were applied to it. >> What's not to like about that? >> >> Furthermore, whereas your current sFlow implementation may only be >> reporting the ingress and egress ports, VLAN Ids, and L2 priorities, plus >> the number of outbound interfaces (can you get all that into 64-bits?), >> that doesn't mean you will never want to report more than that. Perhaps >> there is already something else that should be exported with the sFlow feed? >> If so, a new sFlow structure can be designed and it can go out with the >> sampled header as another annotation. It's open-ended, and adding new >> structures here will never require any further change to the kernel module. >> >> In particular, if an open-flow controller is receiving this sFlow feed so >> it can make network-wide optimization decisions in real time, then it seems >> likely that you would want to export details about all the actions. >> Restricting the output might result in awkward blind-spots, which could make >> these optimization algorithms exponentially harder to write. The HPLabs >> article referenced in this blog post is a good example of the sort of >> network-wide optimization I am referring to: >> http://blog.sflow.com/2011/05/openflow-and-sflow.html >> >> And a first draft to define new structures for reporting OpenFlow-related >> data over sFlow is here: >> http://www.sflow.org/draft-sflow-openflow.txt >> but given the rapid pace of OpenFlow development, this probably needs to be >> reworked. >> >> Neil >> >> >> >> >> >> >> >> On Aug 9, 2011, at 9:42 PM, Ben Pfaff wrote: >> >>> On Tue, Aug 09, 2011 at 10:46:38AM -0700, pravin shelar wrote: >>>> This patch is first step toward generic/simple sampling >>>> action which will be completely independent of sflow. >>>> Following patch cleanup sampling upcall by striping it down to >>>> minimum essential data that userspace can not figure out for >>>> itself. >>>> >>>> Signed-off by: Pravin Shelar <pshe...@nicira.com> >>> >>> (That's "-by", not " by".) >>> >>> I like the general approach here--the sflow support is a big wart--but I >>> think that the userspace interface changes represent a regression. >>> Before, by passing the actions down to userspace, there was always a >>> consistent view of those actions. With this change (as comments in the >>> code make clear) the actions might have changed or the facet might even >>> have been deleted. >>> >>> A really good fix would probably require being able to somehow keep >>> old copies of flows that have changed around for a while in userspace, >>> link them to the sflow messages, and free them when they are known not >>> to be needed anymore. I think that's probably feasible, but in fact I >>> think that we can avoid regression with something simpler. >>> >>> Suppose we keep the "sample pool" netlink option that this patch >>> removes, and add a way for userspace to set a small amount of user data >>> on a flow, and pass that along as the "user data" netlink option. 64 >>> bits of user data should be plenty of space to store what the userspace >>> sflow implementation actually needs (16 bits of final VLAN TCI plus 32 >>> bits for sFlow's idea of output port). Plus, those same 64 bits would >>> be more than enough for the sequence number that the "really good fix" >>> mentioned in the previous paragraph would probably need. >>> >>> With the current patch, the obvious place to put this user data would be >>> in the flow itself. Another approach (that you have probably already >>> considered) is to make sFlow sampling a flow action, in which case the >>> user data could be that action's argument. >>> >>> Thanks, >>> >>> Ben. >>> _______________________________________________ >>> dev mailing list >>> dev@openvswitch.org >>> http://openvswitch.org/mailman/listinfo/dev >> _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev