Thanks Sam, I openend an issue on github for this one: https://github.com/openvswitch/ovs-issues/issues/2
Alessandro On 02 Aug 2014, at 19:15, Eitan Eliahu <elia...@vmware.com> wrote: > Hi Samuel, > You are correct and we are well aware of this issue, in fact this issue was > in our to do list for a long time. The plan is to create an NBL from each NB > and to run the new created NBL(s) against the flow table. > If you guys have some free cycles please go ahead and implement it. > Eitan > > -----Original Message----- > From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Samuel Ghinet > Sent: Saturday, August 02, 2014 3:08 AM > To: dev@openvswitch.org > Subject: [ovs-dev] The NET_BUFFER_LIST issue > > Hello guys, > > While working on the integration between our kernel netlink component and > your driver, I had my attention drawn a bit more towards your implementation > of the buffer management. > I expect that there was a discussion upon it, but I'd like to highlight a few > things myself, hoping that together we'll come up with a solution that will > work best for the future. > > Namely, in the OvsExtractFlow function, in the code pushed upstream, the > first NET_BUFFER of the NET_BUFFER_LIST is used as the basis for the flow key > extraction. > > I hope you had read this MSDN documentattion: > https://urldefense.proofpoint.com/v1/url?u=http://msdn.microsoft.com/en-us/library/windows/hardware/ff570756%28v%3Dvs.85%29.aspx&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=yTvML8OxA42Jb6ViHe7fUXbvPVOYDPVq87w43doxtlY%3D%0A&m=6jqjOj7Vq%2B8mf1eZZXUFqAW9emd4ga9kgXkDrFFHzqo%3D%0A&s=ad7c2d6a7e4d93211759bcb91865f09f9ba30cb0285d2bee9cfce3a05e4cb985 > > The relevant issues here, for our project, are: > "Each NET_BUFFER structure that is linked to a NET_BUFFER_LIST structure > describes a single Ethernet frame." > > "All NET_BUFFER structures that are associated with a NET_BUFFER_LIST > structure must have the same Ethernet frame type and IP protocol version > (IPv4 or IPv6)." > > "All NET_BUFFER structures that are associated with a NET_BUFFER_LIST > structure must have the same source and destination MAC addresses." > > "If a driver is sending TCP or UDP frames, all of the NET_BUFFER structures > that are associated with a NET_BUFFER_LIST structure must be associated with > same TCP or UDP connection." > > ------- > > In other words, we have 3 special cases: > a) We've got one NET_BUFFER_LIST, that contains one NET_BUFFER - best case > for us, nothing special to do. > > b) We've got a TCP or UDP packet split in multiple NET_BUFFER-s, in a > NET_BUFFER_LIST. > This is not a very good case for us, because the packet fields that are > guaranteed to be the same for all NET_BUFFERs are: > eth, net layer info (transport protocol only), transport layer (source and > dest port only), ipv4 info (source and dest ip) and arp info, and ipv6 info. > > Whereas the fields that we have NO guarantee that are the same among > NET_BUFFER-s, are: > net layer: TOS, TTL, fragment type > transport layer: tcp flags > ipv6 info: flow label. > > c) We have something else than TCP / UDP packet (e.g. icmp4, icmp6, etc.): > NONE of these fields are guaranteed to be the same for each NET_BUFFER in the > NET_BUFFER_LIST: > net layer: TOS, TTL, fragment type (don't know about mpls, where it fits) > transpot layer: icmp type, icmp code (icmp4 and icmp6) > ipv4: src and dest address (they are the same only for UDP and TCP) > arp: perhaps the same problem as above, for ip address > ipv6: src address, dest address, flow label It can also happen (i.e. we have > no guarantee otherwise) that we get a NET_BUFFER_LIST with 2 NET_BUFFERs: > first is some icmp6, the second contains an ICMP6 Neighbor Discovery. By the > current implementation of OvsExtractFlow , we would fail matching a flow for > the Neighbor Discovery packet - instead, the flow for the first icmp6 packet > would be used for both packets. > > ---------- > > As a one line conclusion, given the current implementation of the buffer > management mechanism, the flow mechanics can at any time create or match a > wrong flow for a given NET_BUFFER. > > Our implementation - create a NET_BUFFER_LIST for each existing NET_BUFFER, > and wrap it in an OVS_NET_BUFFER - may be less efficient, but it takes into > account all these cases, by the fact that it considers each NET_BUFFER in a > NET_BUFFER_LIST as a totally different packet. > An improvement for our OVS_NET_BUFFER implementation would've been to have a > cache of pre-allocated buffers, so as not to waste time allocating and > de-allocating memory for them each time; also to take one NET_BUFFER packet > in a NET_BUFFER_LIST as it is, without duplicating it. > > I believe this can be considered an architectural issue, and should best be > addressed early, rather than to continue to build functionality upon it, and > come to change it later - given that the network packets are used in many > places in the project. > > I am eager to hear any suggestions. As well as, perhaps, how you thought to > address this issue. > Samuel Ghinet > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://urldefense.proofpoint.com/v1/url?u=http://openvswitch.org/mailman/listinfo/dev&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=yTvML8OxA42Jb6ViHe7fUXbvPVOYDPVq87w43doxtlY%3D%0A&m=6jqjOj7Vq%2B8mf1eZZXUFqAW9emd4ga9kgXkDrFFHzqo%3D%0A&s=7de5237aa6d263325d67ca5724808065b3201d2955ae3f9cbb20d7a477990edb > _______________________________________________ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev