On Tue, Apr 17, 2012 at 07:58:33PM -0700, Jesse Gross wrote: > On Thu, Apr 12, 2012 at 12:36 AM, Simon Horman <ho...@verge.net.au> wrote: > > This is a not yet well exercised implementation of STT intended for review, > > I am sure there are numerous areas that need improvement. > > > > In particular: > > - The transmit path's generation of partial checksums needs to be tested > > - The VLAN stripping code needs to be excercised > > - The code needs to be exercised in the presence of HW checksumming > > - In general, the code has been exercised by running Open vSwtich in > > KVM guests on the same host. Testing between physucal hosts is needed. > > > > This implementation is based on the CAPWAP implementation and in particular > > includes defragmentation code almost identical to CAPWAP. It seems to me > > that while fragmentation can be handled by GSO/TSO, defragmentation code is > > needed in STT in the case where LRO/GRO doesn't reassemble an entire STT > > frame for some reason. > > > > If the defragmentation code, which is of non-trivial length, remains more > > or less in its present state then there is some scope for consolidation > > with CAPWAP. Other code that may possibly be consolidated with CAPWAP has > > been marked accordingly. > > > > This code depends on a encap_rcv hook being added to the Linux Kernel's TCP > > stack. A patch to add such a hook will be posted separately. Ultimately > > this change or some alternative will need to be applied to the mainline > > Linux kernel's TCP stack if STT is to be widely deployed. Motivating this > > change to the TCP stack is part of the purpose of this prototype STT > > implementation. > > > > The configuration of STT is analogous to that of other tunneling > > protocols such as GRE which are supported by Open vSwtich. > > > > e.g. > > > > ovs-vsctl add bridge project0 ports @newport \ > > -- --id=@newport create port name=stt0 interfaces=[@newinterface] \ > > -- --id=@newinterface create interface name=stt0 type=stt > > options="remote_ip=10.0.99.192,key=64" > > > > Signed-off-by: Simon Horman <ho...@verge.net.au> > > Sorry about being slow on this, I've been busy lately. I have a few > high level comments: > > * The hardware offloading parts don't look quite right to me. I > think there's confusion about what is happening between the inner and > outer packets. The offloads that are set when we are given a packet > for encapsulation should essentially just be stored in the STT header > (since this is what we will pass to the NIC if we decapsulate and > forward). Once we do that and add STT, we can set up the resulting > packet for checksum offload/TSO based only on that packet and the > device that it will be forwarded out. > * Conversely on receive we need to use the values stored in the STT > header to recreate the skb offloading metadata.
Thanks, I will work on this. > * I think we could use the OVS flow hash to compute the src port and > save a bunch of code. Sure, I will look into that. The hash calculation code is indeed rather verbose. > * Generally vlan will be stored out of band in skb->vlan_tci, so if > we make it so the generic tunneling code doesn't insert that into the > packet then we won't have to strip it out here. Ok, I will look into that. As it stands It seems that it is being inserted. > * It should always be possible to encapsulate any Ethernet frame. > Things that don't have explicit support like QinQ can still be sent > they just won't get the benefit of offloading (similar to a NIC that > doesn't support a particular header format). My reading of the spec was that it required the inner frame to be untagged ethernet. I'll correct my code as per your description above. > * There are a lot of atomic ops and other expensive things like division... Is there anywhere in particular you are noticing theses? > I would recommend testing with netperf or some other tool that > generates traffic that uses offloading since offloads are such an > important motivator for the protocol. I think that it would help sort > out some of the issues in that area, which in would in turn probably > illuminate some other parts of the protocol. Thanks, good idea. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev