On Fri, Mar 25, 2016 at 6:23 PM, Ben Greear <gree...@candelatech.com> wrote: > On 03/25/2016 02:59 PM, Vijay Pandurangan wrote: >> >> consider two scenarios, where process a sends raw ethernet frames >> containing UDP packets to b >> >> I) process a --> veth --> process b >> >> II) process a -> eth -> wire -> eth -> process b >> >> I believe (I) is the simplest setup we can create that will replicate this >> bug. >> >> If process a sends frames that contain UDP packets to process b, what >> is the behaviour we want if the UDP packet *has an incorrect >> checksum*? >> >> It seems to me that I and II should have identical behaviour, and I >> would think that (II) would not deliver the packets to the >> application. >> >> In (I) with Cong's patch would we be delivering corrupt UDP packets to >> process b despite an incorrect checksum in (I)? >> >> If so, I would argue that this patch isn't right. > > > Checksums are normally used to deal with flaky transport mechanisms, > and once a machine receives the frame, we do not keep re-calculating > checksums > as we move it through various drivers and subsystems. > > In particular, checksums are NOT a security mechanism and can be easily > faked. > > Since packets sent on one veth never actually hit any unreliable transport > before they are received on the peer veth, then there should be no need to > checksum packets whose origin is known to be on the local machine.
That's a good argument. I'm trying to figure out how to reconcile your thoughts with the argument that virtual ethernet devices are an abstraction that should behave identically to perfectly-functional physical ethernet devices when connected with a wire. In my view, the invariant must be identical functionality, and if I were writing a regression test for this system, that's what I would test. I think optimizations for eliding checksums should be implemented only if they don't alter this functionality. There must be a way to structure / write this code so that we can optimize veths without causing different behaviour ... > > Any frame sent from a socket can be considered to be a local packet in my > opinion. I'm not sure that's totally right. Your bridge is adding a delay to your packets; it could just as easily be simulating corruption by corrupting 5% of packets going through it. If this change allows corrupt packets to be delivered to an application when they could not be delivered if the packets were routed via physical eths, I think that is a bug. > > That is what Cong's patch does as far as I can tell. > > > Thanks, > Ben > > > -- > Ben Greear <gree...@candelatech.com> > Candela Technologies Inc http://www.candelatech.com >