On Thu, 2005-15-12 at 05:57 +0100, Krzysztof Oledzki wrote: > Addedd CC: to the [EMAIL PROTECTED] mailing list. >
And I added Shoichi Sakane to the CC. He is responsible for bringing "use new SA" feature to BSD to begin with and is one of the original authors of racoon. So lets take one more go at this discussion ;-> I am annotating the paragraphs for his benefit - I hope you dont mind Krzysztof. > On Wed, 14 Dec 2005, jamal wrote: > [..] > > there are two sorts of problematic devices. > > > > 1) The Ciscos, I think PIX and their relatives (I heard linksys): > > These suckers have a "fixed" time between soft expiry time and > > hard expiry time;-> > > IKE only negotiates hard expiry, and soft expiry is up to the peer. > > Racoon says soft expiry = 80% of hard expiry. > > So if you have the expiry at 10 hours, racoon will set soft expiry > > at 8 hours. CISCO hardcodes 30 seconds to be between the hard and soft > > expiry ;-> Yep, when you have RFCs written in a natural language like > > English shit like this happens. So at the 8 hour mark, racoon > > renegotiates. For 30 seconds more after that, things continue working. > > Then for the next 119.5 minutes nothing works because infact CISCO > > purges its old SA and Linux (as it should) starts using the new one. > > The proper way is for CISCO to send a IKE delete; it doesnt. > > AFAIK this is not true. What is not true?;-> CISCO doesnt hardcode 30 secs as the time between soft and hard expiry? Or what is said with CISCO not sending IKE delete? AFAIK, both are true. > The real problem is that Linux does not start > using new SAs without additional routing cache flush as long as old SA > exist. To Shoichi: This is _the_ contentious issue per the RFC and the discussion is to come up with a solution. The logic i, and I am sure Herbert and Dave are using, is that if it is not Linuxs fault why should Linux get into this complicated fixes when there are other ways to do it. To Krzysztof: The route cache is _valid_ because the SA has not expired as far as linux is concerned, so there is no need to flush it ;-> [I will point that Flushing the route cache is a really bad idea. Actually it is fine if you have a few SAs. imagine a few hundred or thousand SAs]. For Shoichi-Sans benefit, i will point again that Linux is behaving as per spec with: http://marc.theaimsgroup.com/?l=linux-netdev&m=113070963711648&w=2 The only time the SA is invalid is if it expires or if an explicit delete is sent. None of which happen. > The problem has nothing with hardcoded expiry time. When acting as > initiator, Cisco negotiates new IPSec-SA and fluses old one while Linux > still uses it. No, that is the effect ;-> The problem has to do with _hardcoded expiry timer_ for the SA. CISCO could send an IKE delete to indicate it no longer uses the SA or the assumption is the SA is valid for a period of time that has been negotiated. CISCO instead keeps the old SA around for 30 secs and the effect because Linux continues to interpret it as useful for the remainder 20%. > Similar problem exists with Linksys (for example BEFSX41) > when Linux act as initiator. After 80% of a lifetime it negotiates new SA > but still uses old one, and such packets are ignored by BEFSX41. > Probably same code base. And blame Shoichi for propagation of this bug ;-> If this "fix" did not exist on BSD, cisco would have hardly interoped with anybody - given that majority of deployed ipsec devices probabaly just copied the KAME code. > > To fix this i submitted a patch to racoon which is in their CVS - i was > > told it will show up around their release 0.7. The patch allows people > > to hardcode like in cisco a specific time. So this fixes the CISCO > > problem without touching the kernel. > > It may be useful but it does not fix _this_ problem, really. AFAIK, you of all people has not even bothered trying it because you insist on a kernel fix so we can just be good people like BSD;-> > IPSec > initiator can negotiate new SA an any time so this will not work when > Linux is the responder. Negotiating new SA, for example 30s before hard > expiratin, means 30s without communication so this is also unacceptable > solution. > That will only happen if you are unaware that the initiator is CISCO and have not preconfigured racoon to expect a device like CISCO. Hopefully someone you are configuring for will bitch and you can fix your configs. IOW, the setting of the diff for soft to hard time will fix it if you have the knowledge which could be taken for granted. > > 2) There are other sorts of devices - i am told some made by a vendor > > called DrayTek infact deletes right away after renegotiation. > > But they do send a IKE delete except racoon ignores it ;-> > > As was pointed out to me that even since IKEv1 is unreliable such a > > message could be lost anyways. hopefully the ipsec tools folks can pay good attention to this. I could probably take a crack at fixing it when i get the time. > > So bug in racoon for sure but not good enough given the unreliability of > > IKEv1. So in the last discussion Herbert and I had we talked about doing > > something in the kernel since this was getting frustrating ... > > Herbert has it on his TODO and i was going to get racoon part once he > > has his patch. > And note the reasoning we use for why we need to do something in the kernel. It has to do with the fact the IKE delete being unreliable and not the fact that stoopid CISCO uses the new SA immediately and old SA for 30 more seconds. cheers, jamal - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html