On Wed, Aug 14, 2013 at 04:15:25PM +0400, Alexander V. Chernikov wrote: > On 14.08.2013 16:05, Luigi Rizzo wrote: > > On Wed, Aug 14, 2013 at 03:47:13PM +0400, Lev Serebryakov wrote: > >> Hello, Luigi. > >> You wrote 14 ?????????????? 2013 ??., 14:21:09: > >> > >> LR> Then the problem remains that we should keep a copy of route and > >> LR> arp information in the socket instead of redoing the lookups on > >> LR> every single transmission, as they consume some 25% of the time of > >> LR> a sendto(), and probably even more when it comes to large tcp > >> LR> segments, sendfile() and the like. > >> And we should invalidate this info on ARP/route changes, or connection > >> will be lost in such cases, am I right?.. So, on each such event code > >> should look into all sockets and check, if routing/ARP information is > >> still > >> valid for them. Or we should store lists of sockets in routing and ARP > >> tables... I don't know, what is worse. > > I think we should start by acknowledging that routing and ARP > > information is inherently stale, and changes unfrequently. > > So it is not a disaster if we have incorrect information for some > > short amount of time (milliseconds) because in the end the remote > > party that decides to change it and inform us may take much longer > > than that to distribute the update. > You can save rte&arp, however doing this > gives you perfect chance to crash your kernel if egress interface is > destroyed (like vlan or ng or tun).
I hope I learned not to follow a stale ifp pointer :) anyways ARP is really just the mac address so there is no dandling pointer issue. For the ifp associated to the route, i do not see a huge problem in marking the route/ifp as zombie and destroy it when the last reference goes away. Not that the current way is any better -- you need to lock/unlock the rte while you do the lookup, and hold a refcount to the ifp until the packet is queued. So how does my suggestion make things worse ? cheers luigi > > > > > > Considering that each lookup takes between 100..300ns if you are > > lucky (not many misses, relatively empty table etc.), one could > > reasonably do the lookup at most once per millisecond or so (just > > reading 'ticks', no need for a nanotime() if you have a slow clock), > > or whenever we get an error related to the socket, either in the > > forward path (e.g. ifp points to an interface that is down) or in > > the reverse path (e.g. a dupack because we sent a packet to the > > wrong place). > This sounds like "Hey, the kernel lookup is slow (which is true), let's > make a hack and don't bother lookups". > This approach gives us mtx-locked rte refcounts which are used (misused) > in many places making things worse and decreasing the ability to fix the > things up.. > > > > cheers > > luigi > > _______________________________________________ > > freebsd-net@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-net > > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" > > > _______________________________________________ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"