On 14.08.2013 18:00, Luigi Rizzo wrote:
On Wed, Aug 14, 2013 at 05:01:05PM +0400, Alexander V. Chernikov wrote:
On 14.08.2013 16:40, Luigi Rizzo wrote:
...
You can save rte&arp, however doing this
gives you perfect chance to crash your kernel if egress interface is
destroyed (like vlan or ng or tun).
I hope I learned not to follow a stale ifp pointer :)
Well, currently we have no locks (or other means) to ensure all other
cores has "current" pointer to ifp or its fields (or am I wrong?)
This i don't know -- but in case, we should fix the race anyways
(another timescale, but still dangerous).
anyways ARP is really just the mac address so there is no
dandling pointer issue.
For the ifp associated to the route,
i do not see a huge problem in marking the route/ifp as
zombie and destroy it when the last reference goes away.
Yes, but references requires some synchronization primitives. One
Again, we should protect against ifp destruction anyways. Surely
we should try and make the protection mechanism cheap (in my proposal,
going through the refcount once per millisecond instead of every
Sorry, I still can't get this. Are we talking about egress interface
refcounts?
Where are refcounts incremented/decremented?
Btw, currently interface destroying is usually synchronous so
1ms-waiting can effectively reduce interface creation/destroying rate in
BRAS scenarios (mpd with ng*, Juniper case..)
single packet; there might be better ways, and i am all ears on
that); surely, we cannot dismiss something because "we run without
seatbelts now so anything else is more expensive".
We had a related discussion regarding races in interfaces between
the datapath (if_transmit() and *_rxeof() ) and the control path
(ioctls, watchdog etc.).
The reason I am raising this issue is because i want to fix the
races that emerged when we moved to SMP, not because I want to "make
hacks" and cut corners in unsafe ways.
That's great! I want this fixed, too :)
cheers
luigi
possible solution is using pcpu counters, but it does not play well on
!amd64.
Not that the current way is any better -- you need to lock/unlock
the rte while you do the lookup, and hold a refcount to the ifp
until the packet is queued. So how does my suggestion make
things worse ?
cheers
luigi
Considering that each lookup takes between 100..300ns if you are
lucky (not many misses, relatively empty table etc.), one could
reasonably do the lookup at most once per millisecond or so (just
reading 'ticks', no need for a nanotime() if you have a slow clock),
or whenever we get an error related to the socket, either in the
forward path (e.g. ifp points to an interface that is down) or in
the reverse path (e.g. a dupack because we sent a packet to the
wrong place).
This sounds like "Hey, the kernel lookup is slow (which is true), let's
make a hack and don't bother lookups".
This approach gives us mtx-locked rte refcounts which are used (misused)
in many places making things worse and decreasing the ability to fix the
things up..
cheers
luigi
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"