Hi, On Wednesday 10 January 2007 07:46, Patrick McHardy wrote: > > + rcu_read_lock(); > > + for (rth = rcu_dereference(rt_hash_table[hash].chain); rth; > > + rth = rcu_dereference(rth->u.rt_next)) { > > + if (rth->fl.fl4_dst == iph->daddr && > > + rth->fl.fl4_src == iph->saddr && > > + rth->fl.iif == iif && > > + rth->fl.oif == 0 && > > + rth->fl.mark == skb->mark && > > + (rth->u.dst.flags & DST_DIVERTED) && > > + rth->fl.fl4_tos == tos) { > > Mark and tos look unnecessary here since they don't affect the further > processing of the packet.
Indeed, thanks for spotting it. > > + rth->u.dst.lastuse = jiffies; > > + dst_hold(&rth->u.dst); > > + rth->u.dst.__use++; > > + RT_CACHE_STAT_INC(in_hit); > > + rcu_read_unlock(); > > + > > + dst_release(skb->dst); > > + skb->dst = (struct dst_entry*)rth; > > + > > + if (sk) { > > + sock_hold(sk); > > + skb->sk = sk; > > This looks racy, the socket could be closed between the lookup and > the actual use. Why do you need the socket lookup at all, can't > you just divert all packets selected by iptables? Yes, it's racy, but I this is true for the "regular" socket lookup, too. Take UDP for example: __udp4_lib_rcv() does the socket lookup, gets a reference to the socket, and then calls udp_queue_rcv_skb() to queue the skb. As far as I can see there's nothing there which prevents the socket from being closed between these calls. sk_common_release() even documents this behaviour: [...] if (sk->sk_prot->destroy) sk->sk_prot->destroy(sk); /* * Observation: when sock_common_release is called, processes have * no access to socket. But net still has. * Step one, detach it from networking: * * A. Remove from hash tables. */ sk->sk_prot->unhash(sk); /* * In this point socket cannot receive new packets, but it is possible * that some packets are in flight because some CPU runs receiver and * did hash table lookup before we unhashed socket. They will achieve * receive queue and will be purged by socket destructor. * * Also we still have packets pending on receive queue and probably, * our own packets waiting in device queues. sock_destroy will drain * receive queue, but transmitted packets will delay socket destruction * until the last reference will be released. */ [...] Of course it's true that doing early lookups and storing that reference in the skb widens the window considerably, but I think this race is already handled. Or is there anything I don't see? -- Regards, Krisztian Kovacs - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html