On 18 January 2018 at 15:18, Guillaume Nault <g.na...@alphalink.fr> wrote: > On Wed, Jan 17, 2018 at 02:25:38PM -0500, David Miller wrote: >> From: James Chapman <jchap...@katalix.com> >> Date: Wed, 17 Jan 2018 11:13:33 +0000 >> >> > On 16 January 2018 at 19:00, David Miller <da...@davemloft.net> wrote: >> >> From: Tom Herbert <t...@herbertland.com> >> >> Date: Tue, 16 Jan 2018 09:36:41 -0800 >> >> >> >>> sk_user_data is set with the sk_callback lock held in code below. >> >>> Should be able to take the lock earlier can do this check under the >> >>> lock. >> >> >> >> csock, and this csk, is obtained from an arbitrary one of the >> >> process's FDs. It can be any socket type or family, and that socket's >> >> family might set sk_user_data without the callback lock. >> >> >> >> The only socket type check is making sure it is not another PF_KCM >> >> socket. So that doesn't help with this problem. >> > >> > Is it the intention to update all socket code over time to write >> > sk_user_data within the sk_callback lock? If so, I'm happy to address >> > that in the l2tp code (and update the kcm patch to check sk_user_data >> > within the sk_callback lock). Or is the preferred solution to restrict >> > KCM to specific socket families, as suggested by Guillaume earlier in >> > the thread? >> >> I think we have a more fundamental issue here. >> >> sk->sk_user_data is a place where RPC layer specific data is hung off >> of. By this definition SunRPC, RXRPC, RDS, TIPC, and KCM are all >> using it correctly. >> >> Phonet has a similar issue to the one seen here, it tests and changes >> sk_user_data under lock_sock(). The only requirement it makes is >> that the socket type is not SOCK_STREAM. However, this one might be OK >> since only pep_sock sockets can be passed down into gprs_attach(). >> > But, if I read it correctly, that doesn't prevent it from being passed > to kcm_attach() later on, which will overwrite sk_user_data (unless we > update the locking scheme and refuse to overwrite sk_user_data in a > race-free way). > > BTW couldn't the gprs_dev pointer be embedded in struct pep_sock? > This way pep_sk(sk)->gp could be used instead of sk->sk_user_data. > That'd probably be a violation of the phonet's layering, as that'd > tie gprs_dev to pep sockets. OTOH, only pep sockets can currently be > attached to gprs_dev, so in practice that might be a reasonable > compromise. > >> Most of these cases like SunRPC, RXRPC, etc. are fine because they >> only graft on top of TCP and UDP sockets. >> >> The weird situation here is that L2TP does tunneling and stores it's >> private state in sk->sk_user_data like an RPC layer would. And KCM >> allows basically any socket type to be attached. >> >> The RPC layers create their sockets internally, so I cannot see a way >> that those can be sent to a KCM attach operations. And I think that >> is why this RPC invariant is important for sk_user_data usage. >> > SunRPC seems to possibly set sk_user_data on user sockets: svc_addsock() > gets a socket using sockfd_lookup() then passes it to svc_setup_socket() > which in turn sets sk_user_data. I don't know anything about SunRPC, so > I might very well have missed important details, but I believe such a > socket could be passed to KCM which could lead to the same kind of > issues as for L2TP. Other RPCs look safe to me. > >> If all else was equal, even though it doesn't make much sense to KCM >> attach L2TP sockets to KCM, I would suggest to change L2TP to store >> it's private stuff elsewhere. >> >> But that is not the case. Anything using the generic UDP >> encapsulation layer is going to make use of sk->sk_user_data like this >> (see setup_udp_tunnel_sock). >> > Most UDP encapsulations only use kernel sockets though. It seems that > only L2TP and GTP use setup_udp_tunnel_sock() with userpsace sockets. > So it might be feasible to restrict usage of sk_user_data to kernel > sockets only. > > For L2TP, we probably can adapt l2tp_sock_to_tunnel() so that it does > a lookup in a hashtable indexed by the socket pointer, rather than > dereferencing sk_user_data. That doesn't look very satisfying to me, > but that's the only way I found so far.
L2TP needs a way to get at its local data from the socket in the data path. > We also have another user of sk_user_data in l2tp_ppp, but since it > uses its own socket type, I guess we could simply embed the pointer in > its parent structure. > >> It looks like over time we've accumulated this new class of uses >> of sk->sk_user_data, ho hum... >> >> And it's not like we can add a test to KCM to avoid these socket >> types, because they will look like normal UDP datagram sockets. >> >> What a mess... >> >> Furthermore, even if you add a test to KCM, you will now need to >> add the same test to L2TP and anything else which uses sk_user_data >> for tunneling and for which userspace has access to the socket fd. >> >> And it will be racy, indeed, until all such users align to the same >> precise locking scheme for tests and updates to sk_user_data. >> >> Again, what a mess... >> > So, if I understand correctly, we can either restrict sk_user_data to > kernel sockets so that KCM couldn't act on them (but then why would we > make an exception for KCM and allow it to set sk_user_data on > non-kernel sockets?). > Or we could agree on a locking scheme for sk_user_data and update all > users so that they'd fail instead of overwriting it when it's not NULL. > > Assuming my understanding is correct, do you have any preference for > fixing this issue? Or any other ideas? Could we add a new pointer, say, encap_user_data to struct udp_sock and use it instead of sk_user_data for UDP-encap sockets?