The commit 41b14fb8724d ("net: Do not clear the sock TX queue in sk_set_socket()") removes sk_tx_queue_clear() from sk_set_socket() and adds it instead in sk_alloc() and sk_clone_lock() to fix an issue introduced in the commit e022f0b4a03f ("net: Introduce sk_tx_queue_mapping"). On the other hand, the original commit had already put sk_tx_queue_clear() in sk_prot_alloc(): the callee of sk_alloc() and sk_clone_lock(). Thus sk_tx_queue_clear() is called twice in each path.
If we remove sk_tx_queue_clear() in sk_alloc() and sk_clone_lock(), it currently works well because (i) sk_tx_queue_mapping is defined between sk_dontcopy_begin and sk_dontcopy_end, and (ii) sock_copy() called after sk_prot_alloc() in sk_clone_lock() does not overwrite sk_tx_queue_mapping. However, if we move sk_tx_queue_mapping out of the no copy area, it introduces a bug unintentionally. Therefore, this patch adds a compile-time check to take care of the order of sock_copy() and sk_tx_queue_clear() and removes sk_tx_queue_clear() from sk_prot_alloc() so that it does the only allocation and its callers initialize fields. CC: Tariq Toukan <tar...@mellanox.com> CC: Boris Pismenny <bor...@mellanox.com> Signed-off-by: Kuniyuki Iwashima <kun...@amazon.co.jp> Acked-by: Tariq Toukan <tar...@nvidia.com> --- v5: * Move the changelog after the --- separator v4: https://lore.kernel.org/netdev/20210128124229.78315-1-kun...@amazon.co.jp/ * Fix typo in the changelog (runtime -> compile-time) v3: https://lore.kernel.org/netdev/20210128021905.57471-1-kun...@amazon.co.jp/ * Remove Fixes: tag * Add BUILD_BUG_ON * Remove sk_tx_queue_clear() from sk_prot_alloc() instead of sk_alloc() and sk_clone_lock() v2: https://lore.kernel.org/netdev/20210127132215.10842-1-kun...@amazon.co.jp/ * Remove Reviewed-by: tag v1: https://lore.kernel.org/netdev/20210127125018.7059-1-kun...@amazon.co.jp/ net/core/sock.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/net/core/sock.c b/net/core/sock.c index bbcd4b97eddd..cfbd62a5e079 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1657,6 +1657,16 @@ static void sock_copy(struct sock *nsk, const struct sock *osk) #ifdef CONFIG_SECURITY_NETWORK void *sptr = nsk->sk_security; #endif + + /* If we move sk_tx_queue_mapping out of the private section, + * we must check if sk_tx_queue_clear() is called after + * sock_copy() in sk_clone_lock(). + */ + BUILD_BUG_ON(offsetof(struct sock, sk_tx_queue_mapping) < + offsetof(struct sock, sk_dontcopy_begin) || + offsetof(struct sock, sk_tx_queue_mapping) >= + offsetof(struct sock, sk_dontcopy_end)); + memcpy(nsk, osk, offsetof(struct sock, sk_dontcopy_begin)); memcpy(&nsk->sk_dontcopy_end, &osk->sk_dontcopy_end, @@ -1690,7 +1700,6 @@ static struct sock *sk_prot_alloc(struct proto *prot, gfp_t priority, if (!try_module_get(prot->owner)) goto out_free_sec; - sk_tx_queue_clear(sk); } return sk; -- 2.17.2 (Apple Git-113)