On 03/02/2019 09:21 PM, Martin Lau wrote: > On Sat, Mar 02, 2019 at 10:03:03AM -0800, Alexei Starovoitov wrote: >> On Sat, Mar 02, 2019 at 08:10:10AM -0800, Martin KaFai Lau wrote: >>> Lorenz Bauer [thanks!] reported that a ptr returned by bpf_tcp_sock(sk) >>> can still be accessed after bpf_sk_release(sk). >>> Both bpf_tcp_sock() and bpf_sk_fullsock() have the same issue. >>> This patch addresses them together. >>> >>> A simple reproducer looks like this: >>> >>> sk = bpf_sk_lookup_tcp(); >>> /* if (!sk) ... */ >>> tp = bpf_tcp_sock(sk); >>> /* if (!tp) ... */ >>> bpf_sk_release(sk); >>> snd_cwnd = tp->snd_cwnd; /* oops! The verifier does not complain. */ >>> >>> The problem is the verifier did not scrub the register's states of >>> the tcp_sock ptr (tp) after bpf_sk_release(sk). >>> >>> [ Note that when calling bpf_tcp_sock(sk), the sk is not always >>> refcount-acquired. e.g. bpf_tcp_sock(skb->sk). The verifier works >>> fine for this case. ] >>> >>> Currently, the verifier does not track if a helper's return ptr (in REG_0) >>> is "carry"-ing one of its argument's refcount status. To carry this info, >>> the reg1->id needs to be stored in reg0. The reg0->id has already >>> been used for NULL checking purpose. Hence, a new "refcount_id" >>> is needed in "struct bpf_reg_state". >>> >>> With refcount_id, when bpf_sk_release(sk) is called, the verifier can scrub >>> all reg states which has a refcount_id match. It is done with the changes >>> in release_reg_references(). >>> >>> When acquiring and releasing a refcount, the reg->id is still used. >>> Hence, we cannot do "bpf_sk_release(tp)" in the above reproducer >>> example. >> >> I think the choice of returning listener full sock from req sock >> in sk_to_full_sk() was a wrong one. >> It seems better to make semantics of bpf_tcp_sock() and bpf_sk_fullsock() as >> always type cast or null. >> And have a separate helper for req socket that returns >> inet_reqsk(sk)->rsk_listener. >> >> Then it will be ok to call bpf_sk_release(tp) when tp came from >> bpf_sk_lookup_tcp. >> The verifier will know that it's the case because its ID will be in >> acquired_refs. >> >> The additional refcount_id won't be necessary. >> bpf_sk_fullsock() and bpf_tcp_sock() will not call sk_to_full_sk >> and the verifier will be copying reg1->id into reg0->id. >> >> In release_reference() the verifier will do >> if (regs[i].id == id) >> mark_reg_unknown(env, regs, i); >> for all socket types. >> >> release_reference_state() will stay as-is. >> >> imo such logic will be easier to follow. >> >> This implicit sk_to_full_sk() makes the whole thing much harder for the >> verifier >> and for the bpf program writers. >> >> The new bpf_get_listener_sock(sk) doesn't have to copy ID from reg1 to reg0 >> since req socket will not be returned from bpf_sk_lookup_tcp and its ID >> will not be stored in acuired_refs. >> >> Does it make sense ? > I like this idea. Many thanks for thinking it through! > > Allowing bpf_sk_release(tp), no need to call bpf_sk_release() on ptr > returned from bpf_get_listener_sock(sk) and keep one reg->id. > > I think it should work. I will rework the patches.
Agree, makes sense, that seems much better fix. Thanks, Daniel