On Wed, Mar 06, 2019 at 03:59:40PM +0000, Lorenz Bauer wrote: > On Mon, 4 Mar 2019 at 17:43, Martin Lau <ka...@fb.com> wrote: > > > > On Mon, Mar 04, 2019 at 10:33:46AM +0100, Daniel Borkmann wrote: > > > On 03/02/2019 09:21 PM, Martin Lau wrote: > > > > On Sat, Mar 02, 2019 at 10:03:03AM -0800, Alexei Starovoitov wrote: > > > >> On Sat, Mar 02, 2019 at 08:10:10AM -0800, Martin KaFai Lau wrote: > > > >>> Lorenz Bauer [thanks!] reported that a ptr returned by > > > >>> bpf_tcp_sock(sk) > > > >>> can still be accessed after bpf_sk_release(sk). > > > >>> Both bpf_tcp_sock() and bpf_sk_fullsock() have the same issue. > > > >>> This patch addresses them together. > > > >>> > > > >>> A simple reproducer looks like this: > > > >>> > > > >>> sk = bpf_sk_lookup_tcp(); > > > >>> /* if (!sk) ... */ > > > >>> tp = bpf_tcp_sock(sk); > > > >>> /* if (!tp) ... */ > > > >>> bpf_sk_release(sk); > > > >>> snd_cwnd = tp->snd_cwnd; /* oops! The verifier does not complain. */ > > > >>> > > > >>> The problem is the verifier did not scrub the register's states of > > > >>> the tcp_sock ptr (tp) after bpf_sk_release(sk). > > > >>> > > > >>> [ Note that when calling bpf_tcp_sock(sk), the sk is not always > > > >>> refcount-acquired. e.g. bpf_tcp_sock(skb->sk). The verifier works > > > >>> fine for this case. ] > > > >>> > > > >>> Currently, the verifier does not track if a helper's return ptr (in > > > >>> REG_0) > > > >>> is "carry"-ing one of its argument's refcount status. To carry this > > > >>> info, > > > >>> the reg1->id needs to be stored in reg0. The reg0->id has already > > > >>> been used for NULL checking purpose. Hence, a new "refcount_id" > > > >>> is needed in "struct bpf_reg_state". > > > >>> > > > >>> With refcount_id, when bpf_sk_release(sk) is called, the verifier can > > > >>> scrub > > > >>> all reg states which has a refcount_id match. It is done with the > > > >>> changes > > > >>> in release_reg_references(). > > > >>> > > > >>> When acquiring and releasing a refcount, the reg->id is still used. > > > >>> Hence, we cannot do "bpf_sk_release(tp)" in the above reproducer > > > >>> example. > > > >> > > > >> I think the choice of returning listener full sock from req sock > > > >> in sk_to_full_sk() was a wrong one. > > > >> It seems better to make semantics of bpf_tcp_sock() and > > > >> bpf_sk_fullsock() as > > > >> always type cast or null. > > > >> And have a separate helper for req socket that returns > > > >> inet_reqsk(sk)->rsk_listener. > > > >> > > > >> Then it will be ok to call bpf_sk_release(tp) when tp came from > > > >> bpf_sk_lookup_tcp. > > > >> The verifier will know that it's the case because its ID will be in > > > >> acquired_refs. > > > >> > > > >> The additional refcount_id won't be necessary. > > > >> bpf_sk_fullsock() and bpf_tcp_sock() will not call sk_to_full_sk > > > >> and the verifier will be copying reg1->id into reg0->id. > > > >> > > > >> In release_reference() the verifier will do > > > >> if (regs[i].id == id) > > > >> mark_reg_unknown(env, regs, i); > > > >> for all socket types. > > > >> > > > >> release_reference_state() will stay as-is. > > > >> > > > >> imo such logic will be easier to follow. > > > >> > > > >> This implicit sk_to_full_sk() makes the whole thing much harder for > > > >> the verifier > > > >> and for the bpf program writers. > > > >> > > > >> The new bpf_get_listener_sock(sk) doesn't have to copy ID from reg1 to > > > >> reg0 > > > >> since req socket will not be returned from bpf_sk_lookup_tcp and its ID > > > >> will not be stored in acuired_refs. > > > >> > > > >> Does it make sense ? > > > > I like this idea. Many thanks for thinking it through! > > > > > > > > Allowing bpf_sk_release(tp), no need to call bpf_sk_release() on ptr > > > > returned from bpf_get_listener_sock(sk) and keep one reg->id. > > > > > > > > I think it should work. I will rework the patches. > > > > > > Agree, makes sense, that seems much better fix. > > While I was working on this change, based on the code, one issue I saw is: > > > > if the bpf prog does this: > > > > sk = bpf_sk_lookup_tcp(); > > /* if (!sk) ... */ > > fullsock = bpf_sk_fullsock(sk); > > if (!fullsock) { > > bpf_sk_release(sk); /* Fail. sk_reg->id not found in ref state */ > > return 0; > > } > > > > The bpf_sk_release(sk) failed because the reference state has already > > been released by "release_reference_state(state, fullsock_reg->id)" during > > "if (!fullsock) /* handled by mark_ptr_or_null_regs(is_null == true) */" > > Logically, I think bpf_sk_release(sk) should not fail regardless of > > bpf_sk_fullsock() doing sk_to_full_sk() or not. > > > > bpf_sk_fullsock() could disallow PTR_TO_SOCKET or PTR_TO_TCP_SOCK but that > > would be weird. > > > > I think we still need two id. May be rename the refcount_id proposed in > > this patch to ref_obj_id which is the original refcounted object id. > > > > If the sk_to_full_sk() is removed from bpf_sk_fullsock() and bpf_tcp_sock(), > > these two helpers become a simple cast (i.e. either return the same pointer > > or NULL). Then bpf_sk_release(fullsock) and bpf_sk_release(tp) could work: > > > > - When is_null == true, release_reference_state(state, reg->id) is called. > > If I understand correctly, this works because we never > acquire_reference() for tp/ fullsock, > making this a no-op? Sorry for the late reply.
Correct. Those two helpers do not take ref, so release_reference_state() will not be called. > > > - During bpf_sk_release(fullsock), release_reference(env, reg->ref_obj_id) > > is called so that sk, fullsock and tp with the same ref_obj_id will > > be mark_reg_unknown(). > > To clarify, the following states are possible: > * id == 0, ref_obj_id == 0: not a pointer / reference > * id != 0, reg_obj_id == 0: a reference which didn't have > acquire_reference() called > * id != 0, reg_obj_id != 0: a reference which had acquire_reference() called > * id == 0, reg_obj_id: illegal In this 2 id(s) approach, I would think of it in this way. id and ref_obj_id are for two different purposes. One for null checking and one for reference tracking. Whenever its own purpose is served, it can be set to 0. Regardless, I am working on another idea that does not require two id(s) in bpf_reg_state. I will give an update on this.