On Tue, Nov 10, 2020 at 04:17:06PM -0800, Andrii Nakryiko wrote:
> On Tue, Nov 10, 2020 at 4:07 PM Martin KaFai Lau <ka...@fb.com> wrote:
> >
> > On Tue, Nov 10, 2020 at 03:53:13PM -0800, Andrii Nakryiko wrote:
> > > On Tue, Nov 10, 2020 at 3:43 PM Martin KaFai Lau <ka...@fb.com> wrote:
> > > >
> > > > On Tue, Nov 10, 2020 at 11:01:12PM +0100, KP Singh wrote:
> > > > > On Mon, Nov 9, 2020 at 9:32 PM John Fastabend
> > > > > <john.fastab...@gmail.com> wrote:
> > > > > >
> > > > > > Andrii Nakryiko wrote:
> > > > > > > On Fri, Nov 6, 2020 at 5:52 PM Martin KaFai Lau <ka...@fb.com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > On Fri, Nov 06, 2020 at 05:14:14PM -0800, Andrii Nakryiko wrote:
> > > > > > > > > On Fri, Nov 6, 2020 at 2:08 PM Martin KaFai Lau
> > > > > > > > > <ka...@fb.com> wrote:
> > > > > > > > > >
> > > > > > > > > > This patch enables the FENTRY/FEXIT/RAW_TP tracing program
> > > > > > > > > > to use
> > > > > > > > > > the bpf_sk_storage_(get|delete) helper, so those tracing
> > > > > > > > > > programs
> > > > > > > > > > can access the sk's bpf_local_storage and the later selftest
> > > > > > > > > > will show some examples.
> > > > > > > > > >
> > > > > > > > > > The bpf_sk_storage is currently used in bpf-tcp-cc, tc,
> > > > > > > > > > cg sockops...etc which is running either in softirq or
> > > > > > > > > > task context.
> > > > > > > > > >
> > > > > > > > > > This patch adds bpf_sk_storage_get_tracing_proto and
> > > > > > > > > > bpf_sk_storage_delete_tracing_proto. They will check
> > > > > > > > > > in runtime that the helpers can only be called when serving
> > > > > > > > > > softirq or running in a task context. That should enable
> > > > > > > > > > most common tracing use cases on sk.
> > > > > > > > > >
> > > > > > > > > > During the load time, the new tracing_allowed() function
> > > > > > > > > > will ensure the tracing prog using the
> > > > > > > > > > bpf_sk_storage_(get|delete)
> > > > > > > > > > helper is not tracing any *sk_storage*() function itself.
> > > > > > > > > > The sk is passed as "void *" when calling into
> > > > > > > > > > bpf_local_storage.
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Martin KaFai Lau <ka...@fb.com>
> > > > > > > > > > ---
> > > > > > > > > > include/net/bpf_sk_storage.h | 2 +
> > > > > > > > > > kernel/trace/bpf_trace.c | 5 +++
> > > > > > > > > > net/core/bpf_sk_storage.c | 73
> > > > > > > > > > ++++++++++++++++++++++++++++++++++++
> > > > > > > > > > 3 files changed, 80 insertions(+)
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > [...]
> > > > > > > > >
> > > > > > > > > > + switch (prog->expected_attach_type) {
> > > > > > > > > > + case BPF_TRACE_RAW_TP:
> > > > > > > > > > + /* bpf_sk_storage has no trace point */
> > > > > > > > > > + return true;
> > > > > > > > > > + case BPF_TRACE_FENTRY:
> > > > > > > > > > + case BPF_TRACE_FEXIT:
> > > > > > > > > > + btf_vmlinux = bpf_get_btf_vmlinux();
> > > > > > > > > > + btf_id = prog->aux->attach_btf_id;
> > > > > > > > > > + t = btf_type_by_id(btf_vmlinux, btf_id);
> > > > > > > > > > + tname = btf_name_by_offset(btf_vmlinux,
> > > > > > > > > > t->name_off);
> > > > > > > > > > + return !strstr(tname, "sk_storage");
> > > > > > > > >
> > > > > > > > > I'm always feeling uneasy about substring checks... Also, KP
> > > > > > > > > just
> > > > > > > > > fixed the issue with string-based checks for LSM. Can we use a
> > > > > > > > > BTF_ID_SET of blacklisted functions instead?
> > > > > > > > KP one is different. It accidentally whitelist-ed more than it
> > > > > > > > should.
> > > > > > > >
> > > > > > > > It is a blacklist here. It is actually cleaner and safer to
> > > > > > > > blacklist
> > > > > > > > all functions with "sk_storage" and too pessimistic is fine
> > > > > > > > here.
> > > > > > >
> > > > > > > Fine for whom? Prefix check would be half-bad, but substring
> > > > > > > check is
> > > > > > > horrible. Suddenly "task_storage" (and anything related) would be
> > > > > > > also
> > > > > > > blacklisted. Let's do a prefix check at least.
> > > > > > >
> > > > > >
> > > > > > Agree, prefix check sounds like a good idea. But, just doing a quick
> > > > > > grep seems like it will need at least bpf_sk_storage and sk_storage
> > > > > > to
> > > > > > catch everything.
> > > > >
> > > > > Is there any reason we are not using BTF ID sets and an allow list
> > > > > similar
> > > > > to bpf_d_path helper? (apart from the obvious inconvenience of
> > > > > needing to update the set in the kernel)
> > > > It is a blacklist here, a small recap from commit message.
> > > >
> > > > > During the load time, the new tracing_allowed() function
> > > > > will ensure the tracing prog using the bpf_sk_storage_(get|delete)
> > > > > helper is not tracing any *sk_storage*() function itself.
> > > > > The sk is passed as "void *" when calling into bpf_local_storage.
> > > >
> > > > Both BTF_ID and string-based (either prefix/substr) will work.
> > > >
> > > > The intention is to first disallow a tracing program from tracing
> > > > any function in bpf_sk_storage.c and also calling the
> > > > bpf_sk_storage_(get|delete) helper at the same time.
> > > > This blacklist can be revisited later if there would
> > > > be a use case in some of the blacklist-ed
> > > > functions (which I doubt).
> > > >
> > > > To use BTF_ID, it needs to consider about if the current (and future)
> > > > bpf_sk_storage function can be used in BTF_ID or not:
> > > > static, global/external, or inlined.
> > > >
> > > > If BTF_ID is the best way for doing all black/white list, I don't mind
> > > > either. I could force some to inline and we need to remember
> > > > to revisit the blacklist when the scope of fentry/fexit tracable
> > > > function changed, e.g. when static function becomes traceable
> > >
> > > You can consider static functions traceable already. Arnaldo landed a
> > > change a day or so ago in pahole that exposes static functions in BTF
> > > and makes it possible to fentry/fexit attach them.
> > Good to know.
> >
> > Is all static traceable (and can be used in BTF_ID)?
>
> Only those that end up not inlined, I think. Similarly as with
> kprobes. pahole actually checks mcount section to keep only those that
> are attachable with ftrace. See [0] for patches.
>
> [0]
> https://patchwork.kernel.org/project/netdevbpf/list/?series=379377&state=*
I will go with the prefix then to avoid tagging functions with
inline/noinline.