2018-11-20 15:26 UTC-0800 ~ Stanislav Fomichev <s...@fomichev.me> > On 11/20, Alexei Starovoitov wrote: >> On Wed, Nov 21, 2018 at 12:18:57AM +0100, Daniel Borkmann wrote: >>> On 11/21/2018 12:04 AM, Alexei Starovoitov wrote: >>>> On Tue, Nov 20, 2018 at 01:19:05PM -0800, Stanislav Fomichev wrote: >>>>> On 11/20, Alexei Starovoitov wrote: >>>>>> On Mon, Nov 19, 2018 at 04:46:25PM -0800, Stanislav Fomichev wrote: >>>>>>> [Recent commit 23499442c319 ("bpf: libbpf: retry map creation without >>>>>>> the name") fixed this issue for maps, let's do the same for programs.] >>>>>>> >>>>>>> Since commit 88cda1c9da02 ("bpf: libbpf: Provide basic API support >>>>>>> to specify BPF obj name"), libbpf unconditionally sets bpf_attr->name >>>>>>> for programs. Pre v4.14 kernels don't know about programs names and >>>>>>> return an error about unexpected non-zero data. Retry sys_bpf without >>>>>>> a program name to cover older kernels. >>>>>>> >>>>>>> Signed-off-by: Stanislav Fomichev <s...@google.com> >>>>>>> --- >>>>>>> tools/lib/bpf/bpf.c | 10 ++++++++++ >>>>>>> 1 file changed, 10 insertions(+) >>>>>>> >>>>>>> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c >>>>>>> index 961e1b9fc592..cbe9d757c646 100644 >>>>>>> --- a/tools/lib/bpf/bpf.c >>>>>>> +++ b/tools/lib/bpf/bpf.c >>>>>>> @@ -212,6 +212,16 @@ int bpf_load_program_xattr(const struct >>>>>>> bpf_load_program_attr *load_attr, >>>>>>> if (fd >= 0 || !log_buf || !log_buf_sz) >>>>>>> return fd; >>>>>>> >>>>>>> + if (fd < 0 && errno == E2BIG && load_attr->name) { >>>>>>> + /* Retry the same syscall, but without the name. >>>>>>> + * Pre v4.14 kernels don't support prog names. >>>>>>> + */ >>>>>> >>>>>> I'm afraid that will put unnecessary stress on the kernel. >>>>>> This check needs to be tighter. >>>>>> Like E2BIG and anything in the log_buf probably means that >>>>>> E2BIG came from the verifier and nothing to do with prog_name. >>>>>> Asking kernel to repeat is an unnecessary work. >>>>>> >>>>>> In general we need to think beyond this single prog_name field. >>>>>> There are bunch of other fields in bpf_load_program_xattr() and older >>>>>> kernels >>>>>> won't support them. Are we going to zero them out one by one >>>>>> and retry? I don't think that would be practical. >>>>> I general, we don't want to zero anything out. However, >>>>> for this particular problem the rationale is the following: >>>>> In commit 88cda1c9da02 we started unconditionally setting {prog,map}->name >>>>> from the 'higher' libbpfc layer which breaks users on the older kernels. >>>>> >>>>>> Also libbpf silently ignoring prog_name is not great for debugging. >>>>>> A warning is needed. >>>>>> But it cannot be done out of lib/bpf/bpf.c, since it's a set of syscall >>>>>> wrappers. >>>>>> Imo such "old kernel -> lets retry" feature should probably be done >>>>>> at lib/bpf/libbpf.c level. inside load_program(). >>>>> For maps bpftools calls bpf_create_map_xattr directly, that's why >>>>> for maps I did the retry on the lower level (and why for programs I >>>>> initially >>>>> thought about doing the same). However, in this case maybe asking >>>>> user to omit 'name' argument might be a better option. >>>>> >>>>> For program names, I agree, we might think about doing it on the higher >>>>> level (although I'm not sure whether we want to have different API >>>>> expectations, i.e. bpf_create_map_xattr ignoring the name and >>>>> bpf_load_program_xattr not ignoring the name). >>>>> >>>>> So given that rationale above, what do you think is the best way to >>>>> move forward? >>>>> 1. Same patch, but tighten the retry check inside bpf_load_program_xattr ? >>>>> 2. Move this retry logic into load_program and have different handling >>>>> for bpf_create_map_xattr vs bpf_load_program_xattr ? >>>>> 3. Do 2 and move the retry check for maps from bpf_create_map_xattr >>>>> into bpf_object__create_maps ? >>>>> >>>>> (I'm slightly leaning towards #3) >>>> >>>> me too. I think it's cleaner for maps to do it in >>>> bpf_object__create_maps(). >>>> Originally bpf.c was envisioned to be a thin layer on top of bpf syscall. >>>> Whereas 'smart bits' would go into libbpf.c >>> >>> Can't we create in bpf_object__load() a small helper >>> bpf_object__probe_caps() >>> which would figure this out _once_ upon start with a few things to probe for >>> availability in the underlying kernel for maps and programs? E.g. programs >>> it could try to inject a tiny 'r0 = 0; exit' snippet where we figure out >>> things like prog name support etc. Given underlying kernel doesn't change, >>> we >>> would only try this once and it doesn't require fallback every time. >> >> +1. great idea! > Sounds good, let me try to do it. > > It sounds more like a recent LPC proposal/idea to have some sys_bpf option > to query BPF features. This new bpf_object__probe_caps can probably query > that in the future if we eventually add support for it. >
Hi, LPC proposal indeed. I've been working on implementing this kind of probes in bpftool. I don't probe name support for now (but I can certainly add it), but I detect supported program types, map types, header functions, and a couple of other parameters. The idea (initially from Daniel) was to dump "#define" declarations that could later be included in a header file and used for a BPF project (or alternatively, JSON output). I felt like bpftool was maybe a better place to do it, as the set of probes may grow large (all types, helpers, etc). It might have consequences on the running system: for example, if I don't raise the rlimit in bpftool before starting the probes, a good half of them fail because of consecutive program creation and reclaim delay for locked memory usage. So should we start adding probes to libbpf and should I move mine to the lib as well, or should the one in the v3 of this series be directed to something like bpftool instead? Thanks, Quentin