On 08/11/18 19:42, Alexei Starovoitov wrote: > same link let's continue at 1pm PST. So, one thing we didn't really get onto was maps, and you mentioned that it wasn't really clear what I was proposing there. What I have in mind comes in two parts: 1) map type. A new BTF_KIND_MAP with metadata 'key_type', 'value_type' (both are type_ids referencing other BTF type records), describing the type "map from key_type to value_type". 2) record in the 'instances' table. This would have a name_off (the name of the map), a type_id (pointing at a BTF_KIND_MAP in the 'types' table), and potentially also some indication of what symbol (from section 'maps') refers to this map. This is pretty much the exact same metadata that a function in the 'instances' table has, the only differences being (a) function's type_id points at a BTF_KIND_FUNC record (b) function's symbol indication refers from .text section (c) in future functions may be nested inside other functions, whereas AIUI a map can't live inside a function. (But a variable, which is the other thing that would want to go in an 'instances' table, can.) So the 'instances' table record structure looks like
struct btf_instance { __u32 type_id; /* Type of object declared. An index into type section */ __u32 name_off; /* Name of object. An offset into string section */ __u32 parent; /* Containing object if any (else 0). An index into instance section */ }; and we extend the BTF header: struct btf_header { __u16 magic; __u8 version; __u8 flags; __u32 hdr_len; /* All offsets are in bytes relative to the end of this header */ __u32 type_off; /* offset of type section */ __u32 type_len; /* length of type section */ __u32 str_off; /* offset of string section */ __u32 str_len; /* length of string section */ __u32 inst_off; /* offset of instance section */ __u32 inst_len; /* length of instance section */ }; Then in the .BTF.ext section, we have both struct bpf_func_info { __u32 prog_symbol; /* Index of symbol giving address of subprog */ __u32 inst_id; /* Index into instance section */ } struct bpf_map_info { { __u32 map_symbol; /* Index of symbol creating this map */ __u32 inst_id; /* Index into instance section */ } (either living in different subsections, or in a single table with the addition of a kind field, or in a single table relying on the ultimately referenced type to distinguish funcs from maps). Note that the name (in btf_instance) of a map or function need not match the name of the corresponding symbol; we use the .BTF.ext section to tie together btf_instance IDs and symbol IDs. Then in the case of functions (subprogs), the prog_symbol can be looked up in the ELF symbol table to find the address (== insn_offset) of the subprog, as well as the section containing it (since that might not be .text). Similarly in the case of maps the BTF info about the map is connected with the info in the maps section. Now when the loader has munged this, what it passes to the kernel might not have map_symbol, but instead map_fd. Instead of prog_symbol it will have whatever identifies the subprog in the blob of stuff it feeds to the kernel (so probably insn_offset). All this would of course require a bit more compiler support than the current BPF_ANNOTATE_KV_PAIR, since that just causes the existing BTF machinery to declare a specially constructed struct type. At the C level you could still have BPF_ANNOTATE_KV_PAIR and the '____bpf_map_foo' name, but then the compiler would recognise that and convert it into an instance record by looking up the name 'foo' in its "maps" section. That way the special ____bpf_map_* handling (which ties map names to symbol names, also) would be entirely compiler-internal and not 'leak out' into the definition of the format. Frontends for other languages which do possess a native map type (e.g. Python dict) might have other ways of indicating the key/value type of a map at source level (e.g. PEP 484) and could directly generate the appropriate BTF_KIND_MAP and bpf_map_info records rather than (as they would with the current design) having to encode the information as a struct ____bpf_map_foo type-definition. While I realise the desire to concentrate on one topic at once, I think this question of maps should be discussed in tomorrow's call, since it is when we start having other kinds of instances besides functions that the advantages of my design become apparent, unifying the process of 'declaration' of functions, maps, and (eventually) variables while separating them all from the process of 'definition' of the types of all three. Thank you for your continued patience with me. -Ed