On Fri, Oct 30, 2020 at 4:33 PM Song Liu <songliubrav...@fb.com> wrote: > > > > > On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <and...@kernel.org> wrote: > > > > From: Andrii Nakryiko <andr...@fb.com> > > > > Revamp BTF dedup's string deduplication to match the approach of writable > > BTF > > string management. This allows to transfer deduplicated strings index back > > to > > BTF object after deduplication without expensive extra memory copying and > > hash > > map re-construction. It also simplifies the code and speeds it up, because > > hashmap-based string deduplication is faster than sort + unique approach. > > > > Signed-off-by: Andrii Nakryiko <andr...@fb.com> > > LGTM, with a couple nitpick below: > > Acked-by: Song Liu <songliubrav...@fb.com> > > > --- > > tools/lib/bpf/btf.c | 265 +++++++++++++++++--------------------------- > > 1 file changed, 99 insertions(+), 166 deletions(-) > > > > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c > > index 89fecfe5cb2b..db9331fea672 100644 > > --- a/tools/lib/bpf/btf.c > > +++ b/tools/lib/bpf/btf.c > > @@ -90,6 +90,14 @@ struct btf { > > struct hashmap *strs_hash; > > /* whether strings are already deduplicated */ > > bool strs_deduped; > > + /* extra indirection layer to make strings hashmap work with stable > > + * string offsets and ability to transparently choose between > > + * btf->strs_data or btf_dedup->strs_data as a source of strings. > > + * This is used for BTF strings dedup to transfer deduplicated strings > > + * data back to struct btf without re-building strings index. > > + */ > > + void **strs_data_ptr; > > + > > /* BTF object FD, if loaded into kernel */ > > int fd; > > > > @@ -1363,17 +1371,19 @@ int btf__get_map_kv_tids(const struct btf *btf, > > const char *map_name, > > > > static size_t strs_hash_fn(const void *key, void *ctx) > > { > > - struct btf *btf = ctx; > > - const char *str = btf->strs_data + (long)key; > > + const char ***strs_data_ptr = ctx; > > + const char *strs = **strs_data_ptr; > > + const char *str = strs + (long)key; > > Can we keep using btf as the ctx here? "char ***" hurts my eyes... >
yep, changed to struct btf * > [...] > > > - d->btf->hdr->str_len = end - start; > > + /* replace BTF string data and hash with deduped ones */ > > + free(d->btf->strs_data); > > + hashmap__free(d->btf->strs_hash); > > + d->btf->strs_data = d->strs_data; > > + d->btf->strs_data_cap = d->strs_cap; > > + d->btf->hdr->str_len = d->strs_len; > > + d->btf->strs_hash = d->strs_hash; > > + /* now point strs_data_ptr back to btf->strs_data */ > > + d->btf->strs_data_ptr = &d->btf->strs_data; > > + > > + d->strs_data = d->strs_hash = NULL; > > + d->strs_len = d->strs_cap = 0; > > d->btf->strs_deduped = true; > > + return 0; > > + > > +err_out: > > + free(d->strs_data); > > + hashmap__free(d->strs_hash); > > + d->strs_data = d->strs_hash = NULL; > > + d->strs_len = d->strs_cap = 0; > > + > > + /* restore strings pointer for existing d->btf->strs_hash back */ > > + d->btf->strs_data_ptr = &d->strs_data; > > We have quite some duplicated code in err_out vs. succeed_out cases. > How about we add a helper function, like nope, that won't work, free(d->strs_data) vs free(d->btf->strs_data), same for hashmap__free(), plus there are strict requirements about the exact sequence of assignments in success case > > void free_strs_data(struct btf_dedup *d) > { > free(d->strs_data); > hashmap__free(d->strs_hash); > d->strs_data = d->strs_hash = NULL; > d->strs_len = d->strs_cap = 0; > } > > ?