Pierrick Bouvier <pierrick.bouv...@linaro.org> writes: > On 1/26/24 19:14, Alex Bennée wrote: >> Pierrick Bouvier <pierrick.bouv...@linaro.org> writes: >> >>> We introduce a cpu local storage, automatically managed (and extended) >>> by QEMU itself. Plugin allocate a scoreboard, and don't have to deal >>> with how many cpus are launched. >>> >>> This API will be used by new inline functions but callbacks can benefit >>> from this as well. This way, they can operate without a global lock for >>> simple operations. >>> >>> New functions: >>> - qemu_plugin_scoreboard_free >>> - qemu_plugin_scoreboard_get >>> - qemu_plugin_scoreboard_new >>> - qemu_plugin_scoreboard_size >>> >>> In more, we define a qemu_plugin_u64_t, which is a simple struct holding >>> a pointer to a scoreboard, and a given offset. >>> This allows to have a scoreboard containing structs, without having to >>> bring offset for all operations on a specific field. >>> >>> Since most of the plugins are simply collecting a sum of per-cpu values, >>> qemu_plugin_u64_t directly support this operation as well. >>> >>> New functions: >>> - qemu_plugin_u64_get >>> - qemu_plugin_u64_sum >>> New macros: >>> - qemu_plugin_u64 >>> - qemu_plugin_u64_struct >>> >>> Signed-off-by: Pierrick Bouvier <pierrick.bouv...@linaro.org> >>> --- >>> include/qemu/plugin.h | 7 +++ >>> include/qemu/qemu-plugin.h | 75 ++++++++++++++++++++++++++++ >>> plugins/api.c | 39 +++++++++++++++ >>> plugins/core.c | 97 ++++++++++++++++++++++++++++++++++++ >>> plugins/plugin.h | 8 +++ >>> plugins/qemu-plugins.symbols | 6 +++ >>> 6 files changed, 232 insertions(+) >>> >>> diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h >>> index 9346249145d..5f340192e56 100644 >>> --- a/include/qemu/plugin.h >>> +++ b/include/qemu/plugin.h >>> @@ -115,6 +115,13 @@ struct qemu_plugin_insn { >>> bool mem_only; >>> }; >>> +/* A scoreboard is an array of values, indexed by vcpu_index */ >>> +struct qemu_plugin_scoreboard { >>> + GArray *data; >>> + size_t size; >> Can we get size from GArray->len itself? >> > > GArray->len matches the allocated size, which is different from > "semantic" size. I'll answer to why it was implemented this way in a > next comment.
Does it? The documentation states: guint len; the number of elements in the GArray not including the possible terminating zero element. I think the actual allocated size is hidden from you by glib as an implementation detail. >>> + size_t element_size; >>> +}; >>> + >>> /* >>> * qemu_plugin_insn allocate and cleanup functions. We don't expect to >>> * cleanup many of these structures. They are reused for each fresh >>> diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h >>> index 2c1930e7e45..934059d64c2 100644 >>> --- a/include/qemu/qemu-plugin.h >>> +++ b/include/qemu/qemu-plugin.h >>> @@ -220,6 +220,23 @@ void >>> qemu_plugin_register_vcpu_resume_cb(qemu_plugin_id_t id, >>> struct qemu_plugin_tb; >>> /** struct qemu_plugin_insn - Opaque handle for a translated instruction >>> */ >>> struct qemu_plugin_insn; >>> +/** >>> + * struct qemu_plugin_scoreboard - Opaque handle for a scoreboard >>> + * >>> + * A scoreboard is an array of data, indexed by vcpu_index. >>> + **/ >> stray *'s - I think this is what trips up kdoc. >> > > Thanks, fixed that in a new version. I now build with documentation > enabled, so should not happen again, sorry. > >>> +struct qemu_plugin_scoreboard; >>> + >>> +/** >>> + * qemu_plugin_u64_t - uint64_t member of an entry in a scoreboard >> We generally reserve lower_case_with_underscores_ending_with_a_t for >> scalar types. Its also a little generic. Maybe qemu_plugin_scoreboard_ref? >> > > For _t suffix, I hesitated on this, and followed the qemu_info_t > struct example in the same header. We are nothing if not inconsistent in the code base ;-) There are a number of _t types that in retrospect shouldn't have been. > In the beginning, I picked qemu_plugin_scoreboard_u64, but realized > the name was far too long. qemu_plugin_u64 seemed like the right spot > between length/meaning. > I would like to defend the u64 meaning, because it was explicitely > made to express u64 type, instead of a generic scoreboard entry/ref. > > In more, based on other comments made, maybe it was not clear that a > qemu_plugin_u64 does not necessarily point to a "whole" entry in > scoreboard, but it can target a struct member (u64) in a given entry > as well. (i.e. scoreboard[i].member). I'll give a more detailed answer > below in this message. If we just return values rather than pointers we can stick with simpler types. > This way, scoreboard can be composed of struct values, and a > qemu_plugin_u64 allows to work on a specific member of that struct > without having to manipulate offset everywhere. Thus the get and sum > operation on this type. I would have to see an example of why giving this indirection to the plugin makes things easier. > > Does it makes more sense to you given this? > >>> + * >>> + * This field allows to access a specific uint64_t member in one given >>> entry, >>> + * located at a specified offset. Inline operations expect this as entry. >>> + */ >>> +typedef struct qemu_plugin_u64 { >> > I don't think you need the forward declaration here (which clashes later >> on). >> > > Initially wrote this without a typedef, and left this afterwards. Will > remove it, thanks! > >>> + struct qemu_plugin_scoreboard *score; >>> + size_t offset; >>> +} qemu_plugin_u64_t; >>> /** >>> * enum qemu_plugin_cb_flags - type of callback >>> @@ -754,5 +771,63 @@ int qemu_plugin_read_register(unsigned int vcpu, >>> struct qemu_plugin_register *handle, >>> GByteArray *buf); >>> +/** >>> + * qemu_plugin_scoreboard_new() - alloc a new scoreboard >>> + * >>> + * Returns a pointer to a new scoreboard. It must be freed using >>> + * qemu_plugin_scoreboard_free. >>> + */ >>> +QEMU_PLUGIN_API >>> +struct qemu_plugin_scoreboard *qemu_plugin_scoreboard_new(size_t >>> element_size); >>> + >>> +/** >>> + * qemu_plugin_scoreboard_free() - free a scoreboard >>> + * @score: scoreboard to free >>> + */ >>> +QEMU_PLUGIN_API >>> +void qemu_plugin_scoreboard_free(struct qemu_plugin_scoreboard *score); >>> + >>> +/** >>> + * qemu_plugin_scoreboard_size() - return size of a scoreboard >>> + * @score: scoreboard to query >>> + */ >>> +QEMU_PLUGIN_API >>> +size_t qemu_plugin_scoreboard_size(struct qemu_plugin_scoreboard >>> *score); >> Is this actually host memory size related? The use case in the >> plugins >> seems to be a proxy for num_cpus and I'm note sure we want it used that >> way. We are making the contract that it will always be big enough for >> the number of vcpus created - maybe that is the helper we need? >> > > The current API has a limitation in user mode, where you can't know > how many cpus were run overall. The point of scoreboard was to > abstract this away, by removing the need for plugins to track this > (and use lock, manual memory allocation, and glue code in general). Agreed. > I thought it was more clear to write: > for (int i = 0; i < qemu_plugin_scoreboard_size(s); ++i) { > void *val = qemu_plugin_scoreboard_get(s, i); > } > than: > for (int i = 0; i < qemu_plugin_max_cpus(s); ++i) { > void *val = qemu_plugin_scoreboard_get(s, i); > } > > We can always replace qemu_plugin_scoreboard_size by > qemu_plugin_max_cpus (or another name you prefer), which would return > the max number of cpus that were used during all the execution, if > that works for you. I think that works, do we know it even if we haven't allocated any scoreboards? > >>> + >>> +/** >>> + * qemu_plugin_scoreboard_get() - access value from a scoreboard >>> + * @score: scoreboard to query >>> + * @vcpu_index: entry index >>> + * >>> + * Returns address of entry of a scoreboard matching a given vcpu_index. >>> This >>> + * address can be modified later if scoreboard is resized. >>> + */ >>> +QEMU_PLUGIN_API >>> +void *qemu_plugin_scoreboard_get(struct qemu_plugin_scoreboard *score, >>> + unsigned int vcpu_index); >>> + >>> +/* Macros to define a qemu_plugin_u64_t */ >>> +#define qemu_plugin_u64(score) \ >>> + (qemu_plugin_u64_t){score, 0} >>> +#define qemu_plugin_u64_struct(score, type, member) \ >>> + (qemu_plugin_u64_t){score, offsetof(type, member)} >> qemu_plugin_val_ref and qemu_plugin_struct_ref? >> > > Same comment as before, it would be more clear to keep the u64 in the > name if possible, so we know which data type we manipulate. > >>> + >>> +/** >>> + * qemu_plugin_u64_get() - access specific uint64_t in a scoreboard >>> + * @entry: entry to query >>> + * @vcpu_index: entry index >>> + * >>> + * Returns address of a specific member in a scoreboard entry, matching a >>> given >>> + * vcpu_index. >>> + */ >>> +QEMU_PLUGIN_API >>> +uint64_t *qemu_plugin_u64_get(qemu_plugin_u64_t entry, unsigned int >>> vcpu_index); >>> + >>> +/** >>> + * qemu_plugin_u64_sum() - return sum of all values in a scoreboard >>> + * @entry: entry to sum >>> + */ >>> +QEMU_PLUGIN_API >>> +uint64_t qemu_plugin_u64_sum(qemu_plugin_u64_t entry); >>> #endif /* QEMU_QEMU_PLUGIN_H */ >>> diff --git a/plugins/api.c b/plugins/api.c >>> index e777eb4e9fc..4de94e798c6 100644 >>> --- a/plugins/api.c >>> +++ b/plugins/api.c >>> @@ -547,3 +547,42 @@ static void __attribute__((__constructor__)) >>> qemu_api_init(void) >>> qemu_mutex_init(®_handle_lock); >>> } >>> + >>> +struct qemu_plugin_scoreboard *qemu_plugin_scoreboard_new(size_t >>> element_size) >>> +{ >>> + return plugin_scoreboard_new(element_size); >>> +} >>> + >>> +void qemu_plugin_scoreboard_free(struct qemu_plugin_scoreboard *score) >>> +{ >>> + plugin_scoreboard_free(score); >>> +} >>> + >>> +size_t qemu_plugin_scoreboard_size(struct qemu_plugin_scoreboard *score) >>> +{ >>> + return score->size; >> So this would be score->data->len >> >>> +} >>> + >>> +void *qemu_plugin_scoreboard_get(struct qemu_plugin_scoreboard *score, >>> + unsigned int vcpu_index) >>> +{ >>> + g_assert(vcpu_index < qemu_plugin_scoreboard_size(score)); >>> + char *ptr = score->data->data; >>> + return ptr + vcpu_index * score->element_size; >> GArray has accesses functions for this: >> return &g_array_index(score->data, >> g_array_get_element_size(score->data), >> vcpu_index); >> > > Yes, I'll update this. > >>> +} >>> + >>> +uint64_t *qemu_plugin_u64_get(qemu_plugin_u64_t entry, >>> + unsigned int vcpu_index) >>> +{ >>> + char *ptr = (char *) qemu_plugin_scoreboard_get(entry.score, >>> vcpu_index); >>> + return (uint64_t *)(ptr + entry.offset); >> Not sure whats going with the casting here but I wonder if we are >> giving >> the user too much rope. Why not to return the value itself? In fact why >> do we treat things as an offset rather than an index of uint64_t. >> The qemu_plugin_scoreboard_get can return uint64_t * and the >> individual >> get becomes: >> uint64_t *qemu_plugin_scoreboard_get(struct >> qemu_plugin_scoreboard *score, >> unsigned int vcpu_index) >> { >> g_assert(vcpu_index < qemu_plugin_scoreboard_size(score)); >> return &g_array_index(score->data, score->element_size, vcpu_index); >> } >> uint64_t qemu_plugin_u64_get(struct qemu_plugin_scoreboard >> *score, >> unsigned int vcpu_index, unsigned int >> score_index) >> { >> g_assert(score_index < score->num_elements); >> uint64_t *sb = qemu_plugin_scoreboard_get(score, vcpu_index); >> return sb[score_index]; >> } >> > > As said above, maybe it was not clear in the commit (though expressed > in commit message) what was a distinction between a scoreboard, and a > plugin_u64. > > The latter is an abstraction of a specific member in one scoreboard > entry, while a scoreboard holds any type. > > Let me write a concrete example, which will be clearer: > > typedef struct { > char data1; > bool data2; > void* data3; > enum ...; > uint64_t counter; > } CPUData; > > score = qemu_plugin_scoreboard_new(sizeof(CPUData)); > qemu_plugin_u64_t counter = > qemu_plugin_u64_struct(score, CPUData, counter)); > > From there: > qemu_plugin_u64_get(counter, 2) returns &score[2].counter. > qemu_plugin_u64_sum(counter) returns > sum(i: [0:max_cpus[, score[i].counter). > > With this, a plugin can simply have an abstract representation of cpu > local value, than can be get or set (since we return a pointer with > get) and easily summed, which is what most of the plugins want to do > in the end (at least from what I observed in existing ones). > > If we stick to a void* scoreboard, we end up writing the sum function > in every plugin. In more, every API function "inline" operating on a > scoreboard would need the offsetof(CPUData, counter) passed in > parameter, which is error prone and repetitive. > > With this explaination, do you see more the value to have a scoreboard > *and* a plugin_u64 struct to access it? > >>> +} >>> + >>> +uint64_t qemu_plugin_u64_sum(qemu_plugin_u64_t entry) >>> +{ >>> + uint64_t total = 0; >>> + for (int i = 0; i < qemu_plugin_scoreboard_size(entry.score); ++i) { >>> + total += *qemu_plugin_u64_get(entry, i); >>> + } >>> + return total; >>> +} >>> diff --git a/plugins/core.c b/plugins/core.c >>> index e07b9cdf229..0286a127810 100644 >>> --- a/plugins/core.c >>> +++ b/plugins/core.c >>> @@ -209,6 +209,71 @@ plugin_register_cb_udata(qemu_plugin_id_t id, enum >>> qemu_plugin_event ev, >>> do_plugin_register_cb(id, ev, func, udata); >>> } >>> +struct resize_scoreboard_args { >>> + size_t new_alloc_size; >>> + size_t new_size; >>> +}; >>> + >>> +static void plugin_resize_one_scoreboard(gpointer key, >>> + gpointer value, >>> + gpointer user_data) >>> +{ >>> + struct qemu_plugin_scoreboard *score = >>> + (struct qemu_plugin_scoreboard *) value; >>> + struct resize_scoreboard_args *args = >>> + (struct resize_scoreboard_args *) user_data; >>> + if (score->data->len != args->new_alloc_size) { >>> + g_array_set_size(score->data, args->new_alloc_size); >>> + } >>> + score->size = args->new_size; >> I think you have this in len so if you skip the extra field you can >> just >> do: >> new_size = GPOINTER_TO_UINT(user_data); >> > > If we can avoid managing allocated vs real size, yes. > >>> +} >>> + >>> +static void plugin_grow_scoreboards__locked(CPUState *cpu) >>> +{ >>> + if (cpu->cpu_index < plugin.scoreboard_size) { >>> + return; >>> + } >>> + >>> + bool need_realloc = FALSE; >>> + while (cpu->cpu_index >= plugin.scoreboard_alloc_size) { >>> + plugin.scoreboard_alloc_size *= 2; >> I suspect for USER this makes sense, if we every end up doing system >> expansion then it would just be +1 at a time. In fact given glib does: >> gsize want_alloc = g_nearest_pow (g_array_elt_len (array, >> want_len)); >> maybe we just do a simple increment and leave it up glib to handle >> the >> exponential growth? >> > > Initially, I wanted to go this way, but it seems like glib does not > offer alloc_size information about a GArray. I looked for this in > documentation and doc, and maybe missed it. > > If you have an idea to detect this, I would love to just use it, and > it would made the code much more simple, avoiding the need to keep > track of two sizes. > > The problem we try to solve is to detect when the pointer will be > reallocated, with the consequence that all tb must be flushed and > retranslated to use the new one instead. > >>> + need_realloc = TRUE; >>> + } >>> + plugin.scoreboard_size = cpu->cpu_index + 1; >>> + g_assert(plugin.scoreboard_size <= plugin.scoreboard_alloc_size); >>> + >>> + if (g_hash_table_size(plugin.scoreboards) == 0) { >>> + /* nothing to do, we just updated sizes for future scoreboards */ >>> + return; >>> + } >>> + >>> + if (need_realloc) { >>> +#ifdef CONFIG_USER_ONLY >>> + /** >>> + * cpus must be stopped, as some tb might still use an existing >>> + * scoreboard. >>> + */ >>> + start_exclusive(); >>> +#endif >> Hmm this seems wrong to be USER_ONLY. While we don't expect to >> resize in >> system mode if we did we certainly want to do it during exclusive >> periods. >> > > I agree. Initially did this for both (system and user), but in system, > we hit a segfault. > > $ ./build/qemu-system-x86_64 -plugin ./build/tests/plugin/libinline.so > -smp 17 # > 16 will trigger a reallocation of scoreboard > ../cpu-common.c:196:20: runtime error: member access within null > pointer of type 'struct CPUState' > > It seems like something is not set yet at this time (current_cpu TLS > var), which result in this problem. Maybe this code simply reveals an > existing problem, what do you think? > >>> + } >>> + >>> + struct resize_scoreboard_args args = { >>> + .new_alloc_size = plugin.scoreboard_alloc_size, >>> + .new_size = plugin.scoreboard_size >>> + }; >>> + g_hash_table_foreach(plugin.scoreboards, >>> + &plugin_resize_one_scoreboard, >>> + &args); >> This can just be g_hash_table_foreach(plugin.scoreboards, >> &plugin_resize_one_scoreboard, >> >> GUINT_TO_POINTER(plugin.scoreboard_alloc_size)) >> > > If we use a single size, yes. > >>> + >>> + if (need_realloc) { >>> + /* force all tb to be flushed, as scoreboard pointers were >>> changed. */ >>> + tb_flush(cpu); >>> +#ifdef CONFIG_USER_ONLY >>> + end_exclusive(); >>> +#endif >>> + } >>> +} >>> + >>> void qemu_plugin_vcpu_init_hook(CPUState *cpu) >>> { >>> bool success; >>> @@ -218,6 +283,7 @@ void qemu_plugin_vcpu_init_hook(CPUState *cpu) >>> success = g_hash_table_insert(plugin.cpu_ht, &cpu->cpu_index, >>> &cpu->cpu_index); >>> g_assert(success); >>> + plugin_grow_scoreboards__locked(cpu); >>> qemu_rec_mutex_unlock(&plugin.lock); >>> plugin_vcpu_cb__simple(cpu, QEMU_PLUGIN_EV_VCPU_INIT); >>> @@ -576,8 +642,39 @@ static void __attribute__((__constructor__)) >>> plugin_init(void) >>> qemu_rec_mutex_init(&plugin.lock); >>> plugin.id_ht = g_hash_table_new(g_int64_hash, g_int64_equal); >>> plugin.cpu_ht = g_hash_table_new(g_int_hash, g_int_equal); >>> + plugin.scoreboards = g_hash_table_new(g_int64_hash, g_int64_equal); >>> + plugin.scoreboard_size = 1; >>> + plugin.scoreboard_alloc_size = 16; /* avoid frequent reallocation */ >>> QTAILQ_INIT(&plugin.ctxs); >>> qht_init(&plugin.dyn_cb_arr_ht, plugin_dyn_cb_arr_cmp, 16, >>> QHT_MODE_AUTO_RESIZE); >>> atexit(qemu_plugin_atexit_cb); >>> } >>> + >>> +struct qemu_plugin_scoreboard *plugin_scoreboard_new(size_t element_size) >>> +{ >>> + struct qemu_plugin_scoreboard *score = >>> + g_malloc0(sizeof(struct qemu_plugin_scoreboard)); >>> + score->data = g_array_new(FALSE, TRUE, element_size); >>> + g_array_set_size(score->data, plugin.scoreboard_alloc_size); >>> + score->size = plugin.scoreboard_size; >>> + score->element_size = element_size; >>> + >>> + qemu_rec_mutex_lock(&plugin.lock); >>> + bool inserted = g_hash_table_insert(plugin.scoreboards, score, score); >>> + g_assert(inserted); >>> + qemu_rec_mutex_unlock(&plugin.lock); >>> + >>> + return score; >>> +} >>> + >>> +void plugin_scoreboard_free(struct qemu_plugin_scoreboard *score) >>> +{ >>> + qemu_rec_mutex_lock(&plugin.lock); >>> + bool removed = g_hash_table_remove(plugin.scoreboards, score); >>> + g_assert(removed); >>> + qemu_rec_mutex_unlock(&plugin.lock); >>> + >>> + g_array_free(score->data, TRUE); >>> + g_free(score); >>> +} >>> diff --git a/plugins/plugin.h b/plugins/plugin.h >>> index 2c278379b70..99829c40886 100644 >>> --- a/plugins/plugin.h >>> +++ b/plugins/plugin.h >>> @@ -31,6 +31,10 @@ struct qemu_plugin_state { >>> * but with the HT we avoid adding a field to CPUState. >>> */ >>> GHashTable *cpu_ht; >>> + /* Scoreboards, indexed by their addresses. */ >>> + GHashTable *scoreboards; >>> + size_t scoreboard_size; >>> + size_t scoreboard_alloc_size; >>> DECLARE_BITMAP(mask, QEMU_PLUGIN_EV_MAX); >>> /* >>> * @lock protects the struct as well as ctx->uninstalling. >>> @@ -99,4 +103,8 @@ void plugin_register_vcpu_mem_cb(GArray **arr, >>> void exec_inline_op(struct qemu_plugin_dyn_cb *cb, int >>> cpu_index); >>> +struct qemu_plugin_scoreboard *plugin_scoreboard_new(size_t >>> element_size); >>> + >>> +void plugin_scoreboard_free(struct qemu_plugin_scoreboard *score); >>> + >>> #endif /* PLUGIN_H */ >>> diff --git a/plugins/qemu-plugins.symbols b/plugins/qemu-plugins.symbols >>> index 6963585c1ea..93866d1b5f2 100644 >>> --- a/plugins/qemu-plugins.symbols >>> +++ b/plugins/qemu-plugins.symbols >>> @@ -38,10 +38,16 @@ >>> qemu_plugin_register_vcpu_tb_exec_inline; >>> qemu_plugin_register_vcpu_tb_trans_cb; >>> qemu_plugin_reset; >>> + qemu_plugin_scoreboard_free; >>> + qemu_plugin_scoreboard_get; >>> + qemu_plugin_scoreboard_new; >>> + qemu_plugin_scoreboard_size; >>> qemu_plugin_start_code; >>> qemu_plugin_tb_get_insn; >>> qemu_plugin_tb_n_insns; >>> qemu_plugin_tb_vaddr; >>> + qemu_plugin_u64_get; >>> + qemu_plugin_u64_sum; >>> qemu_plugin_uninstall; >>> qemu_plugin_vcpu_for_each; >>> }; >> > > Thanks for your complete and detailed review! -- Alex Bennée Virtualisation Tech Lead @ Linaro