On Thu, Aug 13, 2015 at 8:26 AM, Ilya Maximets <i.maxim...@samsung.com> wrote:
> Ok, thanks for explanation. > In that case, I think, try-lock solution will be good. > > Best regards, Ilya Maximets. > Thx Ilya, for the suggestions and discussion I posted a patch here: http://openvswitch.org/pipermail/dev/2015-August/058855.html > On 12.08.2015 23:52, Alex Wang wrote: > > Hey Ilya, > > > > Thanks for the reply please see my comments inline, > > > > Thanks, > > Alex Wang, > > > > On Wed, Aug 12, 2015 at 4:19 AM, Ilya Maximets <i.maxim...@samsung.com > <mailto:i.maxim...@samsung.com>> wrote: > > > > I agree, that this solution is much more simple, but I don't like > this race for > > coverage_mutex. > > Why this all done so? > > > > > > Before implementing pmd threads, waiting on the ‘coverage_mutex’ is > > affordable for all other threads. > > > > > > > > > > I mean, why we should run coverage_clear every second? > > > > > > Because, we (the main thread) need to calculate the per > second/minute/hour > > rate.. > > > > > > > > What if we change type of > > thread_local unsigned int counter_##COUNTER from unsigned int to > unsigned long long > > and will call coverage_clear() in coverage_read() ? (May be possible > with my patch > > applied) Also, we may collect stats for all threads separately. > > > > > > There is one major issue with you change to 'COVERAGE_DEFNE()' > > > >> /* Defines COUNTER. There must be exactly one such definition > at file scope > >> * within a program. */ > >> #define COVERAGE_DEFINE(COUNTER) > \ > >> DEFINE_STATIC_PER_THREAD_DATA(unsigned int, > \ > >> counter_##COUNTER, 0); > \ > >> - static unsigned int COUNTER##_count(void) > \ > >> + static unsigned int COUNTER##_count(unsigned int i) > \ > >> { > \ > >> - unsigned int *countp = counter_##COUNTER##_get(); > \ > >> + volatile unsigned int *countp = coverage_val[i]; > \ > >> unsigned int count = *countp; > \ > >> *countp = 0; > \ > >> return count; > \ > >> @@ -72,11 +77,17 @@ void coverage_counter_register(struct > coverage_counter*); > >> { > \ > >> *counter_##COUNTER##_get() += n; > \ > >> } > \ > >> - extern struct coverage_counter counter_##COUNTER; > \ > >> - struct coverage_counter counter_##COUNTER > \ > >> + extern thread_local struct coverage_counter > counter_##COUNTER; \ > >> + thread_local struct coverage_counter > counter_##COUNTER \ > >> = { #COUNTER, COUNTER##_count, 0, 0, {0}, {0} }; > > > > > > Here a minor issue is that 'thread_local' is not portable. You need to > define > > it using the ovs wrappers documented in lib/ovs-thread.h > > > > > > > >> - OVS_CONSTRUCTOR(COUNTER##_init) { > \ > >> + static void COUNTER##_initializer(void) { > \ > >> coverage_counter_register(&counter_##COUNTER); > \ > >> + coverage_val[n_coverage_counters - 1] = > \ > >> + > counter_##COUNTER##_get(); \ > >> + } > \ > >> + OVS_CONSTRUCTOR(COUNTER##_init) { > \ > >> + > coverage_initializer_register(&COUNTER##_initializer); \ > >> + COUNTER##_initializer(); > \ > >> } > >> > > > > We really should not have main thread accessing the per thread data of > other > > thread. As mentioned in the comments in lib/ovs-thread.h, > > > > """ > > * - The thread_local feature newly defined in C11 <threads.h> works > with > > * any data type and initializer, and it is fast. thread_local > does not > > * require once-only initialization like pthread_key_t. * C11 does > not* > > * * define what happens if one attempts to access a thread_local > object* > > * * from a thread other than the one to which that object > belongs.* There > > * is no provision to call a user-specified destructor when a > thread > > * ends. Typical implementations allow for an arbitrary amount of > > * thread_local storage, but statically allocated only. > > """ > > > > We really should not play with any undefined behavior. (since has no way > > to guarantee the portability). In other words, we really should not use > > thread local variables, if we ever want to re-architect the coverage > module > > to have main thread taking care of collecting the stats. > > > > So, we think the try lock solution should do good now, > > > > What do you think? > > > > Thanks, > > Alex Wang, > > > > > > > > On 12.08.2015 08:01, Alex Wang wrote: > > > Just trying share an alternative, could we still do the coverage > clear inside 'pmd_thread_main()'. However, > > > instead of contending for the 'coverage_mutex', use > 'ovs_mutex_trylock()'. Since pmd thread runs in infinite > > > loop, it can constantly try grabbing the lock and will eventually > acquire it. > > > > > > Something like this: > > > > > > diff --git a/lib/coverage.c b/lib/coverage.c > > > index 6121956..9f23f99 100644 > > > --- a/lib/coverage.c > > > +++ b/lib/coverage.c > > > @@ -272,6 +272,37 @@ coverage_clear(void) > > > } > > > } > > > > > > +/* Runs approximately every COVERAGE_CLEAR_INTERVAL amount of > time to > > > + * synchronize per-thread counters with global counters. Exits > immediately > > > + * without updating the 'thread_time' if cannot acquire the > 'coverage_mutex'. > > > + * This is to avoid lock contention for some performance-critical > threads. > > > + */ > > > +void > > > +coverage_try_clear(void) > > > +{ > > > + long long int now, *thread_time; > > > + > > > + now = time_msec(); > > > + thread_time = coverage_clear_time_get(); > > > + > > > + /* Initialize the coverage_clear_time. */ > > > + if (*thread_time == LLONG_MIN) { > > > + *thread_time = now + COVERAGE_CLEAR_INTERVAL; > > > + } > > > + > > > + if (now >= *thread_time) { > > > + if (ovs_mutex_trylock(&coverage_mutex)) { > > > + size_t i; > > > + for (i = 0; i < n_coverage_counters; i++) { > > > + struct coverage_counter *c = coverage_counters[i]; > > > + c->total += c->count(); > > > + } > > > + ovs_mutex_unlock(&coverage_mutex); > > > + *thread_time = now + COVERAGE_CLEAR_INTERVAL; > > > + } > > > + } > > > +} > > > + > > > /* Runs approximately every COVERAGE_RUN_INTERVAL amount of time > to update the > > > * coverage counters' 'min' and 'hr' array. 'min' array is for > cumulating > > > * per second counts into per minute count. 'hr' array is for > cumulating per > > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > > > index c144352..96884ec 100644 > > > --- a/lib/dpif-netdev.c > > > +++ b/lib/dpif-netdev.c > > > @@ -2695,6 +2695,7 @@ reload: > > > > > > lc = 0; > > > > > > + coverage_try_clear(); > > > emc_cache_slow_sweep(&pmd->flow_cache); > > > ovsrcu_quiesce(); > > > > > > > > > > > > On Tue, Aug 11, 2015 at 6:24 PM, Alex Wang <al...@nicira.com > <mailto:al...@nicira.com> <mailto:al...@nicira.com <mailto: > al...@nicira.com>>> wrote: > > > > > > Hi IIya, > > > > > > Thx a lot for writing up this fix, > > > > > > The reason that you did not see the pmd related coverage > counter stats > > > (in "ovs-appctl coverage/show") is that we do not call > coverage_clear() > > > anywhere in pmd_thread_main(). This is in that the > coverage_clear() > > > requires holding the global mutex in the coverage module which > can > > > be super expensive for dpdk packet processing. > > > > > > Want to admit that your fix is more complicated than I > expect. Especially, > > > changing the global (extern) struct 'counter_##COUNTER' to > per-thread > > > and expanding the array size by number of threads times can > also be > > > expensive for main thread... > > > > > > Do you think there could be a simpler way to achieve this? > I'd like to think > > > about other possible solutions more, > > > > > > Thanks, > > > Alex Wang, > > > > > > > > > On Mon, Aug 10, 2015 at 7:45 AM, Ilya Maximets < > i.maxim...@samsung.com <mailto:i.maxim...@samsung.com> <mailto: > i.maxim...@samsung.com <mailto:i.maxim...@samsung.com>>> wrote: > > > > > > Currently coverage_counter_register() is called only from > > > constructor functions at program initialization time by > > > main thread. Coverage counter values are static and > > > thread local. > > > > > > This means, that all COVERAGE_INC() calls from pmd threads > > > leads to increment of thread local variables of that > threads > > > that can't be accessed by anyone else. And they are not > used > > > in final coverage accounting. > > > > > > Fix that by adding ability to add/remove coverage counters > > > in runtime for newly created threads and counting coverage > > > using counters from all threads. > > > > > > Signed-off-by: Ilya Maximets <i.maxim...@samsung.com > <mailto:i.maxim...@samsung.com> <mailto:i.maxim...@samsung.com <mailto: > i.maxim...@samsung.com>>> > > > --- > > > lib/coverage.c | 83 > +++++++++++++++++++++++++++++++++++++++++++++---------- > > > lib/coverage.h | 27 ++++++++++++------ > > > lib/dpif-netdev.c | 4 ++- > > > 3 files changed, 91 insertions(+), 23 deletions(-) > > > > > > diff --git a/lib/coverage.c b/lib/coverage.c > > > index 6121956..2ae9e7a 100644 > > > --- a/lib/coverage.c > > > +++ b/lib/coverage.c > > > @@ -30,14 +30,22 @@ VLOG_DEFINE_THIS_MODULE(coverage); > > > > > > /* The coverage counters. */ > > > static struct coverage_counter **coverage_counters = NULL; > > > -static size_t n_coverage_counters = 0; > > > static size_t allocated_coverage_counters = 0; > > > > > > +static void (**coverage_initializers)(void) = NULL; > > > +static size_t n_coverage_initializers = 0; > > > +static size_t allocated_coverage_initializers = 0; > > > + > > > static struct ovs_mutex coverage_mutex = > OVS_MUTEX_INITIALIZER; > > > > > > DEFINE_STATIC_PER_THREAD_DATA(long long int, > coverage_clear_time, LLONG_MIN); > > > static long long int coverage_run_time = LLONG_MIN; > > > > > > +volatile unsigned int *coverage_val[2048] = {NULL}; > > > +volatile size_t n_coverage_counters = 0; > > > + > > > +DEFINE_STATIC_PER_THREAD_DATA(unsigned int, > coverage_thread_start, 0); > > > + > > > /* Index counter used to compute the moving average > array's index. */ > > > static unsigned int idx_count = 0; > > > > > > @@ -54,7 +62,44 @@ coverage_counter_register(struct > coverage_counter* counter) > > > > &allocated_coverage_counters, > > > sizeof(struct > coverage_counter*)); > > > } > > > - coverage_counters[n_coverage_counters++] = counter; > > > + coverage_counters[n_coverage_counters] = counter; > > > + coverage_val[n_coverage_counters++] = NULL; > > > +} > > > + > > > +void > > > +coverage_initializer_register(void (*f)(void)) > > > +{ > > > + if (n_coverage_initializers >= > allocated_coverage_initializers) { > > > + coverage_initializers = > x2nrealloc(coverage_initializers, > > > + > &allocated_coverage_initializers, > > > + sizeof > *coverage_initializers); > > > + } > > > + coverage_initializers[n_coverage_initializers++] = f; > > > +} > > > + > > > +void > > > +coverage_register_new_thread(void) > > > +{ > > > + int i; > > > + ovs_mutex_lock(&coverage_mutex); > > > + *coverage_thread_start_get() = n_coverage_counters; > > > + for (i = 0; i < n_coverage_initializers; i++) > > > + coverage_initializers[i](); > > > + ovs_mutex_unlock(&coverage_mutex); > > > +} > > > + > > > +void > > > +coverage_unregister_thread(void) > > > +{ > > > + int i; > > > + ovs_mutex_lock(&coverage_mutex); > > > + for (i = *coverage_thread_start_get() + > n_coverage_initializers; > > > + i < n_coverage_counters; i++) { > > > + coverage_counters[i - n_coverage_initializers] = > coverage_counters[i]; > > > + coverage_val[i - n_coverage_initializers] = > coverage_val[i]; > > > + } > > > + n_coverage_counters -= n_coverage_initializers; > > > + ovs_mutex_unlock(&coverage_mutex); > > > } > > > > > > static void > > > @@ -102,13 +147,12 @@ coverage_hash(void) > > > uint32_t hash = 0; > > > int n_groups, i; > > > > > > + ovs_mutex_lock(&coverage_mutex); > > > /* Sort coverage counters into groups with equal > totals. */ > > > c = xmalloc(n_coverage_counters * sizeof *c); > > > - ovs_mutex_lock(&coverage_mutex); > > > for (i = 0; i < n_coverage_counters; i++) { > > > c[i] = coverage_counters[i]; > > > } > > > - ovs_mutex_unlock(&coverage_mutex); > > > qsort(c, n_coverage_counters, sizeof *c, > compare_coverage_counters); > > > > > > /* Hash the names in each group along with the rank. > */ > > > @@ -130,6 +174,7 @@ coverage_hash(void) > > > i = j; > > > } > > > > > > + ovs_mutex_unlock(&coverage_mutex); > > > free(c); > > > > > > return hash_int(n_groups, hash); > > > @@ -200,9 +245,11 @@ coverage_read(struct svec *lines) > > > { > > > struct coverage_counter **c = coverage_counters; > > > unsigned long long int *totals; > > > + unsigned long long int *total_sec, *total_min, > *total_hr; > > > size_t n_never_hit; > > > uint32_t hash; > > > size_t i; > > > + int initial_n = n_coverage_initializers; > > > > > > hash = coverage_hash(); > > > > > > @@ -213,14 +260,20 @@ coverage_read(struct svec *lines) > > > "hash=%08"PRIx32":", > > > COVERAGE_RUN_INTERVAL/1000, > hash)); > > > > > > - totals = xmalloc(n_coverage_counters * sizeof > *totals); > > > ovs_mutex_lock(&coverage_mutex); > > > + totals = xcalloc(n_coverage_counters, sizeof *totals); > > > + total_sec = xcalloc(n_coverage_counters, sizeof > *total_sec); > > > + total_min = xcalloc(n_coverage_counters, sizeof > *total_min); > > > + total_hr = xcalloc(n_coverage_counters, sizeof > *total_hr); > > > + > > > for (i = 0; i < n_coverage_counters; i++) { > > > - totals[i] = c[i]->total; > > > + totals[i % initial_n] += c[i]->total; > > > + total_sec[i % initial_n] += c[i]->min[(idx_count > - 1) % MIN_AVG_LEN]; > > > + total_min[i % initial_n] += > coverage_array_sum(c[i]->min, MIN_AVG_LEN); > > > + total_hr[i % initial_n] += > coverage_array_sum(c[i]->hr, HR_AVG_LEN); > > > } > > > - ovs_mutex_unlock(&coverage_mutex); > > > > > > - for (i = 0; i < n_coverage_counters; i++) { > > > + for (i = 0; i < initial_n; i++) { > > > if (totals[i]) { > > > /* Shows the averaged per-second rates for > the last > > > * COVERAGE_RUN_INTERVAL interval, the last > minute and > > > @@ -229,18 +282,22 @@ coverage_read(struct svec *lines) > > > xasprintf("%-24s %5.1f/sec %9.3f/sec " > > > "%13.4f/sec total: %llu", > > > c[i]->name, > > > - (c[i]->min[(idx_count - 1) % > MIN_AVG_LEN] > > > + (total_sec[i] > > > * 1000.0 / > COVERAGE_RUN_INTERVAL), > > > - coverage_array_sum(c[i]->min, > MIN_AVG_LEN) / 60.0, > > > - coverage_array_sum(c[i]->hr, > HR_AVG_LEN) / 3600.0, > > > + total_min[i] / 60.0, > > > + total_hr[i] / 3600.0, > > > totals[i])); > > > } else { > > > n_never_hit++; > > > } > > > } > > > + ovs_mutex_unlock(&coverage_mutex); > > > > > > svec_add_nocopy(lines, xasprintf("%"PRIuSIZE" events > never hit", n_never_hit)); > > > free(totals); > > > + free(total_sec); > > > + free(total_min); > > > + free(total_hr); > > > } > > > > > > /* Runs approximately every COVERAGE_CLEAR_INTERVAL > amount of time to > > > @@ -265,7 +322,7 @@ coverage_clear(void) > > > ovs_mutex_lock(&coverage_mutex); > > > for (i = 0; i < n_coverage_counters; i++) { > > > struct coverage_counter *c = > coverage_counters[i]; > > > - c->total += c->count(); > > > + c->total += c->count(i); > > > } > > > ovs_mutex_unlock(&coverage_mutex); > > > *thread_time = now + COVERAGE_CLEAR_INTERVAL; > > > @@ -340,10 +397,8 @@ coverage_array_sum(const unsigned int > *arr, const unsigned int len) > > > unsigned int sum = 0; > > > size_t i; > > > > > > - ovs_mutex_lock(&coverage_mutex); > > > for (i = 0; i < len; i++) { > > > sum += arr[i]; > > > } > > > - ovs_mutex_unlock(&coverage_mutex); > > > return sum; > > > } > > > diff --git a/lib/coverage.h b/lib/coverage.h > > > index 832c433..1f5ae17 100644 > > > --- a/lib/coverage.h > > > +++ b/lib/coverage.h > > > @@ -45,9 +45,9 @@ BUILD_ASSERT_DECL(COVERAGE_RUN_INTERVAL > % COVERAGE_CLEAR_INTERVAL == 0); > > > > > > /* A coverage counter. */ > > > struct coverage_counter { > > > - const char *const name; /* Textual name. */ > > > - unsigned int (*const count)(void); /* Gets, zeros > this thread's count. */ > > > - unsigned long long int total; /* Total count. */ > > > + const char *const name; /* Textual > name. */ > > > + unsigned int (*const count)(unsigned int); /* Gets, > zeros this thread's count. */ > > > + unsigned long long int total; /* Total > count. */ > > > unsigned long long int last_total; > > > /* The moving average arrays. */ > > > unsigned int min[MIN_AVG_LEN]; > > > @@ -55,15 +55,20 @@ struct coverage_counter { > > > }; > > > > > > void coverage_counter_register(struct coverage_counter*); > > > +void coverage_initializer_register(void (*f)(void)); > > > +void coverage_register_new_thread(void); > > > +void coverage_unregister_thread(void); > > > > > > +extern volatile unsigned int *coverage_val[2048]; > > > +extern volatile size_t n_coverage_counters; > > > /* Defines COUNTER. There must be exactly one such > definition at file scope > > > * within a program. */ > > > #define COVERAGE_DEFINE(COUNTER) > \ > > > DEFINE_STATIC_PER_THREAD_DATA(unsigned int, > \ > > > counter_##COUNTER, > 0); \ > > > - static unsigned int COUNTER##_count(void) > \ > > > + static unsigned int COUNTER##_count(unsigned int > i) \ > > > { > \ > > > - unsigned int *countp = > counter_##COUNTER##_get(); \ > > > + volatile unsigned int *countp = > coverage_val[i]; \ > > > unsigned int count = *countp; > \ > > > *countp = 0; > \ > > > return count; > \ > > > @@ -72,11 +77,17 @@ void coverage_counter_register(struct > coverage_counter*); > > > { > \ > > > *counter_##COUNTER##_get() += n; > \ > > > } > \ > > > - extern struct coverage_counter > counter_##COUNTER; \ > > > - struct coverage_counter counter_##COUNTER > \ > > > + extern thread_local struct coverage_counter > counter_##COUNTER; \ > > > + thread_local struct coverage_counter > counter_##COUNTER \ > > > = { #COUNTER, COUNTER##_count, 0, 0, {0}, {0} > }; \ > > > - OVS_CONSTRUCTOR(COUNTER##_init) { > \ > > > + static void COUNTER##_initializer(void) { > \ > > > > coverage_counter_register(&counter_##COUNTER); \ > > > + coverage_val[n_coverage_counters - 1] = > \ > > > + > counter_##COUNTER##_get(); \ > > > + } > \ > > > + OVS_CONSTRUCTOR(COUNTER##_init) { > \ > > > + > coverage_initializer_register(&COUNTER##_initializer); \ > > > + COUNTER##_initializer(); > \ > > > } > > > > > > /* Adds 1 to COUNTER. */ > > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > > > index c144352..a92dd6a 100644 > > > --- a/lib/dpif-netdev.c > > > +++ b/lib/dpif-netdev.c > > > @@ -34,6 +34,7 @@ > > > #include "bitmap.h" > > > #include "cmap.h" > > > #include "csum.h" > > > +#include "coverage.h" > > > #include "dp-packet.h" > > > #include "dpif.h" > > > #include "dpif-provider.h" > > > @@ -2666,7 +2667,7 @@ pmd_thread_main(void *f_) > > > > > > poll_cnt = 0; > > > poll_list = NULL; > > > - > > > + coverage_register_new_thread(); > > > /* Stores the pmd thread's 'pmd' to 'per_pmd_key'. */ > > > ovsthread_setspecific(pmd->dp->per_pmd_key, pmd); > > > pmd_thread_setaffinity_cpu(pmd->core_id); > > > @@ -2717,6 +2718,7 @@ reload: > > > } > > > > > > dp_netdev_pmd_reload_done(pmd); > > > + coverage_unregister_thread(); > > > > > > free(poll_list); > > > return NULL; > > > -- > > > 2.1.4 > > > > > > > > > > > > > > _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev