On 16 Jun 2023, at 19:19, Aaron Conole wrote:
> Martin Kennelly <mkenn...@redhat.com> writes:
>
>> Hey ovs community,
>>
>> I am a developer working on ovn-kubernetes and I want to programmatically
>> consume long poll information
>> i.e:
>> ovs|00211|timeval(handler25)|WARN|Unreasonably long 52388ms poll interval
>> (752ms user, 209ms system)
>>
>> This is currently exposed via journal logs but it's not practical to consume
>> it there programmatically and I was
>> hoping you could add it to coverage metrics.
>
> I think it could be useful. I do want to be careful about exposing
> these kinds of data in a way that could be misinterpreted. Already,
> that log in particular gets misinterpreted quite a bit, and RH gets
> customers claiming OVS is misbehaving when they've oversubscribed the
> system.
+1
> Mechanically, it would be pretty simple to do something like:
>
> ---
> diff --git a/lib/timeval.c b/lib/timeval.c
> index 193c7bab17..00e5f2a74d 100644
> --- a/lib/timeval.c
> +++ b/lib/timeval.c
> @@ -40,6 +40,7 @@
> #include "openvswitch/vlog.h"
>
> VLOG_DEFINE_THIS_MODULE(timeval);
> +COVERAGE_DEFINE(long_poll_interval);
>
> #if !defined(HAVE_CLOCK_GETTIME)
> typedef unsigned int clockid_t;
> @@ -645,6 +646,8 @@ log_poll_interval(long long int last_wakeup)
> struct rusage rusage;
>
> if (!getrusage_thread(&rusage)) {
> + COVERAGE_INC(long_poll_interval);
> +
> VLOG_WARN("Unreasonably long %lldms poll interval"
> " (%lldms user, %lldms system)",
> interval,
> ---
>
> This would at least expose the coverage data via the coverage framework
> and it can be queried via ovs-appctl. Actually, the advantage here is
> that the coverage counter can track some details about X/sec over the
> last 5 seconds, minute, hour, in addition to the total, so we can see
> whether the condition is ongoing.
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss