RE: [RFC] Dynamic log/trace control via telemetry

Morten Brørup Wed, 17 Aug 2022 08:34:59 -0700

> From: Dmitry Kozlyuk [mailto:dmitry.kozl...@gmail.com]
> Sent: Wednesday, 17 August 2022 17.15
> 
> 2022-08-16 19:08 (UTC-0700), Stephen Hemminger:
> > Not sure if turning telemetry into a do all control api makes sense.
> 
> I'm sure it doesn't, for "do all".
> Controlling diagnostic collection and output, however,
> is directly related to the telemetry purpose.
> 
> > This seems like a different API.


I agree with Stephen regarding not making the telemetry library a "do all" 
control API. A separate API would be preferable.

And then, a wrapper through the telemetry interface can be provided to that 
API. Best of both worlds. :-)

> > Also, the default would have to be disabled for application safety
> reasons.
> 
> This feature would be for collecting additional info
> in case the collection was not planned and a restart is not desired.
> If it is disabled by default, it is likely to be off when it's needed.

All tracing, logging etc. MUST be disabled by default. You are suggesting the 
opposite, which will definitely impact performance.

And performance will become a valid argument for not adding more trace/logging 
to libraries, if all of it is enabled by default.

And my usual rant: I hope all of this can be disabled at build time - for 
maximum performance.

> 
> Let's consider how exactly can safety be compromised.
> 
> 1. Securing telemetry socket access is out of scope for DPDK,
>    that is, any successful access is considered trusted.
> 
> 2. Even read-only telemetry still comes at cost, for example,
>    memory telemetry takes a global lock that blocks all allocations,
>    so affecting the app performance is already possible.
> 
> 3. Important logs and traces enabled at startup may be disabled
> dynamically.
>    If it's an issue, the API can refuse to disable them.
> 
> 4. Bogus logs may flood the output and slow down the app.
>    Bogus traces can exhaust disk space.
>    Logs should be monitored automatically, so flooding is just an
> annoyance.
>    Disk space can have a quota.
>    Since the user is trusted (item 1), even if they do it by mistake,
>    they can quickly correct themselves using the same API.
> 

Here's a thought:

Add an API to set an "unlock key", so applications who don't want to allow 
these features for unauthorized users can prevent them from enabling it. 
Authorized users can use an API to unlock these features by providing the key.

RE: [RFC] Dynamic log/trace control via telemetry

Reply via email to