On 29/09/2021 14:32, Bruce Richardson wrote:
On Wed, Sep 29, 2021 at 01:28:53PM +0100, Kevin Traynor wrote:
Hi Bruce,

On 24/09/2021 17:18, Bruce Richardson wrote:
When DPDK is run with --in-memory mode, multiple processes can run
simultaneously using the same runtime dir. This leads to each process
removing another process' telemetry socket as it started up, giving
unexpected behaviour.

This patch changes that behaviour to first check if the existing socket
is active. If not, it's an old socket to be cleaned up and can be
removed. If it is active, telemetry initialization fails and an error
message is printed out giving instructions on how to remove the error;
either by using file-prefix to have a different runtime dir (and
therefore socket path) or by disabling telemetry if it not needed.


telemetry is enabled by default but it may not be used by the application.
Hitting this issue will cause rte_eal_init() to fail which will probably
stop or severely limit the application.

So it could change a working application to a non-working one (albeit one
that doesn't interfere with other process' sockets).

Can it just print a warning that telemetry will not be enabled and continue
so it's not returning an rte_eal_init failure?


For a backported fix, yes, that would probably be better behaviour, but for
the latest branch, I think returning error and having the user explicitly
choose the resolution they want to occur is best. I'll see about doing a
separate backport patch for 20.11.


But this is a runtime message dependent on runtime environment. The user may not have access or know how to change eal parameters.

In the case where the application doesn't care about telemetry, they have gone from not having telemetry to rte_eal_init() failing, which probably has severe consequence.

I could maybe agree if telemetry was default disable and the application had set the --telemetry flag indicating that they want/need it. As it is, it feels like it's possibly a worse outcome for the user.

thanks,
Kevin.

A more minor thing, I see it changes the behaviour from, last one runs with
telemetry, to, first one runs with telemetry. Though it can be figured from
the commit message, it might be worth calling that change out explicitly.


Sure. I'll resubmit a new version of this without stable CC'ed and include
that behaviour change explicitly in the commit log.

/Bruce


Reply via email to