On Fri,  6 Mar 2026 06:08:22 +0000
Mounika Kalamandala <[email protected]> wrote:

> Add cleanup of EAL runtime directory artifacts during rte_eal_cleanup().
> This removes runtime files created during initialization and ensures
> that no stale files remain under /var/run/dpdk/<file-prefix> after the
> application terminates.
> 
> Update the documentation in multi_proc_support.rst to describe the
> secondary process lifetime requirement when the primary process exits.
> 
> Signed-off-by: Mounika Kalamandala <[email protected]>
> ---

AI has more corrections here. It does raise the point that primary
process does know if secondary is still running. Also the more unfortunate
and common case happens when primary crashes.

Normally eal_clean_runtime_dir() is called on eal_init, not exit.
Also, clean_runtime_dir does a test flock() of the file and will
fail if it is in use, so it may leave things on shutdown.

Overall, this patch is a no from me.
---

**Patch: eal: clean runtime directory files on exit**

## Errors

**1. Runtime directory cleanup while secondary processes may still be running 
(potential correctness issue, ~60% confidence)**

The code unconditionally calls `eal_clean_runtime_dir()` in the primary process 
cleanup path. While the new documentation states secondaries "must be 
terminated" before the primary exits, the code does not enforce or verify this. 
If a secondary process is still running when the primary calls 
`rte_eal_cleanup()`, this removes runtime files (the config, hugepage info, 
socket files) from under the secondary process, which could cause crashes, 
SIGBUS on hugepage access, or silent corruption. The documentation addition 
describes the requirement but the code provides no defense.

Consider at minimum logging a warning if secondary processes are detected as 
still active, or skipping the directory cleanup when secondaries are still 
connected. A best-effort check via the mp socket or `/proc` scan for processes 
with the same file-prefix would make this safer.

**2. Ordering: `eal_clean_runtime_dir()` placed after memory detach**

`eal_clean_runtime_dir()` is called after `rte_eal_memory_detach()` and 
`rte_eal_malloc_heap_cleanup()`. This means the hugepage mapping files are 
already detached from this process, which is fine for the primary itself. 
However, the runtime directory also contains the mp socket and config file. 
Verify that `eal_clean_runtime_dir()` does not rely on any state cleaned up by 
the preceding two calls. If it only uses `internal_conf->runtime_dir` (a 
stack/heap string, not hugepage-backed), the ordering is acceptable — but this 
deserves confirmation since the function wasn't previously called at this point 
in the sequence.

## Warnings

**1. Documentation says "must be terminated" but the patch creates a new 
failure mode rather than preventing it**

The commit message says the patch "ensures that no stale files remain," but it 
also creates a new hazard: a clean `rte_eal_cleanup()` in the primary now 
destroys the runtime directory that running secondaries depend on. Previously, 
stale files were a nuisance; now, a primary exiting with secondaries still 
running actively destroys their runtime state. The commit message and 
documentation should acknowledge this behavioral change more clearly, since 
operators who previously relied on secondaries surviving a primary restart will 
now face a harder failure.

**2. RST indentation suggests this is inside the existing `.. note::` block — 
verify placement**

The added text is indented with 4 spaces, matching the indentation of the 
preceding `.. note::` directive content. If intentional, this is fine. But the 
content describes a general lifecycle requirement, not just a note about device 
access options. It might be clearer as a standalone paragraph or its own 
admonition (e.g., `.. warning::`) rather than appended to the note about 
allow/block options.

**3. Commit body starts with "Add" — consider whether this needs a "why"**

The commit body describes *what* the patch does but not *why*. What problem was 
observed with stale runtime files? Under what circumstances do they cause 
issues? A sentence of motivation (e.g., "After unclean shutdown, stale files in 
/var/run/dpdk prevent re-initialization") would help reviewers and future 
readers understand the necessity.

## Info

The subject line is 42 characters, well within the 60-character limit. Tag 
ordering and format are correct. Copyright year update to 2026 is within valid 
range.

Reply via email to