On 2023-09-11 18:06, Stephen Hemminger wrote:
On Fri, 8 Sep 2023 09:04:29 +0200
Mattias Rönnblom <hof...@lysator.liu.se> wrote:

Also, right now the array is sized at 129 entries to allow for the
maximum number of lcores. When the maximum is increased to 512 or
1024 the problem will get worse.

Using TLS will penalize every thread in the process, not only EAL
threads and registered non-EAL threads, and worse: not only threads
that are using the API in question.

Every thread will carry the TLS memory around, increasing the process
memory footprint.

Thread creation will be slower, since TLS memory is allocated *and
initialized*, lazy user code-level initialization or not.

On my particular Linux x86_64 system, pthread creation overhead looks
something like:

8 us w/o any user code-level use of TLS
11 us w/ 16 kB of TLS
314 us w/ 2 MB of TLS.

Agree that TLS does cause potentially more pages to get allocated on
thread creation, but that argument doesn't make sense here.

Sure. I was talking about the general concept of replacing per-lcore static arrays with TLS.

I find the general applicability of the TLS pattern related because it doesn't make sense to have an ad-hoc, opportunistic way to implement essentially the same thing across the DPDK code base.

The rand
state is small, and DPDK applications should not be creating threads
after startup. Thread creation is an expensive set of system calls.

I agree, and I would add that non-EAL threads will likely be few in numbers, and should all be registered on creation, to assure they can call DPDK APIs which require a lcore id.

That said, if application do create threads, DPDK shouldn't make the thread creation order of magnitudes slower.

Reply via email to