On Thu, Nov 02, 2023 at 11:19:24AM -0700, Rahul Gupta wrote:
> From: Rahul Gupta <rahulg...@microsoft.com>
> 
> Initialization often requires rte_eal_init + rte_pktmbuf_pool_create
> which can consume a total time of 500-600 ms:
> a) For many devices FLR may take a significant chunk of time
>    (200-250 ms in our use-case), this FLR is triggered during device
>    probe in rte_eal_init().
> b) rte_pktmbuf_pool_create() can consume upto 300-350 ms for
> applications that require huge memory.
> 
> This cost is incurred on each restart (which happens in our use-case
> during binary updates for servicing).
> This patch provides an optimization using pthreads that appplications
> can use and which can save 200-230ms.
> 
> In this patch, rte_eal_init() is refactored into two parts-
> a) 1st part is dependent code ie- it’s a perquisite of the FLR and
>    mempool creation. So this code needs to be executed before any
>    pthreads. Its named as rte_eal_init_setup()
> b) 2nd part of code is independent code ie- it can execute in parallel
>    to mempool creation in a pthread. Its named as rte_probe_and_ioctl().
> 
> Existing applications require no changes unless they wish to leverage
> the optimization.
> 
> If the application wants to use pthread functionality, it should call-
> a) rte_eal_init_setup() then create two or more pthreads-
> b) in one pthread call- rte_probe_and_ioctl(),
> c) second pthread call- rte_pktmbuf_pool_create()
> d) (optional) Other pthreads for  any other independent function.
> 
> Signed-off-by: Rahul Gupta <rahulg...@linux.microsoft.com>

Reading the description, this seems an interesting idea, and a good saving.

If I may, I wonder if I can suggest a slight alternative. Rather than
splitting EAL init into two functions like that, how about providing an
"rte_eal_init_async()" function, which does part 1, and then spawns a
thread for part 2, before returning. We can then provide an
rte_eal_init_done() [or eal_init_async_done()] function to allow apps to
resync and check for EAL being done.

The reason for suggesting this is that the naming and purpose of the APIs
may be a little clearer for the end user. Allowing the async init function
to create threads also allows possible future parallelism in the function
itself. For example, we could do probing of the devices themselves in
parallel.

Regards,
/Bruce

Reply via email to