On 13-Nov-19 9:19 AM, Bruce Richardson wrote:
On Wed, Nov 13, 2019 at 10:37:57AM +0530, Venumadhav Josyula wrote:
Hi ,
We are using 'rte_mempool_create' for allocation of flow memory. This has
been there for a while. We just migrated to dpdk-18.11 from dpdk-17.05. Now
here is problem statement
Problem statement :
In new dpdk ( 18.11 ), the 'rte_mempool_create' take approximately ~4.4 sec
for allocation compared to older dpdk (17.05). We have som 8-9 mempools for
our entire product. We do upfront allocation for all of them ( i.e. when
dpdk application is coming up). Our application is run to completion model.
Questions:-
i) is that acceptable / has anybody seen such a thing ?
ii) What has changed between two dpdk versions ( 18.11 v/s 17.05 ) from
memory perspective ?
Any pointer are welcome.
Hi,
from 17.05 to 18.11 there was a change in default memory model for DPDK. In
17.05 all DPDK memory was allocated statically upfront and that used for
the memory pools. With 18.11, no large blocks of memory are allocated at
init time, instead the memory is requested from the kernel as it is needed
by the app. This will make the initial startup of an app faster, but the
allocation of new objects like mempools slower, and it could be this you
are seeing.
Some things to try:
1. Use "--socket-mem" EAL flag to do an upfront allocation of memory for use
by your memory pools and see if it improves things.
2. Try using "--legacy-mem" flag to revert to the old memory model.
Regards,
/Bruce
I would also add to this the fact that the mempool will, by default,
attempt to allocate IOVA-contiguous memory, with a fallback to non-IOVA
contiguous memory whenever getting IOVA-contiguous memory isn't possible.
If you are running in IOVA as PA mode (such as would be the case if you
are using igb_uio kernel driver), then, since it is now impossible to
preallocate large PA-contiguous chunks in advance, what will likely
happen in this case is, mempool will try to allocate IOVA-contiguous
memory, fail and retry with non-IOVA contiguous memory (essentially
allocating memory twice). For large mempools (or large number of
mempools) that can take a bit of time.
The obvious workaround is using VFIO and IOVA as VA mode. This will
cause the allocator to be able to get IOVA-contiguous memory at the
outset, and allocation will complete faster.
The other two alternatives, already suggested in this thread by Bruce
and Olivier, are:
1) use bigger page sizes (such as 1G)
2) use legacy mode (and lose out on all of the benefits provided by the
new memory model)
The recommended solution is to use VFIO/IOMMU, and IOVA as VA mode.
--
Thanks,
Anatoly