On Tue, 2017-12-19 at 11:14 +0000, Anatoly Burakov wrote: > > Quick outline of all changes done as part of this patchset: > > * Malloc heap adjusted to handle holes in address space > * Single memseg list replaced by multiple expandable memseg lists > * VA space for hugepages is preallocated in advance > * Added dynamic alloc/free for pages, happening as needed on malloc/free
SPDK will need some way to register for a notification when pages are allocated or freed. For storage, the number of requests per second is (relative to networking) fairly small (hundreds of thousands per second in a traditional block storage stack, or a few million per second with SPDK). Given that, we can afford to do a dynamic lookup from va to pa/iova on each request in order to greatly simplify our APIs (users can just pass pointers around instead of mbufs). DPDK has a way to lookup the pa from a given va, but it does so by scanning /proc/self/pagemap and is very slow. SPDK instead handles this by implementing a lookup table of va to pa/iova which we populate by scanning through the DPDK memory segments at start up, so the lookup in our table is sufficiently fast for storage use cases. If the list of memory segments changes, we need to know about it in order to update our map. Having the map also enables a number of other nice things - for instance we allow users to register memory that wasn't allocated through DPDK and use it for DMA operations. We keep that va to pa/iova mapping in the same map. I appreciate you adding APIs to dynamically register this type of memory with the IOMMU on our behalf. That allows us to eliminate a nasty hack where we were looking up the vfio file descriptor through sysfs in order to send the registration ioctl. > * Added contiguous memory allocation API's for rte_malloc and rte_memzone > * Integrated Pawel Wodkowski's patch [1] for registering/unregistering memory > with VFIO >