> On May 9, 2018, at 4:12 PM, Ferruh Yigit <ferruh.yi...@intel.com> wrote: > > On 5/9/2018 12:09 PM, Yongseok Koh wrote: >> This is the new design of Memory Region (MR) for mlx PMD, in order to: >> - Accommodate the new memory hotplug model. >> - Support non-contiguous Mempool. >> >> There are multiple layers for MR search. >> >> L0 is to look up the last-hit entry which is pointed by mr_ctrl->mru (Most >> Recently Used). If L0 misses, L1 is to look up the address in a fixed-sized >> array by linear search. L0/L1 is in an inline function - >> mlx4_mr_lookup_cache(). >> >> If L1 misses, the bottom-half function is called to look up the address >> from the bigger local cache of the queue. This is L2 - mlx4_mr_addr2mr_bh() >> and it is not an inline function. Data structure for L2 is the Binary Tree. >> >> If L2 misses, the search falls into the slowest path which takes locks in >> order to access global device cache (priv->mr.cache) which is also a B-tree >> and caches the original MR list (priv->mr.mr_list) of the device. Unless >> the global cache is overflowed, it is all-inclusive of the MR list. This is >> L3 - mlx4_mr_lookup_dev(). The size of the L3 cache table is limited and >> can't be expanded on the fly due to deadlock. Refer to the comments in the >> code for the details - mr_lookup_dev(). If L3 is overflowed, the list will >> have to be searched directly bypassing the cache although it is slower. >> >> If L3 misses, a new MR for the address should be created - >> mlx4_mr_create(). When it creates a new MR, it tries to register adjacent >> memsegs as much as possible which are virtually contiguous around the >> address. This must take two locks - memory_hotplug_lock and >> priv->mr.rwlock. Due to memory_hotplug_lock, there can't be any >> allocation/free of memory inside. >> >> In the free callback of the memory hotplug event, freed space is searched >> from the MR list and corresponding bits are cleared from the bitmap of MRs. >> This can fragment a MR and the MR will have multiple search entries in the >> caches. Once there's a change by the event, the global cache must be >> rebuilt and all the per-queue caches will be flushed as well. If memory is >> frequently freed in run-time, that may cause jitter on dataplane processing >> in the worst case by incurring MR cache flush and rebuild. But, it would be >> the least probable scenario. >> >> To guarantee the most optimal performance, it is highly recommended to use >> an EAL option - '--socket-mem'. Then, the reserved memory will be pinned >> and won't be freed dynamically. And it is also recommended to configure >> per-lcore cache of Mempool. Even though there're many MRs for a device or >> MRs are highly fragmented, the cache of Mempool will be much helpful to >> reduce misses on per-queue caches anyway. >> >> '--legacy-mem' is also supported. >> >> Signed-off-by: Yongseok Koh <ys...@mellanox.com> > > <...> > >> +/** >> + * Insert an entry to B-tree lookup table. >> + * >> + * @param bt >> + * Pointer to B-tree structure. >> + * @param entry >> + * Pointer to new entry to insert. >> + * >> + * @return >> + * 0 on success, -1 on failure. >> + */ >> +static int >> +mr_btree_insert(struct mlx4_mr_btree *bt, struct mlx4_mr_cache *entry) >> +{ >> + struct mlx4_mr_cache *lkp_tbl; >> + uint16_t idx = 0; >> + size_t shift; >> + >> + assert(bt != NULL); >> + assert(bt->len <= bt->size); >> + assert(bt->len > 0); >> + lkp_tbl = *bt->table; >> + /* Find out the slot for insertion. */ >> + if (mr_btree_lookup(bt, &idx, entry->start) != UINT32_MAX) { >> + DEBUG("abort insertion to B-tree(%p):" >> + " already exist at idx=%u [0x%lx, 0x%lx) lkey=0x%x", >> + (void *)bt, idx, entry->start, entry->end, entry->lkey); > > This and various other logs causing 32bits build error because of %lx usage. > Can > you please check them? > > I am feeling sad to complain a patch like this just because of log format > issue, > we should find a solution to this issue as community, either checkpatch checks > or automated 32bit builds, I don't know.
Bummer. I have to change my bad habit of using %lx. And we will add 32-bit build check to our internal system to filter this kind of mistakes beforehand. Will work with Shahaf to fix it and rebase next-net-mlx. Thanks, Yongseok