Hi Olivier, On Friday 23 June 2017 10:38 PM, Jan Blunck wrote:
> On Fri, Jun 23, 2017 at 10:11 AM, Olivier Matz <olivier.m...@6wind.com> wrote: >> Hi Jan, >> >> On Sat, 10 Jun 2017 10:31:22 +0200, Jan Blunck <jblu...@infradead.org> wrote: >>> On Fri, Jun 9, 2017 at 10:29 AM, Olivier Matz <olivier.m...@6wind.com> >>> wrote: >>>> When populating a mempool with a virtual memory area, the mempool >>>> library expects to be able to get the physical address of each page. >>>> >>>> When started with --no-huge, the physical addresses may not be available >>>> because the pages are not locked in memory. It sometimes returns >>>> RTE_BAD_PHYS_ADDR, which makes the mempool_populate() function to fail. >>>> >>>> This was working before the commit cdc242f260e7 ("eal/linux: support >>>> running as unprivileged user"), because rte_mem_virt2phy() was returning >>>> 0 instead of RTE_BAD_PHYS_ADDR, which was seen as a valid physical >>>> address. >>>> >>>> Since --no-huge is a debug function that breaks the support of physical >>>> drivers, always set physical addresses to RTE_BAD_PHYS_ADDR in memzones >>>> or in rte_mem_virt2phy(), and ensure that mempool won't complain in that >>>> case. >>>> >>>> Fixes: cdc242f260e7 ("eal/linux: support running as unprivileged user") >>>> >>>> CC: sta...@dpdk.org >>>> Signed-off-by: Olivier Matz <olivier.m...@6wind.com> >>>> --- >>>> lib/librte_eal/common/eal_common_memzone.c | 5 ++++- >>>> lib/librte_eal/linuxapp/eal/eal_memory.c | 7 +++++++ >>>> lib/librte_mempool/rte_mempool.c | 2 +- >>>> 3 files changed, 12 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/lib/librte_eal/common/eal_common_memzone.c >>>> b/lib/librte_eal/common/eal_common_memzone.c >>>> index 3026e36b8..c465c8fc2 100644 >>>> --- a/lib/librte_eal/common/eal_common_memzone.c >>>> +++ b/lib/librte_eal/common/eal_common_memzone.c >>>> @@ -251,7 +251,10 @@ memzone_reserve_aligned_thread_unsafe(const char >>>> *name, size_t len, >>>> >>>> mcfg->memzone_cnt++; >>>> snprintf(mz->name, sizeof(mz->name), "%s", name); >>>> - mz->phys_addr = rte_malloc_virt2phy(mz_addr); >>>> + if (rte_eal_has_hugepages()) >>>> + mz->phys_addr = rte_malloc_virt2phy(mz_addr); >>>> + else >>>> + mz->phys_addr = RTE_BAD_PHYS_ADDR; >>> Since you set phys_addrs_available to false rte_malloc_virt2phy() >>> anyway returns RTE_BAD_PHYS_ADDR so I believe the conditional isn't >>> necessary here. >>> >>> Rest of the patch looks good to me. >> The variable phys_addrs_available only impacts rte_mem_virt2phy(). >> Here, for memzones allocation, rte_malloc_virt2phy() is used, and >> it gets its physical address by retrieving it from the memseg structure. >> >> With the full patch, "dump_memzone" displays something like: >> Zone 0: name:<rte_eth_dev_data>, phys:0xffffffffffffffff, len:0x30100, >> [...] >> ... >> >> If I strip the memzone part, it displays: >> Zone 0: name:<rte_eth_dev_data>, phys:0x7fe382c62640, len:0x30100, [...] >> ... >> >> So I think we should either keep the patch as is, or change the memseg >> and malloc part like this (it's maybe better): >> >> --- a/lib/librte_eal/common/rte_malloc.c >> +++ b/lib/librte_eal/common/rte_malloc.c >> @@ -254,5 +254,7 @@ rte_malloc_virt2phy(const void *addr) >> const struct malloc_elem *elem = malloc_elem_from_data(addr); >> if (elem == NULL) >> return 0; >> + if (elem->ms->phys_addr == RTE_BAD_PHYS_ADDR) >> + return RTE_BAD_PHYS_ADDR; >> return elem->ms->phys_addr + ((uintptr_t)addr - >> (uintptr_t)elem->ms->addr); >> } >> diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c >> b/lib/librte_eal/linuxapp/eal/eal_memory.c >> index 1c99852..2a401ca 100644 >> --- a/lib/librte_eal/linuxapp/eal/eal_memory.c >> +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c >> @@ -973,7 +973,7 @@ rte_eal_hugepage_init(void) >> strerror(errno)); >> return -1; >> } >> - mcfg->memseg[0].phys_addr = (phys_addr_t)(uintptr_t)addr; >> + mcfg->memseg[0].phys_addr = RTE_BAD_PHYS_ADDR; >> mcfg->memseg[0].addr = addr; >> mcfg->memseg[0].hugepage_sz = RTE_PGSIZE_4K; >> mcfg->memseg[0].len = internal_config.memory; >> >> >> Let me know what you are ok with this and I'll send a v2. >> > This approach looks better to me. > > hanks, > Jan Approach LGTM, though small comment: I think we also need to fix error return description for API rte_malloc_virt2phy. It says 'NULL' on error. It should be 0 or RTE_BAD_PHYS_ADDR. In fact, we should remove '0' as error return and keep RTE_BAD_PHYS_ADDR as the error value. If so then change may look like: if (elem == NULL || elem->ms->phys_addr == RTE_BAD_PHYS_ADDR) return RTE_BAD_PHYS_ADDR; Provided that return value '0' considered as error value in current code. Having said that, few drivers using rte_malloc_virt2phy without an error check. I guess now they must check return value before using phys_addr_t.