On Thursday 11 October 2018 03:43 PM, Burakov, Anatoly wrote:
On 11-Oct-18 11:07 AM, Shreyansh Jain wrote:
On Thursday 11 October 2018 03:32 PM, Shreyansh Jain wrote:
On Thursday 11 October 2018 02:33 PM, Burakov, Anatoly wrote:
On 09-Oct-18 11:45 AM, Shreyansh Jain wrote:
On Tuesday 25 September 2018 07:09 PM, Shreyansh Jain wrote:
Hello Anatoly,
On Tuesday 25 September 2018 06:58 PM, Burakov, Anatoly wrote:
On 25-Sep-18 1:54 PM, Shreyansh Jain wrote:
A common library, valid for dpaaX drivers, which is used to
maintain
a local copy of PA->VA translations.
In case of physical addressing mode (one of the option for
FSLMC, and
only option for DPAA bus), the addresses of descriptors Rx'd are
physical. These need to be converted into equivalent VA for
rte_mbuf
and other similar calls.
Using the rte_mem_virt2iova or rte_mem_virt2phy is expensive. This
library is an attempt to reduce the overall cost associated with
this translation.
A small table is maintained, containing continuous entries
representing a continguous physical range. Each of these entries
stores the equivalent VA, which is fed during mempool creation, or
memory allocation/deallocation callbacks.
[...]
Also, a couple of nitpicks below.
cosnfig/common_base | 5 +
config/common_linuxapp | 5 +
drivers/common/Makefile | 4 +
drivers/common/dpaax/Makefile | 31 ++
drivers/common/dpaax/dpaax_iova_table.c | 509
++++++++++++++++++
drivers/common/dpaax/dpaax_iova_table.h | 104 ++++
drivers/common/dpaax/dpaax_logs.h | 39 ++
drivers/common/dpaax/meson.build | 12 +
<snip>
+ DPAAX_DEBUG("Add: Found slot at (%"PRIu64")[(%zu)] for
vaddr:(%p),"
+ " phy(%"PRIu64"), len(%zu)", entry[i].start, e_offset,
+ vaddr, paddr, length);
+ return 0;
+}
+
+int
+dpaax_iova_table_del(phys_addr_t paddr, size_t len __rte_unused)
len is not unused.
I will fix this.
Actually, this function itself is useless - more for symmetry reason.
Callers would be either simply updating the table, or ignoring it
completely. But, yes, this is indeed wrong that I set that unused.
Actually, I was wrong in my first reply. In case of
dpaax_iova_table_del(), len is indeed redundant. This is because
the mapping is for a complete page (min of 2MB size), even if the
request is for lesser length. So, removal of a single entry (of
fixed size) would be done.
In fact, while on this, I think deleting a PA->VA entry itself is
incorrect (not just useless). A single entry (~2MB equivalent) can
represent multiple users (working on a rte_malloc'd area, for
example). So, effectively, its always an update - not an add or del.
I'm not sure what you mean here. If you got a mem event about memory
area being freed, it's guaranteed to *not* have any users - neither
malloc, nor any other memory. And len is always page-aligned.
ok. Maybe I am getting this wrong, but consider this:
1) hugepage size=2MB
2) a = malloc(1M)
this will pin an entry in table for a block starting at VA=(a) and
PA=(a'). Each entry is of 2MB length - that means, even if someone
were to access a+1048577 for an equivalent PA, they would get it
(though, that is a incorrect access).
3) b = malloc(1M)
this *might* lead to a case where same 2MB page is used and
VA=(b==(a+1MB)). Being hugepage backed, PA=(b=PA(a)+1M).
= After b, the PA-VA table has a single entry of 2MB, representing
two mallocs. It can be used for translation for any thread requesting
PAs of a or b.
4) Free(a)
- this would attempt to remove one 2MB entry from PA-VA table. But,
'b' is already valid. Access to get_pa(VA(b)) should return me the
PA(b).
- 'len' is not even used as the entry in PA-VA table is of a fixed
size.
Just to add to this:
- if talking about the mem_event callback, it definitely won't be a
case where same page is still being served under another rte_malloc
- But, calls can come to delete from users of PA-VA table based on
their own rte_free().
And, your comment makes me think - I should probably del entry from
the table only when mem_event callback is received.
Mem events are not triggered on rte_free(), they're triggered on page
deallocation. A call to rte_free/rte_memzone_free/rte_mempool_free etc.
*might* trigger a page deallocation, but *only* if the memory area being
freed encompasses an entire page. If you rte_malloc() 64 bytes and then
rte_free() those 64 bytes, you won't get a mem event *unless* these were
the only 64 bytes allocated on a particular page, and the entire page is
no longer used by anything else.
My understanding is same.
But, it seems my explanation wasn't well written:
For a rte_free(), I am not expecting that mem_event is raised - but, the
caller of rte_free() (the eth or crypto drivers, or applications) may
call the PA-VA table del function to remove the entries.
This voluntary delete of table entry from the drivers or applications
using PA-VA calling del of PA-VA table - is not correct.
The path from mem_event callback clearing the PA-VA table entry is
correct (which I removed in v2) - that time the page (len) would
definitely not be used by anyone and can be removed from PA-VA table.
And, yes, I agree that mem-event may not be on an rte_free().
In the above, (3) is an assumption I am making based on my
understanding how mem allocator is working. Is that wrong?
Basically, this is a restriction of this table - it has a min chunk
of 2MB - even for 1G hugepages - and hence, it is not possible to
honor deletes. I know this is convoluted logic - but, this keeps it
simple and use-able without much performance impact.
[...]