On 29-Oct-18 2:18 PM, Thomas Monjalon wrote:
29/10/2018 14:40, Alejandro Lucero:
On Mon, Oct 29, 2018 at 1:18 PM Yao, Lei A <lei.a....@intel.com> wrote:
*From:* Alejandro Lucero [mailto:alejandro.luc...@netronome.com]
On Mon, Oct 29, 2018 at 11:46 AM Thomas Monjalon <tho...@monjalon.net>
wrote:

29/10/2018 12:39, Alejandro Lucero:
I got a patch that solves a bug when calling rte_eal_dma_mask using the
mask instead of the maskbits. However, this does not solves the
deadlock.

The deadlock is a bigger concern I think.

I think once the call to rte_eal_check_dma_mask uses the maskbits instead
of the mask, calling rte_memseg_walk_thread_unsafe avoids the deadlock.

Yao, can you try with the attached patch?

Hi, Lucero

This patch can fix the issue at my side. Thanks a lot
for you quick action.

Great!

I will send an official patch with the changes.

Please, do not forget my other request to better comment functions.


I have to say that I tested the patchset, but I think it was where
legacy_mem was still there and therefore dynamic memory allocation code not
used during memory initialization.

There is something that concerns me though. Using
rte_memseg_walk_thread_unsafe could be a problem under some situations
although those situations being unlikely.

Usually, calling rte_eal_check_dma_mask happens during initialization. Then
it is safe to use the unsafe function for walking memsegs, but with device
hotplug and dynamic memory allocation, there exists a potential race
condition when the primary process is allocating more memory and
concurrently a device is hotplugged and a secondary process does the device
initialization. By now, this is just a problem with the NFP, and the
potential race condition window really unlikely, but I will work on this
asap.

Yes, this is what concerns me.
You can add a comment explaining the unsafe which is not handled.

The issue here is that this code is called from both memory-locked and memory-unlocked context. Virtio had a similar issue with their mem table update code - they solved it by manually locking the memory before doing everything else, and using thread_unsafe version of the walk.

Could something like that be done here?



Interestingly, the problem looks like a compiler one. Calling
rte_memseg_walk does not return when calling inside rt_eal_dma_mask,
but if
you modify the call like this:

-       if (rte_memseg_walk(check_iova, &mask))
+       if (!rte_memseg_walk(check_iova, &mask))

it works, although the value returned to the invoker changes, of course.
But the point here is it should be the same behaviour when calling
rte_memseg_walk than before and it is not.

Anyway, the coding style requires to save the return value in a variable,
instead of nesting the call in an "if" condition.
And the "if" check should be explicitly != 0 because it is not a real
boolean.

PS: please do not top post and avoid HTML emails, thanks











--
Thanks,
Anatoly

Reply via email to