On Mon, Oct 29, 2018 at 2:18 PM Thomas Monjalon <tho...@monjalon.net> wrote:
> 29/10/2018 14:40, Alejandro Lucero: > > On Mon, Oct 29, 2018 at 1:18 PM Yao, Lei A <lei.a....@intel.com> wrote: > > > *From:* Alejandro Lucero [mailto:alejandro.luc...@netronome.com] > > > On Mon, Oct 29, 2018 at 11:46 AM Thomas Monjalon <tho...@monjalon.net> > > > wrote: > > > > > > 29/10/2018 12:39, Alejandro Lucero: > > > > I got a patch that solves a bug when calling rte_eal_dma_mask using > the > > > > mask instead of the maskbits. However, this does not solves the > > > deadlock. > > > > > > The deadlock is a bigger concern I think. > > > > > > I think once the call to rte_eal_check_dma_mask uses the maskbits > instead > > > of the mask, calling rte_memseg_walk_thread_unsafe avoids the deadlock. > > > > > > Yao, can you try with the attached patch? > > > > > > Hi, Lucero > > > > > > This patch can fix the issue at my side. Thanks a lot > > > for you quick action. > > > > Great! > > > > I will send an official patch with the changes. > > Please, do not forget my other request to better comment functions. > > > Sure. > > I have to say that I tested the patchset, but I think it was where > > legacy_mem was still there and therefore dynamic memory allocation code > not > > used during memory initialization. > > > > There is something that concerns me though. Using > > rte_memseg_walk_thread_unsafe could be a problem under some situations > > although those situations being unlikely. > > > > Usually, calling rte_eal_check_dma_mask happens during initialization. > Then > > it is safe to use the unsafe function for walking memsegs, but with > device > > hotplug and dynamic memory allocation, there exists a potential race > > condition when the primary process is allocating more memory and > > concurrently a device is hotplugged and a secondary process does the > device > > initialization. By now, this is just a problem with the NFP, and the > > potential race condition window really unlikely, but I will work on this > > asap. > > Yes, this is what concerns me. > You can add a comment explaining the unsafe which is not handled. > > I'' do. Thanks! > > > > > Interestingly, the problem looks like a compiler one. Calling > > > > rte_memseg_walk does not return when calling inside rt_eal_dma_mask, > > > but if > > > > you modify the call like this: > > > > > > > > - if (rte_memseg_walk(check_iova, &mask)) > > > > + if (!rte_memseg_walk(check_iova, &mask)) > > > > > > > > it works, although the value returned to the invoker changes, of > course. > > > > But the point here is it should be the same behaviour when calling > > > > rte_memseg_walk than before and it is not. > > > > > > Anyway, the coding style requires to save the return value in a > variable, > > > instead of nesting the call in an "if" condition. > > > And the "if" check should be explicitly != 0 because it is not a real > > > boolean. > > > > > > PS: please do not top post and avoid HTML emails, thanks > > > > > > > > > > > > > >