On 23/02/2015 18:23, Neil Horman wrote: > On Mon, Feb 23, 2015 at 02:58:30PM +0000, Gonzalez Monroy, Sergio wrote: >> On 23/02/2015 13:52, Neil Horman wrote: >>> On Mon, Feb 23, 2015 at 10:25:01AM +0000, Gonzalez Monroy, Sergio wrote: >>>> On 22/02/2015 23:37, Neil Horman wrote: >>>>> On Fri, Feb 20, 2015 at 02:31:36PM +0000, Gonzalez Monroy, Sergio wrote: >>>>>> On 13/02/2015 12:51, Neil Horman wrote: >>>>>>> On Fri, Feb 13, 2015 at 11:08:02AM +0000, Gonzalez Monroy, Sergio wrote: >>>>>>>> On 13/02/2015 10:14, Panu Matilainen wrote: >>>>>>>>> On 02/12/2015 05:52 PM, Neil Horman wrote: >>>>>>>>>> On Thu, Feb 12, 2015 at 04:07:50PM +0200, Panu Matilainen wrote: >>>>>>>>>>> On 02/12/2015 02:23 PM, Neil Horman wrote: >>>>>>>>> [...snip...] >>>>>>>>>>>>>>> So I just realized that I was not having into account a possible >>>>>>>>>>>>>>> scenario, where >>>>>>>>>>>>>>> we have an app built with static dpdk libs then loading a dso >>>>>>>>>>>>>>> with -d >>>>>>>>>>>>>>> option. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In such case, because the pmd would have DT_NEEDED entries, >>>>>>>>>>>>>>> dlopen will >>>>>>>>>>>>>>> fail. >>>>>>>>>>>>>>> So to enable such scenario we would need to build PMDs without >>>>>>>>>>>>>>> DT_NEEDED >>>>>>>>>>>>>>> entries. >>>>>>>>>>>>>> Hmm, for that to be a problem you'd need to have the PMD built >>>>>>>>>>>>>> against >>>>>>>>>>>>>> shared dpdk libs and while the application is built against >>>>>>>>>>>>>> static dpdk >>>>>>>>>>>>>> libs. I dont think that's a supportable scenario in any case. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Or is there some other scenario that I'm not seeing? >>>>>>>>>>>>>> >>>>>>>>>>>>>> - Panu - >>>>>>>>>>>>>> >>>>>>>>>>>>> I agree with you. I suppose it comes down to, do we want to >>>>>>>>>>>>> support such >>>>>>>>>>>>> scenario? >>>>>>>>>>>>> >>>>>>>>>>>>> From what I can see, it seems that we do currently support such >>>>>>>>>>>>> scenario by >>>>>>>>>>>>> building dpdk apps against all static dpdk libs using >>>>>>>>>>>>> --whole-archive (all >>>>>>>>>>>>> libs and not only PMDs). >>>>>>>>>>>>> http://dpdk.org/browse/dpdk/commit/?id=20afd76a504155e947c770783ef5023e87136ad8 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Am I misunderstanding this? >>>>>>>>>>>>> >>>>>>>>>>>> Shoot, you're right, I missed the static build aspect to this. >>>>>>>>>>>> Yes, >>>>>>>>>>>> if we do the following: >>>>>>>>>>>> >>>>>>>>>>>> 1) Build the DPDK as a static library >>>>>>>>>>>> 2) Link an application against (1) >>>>>>>>>>>> 3) Use the dlopen mechanism to load a PMD built as a DSO >>>>>>>>>>>> >>>>>>>>>>>> Then the DT_NEEDED entries in the DSO will go unsatisfied, because >>>>>>>>>>>> the shared >>>>>>>>>>>> objects on which it (the PMD) depends will not exist in the file >>>>>>>>>>>> system. >>>>>>>>>>> I think its even more twisty: >>>>>>>>>>> >>>>>>>>>>> 1) Build the DPDK as a static library >>>>>>>>>>> 2) Link an application against (1) >>>>>>>>>>> 3) Do another build of DPDK as a shared library >>>>>>>>>>> 4) In app 2), use the dlopen mechanism to load a PMD built as a part >>>>>>>>>>> of or >>>>>>>>>>> against 3) >>>>>>>>>>> >>>>>>>>>>> Somehow I doubt this would work very well. >>>>>>>>>>> >>>>>>>>>> Ideally it should, presuming the ABI is preserved between (1) and >>>>>>>>>> (3), >>>>>>>>>> though I >>>>>>>>>> agree, up until recently, that was an assumption that was unreliable. >>>>>>>>> Versioning is a big and important step towards reliability but there >>>>>>>>> are >>>>>>>>> more issues to solve. This of course getting pretty far from the >>>>>>>>> original >>>>>>>>> topic, but at least one such issue is that there are some cases where >>>>>>>>> a >>>>>>>>> config value affects what are apparently public structs (rte_mbuf wrt >>>>>>>>> RTE_MBUF_REFCNT for example), which really is a no-go. >>>>>>>>> >>>>>>>> Agree, the RTE_MBUF_REFCNT is something that needs to be dealt with >>>>>>>> asap. >>>>>>>> I'll look into it. >>>>>>>> >>>>>>>>>>>> I think the problem is a little bit orthogonal to the libdpdk_core >>>>>>>>>>>> problem you >>>>>>>>>>>> were initially addressing. That is to say, this problem of >>>>>>>>>>>> dlopen-ed PMD's >>>>>>>>>>>> exists regardless of weather you build the DPDK as part of a static >>>>>>>>>>>> or dynamic >>>>>>>>>>>> library. The problems just happen to intersect in their >>>>>>>>>>>> manipulation of the >>>>>>>>>>>> DT_NEEDED entries. >>>>>>>>>>>> >>>>>>>>>>>> Ok, so, given the above, I would say your approach is likely >>>>>>>>>>>> correct, just >>>>>>>>>>>> prevent DT_NEEDED entries from getting added to PMD's. Doing so >>>>>>>>>>>> will >>>>>>>>>>>> sidestep >>>>>>>>>>>> loading issue for libraries that may not exist in the filesystem, >>>>>>>>>>>> but thats ok, >>>>>>>>>>>> because by all rights, the symbols codified in those needed >>>>>>>>>>>> libraries should >>>>>>>>>>>> already be present in the running application (either made >>>>>>>>>>>> available >>>>>>>>>>>> by the >>>>>>>>>>>> application having statically linked them, or having the linker >>>>>>>>>>>> load >>>>>>>>>>>> them from >>>>>>>>>>>> the proper libraries at run time). >>>>>>>>>>> My 5c is that I'd much rather see the common case (all static or all >>>>>>>>>>> shared) >>>>>>>>>>> be simple and reliable, which in case of DSOs includes no lying >>>>>>>>>>> (whether by >>>>>>>>>>> omission or otherwise) about DT_NEEDED, ever. That way the issue is >>>>>>>>>>> dealt >>>>>>>>>>> once where it belongs. If somebody wants to go down the rabbit hole >>>>>>>>>>> of >>>>>>>>>>> mixed >>>>>>>>>>> shared + static linkage, let them dig the hole by themselves :) >>>>>>>>>>> >>>>>>>>>> This is a fair point. Can DT_NEEDED sections be stripped via tools >>>>>>>>>> like >>>>>>>>>> objcopy >>>>>>>>>> after the build is complete? If so, end users can hack this corner >>>>>>>>>> case >>>>>>>>>> to work >>>>>>>>>> as needed. >>>>>>>>> Patchelf (http://nixos.org/patchelf.html) appears to support that, but >>>>>>>>> given that source is available it'd be easier to just modify the >>>>>>>>> makefiles >>>>>>>>> if that's really needed. >>>>>>>>> >>>>>>>> I think we agree on the issue. >>>>>>>> >>>>>>>> So I'll be sending a patch to add DT_NEEDED entries to all libraries >>>>>>>> and >>>>>>>> PMDs. The only exception would be librte_eal, which would not have >>>>>>>> proper >>>>>>>> NEEDED entries. >>>>>>>> Do we bother adding a linker script for librte_eal that would include >>>>>>>> dependent libraries? >>>>>>>> >>>>>>> I say yes to the linker script, but will happily bow to an alternate >>>>>>> consensus >>>>>>> Neil >>>>>>> >>>>>> So the case we want to solve is the following circular dependencies: >>>>>> eal -> mempool, malloc >>>>>> mempool -> eal , malloc, ring >>>>>> malloc -> eal >>>>>> ring -> eal, malloc >>>>>> >>>>>> We cannot write/create the proposed (below) linker script at least until >>>>>> we >>>>>> have built mempool and malloc. >>>>>> INPUT ( -lrte_eal.so -lrte_mempool -lrte_malloc ) >>>>>> >>>>> Not sure I understand why you have a build time dependency on this. Link >>>>> time >>>>> perhaps, but not build time. Or am I reading too much into your use of >>>>> the term >>>>> 'built' above? >>>> I meant 'built' as compiled + linked. Am I misusing the term? >>> No, you're not (though I misused the term link time above, I meant to say >>> load >>> time). So you're saying that when you build shared libraries, you get >>> linker >>> errors indicating that, during the build, you're missing symbols, is that >>> correct? I guess I'm confused because I don't see how thats not happening >>> for >>> everyone, right now. In other words, I'm not sure what about your changes >>> is >>> giving rise to that problem. >>> >>>>>> Few ways I have thought about implementing this (not particularly fond of >>>>>> any of them) : >>>>>> - Have the linker script file in the repo (scripts/ ?) in a fixed >>>>>> location >>>>>> and just copy it to $(RTE_OUTPUT)/lib/ once all libs have finished >>>>>> building. >>>>>> - Generate the file on build time from a defined make variable once all >>>>>> libs have finished >>>>>> >>>>> I'm still not sure I understand. Why does this dependency exist at build >>>>> time? >>>>> The dependency between malloc and eal shouldn't be a problem during the >>>>> build, >>>>> as symbols from each other should just remain undefined, and get resolved >>>>> at >>>>> load time. >>>> Is that not the way it is currently implemented? >>>> I get the impression that we are talking about different goals (correct me >>>> if it is not the case) >>>> >>> We may well be, I'm not sure yet. >>> >>>> I thought that the agreed solution was to: >>>> 1) NOT to create/generate a 'core' library >>>> 2) Add DT_NEEDED entries for all libraries (except eal which is the first >>>> library we link) >>>> 3) Use linker script for eal >>>> >>> Ok, we're definately on the same page, as thats what I thought the goal was >>> as >>> well. >>> >>>> Given the previously mentioned circular dependencies between eal, mempool, >>>> malloc and ring: >>>> - eal would not be linked against other libraries (no NEEDED entries) >>>> - malloc is linked against eal (previously built), so malloc would have a >>>> NEEDED entry for eal. >>>> >>>> In that scenario, if the linker script is setup/created after we build eal, >>>> then when we try to link malloc >>>> against eal, the linker will pull mempool and malloc too (because we >>>> included them in the linker script). >>>> Therefore, the link fails as none of those libraries (malloc and mempool) >>>> have been built yet. >>>> >>> Ah, I see now, I wasn't thinking about the extra requirements that DT_NEEDED >>> entries placed on the build conditions. >>> >>> I see now, apologies for being dense previously. Given what you indicate I >>> would say that the solution here is to link the libraries against individual >>> other specific libraries, not the core library that you generate as a linker >>> script. That way you avoid the circular dependency, and the core library >>> just >>> becomes a convienience for application developers looking to link to a >>> single >>> library. >>> >> I'm not sure I quite understand your suggestion. >> >> Could you roughly describe steps for building eal, malloc and mempool libs ? >> For example, something like this? >> 1) build eal, which creates librte_eal.so.1 >> 2) write linker script for librte_eal.so >> 3) build malloc against eal (-lrte_eal ) >> etc > Hm, so I spent a bit of time looking at this, and your right, I thought this > was > really just a artifact of the introduction of --as-needed to the build to > force > DT_NEEDED entries, and my suggestion was that you simply not link the > libraries > that were causing the circular dependency. I had assumed that the link > directives included -lrte_malloc -lrte_mempool for the eal library, but they > weren't really needed, so you could remove them and it would work out. > > Unfortunately that turns out to not be the case. librte_eal does explicitly > use > calls in librte_malloc, and vice versa. The current use of -no-as-needed in > the > build system (which I was previously unaware of), is a hack to avoid having to > address that problem. > > That throws a monkey wrench into this plan. I would see 3 ways forward: > > 1) Fix the problem - That is to say, remove the use of --no-as-needed from the > build, and address the circular dependencies that arise. This could/will mean > actually merging libraries with circular dependencies into a single library, > as > they should be so that they can completely resolve all of the symbols they use > at link time > > 2) Ignore the problem. If we just keep the lack of DT_NEEDED entries in > place, > I think the problem goes away, and we can continue on. > > I think option 1 is likely the more correct approach, as removing DT_NEEDED to > avoid a circuar depenency is a hack, but it may not be the most pragmatic > approach. just living without DT_NEEDED entries and documenting link time > needs > will certainly be faster, and might be the better course of action, especially > if we provide a 'core' pseudo library/linker script that embodies that action > for the end user. > > Neil > So basically for 1) the approach of creating a core library would be a solution. The last suggestion for the core library was to not merge the sources but generate a single library. The problem with that was the versioning wouldn't work as it currently is, given that is per library. So if we were to create a core library, we just need to merge the version.map files of each library into a single version.map for the core library. This approach, as you noted, would be the proper fix.
The other solution would be to just leave eal without DT_NEEDED entries and specify in the docs that apps require eal, mempool, malloc and ring to be link such as: --no-as-needed -lrte_eal -lrte_mempool -lrte_malloc -lrte_ring --as-needed With those flags it should work regardless of --as-needed being set before hand. With this second approach we would have all libraries, but eal, with DT_NEEDED entries and we would not need to create a core library. We don't really need to create a linker script for that (not sure we can even create such linker script with those flags) and just documenting link time needs as you mentioned should be enough. So should I go forward with last suggested approach? Regards, Sergio >> I suppose that another way to go about this, instead of creating the linker >> script that pulls >> dependent libraries, is to always link (using --no-as-needed in case gcc >> adds it by default) >> against these libraries (eal, mempool, malloc, and ring) with necessary doc >> about how to build apps. >> >> Sergio >>> Neil >>> >>>> Was your suggestion to leave all of these libraries (eal, mempool, malloc, >>>> ring) without NEEDED entries? >>>> >>> No, you can add NEEDED entries there, they will just be for the individual >>> libraries, not the core linker script library. >>> >>> Best >>> Neil >>> >>>> Regards, >>>> Sergio >>>>> What is the error you are getting? >>>>> >>>>> Best >>>>> Neil >>>>> >>>>>> Thoughts? any other approached is more than welcome! >>>>>> >>>>>> Sergio >>>>>> >>>>>> PS: Thinking again on the core library and the issue of having multiple >>>>>> version.map files, we could have a core_version.map instead instead of >>>>>> multiple files per core library (eal, mempool, etc) >>>>>> >>>>>> >>>> >> >>