Jeff, Wow, that is quite a study of the different options. I will spend some time looking over things to better understand the (complex) situation. I will also talk with Lisandro Dalcin about what he thinks the best approach is for mpi4py. One question though. You said that nothing had changed in this respect from 1.2.3 to 1.2.4, but 1.2.3 doesn't show the problem. Does this make sense?
Brian On 10/16/07, Jeff Squyres <jsquy...@cisco.com> wrote: > On Oct 12, 2007, at 3:5 PM, Brian Granger wrote: > > > My guess is that Rmpi is dynamically loading libmpi.so, but not > > > specifying the RTLD_GLOBAL flag. This means that libmpi.so is not > > > available to the components the way it should be, and all goes > > > downhill from there. It only mostly works because we do something > > > silly with how we link most of our components, and Linux is just > > > smart enough to cover our rears (thankfully). > > > > In mpi4py, libmpi.so is linked in at compile time, not loaded using > > dlopen. Granted, the resulting mpi4py binary is loaded into python > > using dlopen. > > I believe that means that libmpi.so will be loaded as an indirect > dependency of mpi4py. See the table below. > > > > The pt2pt component (rightly) does not have a -lmpi in its link > > > line. The other components that use symbols in libmpi.so (wrongly) > > > do have a -lmpi in their link line. This can cause some problems on > > > some platforms (Linux tends to do dynamic linking / dynamic loading > > > better than most). That's why only the pt2pt component fails. > > > > Did this change from 1.2.3 to 1.2.4? > > No: > > % diff openmpi-1.2.3/ompi/mca/osc/pt2pt/Makefile.am openmpi-1.2.4/ > ompi/mca/osc/pt2pt/Makefile.am > % > > > > Solutions: > > > > > > - Someone could make the pt2pt osc component link in libmpi.so > > > like the rest of the components and hope that no one ever > > > tries this on a non-friendly platform. > > > > Shouldn't the openmpi build system be able to figure this stuff out on > > a per platform basis? > > I believe that this would not be useful -- see the tables and > conclusions below. > > > > - Debian (and all Rmpi users) could configure Open MPI with the > > > > > --disable-dlopen flag and ignore the problem. > > > > Are there disadvantages to this approach? > > You won't be able to add more OMPI components to your existing > installation (e.g., 3rd party components). But that's probably ok, > at least for now -- not many people are distributing 3rd party OMPI > components. > > > > - Someone could fix Rmpi to dlopen libmpi.so with the RTLD_GLOBAL > > > flag and fix the problem properly. > > > > Again, my main problem with this solution is that it means I must both > > link to libmpi at compile time and load it dynamically using dlopen. > > This doesn't seem right. Also, it makes it impossible on OS X to > > avoid setting LD_LIBRARY_PATH (OS X doesn't have rpath). Being able > > to use openmpi without setting LD_LIBRARY_PATH is important. > > This is a very complex issue. Here's the possibilities that I see... > (prepare for confusion!) > > ======================================================================== > == > > This first table represents what happens in the following scenarios: > > - compile an application against Open MPI's libmpi, or > - compile an "application" DSO that is dlopen'ed with RTLD_GLOBAL, or > - explicitly dlopen Open MPI's libmpi with RTLD_GLOBAL > > OMPI DSO > libmpi OMPI DSO components > App linked includes components depend on > against components? available? libmpi.so? Result > ---------- ----------- ---------- ---------- ---------- > 1. libmpi.so no no NA won't run > 2. libmpi.so no yes no yes > 3. libmpi.so no yes yes yes (*1*) > 4. libmpi.so yes no NA yes > 5. libmpi.so yes yes no maybe (*2*) > 6. libmpi.so yes yes yes maybe (*3*) > ---------- ------------ ---------- ------------ ---------- > 7. libmpi.a no no NA won't run > 8. libmpi.a no yes no yes (*4*) > 9. libmpi.a no yes yes no (*5*) > 10. libmpi.a yes no NA yes > 11. libmpi.a yes yes no maybe (*6*) > 12. libmpi.a yes yes yes no (*7*) > ---------- ------------ ---------- ------------ -------- > > All libmpi.a scenarios assume that libmpi.so is also available. > > In the OMPI v1.2 series, most components link against libmpi.so, but > some do not (it's our mistake for not being uniform). > > (*1*) As far as we know, this works on all platforms that have dlopen > (i.e., almost everywhere). But we've only tested (recently) Linux, > OSX, and Solaris. These 3 dynamic loaders are smart enough to realize > that they only need to load libmpi.so once (i.e., that the implicit > dependency of libmpi.so brought in by the components is the same > libmpi.so that is already loaded), so everything works fine. > > (*2*) If the *same* component is both in libmpi and available as a > DSO, the same symbols will be defined twice when the component is > dlopen'ed and Badness will ensure. If the components are different, > all platforms should be ok. > > (*3*) Same caveat as (*2*) about if a components is both in libmpi and > available as a DSO. Same as (*1*) for whether libmpi.so is loaded > multiple times by the dynamic loader or not. > > (*4*) Only works if the application was compiled with the equivalent > of the GNU linker's --whole-archive flag. > > (*5*) This does not work because libmpi.a will be loaded and libmpi.so > will also be pulled in as a dependency of the components. As such, > all the data structures in libmpi will [attempt to] be in the process > twice: the "main libmpi" will have one set and the libmpi pulled in by > the component dependencies will have a different set. Nothing good will > come of that: possibly dynamic linker run-time symbol conflicts or > possibly two separate copies of the symbols. Both possibilities are > Bad. > > (*6*) Same caveat as (*2*) about if a components is both in libmpi and > available as a DSO. > > (*7*) Same problem as (*5*). > > ======================================================================== > == > > This second table represents what happens in the following scenarios: > > - compile an "application" DSO that is dlopen'ed with RTLD_LOCAL, or > - explicitly dlopen Open MPI's libmpi with RTLD_LOCAL > > OMPI DSO > App libmpi OMPI DSO components > DSO linked includes components depend on > against components? available? libmpi.so? Result > ---------- ----------- ---------- ---------- ---------- > 13. libmpi.so no no NA won't run > 14. libmpi.so no yes no no (*8*) > 15. libmpi.so no yes yes maybe (*9*) > 16. libmpi.so yes no NA ok > 17. libmpi.so yes yes no no (*10*) > 18. libmpi.so yes yes yes maybe (*11*) > ---------- ------------ ---------- ------------ ---------- > 19. libmpi.a no no NA won't run > 20. libmpi.a no yes no no (*12*) > 21. libmpi.a no yes yes no (*13*) > 22. libmpi.a yes no NA ok > 23. libmpi.a yes yes no no (*14*) > 24. libmpi.a yes yes yes no (*15*) > ---------- ------------ ---------- ------------ -------- > > All libmpi.a scenarios assume that libmpi.so is also available. > > (*8*) This does not work because the OMPI DSOs require symbols in > libmpi that will not be able to be found because libmpi.so was not > loaded in the global scope. > > (*9*) This is a fun case: the Linux dynamic linker is smart enough to > make it work, but others likely will not. What happens is that > libmpi.so is loaded in a LOCAL scope, but then OMPI dlopens its own > DSOs that require symbols from libmpi. The Linux linker figures this > out and resolves the required symbols from the already-loaded LOCAL > libmpi.so. Other linkers will fail to figure out that there is a > libmpi.so already loaded in the process and will therefore load a 2nd > copy. This results in the problems cited in (*5*). > > (*10*) This does not work either a) because of the caveat stated in > (*2*) or b) because the unresolved symbol issue stated in (*8*). > > (*11*) This may not work either because of the caveat stated in (*2*) > or because the duplicate libmpi.so issue cited in (*9*). If you are > using the Linux linker, then (*9*) is not an issue, and it should > work. > > (*12*) Essentially the same as the unresolved symbol issue cited in > (*8*), but with libmpi.a instead of libmpi.so. > > (*13*) Worse than (*9*); the Linux linker will not figure this one out > because the libmpi.so symbols are not part of "libmpi" -- they are > simply part of the application DSO and therefore there's no way for > the linker to know that by loading libmpi.so, it's going to be loading > a 2nd set of the same symbols that are already in the process. Hence, > we devolve down to the duplicate symbol issue cited in (*5*). > > (*14*) This does not work either a) because of the caveat stated in > (*2*) or b) because the unresolved symbols issue stated in (*8*). > > (*15*) This may not work either because of the caveat stated in (*2*) > or because the duplicate libmpi.so issue cited in (*13*). > > ======================================================================== > == > > (I'm going to put this data on the OMPI web site somewhere because it > took me all day yesterday to get it straight in my head and type it > out :-) ) > > In the OMPI v1.2 series, most OMPI configurations fall into scenarios > 2 and 3 (as I mentioned above, we have some components that link > against libmpi and others that don't -- our mistake for not being > consistent). > > The problematic scenario that the R and Python MPI plugins are > running into is 14 because the osc_pt2pt component does *not* link > against libmpi. Most of the rest of our components do link against > libmpi, and therefore fall into scenario 15, and therefore work on > Linux (but possibly not elsewhere). > > With all this being said, if you are looking for a general solution > for the Python and R plugins, dlopen() of libmpi with RTLD_GLOBAL > before MPI_INIT seems to be the way to go. Specifically, even if we > updated osc_pt2pt to link against libmpi, that will work on Linux, > but not elsewhere. dlopen'ing libmpi with GLOBAL seems to be the > most portable solution. > > Indeed, table 1 also suggests that we should change our components > (as Brian suggests) to all *not* link against libmpi, because then > we'll gain the ability to work properly with a static libmpi.a, > putting OMPI's common usage into scenarios 2 and 8 (which is better > than the 2, 3, 8, and 9 scenarios that are used today, which means we > don't work with libmpi.a). > > ...but I think that this would break the current R and Python plugins > until they put in the explicit call to dlopen(). > > -- > Jeff Squyres > Cisco Systems > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >