Hi All,

Thanks for the feedback.  I guess I'm a little perplexed about how we got here; 
I'd think if it was linking against the PMI stuff that slurm version wouldn't 
matter?  There aren't versioned PMI libraries:
/usr/lib64/libpmi.so
/usr/lib64/libpmi.so.0
/usr/lib64/libpmi.so.0.0.0 (real file)
/usr/lib64/libpmi2.so
/usr/lib64/libpmi2.so.0
/usr/lib64/libpmi2.so.0.0.0 (real file)

FWIW slurm has:
/usr/lib64/libslurm.so
/usr/lib64/libslurm.so.29 (real file)

Any easy temporary fix is just to make a symlink from libslurm.so.29 to 
libslurm.so.28; things just work.  Not really a long term strategy but gets 
folks running again.

Sounds like I should follow up with the slurm list.

Regards,

Will

> On Jan 29, 2016, at 3:59 AM, Gilles Gouaillardet 
> <gilles.gouaillar...@gmail.com> wrote:
> 
> on second thought, is there any chance your sysadmin removed the old 
> libslurm.so.x but kept the old libpmix.so.y ?
> in this case, the real issue would be hidden 
> your sysadmin "broke" the old libpmi, but you want to use the new one indeed.
> 
> Cheers,
> 
> Gilles
> 
> On Friday, January 29, 2016, Gilles Gouaillardet 
> <gilles.gouaillar...@gmail.com <mailto:gilles.gouaillar...@gmail.com>> wrote:
> Is openmpi linked with a static libpmi.a that requires a dynamic libslurm ?
> that can be checked with ldd mca_ess_pmi.so
> 
> btw, do slurm folks increase the libpmi.so version each time slurm is 
> upgraded ?
> that could be a part of the issue ...
> but if they increase lib version because of abi changes, it might be a bad 
> idea to open libxxx.so instead of libxxx.so.y
> generally speaking, libxxx.so.y is provided by libxxx package, and libxxx.so 
> is provided by libxxx-devel package, which means it might not be available on 
> compute nodes.
> we could also dlopen libxxx instead of linking with it, and have the sysadmin 
> configure openmpi so it finds the right lib (this approach is used by a 
> prominent vendor, and has other pros but also cons)
> 
> Cheers,
> 
> Gilles
> 
> On Friday, January 29, 2016, Ralph Castain <r...@open-mpi.org 
> <javascript:_e(%7B%7D,'cvml','r...@open-mpi.org');>> wrote:
> It makes sense - but isn’t it slurm that is linking libpmi against libslurm? 
> I don’t think we are making that connection, so it would be a slurm issue to 
> change it.
> 
> 
>> On Jan 28, 2016, at 10:12 PM, William Law <willthe...@gmail.com <>> wrote:
>> 
>> Hi,
>> 
>> Our group can't find anyway to do this and it'd be helpful.
>> 
>> We use slurm and keep upgrading the slurm environment.  OpenMPI bombs out 
>> against PMI each time the libslurm stuff changes, which seems to be fairly 
>> regularly.  Is there a way to compile against slurm but insulate ourselves 
>> from the libslurm chaos?  Obvious will ask the slurm folks too.
>> 
>> [wlaw@some-node /scratch/users/wlaw/imb/src]$ mpirun -n 2 --mca grpcomm ^pmi 
>> ./IMB-MPI1 
>> [some-node.local:42584] mca: base: component_find: unable to open 
>> /share/sw/free/openmpi/1.6.5/intel/13sp1up1/lib/openmpi/mca_ess_pmi: 
>> libslurm.so.28: cannot open shared object file: No such file or directory 
>> (ignored)
>> [some-node.local:42585] mca: base: component_find: unable to open 
>> /share/sw/free/openmpi/1.6.5/intel/13sp1up1/lib/openmpi/mca_pubsub_pmi: 
>> libslurm.so.28: cannot open shared object file: No such file or directory 
>> (ignored)
>> [some-node.local:42586] mca: base: component_find: unable to open 
>> /share/sw/free/openmpi/1.6.5/intel/13sp1up1/lib/openmpi/mca_pubsub_pmi: 
>> libslurm.so.28: cannot open shared object file: No such file or directory 
>> (ignored)
>> 
>> (sent it via the wrong email so it bounced..... heh)
>> 
>> Upon further investigation it seems like the most appropriate thing would be 
>> to point it at compile time to libslurm.so instead of libslurm.so.xx; does 
>> that make sense?
>> 
>> Thanks,
>> 
>> Will
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org <>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
>> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2016/01/28408.php 
>> <http://www.open-mpi.org/community/lists/users/2016/01/28408.php>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2016/01/28415.php

Reply via email to