[slurm-users] slurm EPEL7/8 bump coming in 4 days (20.11.2 to 20.11.5)

2021-04-12 Thread Philip Kovacs
Several people asked me to bump EPEL's slurm up from 20.11.2, mostly due to mpi-related issueswith that release,  so I've got 20.11.5 on deck 4 days left to stable.   Please protect your private slurminstallations so there are no surprises when this release hits the EPEL repos in 4 days. Phil

Re: [slurm-users] Exclude Slurm packages from the EPEL yum repository

2021-02-03 Thread Philip Kovacs
SB C630, Newark     `' > On Feb 3, 2021, at 1:06 PM, Philip Kovacs wrote: > > I am familiar with the package rename process and it would not have the > effect you might think it would. > If I provide an upgrade path to a new package name, e.g. slurm-xxx, the net > effect would b

Re: [slurm-users] Exclude Slurm packages from the EPEL yum repository

2021-02-03 Thread Philip Kovacs
e possible, in the long run, to follow the Fedora packaging guidelines for renaming existing packages? https://docs.fedoraproject.org/en-US/packaging-guidelines/#renaming-or-replacing-existing-packages Best regards Jürgen On 03.02.21 01:58, Philip Kovacs wrote: > Lots of mixed reactions

Re: [slurm-users] Exclude Slurm packages from the EPEL yum repository

2021-02-02 Thread Philip Kovacs
Lots of mixed reactions here, many in favor (and grateful) for the add to EPEL, many much less enthusiastic. I cannot rename an EPEL package that is now in the wild without providing an upgrade path to the new name. Such an upgrade path would defeat the purpose of the rename and won't help at a

Re: [slurm-users] Exclude Slurm packages from the EPEL yum repository

2021-01-23 Thread Philip Kovacs
I can assure you it was easier for you to filter slurm from your repos than it was for me to make them available to both epel7 and epel8. No good deed goes unpunished I guess.On Saturday, January 23, 2021, 07:03:08 AM EST, Ole Holm Nielsen wrote: We use the EPEL yum repository on our C

Re: [slurm-users] [EXT] Re: pmix issue

2020-12-07 Thread Philip Kovacs
Make sure the .so symlink for the pmix lib is available -- not just the versioned .so, e.g. .so.2.   Slurm requires that .so symlink.  Some distros split packages into base/devel, so you may need to install a pmix-devel package, if available, in order to add the .so symlink (which is considered

Re: [slurm-users] Slurm 19-05-4-1 and Centos8

2019-12-08 Thread Philip Kovacs
There's a typo in there.  It's lazy not -lazy.   Try adding exactly this line just before the %configure: # use -z lazy to allow dlopen with unresolved symbolsexport LDFLAGS="%{build_ldflags} -Wl,-z,lazy"                     <--- this should fix it%configure \ On Sunday, December 8, 2019, 0

Re: [slurm-users] Slurm 19-05-4-1 and Centos8

2019-12-05 Thread Philip Kovacs
I answered this question on Oct 28.  Simply use lazy binding as required by slurm.  See a copy below of my Oct 28 response to your original thread.Just adjust the %build section of the rpm spec to ensure that -Wl,-z,-lazy appears at the end of LDFLAGS.  Problem solved. > You probably built slur

Re: [slurm-users] RHEL8 support - Missing Symbols in SelectType libraries

2019-10-28 Thread Philip Kovacs
>On Monday, October 28, 2019, 03:18:06 PM EDT, Brian Andrus > wrote: >I spoke too soon. >While I can successfully build/run slurmctld, slurmd is failing because ALL of >the SelectType libraries are missing symbols. >Example from select_cons_tres.so: ># slurmd >slurmd: error: plugin_load_from

Re: [slurm-users] MPI jobs via mirun vs. srun through PMIx.

2019-09-17 Thread Philip Kovacs
>For our next cluster we will switch from Moab/Torque to Slurm and have >to adapt the documentation and example batch scripts for the users. >Therefore, I wonder if and why we should recommend (or maybe even urge) >our users to use srun instead of mpirun/mpiexec in their batch scripts >for MPI j

Re: [slurm-users] MPI jobs via mirun vs. srun through PMIx.

2019-09-16 Thread Philip Kovacs
>according to https://slurm.schedmd.com/mpi_guide.html I have built >Slurm 19.05 with PMIx support enabled and it seems to work for both, >OpenMPI and Intel MPI. (I've also set MpiDefault=pmix in slurm.conf.) >But I still don't get the point. Why should I favour `srun ./my_mpi_program´  >over `mpi

[slurm-users] Announcing availability of new plugin jobcomp/redis

2019-08-26 Thread Philip Kovacs
I have some plans for additional plugins for slurm and I started my efforts with a new job completionplugin for --- redis!  Redis is a highly-optimized memory-based key/value store that also featurespersistence, so data is not lost.   Other nice features of redis include replication and sharding

Re: [slurm-users] Trouble installing slurm-19.05.1-2.el7.centos.x86_64

2019-08-15 Thread Philip Kovacs
>I have tried running ldconfig manually as suggested with  slurm-19.05.1-2 and >it fails the same way... >error: Failed dependencies:>        >libnvidia-ml.so.1()(64bit) is needed by slurm-19.05.1-2.el7.centos.x86_64   Lou, that's a packaging mistake on the part of the person who created that

Re: [slurm-users] slurm-19.05 link error

2019-07-23 Thread Philip Kovacs
Looks like you need to install hdf5, development headers and libraries. On Tuesday, July 23, 2019, 08:52:06 PM EDT, Weiguang Chen wrote: Hi,   I’m installing slurm in myArchlinux Server. At the beginning, I used AUR helper yaourt to install it. yaourt -S slurm-llnl But an error

Re: [slurm-users] PMIX with heterogeneous jobs

2019-07-16 Thread Philip Kovacs
Well it looks like it it does fail as often as it works. srun --mpi=pmix -n1 -wporthos : -n1 -wathos ./hellosrun: job 681 queued and waiting for resourcessrun: job 681 has been allocated resourcesslurmstepd: error: athos [0] pmixp_coll_ring.c:613 [pmixp_coll_ring_check] mpi/pmix: ERROR: 0x153ab

Re: [slurm-users] PMIX with heterogeneous jobs

2019-07-16 Thread Philip Kovacs
Works here on slurm 18.08.8, pmix 3.1.2.  The mpi world ranks are unified as they should be. $ srun --mpi=pmix -n2 -wathos ./hello : -n8 -wporthos ./hellosrun: job 586 queued and waiting for resourcessrun: job 586 has been allocated resourcesHello world from processor athos, rank 1 out of 10 pr

Re: [slurm-users] Configure Slurm 17.11.9 in Ubuntu 18.10 with use of PMI

2019-06-20 Thread Philip Kovacs
Also look for the presence of the slurm mpi plugins:  mpi_none.so, mpi_openmpi.so, mpi_pmi2.so, mpi_pmix.so, mpi_pmix_v3.so,  They will be installed typically to /usr/lib64/slurm/.  Those plugins are used for the various mpi capabilities and are good "markers"for how your configure detected an

Re: [slurm-users] Slurm tarball numbering vs RPM numbering for first release tarballs.

2019-06-09 Thread Philip Kovacs
As one of the downstream distro packagers, I follow both the tarball and rpm revisions carefully.   Please be aware that changeslike the one proposed impact us and ought not be made without some announcement so we can know what is going on and adjustour packaging code accordingly.  Right now I'

Re: [slurm-users] Slurm 17.11.9, sshare undefined symbol

2018-08-24 Thread Philip Kovacs
Slurm plugins won't load if you build with certain hardening settings, in your case "bind now."  Check your linker flags and make sure to remove `-z now` or `-Wl,z -Wl,now`.  If you want to check beforehand that this is the issue, inspect the plugin .so with readelf -d and grep for BIND_NOW.  If

Re: [slurm-users] Slurm version 17.11.1 available

2017-12-22 Thread Philip Kovacs
I notice the slurm download tarballs are affixing a new {-rel} version, e.g. 17.11.1-2.tar.bz2. Ugh.   The downstream spec files in the distros don't track releases in lock step with your releases. Would prefer to see just major.minor.micro as before  On Wednesday, December 20, 2017 7:

Re: [slurm-users] [17.11.1] no good pmi intention goes unpunished

2017-12-21 Thread Philip Kovacs
ort (they are nothing more than symlinks to libpmix), and (b) specify --mpi=pmix on the srun cmd line. On Dec 21, 2017, at 11:44 AM, Philip Kovacs wrote: OK, so slurm's libpmi2 is a functional superset of the libpmi2 provided by pmix 2.0+.  That's good to know. My point here is that, if y

Re: [slurm-users] [17.11.1] no good pmi intention goes unpunished

2017-12-21 Thread Philip Kovacs
mpiled directly into the plugin. On Wednesday, December 20, 2017 10:47 PM, "r...@open-mpi.org" wrote: On Dec 20, 2017, at 6:21 PM, Philip Kovacs wrote: >  -- slurm.spec: move libpmi to a separate package to solve a conflict with the >    version provided by PMIx. This wi

[slurm-users] [17.11.1] no good pmi intention goes unpunished

2017-12-20 Thread Philip Kovacs
>  -- slurm.spec: move libpmi to a separate package to solve a conflict with the >    version provided by PMIx. This will require a separate change to PMIx as >    well. I see the intention behind this change since the pmix 2.0+ package provides libpmi/libpmi2and there is a possible (installation)

Re: [slurm-users] Problem with slurmctl communication with clurmdbd

2017-11-29 Thread Philip Kovacs
Step back from slurm and confirm that MariaDb is up and responsive. # mysql -uroot -pEnter password: Welcome to the MariaDB monitor.  Commands end with ; or \g.Your MariaDB connection id is 8Server version: 10.2.9-MariaDB MariaDB Server Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and

Re: [slurm-users] PMIx and Slurm

2017-11-28 Thread Philip Kovacs
So it just interoperates with it fine if you just have it installed and specify pmix as the launch option?  That's neat. -Paul Edmon- On 11/28/2017 6:11 PM, Philip Kovacs wrote: Actually if you're set on installing pmix/pmix-devel from the rpms and then configuring slurm man

Re: [slurm-users] PMIx and Slurm

2017-11-28 Thread Philip Kovacs
e the slurm versions or movethe the pmix versions of libpmi and libpmi2 back into place in /usr/lib64.  On Tuesday, November 28, 2017 5:32 PM, Philip Kovacs wrote: This issue is that pmi 2.0+ provides a "backward compatibility" feature, enabled by default, which installsbot

Re: [slurm-users] PMIx and Slurm

2017-11-28 Thread Philip Kovacs
This issue is that pmi 2.0+ provides a "backward compatibility" feature, enabled by default, which installsboth libpmi.so and libpmi2.so in addition to libpmix.so.  The route with the least friction for you would probablybe to uninstall pmix, then install slurm normally, letting it install its l

Re: [slurm-users] Can't start slurmdbd

2017-11-20 Thread Philip Kovacs
Try adding this to your conf: PluginDir=/usr/lib64/slurm On Monday, November 20, 2017 6:48 AM, Juan A. Cordero Varelaq wrote: I did that but got the same errors. slurmdbd.log contains by the way the following: [2017-11-20T12:39:04.178] error: Couldn't find the specified plugin name

Re: [slurm-users] Fwd: Problem installing slurm on computer cluster

2017-11-16 Thread Philip Kovacs
>Forgive me for saying this. I do have a bit of experience in building HPC >systems.>Distro supplied software packages have improved a lot over the >years.>But they do tend to be out of date compared to the latest versions of >(say) Slurm. It is actually a great deal of work to package Slurm for