Re: [slurm-users] Power9 ACC922

2019-04-16 Thread Sergi More
Hi, Yes, HW identification and task affinity are working as expected. We have smt=4, and slurm is able to get it without problems. See output from "slurmd -C": [root@node01 ~]# slurmd -C NodeName=node01 CPUs=160 Boards=1 SocketsPerBoard=2 CoresPerSocket=20 ThreadsPerCore=4 RealMemory=583992

Re: [slurm-users] Power9 ACC922

2019-04-16 Thread Fulcomer, Samuel
We went straight to ESSL. It also has FFTs and selected LAPACK, some with GPU support ( https://www-01.ibm.com/common/ssi/ShowDoc.wss?docURL=/common/ssi/rep_sm/1/872/ENUS5765-L61/index.html&lang=en&request_locale=en ). I also try to push people to use MKL on Intel, as it has multi-code-path execut

Re: [slurm-users] Power9 ACC922

2019-04-16 Thread Prentice Bisbal
Thanks for the info. Did you try building/using any of the open-source math libraries for Power9, like OpenBLAS, or did you just use ESSL for everything? Prentice On 4/16/19 1:12 PM, Fulcomer, Samuel wrote: We had an AC921 and AC922 as a while as loaners. We had no problems with SLURM. Gett

Re: [slurm-users] Power9 ACC922

2019-04-16 Thread Fulcomer, Samuel
We had an AC921 and AC922 as a while as loaners. We had no problems with SLURM. Getting POWERAI running correctly (bugs since fixed in newer release) and apps properly built and linked to ESSL was the long march. regards, s On Tue, Apr 16, 2019 at 12:59 PM Prentice Bisbal wrote: > Sergi, > >

Re: [slurm-users] Power9 ACC922

2019-04-16 Thread Prentice Bisbal
Sergi, I'm working with Bill on this project. Is all the hardware identification/mapping and task affinity working as expected/desired with the Power9? I assume your answer implies "yes", but I just want to make sure. Prentice On 4/16/19 10:37 AM, Sergi More wrote: Hi, We have a Power9 cl

Re: [slurm-users] Power9 ACC922

2019-04-16 Thread Sergi More
Hi, We have a Power9 cluster (AC922) working without problems. Now with 18.08, but have been running as well with 17.11. No extra steps/problems found during installation because of Power9. Thank you, Sergi. On 16/04/2019 16:05, Bill Wichser wrote: Does anyone on this list run Slurm on the

[slurm-users] Power9 ACC922

2019-04-16 Thread Bill Wichser
Does anyone on this list run Slurm on the Sierra-like machines from IBM? I believe they are the ACC922 nodes. We are looking to purchase a small cluster of these nodes but have concerns about the scheduler. Just looking for a nod that, yes it works fine, as well as any issues seen during dep