The CPUs are Opteron and don't have HT. I also have found that for large run times (2-3 days) with high number of threads, e.g 32 threads, the run time w/o slurm is nearly the same with little differences. For shared memory runs, there is no problem and difference is negligible. So, I will try to use latest ompi to dig more. I think the problem is not with slurm itself :) Regards, Mahmood
On Thu, Apr 26, 2018 at 12:56 PM, John Hearns <hear...@googlemail.com> wrote: > Mahmood, do you haave Hyperthreading enabled? > That may be the root cause of your problem. If you have hyperhtreading, then > when you start to run more than the number of PHYSICAL cores you > will get over-subscription. Now, with certain workloads that is fine - that > is what hyperhtreading is all about. > However HPC workloads have traditionalyl not benifited from hyperhreading. > > I would suggest the following: > > a) share the result of cat /proc/cpuinfo with is here so we can figure out > f HT is enabled > b) learn how to mimic HT being switched on or off by setting every odd > numbered CPU core to 'offline' > This means you can 'play' with HT being on or off without a reboot > c) reboot one of your servers and look at the BIOS settings > That is a good idea anyway - please tell us if HT is on or off. What is > the Power Profile? Are C0 states disabled? > > > >