Hi, On 03/17/2016 10:00 AM, Rainer Koenig wrote:
I'm experiencing a strange problem with running LIGGGHTS on 48 core workstation running Ubuntu 14.04.4 LTS.If I cold boot the workstation and start one of the examples form LIGGGHTS then everything looks fine: $ mpirun -np 48 liggghts < in.chute_wear launches the example on all 48 cores, htop in a second window shows that all cores are occupied and run at nearly 100% workload.
does that machine really have 48 cores or 48 cpus, i.e. assuming it's an Intel machine is Hyperthreading active or not?
So far so good. Now I just reboot the workstation and do the exact same steps as abovre. This time the job just runs on a few cores (16 to 20) and the cores don't even run at 100% load. So now I'm trying to find out what is wrong. Bad luck is that I can't just ask the vendor of the workstation since I'm working for that vendor and trying to solve this issue. :-) I guess that something that OpenMPI needs is initialized different when I do a cold boot or a warm boot. But how can I find out what is wrong?
I might be wrong but you mpirun command does not specify affinity so it's probably not something in OpenMPI and rather an effect of the way your Linux scheduler works.
Already tried to look for differences in the Ubuntu boot logs, but there is nothing different.
Did you look into /proc/cpuinfo? Regards, Thomas -- Thomas Jahns HD(CP)^2 Abteilung Anwendungssoftware Deutsches Klimarechenzentrum GmbH Bundesstraße 45a • D-20146 Hamburg • Germany Phone: +49 40 460094-151 Fax: +49 40 460094-270 Email: Thomas Jahns <ja...@dkrz.de> URL: www.dkrz.de Geschäftsführer: Prof. Dr. Thomas Ludwig Sitz der Gesellschaft: Hamburg Amtsgericht Hamburg HRB 39784
smime.p7s
Description: S/MIME Cryptographic Signature