Hi,

On 03/17/2016 10:00 AM, Rainer Koenig wrote:
I'm experiencing a strange problem with running LIGGGHTS on  48 core
workstation running Ubuntu 14.04.4 LTS.

If I cold boot the workstation and start one of the examples form
LIGGGHTS then everything looks fine:

$ mpirun -np 48 liggghts < in.chute_wear

launches the example on all 48 cores, htop in a second window shows that
all cores are occupied and run at nearly 100% workload.

does that machine really have 48 cores or 48 cpus, i.e. assuming it's an Intel machine is Hyperthreading active or not?

So far so good. Now I just reboot the workstation and do the exact same
steps as abovre.

This time the job just runs on a few cores (16 to 20) and the cores
don't even run at 100% load.

So now I'm trying to find out what is wrong. Bad luck is that I can't
just ask the vendor of the workstation since I'm working for that vendor
and trying to solve this issue. :-)

I guess that something that OpenMPI needs is initialized different when
I do a cold boot or a warm boot. But how can I find out what is wrong?

I might be wrong but you mpirun command does not specify affinity so it's probably not something in OpenMPI and rather an effect of the way your Linux scheduler works.

Already tried to look for differences in the Ubuntu boot logs, but there
is nothing different.

Did you look into /proc/cpuinfo?

Regards, Thomas
--
Thomas Jahns
HD(CP)^2
Abteilung Anwendungssoftware

Deutsches Klimarechenzentrum GmbH
Bundesstraße 45a • D-20146 Hamburg • Germany

Phone:  +49 40 460094-151
Fax:    +49 40 460094-270
Email:  Thomas Jahns <ja...@dkrz.de>
URL:    www.dkrz.de

Geschäftsführer: Prof. Dr. Thomas Ludwig
Sitz der Gesellschaft: Hamburg
Amtsgericht Hamburg HRB 39784

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to