Hello,

I tried to run a parallel computation (OpenFOAM) using Open MPI on a HPC 
connected with infiniband.
When I ran a job using mpirun over a couple of nodes (20 cpus per node), the 
computation was not accelerated as I expected.

For example, I ran the job over 40 cpus on 2 nodes, and I checked cpu usages 
and processes via top command.
I expected 20 processes would be running on each node but I found that only 19 
processes were running and a cpu was in idle while others were used.
Following is a capture of top result.

As you can see, Cpu1 is in idle and there are only 19 simpleFoam processes!

[cid:image002.png@01D126AB.FA0798C0]

I have no idea why this is happened.

Sometimes, a cpu is in idle while 20 processes are running but, in that case 
two processes running with 50% of cpu usage.
That is, those two different processes are assigned to the same cpu.

Please refer to the attached file for required information of the cluster and 
its environment.
The output of “ulimit -l“ command on both nodes is “unlimited”.

Additional information for OpenFabrics-based network is as following:

1.     OpenFabrics version : MLNX_OFED_LINUX-2.4.1.0.0

2.     Linux/kernel info: RHEL6.5 2.6.32-431.el6.x86_64
- Linux distro/version : Red hat Enterprise Linux Server release 6.5 (Santiago)
- Kernel version : 2.6.32-431.el6.x86_64

3.     Subnet manager : infiniband B class

Thank you.

Best regards,
Geon-Hong Kim.

-----------------------------------
[cid:image001.png@01D126AA.AC080E80]
Geon-Hong Kim
Engineer, Ph.D.

Performance Evaluation Research Department
Hyundai Maritime Research Institute
Hyundai Heavy Industries Co., Ltd.

Office +82-52-203-8053
Fax +82-52-250-9675
Mobile +82-10-3084-1357
-----------------------------------

Attachment: system_info.tar.bz2
Description: system_info.tar.bz2

Reply via email to