I use Open MPI 1.8.5.
The command is as following:
$ mpirun –np 40 –hostfile machines simpleFoam –parallel

and the host file “machines” says,
hpcnode127 cpu=20
hpcnode128 cpu=20

Another interesting symptom is that,
if I run two mpirun’s with –np 2 option on a same node, those two mpirun’s run 
on the same cpu’s.
As it is shown in the following figure, only two cpu’s are working while four 
simpleFoam processes are running.

[cid:image001.png@01D126D5.3DA13680]

Thank you.

Best regards,
Geon-Hong Kim.

From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain
Sent: Tuesday, November 24, 2015 4:11 PM
To: Open MPI Users
Subject: Re: [OMPI users] OpenMPI with infiniband, child processes of mpirun 
are missing or overlapped on the same cpu

Could you please tell us what version of OpenMPI you are using, and the cmd 
line you used to execute the job?

Thanks
Ralph

On Nov 23, 2015, at 11:05 PM, 김건홍(KIM GEON HONG) 
<geonhong....@hhi.co.kr<mailto:geonhong....@hhi.co.kr>> wrote:

Hello,

I tried to run a parallel computation (OpenFOAM) using Open MPI on a HPC 
connected with infiniband.
When I ran a job using mpirun over a couple of nodes (20 cpus per node), the 
computation was not accelerated as I expected.

For example, I ran the job over 40 cpus on 2 nodes, and I checked cpu usages 
and processes via top command.
I expected 20 processes would be running on each node but I found that only 19 
processes were running and a cpu was in idle while others were used.
Following is a capture of top result.

As you can see, Cpu1 is in idle and there are only 19 simpleFoam processes!

<image002.png>

I have no idea why this is happened.

Sometimes, a cpu is in idle while 20 processes are running but, in that case 
two processes running with 50% of cpu usage.
That is, those two different processes are assigned to the same cpu.

Please refer to the attached file for required information of the cluster and 
its environment.
The output of “ulimit –l“ command on both nodes is “unlimited”.

Additional information for OpenFabrics-based network is as following:
1.     OpenFabrics version : MLNX_OFED_LINUX-2.4.1.0.0
2.     Linux/kernel info: RHEL6.5 2.6.32-431.el6.x86_64
- Linux distro/version : Red hat Enterprise Linux Server release 6.5 (Santiago)
- Kernel version : 2.6.32-431.el6.x86_64
3.     Subnet manager : infiniband B class

Thank you.

Best regards,
Geon-Hong Kim.

-----------------------------------
<image001.png>
Geon-Hong Kim
Engineer, Ph.D.

Performance Evaluation Research Department
Hyundai Maritime Research Institute
Hyundai Heavy Industries Co., Ltd.

Office +82-52-203-8053
Fax +82-52-250-9675
Mobile +82-10-3084-1357
-----------------------------------

<system_info.tar.bz2>_______________________________________________
users mailing list
us...@open-mpi.org<mailto:us...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/11/28100.php

Reply via email to