[OMPI users] Question about oversubscribing

George Markomanolis Sun, 4 Nov 2012 10:06:19 -0500

Dear all,

I am trying to execute an experiment by oversubscribing the nodes. So Ihave available some clusters (I can use up to 8-10 different clustersduring one execution) and I have totally around to 1300 cores. I amexecuting the EP benchmark from the NAS suite which means that there arenot a lot of MPI messages, just some collective MPI calls.

The number of the MPI processes per node, depends on the availablememory of each node. Thus in the machinefile I have declared one node 13times if I want 13 MPI processes on it. Is that correct? Giving amachinefile of 32768 nodes when I want to execute 32768 processes, doesOpenMPI behave like there is no oversubscribing? If yes how can I give amachinefile where there is different number of MPI processes on eachnode? The maximum number of MPI processes that I have in a node is 388.

My problem is that I can execute 16384 processes but not 32768. In thefirst case I need around to 3 minutes for the execution but in thesecond case, even after 7 hours the benchmark does not even start. Thereis no error, I am just cancelling the job by myself but I am assumingthat something is wrong because 7 hours it is too much. I have to saythat I executed the instance of 16384 processes without any problem. Iadded some debug info in the benchmark and I can see that the executionis delayed during MPI_Init, it never passes this point. For the instanceof 16384 processes I need around to 2 minutes to finish the MPI_Initcall. I am checking the memory of all the nodes and there is at least0.5GB free memory on each node.

I know about the parameter mpi_yield_when_idle but I have read that ifthere are not a lot of MPI messages will not improve the performance. Itried though and nothing changed. I tried also the mpi_preconnect_mpijust in case but again nothing. Could you please indicate a reason whyis this happening?

Moreover I used just one node with 48GB memory in order to execute 2048MPI processes without any problem, of course I just had to wait a lot.


I am using OpenMPI v1.4.1 and all the clusters are 64 bit.

I execute the benchmark with the following command:

mpirun --mca pml ob1 --mca btl tcp,self --mca btl_tcp_if_excludeib0,lo,myri0 -machinefile machines -np 32768 ep.D.32768


Best regards,
George Markomanolis

[OMPI users] Question about oversubscribing

Reply via email to