Re: [OMPI users] Question about oversubscribing

George Markomanolis Sun, 4 Nov 2012 11:38:28 -0500

Dear Ralph,

I am copying your email from the web site because I had enabled theoption to receive all the emails once per day



On 11/04/2012 05:27 PM, George Markomanolis wrote:

> Dear all,

>
> I am trying to execute an experiment by oversubscribing the nodes. So Ihave available some clusters (I can use up to 8-10 different clustersduring one execution) and I have totally around to 1300 cores. I amexecuting the EP benchmark from the NAS suite which means that thereare not a lot of MPI messages, just some collective MPI calls.
>
> The number of the MPI processes per node, depends on the availablememory of each node. Thus in the machinefile I have declared one node13 times if I want 13 MPI processes on it. Is that correct?
You *can* do it that way, or you could just use "slots=13" for thatnode in the file, and list it only once.

OK, but I assume the result is the same, right?

> Giving a machinefile of 32768 nodes when I want to execute 32768 processes, does OpenMPIbehave like there is no oversubscribing?
Yes, it should - I assume you mean "slots" and not "nodes" in theabove statement, since you indicate that you listed each node multipletimes to set the number of slots on that node.

Yes, I mean slots.

> If yes how can I give a machinefile where there is different number of MPI processes on eachnode? The maximum number of MPI processes that I have in a node is 388.
Just assign the number of slots on each node to be the number ofprocesses you want on that node

OK

>
> My problem is that I can execute 16384 processes but not 32768. Inthe first case I need around to 3 minutes for the execution but in thesecond case, even after 7 hours the benchmark does not even start.There is no error, I am just cancelling the job by myself but I amassuming that something is wrong because 7 hours it is too much. Ihave to say that I executed the instance of 16384 processes withoutany problem. I added some debug info in the benchmark and I can seethat the execution is delayed during MPI_Init, it never passes thispoint. For the instance of 16384 processes I need around to 2 minutesto finish the MPI_Init call. I am checking the memory of all the nodesand there is at least 0.5GB free memory on each node.
>
> I know about the parameter mpi_yield_when_idle but I have read that ifthere are not a lot of MPI messages will not improve the performance.I tried though and nothing changed. I tried also thempi_preconnect_mpi just in case but again nothing. Could you pleaseindicate a reason why is this happening?
You indicated that these jobs are actually spanning multiple clusters- true? If so, when you cross that 16384 boundary, do you also crossclusters? Is it possible one or more of the additional clusters isblocking communications?

I have tried both configurations even using exactly the same nodes withless MPI processes per node in order to check if a site is blocking therest ones and I have tried the half machinefile for the instance of16384 in order to see if there is any issue by using so many MPIprocesses per node. Both were executed fine with the instance of 16384MPI processes. Also I tried to combine different quarters of themachinefile in order to check if there is any issue with the combinationof specific sites and again I didn't have a problem.

>
> Moreover I used just one node with 48GB memory in order to execute2048 MPI processes without any problem, of course I just had to wait alot.
>
> I am using OpenMPI v1.4.1 and all the clusters are 64 bit.
>
> I execute the benchmark with the following command:
> mpirun --mca pml ob1 --mca btl tcp,self --mca btl_tcp_if_excludeib0,lo,myri0 -machinefile machines -np 32768 ep.D.32768
You could just leave off the "-np N" part of the command line - we'llassign one process to every slot specified in the machinefile.

OK, nice

Best regards,
George Markomanolis


>
> Best regards,
> George Markomanolis
> _______________________________________________
> users mailing list
> users_at_[hidden]
>http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Question about oversubscribing

Reply via email to