Sorry about the comment re cpus-per-proc - confused this momentarily with another user also using Torque. I confirmed that this works fine with 1.6.5, and would guess you are hitting some bug in 1.6.0. Can you update?
On Jun 6, 2014, at 12:20 PM, Ralph Castain <r...@open-mpi.org> wrote: > You might want to update to 1.6.5, if you can - I'll see what I can find > > On Jun 6, 2014, at 12:07 PM, Sasso, John (GE Power & Water, Non-GE) > <john1.sa...@ge.com> wrote: > >> Version 1.6 (i.e. prior to 1.6.1) >> >> -----Original Message----- >> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain >> Sent: Friday, June 06, 2014 3:03 PM >> To: Open MPI Users >> Subject: Re: [OMPI users] Determining what parameters a scheduler passes to >> OpenMPI >> >> It's possible that you are hitting a bug - not sure how much the >> cpus-per-proc option has been exercised in 1.6. Is this 1.6.5, or some other >> member of that series? >> >> I don't have a Torque machine handy any more, but should be able to test >> this scenario on my boxes >> >> >> On Jun 6, 2014, at 10:51 AM, Sasso, John (GE Power & Water, Non-GE) >> <john1.sa...@ge.com> wrote: >> >>> Re: $PBS_NODEFILE, we use that to create the hostfile that is passed via >>> --hostfile (i.e. the two are the same). >>> >>> To further debug this, I passed "--display-allocation --display-map" to >>> orterun, which resulted in: >>> >>> ====================== ALLOCATED NODES ====================== >>> >>> Data for node: node0001 Num slots: 16 Max slots: 0 >>> Data for node: node0002 Num slots: 8 Max slots: 0 >>> >>> ================================================================= >>> >>> ======================== JOB MAP ======================== >>> >>> Data for node: node0001 Num procs: 24 >>> Process OMPI jobid: [24552,1] Process rank: 0 >>> Process OMPI jobid: [24552,1] Process rank: 1 >>> Process OMPI jobid: [24552,1] Process rank: 2 >>> Process OMPI jobid: [24552,1] Process rank: 3 >>> Process OMPI jobid: [24552,1] Process rank: 4 >>> Process OMPI jobid: [24552,1] Process rank: 5 >>> Process OMPI jobid: [24552,1] Process rank: 6 >>> Process OMPI jobid: [24552,1] Process rank: 7 >>> Process OMPI jobid: [24552,1] Process rank: 8 >>> Process OMPI jobid: [24552,1] Process rank: 9 >>> Process OMPI jobid: [24552,1] Process rank: 10 >>> Process OMPI jobid: [24552,1] Process rank: 11 >>> Process OMPI jobid: [24552,1] Process rank: 12 >>> Process OMPI jobid: [24552,1] Process rank: 13 >>> Process OMPI jobid: [24552,1] Process rank: 14 >>> Process OMPI jobid: [24552,1] Process rank: 15 >>> Process OMPI jobid: [24552,1] Process rank: 16 >>> Process OMPI jobid: [24552,1] Process rank: 17 >>> Process OMPI jobid: [24552,1] Process rank: 18 >>> Process OMPI jobid: [24552,1] Process rank: 19 >>> Process OMPI jobid: [24552,1] Process rank: 20 >>> Process OMPI jobid: [24552,1] Process rank: 21 >>> Process OMPI jobid: [24552,1] Process rank: 22 >>> Process OMPI jobid: [24552,1] Process rank: 23 >>> >>> I have been going through the man page of mpirun as well as the OpenMPI >>> mailing list and website, and thus far have been unable to determine the >>> reason for the oversubscription of the head node (node0001) when even the >>> PBS scheduler is passing along the correct slot count #s (16 and 8, resp). >>> >>> Am I running into a bug w/ OpenMPI 1.6? >>> >>> --john >>> >>> >>> >>> -----Original Message----- >>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph >>> Castain >>> Sent: Friday, June 06, 2014 1:30 PM >>> To: Open MPI Users >>> Subject: Re: [OMPI users] Determining what parameters a scheduler >>> passes to OpenMPI >>> >>> >>> On Jun 6, 2014, at 10:24 AM, Gus Correa <g...@ldeo.columbia.edu> wrote: >>> >>>> On 06/06/2014 01:05 PM, Ralph Castain wrote: >>>>> You can always add --display-allocation to the cmd line to see what >>>>> we thought we received. >>>>> >>>>> If you configure OMPI with --enable-debug, you can set --mca >>>>> ras_base_verbose 10 to see the details >>>>> >>>>> >>>> >>>> Hi John >>>> >>>> On the Torque side, you can put a line "cat $PBS_NODEFILE" on the job >>>> script. This will list the nodes (multiple times according to the number >>>> of cores requested). >>>> I find this useful documentation, >>>> along with job number, work directory, etc. >>>> "man qsub" will show you all the PBS_* environment variables >>>> available to the job. >>>> For instance, you can echo them using a Torque 'prolog' script, if >>>> the user didn't do it. That will appear in the Torque STDOUT file. >>>> >>>> From outside the job script, "qstat -n" (and variants, say, with -u >>>> username) will list the nodes allocated to each job, again multiple >>>> times as per the requested cores. >>>> >>>> "tracejob job_number" will show similar information. >>>> >>>> >>>> If you configured Torque --with-cpuset, there is more information >>>> about the cpuset allocated to the job in /dev/cpuset/torque/jobnumber >>>> (on the first node listed above, called "mother superior" in Torque >>>> parlance). >>>> This mostly matter if there is more than one job running on a node. >>>> However, Torque doesn't bind processes/MPI_ranks to cores or sockets or >>>> whatever. As Ralph said, Open MPI does that. >>>> I believe Open MPI doesn't use the cpuset info from Torque. >>>> (Ralph, please correct me if I am wrong.) >>> >>> You are correct in that we don't use any per-process designations. We do, >>> however, work inside any overall envelope that Torque may impose on us - >>> e.g., if you tell Torque to limit the job to cores 0-4, we will honor that >>> directive and keep all processes within that envelope. >>> >>> >>>> >>>> My two cents, >>>> Gus Correa >>>> >>>> >>>>> On Jun 6, 2014, at 10:01 AM, Reuti <re...@staff.uni-marburg.de >>>>> <mailto:re...@staff.uni-marburg.de>> wrote: >>>>> >>>>>> Am 06.06.2014 um 18:58 schrieb Sasso, John (GE Power & Water, Non-GE): >>>>>> >>>>>>> OK, so at the least, how can I get the node and slots/node info >>>>>>> that is passed from PBS? >>>>>>> >>>>>>> I ask because I'm trying to troubleshoot a problem w/ PBS and the >>>>>>> build of OpenMPI 1.6 I noted. If I submit a 24-process simple job >>>>>>> through PBS using a script which has: >>>>>>> >>>>>>> /usr/local/openmpi/bin/orterun -n 24 --hostfile >>>>>>> /home/sasso/TEST/hosts.file --mca orte_rsh_agent rsh --mca btl >>>>>>> openib,tcp,self --mca orte_base_help_aggregate 0 -x PATH -x >>>>>>> LD_LIBRARY_PATH /home/sasso/TEST/simplempihello.exe >>>>>> >>>>>> Using the --hostfile on your own would mean to violate the granted >>>>>> slot allocation by PBS. Just leave this option out. How do you >>>>>> submit your job? >>>>>> >>>>>> -- Reuti >>>>>> >>>>>> >>>>>>> And the hostfile /home/sasso/TEST/hosts.file contains 24 entries >>>>>>> (the first 16 being host node0001 and the last 8 being node0002), >>>>>>> it appears that 24 MPI tasks try to start on node0001 instead of getting >>>>>>> distributed as 16 on node0001 and 8 on node0002. Hence, I am >>>>>>> curious what is being passed by PBS. >>>>>>> >>>>>>> --john >>>>>>> >>>>>>> >>>>>>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph >>>>>>> Castain >>>>>>> Sent: Friday, June 06, 2014 12:31 PM >>>>>>> To: Open MPI Users >>>>>>> Subject: Re: [OMPI users] Determining what parameters a scheduler >>>>>>> passes to OpenMPI >>>>>>> >>>>>>> We currently only get the node and slots/node info from PBS - we >>>>>>> don't get any task placement info at all. We then use the mpirun >>>>>>> cmd options and built-in mappers to map the tasks to the nodes. >>>>>>> >>>>>>> I suppose we could do more integration in that regard, but haven't >>>>>>> really seen a reason to do so - the OMPI mappers are generally >>>>>>> more flexible than anything in the schedulers. >>>>>>> >>>>>>> >>>>>>> On Jun 6, 2014, at 9:08 AM, Sasso, John (GE Power & Water, Non-GE) >>>>>>> <john1.sa...@ge.com <mailto:john1.sa...@ge.com>> wrote: >>>>>>> >>>>>>> >>>>>>> For the PBS scheduler and using a build of OpenMPI 1.6 built >>>>>>> against PBS include files + libs, is there a way to determine >>>>>>> (perhaps via some debugging flags passed to mpirun) what job >>>>>>> placement parameters are passed from the PBS scheduler to OpenMPI? >>>>>>> In particular, I am talking about task placement info such as nodes to >>>>>>> place on, etc. >>>>>>> Thanks! >>>>>>> >>>>>>> --john >>>>>>> _______________________________________________ >>>>>>> users mailing list >>>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org> >>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>> >>>>>>> _______________________________________________ >>>>>>> users mailing list >>>>>>> us...@open-mpi.org >>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>> >>>>>> _______________________________________________ >>>>>> users mailing list >>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org> >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >