On Jun 6, 2014, at 10:24 AM, Gus Correa <g...@ldeo.columbia.edu> wrote:
> On 06/06/2014 01:05 PM, Ralph Castain wrote: >> You can always add --display-allocation to the cmd line to see what we >> thought we received. >> >> If you configure OMPI with --enable-debug, you can set --mca >> ras_base_verbose 10 to see the details >> >> > > Hi John > > On the Torque side, you can put a line "cat $PBS_NODEFILE" on the job script. > This will list the nodes (multiple times according to the number of cores > requested). > I find this useful documentation, > along with job number, work directory, etc. > "man qsub" will show you all the PBS_* environment variables > available to the job. > For instance, you can echo them using a Torque > 'prolog' script, if the user > didn't do it. That will appear in the Torque STDOUT file. > > From outside the job script, "qstat -n" (and variants, say, with -u username) > will list the nodes allocated to each job, > again multiple times as per the requested cores. > > "tracejob job_number" will show similar information. > > > If you configured Torque --with-cpuset, > there is more information about the cpuset allocated to the job > in /dev/cpuset/torque/jobnumber (on the first node listed above, called > "mother superior" in Torque parlance). > This mostly matter if there is more than one job running on a node. > However, Torque doesn't bind processes/MPI_ranks to cores or sockets or > whatever. As Ralph said, Open MPI does that. > I believe Open MPI doesn't use the cpuset info from Torque. > (Ralph, please correct me if I am wrong.) You are correct in that we don't use any per-process designations. We do, however, work inside any overall envelope that Torque may impose on us - e.g., if you tell Torque to limit the job to cores 0-4, we will honor that directive and keep all processes within that envelope. > > My two cents, > Gus Correa > > >> On Jun 6, 2014, at 10:01 AM, Reuti <re...@staff.uni-marburg.de >> <mailto:re...@staff.uni-marburg.de>> wrote: >> >>> Am 06.06.2014 um 18:58 schrieb Sasso, John (GE Power & Water, Non-GE): >>> >>>> OK, so at the least, how can I get the node and slots/node info that >>>> is passed from PBS? >>>> >>>> I ask because I’m trying to troubleshoot a problem w/ PBS and the >>>> build of OpenMPI 1.6 I noted. If I submit a 24-process simple job >>>> through PBS using a script which has: >>>> >>>> /usr/local/openmpi/bin/orterun -n 24 --hostfile >>>> /home/sasso/TEST/hosts.file --mca orte_rsh_agent rsh --mca btl >>>> openib,tcp,self --mca orte_base_help_aggregate 0 -x PATH -x >>>> LD_LIBRARY_PATH /home/sasso/TEST/simplempihello.exe >>> >>> Using the --hostfile on your own would mean to violate the granted >>> slot allocation by PBS. Just leave this option out. How do you submit >>> your job? >>> >>> -- Reuti >>> >>> >>>> And the hostfile /home/sasso/TEST/hosts.file contains 24 entries (the >>>> first 16 being host node0001 and the last 8 being node0002), it >>>> appears that 24 MPI tasks try to start on node0001 instead of getting >>>> distributed as 16 on node0001 and 8 on node0002. Hence, I am >>>> curious what is being passed by PBS. >>>> >>>> --john >>>> >>>> >>>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph >>>> Castain >>>> Sent: Friday, June 06, 2014 12:31 PM >>>> To: Open MPI Users >>>> Subject: Re: [OMPI users] Determining what parameters a scheduler >>>> passes to OpenMPI >>>> >>>> We currently only get the node and slots/node info from PBS - we >>>> don't get any task placement info at all. We then use the mpirun cmd >>>> options and built-in mappers to map the tasks to the nodes. >>>> >>>> I suppose we could do more integration in that regard, but haven't >>>> really seen a reason to do so - the OMPI mappers are generally more >>>> flexible than anything in the schedulers. >>>> >>>> >>>> On Jun 6, 2014, at 9:08 AM, Sasso, John (GE Power & Water, Non-GE) >>>> <john1.sa...@ge.com <mailto:john1.sa...@ge.com>> wrote: >>>> >>>> >>>> For the PBS scheduler and using a build of OpenMPI 1.6 built against >>>> PBS include files + libs, is there a way to determine (perhaps via >>>> some debugging flags passed to mpirun) what job placement parameters >>>> are passed from the PBS scheduler to OpenMPI? In particular, I am >>>> talking about task placement info such as nodes to place on, etc. >>>> Thanks! >>>> >>>> --john >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org <mailto:us...@open-mpi.org> >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org <mailto:us...@open-mpi.org> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users