On Jun 6, 2014, at 10:24 AM, Gus Correa <g...@ldeo.columbia.edu> wrote:

> On 06/06/2014 01:05 PM, Ralph Castain wrote:
>> You can always add --display-allocation to the cmd line to see what we
>> thought we received.
>> 
>> If you configure OMPI with --enable-debug, you can set --mca
>> ras_base_verbose 10 to see the details
>> 
>> 
> 
> Hi John
> 
> On the Torque side, you can put a line "cat $PBS_NODEFILE" on the job script. 
>  This will list the nodes (multiple times according to the number of cores 
> requested).
> I find this useful documentation,
> along with job number, work directory, etc.
> "man qsub" will show you all the PBS_* environment variables
> available to the job.
> For instance, you can echo them using a Torque
> 'prolog' script, if the user
> didn't do it. That will appear in the Torque STDOUT file.
> 
> From outside the job script, "qstat -n" (and variants, say, with -u username)
> will list the nodes allocated to each job,
> again multiple times as per the requested cores.
> 
> "tracejob job_number" will show similar information.
> 
> 
> If you configured Torque --with-cpuset,
> there is more information about the cpuset allocated to the job
> in /dev/cpuset/torque/jobnumber (on the first node listed above, called 
> "mother superior" in Torque parlance).
> This mostly matter if there is more than one job running on a node.
> However, Torque doesn't bind processes/MPI_ranks to cores or sockets or 
> whatever.  As Ralph said, Open MPI does that.
> I believe Open MPI doesn't use the cpuset info from Torque.
> (Ralph, please correct me if I am wrong.)

You are correct in that we don't use any per-process designations. We do, 
however, work inside any overall envelope that Torque may impose on us - e.g., 
if you tell Torque to limit the job to cores 0-4, we will honor that directive 
and keep all processes within that envelope.


> 
> My two cents,
> Gus Correa
> 
> 
>> On Jun 6, 2014, at 10:01 AM, Reuti <re...@staff.uni-marburg.de
>> <mailto:re...@staff.uni-marburg.de>> wrote:
>> 
>>> Am 06.06.2014 um 18:58 schrieb Sasso, John (GE Power & Water, Non-GE):
>>> 
>>>> OK, so at the least, how can I get the node and slots/node info that
>>>> is passed from PBS?
>>>> 
>>>> I ask because I’m trying to troubleshoot a problem w/ PBS and the
>>>> build of OpenMPI 1.6 I noted.  If I submit a 24-process simple job
>>>> through PBS using a script which has:
>>>> 
>>>> /usr/local/openmpi/bin/orterun -n 24 --hostfile
>>>> /home/sasso/TEST/hosts.file --mca orte_rsh_agent rsh --mca btl
>>>> openib,tcp,self --mca orte_base_help_aggregate 0 -x PATH -x
>>>> LD_LIBRARY_PATH /home/sasso/TEST/simplempihello.exe
>>> 
>>> Using the --hostfile on your own would mean to violate the granted
>>> slot allocation by PBS. Just leave this option out. How do you submit
>>> your job?
>>> 
>>> -- Reuti
>>> 
>>> 
>>>> And the hostfile /home/sasso/TEST/hosts.file contains 24 entries (the
>>>> first 16 being host node0001 and the last 8 being node0002), it
>>>> appears that 24 MPI tasks try to start on node0001 instead of getting
>>>> distributed as 16 on node0001 and 8 on node0002.   Hence, I am
>>>> curious what is being passed by PBS.
>>>> 
>>>> --john
>>>> 
>>>> 
>>>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph
>>>> Castain
>>>> Sent: Friday, June 06, 2014 12:31 PM
>>>> To: Open MPI Users
>>>> Subject: Re: [OMPI users] Determining what parameters a scheduler
>>>> passes to OpenMPI
>>>> 
>>>> We currently only get the node and slots/node info from PBS - we
>>>> don't get any task placement info at all. We then use the mpirun cmd
>>>> options and built-in mappers to map the tasks to the nodes.
>>>> 
>>>> I suppose we could do more integration in that regard, but haven't
>>>> really seen a reason to do so - the OMPI mappers are generally more
>>>> flexible than anything in the schedulers.
>>>> 
>>>> 
>>>> On Jun 6, 2014, at 9:08 AM, Sasso, John (GE Power & Water, Non-GE)
>>>> <john1.sa...@ge.com <mailto:john1.sa...@ge.com>> wrote:
>>>> 
>>>> 
>>>> For the PBS scheduler and using a build of OpenMPI 1.6 built against
>>>> PBS include files + libs, is there a way to determine (perhaps via
>>>> some debugging flags passed to mpirun) what job placement parameters
>>>> are passed from the PBS scheduler to OpenMPI?  In particular, I am
>>>> talking about task placement info such as nodes to place on, etc.
>>>>  Thanks!
>>>> 
>>>>             --john
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to