Sorry about the comment re cpus-per-proc - confused this momentarily with 
another user also using Torque. I confirmed that this works fine with 1.6.5, 
and would guess you are hitting some bug in 1.6.0. Can you update?


On Jun 6, 2014, at 12:20 PM, Ralph Castain <r...@open-mpi.org> wrote:

> You might want to update to 1.6.5, if you can - I'll see what I can find
> 
> On Jun 6, 2014, at 12:07 PM, Sasso, John (GE Power & Water, Non-GE) 
> <john1.sa...@ge.com> wrote:
> 
>> Version 1.6 (i.e. prior to 1.6.1)
>> 
>> -----Original Message-----
>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain
>> Sent: Friday, June 06, 2014 3:03 PM
>> To: Open MPI Users
>> Subject: Re: [OMPI users] Determining what parameters a scheduler passes to 
>> OpenMPI
>> 
>> It's possible that you are hitting a bug - not sure how much the 
>> cpus-per-proc option has been exercised in 1.6. Is this 1.6.5, or some other 
>> member of that series?
>> 
>> I don't have a Torque machine handy any more, but should be able to test 
>> this scenario on my boxes
>> 
>> 
>> On Jun 6, 2014, at 10:51 AM, Sasso, John (GE Power & Water, Non-GE) 
>> <john1.sa...@ge.com> wrote:
>> 
>>> Re: $PBS_NODEFILE, we use that to create the hostfile that is passed via 
>>> --hostfile (i.e. the two are the same).  
>>> 
>>> To further debug this, I passed "--display-allocation --display-map" to 
>>> orterun, which resulted in:
>>> 
>>> ======================   ALLOCATED NODES   ======================
>>> 
>>> Data for node: node0001        Num slots: 16   Max slots: 0
>>> Data for node: node0002      Num slots: 8    Max slots: 0
>>> 
>>> =================================================================
>>> 
>>> ========================   JOB MAP   ========================
>>> 
>>> Data for node: node0001        Num procs: 24
>>>      Process OMPI jobid: [24552,1] Process rank: 0
>>>      Process OMPI jobid: [24552,1] Process rank: 1
>>>      Process OMPI jobid: [24552,1] Process rank: 2
>>>      Process OMPI jobid: [24552,1] Process rank: 3
>>>      Process OMPI jobid: [24552,1] Process rank: 4
>>>      Process OMPI jobid: [24552,1] Process rank: 5
>>>      Process OMPI jobid: [24552,1] Process rank: 6
>>>      Process OMPI jobid: [24552,1] Process rank: 7
>>>      Process OMPI jobid: [24552,1] Process rank: 8
>>>      Process OMPI jobid: [24552,1] Process rank: 9
>>>      Process OMPI jobid: [24552,1] Process rank: 10
>>>      Process OMPI jobid: [24552,1] Process rank: 11
>>>      Process OMPI jobid: [24552,1] Process rank: 12
>>>      Process OMPI jobid: [24552,1] Process rank: 13
>>>      Process OMPI jobid: [24552,1] Process rank: 14
>>>      Process OMPI jobid: [24552,1] Process rank: 15
>>>      Process OMPI jobid: [24552,1] Process rank: 16
>>>      Process OMPI jobid: [24552,1] Process rank: 17
>>>      Process OMPI jobid: [24552,1] Process rank: 18
>>>      Process OMPI jobid: [24552,1] Process rank: 19
>>>      Process OMPI jobid: [24552,1] Process rank: 20
>>>      Process OMPI jobid: [24552,1] Process rank: 21
>>>      Process OMPI jobid: [24552,1] Process rank: 22
>>>      Process OMPI jobid: [24552,1] Process rank: 23
>>> 
>>> I have been going through the man page of mpirun as well as the OpenMPI 
>>> mailing list and website, and thus far have been unable to determine the 
>>> reason for the oversubscription of the head node (node0001) when even the 
>>> PBS scheduler is passing along the correct slot count #s (16 and 8, resp).
>>> 
>>> Am I running into a bug w/ OpenMPI 1.6?
>>> 
>>> --john
>>> 
>>> 
>>> 
>>> -----Original Message-----
>>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph 
>>> Castain
>>> Sent: Friday, June 06, 2014 1:30 PM
>>> To: Open MPI Users
>>> Subject: Re: [OMPI users] Determining what parameters a scheduler 
>>> passes to OpenMPI
>>> 
>>> 
>>> On Jun 6, 2014, at 10:24 AM, Gus Correa <g...@ldeo.columbia.edu> wrote:
>>> 
>>>> On 06/06/2014 01:05 PM, Ralph Castain wrote:
>>>>> You can always add --display-allocation to the cmd line to see what 
>>>>> we thought we received.
>>>>> 
>>>>> If you configure OMPI with --enable-debug, you can set --mca 
>>>>> ras_base_verbose 10 to see the details
>>>>> 
>>>>> 
>>>> 
>>>> Hi John
>>>> 
>>>> On the Torque side, you can put a line "cat $PBS_NODEFILE" on the job 
>>>> script.  This will list the nodes (multiple times according to the number 
>>>> of cores requested).
>>>> I find this useful documentation,
>>>> along with job number, work directory, etc.
>>>> "man qsub" will show you all the PBS_* environment variables 
>>>> available to the job.
>>>> For instance, you can echo them using a Torque 'prolog' script, if 
>>>> the user didn't do it. That will appear in the Torque STDOUT file.
>>>> 
>>>> From outside the job script, "qstat -n" (and variants, say, with -u
>>>> username) will list the nodes allocated to each job, again multiple 
>>>> times as per the requested cores.
>>>> 
>>>> "tracejob job_number" will show similar information.
>>>> 
>>>> 
>>>> If you configured Torque --with-cpuset, there is more information 
>>>> about the cpuset allocated to the job in /dev/cpuset/torque/jobnumber 
>>>> (on the first node listed above, called "mother superior" in Torque 
>>>> parlance).
>>>> This mostly matter if there is more than one job running on a node.
>>>> However, Torque doesn't bind processes/MPI_ranks to cores or sockets or 
>>>> whatever.  As Ralph said, Open MPI does that.
>>>> I believe Open MPI doesn't use the cpuset info from Torque.
>>>> (Ralph, please correct me if I am wrong.)
>>> 
>>> You are correct in that we don't use any per-process designations. We do, 
>>> however, work inside any overall envelope that Torque may impose on us - 
>>> e.g., if you tell Torque to limit the job to cores 0-4, we will honor that 
>>> directive and keep all processes within that envelope.
>>> 
>>> 
>>>> 
>>>> My two cents,
>>>> Gus Correa
>>>> 
>>>> 
>>>>> On Jun 6, 2014, at 10:01 AM, Reuti <re...@staff.uni-marburg.de 
>>>>> <mailto:re...@staff.uni-marburg.de>> wrote:
>>>>> 
>>>>>> Am 06.06.2014 um 18:58 schrieb Sasso, John (GE Power & Water, Non-GE):
>>>>>> 
>>>>>>> OK, so at the least, how can I get the node and slots/node info 
>>>>>>> that is passed from PBS?
>>>>>>> 
>>>>>>> I ask because I'm trying to troubleshoot a problem w/ PBS and the 
>>>>>>> build of OpenMPI 1.6 I noted.  If I submit a 24-process simple job 
>>>>>>> through PBS using a script which has:
>>>>>>> 
>>>>>>> /usr/local/openmpi/bin/orterun -n 24 --hostfile 
>>>>>>> /home/sasso/TEST/hosts.file --mca orte_rsh_agent rsh --mca btl 
>>>>>>> openib,tcp,self --mca orte_base_help_aggregate 0 -x PATH -x 
>>>>>>> LD_LIBRARY_PATH /home/sasso/TEST/simplempihello.exe
>>>>>> 
>>>>>> Using the --hostfile on your own would mean to violate the granted 
>>>>>> slot allocation by PBS. Just leave this option out. How do you 
>>>>>> submit your job?
>>>>>> 
>>>>>> -- Reuti
>>>>>> 
>>>>>> 
>>>>>>> And the hostfile /home/sasso/TEST/hosts.file contains 24 entries 
>>>>>>> (the first 16 being host node0001 and the last 8 being node0002), 
>>>>>>> it appears that 24 MPI tasks try to start on node0001 instead of getting
>>>>>>> distributed as 16 on node0001 and 8 on node0002.   Hence, I am
>>>>>>> curious what is being passed by PBS.
>>>>>>> 
>>>>>>> --john
>>>>>>> 
>>>>>>> 
>>>>>>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph 
>>>>>>> Castain
>>>>>>> Sent: Friday, June 06, 2014 12:31 PM
>>>>>>> To: Open MPI Users
>>>>>>> Subject: Re: [OMPI users] Determining what parameters a scheduler 
>>>>>>> passes to OpenMPI
>>>>>>> 
>>>>>>> We currently only get the node and slots/node info from PBS - we 
>>>>>>> don't get any task placement info at all. We then use the mpirun 
>>>>>>> cmd options and built-in mappers to map the tasks to the nodes.
>>>>>>> 
>>>>>>> I suppose we could do more integration in that regard, but haven't 
>>>>>>> really seen a reason to do so - the OMPI mappers are generally 
>>>>>>> more flexible than anything in the schedulers.
>>>>>>> 
>>>>>>> 
>>>>>>> On Jun 6, 2014, at 9:08 AM, Sasso, John (GE Power & Water, Non-GE) 
>>>>>>> <john1.sa...@ge.com <mailto:john1.sa...@ge.com>> wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> For the PBS scheduler and using a build of OpenMPI 1.6 built 
>>>>>>> against PBS include files + libs, is there a way to determine 
>>>>>>> (perhaps via some debugging flags passed to mpirun) what job 
>>>>>>> placement parameters are passed from the PBS scheduler to OpenMPI?
>>>>>>> In particular, I am talking about task placement info such as nodes to 
>>>>>>> place on, etc.
>>>>>>> Thanks!
>>>>>>> 
>>>>>>>          --john
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org> 
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> 
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org <mailto:us...@open-mpi.org> 
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 

Reply via email to