Re: [OMPI users] Error launching single-node tasks from multiple-node job.

Ralph Castain Sat, 10 Aug 2013 19:28:03 -0400 (EDT)

It helps if you use the correct configure option: --without-tm

Regardless, you can always deselect Torque support at runtime. Just put the 
following in your environment:


OMPI_MCA_ras=^tm

That will tell ORTE to ignore the Torque allocation module and it should then 
look at the machinefile.


On Aug 10, 2013, at 4:18 PM, "Lee-Ping Wang" <leep...@stanford.edu> wrote:

> Hi Gus,
> 
> I agree that $PBS_JOBID should not point to a file in normal situations,
> because it is the job identifier given by the scheduler.  However,
> ras_tm_module.c actually does search for a file named $PBS_JOBID, and that
> seems to be why it was failing.  You can see this in the source code as well
> (look at ras_tm_module.c, I uploaded it to
> https://dl.dropboxusercontent.com/u/5381783/ras_tm_module.c ).  Once I
> changed the $PBS_JOBID environment variable to the name of the node file,
> things seemed to work - though I agree, it's not very logical.  
> 
> I doubt Q-Chem is causing the issue, because I was able to "fix" things by
> changing $PBS_JOBID before Q-Chem is called.  Also, I provided the command
> line to mpirun in a previous email, where the -machinefile argument
> correctly points to the custom machine file that I created.  The missing
> environment variables should not matter.
> 
> The PBS_NODEFILE created by Torque is
> /opt/torque/aux//272139.certainty.stanford.edu and it never gets touched.  I
> followed the advice in your earlier email and I created my own node file on
> each node called /scratch/leeping/pbs_nodefile.$HOSTNAME, and I set
> PBS_NODEFILE to point to this file.  However, this file does not get used
> either, even if I include it on the mpirun command line, unless I set
> PBS_JOBID to the file name.  
> 
> Finally, I was not able to build OpenMPI 1.4.2 without pbs support.  I used
> the configure flag --without-rte-support, but the build failed halfway
> through.
> 
> Thanks,
> 
> - Lee-Ping
> 
> leeping@certainty-a:~/temp$ qsub -I -q debug -l walltime=1:00:00 -l
> nodes=1:ppn=12
> qsub: waiting for job 272139.certainty.stanford.edu to start
> qsub: job 272139.certainty.stanford.edu ready
> 
> leeping@compute-140-4:~$ echo $PBS_NODEFILE 
> /opt/torque/aux//272139.certainty.stanford.edu
> 
> leeping@compute-140-4:~$ cat $PBS_NODEFILE
> compute-140-4
> compute-140-4
> compute-140-4
> compute-140-4
> compute-140-4
> compute-140-4
> compute-140-4
> compute-140-4
> compute-140-4
> compute-140-4
> compute-140-4
> compute-140-4
> 
> leeping@compute-140-4:~$ echo $PBS_JOBID
> 272139.certainty.stanford.edu
> 
> leeping@compute-140-4:~$ cat $PBS_JOBID
> cat: 272139.certainty.stanford.edu: No such file or directory
> 
> leeping@compute-140-4:~$ env | grep PBS
> PBS_VERSION=TORQUE-2.5.3
> PBS_JOBNAME=STDIN
> PBS_ENVIRONMENT=PBS_INTERACTIVE
> PBS_O_WORKDIR=/home/leeping/temp
> PBS_TASKNUM=1
> PBS_O_HOME=/home/leeping
> PBS_MOMPORT=15003
> PBS_O_QUEUE=debug
> PBS_O_LOGNAME=leeping
> PBS_O_LANG=en_US.iso885915
> PBS_JOBCOOKIE=A27B00DAF72024CBEBB7CD3752BDBADC
> PBS_NODENUM=0
> PBS_NUM_NODES=1
> PBS_O_SHELL=/bin/bash
> PBS_SERVER=certainty.stanford.edu
> PBS_JOBID=272139.certainty.stanford.edu
> PBS_O_HOST=certainty-a.local
> PBS_VNODENUM=0
> PBS_QUEUE=debug
> PBS_O_MAIL=/var/spool/mail/leeping
> PBS_NUM_PPN=12
> PBS_NODEFILE=/opt/torque/aux//272139.certainty.stanford.edu
> PBS_O_PATH=/opt/intel/Compiler/11.1/064/bin/intel64:/opt/intel/Compiler/11.1
> /064/bin/intel64:/usr/local/cuda/bin:/home/leeping/opt/psi-4.0b5/bin:/home/l
> eeping/opt/tinker/bin:/home/leeping/opt/cctools/bin:/home/leeping/bin:/home/
> leeping/local/bin:/home/leeping/opt/bin:/usr/kerberos/bin:/usr/java/latest/b
> in:/usr/local/bin:/bin:/usr/bin:/opt/ganglia/bin:/opt/ganglia/sbin:/opt/open
> mpi/bin/:/opt/maui/bin:/opt/torque/bin:/opt/torque/sbin:/opt/rocks/bin:/opt/
> rocks/sbin:/opt/sun-ct/bin:/home/leeping/bin
> 
> -----Original Message-----
> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gustavo Correa
> Sent: Saturday, August 10, 2013 3:58 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] Error launching single-node tasks from
> multiple-node job.
> 
> Lee-Ping
> 
> Something looks amiss.
> PBS_JOBID contains the job name.
> PBS_NODEFILE contains a list (with repetitions up to the number of cores) of
> the nodes that torque assigned to the job.
> 
> Why things get twisted it is hard to tell, it may be something in the Q-Chem
> scripts (could it be mixing up PBS_JOBID and PBS_NODEFILE?), it may be
> something else.
> A more remote possibility is if the cluster has a Torque qsub wrapper that
> may perhaps produce the aforementioned confusion.  Unlikely, but possible.
> 
> To sort out, run any simple job (mpiexec -np 32 hostname), or even your very
> Q-Chem job, but precede it with a bunch of printouts of the PBS environment
> variables:
> echo $PBS_JOBID
> echo $PBS_NODEFILE
> ls -l $PBS_NODEFILE
> cat $PBS_NODEFILE
> cat $PBS_JOBID [this one should fail, because that is not a file, but may
> work the PBS variables were messed up along the way]
> 
> I hope this helps,
> Gus Correa
> 
> On Aug 10, 2013, at 6:39 PM, Lee-Ping Wang wrote:
> 
>> Hi Gus,
>> 
>> It seems the calculation is now working, or at least it didn't crash.  
>> I set the PBS_JOBID environment variable to the name of my custom node 
>> file.  That is to say, I set PBS_JOBID=pbs_nodefile.compute-3-3.local.  
>> It appears that ras_tm_module.c is trying to open the file located at 
>> /scratch/leeping/$PBS_JOBID for some reason, and it is disregarding 
>> the machinefile argument on the command line.
>> 
>> It'll be a few hours before I know for sure whether the job actually
> worked.
>> I still don't know why things are structured this way, however. 
>> 
>> Thanks,
>> 
>> - Lee-Ping
>> 
>> -----Original Message-----
>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Lee-Ping 
>> Wang
>> Sent: Saturday, August 10, 2013 3:07 PM
>> To: 'Open MPI Users'
>> Subject: Re: [OMPI users] Error launching single-node tasks from 
>> multiple-node job.
>> 
>> Hi Gus,
>> 
>> I tried your suggestions.  Here is the command line which executes mpirun.
>> I was puzzled because it still reported a file open failure, so I 
>> inserted a print statement into ras_tm_module.c and recompiled.  The
> results are below.
>> As you can see, it tries to open a different file
>> (/scratch/leeping/272055.certainty.stanford.edu) than the one I 
>> specified (/scratch/leeping/pbs_nodefile.compute-3-3.local).
>> 
>> - Lee-Ping
>> 
>> === mpirun command line ===
>> /home/leeping/opt/openmpi-1.4.2-intel11-dbg/bin/mpirun -machinefile 
>> /scratch/leeping/pbs_nodefile.compute-3-3.local -x HOME -x PWD -x QC 
>> -x QCAUX -x QCCLEAN -x QCFILEPREF -x QCLOCALSCR -x QCPLATFORM -x QCREF 
>> -x QCRSH -x QCRUNNAME -x QCSCRATCH
>>                      -np 24 /home/leeping/opt/qchem40/exe/qcprog.exe
>> .B.in.28642.qcin.1 ./qchem28642/ >>B.out
>> 
>> === Error message from compute node === [compute-3-3.local:28666] 
>> Warning: could not find environment variable "QCLOCALSCR"
>> [compute-3-3.local:28666] Warning: could not find environment variable 
>> "QCREF"
>> [compute-3-3.local:28666] Warning: could not find environment variable 
>> "QCRUNNAME"
>> Attempting to open /scratch/leeping/272055.certainty.stanford.edu
>> [compute-3-3.local:28666] [[56726,0],0] ORTE_ERROR_LOG: File open 
>> failure in file ras_tm_module.c at line 155 [compute-3-3.local:28666] 
>> [[56726,0],0]
>> ORTE_ERROR_LOG: File open failure in file ras_tm_module.c at line 87 
>> [compute-3-3.local:28666] [[56726,0],0] ORTE_ERROR_LOG: File open 
>> failure in file base/ras_base_allocate.c at line 133 
>> [compute-3-3.local:28666] [[56726,0],0] ORTE_ERROR_LOG: File open 
>> failure in file base/plm_base_launch_support.c at line 72 
>> [compute-3-3.local:28666] [[56726,0],0] ORTE_ERROR_LOG: File open 
>> failure in file plm_tm_module.c at line 167
>> 
>> -----Original Message-----
>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Lee-Ping 
>> Wang
>> Sent: Saturday, August 10, 2013 12:51 PM
>> To: 'Open MPI Users'
>> Subject: Re: [OMPI users] Error launching single-node tasks from 
>> multiple-node job.
>> 
>> Hi Gus,
>> 
>> Thank you.  You gave me many helpful suggestions, which I will try out 
>> and get back to you.  I will provide more specifics (e.g. how my jobs 
>> were
>> submitted) in a future email.  
>> 
>> As for the queue policy, that is a highly political issue because the 
>> cluster is a shared resource.  My usual recourse is to use the batch 
>> system as effectively as possible within the confines of their 
>> policies.  This is why it makes sense to submit a single multiple-node 
>> batch job, which then executes several independent single-node tasks.
>> 
>> - Lee-Ping
>> 
>> -----Original Message-----
>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gustavo 
>> Correa
>> Sent: Saturday, August 10, 2013 12:39 PM
>> To: Open MPI Users
>> Subject: Re: [OMPI users] Error launching single-node tasks from 
>> multiple-node job.
>> 
>> Hi Lee-Ping
>> On Aug 10, 2013, at 3:15 PM, Lee-Ping Wang wrote:
>> 
>>> Hi Gus,
>>> 
>>> Thank you for your reply.  I want to run MPI jobs inside a single 
>>> node, but due to the resource allocation policies on the clusters, I 
>>> could get many more resources if I submit multiple-node "batch jobs".
>>> Once I have a multiple-node batch job, then I can use a command like 
>>> "pbsdsh" to run single node MPI jobs on each node that is allocated 
>>> to me.  Thus, the MPI jobs on each node are running independently of 
>>> each other and unaware of one another.
>> 
>> Even if you use pbdsh to launch separate MPI jobs on individual nodes, 
>> you probably (not 100% sure about that), probably need to specify he 
>> -hostfile naming the specific node that each job will run on.
>> 
>> Still quite confused because you didn't tell how your "qsub" command 
>> looks like, what Torque script (if any) it is launching, etc.
>> 
>>> 
>>> The actual call to mpirun is nontrivial to get, because Q-Chem has a 
>>> complicated series of wrapper scripts which ultimately calls mpirun.
>> 
>> Yes, I just found this out on the Web.  See my previous email.
>> 
>>> If the
>>> jobs are failing immediately, then I only have a small window to view 
>>> the actual command through "ps" or something.
>>> 
>> 
>> Are you launching the jobs interactively?  
>> I.e., with the -I switch to qsub?
>> 
>> 
>>> Another option is for me to compile OpenMPI without Torque / PBS support.
>>> If I do that, then it won't look for the node file anymore.  Is this 
>>> correct?
>> 
>> You will need to tell mpiexec where to launch the jobs.
>> If I understand what you are trying to achieve (and I am not sure I 
>> do), one way to do it would be to programatically split the 
>> $PBS_NODEFILE into several hostfiles, one per MPI job (so to speak) that
> you want to launch.
>> Then use each of these nodefiles for each of the MPI jobs.
>> Note that the PBS_NODEFILE has one line per-node-per-core, *not* one 
>> line per node.
>> I have no idea how the trick above could be reconciled with the Q-Chem 
>> scripts, though.
>> 
>> Overall, I don't understand why you would benefit from such a 
>> complicated scheme, rather than lauching either a big MPI job across 
>> all nodes that you requested (if the problem is large enough to 
>> benefit from  this many cores), or launch several small single-node 
>> jobs (if the problem is small enough to fit well a single node).
>> 
>> You may want to talk to the cluster managers, because there must be a 
>> way to reconcile their queue policies with your needs (if this not 
>> already in place).
>> We run tons of parallel single-node jobs here, for problems that fit 
>> well a single node.
>> 
>> 
>> My two cents
>> Gus Correa
>> 
>>> 
>>> I will try your suggestions and get back to you.  Thanks!
>>> 
>>> - Lee-Ping
>>> 
>>> -----Original Message-----
>>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gustavo
>> Correa
>>> Sent: Saturday, August 10, 2013 12:04 PM
>>> To: Open MPI Users
>>> Subject: Re: [OMPI users] Error launching single-node tasks from 
>>> multiple-node job.
>>> 
>>> Hi Lee-Ping
>>> 
>>> I know nothing about Q-Chem, but I was confused by these sentences:
>>> 
>>> "That is to say, these tasks are intended to use OpenMPI parallelism 
>>> on
>> each
>>> node, but no parallelism across nodes. "
>>> 
>>> "I do not observe this error when submitting single-node jobs."
>>> 
>>> "Since my jobs are only parallel over the node they're running on, I
>> believe
>>> that a node file of any kind is unnecessary. "
>>> 
>>> Are you trying to run MPI jobs across several nodes or inside a 
>>> single
>> node?
>>> 
>>> ***
>>> 
>>> Anyway, as far as I know,
>>> if your OpenMPI was compiled with Torque/PBS support, the 
>>> mpiexec/mpirun command will look for the $PBS_NODEFILE to learn in 
>>> which node(s) it
>> should
>>> launch the MPI processes, regardless of whether you are using one 
>>> node or more than one node.
>>> 
>>> You didn't send your mpiexec command line (which would help), but 
>>> assuming that Q-Chem allows some level of standard mpiexec command 
>>> options, you
>> could
>>> force passing the $PBS_NODEFILE to it.
>>> 
>>> Something like this (for two nodes with 8 cores each):
>>> 
>>> #PBS -q myqueue
>>> #PBS -l nodes=2:ppn=8
>>> #PBS -N myjob
>>> cd $PBS_O_WORKDIR
>>> ls -l $PBS_NODEFILE
>>> cat $PBS_NODEFILE
>>> 
>>> mpiexec -hostfile $PBS_NODEFILE -np 16 ./my-Q-chem-executable 
>>> <parameters
>> to
>>> Q-chem>
>>> 
>>> I hope this helps,
>>> Gus Correa
>>> 
>>> On Aug 10, 2013, at 1:51 PM, Lee-Ping Wang wrote:
>>> 
>>>> Hi there,
>>>> 
>>>> Recently, I've begun some calculations on a cluster where I submit a
>>> multiple node job to the Torque batch system, and the job executes
>> multiple
>>> single-node parallel tasks.  That is to say, these tasks are intended 
>>> to
>> use
>>> OpenMPI parallelism on each node, but no parallelism across nodes. 
>>>> 
>>>> Some background: The actual program being executed is Q-Chem 4.0.  I 
>>>> use
>>> OpenMPI 1.4.2 for this, because Q-Chem is notoriously difficult to 
>>> compile and this is the last known version of OpenMPI that this 
>>> version of Q-Chem
>> is
>>> known to work with.
>>>> 
>>>> My jobs are failing with the error message below; I do not observe 
>>>> this
>>> error when submitting single-node jobs.  From reading the mailing 
>>> list archives
>> (http://www.open-mpi.org/community/lists/users/2010/03/12348.php),
>>> I believe it is looking for a PBS node file somewhere.  Since my jobs 
>>> are only parallel over the node they're running on, I believe that a 
>>> node file of any kind is unnecessary.
>>>> 
>>>> My question is: Why is OpenMPI behaving differently when I submit a
>>> multi-node job compared to a single-node job?  How does OpenMPI 
>>> detect
>> that
>>> it is running under a multi-node allocation?  Is there a way I can 
>>> change OpenMPI's behavior so it always thinks it's running on a 
>>> single node, regardless of the type of job I submit to the batch system?
>>>> 
>>>> Thank you,
>>>> 
>>>> -          Lee-Ping Wang (Postdoc in Dept. of Chemistry, Stanford
>>> University)
>>>> 
>>>> [compute-1-1.local:10910] [[42010,0],0] ORTE_ERROR_LOG: File open 
>>>> failure in file ras_tm_module.c at line 153 
>>>> [compute-1-1.local:10909] [[42009,0],0] ORTE_ERROR_LOG: File open 
>>>> failure in file ras_tm_module.c at line 153 
>>>> [compute-1-1.local:10911] [[42011,0],0]
>>>> ORTE_ERROR_LOG: File open failure in file ras_tm_module.c at line 
>>>> 153 [compute-1-1.local:10910] [[42010,0],0] ORTE_ERROR_LOG: File 
>>>> open failure in file ras_tm_module.c at line 87 
>>>> [compute-1-1.local:10909] [[42009,0],0] ORTE_ERROR_LOG: File open 
>>>> failure in file ras_tm_module.c at line 87 [compute-1-1.local:10911] 
>>>> [[42011,0],0]
>>>> ORTE_ERROR_LOG: File open failure in file ras_tm_module.c at line 87 
>>>> [compute-1-1.local:10910] [[42010,0],0] ORTE_ERROR_LOG: File open 
>>>> failure in file base/ras_base_allocate.c at line 133 
>>>> [compute-1-1.local:10909] [[42009,0],0] ORTE_ERROR_LOG: File open 
>>>> failure in file base/ras_base_allocate.c at line 133 
>>>> [compute-1-1.local:10911] [[42011,0],0] ORTE_ERROR_LOG: File open 
>>>> failure in file base/ras_base_allocate.c at line 133 
>>>> [compute-1-1.local:10910] [[42010,0],0] ORTE_ERROR_LOG: File open 
>>>> failure in file base/plm_base_launch_support.c at line 72 
>>>> [compute-1-1.local:10909] [[42009,0],0] ORTE_ERROR_LOG: File open 
>>>> failure in file base/plm_base_launch_support.c at line 72 
>>>> [compute-1-1.local:10911] [[42011,0],0] ORTE_ERROR_LOG: File open 
>>>> failure in file base/plm_base_launch_support.c at line 72 
>>>> [compute-1-1.local:10910] [[42010,0],0] ORTE_ERROR_LOG: File open 
>>>> failure in file plm_tm_module.c at line 167 
>>>> [compute-1-1.local:10909] [[42009,0],0] ORTE_ERROR_LOG: File open 
>>>> failure in file plm_tm_module.c at line 167 
>>>> [compute-1-1.local:10911] [[42011,0],0]
>>>> ORTE_ERROR_LOG: File open failure in file plm_tm_module.c at line 
>>>> 167 _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Error launching single-node tasks from multiple-node job.

Reply via email to