Re: [OMPI users] Error launching single-node tasks from multiple-node job.

Gustavo Correa Sat, 10 Aug 2013 15:39:32 -0400 (EDT)

Hi Lee-Ping
On Aug 10, 2013, at 3:15 PM, Lee-Ping Wang wrote:

> Hi Gus,
> 
> Thank you for your reply.  I want to run MPI jobs inside a single node, but
> due to the resource allocation policies on the clusters, I could get many
> more resources if I submit multiple-node "batch jobs".  Once I have a
> multiple-node batch job, then I can use a command like "pbsdsh" to run
> single node MPI jobs on each node that is allocated to me.  Thus, the MPI
> jobs on each node are running independently of each other and unaware of one
> another.


Even if you use pbdsh to launch separate MPI jobs on individual nodes,
you probably (not 100% sure about that), probably need to specify he -hostfile
naming the specific node that each job will run on.

Still quite confused because you didn't tell how your "qsub" command looks like,
what Torque script (if any) it is launching, etc.

> 
> The actual call to mpirun is nontrivial to get, because Q-Chem has a
> complicated series of wrapper scripts which ultimately calls mpirun.  

Yes, I just found this out on the Web.  See my previous email.

> If the
> jobs are failing immediately, then I only have a small window to view the
> actual command through "ps" or something.
> 

Are you launching the jobs interactively?  
I.e., with the -I switch to qsub?


> Another option is for me to compile OpenMPI without Torque / PBS support.
> If I do that, then it won't look for the node file anymore.  Is this
> correct? 

You will need to tell mpiexec where to launch the jobs.
If I understand what you are trying to achieve (and I am not sure I do),
one way to do it would be to programatically split the $PBS_NODEFILE into 
several hostfiles, one per MPI job (so to speak) that you want to launch.
Then use each of these nodefiles for each of the MPI jobs.
Note that the PBS_NODEFILE has one line per-node-per-core, *not* one line per 
node.
I have no idea how the trick above could be reconciled with the Q-Chem scripts, 
though.

Overall, I don't understand why you would benefit from such a complicated 
scheme,
rather than lauching either a big MPI job across all nodes that you requested 
(if the problem
is large enough to benefit from  this many cores), 
or launch several small single-node jobs (if the problem is small enough to fit 
well a single node).

You may want to talk to the cluster managers, because there must be a way to 
reconcile their queue policies with your needs (if this not already in place).
We run tons of parallel single-node jobs here, for problems that fit well a 
single node.


My two cents
Gus Correa
 
> 
> I will try your suggestions and get back to you.  Thanks!
> 
> - Lee-Ping
> 
> -----Original Message-----
> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gustavo Correa
> Sent: Saturday, August 10, 2013 12:04 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] Error launching single-node tasks from
> multiple-node job.
> 
> Hi Lee-Ping
> 
> I know nothing about Q-Chem, but I was confused by these sentences:
> 
> "That is to say, these tasks are intended to use OpenMPI parallelism on each
> node, but no parallelism across nodes. "
> 
> "I do not observe this error when submitting single-node jobs."
> 
> "Since my jobs are only parallel over the node they're running on, I believe
> that a node file of any kind is unnecessary. "
> 
> Are you trying to run MPI jobs across several nodes or inside a single node?
> 
> ***
> 
> Anyway, as far as I know,
> if your OpenMPI was compiled with Torque/PBS support, the mpiexec/mpirun
> command will look for the $PBS_NODEFILE to learn in which node(s) it should
> launch the MPI processes, regardless of whether you are using one node or
> more than one node.
> 
> You didn't send your mpiexec command line (which would help), but assuming
> that Q-Chem allows some level of standard mpiexec command options, you could
> force passing the $PBS_NODEFILE to it.
> 
> Something like this (for two nodes with 8 cores each):
> 
> #PBS -q myqueue
> #PBS -l nodes=2:ppn=8
> #PBS -N myjob
> cd $PBS_O_WORKDIR
> ls -l $PBS_NODEFILE
> cat $PBS_NODEFILE
> 
> mpiexec -hostfile $PBS_NODEFILE -np 16 ./my-Q-chem-executable <parameters to
> Q-chem>
> 
> I hope this helps,
> Gus Correa
> 
> On Aug 10, 2013, at 1:51 PM, Lee-Ping Wang wrote:
> 
>> Hi there,
>> 
>> Recently, I've begun some calculations on a cluster where I submit a
> multiple node job to the Torque batch system, and the job executes multiple
> single-node parallel tasks.  That is to say, these tasks are intended to use
> OpenMPI parallelism on each node, but no parallelism across nodes. 
>> 
>> Some background: The actual program being executed is Q-Chem 4.0.  I use
> OpenMPI 1.4.2 for this, because Q-Chem is notoriously difficult to compile
> and this is the last known version of OpenMPI that this version of Q-Chem is
> known to work with.
>> 
>> My jobs are failing with the error message below; I do not observe this
> error when submitting single-node jobs.  From reading the mailing list
> archives (http://www.open-mpi.org/community/lists/users/2010/03/12348.php),
> I believe it is looking for a PBS node file somewhere.  Since my jobs are
> only parallel over the node they're running on, I believe that a node file
> of any kind is unnecessary. 
>> 
>> My question is: Why is OpenMPI behaving differently when I submit a
> multi-node job compared to a single-node job?  How does OpenMPI detect that
> it is running under a multi-node allocation?  Is there a way I can change
> OpenMPI's behavior so it always thinks it's running on a single node,
> regardless of the type of job I submit to the batch system?
>> 
>> Thank you,
>> 
>> -          Lee-Ping Wang (Postdoc in Dept. of Chemistry, Stanford
> University)
>> 
>> [compute-1-1.local:10910] [[42010,0],0] ORTE_ERROR_LOG: File open 
>> failure in file ras_tm_module.c at line 153 [compute-1-1.local:10909] 
>> [[42009,0],0] ORTE_ERROR_LOG: File open failure in file 
>> ras_tm_module.c at line 153 [compute-1-1.local:10911] [[42011,0],0] 
>> ORTE_ERROR_LOG: File open failure in file ras_tm_module.c at line 153 
>> [compute-1-1.local:10910] [[42010,0],0] ORTE_ERROR_LOG: File open 
>> failure in file ras_tm_module.c at line 87 [compute-1-1.local:10909] 
>> [[42009,0],0] ORTE_ERROR_LOG: File open failure in file 
>> ras_tm_module.c at line 87 [compute-1-1.local:10911] [[42011,0],0] 
>> ORTE_ERROR_LOG: File open failure in file ras_tm_module.c at line 87 
>> [compute-1-1.local:10910] [[42010,0],0] ORTE_ERROR_LOG: File open 
>> failure in file base/ras_base_allocate.c at line 133 
>> [compute-1-1.local:10909] [[42009,0],0] ORTE_ERROR_LOG: File open 
>> failure in file base/ras_base_allocate.c at line 133 
>> [compute-1-1.local:10911] [[42011,0],0] ORTE_ERROR_LOG: File open 
>> failure in file base/ras_base_allocate.c at line 133 
>> [compute-1-1.local:10910] [[42010,0],0] ORTE_ERROR_LOG: File open 
>> failure in file base/plm_base_launch_support.c at line 72 
>> [compute-1-1.local:10909] [[42009,0],0] ORTE_ERROR_LOG: File open 
>> failure in file base/plm_base_launch_support.c at line 72 
>> [compute-1-1.local:10911] [[42011,0],0] ORTE_ERROR_LOG: File open 
>> failure in file base/plm_base_launch_support.c at line 72 
>> [compute-1-1.local:10910] [[42010,0],0] ORTE_ERROR_LOG: File open 
>> failure in file plm_tm_module.c at line 167 [compute-1-1.local:10909] 
>> [[42009,0],0] ORTE_ERROR_LOG: File open failure in file 
>> plm_tm_module.c at line 167 [compute-1-1.local:10911] [[42011,0],0] 
>> ORTE_ERROR_LOG: File open failure in file plm_tm_module.c at line 167 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Error launching single-node tasks from multiple-node job.

Reply via email to