Re: [OMPI users] A problem with 'mpiexec -launch-agent'
On Jun 14, 2010, at 5:24 PM, Terry Frankcombe wrote: > Speaking as no more than an uneducated user, having the behaviour change > depending on invoking by an absolute path or invoking by some > unspecified (potentially shell-dependent) path magic seems like a bad > idea. FWIW, this specific feature was copied (at the request of multiple users) from another MPI implementation. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
[OMPI users] mpirun jobs only one single node
Hi, I have using openmpi build with tm support When i run the job requesting for two nodes it run only on single node. Here is my script. >cat mpipbs-script.sh #PBS -N mpipbs-script #PBS -q short ### Number of nodes: resources per node ### (4 cores/node, so ppn=4 is ALL resources on the node) #PBS -l nodes=2:ppn=4 /opt/openmpi-1.4.2/bin/mpirun /scratch0/gsongara/mpitest/hello torque config set queue short resources_max.nodes = 4 set queue short resources_default.nodes = 1 set server resources_default.neednodes = 1 set server resources_default.nodect = 1 set server resources_default.nodes = 1 Can someone please advise if i missing anything here. Regards Govind
Re: [OMPI users] mpirun jobs only one single node
Look at the contents of $PBS_NODEFILE and see how many nodes it contains. On Jun 15, 2010, at 3:56 AM, Govind Songara wrote: > Hi, > > I have using openmpi build with tm support > When i run the job requesting for two nodes it run only on single node. > Here is my script. > >cat mpipbs-script.sh > #PBS -N mpipbs-script > #PBS -q short > ### Number of nodes: resources per node > ### (4 cores/node, so ppn=4 is ALL resources on the node) > #PBS -l nodes=2:ppn=4 > /opt/openmpi-1.4.2/bin/mpirun /scratch0/gsongara/mpitest/hello > > > torque config > set queue short resources_max.nodes = 4 > set queue short resources_default.nodes = 1 > set server resources_default.neednodes = 1 > set server resources_default.nodect = 1 > set server resources_default.nodes = 1 > > Can someone please advise if i missing anything here. > > Regards > Govind > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] A problem with 'mpiexec -launch-agent'
On Jun 14, 2010, at 3:13 PM, Reuti wrote: > > bash: -c: line 0: syntax error near unexpected token `(' > > bash: -c: line 0: ` PATH=/OMPI_dir/bin:$PATH ; export PATH ; > > LD_LIBRARY_PATH=/OMPI_dir/lib:$LD_LIBRARY_PATH ; export > > LD_LIBRARY_PATH ; /some_path/myscript /OMPI_dir/bin/(null) -- > > daemonize -mca ess env -mca orte_ess_jobid 1978662912 -mca > > orte_ess_vpid 1 -mca orte_ess_num_procs 2 --hnp-uri > > "1978662912.0;tcp://180.0.14.12:54844;tcp://190.0.14.12:54844"' The problem is that "(null)" in the middle. We'll have to dig into how that got there... Reuti's probably right that something is somehow NULL in there, and glibc is snprintf'ing (null) instead of SEGV'ing. Ralph and I are talking about this issue, but we're hindered by the fact that I'm at the MPI Forum this week (i.e., meetings are taking up all my days). I haven't had a chance to look at the code in depth yet. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] A problem with 'mpiexec -launch-agent'
Am 15.06.2010 um 14:52 schrieb Jeff Squyres: > On Jun 14, 2010, at 3:13 PM, Reuti wrote: > >>> bash: -c: line 0: syntax error near unexpected token `(' >>> bash: -c: line 0: ` PATH=/OMPI_dir/bin:$PATH ; export PATH ; >>> LD_LIBRARY_PATH=/OMPI_dir/lib:$LD_LIBRARY_PATH ; export >>> LD_LIBRARY_PATH ; /some_path/myscript /OMPI_dir/bin/(null) -- >>> daemonize -mca ess env -mca orte_ess_jobid 1978662912 -mca >>> orte_ess_vpid 1 -mca orte_ess_num_procs 2 --hnp-uri >>> "1978662912.0;tcp://180.0.14.12:54844;tcp://190.0.14.12:54844"' > > The problem is that "(null)" in the middle. We'll have to dig into how that > got there... Reuti's probably right that something is somehow NULL in there, > and glibc is snprintf'ing (null) instead of SEGV'ing. I think the problem is not only the (null) itself, but also the output "prefix_dir" and "bin_base" (unless the launch-agent would have ignore/interpret $1 $2 in a proper way). The (null) is then the content of "orted_cmd". -- Reuti > > Ralph and I are talking about this issue, but we're hindered by the fact that > I'm at the MPI Forum this week (i.e., meetings are taking up all my days). I > haven't had a chance to look at the code in depth yet. > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] mpirun jobs only one single node
I added the $PBS_NODEFILE in the script in my last email below. It show only one node here is the output === node47.beowulf.cluster node47.beowulf.cluster node47.beowulf.cluster node47.beowulf.cluster This job has allocated 4 nodes Hello World! from process 1 out of 4 on node47.beowulf.cluster Hello World! from process 2 out of 4 on node47.beowulf.cluster Hello World! from process 3 out of 4 on node47.beowulf.cluster Hello World! from process 0 out of 4 on node47.beowulf.cluster === On 15 June 2010 13:41, Ralph Castain wrote: > Look at the contents of $PBS_NODEFILE and see how many nodes it contains. > > On Jun 15, 2010, at 3:56 AM, Govind Songara wrote: > > Hi, > > I have using openmpi build with tm support > When i run the job requesting for two nodes it run only on single node. > Here is my script. > >cat mpipbs-script.sh > #PBS -N mpipbs-script > #PBS -q short > ### Number of nodes: resources per node > ### (4 cores/node, so ppn=4 is ALL resources on the node) > #PBS -l nodes=2:ppn=4 > > echo `cat $PBS_NODEFILE` > NPROCS=`wc -l < $PBS_NODEFILE` > echo This job has allocated $NPROCS nodes > > /opt/openmpi-1.4.2/bin/mpirun /scratch0/gsongara/mpitest/hello torque config set queue short resources_max.nodes = 4 set queue short resources_default.nodes = 1 set server resources_default.neednodes = 1 set server resources_default.nodect = 1 set server resources_default.nodes = 1 Can someone please advise if i missing anything here. Regards Govind > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] mpirun jobs only one single node
That's what I suspected. I suggest you talk to your sys admin about how PBS is configured - looks like you are only getting one node allocated despite your request for two. Probably something in the config needs adjusting. On Jun 15, 2010, at 7:20 AM, Govind Songara wrote: > I added the $PBS_NODEFILE in the script in my last email below. > It show only one node here is the output > === > node47.beowulf.cluster node47.beowulf.cluster node47.beowulf.cluster > node47.beowulf.cluster > This job has allocated 4 nodes > Hello World! from process 1 out of 4 on node47.beowulf.cluster > Hello World! from process 2 out of 4 on node47.beowulf.cluster > Hello World! from process 3 out of 4 on node47.beowulf.cluster > Hello World! from process 0 out of 4 on node47.beowulf.cluster > === > > On 15 June 2010 13:41, Ralph Castain wrote: > Look at the contents of $PBS_NODEFILE and see how many nodes it contains. > > On Jun 15, 2010, at 3:56 AM, Govind Songara wrote: > >> Hi, >> >> I have using openmpi build with tm support >> When i run the job requesting for two nodes it run only on single node. >> Here is my script. >> >cat mpipbs-script.sh >> #PBS -N mpipbs-script >> #PBS -q short >> ### Number of nodes: resources per node >> ### (4 cores/node, so ppn=4 is ALL resources on the node) >> #PBS -l nodes=2:ppn=4 > >> echo `cat $PBS_NODEFILE` >> NPROCS=`wc -l < $PBS_NODEFILE` >> echo This job has allocated $NPROCS nodes > > /opt/openmpi-1.4.2/bin/mpirun /scratch0/gsongara/mpitest/hello > > > torque config > set queue short resources_max.nodes = 4 > set queue short resources_default.nodes = 1 > set server resources_default.neednodes = 1 > set server resources_default.nodect = 1 > set server resources_default.nodes = 1 > > Can someone please advise if i missing anything here. > > Regards > Govind >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users