Am 12.11.2012 um 20:54 schrieb Guillermo Marco Puche: > Hello, > > It seems that the program i need to use needs mpi as PE. I've tested with > orte and it just does the same process 16 times. > > I'm trying the same submit job script but chaning "-pe orte 16" to "-pe mpi > 16". > > And getting the following error: > > /opt/gridengine/default/spool/compute-0-1/active_jobs/102.1/pe_hostfile > compute-0-1 > compute-0-1 > compute-0-1 > compute-0-1 > compute-0-1 > compute-0-1 > compute-0-1 > compute-0-1 > compute-0-0 > compute-0-0 > compute-0-0 > compute-0-0 > compute-0-0 > compute-0-0 > compute-0-0 > compute-0-0 > rm: cannot remove `/tmp/102.1.all.q/rsh': No such file or directory
Either ignore it or remove the start-/stop_proc_args from the PE. It tries to remove a file which is only created if start_proc_args is called with -catch_rsh. -- Reuti > > > El 12/11/2012 12:18, Guillermo Marco Puche escribió: >> Hello, >> >> I'm currently trying with the following job script and then submiting with >> qsub. >> I don't know why it just uses cpus of one of my two compute nodes. It's not >> using both compute nodes. (compute-0-2 it's currently powered off node). >> >> #!/bin/bash >> #$ -S /bin/bash >> #$ -V >> ### name >> #$ -N aln_left >> ### work dir >> #$ -cwd >> ### outputs >> #$ -j y >> ### PE >> #$ -pe orte 16 >> ### all.q >> #$ -q all.q >> >> mpirun -np 16 pBWA aln -f aln_left >> /data_in/references/genomes/human/hg19/bwa_ref/hg19.fa >> /data_in/data/rawdata/HapMap_1.fastq > >> /data_out_2/tmp/05_11_12/mpi/HapMap_cloud.left.sai >> >> Here's all.q config file: >> >> qname all.q >> hostlist @allhosts >> seq_no 0 >> load_thresholds np_load_avg=1.75 >> suspend_thresholds NONE >> nsuspend 1 >> suspend_interval 00:05:00 >> priority 0 >> min_cpu_interval 00:05:00 >> processors UNDEFINED >> qtype BATCH INTERACTIVE >> ckpt_list NONE >> pe_list make mpich mpi orte openmpi smp >> rerun FALSE >> slots 0,[compute-0-0.local=8],[compute-0-1.local=8], \ >> [compute-0-2.local.sg=8] >> tmpdir /tmp >> shell /bin/csh >> prolog NONE >> epilog NONE >> shell_start_mode posix_compliant >> starter_method NONE >> suspend_method NONE >> resume_method NONE >> terminate_method NONE >> notify 00:00:60 >> owner_list NONE >> user_lists NONE >> xuser_lists NONE >> subordinate_list NONE >> complex_values NONE >> projects NONE >> xprojects NONE >> calendar NONE >> initial_state default >> s_rt INFINITY >> h_rt INFINITY >> s_cpu INFINITY >> h_cpu INFINITY >> s_fsize INFINITY >> h_fsize INFINITY >> s_data INFINITY >> h_data INFINITY >> s_stack INFINITY >> h_stack INFINITY >> s_core INFINITY >> h_core INFINITY >> s_rss INFINITY >> h_rss INFINITY >> s_vmem INFINITY >> h_vmem INFINITY >> >> Best regards, >> Guillermo. >> >> >> El 05/11/2012 12:01, Reuti escribió: >>> Hi, >>> >>> Am 05.11.2012 um 10:55 schrieb Guillermo Marco Puche: >>> >>>> I've managed to compile Open MPI for Rocks: >>>> ompi_info | grep grid >>>> MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.4.3) >>>> >>>> Now I'm really confused on how i should run my pBWA program with Open MPI. >>>> Program website (http://pbwa.sourceforge.net/) suggests something like: >>>> >>>> sqsub -q mpi -n 240 -r 1h --mpp 4G ./pBWA bla bla bla... >>> Seems to be a local proprietary command on Sharcnet, or at least a wrapper >>> to another unknown queuing system. >>> >>> >>>> I don't have sqsub, but qsub provided by SGE. "-q" option isn't valid for >>>> SGE since it's for queue selection. >>> Correct, the SGE paradigm is to request resources and SGE will select an >>> appropriate queue for your job which fullfils the requirements. >>> >>> >>>> Maybe the solution is to create a simple job bash script and include >>>> parallel environment for SGE and the number of slots (since pBWA >>>> internally supports Open MPI) >>> How is your actal setup of your SGE? Most likely you will need to define a >>> PE and request it during submission like for any other Open MPI application: >>> >>> $ qsub -pe orte 240 -l h_rt=1:00:00,h_vmem=4G ./pBWA bla bla bla... >>> >>> Assuming "-n" gives the number of cores. >>> Assuming "-r 1h" means wallclock time: -l h_rt=1:00:00 >>> Assuming "--mpp 4G" requests the memory per slot: -l h_vmem=4G >>> >>> Necessary setup: >>> >>> http://www.open-mpi.org/faq/?category=running#run-n1ge-or-sge >>> >>> -- Reuti >>> >>> >>>> Regards, >>>> Guillermo. >>>> >>>> El 26/10/2012 12:21, Reuti escribió: >>>>> Am 26.10.2012 um 12:02 schrieb Guillermo Marco Puche: >>>>> >>>>> >>>>>> Hello, >>>>>> >>>>>> Like I said i'm using Rocks cluster 5.4.3 and it comes with mpirun (Open >>>>>> MPI) 1.4.3. >>>>>> But $ ompi_info | grep gridengine shows nothing. >>>>>> >>>>>> So I'm confused if I've to update and rebuild open-mpi into the latest >>>>>> version. >>>>>> >>>>> You can also remove the supplied version 1.4.3 from your system and build >>>>> it from source with SGE support. But I don't see the advantage of using >>>>> an old version. ROCKS supplies the source of their used version of Open >>>>> MPI? >>>>> >>>>> >>>>> >>>>>> Or if i can keep that current version of MPI and re-build it (that would >>>>>> be the preferred option to keep the stability of the cluster) >>>>>> >>>>> If you compile and install only in your own $HOME (as normal user, no >>>>> root access necessary), then there is no impact to any system tool at >>>>> all. You just have to take care which version you use by setting the >>>>> correct $PATH and $LD_LIBRARY_PATH during compilation of your application >>>>> and during execution of it. Therefore I suggested to include the name of >>>>> the used compiler and Open MPI version in the build installation's >>>>> directory name. >>>>> >>>>> There was a question about the to be used version of `mpiexec` just on >>>>> the MPICH2 mailing list, maybe it's additional info: >>>>> >>>>> >>>>> http://lists.mcs.anl.gov/pipermail/mpich-discuss/2012-October/013318.html >>>>> >>>>> >>>>> -- Reuti >>>>> >>>>> >>>>> >>>>>> Thanks ! >>>>>> >>>>>> Best regards, >>>>>> Guillermo. >>>>>> >>>>>> El 26/10/2012 11:59, Reuti escribió: >>>>>> >>>>>>> Am 26.10.2012 um 09:40 schrieb Guillermo Marco Puche: >>>>>>> >>>>>>> >>>>>>>> Hello, >>>>>>>> >>>>>>>> Thank you for the links Reuti ! >>>>>>>> >>>>>>>> When they talk about: >>>>>>>> >>>>>>>> shell $ ./configure --with-sge >>>>>>>> >>>>>>>> It's in bash shell or in any other special shell? >>>>>>>> >>>>>>> There is no special shell required (please have a look at the INSTALL >>>>>>> file in Open MPI's tar-archive). >>>>>>> >>>>>>> >>>>>>>> Do I've to be in a specified directory to execute that command? >>>>>>>> >>>>>>> Depends. >>>>>>> >>>>>>> As it's set up according to the >>>>>>> http://en.wikipedia.org/wiki/GNU_build_system >>>>>>> , you can either: >>>>>>> >>>>>>> $ tar -xf openmpi-1.6.2.tar.gz >>>>>>> $ cd openmpi-1.6.2 >>>>>>> $ ./configure --prefix=$HOME/local/openmpi-1.6.2_gcc --with-sge >>>>>>> $ make >>>>>>> $ make install >>>>>>> >>>>>>> It's quite common to build inside the source tree. But if it is set up >>>>>>> in the right way, it also supports building in different directories >>>>>>> inside or outside the source tree which avoids a `make distclean` in >>>>>>> case you want to generate different builds: >>>>>>> >>>>>>> $ tar -xf openmpi-1.6.2.tar.gz >>>>>>> $ mkdir openmpi-gcc >>>>>>> $ cd openmpi-gcc >>>>>>> $ ../openmpi-1.6.2/configure --prefix=$HOME/local/openmpi-1.6.2_gcc >>>>>>> --with-sge >>>>>>> $ make >>>>>>> $ make install >>>>>>> >>>>>>> While at the time in another window you can execute: >>>>>>> >>>>>>> $ mkdir openmpi-intel >>>>>>> $ cd openmpi-intel >>>>>>> $ ../openmpi-1.6.2/configure --prefix=$HOME/local/openmpi-1.6.2_intel >>>>>>> CC=icc CXX=icpc FC=ifort F77=ifort --disable-vt --with-sge >>>>>>> $ make >>>>>>> $ make install >>>>>>> >>>>>>> (Not to confuse anyone: there is bug in combination of Intel compiler >>>>>>> and GNU headers with the above version of Open MPI, disabling >>>>>>> VampirTrace support helps.) >>>>>>> >>>>>>> -- Reuti >>>>>>> >>>>>>> >>>>>>> >>>>>>>> Thank you ! >>>>>>>> Sorry again for my ignorance. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Guillermo. >>>>>>>> >>>>>>>> El 25/10/2012 19:50, Reuti escribió: >>>>>>>> >>>>>>>>> Am 25.10.2012 um 19:36 schrieb Guillermo Marco Puche: >>>>>>>>> >>>>>>>>> >>>>>>>>>> Hello, >>>>>>>>>> >>>>>>>>>> I've no idea who compiled the application. I just found on >>>>>>>>>> seqanswers forum that pBWA was a nice speed up to the original BWA >>>>>>>>>> since it supports native OPEN MPI. >>>>>>>>>> >>>>>>>>>> As you told me i'll look further on how to compile open-mpi with >>>>>>>>>> SGE. If anyone knows a good introduction/tutorial to this would be >>>>>>>>>> appreciated. >>>>>>>>>> >>>>>>>>> The Open MPI site has huge documentation: >>>>>>>>> >>>>>>>>> >>>>>>>>> http://www.open-mpi.org/faq/?category=building#build-rte-sge >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> http://www.open-mpi.org/faq/?category=running#run-n1ge-or-sge >>>>>>>>> >>>>>>>>> >>>>>>>>> Be sure that during execution you pick the correct `mpiexec` and >>>>>>>>> LD_LIBRARY_PATH from you own build. You can also adjust the location >>>>>>>>> of Open MPI with the usual --prefix. I put it in >>>>>>>>> --prefix==$HOME/local/openmpi-1.6.2_shared_gcc refelcting the version >>>>>>>>> I built. >>>>>>>>> >>>>>>>>> -- Reuti >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> Then i'll try to run it with my current version of open-mpi and >>>>>>>>>> update if needed. >>>>>>>>>> >>>>>>>>>> Thanks. >>>>>>>>>> >>>>>>>>>> Best regards, >>>>>>>>>> Guillermo. >>>>>>>>>> >>>>>>>>>> El 25/10/2012 18:53, Reuti escribió: >>>>>>>>>> >>>>>>>>>>> Please keep the list posted, so that others can participate on the >>>>>>>>>>> discussion. I'm not aware of this application, but maybe someone >>>>>>>>>>> else is on the list who could be of broader help. >>>>>>>>>>> >>>>>>>>>>> Again: who compiled the application, as I can see only the source >>>>>>>>>>> at the site you posted? >>>>>>>>>>> >>>>>>>>>>> -- Reuti >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Am 25.10.2012 um 13:23 schrieb Guillermo Marco Puche: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> $ ompi_info | grep grid >>>>>>>>>>>> >>>>>>>>>>>> Returns nothing. Like i said I'm newbie to MPI. >>>>>>>>>>>> I didn't know that I had to compile anything. I've Rocks >>>>>>>>>>>> installation out of the box. >>>>>>>>>>>> So MPI is installed but nothing more I guess. >>>>>>>>>>>> >>>>>>>>>>>> I've found an old thread in Rocks discuss list: >>>>>>>>>>>> >>>>>>>>>>>> https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/2012-April/057303.html >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> User asking is using this script: >>>>>>>>>>>> >>>>>>>>>>>> *#$ -S /bin/bash* >>>>>>>>>>>> >>>>>>>>>>>> *#* >>>>>>>>>>>> >>>>>>>>>>>> *#* >>>>>>>>>>>> >>>>>>>>>>>> *# Export all environment variables* >>>>>>>>>>>> >>>>>>>>>>>> *#$ -V* >>>>>>>>>>>> >>>>>>>>>>>> *# specify the PE and core #* >>>>>>>>>>>> >>>>>>>>>>>> *#$ -pe mpi 128* >>>>>>>>>>>> >>>>>>>>>>>> *# Customize job name* >>>>>>>>>>>> >>>>>>>>>>>> *#$ -N job_hpl_2.0* >>>>>>>>>>>> >>>>>>>>>>>> *# Use current working directory* >>>>>>>>>>>> >>>>>>>>>>>> *#$ -cwd* >>>>>>>>>>>> >>>>>>>>>>>> *# Join stdout and stder into one file* >>>>>>>>>>>> >>>>>>>>>>>> *#$ -j y* >>>>>>>>>>>> >>>>>>>>>>>> *# The mpirun command; note the lack of host names as SGE will >>>>>>>>>>>> provide them >>>>>>>>>>>> >>>>>>>>>>>> on-the-fly.* >>>>>>>>>>>> >>>>>>>>>>>> *mpirun -np $NSLOTS ./xhpl >> xhpl.out* >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> But then I read this: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> in rocks sge PE >>>>>>>>>>>> mpi is loosely integrated >>>>>>>>>>>> mpich and orte are tightly integrated >>>>>>>>>>>> qsub require args are different for mpi mpich with orte >>>>>>>>>>>> >>>>>>>>>>>> mpi and mpich need machinefile >>>>>>>>>>>> >>>>>>>>>>>> by default >>>>>>>>>>>> mpi, mpich are for mpich2 >>>>>>>>>>>> orte is for openmpi >>>>>>>>>>>> regards >>>>>>>>>>>> -LT >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> The program I need to run is pBWA: >>>>>>>>>>>> http://pbwa.sourceforge.net/ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> It uses MPI. >>>>>>>>>>>> >>>>>>>>>>>> At this moment i'm kinda confused on which is the next step. >>>>>>>>>>>> >>>>>>>>>>>> I thought i just could run with MPI and a simple SGE job pBWA with >>>>>>>>>>>> multiple processes. >>>>>>>>>>>> >>>>>>>>>>>> Regards, >>>>>>>>>>>> Guillermo. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> El 25/10/2012 13:17, Reuti escribió: >>>>>>>>>>>> >>>>>>>>>>>>> Am 25.10.2012 um 13:11 schrieb Guillermo Marco Puche: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Hello Reuti, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I got stoned here. I've no idea what MPI library I've got. I'm >>>>>>>>>>>>>> using Rocks Cluster Viper 5.4.3 which comes out with Centos 5.6, >>>>>>>>>>>>>> SGE, SPM, OPEN MPI and MPI. >>>>>>>>>>>>>> >>>>>>>>>>>>>> How can i check which library i got installed? >>>>>>>>>>>>>> >>>>>>>>>>>>>> I found this: >>>>>>>>>>>>>> >>>>>>>>>>>>>> $ mpirun -V >>>>>>>>>>>>>> mpirun (Open MPI) 1.4.3 >>>>>>>>>>>>>> >>>>>>>>>>>>>> Report bugs to >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://www.open-mpi.org/community/help/ >>>>>>>>>>>>> Good, and this one you also used to compile the application? >>>>>>>>>>>>> >>>>>>>>>>>>> The check whether Open MPI was build with SGE support: >>>>>>>>>>>>> >>>>>>>>>>>>> $ ompi_info | grep grid >>>>>>>>>>>>> MCA ras: gridengine (MCA v2.0, API v2.0, >>>>>>>>>>>>> Component v1.6.2) >>>>>>>>>>>>> >>>>>>>>>>>>> -- Reuti >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Best regards, >>>>>>>>>>>>>> Guillermo. >>>>>>>>>>>>>> >>>>>>>>>>>>>> El 25/10/2012 13:05, Reuti escribió: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Am 25.10.2012 um 10:37 schrieb Guillermo Marco Puche: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hello ! >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I found a new version of my tool which supports >>>>>>>>>>>>>>>> multi-threading but also MPI or OPENMPI for more additional >>>>>>>>>>>>>>>> processes. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I'm kinda new to MPI with SGE. What would be the good command >>>>>>>>>>>>>>>> for qsub or config inside a job file to ask SGE to work with 2 >>>>>>>>>>>>>>>> MPI processes? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Will the following code work in a SGE job file? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> #$ -pe mpi 2 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> That's supposed to make job work with 2 processes instead of 1. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Not out of the box: it will grant 2 slots for the job according >>>>>>>>>>>>>>> to the allocation rules of the PE. But how to start your >>>>>>>>>>>>>>> application in the jobscript inside the granted allocation is >>>>>>>>>>>>>>> up to you. Fortunately the MPI libraries got an (almost) >>>>>>>>>>>>>>> automatic integration into queuing systems nowadays without >>>>>>>>>>>>>>> further user intervention. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Which MPI library do you use when you compile your application >>>>>>>>>>>>>>> of the mentioned ones above? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- Reuti >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>>>> Guillermo. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> El 22/10/2012 17:19, Reuti escribió: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Am 22.10.2012 um 16:31 schrieb Guillermo Marco Puche: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I'm using a program where I can specify the number of >>>>>>>>>>>>>>>>>> threads I want to use. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Only threads and not additional processes? Then you are >>>>>>>>>>>>>>>>> limited to one node, unless you add something like >>>>>>>>>>>>>>>>> http://www.kerrighed.org/wiki/index.php/Main_Page or >>>>>>>>>>>>>>>>> http://www.scalemp.com >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> to get a cluster wide unique process and memory space. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- Reuti >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I'm able to launch multiple instances of that tool in >>>>>>>>>>>>>>>>>> separate nodes. >>>>>>>>>>>>>>>>>> For example: job_process_00 in compute-0-0, job_process_01 >>>>>>>>>>>>>>>>>> in compute-1 etc.. each job is calling that program which >>>>>>>>>>>>>>>>>> splits up in 8 threads (each of my nodes has 8 CPUs). >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> When i setup 16 threads i can't split 8 threads per node. So >>>>>>>>>>>>>>>>>> I would like to split them between 2 compute nodes. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Currently I've 4 compute nodes and i would like to speed up >>>>>>>>>>>>>>>>>> the process setting 16 threads of my program splitting >>>>>>>>>>>>>>>>>> between more than one compute node. At this moment I'm stuck >>>>>>>>>>>>>>>>>> using only 1 compute node per process with 8 threads. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thank you ! >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Best regards, >>>>>>>>>>>>>>>>>> Guillermo. >>>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>> users mailing list >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>>>> https://gridengine.org/mailman/listinfo/users >>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>> users mailing list >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>> https://gridengine.org/mailman/listinfo/users >> >> _______________________________________________ >> users mailing list >> [email protected] >> https://gridengine.org/mailman/listinfo/users > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
