Hi Reuti, I've been unable to reproduce the issue so far.
Sorry for the convenience, Eloi On Tuesday 25 May 2010 11:32:44 Reuti wrote: > Hi, > > Am 25.05.2010 um 09:14 schrieb Eloi Gaudry: > > I do no reset any environment variable during job submission or job > > handling. Is there a simple way to check that openmpi is working as > > expected with SGE tight integration (as displaying environment > > variables, setting options on the command line, etc. ) ? > > a) put a command: > > env > > in the jobscript and check the output for $JOB_ID and various $SGE_* > variables. > > b) to confirm the misbehavior: are the tasks on the slave nodes kids of > sge_shepherd or any system sshd/rshd? > > -- Reuti > > > Regards, > > Eloi > > > > On Friday 21 May 2010 17:35:24 Reuti wrote: > >> Hi, > >> > >> Am 21.05.2010 um 17:19 schrieb Eloi Gaudry: > >>> Hi Reuti, > >>> > >>> Yes, the openmpi binaries used were build after having used the > >>> --with-sge during configure, and we only use those binaries on our > >>> cluster. > >>> > >>> [eg@moe:~]$ /opt/openmpi-1.3.3/bin/ompi_info > >>> > >>> MCA ras: gridengine (MCA v2.0, API v2.0, Component > >>> v1.3.3) > >> > >> ok. As you have a Tight Integration as goal and set in your PE > >> "control_slaves TRUE", SGE wouldn't allow `qrsh -inherit ...` to nodes > >> which are not in the list of granted nodes. So it looks, like your job > >> is running outside of this Tight Integration with its own `rsh`or > >> `ssh`. > >> > >> Do you reset $JOB_ID or other environment variables in your jobscript, > >> which could trigger Open MPI to assume that it's not running inside SGE? > >> > >> -- Reuti > >> > >>> On Friday 21 May 2010 16:01:54 Reuti wrote: > >>>> Hi, > >>>> > >>>> Am 21.05.2010 um 14:11 schrieb Eloi Gaudry: > >>>>> Hi there, > >>>>> > >>>>> I'm observing something strange on our cluster managed by SGE6.2u4 > >>>>> when launching a parallel computation on several nodes, using > >>>>> OpenMPI/SGE tight- integration mode (OpenMPI-1.3.3). It seems that > >>>>> the SGE allocated slots are not used by OpenMPI, as if OpenMPI was > >>>>> doing is own > >>>>> round-robin allocation based on the allocated node hostnames. > >>>> > >>>> you compiled Open MPI with --with-sge (and recompiled your > >>>> applications)? You are using the correct mpiexec? > >>>> > >>>> -- Reuti > >>>> > >>>>> Here is what I'm doing: > >>>>> - launch a parallel computation involving 8 processors, using for > >>>>> each of them 14GB of memory. I'm using a qsub command where i > >>>>> request memory_free resource and use tight integration with openmpi > >>>>> - 3 servers are available: > >>>>> . barney with 4 cores (4 slots) and 32GB > >>>>> . carl with 4 cores (4 slots) and 32GB > >>>>> . charlie with 8 cores (8 slots) and 64GB > >>>>> > >>>>> Here is the output of the allocated nodes (OpenMPI output): > >>>>> ====================== ALLOCATED NODES ====================== > >>>>> > >>>>> Data for node: Name: charlie Launch id: -1 Arch: ffc91200 State: 2 > >>>>> > >>>>> Daemon: [[44332,0],0] Daemon launched: True > >>>>> Num slots: 4 Slots in use: 0 > >>>>> Num slots allocated: 4 Max slots: 0 > >>>>> Username on node: NULL > >>>>> Num procs: 0 Next node_rank: 0 > >>>>> > >>>>> Data for node: Name: carl.fft Launch id: -1 Arch: 0 State: 2 > >>>>> > >>>>> Daemon: Not defined Daemon launched: False > >>>>> Num slots: 2 Slots in use: 0 > >>>>> Num slots allocated: 2 Max slots: 0 > >>>>> Username on node: NULL > >>>>> Num procs: 0 Next node_rank: 0 > >>>>> > >>>>> Data for node: Name: barney.fft Launch id: -1 Arch: 0 State: 2 > >>>>> > >>>>> Daemon: Not defined Daemon launched: False > >>>>> Num slots: 2 Slots in use: 0 > >>>>> Num slots allocated: 2 Max slots: 0 > >>>>> Username on node: NULL > >>>>> Num procs: 0 Next node_rank: 0 > >>>>> > >>>>> ================================================================= > >>>>> > >>>>> Here is what I see when my computation is running on the cluster: > >>>>> # rank pid hostname > >>>>> > >>>>> 0 28112 charlie > >>>>> 1 11417 carl > >>>>> 2 11808 barney > >>>>> 3 28113 charlie > >>>>> 4 11418 carl > >>>>> 5 11809 barney > >>>>> 6 28114 charlie > >>>>> 7 11419 carl > >>>>> > >>>>> Note that -the parallel environment used under SGE is defined as: > >>>>> [eg@moe:~]$ qconf -sp round_robin > >>>>> pe_name round_robin > >>>>> slots 32 > >>>>> user_lists NONE > >>>>> xuser_lists NONE > >>>>> start_proc_args /bin/true > >>>>> stop_proc_args /bin/true > >>>>> allocation_rule $round_robin > >>>>> control_slaves TRUE > >>>>> job_is_first_task FALSE > >>>>> urgency_slots min > >>>>> accounting_summary FALSE > >>>>> > >>>>> I'm wondering why OpenMPI didn't use the allocated nodes chosen by > >>>>> SGE (cf. "ALLOCATED NODES" report) but instead allocate each job of > >>>>> the parallel computation at a time, using a round-robin method. > >>>>> > >>>>> Note that I'm using the '--bynode' option in the orterun command > >>>>> line. If the behavior I'm observing is simply the consequence of > >>>>> using this option, please let me know. This would eventually mean > >>>>> that one need to state that SGE tight- integration has a lower > >>>>> priority on orterun behavior than the different command line > >>>>> options. > >>>>> > >>>>> Any help would be appreciated, > >>>>> Thanks, > >>>>> Eloi > >>>> > >>>> _______________________________________________ > >>>> users mailing list > >>>> us...@open-mpi.org > >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users -- Eloi Gaudry Free Field Technologies Axis Park Louvain-la-Neuve Rue Emile Francqui, 1 B-1435 Mont-Saint Guibert BELGIUM Company Phone: +32 10 487 959 Company Fax: +32 10 454 626