Am 14.03.2012 um 17:44 schrieb Ralph Castain: > Hi Reuti > > I appreciate your help on this thread - I confess I'm puzzled by it. As you > know, OMPI doesn't use SGE to launch the individual processes, nor does SGE > even know they exist. All SGE is used for is to launch the OMPI daemons > (orteds). This is done as a single qrsh call, so won't all the daemons wind > up being executed against the same queue regardless of how many queues exist > in the system?
Yes, per machine they will then start in one queue (the one the first and only `qrsh -inherit ...` will be assigned to). But between machines, they can get different queues. I would also assume that this is not relevant to Open MPI. You could say it's a cosmectic flaw, but it's worth to be noted as some applications expect the same $TMPDIR to be present on all machines with exactly the same name, and this can't be guranteed in case different queues were used for a job. > > Given that the daemons then fork/exec the MPI processes (outside of qrsh), I > would think they would inherit that nice setting as well, and so all the > procs will be running at the same nice level too. > > As for TMPDIR, we don't forward that unless specifically directed to do so, > which I didn't see on their cmd line. The SGE integration of Open MPI will forward all variables from the master task to all nodes by the supplied -V option in the Open MPI source. But for TMPDIR it won't do any harm, as SGE will overide this with the real $TMPDIR according to the selected queue on each particular slave machine again. If now this is again overriden by the application by any distribution of a variable to the slaves, then it can fail as the expected $TMPDIR isn't there. As said: maybe it's unrelated to the issue. I just tested with two different queues on two machines and a small mpihello and it is working as expected. Joshua: the Centos6 is the same on all nodes and the you recompiled the application with the actual version of the library? By "threads" you refer to "processes"? -- Reuti > On Mar 14, 2012, at 2:33 AM, Reuti wrote: > >> Hi, >> >> Am 14.03.2012 um 04:02 schrieb Joshua Baker-LePain: >> >>> On Tue, 13 Mar 2012 at 5:31pm, Ralph Castain wrote >>> >>>> FWIW: I have a Centos6 system myself, and I have no problems running OMPI >>>> on it (1.4 or 1.5). I can try building it the same way you do and see what >>>> happens. >>> >>> I can run as many threads as I like on a single system with no problems, >>> even if those threads are running at different nice levels. >> >> How do they get different nice levels - you renice them? I would assume that >> all start at the same of the parent. In your test program you posted there >> are no threads. >> >> >>> The problem seems to arise when I'm both a) running across multiple >>> machines and b) running threads at differing nice levels (which often >>> happens as a result of our queueing setup). >> >> This sounds like you are getting slots from different queues assigned to one >> and the same job. My experience: don't do it, unless you neeed it. The >> problem is, that SGE can't decide in its `qrsh -inherit ...` call, which >> queue is the correct one for the particular call. As a result all calls to a >> slave machine can end up in one and the same queue. Although this is not >> correct, it won't oversubscribe the node, as usually the overall slot amount >> is limited already and it's more a matter of names SGE sets for the >> environment of the job: >> >> https://arc.liv.ac.uk/trac/SGE/ticket/813 >> >> As a result, the SGE set $TMPDIR can be different between the master of the >> parallel job and the slave as the name of the queue is part of $TMPDIR. When >> a wrong $TMPDIR is set on a node (by Open MPI's forwarding?), strange things >> can happen depending on the application. >> >> Do you face the same if you stay in one and the same queue across the >> machines? If you want to limit the number of available PEs in your setup for >> the user, you could request a PE by a wildcard and once a PE is selected SGE >> will stay in this PE. Attaching each PE to only one queue allows this way to >> avoid the mixture of slots from different queues (orte1 PE => all.q, orte2 >> PE => extra.q and you request orte*). >> >> -- Reuti >> >> >>> I can't guarantee that the problem *never* happens when I run across >>> multiple machines with all the threads un-niced, but I haven't been able to >>> reproduce that at will like I can for the other case. >>> >>> -- >>> Joshua Baker-LePain >>> QB3 Shared Cluster Sysadmin >>> UCSF >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users