Am 12.10.2010 um 15:49 schrieb Dave Love: > Chris Jewell <chris.jew...@warwick.ac.uk> writes: > >> I've scrapped this system now in favour of the new SGE core binding feature. > > How does that work, exactly? I thought the OMPI SGE integration didn't > support core binding, but good if it does.
With the default binding_instance set to "set" (the default) the shepherd should bind the processes to cores already. With other types of binding_instance these selected cores must be forward to the application via an environment variable or in the hostfile. As this is only a hint to SGE and not a hard request, the user must plan a little bit the allocation beforehand. Especially if you oversubscribe a machine it won't work. When I look at /proc/*/status it's mentioned there as it happened. And it's also noted in "config" file of each job's .../active_jobs/... file. E.g. a top shows: 9926 ms04 39 19 3756 292 228 R 25 0.0 0:19.31 ever 9927 ms04 39 19 3756 292 228 R 25 0.0 0:19.31 ever 9925 ms04 39 19 3756 288 228 R 25 0.0 0:19.30 ever 9928 ms04 39 19 3756 292 228 R 25 0.0 0:19.30 ever for 4 forks of an endless loop in one and the same jobscript when submitted with `qsub -binding linear:1 demo.sh`. Well, the funny thing is that with this kernel version I still get a load of 4, despite the fact that all 4 forks are bound to one core. Should it really be four? -- Reuti > ______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users