OK, it works although there are some temporary errors. This is the NetBSD wip openmpi package as downloaded from the webCVS a couple of days ago but with my patches as detailed before (I have not tried comparing yours with mine as yet) and the removal of the compilation and install of the Vampire Tracing stuff at the config stage, via the previously detailed change to the NetBSD package's Makefile.
% cat my_mpirun_job.sh #!/bin/sh # #$ -wd /vol/grid/sgeusers/kingstlind/SGE-MPI #$ -S /bin/sh # /usr/pkg/bin/mpirun -n $NSLOTS /vol/grid/sgeusers/kingstlind/SGE-MPI/hello_c % qsub -pe kmbmpi 4 my_mpirun_job.sh % qstat -f kmbmp...@citron.ecs.vuw.ac.nz BIP 0/1/1 0.02 nbsd-i386 419972 0.60500 my_mpirun_ kingstlind r 12/09/2009 13:10:39 1 ------------------------------------------------------------------------------- kmbmp...@kipp-cafe.ecs.vuw.ac. BIP 0/1/1 0.03 nbsd-i386 419972 0.60500 my_mpirun_ kingstlind r 12/09/2009 13:10:39 1 ------------------------------------------------------------------------------- kmbmp...@matterhorn.ecs.vuw.ac BIP 0/1/1 0.02 nbsd-i386 419972 0.60500 my_mpirun_ kingstlind r 12/09/2009 13:10:39 1 ------------------------------------------------------------------------------- kmbmp...@old-bailey.ecs.vuw.ac BIP 0/1/1 0.05 nbsd-i386 419972 0.60500 my_mpirun_ kingstlind r 12/09/2009 13:10:39 1 % ls -ltr -rw-r--r-- 1 kingstlind grid 0 Dec 9 13:10 my_mpirun_job.sh.po419972 -rw-r--r-- 1 kingstlind grid 0 Dec 9 13:10 my_mpirun_job.sh.pe419972 -rw-r--r-- 1 kingstlind grid 207 Dec 9 13:10 my_mpirun_job.sh.o419972 -rw-r--r-- 1 kingstlind grid 615 Dec 9 13:10 my_mpirun_job.sh.e419972 % cat my_mpirun_job.sh.o419972 Hello world, I am 0 of 4 on kipp-cafe.ecs.vuw.ac.nz Hello world, I am 2 of 4 on old-bailey.ecs.vuw.ac.nz Hello world, I am 3 of 4 on matterhorn.ecs.vuw.ac.nz Hello world, I am 1 of 4 on citron.ecs.vuw.ac.nz % cat my_mpirun_job.sh.e419972 [kipp-cafe.ecs.vuw.ac.nz:02387] opal_sockaddr2str failed:Temporary failure in name resolution (return code 4) [old-bailey.ecs.vuw.ac.nz:03279] opal_sockaddr2str failed:Temporary failure in name resolution (return code 4) [matterhorn.ecs.vuw.ac.nz:02443] opal_sockaddr2str failed:Temporary failure in name resolution (return code 4) [old-bailey.ecs.vuw.ac.nz:03279] opal_sockaddr2str failed:Unknown error (return code 4) [matterhorn.ecs.vuw.ac.nz:02443] opal_sockaddr2str failed:Unknown error (return code 4) [citron.ecs.vuw.ac.nz:02011] opal_sockaddr2str failed:Temporary failure in name resolution (return code 4) Oddly enough, those were the non-fatal errors I was seeing for a single machine MPI job that got me started on all this and so the wheel has seemingly come full circle, albeit having moved forward, by a circumference's length! But anyroad, by my reckoning, an OpenMPI job is running, under SGE, on NetBSD. Just need to tidy up the loose ends and patch for OpenMPI 1.4 which I see is just out. Kevin -- Kevin M. Buckley Room: CO327 School of Engineering and Phone: +64 4 463 5971 Computer Science Victoria University of Wellington New Zealand