[OMPI users] Try to submit OMPI job to SGE gives ERRORS (orte_plm_base_select failed & orte_ess_set_name failed)

2011-04-15 Thread Derrick LIN
Hi all, I am trying to setup a small SGE cluster with OpenMPI integrated but I am totally stuck when trying to run a openmpi job to the SGE's PE. I mainly followed the guide sge-snow.pdf from Revolutions Computing and http://idolinux.blogspot.com/2010/04/quick-install-of-open-mpi-with-grid.html

Re: [OMPI users] Condor and MPI

2011-04-15 Thread Asad Ali
Hi Ralph, Thank you for your reply. On Fri, Apr 15, 2011 at 1:16 PM, Ralph Castain wrote: > Not much we can say with that little info. :-/ > > Are you using Open MPI? If so, what version? > Yes. The version is mpirun (Open MPI) 1.2.7rc2. > When you say the job gets restarted, do you mean t

Re: [OMPI users] Try to submit OMPI job to SGE gives ERRORS (orte_plm_base_select failed & orte_ess_set_name failed)

2011-04-15 Thread Reuti
Hi, Am 15.04.2011 um 06:53 schrieb Derrick LIN: > I am trying to setup a small SGE cluster with OpenMPI integrated but I am > totally stuck when trying to run a openmpi job to the SGE's PE. > > I mainly followed the guide sge-snow.pdf from Revolutions Computing and > http://idolinux.blogspot.c

Re: [OMPI users] Condor and MPI

2011-04-15 Thread Reuti
Hi, Am 15.04.2011 um 07:25 schrieb Asad Ali: > > Yes. The entire job gets restarted. maybe this is caused by a signal sent to the job by Condor, so that it gets terminated and as a result Condor restarts it. Can you trap the signals in your appliaction for a test? > If so, you had best tal

Re: [OMPI users] Condor and MPI

2011-04-15 Thread Ralph Castain
On Apr 15, 2011, at 2:59 AM, Reuti wrote: > Hi, > > Am 15.04.2011 um 07:25 schrieb Asad Ali: > >> >> Yes. The entire job gets restarted. > > maybe this is caused by a signal sent to the job by Condor, so that it gets > terminated and as a result Condor restarts it. Can you trap the signals

Re: [OMPI users] Try to submit OMPI job to SGE gives ERRORS (orte_plm_base_select failed & orte_ess_set_name failed) (Reuti)

2011-04-15 Thread Derrick LIN
> > - what is your SGE configuration `qconf -sconf`? #global: execd_spool_dir /var/spool/gridengine/execd mailer /usr/bin/mail xterm/usr/bin/xterm load_sensor none prolog none epilog

Re: [OMPI users] Try to submit OMPI job to SGE gives ERRORS (orte_plm_base_select failed & orte_ess_set_name failed) (Reuti)

2011-04-15 Thread Reuti
Am 15.04.2011 um 23:02 schrieb Derrick LIN: > - what is your SGE configuration `qconf -sconf`? > > > rlogin_daemon/usr/sbin/sshd -i > rlogin_command /usr/bin/ssh > qlogin_daemon/usr/sbin/sshd -i > qlogin_command /usr/share/gridengine/q

[OMPI users] missing symbols in Windows 1.5.3 binaries?

2011-04-15 Thread Damien
Hiya, I just tested the 1.5.3 binaries and my link pass broke. Using 1.5.3 I get unresolved externals on things like _MPI_NULL_COPY_FN. On 1.5.2.2 it's fine. I did a dumpbin on libmpi.lib for both versions, and in 1.5.3 there's upper-case symbols for _OMPI_C_MPI_NULL_COPY_FN, but not _MPI_