Re: [OMPI users] signal handling

2007-03-13 Thread Ralph Castain
I've been letting this rattle around in my head some more, and *may* have come up with an idea of what *might* be going on. In the GE environment, qsub only launches the daemons - the daemons are the ones that actually "launch" your local application processes. If qsub -notify uses qsub's knowledg

[OMPI users] Orted freezes on launch of application

2007-03-13 Thread David Minor
Hi, I'm an MPICH2 user trying out openmpi. I'm running a 1G network under Red Hat 9, but using the g++ 3.4.3 compiler. Openmpi compiled and installed fine but none of my applications that run under MPICH2 will run. I decided to go backwards and try to run a non-mpi application like /bin/ps, same

[OMPI users] Fun with threading

2007-03-13 Thread Mike Houston
At least with 1.1.4, I'm having a heck of a time with enabling multi-threading. Configuring with --with-threads=posix --enable-mpi-threads --enable-progress-threads leads to mpirun just hanging, even when not launching MPI apps, i.e. mpirun -np 1 hostname, and I can't crtl-c to kill it, I have

Re: [OMPI users] Fun with threading

2007-03-13 Thread David Minor
Sounds like bad news about the threading. That's probably what's hanging me as well. We're running clusters of multi-core smp's, our app NEEDS multi-threading. It'd be nice to get an "official" reply on this from someone on the dev team. -David -Original Message- From: users-boun...@ope

Re: [OMPI users] signal handling

2007-03-13 Thread Reuti
Am 12.03.2007 um 21:29 schrieb Ralph Castain: On 3/12/07 2:18 PM, "Reuti" wrote: Am 12.03.2007 um 20:36 schrieb Ralph Castain: ORTE propagates the signal to the application processes, but the ORTE daemons never actually look at the signal themselves (looks just like a message to them). So

Re: [OMPI users] signal handling (part 2)

2007-03-13 Thread Reuti
Am 12.03.2007 um 21:29 schrieb Ralph Castain: But now we are going beyond Mark's initial problem. Back to the initial problem: suspending a parallel job in SGE leads to: 19924 1786 19924 S \_ sge_shepherd-45250 -bg 19926 19924 19926 Ts| \_ /bin/sh /var/spool/sge/node39/ job_script

Re: [OMPI users] Orted freezes on launch of application

2007-03-13 Thread Ralph H Castain
Hi David I think your tar file didn¹t get attached ­ at least, it didn¹t reach me. Can you please send it again? Thanks Ralph On 3/13/07 1:00 AM, "David Minor" wrote: > Hi, > I'm an MPICH2 user trying out openmpi. I'm running a 1G network under Red Hat > 9, but using the g++ 3.4.3 compiler. O

Re: [OMPI users] MPI_Comm_Spawn

2007-03-13 Thread Ralph H Castain
I was informed yesterday that we will not be doing any more bug fixes in the 1.1 series beyond what is in the soon-to-be-released 1.1.5. So I've been asked to confine any "fix" activity to the 1.2 series about to be released. Unfortunately, 1.1.5 won't solve the problem you noted. Tim tells me tha

Re: [OMPI users] signal handling (part 2)

2007-03-13 Thread Olesen, Mark
Hi Reuti (and others), > And now the odd thing: the jobscript (with the mpirun) is gone on the > head node of this parallel job, but all the spawned qrsh processes > are still there: I'm glad that someone else can almost reproduce my problem. On the suspicion that my application was not ignoring

Re: [OMPI users] Orted freezes on launch of application

2007-03-13 Thread David Minor
with tar From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph H Castain Sent: Tuesday, March 13, 2007 3:25 PM To: Open MPI Users Subject: Re: [OMPI users] Orted freezes on launch of application Hi David I think your tar

Re: [OMPI users] signal handling

2007-03-13 Thread Reuti
Am 13.03.2007 um 06:01 schrieb Ralph Castain: I've been letting this rattle around in my head some more, and *may* have come up with an idea of what *might* be going on. In the GE environment, qsub only launches the daemons - the daemons are the ones that actually "launch" your local appli

Re: [OMPI users] LSF & OpenMPI

2007-03-13 Thread Renato Golin
On 12/03/07, Ralph Castain wrote: I have been asked about providing native LSF support and hope to get to that in the not-too-distant future, but have no access to an LSF machine to verify operation (I may have a cooperative user, though, who will test for me - I would welcome another!). Hi Ra

Re: [OMPI users] Error in MPI_Unpack --- MPI_ERR_TRUNCATE: message truncated

2007-03-13 Thread Tim Mattox
Michael, Can you upgrade to a newer version of Open MPI? There have been several bugfix releases of the 1.1 series, and we are on the verge of releasing v1.2. So, please try either 1.1.4 (or 1.1.5rc1), and/or try v1.2rc3. On 3/12/07, Michael Epitropakis wrote: Dear ompi users, I am using Ope