Owen --
Sorry, we all fell [way] behind on e-mail because many of us were at an
OMPI developer's meeting last week. :-(
In the interim, we have finally released Open MPI v1.1. Could you give
this version a whirl and see if it fixes your problems?
> -Original Message-
> From: users-bo
I'm currently working with Owen on this issue.. will continue my
findings on list..
- Galen
On Jun 29, 2006, at 7:56 AM, Jeff Squyres ((jsquyres)) wrote:
Owen --
Sorry, we all fell [way] behind on e-mail because many of us were
at an
OMPI developer's meeting last week. :-(
In the interi
Jens --
I'm trolling through old e-mails on this list and it doesn't look like
you ever got an answer to this message.
Did you ever figure out the problem?
> -Original Message-
> From: users-boun...@open-mpi.org
> [mailto:users-boun...@open-mpi.org] On Behalf Of Jens Klostermann
> Sent
It doesn't look like you ever got an answer to this question -- sorry!
We sometimes get very bad at mail management. :-(
I'm guessing that this is always going to be a problematic scenario for
Open MPI. We have to do forwarding of stdin/out/err between the MPI
process and mpirun. I'm guessing
Sorry for the delay in replying -- sometimes we just get overwhelmed
with all the incoming mail. :-(
> -Original Message-
> From: users-boun...@open-mpi.org
> [mailto:users-boun...@open-mpi.org] On Behalf Of Tony Ladd
> Sent: Saturday, June 17, 2006 9:47 AM
> To: us...@open-mpi.org
> Sub
Sorry for the delay in replying. Too much travel, and too much e-mail!
:-)
> -Original Message-
> From: users-boun...@open-mpi.org
> [mailto:users-boun...@open-mpi.org] On Behalf Of Michael Kluskens
> Sent: Monday, June 19, 2006 4:56 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] a
@Terry
I hope this is of any help (debugged with TotalView):
Enclose you will find a graph from TotalView as well as this:
/Created process 2 (7633), named "mpirun"
Thread 2.1 has appeared
Thread 2.2 has appeared
Thread 2.1 received a signal (Segmentation Violation)/
and the stack trace:
/
I think you may have caught us in an unintentional breakage. If your Open MPI
was compiled as shared libraries and dynamic shared objects (the default), this
error should not have happened since we did not change mpi.h. So there must be
a second-order effect going on here (somehow the size of
I should have tried this before I replied. I had a further thought (after I
replied, of course) -- I was wondering if one of our components had a reference
to ompi_comm_world (and not your application) and that caused the problem. If
you installed 1.1 over 1.0.2 and didn't uninstall first, an
Patrick --
I'm a little confused about your response. Are you replying to the
"keyval parser" thread (i.e., saying that you had the same problem as
Benjamin Landsteiner), or are you replying to the "mca_oob_tcp_accept"
thread?
> -Original Message-
> From: users-boun...@open-mpi.org
> [
I am running into a problem with a simple program (which performs
several MPI_Bcast operations) hanging. Most processes hang in
MPI_Finalize, the others hang in MPI_Bcast. Interestingly enough,
this only happens when I oversubscribe the nodes. For instance, using
IU's Odin cluster, I take 4
Jeff,
Sorry for the confusion. It's for the the "mca_oob_tcp_accept" thread.
I mistakenly replied to the wrong message ("keyval parser").
As for the "mca_oob_tcp_accept" thread, I have since found since found
some more information on the problem (1.1 no longer works if stdin is
closed; 1.0
I'm having trouble getting OpenMPI to execute jobs when submitting through
Torque.
Everything works fine if I am to use the included mpirun scripts, but this
is obviously
not a good solution for the general users on the cluster.
I'm running under OS X 10.4, Darwin 8.6.0. I configured OpenMpi wit
Hi Doug
wow, looks like some messages are getting lost (or even delivered to the
wrong peer on the same node.. ) Could you also try with:
-mca coll_base_verbose 1 -mca coll_tuned_use_dynamic_rules 1 -mca
coll_tuned_bcast_algorithm <1,2,3,4,5,6>
The values 1-6 control which topology/aglorith
On Jun 29, 2006, at 5:23 PM, Graham E Fagg wrote:
Hi Doug
wow, looks like some messages are getting lost (or even delivered
to the wrong peer on the same node.. ) Could you also try with:
-mca coll_base_verbose 1 -mca coll_tuned_use_dynamic_rules 1 -mca
coll_tuned_bcast_algorithm <1,2,3,
On Thu, 29 Jun 2006, Doug Gregor wrote:
Are there other settings I can tweak to try to find the algorithm
that it's deciding to use at run-time?
Yes just: -mca coll_base_verbose 1
will show whats being decided at run time. i.e.
[reliant:25351] ompi_coll_tuned_bcast_intra_dec_fixed
[reliant:25
I am testing the one-sided message passing (mpi_put, mpi_get) that is
now supported in the 1.1 release. It seems to work OK for some simple
test codes, but when I run my big application, it fails. This
application is a large weather model that runs operationally on the SGI
Origin 3000, using
On Thu, 29 Jun 2006, Doug Gregor wrote:
When I use algorithm 6, I get:
[odin003.cs.indiana.edu:14174] *** An error occurred in MPI_Bcast
[odin005.cs.indiana.edu:10510] *** An error occurred in MPI_Bcast
Broadcasting integers from root 0...[odin004.cs.indiana.edu:11752]
*** An error occurred in
18 matches
Mail list logo