On Jun 29, 2006, at 11:16 PM, Graham E Fagg wrote:
On Thu, 29 Jun 2006, Doug Gregor wrote:
When I use algorithm 6, I get:
[odin003.cs.indiana.edu:14174] *** An error occurred in MPI_Bcast
[odin005.cs.indiana.edu:10510] *** An error occurred in MPI_Bcast
Broadcasting integers from root 0...[od
On Thu, 29 Jun 2006, Doug Gregor wrote:
When I use algorithm 6, I get:
[odin003.cs.indiana.edu:14174] *** An error occurred in MPI_Bcast
[odin005.cs.indiana.edu:10510] *** An error occurred in MPI_Bcast
Broadcasting integers from root 0...[odin004.cs.indiana.edu:11752]
*** An error occurred in
On Thu, 29 Jun 2006, Doug Gregor wrote:
Are there other settings I can tweak to try to find the algorithm
that it's deciding to use at run-time?
Yes just: -mca coll_base_verbose 1
will show whats being decided at run time. i.e.
[reliant:25351] ompi_coll_tuned_bcast_intra_dec_fixed
[reliant:25
On Jun 29, 2006, at 5:23 PM, Graham E Fagg wrote:
Hi Doug
wow, looks like some messages are getting lost (or even delivered
to the wrong peer on the same node.. ) Could you also try with:
-mca coll_base_verbose 1 -mca coll_tuned_use_dynamic_rules 1 -mca
coll_tuned_bcast_algorithm <1,2,3,
Hi Doug
wow, looks like some messages are getting lost (or even delivered to the
wrong peer on the same node.. ) Could you also try with:
-mca coll_base_verbose 1 -mca coll_tuned_use_dynamic_rules 1 -mca
coll_tuned_bcast_algorithm <1,2,3,4,5,6>
The values 1-6 control which topology/aglorith
I am running into a problem with a simple program (which performs
several MPI_Bcast operations) hanging. Most processes hang in
MPI_Finalize, the others hang in MPI_Bcast. Interestingly enough,
this only happens when I oversubscribe the nodes. For instance, using
IU's Odin cluster, I take 4