On Nov 29, 2006, at 8:44 AM, Scott Atchley wrote:
My last few runs all completed successfully without hanging. The job
I am currently running just hung one node (can respond to ping,
cannot ssh into it, cannot use any terminals connected to it).
There are no messages in dmesg and vmstat shows t
On Nov 21, 2006, at 1:27 PM, Brock Palen wrote:
I had sent a message two weeks ago about this problem and talked with
jeff at SC06 about how it might not be a OMPI problem. But it
appears now working with myricom that it is a problem in both
lam-7.1.2 and openmpi-1.1.2/1.1.1. Basically the re
On Nov 27, 2006, at 10:56 AM, Galen Shipman wrote:
Note that MX is supported as both a BTL and an MTL, I would recommend
using the MX MTL as the performance is much better. If you are using
GM you can only use OB1 or DR, I would recommend OB1 as DR is only
available in the trunk and is in devel
Here is a paper on PML OB1:
http://www.open-mpi.org/papers/euro-pvmmpi-2006-hpc-protocols
There is also some information in this paper:
http://www.open-mpi.org/papers/ipdps-2006
For a very detailed presentation on OB1 go here:
http://www.open-mpi.org/papers/workshop-2006/wed_01_pt2pt.pdf
In ge
Well, im not finding much good information on what 'pml' is. Or
what ones are available what one is used by default, or how to
switch between them. Is there a paper someplace that describes this?
Brock Palen
Center for Advanced Computing
bro...@umich.edu
(734)936-1985
On Nov 26, 2006, a
Oh, just noticed you are using GM, PML CM is only available for MX..
sorry..
Galen
On Nov 26, 2006, at 9:08 AM, Galen Shipman wrote:
I would suggest trying Open MPI 1.2b1 and PML CM. You can select
PML CM at runtime via:
mpirun -mca pml cm
Have you tried this?
- Galen
On Nov 21, 200
I would suggest trying Open MPI 1.2b1 and PML CM. You can select PML
CM at runtime via:
mpirun -mca pml cm
Have you tried this?
- Galen
On Nov 21, 2006, at 12:28 PM, Scott Atchley wrote:
On Nov 21, 2006, at 1:27 PM, Brock Palen wrote:
I had sent a message two weeks ago about this probl
On Nov 21, 2006, at 2:28 PM, Scott Atchley wrote:
On Nov 21, 2006, at 1:27 PM, Brock Palen wrote:
I had sent a message two weeks ago about this problem and talked with
jeff at SC06 about how it might not be a OMPI problem. But it
appears now working with myricom that it is a problem in both
l
On Nov 21, 2006, at 1:27 PM, Brock Palen wrote:
I had sent a message two weeks ago about this problem and talked with
jeff at SC06 about how it might not be a OMPI problem. But it
appears now working with myricom that it is a problem in both
lam-7.1.2 and openmpi-1.1.2/1.1.1. Basically the re