Re: [OMPI users] orte-ps and orte-top behavior

2014-10-31 Thread Brock Palen
Thanks! Brock Palen www.umich.edu/~brockp CAEN Advanced Computing XSEDE Campus Champion bro...@umich.edu (734)936-1985 > On Oct 31, 2014, at 2:22 PM, Ralph Castain wrote: > > >> On Oct 30, 2014, at 3:15 PM, Brock Palen wrote: >> >> If i'm on the node hosting mpirun for a job, and run: >>

Re: [OMPI users] orte-ps and orte-top behavior

2014-10-31 Thread Ralph Castain
> On Oct 30, 2014, at 3:15 PM, Brock Palen wrote: > > If i'm on the node hosting mpirun for a job, and run: > > orte-ps > > It finds the job and shows the pids and info for all ranks. > > If I use orte-top though it does no such default, I have to find the mpirun > pid and then use it. > >

[OMPI users] IB Retry Limit Errors when fabric changes

2014-10-31 Thread Brock Palen
Does anyone have issues with jobs dying with errors: > The InfiniBand retry count between two MPI processes has been > exceeded. "Retry count" is defined in the InfiniBand spec 1.2 > (section 12.7.38): We started seeing this about a year ago. If we have changes to the IB fabric, this can happe

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-31 Thread Roland Fehrenbacher
> "Nathan" == Nathan Hjelm writes: Hi Nathan Nathan> I want to close the loop on this issue. 1.8.5 will address Nathan> it in several ways: Nathan> - knem support in btl/sm has been fixed. A sanity check was Nathan>disabling knem during component registration. I wrote t

Re: [OMPI users] Bug in OpenMPI-1.8.3: storage limition in shared memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code

2014-10-31 Thread Michael.Rachner
Dear developers of OPENMPI, There remains a hanging observed in MPI_WIN_ALLOCATE_SHARED. But first: Thank you for your advices to employ shmem_mmap_relocate_backing_file = 1 It indeed turned out, that the bad (but silent) allocations by MPI_WIN_ALLOCATE_SHARED, which I observed in the past

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-31 Thread Brice Goglin
Le 31/10/2014 00:24, Gus Correa a écrit : > 2) Any recommendation for the values of the > various vader btl parameters? > [There are 12 of them in OMPI 1.8.3! > That is real challenge to get right.] > > Which values did you use in your benchmarks? > Defaults? > Other? > > In particular, is there an