Re: [OMPI users] usNIC point-to-point messaging module

2014-04-04 Thread Ralph Castain
Fixed in r31308 and scheduled for inclusion in 1.8.1 Thanks Ralph On Apr 2, 2014, at 12:17 PM, Ralph Castain wrote: > Yeah, it's a change we added to resolve a problem when Slurm is configured > with TaskAffinity set. It's harmless, but annoying - I'm trying to figure out > a solution. > >

Re: [OMPI users] openmpi query

2014-04-04 Thread Nisha Dhankher -M.Tech(CSE)
sir smae virt-manager is bein used by all pc's.no i did n't enable openmpi-hetro.Yes openmpi version is same in all through same kickstart file. ok...actually sir...rocks itself installed,configured openmpi and mpich on it own through hpc roll. On Fri, Apr 4, 2014 at 9:25 AM, Ralph Castain wrote

Re: [OMPI users] openmpi query

2014-04-04 Thread Ralph Castain
Hi Nisha I'm sorry if my questions appear abrasive - I'm just a little frustrated at the communication bottleneck as I can't seem to get a clear picture of your situation. So you really don't need to keep calling me "sir" :-) The error you are hitting is very unusual - it means that the process

Re: [OMPI users] openmpi query

2014-04-04 Thread Nisha Dhankher -M.Tech(CSE)
no it does not happen on names nodes On Fri, Apr 4, 2014 at 7:51 PM, Ralph Castain wrote: > Hi Nisha > > I'm sorry if my questions appear abrasive - I'm just a little frustrated > at the communication bottleneck as I can't seem to get a clear picture of > your situation. So you really don't nee

Re: [OMPI users] openmpi query

2014-04-04 Thread Reuti
Am 04.04.2014 um 05:55 schrieb Ralph Castain: > On Apr 3, 2014, at 8:03 PM, Nisha Dhankher -M.Tech(CSE) > wrote: > >> thankyou Ralph. >> Yes cluster is heterogenous... > > And did you configure OMPI --enable-heterogeneous? And are you running it > with ---hetero-nodes? What version of OMPI ar

Re: [OMPI users] openmpi query

2014-04-04 Thread Ralph Castain
On Apr 4, 2014, at 7:39 AM, Reuti wrote: > Am 04.04.2014 um 05:55 schrieb Ralph Castain: > >> On Apr 3, 2014, at 8:03 PM, Nisha Dhankher -M.Tech(CSE) >> wrote: >> >>> thankyou Ralph. >>> Yes cluster is heterogenous... >> >> And did you configure OMPI --enable-heterogeneous? And are you runn

Re: [OMPI users] openmpi query

2014-04-04 Thread Ralph Castain
Okay, so if you run mpiBlast on all the non-name nodes, everything is okay? What do you mean by "names nodes"? On Apr 4, 2014, at 7:32 AM, Nisha Dhankher -M.Tech(CSE) wrote: > no it does not happen on names nodes > > > On Fri, Apr 4, 2014 at 7:51 PM, Ralph Castain wrote: > Hi Nisha > > I

[OMPI users] Call stack upon MPI routine error

2014-04-04 Thread Vince Grimes
Dear all: The subject heading is a little misleading because this is in response to part of that original contact. I tried the first two suggestions below (disabling eager DMA and using tcp btl), but to no avail. In all cases I am running over 20 12-core nodes through SGE. In the first case,

Re: [OMPI users] Call stack upon MPI routine error

2014-04-04 Thread Ralph Castain
Running out of file descriptors sounds likely here - if you have 20 procs/node, and fully connect, each node will see 20*220 connections (you don't use tcp between procs on the same node), with each connection requiring a file descriptor. On Apr 4, 2014, at 11:26 AM, Vince Grimes wrote: > De

[OMPI users] Waitall never returns

2014-04-04 Thread Ross Boylan
During shutdown of my application the processes issue a waitall, since they have done some Isends. A couple of them never return from that call. Could this be the result of some of the processes already being shutdown (the processes with the problem were late in the shutdown sequence)? If so

Re: [OMPI users] Waitall never returns

2014-04-04 Thread Ralph Castain
It sounds like you don't have a balance between sends and recvs somewhere - i.e., some apps send messages, but the intended recipient isn't issuing a recv and waiting until the message has been received before exiting. If the recipient leaves before the isend completes, then the isend will never

Re: [OMPI users] Waitall never returns

2014-04-04 Thread Ross Boylan
On 4/4/2014 6:01 PM, Ralph Castain wrote: It sounds like you don't have a balance between sends and recvs somewhere - i.e., some apps send messages, but the intended recipient isn't issuing a recv and waiting until the message has been received before exiting. If the recipient leaves before th

Re: [OMPI users] Waitall never returns

2014-04-04 Thread George Bosilca
Ross, I’m not familiar with the R implementation you are using, but bear with me and I will explain how you can all Open MPI about the list of all pending requests on a process. Disclosure: This is Open MPI deep voodoo, an extreme way to debug applications that might save you quite some time.