Re: [OMPI users] Infiniband errors

2012-12-20 Thread Syed Ahsan Ali
Dear Yann Here is the output *[root@compute-01-01 ~]# cat /etc/redhat-release* Red Hat Enterprise Linux Server release 5.3 (Tikanga) *[root@compute-01-01 ~]# uname -a* Linux compute-01-01.private.dns.zone 2.6.18-128.el5 #1 SMP Wed Dec 17 11:41:38 EST 2008 x86_64 x86_64 x86_64 GNU/Linux *[root@com

[OMPI users] Question about Lost Messages

2012-12-20 Thread Corey Allen
Hello, I am trying to confirm that I am using OpenMPI in a correct way. I seem to be losing messages but I don't like to assume there's a bug when I'm still new to MPI in general. I have multiple processes in a master / slaves type setup, and I am trying to have multiple persistent non-blocking m

Re: [OMPI users] 1.6.2 affinity failures

2012-12-20 Thread Brock Palen
w00t :-) Thanks Brock Palen www.umich.edu/~brockp CAEN Advanced Computing bro...@umich.edu (734)936-1985 On Dec 20, 2012, at 10:46 AM, Ralph Castain wrote: > HmmmI'll see what I can do about the error message. I don't think there > is much in 1.6 I can do, but in 1.7 I could generate an

Re: [OMPI users] 1.6.2 affinity failures

2012-12-20 Thread Ralph Castain
HmmmI'll see what I can do about the error message. I don't think there is much in 1.6 I can do, but in 1.7 I could generate an appropriate error message as we have a way to check the topologies. On Dec 20, 2012, at 7:11 AM, Brock Palen wrote: > Ralph, > > Thanks for the info, > That sai

Re: [OMPI users] 1.6.2 affinity failures

2012-12-20 Thread Brock Palen
Ralph, Thanks for the info, That said I found the problem, one of the new nodes, had Hyperthreading on, and the rest didn't so all the nodes didn't match. A quick pdsh lstopo | dshbak -c Uncovered the one different node. The error just didn't give me a clue to that being the cause, which

Re: [OMPI users] MPI_Alltoallv performance regression 1.6.0 to 1.6.1

2012-12-20 Thread Iliev, Hristo
Simon, The goal of any MPI implementation is to be as fast as possible. Unfortunately there is no "one size fits all" algorithm that works on all networks and given all possible kind of peculiarities that your specific communication scheme may have. That's why there are different algorithms and yo

Re: [OMPI users] Windows Open MPI question

2012-12-20 Thread Jeff Squyres
Glad you got it resolved! On Dec 18, 2012, at 8:53 PM, Kumar, Sudhir wrote: > Hi > The error is resolved. The solution was actually in a previous post. > http://www.open-mpi.org/community/lists/users/2011/03/15954.php > > > > -Original Message- > From: Kumar, Sudhir > Sent: Tuesday, D

Re: [OMPI users] Possible memory error

2012-12-20 Thread Jeff Squyres
On Dec 19, 2012, at 11:26 AM, Handerson, Steven wrote: > I fixed the problem we were experiencing by adding a barrier. > The bug occurred between a piece of code that uses (many, over a loop) SEND > (from the leader) > and RECV (in the worker processes) to ship data to the > processing nodes fro