You might try restarting the device drivers. 

$pdsh -g yourcluster service openibd restart 

Josh

Sent from my iPhone

> On Jun 26, 2014, at 6:55 AM, "Jeff Squyres (jsquyres)" <jsquy...@cisco.com> 
> wrote:
> 
> Just curious -- if you run standard ping-pong kinds of MPI benchmarks with 
> the same kind of mpirun command line that you run your application, do you 
> see the expected level of performance?  (i.e., verification that you're using 
> the low latency transport, etc.)
> 
> 
>> On Jun 25, 2014, at 9:52 AM, Fischer, Greg A. <fisch...@westinghouse.com> 
>> wrote:
>> 
>> I looked through my configure log, and that option is not enabled. Thanks 
>> for the suggestion.
>> 
>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Maxime 
>> Boissonneault
>> Sent: Wednesday, June 25, 2014 10:51 AM
>> To: Open MPI Users
>> Subject: Re: [OMPI users] poor performance using the openib btl
>> 
>> Hi,
>> I recovered the name of the option that caused problems for us. It is 
>> --enable-mpi-thread-multiple
>> 
>> This option enables threading within OPAL, which was bugged (at least in 
>> 1.6.x series). I don't know if it has been fixed in 1.8 series. 
>> 
>> I do not see your configure line in the attached file, to see if it was 
>> enabled or not.
>> 
>> Maxime
>> 
>> Le 2014-06-25 10:46, Fischer, Greg A. a écrit :
>> Attached are the results of “grep thread” on my configure output. There 
>> appears to be some amount of threading, but is there anything I should look 
>> for in particular?
>> 
>> I see Mike Dubman’s questions on the mailing list website, but his message 
>> didn’t appear to make it to my inbox. The answers to his questions are:
>> 
>> [binford:fischega] $ rpm -qa | grep ofed
>> ofed-doc-1.5.4.1-0.11.5
>> ofed-kmp-default-1.5.4.1_3.0.76_0.11-0.11.5
>> ofed-1.5.4.1-0.11.5
>> 
>> Distro: SLES11 SP3
>> 
>> HCA:
>> [binf102:fischega] $ /usr/sbin/ibstat
>> CA 'mlx4_0'
>>        CA type: MT26428
>> 
>> Command line (path and LD_LIBRARY_PATH are set correctly):
>> mpirun -x LD_LIBRARY_PATH -mca btl openib,sm,self -mca btl_openib_verbose 1 
>> -np 31 $CTF_EXEC
>> 
>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Maxime 
>> Boissonneault
>> Sent: Tuesday, June 24, 2014 6:41 PM
>> To: Open MPI Users
>> Subject: Re: [OMPI users] poor performance using the openib btl
>> 
>> What are your threading options for OpenMPI (when it was built) ?
>> 
>> I have seen OpenIB BTL completely lock when some level of threading is 
>> enabled before.
>> 
>> Maxime Boissonneault
>> 
>> 
>> Le 2014-06-24 18:18, Fischer, Greg A. a écrit :
>> Hello openmpi-users,
>> 
>> A few weeks ago, I posted to the list about difficulties I was having 
>> getting openib to work with Torque (see “openib segfaults with Torque”, June 
>> 6, 2014). The issues were related to Torque imposing restrictive limits on 
>> locked memory, and have since been resolved.
>> 
>> However, now that I’ve had some time to test the applications, I’m seeing 
>> abysmal performance over the openib layer. Applications run with the tcp btl 
>> execute about 10x faster than with the openib btl. Clearly something still 
>> isn’t quite right.
>> 
>> I tried running with “-mca btl_openib_verbose 1”, but didn’t see anything 
>> resembling a smoking gun. How should I go about determining the source of 
>> the problem? (This uses the same OpenMPI Version 1.8.1 / SLES11 SP3 / GCC 
>> 4.8.3 setup discussed previously.)
>> 
>> Thanks,
>> Greg
>> 
>> 
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/06/24697.php
>> 
>> 
>> 
>> 
>> -- 
>> ---------------------------------
>> Maxime Boissonneault
>> Analyste de calcul - Calcul Québec, Université Laval
>> Ph. D. en physique
>> 
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/06/24700.php
>> 
>> 
>> 
>> -- 
>> ---------------------------------
>> Maxime Boissonneault
>> Analyste de calcul - Calcul Québec, Université Laval
>> Ph. D. en physique
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/06/24702.php
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/06/24707.php

Reply via email to