Re: [OMPI users] Infiniband errors

2012-12-20 Thread Syed Ahsan Ali
Dear Yann Here is the output *[root@compute-01-01 ~]# cat /etc/redhat-release* Red Hat Enterprise Linux Server release 5.3 (Tikanga) *[root@compute-01-01 ~]# uname -a* Linux compute-01-01.private.dns.zone 2.6.18-128.el5 #1 SMP Wed Dec 17 11:41:38 EST 2008 x86_64 x86_64 x86_64 GNU/Linux *[root@com

Re: [OMPI users] Infiniband errors

2012-12-19 Thread Yann Droneaud
Le mercredi 19 décembre 2012 à 12:12 +0500, Syed Ahsan Ali a écrit : > Dear John > > I found this output of ibstatus on some nodes (most probably the > problem causing) > [root@compute-01-08 ~]# ibstatus > > Fatal error: device '*': sys files not found > (/sys/class/infiniband/*/ports) > > Do

Re: [OMPI users] Infiniband errors

2012-12-19 Thread Shamis, Pavel
Seems like driver was not started. I would suggest to run lspci and check if the HCA is visible on HW level. Pavel (Pasha) Shamis --- Computer Science Research Group Computer Science and Math Division Oak Ridge National Laboratory On Dec 19, 2012, at 2:12 AM, Syed Ahsan Ali wrote: Dear Joh

Re: [OMPI users] Infiniband errors

2012-12-19 Thread Syed Ahsan Ali
Dear John I found this output of ibstatus on some nodes (most probably the problem causing) [root@compute-01-08 ~]# ibstatus Fatal error: device '*': sys files not found (/sys/class/infiniband/*/ports) Does this show any hardware or software issue? Thanks On Wed, Nov 28, 2012 at 3:17 PM, Jo

Re: [OMPI users] Infiniband errors

2012-11-28 Thread Syed Ahsan Ali
I am not sure about drivers because those were installed by someone else during cluster setup. I see following information about infiniband card. The card is DDR InfiniBand Mellanox ConnectX. On Wed, Nov 28, 2012 at 3:17 PM, John Hearns wrote: > Those diagnostics are from Openfabrics. > What ty

Re: [OMPI users] Infiniband errors

2012-11-28 Thread John Hearns
Those diagnostics are from Openfabrics. What type of infiniband card do you have? What drivers are you using?

Re: [OMPI users] Infiniband errors

2012-11-28 Thread Syed Ahsan Ali
ibstats comes with some other distribution? I don't have this command available right now On Wed, Nov 28, 2012 at 1:14 PM, John Hearns wrote: > Short answer. Run ibstats or ibstatus. > Look also at the logs of your subnet manager. > > ___ > users mail

Re: [OMPI users] Infiniband errors

2012-11-28 Thread John Hearns
Short answer. Run ibstats or ibstatus. Look also at the logs of your subnet manager.

[OMPI users] Infiniband errors

2012-11-28 Thread Syed Ahsan Ali
Dear All I have an application which is run using openmpi and uses infiniband flags. The application is a forecast model simulation. A frequent problem arises that the Infiniband mezzanine cards of servers become faulty (don't know the reason why it happens so frequent), the model simulation becom