[OMPI users] openmpi/ib noob question

2009-02-11 Thread Gary Draving
Hello, When running the followng program on 4 of my nodes I get the expected response: "/usr/local/bin/mpirun --mca btl tcp,self,openib --hostfile ibnodes -np 4 hello_c" Hello, world, I am 0 of 4 Hello, world, I am 2 of 4 Hello, world, I am 1 of 4 Hello, world, I am 3 of 4 But when I run it

Re: [OMPI users] openmpi/ib noob question

2009-02-12 Thread Gary Draving
Feb 11, 2009, at 2:01 PM, Gary Draving wrote: Hello, When running the followng program on 4 of my nodes I get the expected response: "/usr/local/bin/mpirun --mca btl tcp,self,openib --hostfile ibnodes -np 4 hello_c" Hello, world, I am 0 of 4 Hello, world, I am 2 of 4 Hello, world, I

[OMPI users] /usr/bin/ld: skipping incompatible /usr/local/lib/libgcc_s.so when searching for -lgcc_s

2009-02-13 Thread Gary Draving
Hi Everyone, I'm getting a "/usr/bin/ld: skipping incompatible /usr/local/lib/libgcc_s.so when searching for -lgcc_s" when compiling some of the openmpi 1.3 examples. The programs still compile and run. Does anyone know if this warning is something I should be concerned about and/or how I c

Re: [OMPI users] /usr/bin/ld: skipping incompatible /usr/local/lib/libgcc_s.so when searching for -lgcc_s

2009-02-13 Thread Gary Draving
runs. So nothing to worry about. To avoid the warning you can switch the (default) search order. best regards, Samuel Gary Draving wrote: Hi Everyone, I'm getting a "/usr/bin/ld: skipping incompatible /usr/local/lib/libgcc_s.so when searching for -lgcc_s" when compiling

[OMPI users] selected pml cm, but peer [[2469, 1], 0] on compute-0-0 selected pml ob1

2009-03-18 Thread Gary Draving
Hi all, anyone ever seen an error like this? Seems like I have some setting wrong in opemmpi. I thought I had it setup like the other machines but seems as though I have missed something. I only get the error when adding machine "fs1" to the hostfile list. The other 40+ machines seem fine.

Re: [OMPI users] selected pml cm, but peer [[2469, 1], 0] on compute-0-0 selected pml ob1

2009-03-19 Thread Gary Draving
t; PML whereas other nodes are selecting the "ob1" PML component. You can force ob1 to be used via "--mca pml ob1" What kind of hardware/NIC does fs1 have? --Nysal On Wed, 2009-03-18 at 17:17 -0400, Gary Draving wrote: Hi all, anyone ever seen an error like this? Seems like I

[OMPI users] btl_openib_ib_max_inline_data warnings

2009-03-19 Thread Gary Draving
Hi All, I have written a simple ring program that seems to run fine but I get the following warning even though I am not explicitly defining the btl_openib_ib_max_inline_data with an MCA parm. I'm only getting the warning in the 3 machines that have the QLE7240, the other 40+ machines with M

Re: [OMPI users] btl_openib_ib_max_inline_data warnings

2009-03-20 Thread Gary Draving
Thanks, I was starting the suspect our mix and match of hardware was causing some problems. Gary Jeff Squyres wrote: On Mar 19, 2009, at 4:02 PM, Gary Draving wrote: I have written a simple ring program that seems to run fine but I get the following warning even though I am not explicitly

[OMPI users] error polling LP CQ with status RETRY EXCEEDED ERROR

2009-03-26 Thread Gary Draving
Hi Everyone, I'm doing some performance testing using HPL with TCP turned off. My HPL.dat file looks like the following: It seems to work well for lower Ns values but as I increase that value it inevitably fails with "[[13535,1],169][btl_openib_component.c:2905:handle_wc] from compute-0-0.lo

Re: [OMPI users] error polling LP CQ with status RETRY EXCEEDED ERROR

2009-03-27 Thread Gary Draving
er 25 -mca btl_openib_ib_timeout 20 Should work. Ralph On Mar 26, 2009, at 2:16 PM, Gary Draving wrote: Hi Everyone, I'm doing some performance testing using HPL with TCP turned off. My HPL.dat file looks like the following: It seems to work well for lower Ns values but as I increase