[OMPI users] torque pbs behaviour...

2009-08-10 Thread Jody Klymak
My /var/spool/toque/server_priv/nodes file looks like: xserve01.local np=8 xserve02.local np=8 Any idea what could be going wrong or how to debu this properly? There is nothing suspicious in the server or mom logs. Thanks for any help, Jody -- Jody Klymak http://web.uvic.ca/~jklymak/

Re: [OMPI users] torque pbs behaviour...

2009-08-10 Thread Jody Klymak
nt-Doherty Earth Observatory - Columbia University Palisades, NY, 10964-8000 - USA ----- Jody Klymak wrote: Hi All, I've been trying to get torque pbs to work on my OS X 10.5.7 cluster with openMPI (after finding that Xgrid w

Re: [OMPI users] torque pbs behaviour...

2009-08-10 Thread Jody Klymak
; $H.txt In the stdout, the echo $H returns "xserve02.local" 16 times and only xsever02.local.txt gets created... Again, if I run with "ssh" outside of pbs I get the expected response. Thanks, Jody Ralph On Aug 10, 2009, at 1:43 PM, Jody Klymak wrote: Hi All, I&#

Re: [OMPI users] torque pbs behaviour...

2009-08-10 Thread Jody Klymak
least mpirun --version tells me that is the mpirun version. I'll get back to you when I get time to rebuild with 1.3.3. Could be that this is the source of my xgrid problems as well. Sorry for the noise. I'll get back to you if I still have problems... Thanks, Jody -- Jody K

[OMPI users] tcp connectivity OS X and 1.3.3

2009-08-11 Thread Jody Klymak
Hello,On Aug 11, 2009, at  8:15 AM, Ralph Castain wrote:You can turn off those mca params I gave you as you are now past that point. I know there are others that can help debug that TCP btl error, but they can help you there.Just to eliminate the mitgcm from the debugging I compiled example/hello_c

Re: [OMPI users] tcp connectivity OS X and 1.3.3

2009-08-11 Thread Jody Klymak
passwordless, but do the nodes need to be passwordless as well? i.e. is xserve01 trying to ssh to xserve02? Anyway, not sure what else I can do to debug this. I'm considering rolling back to 1.1.5 and living without a queue manager... Thanks, Jody -- Jody Klymak http://web.uvic.ca/~jklymak/

Re: [OMPI users] tcp connectivity OS X and 1.3.3

2009-08-12 Thread Jody Klymak
s other OS X users are using non-tcpip communication, and the tcp stuff just doesn't work in 1.3.3. Thanks, Jody -- Jody Klymak http://web.uvic.ca/~jklymak/

Re: [OMPI users] tcp connectivity OS X and 1.3.3

2009-08-12 Thread Jody Klymak
Thanks, Jody On Wed, Aug 12, 2009 at 10:01 AM, Jody Klymak wrote: On Aug 11, 2009, at 18:55 PM, Gus Correa wrote: Did you wipe off the old directories before reinstalling? Check. I prefer to install on a NFS mounted directory, Check Have you tried to ssh from node to node on al

Re: [OMPI users] tcp connectivity OS X and 1.3.3

2009-08-12 Thread Jody Klymak
2009 at 12:51 PM, Jody Klymak wrote: Hi Ralph, That gives me something more to work with... On Aug 12, 2009, at 9:44 AM, Ralph Castain wrote: I believe TCP works fine, Jody, as it is used on Macs fairly widely. I suspect this is something funny about your installation. One thing I have foun

Re: [OMPI users] tcp connectivity OS X and 1.3.3

2009-08-12 Thread Jody Klymak
On Aug 12, 2009, at 12:46 PM, Jody Klymak wrote: So I think ranks 0 and 2 are on xserve02 and rank 1 is on xserve01, Should read xserve03, -- Jody Klymak http://web.uvic.ca/~jklymak/

Re: [OMPI users] tcp connectivity OS X and 1.3.3

2009-08-13 Thread Jody Klymak
On Aug 12, 2009, at 19:09 PM, Ralph Castain wrote: Hmmm...well, I'm going to ask our TCP friends for some help here. Meantime, I do see one thing that stands out. Port 4 is an awfully low port number that usually sits in the reserved range. I checked the /etc/services file on my Mac, and

Re: [OMPI users] tcp connectivity OS X and 1.3.3

2009-08-14 Thread Jody Klymak
hours or so due to an sshd deciding it was a security breach and killing all the processes). Anyways, all seems to be working so far. Sorry that my poor choice in user management caused so many mysteries. Thanks for everyone's help. Cheers, Jody -- Jody Klymak http://web.uvic.ca/~jklymak/

Re: [OMPI users] Problem with linking on OS X

2009-08-19 Thread Jody Klymak
ets linked with /usr/lib/libmpi... Note, that the /opt/openmpi/bin path is properly set and ompi_info does outputs the right info. You do not need to set DYLD_LIBRARY_PATH. I don't have it set and my mpi applications run fine. Did 4 work? Cheers, Jody -- Jody Klymak http://web.uvic.ca/~jklymak/

Re: [OMPI users] Problem with linking on OS X

2009-08-19 Thread Jody Klymak
mpicc and friends... Cheers, Jody On Aug 19, 2009, at 15:57 PM, tomek wrote: OK - I have fixed it by including -L/opt/openmpi/lib at the very beginning of mpicc ... -L/opt/openmpi/lib -o app.exe the rest ... But something is wrong with dyld anyhow. On 19 Aug 2009, at 21:04, Jody Klymak

Re: [OMPI users] openMPI on Xgrid

2010-03-29 Thread Jody Klymak
id/openmpi/etc/openmpi-mca-params.conf to make sure that the right ports are used: # set ports so that they are more valid than the default ones (see email from Ralph Castain) btl_tcp_port_min_v4 = 36900 btl_tcp_port_range = 32 Cheers, Jody -- Jody Klymak http://web.uvic.ca/~jklymak/

Re: [OMPI users] openMPI on Xgrid

2010-03-30 Thread Jody Klymak
uling layer on top of pbs. However, there are folks here who would know far more than I do about these sorts of things. Cheers, Jody -- Jody Klymak http://web.uvic.ca/~jklymak/