Re: [OMPI users] [Open MPI] #3351: JAVA scatter error

2012-12-17 Thread Jeff Squyres
On Dec 15, 2012, at 10:46 AM, Siegmar Gross wrote: > If I misunderstood the mpiJava specification and I must create a special > MPI object from my Java object: How do I create it? Thank you very much > for any help in advance. You sent me a source code listing off-list, but I want to reply on-li

Re: [OMPI users] [Open MPI] #3351: JAVA scatter error

2012-12-17 Thread Jeff Squyres
On Dec 15, 2012, at 10:46 AM, Siegmar Gross wrote: > "Broadcast" works if I have only a root process and it fails when I have > one more process. I'm sorry; I didn't clarify this error. In a broadcast of only 1 process, it's effectively a no-op. So it doesn't need to do anything to the buffer

Re: [OMPI users] mpi problems/many cpus per node

2012-12-17 Thread Daniel Davidson
Yes, it does. Dan [root@compute-2-1 ~]# ssh compute-2-0 Warning: untrusted X11 forwarding setup failed: xauth key data not generated Warning: No xauth data; using fake authentication data for X11 forwarding. Last login: Mon Dec 17 16:13:00 2012 from compute-2-1.local [root@compute-2-0 ~]# ssh co

Re: [OMPI users] [Open MPI] #3351: JAVA scatter error

2012-12-17 Thread Jeff Squyres
On Dec 15, 2012, at 10:46 AM, Siegmar Gross wrote: >> 1. The datatypes passed to Scatter are not valid MPI datatypes >> (MPI.OBJECT). You need to construct a datatype that is specific to the >> !MyData class, just like you would in C/C++. I think that this is the >> first error that you are see

Re: [OMPI users] mpi problems/many cpus per node

2012-12-17 Thread Doug Reeder
Daniel, Does passwordless ssh work. You need to make sure that it is. Doug On Dec 17, 2012, at 2:24 PM, Daniel Davidson wrote: > I would also add that scp seems to be creating the file in the /tmp directory > of compute-2-0, and that /var/log secure is showing ssh connections being > accepted.

Re: [OMPI users] mpi problems/many cpus per node

2012-12-17 Thread Daniel Davidson
I would also add that scp seems to be creating the file in the /tmp directory of compute-2-0, and that /var/log secure is showing ssh connections being accepted. Is there anything in ssh that can limit connections that I need to look out for? My guess is that it is part of the client prefs an

Re: [OMPI users] mpi problems/many cpus per node

2012-12-17 Thread Daniel Davidson
A very long time (15 mintues or so) I finally received the following in addition to what I just sent earlier: [compute-2-0.local:24659] [[32341,0],1] odls:kill_local_proc working on WILDCARD [compute-2-0.local:24659] [[32341,0],1] odls:kill_local_proc working on WILDCARD [compute-2-0.local:246

Re: [OMPI users] mpi problems/many cpus per node

2012-12-17 Thread Ralph Castain
Hmmm...and that is ALL the output? If so, then it never succeeded in sending a message back, which leads one to suspect some kind of firewall in the way. Looking at the ssh line, we are going to attempt to send a message from tnode 2-0 to node 2-1 on the 10.1.255.226 address. Is that going to wo

Re: [OMPI users] mpi problems/many cpus per node

2012-12-17 Thread Daniel Davidson
These nodes have not been locked down yet so that jobs cannot be launched from the backend, at least on purpose anyway. The added logging returns the information below: [root@compute-2-1 /]# /home/apps/openmpi-1.7rc5/bin/mpirun -host compute-2-0,compute-2-1 -v -np 10 --leave-session-attached

Re: [OMPI users] mpi problems/many cpus per node

2012-12-17 Thread Ralph Castain
?? That was all the output? If so, then something is indeed quite wrong as it didn't even attempt to launch the job. Try adding -mca plm_base_verbose 5 to the cmd line. I was assuming you were using ssh as the launcher, but I wonder if you are in some managed environment? If so, then it could b

Re: [OMPI users] mpi problems/many cpus per node

2012-12-17 Thread Daniel Davidson
This looks to be having issues as well, and I cannot get any number of processors to give me a different result with the new version. [root@compute-2-1 /]# /home/apps/openmpi-1.7rc5/bin/mpirun -host compute-2-0,compute-2-1 -v -np 50 --leave-session-attached -mca odls_base_verbose 5 hostname [

Re: [OMPI users] EXTERNAL: Re: Problems with shared libraries while launching jobs

2012-12-17 Thread Ralph Castain
On Dec 17, 2012, at 7:42 AM, "Blosch, Edwin L" wrote: > Ralph, > > Unfortunately I didn’t see the ssh output. The output I got was pretty much > as before. Sorry - I forgot that you built from a tarball, and so the debug is "off" by default. You need to reconfigure with --enable-debug to g

Re: [OMPI users] EXTERNAL: Re: Problems with shared libraries while launching jobs

2012-12-17 Thread Blosch, Edwin L
Ralph, Unfortunately I didn't see the ssh output. The output I got was pretty much as before. You know, the fact that the error message is not prefixed with a host name makes me think it could be happening on the host where the job is placed by PBS. If there is something wrong in the user env

Re: [OMPI users] mpi problems/many cpus per node

2012-12-17 Thread Daniel Davidson
I will give this a try, but wouldn't that be an issue as well if the process was run on the head node or another node? So long as the mpi job is not started on either of these two nodes, it works fine. Dan On 12/14/2012 11:46 PM, Ralph Castain wrote: It must be making contact or ORTE wouldn'