Re: [OMPI users] 'orte_ess_base_select failed'

2009-04-06 Thread Russell McQueeney
Jeff Squyres wrote: Run with "--mca ess_base_verbose 1000" on the mpirun command line and send the output, such as: mpirun --mca ess_base_verbose 1000 rest of your command here... On Mar 30, 2009, at 5:33 PM, Russell McQueeney wrote: I only invoked orted manually to see the error mess

[OMPI users] Interaction between Intel and OpenMPI floating point exceptions

2009-04-06 Thread Steve Lowder
Recently I've been running an MPI code that uses the LAPACK slamch routine to determine machine precision parameters. This software is compiled using the latest Intel Fortran compiler and setting the -fpe0 argument to watch for certain floating point errors. The slamch routines crashed and p

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-06 Thread Steve Kargl
On Mon, Apr 06, 2009 at 02:04:16PM -0700, Eugene Loh wrote: > Steve Kargl wrote: > > >I recently upgraded OpenMPI from 1.2.9 to 1.3 and then 1.3.1. > >One of my colleagues reported a dramatic drop in performance > >with one of his applications. My investigation shows a factor > >of 10 drop in com

Re: [OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-06 Thread Eugene Loh
Steve Kargl wrote: I recently upgraded OpenMPI from 1.2.9 to 1.3 and then 1.3.1. One of my colleagues reported a dramatic drop in performance with one of his applications. My investigation shows a factor of 10 drop in communication over the memory bus. I've placed a figure that iilustrates the

Re: [OMPI users] ssh MPi and program tests

2009-04-06 Thread Gus Correa
Hi Francesco See answers inline. Francesco Pietra wrote: Hi Gus: Partial quick answers below. I have reestablished the ssh connection so that tomorrow I'll run the tests. Everything that relates to running amber is on the "parallel computer", where I have access to everything. On Mon, Apr 6,

Re: [OMPI users] Incorrect results with MPI-IO under OpenMPI v1.3.1

2009-04-06 Thread Yvan Fournier
Hello to all, I have also encountered a similar bug with MPI-IO with Open MPI 1.3.1, reading a Code_Saturne preprocessed mesh file (www.code-saturne.org). Reading the file can be done using 2 MPI-IO modes, or one non-MPI-IO mode. The first MPI-IO mode uses individual file pointers, and involves

Re: [OMPI users] libnuma issue

2009-04-06 Thread Prentice Bisbal
Francesco Pietra wrote: > I am posting again more specifically because it may have been buried > in a more generic thread. > > With debian linux amd64 lenny and openmpi-1.3.1 > > ./configure cc=/opt/intel/cce/10.1.015/bin/icc > cxx=/opt/intel/cce/10.1.015/bin/icpc > F77=/opt/intel/fce/10.1.015/bi

Re: [OMPI users] mpirun: symbol lookup error: /usr/local/lib/openmpi/mca_plm_lsf.so: undefined symbol: ls b_init

2009-04-06 Thread Prentice Bisbal
Alessandro Surace wrote: > Hi guys, I try to repost my question... > I've a problem with the last stable build and the last nightly snapshot. > > When I run a job directly with mpirun no problem. > If I try to submit it with lsf: > bsub -a openmpi -m grid01 mpirun.lsf /mnt/ewd/mpi/fibonacci/fibona

Re: [OMPI users] ssh MPi and program tests

2009-04-06 Thread Francesco Pietra
Hi Gus: Partial quick answers below. I have reestablished the ssh connection so that tomorrow I'll run the tests. Everything that relates to running amber is on the "parallel computer", where I have access to everything. On Mon, Apr 6, 2009 at 7:53 PM, Gus Correa wrote: > Hi Francesco, list > > F

[OMPI users] Incorrect results with MPI-IO under OpenMPI v1.3.1

2009-04-06 Thread Scott Collis
I have been a user of MPI-IO for 4+ years and have a code that has run correctly with MPICH, MPICH2, and OpenMPI 1.2.* I recently upgraded to OpenMPI 1.3.1 and immediately noticed that my MPI-IO generated output files are corrupted. I have not yet had a chance to debug this in detail, but

Re: [OMPI users] ssh MPi and program tests

2009-04-06 Thread Gus Correa
Hi Francesco, list Francesco Pietra wrote: On Mon, Apr 6, 2009 at 5:21 PM, Gus Correa wrote: Hi Francesco Did you try to run examples/connectivity_c.c, or examples/hello_c.c before trying amber? They are in the directory where you untarred the OpenMPI tarball. It is easier to troubleshoot pos

Re: [OMPI users] Problem with running openMPI program

2009-04-06 Thread Gus Correa
Hi Ankush Ankush Kaul wrote: I am not able to check if NFS export/mount of /tmp is working, when i give the command *ssh 192.168.45.65 192.168.67.18* i get the error : bash: 192.168.67.18 : command not found The ssh command syntax above is wrong. Use only one IP addre

[OMPI users] Factor of 10 loss in performance with 1.3.x

2009-04-06 Thread Steve Kargl
Hi, I recently upgraded OpenMPI from 1.2.9 to 1.3 and then 1.3.1. One of my colleagues reported a dramatic drop in performance with one of his applications. My investigation shows a factor of 10 drop in communication over the memory bus. I've placed a figure that iilustrates the problem at htt

Re: [OMPI users] ssh MPi and program tests

2009-04-06 Thread Francesco Pietra
On Mon, Apr 6, 2009 at 5:21 PM, Gus Correa wrote: > Hi Francesco > > Did you try to run examples/connectivity_c.c, > or examples/hello_c.c before trying amber? > They are in the directory where you untarred the OpenMPI tarball. > It is easier to troubleshoot > possible network and host problems >

Re: [OMPI users] Problem with running openMPI program

2009-04-06 Thread Ankush Kaul
I am not able to check if NFS export/mount of /tmp is working, when i give the command *ssh 192.168.45.65 192.168.67.18* i get the error : bash: 192.168.67.18: command not found let me explain what i understood using an example. First, i make a folder '/work directory' on my master node. Then i

Re: [OMPI users] Problem with running openMPI program

2009-04-06 Thread Gus Correa
Hi Ankush If I remember right, mpirun will put you on your home directory, not on /tmp, when it starts your ssh session. To run on /tmp (or on /mnt/nfs) you may need to use "-path" option. Likewise, you may want to give mpirun a list of hosts (-host option) or a hostfile (-hostfile option), to s

Re: [OMPI users] ssh MPi and program tests

2009-04-06 Thread Gus Correa
Hi Francesco Did you try to run examples/connectivity_c.c, or examples/hello_c.c before trying amber? They are in the directory where you untarred the OpenMPI tarball. It is easier to troubleshoot possible network and host problems with these simpler programs. Also, to avoid confusion, you may u

Re: [OMPI users] Problem with running openMPI program

2009-04-06 Thread Ankush Kaul
Thank you sir, one more thing i am confused about, suppose i have 2 run a 'pi' program using open mpi, where do i place the program? currently i have placed it in /tmp folder on de master node. this /tmp folder is mounted on /mnt/nfs of the compute node. i run de progam from the tmp folder on de

Re: [OMPI users] Problem with running openMPI program

2009-04-06 Thread John Hearns
2009/4/6 Ankush Kaul : >> Also how do i come to know that the program is using resources of both the > nodes? Log into the second node before you start the program. Run 'top' Seriously - top is a very, very useful utility.

Re: [OMPI users] ssh MPi and program tests

2009-04-06 Thread Ralph Castain
You might first try and see if you can run something other than amber with your new installation. Make sure you have the PATH and LD_LIBRARY_PATH set correctly on the remote node, or add --prefix to your mpirun cmd line. Also, did you remember to install the OMPI 1.3 libraries on the remote

[OMPI users] ssh MPi and program tests

2009-04-06 Thread Francesco Pietra
I have compiled openmpi 1.3.1 on debian amd64 lenny with icc/ifort (10.1.015) and libnuma. Tests passed: ompi_info | grep libnuma MCA affinity: libnuma (MCA v 2.0, API 2.0) ompi_info | grep maffinity MCA affinity: first use (MCA as above) MCA affinity: libnuma as above. Then, I have compiled

Re: [OMPI users] Problem with running openMPI program

2009-04-06 Thread Ankush Kaul
Thank you Sir the problem was with the paths of 'bin' and 'lib' folders so i used de *mpirun --prefix* command. I want to run a program 'pi' now using the cluster, so where do i place de file on de master and the compute nodes? Also how do i come to know that the program is using resources of both

Re: [OMPI users] Bogus memcpy or bogus valgrind record

2009-04-06 Thread Number Cruncher
I'd like to add my concern to the thread at http://www.open-mpi.org/community/lists/users/2009/03/8661.php that the latest 1.3 series produces far too much memory-checker noise. We use Valgrind extensively during debugging, and although I'm using the latest snapshot (1.3.2a1r20901) and latest