[OMPI users] could oversubscription clobber an executable?

2009-05-14 Thread Valmor de Almeida
Hello, I am wondering whether light oversubscription could lead to a clobbered program. The particular case is a fortran 77 (for the most part) code I am working with that can only run on powers of 2 processes (starting with power 1). When I run the program on my single-processor laptop, it shows

[OMPI users] (no subject)

2009-05-14 Thread Camelia Avram
Ni, I'm new to MPI. I'm trying to install OpenMPI and I got some errors. I use the command: ./configure -prefix=/usr/local - no problem with this But after that: "make all install", I got the next message: "no rule to make target 'VERSION', needed by Makefile.in STOP " What should I do? Than

Re: [OMPI users] (no subject)

2009-05-14 Thread Jeff Squyres
Please send all the information listed here: http://www.open-mpi.org/community/help/ On May 14, 2009, at 1:20 AM, Camelia Avram wrote: Ni, I’m new to MPI. I’m trying to install OpenMPI and I got some errors. I use the command: ./configure –prefix=/usr/local – no problem with this But af

Re: [OMPI users] (no subject)

2009-05-14 Thread Camelia Avram
Hi, Sorry, my mistake. Attached is the config.log file. > make install > no rule to make target 'VERSION', needed by Makefile.in STOP > ompi_info --all > ompi_info: command not found Thanks, Cami -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.or

[OMPI users] build problem

2009-05-14 Thread Jeff Squyres
Please submit *all* the information listed on the help page; the config.log is not enough. Thanks! On May 14, 2009, at 9:15 AM, Camelia Avram wrote: Hi, Sorry, my mistake. Attached is the config.log file. > make install > no rule to make target 'VERSION', needed by Makefile.in STOP > omp

Re: [OMPI users] Problems with "error polling LP CQ with status RNR"

2009-05-14 Thread Jeff Squyres
On May 13, 2009, at 4:55 PM, Åke Sandgren wrote: I'm having problem with getting the "error polling LP CQ with status RNR..." on an otherwise completely empty system. There are no errors visible in the error counters in any of the HCAs or switches or anywhere else. I'm running OMPI 1.3.2 bui

Re: [OMPI users] could oversubscription clobber an executable?

2009-05-14 Thread Jeff Squyres
This sounds like memory badness is occurring somewhere in your application which eventually corrupts things to make them stop working (e.g., writing beyond the end of arrays, etc.). Have you run your app through a memory-checking debugger, perchance? On May 14, 2009, at 1:00 AM, Valmor de

Re: [OMPI users] Problems with "error polling LP CQ with status RNR"

2009-05-14 Thread Åke Sandgren
On Thu, 2009-05-14 at 09:24 -0400, Jeff Squyres wrote: > On May 13, 2009, at 4:55 PM, Åke Sandgren wrote: > > > I'm having problem with getting the "error polling LP CQ with status > > RNR..." on an otherwise completely empty system. > > There are no errors visible in the error counters in any of

Re: [OMPI users] could oversubscription clobber an executable?

2009-05-14 Thread Valmor de Almeida
Jeff Squyres wrote: > This sounds like memory badness is occurring somewhere in your > application which eventually corrupts things to make them stop working > (e.g., writing beyond the end of arrays, etc.). Have you run your app > through a memory-checking debugger, perchance? > > I have the co

Re: [OMPI users] Problems with "error polling LP CQ with status RNR"

2009-05-14 Thread Pavel Shamis (Pasha)
RNR , receive is not ready - It means that on recv side MPI don't have buffers to get the data. It may point to some broken configuration in MPI/ofud or credit leak in OFUD code. Åke Sandgren wrote: Hi! I'm having problem with getting the "error polling LP CQ with status RNR..." on an otherw

Re: [OMPI users] could oversubscription clobber an executable?

2009-05-14 Thread Jeff Squyres
On May 14, 2009, at 10:11 AM, Valmor de Almeida wrote: > This sounds like memory badness is occurring somewhere in your > application which eventually corrupts things to make them stop working > (e.g., writing beyond the end of arrays, etc.). Have you run your app > through a memory-checki

[OMPI users] bug in ompi-restart

2009-05-14 Thread Bouguerra mohamed slim
Hello, I think that there is a problem with the /ompi-restar/t from the release r-21197. in fact ompi-restart can restart only if the checkpoint directory is $HOME. For example the checkpoint folder is $HOME. if i try *ompi-restart -i $HOME/ompi_global_snapshot_7056.ckpt/ *it doesn't work

Re: [OMPI users] could oversubscription clobber an executable?

2009-05-14 Thread John Hearns
2009/5/14 Valmor de Almeida : > > Hello, > > I am wondering whether light oversubscription could lead to a clobbered > program. Apologies if this is a stupid reply. Have you checked if the OOM killer (out of memory killer) is being triggered when you run the program on the laptop? Open a separate w

Re: [OMPI users] could oversubscription clobber an executable?

2009-05-14 Thread Valmor de Almeida
John Hearns wrote: > Have you checked if the OOM killer (out of memory killer) is being > triggered when you run the program on the laptop? > Open a separate window and run 'tail -f /var/log/messages' as the program > runs. Thanks for the reminder. No OOM; the messages file is clean. -- Valmor

Re: [OMPI users] could oversubscription clobber an executable?

2009-05-14 Thread Valmor de Almeida
Jeff Squyres wrote: > This sounds like memory badness is occurring somewhere in your > application which eventually corrupts things to make them stop working > (e.g., writing beyond the end of arrays, etc.). Have you run your app > through a memory-checking debugger, perchance? A related question

[OMPI users] Cross-Compile MPI Programs

2009-05-14 Thread Gilliland, Spenser D
Hi, We are in the process of setting up a cluster of mpi nodes. The user development machine is x86 and the nodes are ppc405. We have a cross compiler setup on the development machine but have been unsuccessful in using the development machine to build powerpc mpi applications. What is the pr

[OMPI users] Cross-Compile MPI Programs

2009-05-14 Thread Gilliland, Spenser D
Hi, We are in the process of setting up a cluster of mpi nodes. The user development machine is x86 and the nodes are ppc405. We have a cross compiler setup on the development machine but have been unsuccessful in using the development machine to build powerpc mpi applications. What is the pr

[OMPI users] OpenMPI 1.3.2 with PathScale 3.2

2009-05-14 Thread Joshua Bernstein
Greetings All, I'm trying to build OpenMPI 1.3.2 with the Pathscale compiler, version 3.2. A bit of the way through the build the compiler dies with what it things is a bad optimization. Has anybody else seen this, or know a work around for it? I'm going to take it up with Pathscale of course

Re: [OMPI users] OpenMPI 1.3.2 with PathScale 3.2

2009-05-14 Thread Åke Sandgren
On Thu, 2009-05-14 at 13:35 -0700, Joshua Bernstein wrote: > Greetings All, > > I'm trying to build OpenMPI 1.3.2 with the Pathscale compiler, version > 3.2. A > bit of the way through the build the compiler dies with what it things is a > bad > optimization. Has anybody else seen this,

Re: [OMPI users] OpenMPI 1.3.2 with PathScale 3.2

2009-05-14 Thread Ralph Castain
Last I checked when we were building here, I'm not sure Pathscale supports -O3. IIRC, O2 is the max supported value, though it has been awhile since I played with it. Have you checked the man page for it? It could also be something in the VampirTrace code since that is where you are failin

Re: [OMPI users] OpenMPI deadlocks and race conditions ?

2009-05-14 Thread Eugene Loh
François PELLEGRINI wrote: I sometimes run into deadlocks in OpenMPI (1.3.3a1r21206), when running my MPI+threaded PT-Scotch software. So, are there multiple threads per process that perform message-passing operations? Other comments below. Luckily, the case is very small, with 4 procs onl