On May 18, 2009, at 11:50 AM, John Boccio wrote:
Thanks for that comment.
I thought that is what I was doing when I used the full path name
/usr/local/openmpi-1.3/bin/mpif90
Is that not true?
Ah, yes it is -- sorry, I missed that at the end of your output.
The tarball you sent doesn't see
I am using CPMD 3.11.1, not cp2k. Below are the timings for 20 steps
of MD for 32 water molecules (one of standard CPMD benchmarks) with
openmpi, mvapich and Intel MPI, running on 64 cores (8 blades, each
has 2 quad-core 2.2 GHz AMD Barcelona CPUs).
openmpi-1.3.2 time per
fork() support in OpenFabrics has always been dicey -- it can lead to
random behavior like this. Supposedly it works in a specific set of
circumstances, but I don't have a recent enough kernel on my machines
to test.
It's best not to use calls to system() if they can be avoided.
Indeed,
On May 19, 2009, at 8:29 AM, Jeff Squyres wrote:
fork() support in OpenFabrics has always been dicey -- it can lead
to random behavior like this. Supposedly it works in a specific set
of circumstances, but I don't have a recent enough kernel on my
machines to test.
It's best not to use
On Tuesday 19 May 2009, Roman Martonak wrote:
...
> openmpi-1.3.2 time per one MD step is 3.66 s
>ELAPSED TIME :0 HOURS 1 MINUTES 25.90 SECONDS
> = ALL TO ALL COMM 102033. BYTES 4221. =
> = ALL TO ALL COMM 7.802 MB/S
On Mon, 2009-05-18 at 17:05 -0400, Noam Bernstein wrote:
> The code is complicated, the input files are big and lead to long
> computation
> times, so I don't think I'll be able to make a simple test case.
> Instead
> I attached to the hanging processes (all 8 of them) with gdb
> during the h
On May 19, 2009, at 9:13 AM, Noam Bernstein wrote:
The MPI code isn't calling fork or system. The serial code is calling
system("mpirun cp2k.popt"). That runs to completion, processes the
output files, and calls system("mpirun cp2k.popt") again, and so on.
Is that in fact likely to be a problem
On Tue, May 19, 2009 at 3:29 PM, Peter Kjellstrom wrote:
> On Tuesday 19 May 2009, Roman Martonak wrote:
> ...
>> openmpi-1.3.2 time per one MD step is 3.66 s
>> ELAPSED TIME : 0 HOURS 1 MINUTES 25.90 SECONDS
>> = ALL TO ALL COMM 102033. BYTES
Currently ompi-restart does not know how to deal with an absolute or
relative path in the command line argument for the global snapshot
handle. It will always prepend the value of the MCA parameter:
snapc_base_global_snapshot_dir
Which defaults to $HOME.
So what you are seeing is (currently)
On May 19, 2009, at 9:32 AM, Ashley Pittman wrote:
Can you confirm that *all* processes are in PMPI_Allreduce at some
point, the collectives commonly get blamed for a lot of hangs and it's
not always the correct place to look.
For the openmpi run, every single process showed one of those
two
On Tuesday 19 May 2009, Roman Martonak wrote:
> On Tue, May 19, 2009 at 3:29 PM, Peter Kjellstrom wrote:
> > On Tuesday 19 May 2009, Roman Martonak wrote:
> > ...
> >> openmpi-1.3.2 time per one MD step is 3.66 s
> >> ELAPSED TIME : 0 HOURS 1 MINUTES 25.90 SECONDS
On Tue, 2009-05-19 at 11:01 -0400, Noam Bernstein wrote:
> I'd suspect the filesystem too, except that it's hung up in an MPI
> call. As I said
> before, the whole thing is bizarre. It doesn't matter where the
> executable is,
> just what CWD is (i.e. I can do mpirun /scratch/exec or mpirun
On May 19, 2009, at 12:13 PM, Ashley Pittman wrote:
Finally if you could run it with "--mca btl ^ofed" to rule out the
ofed
stack causing the problem that would be useful. You'd need to check
the
syntax here.
--mca btl ^openib
We're stuck with that old name for now -- see http://www.open
On May 19, 2009, at 12:13 PM, Ashley Pittman wrote:
On Tue, 2009-05-19 at 11:01 -0400, Noam Bernstein wrote:
I'd suspect the filesystem too, except that it's hung up in an MPI
call. As I said
before, the whole thing is bizarre. It doesn't matter where the
executable is,
just what CWD is (i.
Jeff Squyres wrote:
Hah; this is probably at least tangentially related to
http://www.open-mpi.org/faq/?category=building#pathscale-broken-with-mpi-c++-api
This looks related, perhaps something about suggesting using the -gnu4 option
might be nice to add, if not there then maybe somep
On May 19, 2009, at 1:22 PM, Joshua Bernstein wrote:
>
http://www.open-mpi.org/faq/?category=building#pathscale-broken-with-mpi-c++-api
They've been very responsive for me and their suggestions do
generally do the
trick. There is no doubt the compiler is smoking fast, its just about
compati
On May 19, 2009, at 12:13 PM, Ashley Pittman wrote:
That is indeed odd but it shouldn't be too hard to track down, how
often
does the failure occur? Presumably when you say you have three
invocations of the program they communicate via files, is the location
of these files changing?
Yeay.
I tired the whole thing again from scratch.
Here is g95 and xcode info.
Using openmpi-1.3
Mac OSX Leopard 10.5.7
g95 from www.g95.com
g95 -v
Using built-in specs.
Target:
Configured with: ../configure --enable-languages=c
Thread model: posix
gcc version 4.0.3 (g95 0.92!) Oct 18 2008
xcode311_
On May 19, 2009, at 2:07 PM, John Boccio wrote:
I tired the whole thing again from scratch.
Here is g95 and xcode info.
Using openmpi-1.3
Any reason you're not using 1.3.2? (the latest release)
sudo ./configure --enable-mpi-f77 --enable-mpi-f90 F77="/usr/bin/g95"
FC="/usr/bin/g95" > confi
On Tue, 2009-05-19 at 14:01 -0400, Noam Bernstein wrote:
I'm glad you got to the bottom of it.
> With one of them, apparently, CP2K will silently go on if
> the
> file is missing, but then lock up in an MPI call (maybe it leaves
> some
> variables uninitialized, and then uses them in the call
Thanks a million Jeff. That last email did the trick.
It has now compiled with Fortran bindings.
John Boccio
On May 19, 2009, at 2:37 PM, Jeff Squyres wrote:
On May 19, 2009, at 2:07 PM, John Boccio wrote:
I tired the whole thing again from scratch.
Here is g95 and xcode info.
Using openmpi
21 matches
Mail list logo