Hello all,
I have finally solved the issue, or as it should be said, discovered my
oversight. And it's a mistake that will have je mad at myself for a while. I'm
new to MPI, though, and not versed in the MPP communications of LS-DYNA at all
though, so it was an oversight easily made.
The key t
On Jul 9, 2010, at 12:43 PM, Douglas Guptill wrote:
> After some lurking and reading, I plan this:
> Debian (lenny)
> + fai - for compute-node operating system install
> + Torque- job scheduler/manager
> + MPI (Intel MPI) - for the application
> +
+1 on all that has been said.
As Eugene stated: this is not an internal Open MPI bug. Your application is
calling some form of an MPI receive with a buffer that is too small. The MPI
specification defines this as a truncation error; hence, Open MPI gives you an
ERR_TRUNCATE. You can fix the
On Jul 12, 2010, at 11:14 AM, Olivier Marsden wrote:
> Hi again,
> after testing as suggested, it is indeed a massive slowdown rather than
> a full-blown machine hang.
Ok.
> Would the next test be to run with debug flags for openmpi ?
You might want to run with
mpirun --mca mpi_yield_when_
I started today reading e-mail quickly and out of order. So, I'm going
back to an earlier message now, but still with the new Subject heading,
which better reflects where you are in your progress. I'm extracting
some questions from this thread, from bottom/old to top/new:
1) What tools to u
I took the liberty of changing the subject line.
Yes, MPI_Barrier waits until all other processes in the communicator
catch up. So, long barrier time usually indicates there is some "load
imbalance"... one or more processes reach the synchronization point
well before the others. Other commun
Also, I finally got some graphical output from Sun Studio Analyzer.
I
see MPI_Recv and MPI_Wait taking a lot of time, but I would think that
is ok, this program does heavy number crunching and I would expect it to
need to Wait or wait to Receive very often since there is a decent
amount of tim
You need to call MPI_Abort, not Finalize. Finalize will block until all procs
call it. Abort causes the system to terminate everyone immediately.
On Jul 14, 2010, at 5:06 AM, Saygin Arkan wrote:
> Hi,
> I'm executing an mpi program, using C++ bindings.
>
> if( rank == 0)
> {
> ...
> ...
> if( !
Hi,
I'm executing an mpi program, using C++ bindings.
if( rank == 0)
{
...
...
if( !isFileFound){
LOG4CXX_ERROR(log, "There are not any files related with the
given probe ID");
Finalize();
exit(0);
}
}
Here rank zero stops working, I print the error log. But th
Hi Rolf,
thanks for your input. You're right, I miss the coll_tuned_use_dynamic_rules
option.
I'll check if I the segmentation fault disappears when using the basic bcast
linear algorithm using the proper command line you provided.
Regards,
Eloi
On Tuesday 13 July 2010 20:39:59 Rolf vandeVaar
10 matches
Mail list logo