Hi,
setting the eager limit to such a drastically high value will have the effect
of generating gigantic memory consumption for unexpected messages. Any message
you send which does not have a preposted ready recv will mallocate 150mb of
temporary storage, and will be memcopied from that intern
2010-03-05
马少杰
Dear Sir:
I want to use openmpi and blcr to checkpoint.However, I want restart the
check point
on other hosts. For example, I run mpi program using openmpi on
host1 and host2, and I save the checkpoint file at a nfs shared path.
Then I wan to restart the job (ompi-res
Dear Sir:
- What version of Open MPI are you using?
my version is 1.3.4
- What configure options are you using?
./configure --with-ft=cr --enable-mpi-threads --enable-ft-thread
--with-blcr=$dir --with-blcr-libdir=/$dir/lib
--prefix=/public/mpi/openmpi134-gnu-cr --enable-mpirun-prefix-by-default
Hi,
Thank you for those informations.
For the moment, I didn't encountered those problems yet. Maybe because, my
program don't use much memory (100MB) and the master machine have huge RAM
(8GB).
So meanwhile, the solution seems to be the parameter "btl_tcp_eager_limit"
but a cleaner solution is ve
On Mar 5, 2010, at 3:15 AM, 马少杰 wrote:
> Dear Sir:
> - What version of Open MPI are you using?
> my version is 1.3.4
> - What configure options are you using?
> ./configure --with-ft=cr --enable-mpi-threads --enable-ft-thread
> --with-blcr=$dir --with-blcr-libdir=/$dir/lib
> --prefix=/public/m
Hello,
Thanks for the comments. Indeed, until yesterday, I didn't realise the
difference between MVAPICH, MVAPICH2 and Open-MPI.
This problem has moved from mvapich2 to open-mpi now however, because I now
realise that the production environment uses Open-MPI, which means my solution
for mvapi
How are you trying to start this external program? With an MPI_Comm_spawn? Or
are you just fork/exec'ing it?
How are you waiting for this external program to finish?
On Mar 5, 2010, at 7:52 AM, abc def wrote:
> Hello,
>
> Thanks for the comments. Indeed, until yesterday, I didn't realise the
This type of failure is usually due to prelink'ing being left enabled
on one or more of the systems. This has come up multiple times on the
Open MPI list, but is actually a problem between BLCR and the Linux
kernel. BLCR has a FAQ entry on this that you will want to check out:
https://upc-
Hello,
>From within the MPI fortran program I run the following command:
CALL SYSTEM("cd " // TRIM(dir) // " ; mpirun -machinefile ./machinefile -np 1
/home01/group/Execute/DLPOLY.X > job.out 2> job.err ; cd - > /dev/null")
where "dir" is a process-number-dependent directory, to ensure the proc
On Mar 5, 2010, at 8:52 AM, abc def wrote:
> Hello,
> From within the MPI fortran program I run the following command:
>
> CALL SYSTEM("cd " // TRIM(dir) // " ; mpirun -machinefile ./machinefile -np 1
> /home01/group/Execute/DLPOLY.X > job.out 2> job.err ; cd - > /dev/null")
That is guaranteed
On Mar 5, 2010, at 2:38 PM, Ralph Castain wrote:
>> CALL SYSTEM("cd " // TRIM(dir) // " ; mpirun -machinefile ./machinefile -np
>> 1 /home01/group/Execute/DLPOLY.X > job.out 2> job.err ; cd - > /dev/null")
>
> That is guaranteed not to work. The problem is that mpirun sets environmental
> varia
11 matches
Mail list logo