Daniel,
Many papers have been published about the performance modeling of different
collective communications algorithms (and fortunately these models are
implementation independent). I can pinpoint you to our research in
collective modeling which is the underlying infrastructure behind the
decisi
Thanks.
I'll check.
With Blessings, always,
Jerry Mersel
[cid:image003.png@01CF80E7.62B7D810]
System Administrator
IT Infrastructure Branch | Division of Information Systems
Weizmann Institute of Science
Rehovot 76100, Israel
Tel: +972-8-9342363
"allow our heart, the hear
Thank you for your response.
I will investigate further.
With Blessings, always,
Jerry Mersel
[cid:image003.png@01CF80E7.62B7D810]
System Administrator
IT Infrastructure Branch | Division of Information Systems
Weizmann Institute of Science
Rehovot 76100, Israel
Tel: +972-
Unfortunately, there is no way to share memory across nodes. Running out of
memory as you describe can be due to several factors, including most
typically:
* a memory leak in the application, or the application simply growing too
big for the environment
* one rank running slow, causing it to buil
I've seen several suggestions for "home-brew" systems, usually modifying
the paging mechanism. However there is one commercial solution I have
seen advertised: https://numascale.com/index.html
I've never used them and have no idea if they are any good or as good as
they claim, you'll have to d
Hi all:
I am running openmpi 1.6.5 and a job which is memory intensive.
The job runs on 7 hosts using 16 core on each. On one of the hosts the memory
is exhausted so the kernel starts to
Kill the processes.
It could be that there is plenty of free memory on one of the other hosts.
Is
Best guess is you are seeing a race condition. If a proc immediately fails,
we will respond by aborting the launch of any other local processes as we
are going to kill the entire job. So if I get several of them started
before the first one aborts, then any remaining ones will never get
spawned, an
Tip: INTEL-Ftn-compiler problems can be communicated to INTEL there:
https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x
Greetings
Michael Rachner
Von: users [mailto:users-boun...@open-mpi.org] Im Auftrag von John Bray
Gesendet: Dienstag, 18. November 2014 11:0
The original problem used a separate file and not a module. Its clearly a
bizarre Intel bug, I am only continuing to persue it here as I'm curious as
to why the segfault messages disappear at higher process counts
John
On 18 November 2014 09:58, wrote:
> It may be possibly a bug in Intel-15.0
It may be possibly a bug in Intel-15.0 .
I suspect it has to do with the contains-block and with the fact, that you
call an intrinsic sbr in that contains-block.
Normally this must work. You may try to separate the influence of both:
What happens with these 3 variants of your code:
variant a
A delightful bug this, you get a segfault if you code contains a
random_number call and is compiled with -fopenmp, EVEN IF YOU CANNOT CALL
IT!
program fred
use mpi
integer :: ierr
call mpi_init(ierr)
print *,"hello"
call mpi_finalize(ierr)
contains
subroutine sub
real :: a(10)
call rando
11 matches
Mail list logo