Did you tried these test programs? Or any suggestion to overcome this bug???
Thank you. -Hiral On Fri, May 13, 2011 at 11:20 AM, hi <hiralsmaill...@gmail.com> wrote: > Hi Rainer, > >> Does REAL work for You? > No. > I am observing same errors (see below) even with INTEGER; please find > the attached test programs with INTEGER and REAL. > > C:\test> mpirun mar_f_i.exe > size= 1 , rank= 0 > start --, rcvbuf= 0 0 0 0 0 > end --, rcvbuf= 2 2 2 2 2 > > C:\test> mpirun -np 2 mar_f_i.exe > size= 2 , rank= 0 > start --, rcvbuf= 0 0 0 0 0 > size= 2 , rank= 1 > start --, rcvbuf= 0 0 0 0 0 > forrtl: severe (157): Program Exception - access violation > Image PC Routine Line Source > [vibgyor:12628] [[31763,0],0]-[[31763,1],0] mca_oob_tcp_msg_recv: > readv failed: Unknown error (108) > -------------------------------------------------------------------------- > WARNING: A process refused to die! > > Host: vibgyor > PID: 488 > > This process may still be running and/or consuming resources. > > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > mpirun has exited due to process rank 0 with PID 452 on node vibgyor > exiting improperly. There are two reasons this could occur: > > 1. this process did not call "init" before exiting, but others in the > job did. This can cause a job to hang indefinitely while it waits for > all processes to call "init". By rule, if one process calls "init", > then ALL processes must call "init" prior to termination. > > 2. this process called "init", but exited without calling "finalize". > By rule, all processes that call "init" MUST call "finalize" prior to > exiting or it will be considered an "abnormal termination" > > This may have caused other processes in the application to be > terminated by signals sent by mpirun (as reported here). > -------------------------------------------------------------------------- > > > Thank you. > -Hiral > > > On Thu, May 12, 2011 at 9:03 PM, Rainer Keller <kel...@hlrs.de> wrote: >> Hello Hiral, >> in the ompi_info You attached, the fortran size detection did not work >> correctly (on viscluster -- aka that shows the you used the std.-installation >> package): >> ... >> Fort dbl prec size: 4 >> ... >> >> This most probably does not match Your compiler's setting for DOUBLE >> PRECISION, which probably considers this to be 8. >> >> Does REAL work for You? >> >> Shiqing is currently away, will ask when he returns. >> >> With best regards, >> Rainer >> >> >> On Wednesday 11 May 2011 09:29:03 hi wrote: >>> Hi Jeff, >>> >>> > Can you send the info listed on the help page? >>> > >>> >From the HELP page... >>> >>> ***For run-time problems: >>> 1) Check the FAQ first. Really. This can save you a lot of time; many >>> common problems and solutions are listed there. >>> I couldn't find reference in FAQ. >>> >>> 2) The version of Open MPI that you're using. >>> I am using pre-built openmpi-1.5.3 64-bit and 32-bit binaries on Window 7 >>> I also tried with locally built openmpi-1.5.2 using Visual Studio 2008 >>> 32-bit compilers >>> I tried various compilers: VS-9 32-bit and VS-10 64-bit and >>> corresponding intel ifort compiler. >>> >>> 3) The config.log file from the top-level Open MPI directory, if >>> available (please compress!). >>> Don't have. >>> >>> 4) The output of the "ompi_info --all" command from the node where >>> you're invoking mpirun. >>> see output of pre-built openmpi-1.5.3_x64/bin/ompi_info --all" in >>> attachments. >>> >>> 5) If running on more than one node -- >>> I am running test program on single none. >>> >>> 6) A detailed description of what is failing. >>> Already described in this post. >>> >>> 7) Please include information about your network: >>> As I am running test program on local and single machine, this might >>> not be required. >>> >>> > You forgot ierr in the call to MPI_Finalize. You also paired >>> > DOUBLE_PRECISION data with MPI_INTEGER in the call to allreduce. And >>> > you mixed sndbuf and rcvbuf in the call to allreduce, meaning that when >>> > your print rcvbuf afterwards, it'll always still be 0. >>> >>> As I am not Fortran programmer, this is my mistake !!! >>> >>> > program Test_MPI >>> > use mpi >>> > implicit none >>> > >>> > DOUBLE PRECISION rcvbuf(5), sndbuf(5) >>> > INTEGER nproc, rank, ierr, n, i, ret >>> > >>> > n = 5 >>> > do i = 1, n >>> > sndbuf(i) = 2.0 >>> > rcvbuf(i) = 0.0 >>> > end do >>> > >>> > call MPI_INIT(ierr) >>> > call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr) >>> > call MPI_COMM_SIZE(MPI_COMM_WORLD, nproc, ierr) >>> > write(*,*) "size=", nproc, ", rank=", rank >>> > write(*,*) "start --, rcvbuf=", rcvbuf >>> > CALL MPI_ALLREDUCE(sndbuf, rcvbuf, n, >>> > & MPI_DOUBLE_PRECISION, MPI_SUM, MPI_COMM_WORLD, ierr) >>> > write(*,*) "end --, rcvbuf=", rcvbuf >>> > >>> > CALL MPI_Finalize(ierr) >>> > end >>> > >>> > (you could use "include 'mpif.h'", too -- I tried both) >>> > >>> > This program works fine for me. >>> >>> I am observing same crash, as described in this thread (when executing >>> as "mpirun -np 2 mar_f_dp.exe"), even with above correct and simple >>> test program. I commented 'use mpi' as it gave me "Error in compiled >>> module file" error, so I used 'include "mpif.h"' statement (see >>> attachement). >>> >>> It seems that Windows specific issue, (I could run this test program >>> on Linux with openmpi-1.5.1). >>> >>> Can anybody try this test program on Windows? >>> >>> Thank you in advance. >>> -Hiral >> >> -- >> ---------------------------------------------------------------- >> Dr.-Ing. Rainer Keller http://www.hlrs.de/people/keller >> HLRS Tel: ++49 (0)711-685 6 5858 >> Nobelstrasse 19 Fax: ++49 (0)711-685 6 5832 >> 70550 Stuttgart email: kel...@hlrs.de >> Germany AIM/Skype:rusraink >> >