Hello

I've just introduced the possibility to use OpenMPI instead of MPICH in
an ocean model. The code is quite well tested and has being run in
various parallel setups by various groups.

I've compiled the program using mpif90 (instead of ifort). When I run I
get the error - shown at the end of this mail.

As you can see all 13 jobs are started - but then ...

One problem with ocean models using domain decomposition in relation to 
load balancing is that the computational burden of the equal sized
domain is not the same (the different domains have different
land-fractions). To overcome this a matlab tool has been developed that
allows for assigning more sub-doamins to one processor/core based on the
sum of water-points in the sub-domains. Attached is a figure showing the
actual setup in this case. The neighbor relation is read from a file
produced by said matlab-tool. Non-existing neighbors are set to -1
- MPI_PROC_NULL in MPICH.

The setup is run on a quad-core machine for testing purposes only.

Any ideas what goes wrong?


====  error ======
kb@gate:~/DK/setups/north_sea_fine$ mpirun -np 13
bin/getm_prod_IFORT.96x96 
 Process            0  of           13  is alive on gate
[gate:18564] *** An error occurred in MPI_Isend
[gate:18564] *** on communicator MPI_COMM_WORLD
[gate:18564] *** MPI_ERR_RANK: invalid rank
[gate:18564] *** MPI_ERRORS_ARE_FATAL (goodbye)
 Process            1  of           13  is alive on gate
[gate:18565] *** An error occurred in MPI_Isend
[gate:18565] *** on communicator MPI_COMM_WORLD
[gate:18565] *** MPI_ERR_RANK: invalid rank
[gate:18565] *** MPI_ERRORS_ARE_FATAL (goodbye)
 Process            2  of           13  is alive on gate
 Process            3  of           13  is alive on gate
[gate:18567] *** An error occurred in MPI_Isend
[gate:18567] *** on communicator MPI_COMM_WORLD
[gate:18567] *** MPI_ERR_RANK: invalid rank
[gate:18567] *** MPI_ERRORS_ARE_FATAL (goodbye)
 Process            4  of           13  is alive on gate
[gate:18568] *** An error occurred in MPI_Isend
[gate:18568] *** on communicator MPI_COMM_WORLD
[gate:18568] *** MPI_ERR_RANK: invalid rank
[gate:18568] *** MPI_ERRORS_ARE_FATAL (goodbye)
 Process            5  of           13  is alive on gate
[gate:18569] *** An error occurred in MPI_Isend
[gate:18569] *** on communicator MPI_COMM_WORLD
[gate:18569] *** MPI_ERR_RANK: invalid rank
[gate:18569] *** MPI_ERRORS_ARE_FATAL (goodbye)
 Process            7  of           13  is alive on gate
[gate:18571] *** An error occurred in MPI_Isend
[gate:18571] *** on communicator MPI_COMM_WORLD
[gate:18571] *** MPI_ERR_RANK: invalid rank
[gate:18571] *** MPI_ERRORS_ARE_FATAL (goodbye)
 Process            8  of           13  is alive on gate
 Process            9  of           13  is alive on gate
[gate:18573] *** An error occurred in MPI_Isend
[gate:18573] *** on communicator MPI_COMM_WORLD
[gate:18573] *** MPI_ERR_RANK: invalid rank
[gate:18573] *** MPI_ERRORS_ARE_FATAL (goodbye)
 Process           10  of           13  is alive on gate
[gate:18574] *** An error occurred in MPI_Isend
[gate:18574] *** on communicator MPI_COMM_WORLD
[gate:18574] *** MPI_ERR_RANK: invalid rank
[gate:18574] *** MPI_ERRORS_ARE_FATAL (goodbye)
 Process           11  of           13  is alive on gate
 Process           12  of           13  is alive on gate
[gate:18576] *** An error occurred in MPI_Isend
[gate:18576] *** on communicator MPI_COMM_WORLD
[gate:18576] *** MPI_ERR_RANK: invalid rank
[gate:18576] *** MPI_ERRORS_ARE_FATAL (goodbye)
[gate:18566] *** An error occurred in MPI_Isend
[gate:18566] *** on communicator MPI_COMM_WORLD
[gate:18566] *** MPI_ERR_RANK: invalid rank
[gate:18566] *** MPI_ERRORS_ARE_FATAL (goodbye)
[gate:18572] *** An error occurred in MPI_Isend
[gate:18572] *** on communicator MPI_COMM_WORLD
[gate:18572] *** MPI_ERR_RANK: invalid rank
[gate:18572] *** MPI_ERRORS_ARE_FATAL (goodbye)
[gate:18575] *** An error occurred in MPI_Isend
[gate:18575] *** on communicator MPI_COMM_WORLD
[gate:18575] *** MPI_ERR_RANK: invalid rank
[gate:18575] *** MPI_ERRORS_ARE_FATAL (goodbye)
 Process            6  of           13  is alive on gate
[gate:18570] *** An error occurred in MPI_Isend
[gate:18570] *** on communicator MPI_COMM_WORLD
[gate:18570] *** MPI_ERR_RANK: invalid rank
[gate:18570] *** MPI_ERRORS_ARE_FATAL (goodbye)
[gate:18561] [0,0,0] ORTE_ERROR_LOG: Timeout in file
base/pls_base_orted_cmds.c at line 275
[gate:18561] [0,0,0] ORTE_ERROR_LOG: Timeout in file pls_rsh_module.c at
line 1166



-- 
----------------------------------------------------------------------
Karsten Bolding                    Bolding & Burchard Hydrodynamics
Strandgyden 25                     Phone: +45 64422058
DK-5466 Asperup                    Fax:   +45 64422068
Denmark                            Email: kars...@bolding-burchard.com

http://www.findvej.dk/Strandgyden25,5466,11,3
----------------------------------------------------------------------

Reply via email to