[OMPI users] mpirun hangs when used on more than 2 CPUs

Theiner, Andre Mon, 16 Jan 2012 05:56:04 -0500

Hi everyone,
may I have your help on a strange problem?
High performance computing is new to me and I have not much idea about OpenMPI 
and OpenFoam (OF) which uses the "mpirun" command.
I have to support the OF application in my company and have been trying to find 
the problem since about 1 week.
The versions are openmpi-1.3.2 and OF 2.0.1 which are running on openSUSE 11.3 
x86_64.
The computer is brand new, has 96 GB RAM, 12 CPUs and was installed with Linux 
some weeks ago.
I installed OF 2.0.1 according to the vendors instructions at 
http://www.openfoam.org/archive/2.0.1/download/suse.php.


Here the problem:
The experienced user tested the OF with a test case out of one of the vendors 
tutorials.
He only used the computing power of his local machine "caelde04" , no other 
computers were accessed by mpirun.

He found no problem when testing in single "processor mode" but in 
"multiprocessor mode" his calculations hangs when he distributes
the calculations to more than 2 CPUs. The OF vendor thinks this is an OpenMPI 
problem somehow and that is why I am trying to get
help from this forum here.
I attached 2 files, one is the "decomposeParDict" which resides in the "system" 
subdirectory of his test case and the other is the log file
from the "decomposePar" command and the mpirun command "mpirun -np 9 interFoam 
-parallel".
Do you have an idea where the problem is or how I can narrow it down?
Thanks much for any help.

Andre

 
testuser@caelde04:~> cd /home/testuser/OpenFOAM/testuser-2.0.1/nozzleFlow2D
testuser@caelde04:~/OpenFOAM/testuser-2.0.1/nozzleFlow2D> . 
/opt/openfoam201/etc/bashrc 
testuser@caelde04:~/OpenFOAM/testuser-2.0.1/nozzleFlow2D> which mp
mp2bug               mpeg_encode          mpiCC                mpicxx           
    mpif77-vt            mpirunDebug          mplex                
mpartition           mperfmon             mpicc-vt             mpicxx-vt        
    mpif90               mpi-selector         mprof-decoder        
mpeg2enc80           mpic++               mpiCC-vt             mpiexec          
    mpif90-vt            mpi-selector-menu    mprof-heap-viewer    
mpeg2_encode         mpicc                mpic++-vt            mpif77           
    mpirun               mplayerthumbsconfig  mpt-status           
testuser@caelde04:~/OpenFOAM/testuser-2.0.1/nozzleFlow2D> which mpirun
/usr/lib64/mpi/gcc/openmpi/bin/mpirun
testuser@caelde04:~/OpenFOAM/testuser-2.0.1/nozzleFlow2D> decomposePar 
/*---------------------------------------------------------------------------*\
| =========                 |                                                 |
| \\      /  F ield         | OpenFOAM: The Open Source CFD Toolbox           |
|  \\    /   O peration     | Version:  2.0.1                                 |
|   \\  /    A nd           | Web:      www.OpenFOAM.com                      |
|    \\/     M anipulation  |                                                 |
\*---------------------------------------------------------------------------*/
Build  : 2.0.1-51f1de99a4bc
Exec   : decomposePar
Date   : Jan 06 2012
Time   : 11:00:11
Host   : caelde04
PID    : 6988
Case   : /home/testuser/OpenFOAM/testuser-2.0.1/nozzleFlow2D                    
                                                                                
                                                    
nProcs : 1
sigFpe : Enabling floating point exception trapping (FOAM_SIGFPE).
fileModificationChecking : Monitoring run-time modified files using 
timeStampMaster
allowSystemOperations : Disallowing user-supplied system call operations

// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //
Create time

Time = 0
Create mesh

Calculating distribution of cells
Selecting decompositionMethod hierarchical

Finished decomposition in 0.01 s

Calculating original mesh data

Distributing cells to processors

Distributing faces to processors

Distributing points to processors

Constructing processor meshes

Processor 0
    Number of cells = 833
    Number of faces shared with processor 1 = 86
    Number of processor patches = 1
    Number of processor faces = 86
    Number of boundary faces = 1693

Processor 1
    Number of cells = 833
    Number of faces shared with processor 0 = 86
    Number of faces shared with processor 2 = 107
    Number of processor patches = 2
    Number of processor faces = 193
    Number of boundary faces = 1699

Processor 2
    Number of cells = 833
    Number of faces shared with processor 1 = 107
    Number of faces shared with processor 3 = 112
    Number of processor patches = 2
    Number of processor faces = 219
    Number of boundary faces = 1683

Processor 3
    Number of cells = 833
    Number of faces shared with processor 2 = 112
    Number of faces shared with processor 4 = 110
    Number of processor patches = 2
    Number of processor faces = 222
    Number of boundary faces = 1680

Processor 4
    Number of cells = 833
    Number of faces shared with processor 3 = 110
    Number of faces shared with processor 5 = 106
    Number of processor patches = 2
    Number of processor faces = 216
    Number of boundary faces = 1682

Processor 5
    Number of cells = 833
    Number of faces shared with processor 4 = 106
    Number of faces shared with processor 6 = 97
    Number of processor patches = 2
    Number of processor faces = 203
    Number of boundary faces = 1687

Processor 6
    Number of cells = 833
    Number of faces shared with processor 5 = 97
    Number of faces shared with processor 7 = 86
    Number of processor patches = 2
    Number of processor faces = 183
    Number of boundary faces = 1693

Processor 7
    Number of cells = 833
    Number of faces shared with processor 6 = 86
    Number of faces shared with processor 8 = 78
    Number of processor patches = 2
    Number of processor faces = 164
    Number of boundary faces = 1688

Processor 8
    Number of cells = 836
    Number of faces shared with processor 7 = 78
    Number of processor patches = 1
    Number of processor faces = 78
    Number of boundary faces = 1770

Number of processor faces = 782
Max number of cells = 836 (0.32% above average 833.33333)
Max number of processor patches = 2 (12.5% above average 1.7777778)
Max number of faces between processors = 222 (27.749361% above average 
173.77778)


Processor 0: field transfer
Processor 1: field transfer
Processor 2: field transfer
Processor 3: field transfer
Processor 4: field transfer
Processor 5: field transfer
Processor 6: field transfer
Processor 7: field transfer
Processor 8: field transfer

End.

testuser@caelde04:~/OpenFOAM/testuser-2.0.1/nozzleFlow2D> mpirun -np 9 
interFoam -parallel
/*---------------------------------------------------------------------------*\
| =========                 |                                                 |
| \\      /  F ield         | OpenFOAM: The Open Source CFD Toolbox           |
|  \\    /   O peration     | Version:  2.0.1                                 |
|   \\  /    A nd           | Web:      www.OpenFOAM.com                      |
|    \\/     M anipulation  |                                                 |
\*---------------------------------------------------------------------------*/
Build  : 2.0.1-51f1de99a4bc
Exec   : interFoam -parallel
Date   : Jan 06 2012
Time   : 11:00:53
Host   : caelde04
PID    : 6995
Case   : /home/testuser/OpenFOAM/testuser-2.0.1/nozzleFlow2D
nProcs : 9
Slaves : 
8
(
caelde04.6996
caelde04.6997
caelde04.6998
caelde04.6999
caelde04.7000
caelde04.7001
caelde04.7002
caelde04.7003
)

Pstream initialized with:
    floatTransfer     : 0
    nProcsSimpleSum   : 0
    commsType         : nonBlocking
sigFpe : Enabling floating point exception trapping (FOAM_SIGFPE).
fileModificationChecking : Monitoring run-time modified files using 
timeStampMaster
allowSystemOperations : Disallowing user-supplied system call operations

// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //
Create time

Create mesh for time = 0


PIMPLE: Operating solver in PISO mode

Reading field p_rgh

Reading field alpha1

Reading field U

Reading/calculating face flux field phi

Reading transportProperties

Selecting incompressible transport model Newtonian
Selecting incompressible transport model Newtonian
Selecting turbulence model type LESModel
Selecting LES turbulence model oneEqEddy
--> FOAM Warning : 
    From function cubeRootVolDelta::calcDelta()
    in file cubeRootVolDelta/cubeRootVolDelta.C at line 52
    Case is 2D, LES is not strictly applicable

oneEqEddyCoeffs
{
    ce              1.048;
    ck              0.094;
}


Reading g
Calculating field g.h

/*--------------------------------*- C++ -*----------------------------------*\
| =========                 |                                                 |
| \\      /  F ield         | OpenFOAM: The Open Source CFD Toolbox           |
|  \\    /   O peration     | Version:  2.0.1                                 |
|   \\  /    A nd           | Web:      www.OpenFOAM.com                      |
|    \\/     M anipulation  |                                                 |
\*---------------------------------------------------------------------------*/
FoamFile
{
    version     2.0;
    format      ascii;
    class       dictionary;
    note        "mesh decomposition control dictionary";
    object      decomposeParDict;
}
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //

numberOfSubdomains  9;

//- Keep owner and neighbour on same processor for faces in zones:
// preserveFaceZones (heater solid1 solid3);

//- Keep owner and neighbour on same processor for faces in patches:
//  (makes sense only for cyclic patches)
//preservePatches (cyclic_half0 cyclic_half1);

//- Use the volScalarField named here as a weight for each cell in the
//  decomposition.  For example, use a particle population field to decompose
//  for a balanced number of particles in a lagrangian simulation.
// weightField dsmcRhoNMean;

// method          scotch;
method          hierarchical;
// method          simple;
// method          metis;
// method          manual;
// method          multiLevel;
// method          structured;  // does 2D decomposition of structured mesh

multiLevelCoeffs
{
    // Decomposition methods to apply in turn. This is like hierarchical but
    // fully general - every method can be used at every level.

    level0
    {
        numberOfSubdomains  64;
        //method simple;
        //simpleCoeffs
        //{
        //    n           (2 1 1);
        //    delta       0.001;
        //}
        method scotch;
    }
    level1
    {
        numberOfSubdomains  4;
        method scotch;
    }
}

// Desired output

simpleCoeffs
{
    n           (2 1 1);
    delta       0.001;
}

hierarchicalCoeffs
{
    n           (1 9 1);
    delta       0.001;
    order       xyz;
}

metisCoeffs
{
     processorWeights
    (
        1
        1
        1
        1
        1
        1
        1
        1
        1

    );
 }

scotchCoeffs
{
    //processorWeights
    //(
    //    1
    //    1
    //    1
    //    1
    //);
    //writeGraph  true;
    //strategy "b";
}

manualCoeffs
{
    dataFile    "decompositionData";
}

structuredCoeffs
{
    // Patches to do 2D decomposition on. Structured mesh only; cells have
    // to be in 'columns' on top of patches.
    patches     (bottomPatch);
}

//// Is the case distributed
//distributed     yes;
//// Per slave (so nProcs-1 entries) the directory above the case.
//roots
//(
//    "/tmp"
//    "/tmp"
//);

// ************************************************************************* //

[OMPI users] mpirun hangs when used on more than 2 CPUs

Reply via email to