Dear Jody,

WRF is running well on serial option (i.e single interface) . I am running
another application HRM using OpenMPI , there is no issue with that and
application is running on cluster of many nodes. The wrf manual says the
following about MPI run:

I*f you have run the model on multiple processors using MPI, you should have
a number of rsl.out.* and rsl.error.* files. Type ‘tail rsl.out.0000’ to see
if you get ‘SUCCESS COMPLETE WRF’. This is a good indication that the model
has run successfully.*

*Take a look at either rsl.out.0000 file or other standard out file. This
file logs the times taken to compute for one model time step, and to write
one history and restart output:*

*
Timing for main: time 2006-01-21_23:55:00 on domain  2:    4.91110 elapsed
seconds.*

*Timing for main: time 2006-01-21_23:56:00 on domain  2:    4.73350 elapsed
seconds.*

*Timing for main: time 2006-01-21_23:57:00 on domain  2:    4.72360 elapsed
seconds.*

*Timing for main: time 2006-01-21_23:57:00 on domain  1:   19.55880 elapsed
seconds.*

*and*

*Timing for Writing wrfout_d02_2006-01-22_00:00:00 for domain 2: 1.17970
elapsed seconds.*

*Timing for main: time 2006-01-22_00:00:00 on domain 1: 27.66230 elapsed
seconds.*

*Timing for Writing wrfout_d01_2006-01-22_00:00:00 for domain 1: 0.60250
elapsed seconds.*

* *

*If the model did not run to completion, take a look at these standard
output/error files too. If the model has become numerically unstable, it may
have violated the CFL criterion (for numerical stability). Check whether
this is true by typing the following:*

* *

*grep cfl rsl.error.* or grep cfl wrf.out*

*you might see something like these:*

*5 points exceeded cfl=2 in domain            1 at time   4.200000 *

*  MAX AT i,j,k:          123          48          3 cfl,w,d(eta)= 4.165821*

*21 points exceeded cfl=2 in domain            1 at time   4.200000 *

*  MAX AT i,j,k:          123          49          4 cfl,w,d(eta)= 10.66290*

But when I check the rsl.out* or rsl.error* there is no indication on any
error occured ,It seems that the application just didn't start.
[pmdtest@pmd02 em_real]$ tail rsl.out.0000
 WRF NUMBER OF TILES FROM OMP_GET_MAX_THREADS =   8
 WRF TILE   1 IS      1 IE    360 JS      1 JE     25
 WRF TILE   2 IS      1 IE    360 JS     26 JE     50
 WRF TILE   3 IS      1 IE    360 JS     51 JE     74
 WRF TILE   4 IS      1 IE    360 JS     75 JE     98
 WRF TILE   5 IS      1 IE    360 JS     99 JE    122
 WRF TILE   6 IS      1 IE    360 JS    123 JE    146
 WRF TILE   7 IS      1 IE    360 JS    147 JE    170
 WRF TILE   8 IS      1 IE    360 JS    171 JE    195
 WRF NUMBER OF TILES =   8



Best Regards,
-- 
Syed Ahsan Ali Bokhari
Electronic Engineer (EE)

Research & Development Division
Pakistan Meteorological Department H-8/4, Islamabad.
Phone # off  +92518358714
Cell # +923155145014
------------------------------------------------------------------------------------------------------------------------


> Hi
> At a first glance i would say this is not a OpenMPI problem,
> but a wrf problem (though io must admit i have no knowledge whatsoever ith
> wrf)
>
> Have you tried running a single instance of wrf.exe?
> Have you tried to run a simple application (like a "hello world") on your
> nodes?
>
> Jody
>
>
>

Reply via email to