Hello Gus, Jody

 The system has enough memory. I unlimited the stack size before runnning
WRF by the command *ulimit -s unlimited*.But he problem occured.
Thanks

Hi Ahsan, Jody
>
> Just a guess that this may be a stack size problem.
> Did you try to run WRF with unlimited stack size?
> Also, does your machine have enough memory to run WRF?
>
> I hope this helps,
> Gus Correa
>
>
> jody wrote:
> > Hi
> > At a first glance i would say this is not a OpenMPI problem,
> > but a wrf problem (though io must admit i have no knowledge whatsoever
> ith wrf)
> >
> > Have you tried running a single instance of wrf.exe?
> > Have you tried to run a simple application (like a "hello world") on your
> nodes?
> >
> > Jody
> >
> >
> > On Tue, Feb 22, 2011 at 7:37 AM, Ahsan Ali <ahsansha...@gmail.com>
> wrote:
> >> Hello,
> >>  I an stuck in a problem that is regarding the running for Weather
> research
> >> and Forecasting Model (WRFV 3.2.1). I get the following error while
> running
> >> with mpirun. Any help would be highly appreciated.
> >>
> >> [pmdtest@pmd02 em_real]$ mpirun -np 4 wrf.exe
> >> starting wrf task 0 of 4
> >> starting wrf task 1 of 4
> >> starting wrf task 3 of 4
> >> starting wrf task 2 of 4
> >>
> --------------------------------------------------------------------------
> >> mpirun noticed that process rank 3 with PID 6044 on node
> pmd02.pakmet.com
> >> exited on signal 11 (Segmentation fault).
> >>
> >>
> >>
> >> --
> >> Syed Ahsan Ali Bokhari
> >> Electronic Engineer (EE)
> >> Research & Development Division
> >> Pakistan Meteorological Department H-8/4, Islamabad.
> >> Phone # off  +92518358714
> >> Cell # +923155145014
> >>
> >>
> Dear Jody,
>
> WRF is running well on serial option (i.e single interface) . I am running
> another application HRM using OpenMPI , there is no issue with that and
> application is running on cluster of many nodes. The wrf manual says the
> following about MPI run:
>
> I*f you have run the model on multiple processors using MPI, you should
> have
> a number of rsl.out.* and rsl.error.* files. Type ?tail rsl.out.0000? to
> see
> if you get ?SUCCESS COMPLETE WRF?. This is a good indication that the model
> has run successfully.*
>
> *Take a look at either rsl.out.0000 file or other standard out file. This
> file logs the times taken to compute for one model time step, and to write
> one history and restart output:*
>
> *
> Timing for main: time 2006-01-21_23:55:00 on domain  2:    4.91110 elapsed
> seconds.*
>
> *Timing for main: time 2006-01-21_23:56:00 on domain  2:    4.73350 elapsed
> seconds.*
>
> *Timing for main: time 2006-01-21_23:57:00 on domain  2:    4.72360 elapsed
> seconds.*
>
> *Timing for main: time 2006-01-21_23:57:00 on domain  1:   19.55880 elapsed
> seconds.*
>
> *and*
>
> *Timing for Writing wrfout_d02_2006-01-22_00:00:00 for domain 2: 1.17970
> elapsed seconds.*
>
> *Timing for main: time 2006-01-22_00:00:00 on domain 1: 27.66230 elapsed
> seconds.*
>
> *Timing for Writing wrfout_d01_2006-01-22_00:00:00 for domain 1: 0.60250
> elapsed seconds.*
>
> * *
>
> *If the model did not run to completion, take a look at these standard
> output/error files too. If the model has become numerically unstable, it
> may
> have violated the CFL criterion (for numerical stability). Check whether
> this is true by typing the following:*
>
> * *
>
> *grep cfl rsl.error.* or grep cfl wrf.out*
>
> *you might see something like these:*
>
> *5 points exceeded cfl=2 in domain            1 at time   4.200000 *
>
> *  MAX AT i,j,k:          123          48          3 cfl,w,d(eta)=
> 4.165821*
>
> *21 points exceeded cfl=2 in domain            1 at time   4.200000 *
>
> *  MAX AT i,j,k:          123          49          4 cfl,w,d(eta)=
> 10.66290*
>
> But when I check the rsl.out* or rsl.error* there is no indication on any
> error occured ,It seems that the application just didn't start.
> [pmdtest@pmd02 em_real]$ tail rsl.out.0000
>  WRF NUMBER OF TILES FROM OMP_GET_MAX_THREADS =   8
>  WRF TILE   1 IS      1 IE    360 JS      1 JE     25
>  WRF TILE   2 IS      1 IE    360 JS     26 JE     50
>  WRF TILE   3 IS      1 IE    360 JS     51 JE     74
>  WRF TILE   4 IS      1 IE    360 JS     75 JE     98
>  WRF TILE   5 IS      1 IE    360 JS     99 JE    122
>  WRF TILE   6 IS      1 IE    360 JS    123 JE    146
>  WRF TILE   7 IS      1 IE    360 JS    147 JE    170
>  WRF TILE   8 IS      1 IE    360 JS    171 JE    195
>  WRF NUMBER OF TILES =   8
>
>
>
> Syed Ahsan Ali Bokhari
Electronic Engineer (EE)

Research & Development Division
Pakistan Meteorological Department H-8/4, Islamabad.
Phone # off  +92518358714
Cell # +923155145014

Reply via email to