Hi Yuping,

Maybe using multi-threads inside a socket, and MPI among sockets is better
choice for such NUMA platform.

Multi-threads can exploit the benefit of share memory, and MPI can
alleviate the cost of non-uniform memory access.


regards,
Zehan




On Tue, Jun 17, 2014 at 6:19 AM, Yuping Sun <yupingpaula...@att.net> wrote:

> Dear All:
>
> I bought a 64 core workstation and installed NASA fun3d with open mpi
> 1.6.5. Then I started to test run fun3d using 16, 32, 48 cores. However the
> performance of the fun3d run is bad. I got data below:
>
> the run command is (it is for 32 core as an example)
> mpiexec -np 32 --bysocket --bind-to-socket
> ~ysun/Codes/NASA/fun3d-12.3-66687/Mpi/FUN3D_90/nodet_mpi
> --time_timestep_loop --animation_freq -1 > screen.dump_bs30
>
> CPUs     times    iterations    time/it
> 60    678s    30it        22.61s
> 48    702s    30it        23.40s
> 32    734s    30it        24.50s
> 16    894s    30it        29.80s
>
> You can see using 60 cores, to run 30 iteration, FUN3D will complete in
> 678 seconds, roughly 22.61 second per iteration.
>
> Using 16 cores, to run 30 iteration, FUN3D will complete in 894 seconds,
> roughly 29.8 seconds per iteration.
>
> the data above shows FUN3D run using mpirun does not scale at all! I used
> to run fun3d with mpirun on a 8 core WS, and it scales well.
> The same job to run on a linux cluster scales well.
>
> Would you all give me some advice to improve the performance loss when I
> increase the use of more cores, or how to run mpirun with proper options to
> get a linear scaling when using 16 to 32 to 48 cores?
>
> Thank you.
>
> Yuping
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2014/06/24654.php
>

Reply via email to