Hello,

I have been running some simple benchmarks and saw some strange behaviour:
All tests are done on 4 nodes with 24 cores each (total of 96 mpi processes)

When I run MPI_Allreduce() I see the run time spike up (about 10x) when I
go from reducing a total of 4096KB to 8192KB for example, when count is
2^21 (8192 kb of 4 byte ints):

MPI_Allreduce(send_buf, recv_buf, count, MPI_SUM, MPI_COMM_WORLD)

is slower than:

MPI_Allreduce(send_buf, recv_buf, count*/2*, MPI_INT, MPI_SUM,
MPI_COMM_WORLD)
MPI_Allreduce(send_buf* + count/2*, recv_buf *+ count/2*, count*/2*,MPI_INT,
MPI_SUM, MPI_COMM_WORLD)

Just wondering if anyone knows what the cause of this behaviour is.

Thanks!
Cooper


Cooper Burns
Senior Research Engineer
<https://www.linkedin.com/company/convergent-science-inc>
<https://www.facebook.com/ConvergentScience>
<https://twitter.com/convergecfd>
<https://www.youtube.com/user/convergecfd>  <https://vimeo.com/convergecfd>
(608) 230-1551
convergecfd.com
<https://convergecfd.com/?utm_source=Email&utm_medium=signature&utm_campaign=CSIEmailSignature>
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to