Ok I tried that ( sorry for delay... Network issues killed our cluster ) Setting the env variable you suggested changed results, but all it did was to move the run time spike from between 4mb and 8mb to between 32kb and 64kb
The nodes I'm running on *have* infiniband but i think I am running on ethernet for these tests. Any other ideas? Thanks! Cooper Cooper Burns Senior Research Engineer <https://www.linkedin.com/company/convergent-science-inc> <https://www.facebook.com/ConvergentScience> <https://twitter.com/convergecfd> <https://www.youtube.com/user/convergecfd> <https://vimeo.com/convergecfd> (608) 230-1551 convergecfd.com <https://convergecfd.com/?utm_source=Email&utm_medium=signature&utm_campaign=CSIEmailSignature> On Tue, Sep 19, 2017 at 3:44 PM, Howard Pritchard <hpprit...@gmail.com> wrote: > Hello Cooper > > Could you rerun your test with the following env. variable set > > export OMPI_MCA_coll=self,basic,libnbc > > and see if that helps? > > Also, what type of interconnect are you using - ethernet, IB, ...? > > Howard > > > > 2017-09-19 8:56 GMT-06:00 Cooper Burns <cooper.bu...@convergecfd.com>: > >> Hello, >> >> I have been running some simple benchmarks and saw some strange behaviour: >> All tests are done on 4 nodes with 24 cores each (total of 96 mpi >> processes) >> >> When I run MPI_Allreduce() I see the run time spike up (about 10x) when I >> go from reducing a total of 4096KB to 8192KB for example, when count is >> 2^21 (8192 kb of 4 byte ints): >> >> MPI_Allreduce(send_buf, recv_buf, count, MPI_SUM, MPI_COMM_WORLD) >> >> is slower than: >> >> MPI_Allreduce(send_buf, recv_buf, count*/2*, MPI_INT, MPI_SUM, >> MPI_COMM_WORLD) >> MPI_Allreduce(send_buf* + count/2*, recv_buf *+ count/2*, count*/2*,MPI_INT, >> MPI_SUM, MPI_COMM_WORLD) >> >> Just wondering if anyone knows what the cause of this behaviour is. >> >> Thanks! >> Cooper >> >> >> Cooper Burns >> Senior Research Engineer >> <https://www.linkedin.com/company/convergent-science-inc> >> <https://www.facebook.com/ConvergentScience> >> <https://twitter.com/convergecfd> >> <https://www.youtube.com/user/convergecfd> >> <https://vimeo.com/convergecfd> >> (608) 230-1551 >> convergecfd.com >> <https://convergecfd.com/?utm_source=Email&utm_medium=signature&utm_campaign=CSIEmailSignature> >> >> _______________________________________________ >> users mailing list >> users@lists.open-mpi.org >> https://lists.open-mpi.org/mailman/listinfo/users >> > > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users >
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users