Greetings Can we please verify this problem is with Gotoblas and not with OpenMPI? if I read this correctly, without MPI and with other flavors of MPI, you have normal execution. This would normally indicate the problem is on the OpenMPI side.
I am 2 doors away from Kazushige's office. Please do let me know so that I can talk to him about this. Regards Yaakoub El Khamra On Tue, Jan 19, 2010 at 9:35 AM, Gus Correa <g...@ldeo.columbia.edu> wrote: > Hi Dorian and Eloi > > I wonder if this is really a Goto BLAS problem or related to > how OpenMPI was configured. > > In a recent sequence of postings on this list > a colleague reported several errors which were fixed > after he removed the (non-default) "--enable-mpi-threads" > flag from his OpenMPI configuration (and built OpenMPI again, > and recompiled). > > See this thread: > http://www.open-mpi.org/community/lists/users/2009/12/11640.php > http://www.open-mpi.org/community/lists/users/2010/01/11695.php > > He was also using BLAS (most likely Goto's) in the HPL benchmark. > > Did you configure OpenMPI with "--enable-mpi-threads"? > Have you tried without it? > > I hope this helps. > Gus Correa > --------------------------------------------------------------------- > Gustavo Correa > Lamont-Doherty Earth Observatory - Columbia University > Palisades, NY, 10964-8000 - USA > --------------------------------------------------------------------- > > > Eloi Gaudry wrote: >> Dorian Krause wrote: >>> Hi Eloi, >>>> >>>> Does the segmentation faults you're facing also happen in a >>>> sequential environment (i.e. not linked against openmpi libraries) ? >>> >>> No, without MPI everything works fine. Also, linking against mvapich >>> doesn't give any errors. I think there is a problem with GotoBLAS and >>> the shared library infrastructure of OpenMPI. The code doesn't come to >>> the point to execute the gemm operation at all. >>> >>>> Have you already informed Kazushige Goto (developer of Gotoblas) ? >>> >>> Not yet. Since the problem only happens with openmpi and the BLAS >>> (stand-alone) seems to work, I thought the openmpi mailing list would >>> be the better place to discuss this (to get a grasp of what the error >>> could be before going to the GotoBLAS mailing list). >>> >>>> >>>> Regards, >>>> Eloi >>>> >>>> PS: Could you post your Makefile.rule here so that we could check the >>>> different compilation options chosen ? >>> >>> I didn't make any changes to the Makefile.rules. This is the content >>> of Makefile.conf: >>> >>> OSNAME=Linux >>> ARCH=x86_64 >>> C_COMPILER=GCC >>> BINARY32= >>> BINARY64=1 >>> CEXTRALIB=-L/usr/lib/gcc/x86_64-redhat-linux/4.1.2 >>> -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2 >>> -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../lib64 >>> -L/lib/../lib64 -L/usr/lib/../lib64 -lc >>> F_COMPILER=GFORTRAN >>> FC=gfortran >>> BU=_ >>> FEXTRALIB=-L/usr/lib/gcc/x86_64-redhat-linux/4.1.2 >>> -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2 >>> -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../lib64 >>> -L/lib/../lib64 -L/usr/lib/../lib64 -lgfortran -lm -lgfortran -lm -lc >>> CORE=BARCELONA >>> LIBCORE=barcelona >>> NUM_CORES=8 >>> HAVE_MMX=1 >>> HAVE_SSE=1 >>> HAVE_SSE2=1 >>> HAVE_SSE3=1 >>> HAVE_SSE4A=1 >>> HAVE_3DNOWEX=1 >>> HAVE_3DNOW=1 >>> MAKE += -j 8 >>> SGEMM_UNROLL_M=8 >>> SGEMM_UNROLL_N=4 >>> DGEMM_UNROLL_M=4 >>> DGEMM_UNROLL_N=4 >>> QGEMM_UNROLL_M=2 >>> QGEMM_UNROLL_N=2 >>> CGEMM_UNROLL_M=4 >>> CGEMM_UNROLL_N=2 >>> ZGEMM_UNROLL_M=2 >>> ZGEMM_UNROLL_N=2 >>> XGEMM_UNROLL_M=1 >>> XGEMM_UNROLL_N=1 >>> >>> >>> Thanks, >>> Dorian >>> >> Dorian, >> >> I've been experiencing similar issue on two different Opteron >> architectures (22xx and 25x), in a sequential environment, when using >> v2-1.10 of GotoBLAS. If you can downgrade to version 2-1.09, I bet you >> will not experience such issues. Anyway, I'm pretty sure Kazushige is >> working on fixing this right now. >> >> Eloi >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >