On Mon, Sep 16, 2013 at 7:04 PM, PaulC <paul.cah...@uk.fujitsu.com> wrote: > Hi, > > > I'm attempting to build GROMACS 4.6.3 to run entirely within a single Xeon > Phi (i.e. native) with either/both Intel MPI/OpenMP for parallelisation > within the single Xeon Phi. > > I followed these instructions from Intel for cross compiling for Xeon Phi > with cmake: > > http://software.intel.com/en-us/articles/cross-compilation-for-intel-xeon-phi-coprocessor-with-cmake > > which includes setting: > > export CC=icc > export CXX=icpc > export FC=ifort > export CFLAGS="-mmic" > export CXXFLAGS=$CFLAGS > export FFLAGS=$CFLAGS > export MPI_C=mpiicc > export MPI_CXX=mpiicpc > > I then run cmake with: > > cmake .. -DREGRESSIONTEST_DOWNLOAD=ON -DGMX_MPI=ON -DGMX_THREAD_MPI=OFF > -DGMX_FFT_LIBRARY=mkl -DGMX_CPU_ACCELERATION=None > -DCMAKE_INSTALL_PREFIX=~/gromacs > > > > Note -DGMX_THREAD_MPI=OFF. That seems to work fine (see attached > cmake_output.txt), particularly, it finds the MIC Intel MPI: > > -- Found MPI_C: > /opt/intel/impi/4.1.1.036/mic/lib/libmpigf.so;/opt/intel/impi/4.1.1.036/mic/lib/libmpi.so;/opt/i > ntel/impi/4.1.1.036/mic/lib/libmpigi.a;/usr/lib64/libdl.so;/usr/lib64/librt.so;/usr/lib64/libpthread.so > -- Checking for MPI_IN_PLACE > -- Performing Test MPI_IN_PLACE_COMPILE_OK > -- Performing Test MPI_IN_PLACE_COMPILE_OK - Success > -- Checking for MPI_IN_PLACE - yes > > > When I run make everything trundles along fine until: > > [ 20%] Building C object > src/gmxlib/CMakeFiles/gmx.dir/thread_mpi/errhandler.c.o > [ 20%] Building C object > src/gmxlib/CMakeFiles/gmx.dir/thread_mpi/tmpi_malloc.c.o > [ 22%] Building C object src/gmxlib/CMakeFiles/gmx.dir/thread_mpi/atomic.c.o > [ 22%] Building C object > src/gmxlib/CMakeFiles/gmx.dir/thread_mpi/pthreads.c.o > /tmp/iccQqtl2Vas_.s: Assembler messages: > /tmp/iccQqtl2Vas_.s:1773: Error: `sfence' is not supported on `k1om' > make[2]: *** [src/gmxlib/CMakeFiles/gmx.dir/thread_mpi/pthreads.c.o] Error 1 > make[1]: *** [src/gmxlib/CMakeFiles/gmx.dir/all] Error 2 > make: *** [all] Error 2 > > > Why is it still building thread_mpi given the -DGMX_THREAD_MPI=OFF at the > cmake invocation above?
Because these days thread-MPI not only provides a threading-based MPI implementation for GROMACS, but also some functionality independent from this very feature, namely efficient atomic operations and thread affinity settings. > > Any suggestions how best to work around this? [ FTFY: "Any suggestions how to *fix* this?" ] What seems to be causing the trouble here is the atomics support. While x86 normally supports the atomic memory fence operation, Xeon Phi seems to be not so "normal" and apparently it does not. Now, if you look at src/gmxlib/thread_mpi/pthreads.c:633 you'll see a tMPI_Atomic_memory_barrier() which, for x86, is defined in include/thread_mpi/atomic/gcc_x86.h:105 as #define tMPI_Atomic_memory_barrier() __asm__ __volatile__("sfence;" : : : "memory") along some other atomic operations for icc among other compilers. What's strange is that the build system checks whether it can compile a dummy C file with the atomics stuff included (see cmake/ThreadMPI.cmake). At first sight it seems that this should fail already at cmake time and should disable the atomics, but apprently it does not. You have two options: - Fix the problem by adding an #elif MACRO_TO_CHECK_FOR_MIC_COMPILATION branch and implement an atomic barrier using the appropriate MIC ASM instruction. - Fix the atomics check such that the lack of atomics support in thread-MPI on MIC is correctly reflected (see cmake/ThreadMPI.cmake:45 which compilescmake/TestAtomics.c). More concretely, the cmake test should fail for MIC build which should result in the disabling of atomics support (and hopefully no compile-time error). I suspect that even the proper fix (first option) may be as simple as a couple of lines worth of changes. Regardless of which option you pick, I would really appreciate if you could upload your fix to gerrit.gromacs.org. You could open an issue on redmine.gromacs.org if you want this issue to be track-able. Cheers, -- Szilárd PS: I hope you know that we have neither SIMD intrinsics support not any reasonable accelerator-aware parallelization for MIC (yet), so don't expect high performance. > > Thanks, > > Paul. > > cmake_output.txt > <http://gromacs.5086.x6.nabble.com/file/n5011212/cmake_output.txt> > > -- > View this message in context: > http://gromacs.5086.x6.nabble.com/Cross-compiling-GROMACS-4-6-3-for-native-Xeon-Phi-thread-mpi-problem-tp5011212.html > Sent from the GROMACS Users Forum mailing list archive at Nabble.com. > -- > gmx-users mailing list gmx-users@gromacs.org > http://lists.gromacs.org/mailman/listinfo/gmx-users > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/Search before posting! > * Please don't post (un)subscribe requests to the list. Use the > www interface or send it to gmx-users-requ...@gromacs.org. > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists -- gmx-users mailing list gmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting! * Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists