Hi, Thank you for the information, I'm going to try the new Intel Compilers which I'm downloading now, but as they're taking so long to download I don't think I'm going to be able to look into this again until after the weekend. BTW using their java-based downloader is a bit less painful than their normal download.
In the meantime, if anyone else has some suggestions then please let me know. Thanks Nick 2009/11/5 Jeff Squyres <jsquy...@cisco.com>: > FWIW, I think Intel released 11.1.059 earlier today (I've been trying to > download it all morning). I doubt it's an issue in this case, but I thought > I'd mention it as a public service announcement. ;-) > > Seg faults are *usually* an application issue (never say "never", but they > *usually* are). You might want to first contact the RaXML team to see if > there are any known issues with their software and Open MPI 1.3.3...? > (Sorry, I'm totally unfamiliar with RaXML) > > On Nov 5, 2009, at 12:30 PM, Nick Holway wrote: > >> Dear all, >> >> I'm trying to run RaXML 7.0.4 on my 64bit Rocks 5.1 cluster (ie Centos >> 5.2). I compiled Open MPI 1.3.3 using the Intel compilers v 11.1.056 >> using ./configure CC=icc CXX=icpc F77=ifort FC=ifort --with-sge >> --prefix=/usr/prog/mpi/openmpi/1.3.3/x86_64-no-mem-man >> --with-memory-manager=none. >> >> When I run run RaXML in a qlogin session using >> /usr/prog/mpi/openmpi/1.3.3/x86_64-no-mem-man/bin/mpirun -np 8 >> /usr/prog/bioinformatics/RAxML/7.0.4/x86_64/RAxML-7.0.4/raxmlHPC-MPI >> -f a -x 12345 -p12345 -# 10 -m GTRGAMMA -s >> /users/holwani1/jay/ornodko-1582 -n mpitest39 >> >> I get the following output: >> >> This is the RAxML MPI Worker Process Number: 1 >> This is the RAxML MPI Worker Process Number: 3 >> >> This is the RAxML MPI Master process >> >> This is the RAxML MPI Worker Process Number: 7 >> >> This is the RAxML MPI Worker Process Number: 4 >> >> This is the RAxML MPI Worker Process Number: 5 >> >> This is the RAxML MPI Worker Process Number: 2 >> >> This is the RAxML MPI Worker Process Number: 6 >> IMPORTANT WARNING: Alignment column 1695 contains only undetermined >> values which will be treated as missing data >> >> >> IMPORTANT WARNING: Sequences A4_H10 and A3ii_E11 are exactly identical >> >> >> IMPORTANT WARNING: Sequences A2_A08 and A9_C10 are exactly identical >> >> >> IMPORTANT WARNING: Sequences A3ii_B03 and A3ii_C06 are exactly identical >> >> >> IMPORTANT WARNING: Sequences A9_D08 and A9_F10 are exactly identical >> >> >> IMPORTANT WARNING: Sequences A3ii_F07 and A9_C08 are exactly identical >> >> >> IMPORTANT WARNING: Sequences A6_F05 and A6_F11 are exactly identical >> >> IMPORTANT WARNING >> Found 6 sequences that are exactly identical to other sequences in the >> alignment. >> Normally they should be excluded from the analysis. >> >> >> IMPORTANT WARNING >> Found 1 column that contains only undetermined values which will be >> treated as missing data. >> Normally these columns should be excluded from the analysis. >> >> An alignment file with undetermined columns and sequence duplicates >> removed has already >> been printed to file /users/holwani1/jay/ornodko-1582.reduced >> >> >> You are using RAxML version 7.0.4 released by Alexandros Stamatakis in >> April 2008 >> >> Alignment has 1280 distinct alignment patterns >> >> Proportion of gaps and completely undetermined characters in this >> alignment: 0.124198 >> >> RAxML rapid bootstrapping and subsequent ML search >> >> >> Executing 10 rapid bootstrap inferences and thereafter a thorough ML >> search >> >> All free model parameters will be estimated by RAxML >> GAMMA model of rate heteorgeneity, ML estimate of alpha-parameter >> GAMMA Model parameters will be estimated up to an accuracy of >> 0.1000000000 Log Likelihood units >> >> Partition: 0 >> Name: No Name Provided >> DataType: DNA >> Substitution Matrix: GTR >> Empirical Base Frequencies: >> pi(A): 0.261129 pi(C): 0.228570 pi(G): 0.315946 pi(T): 0.194354 >> >> >> Switching from GAMMA to CAT for rapid Bootstrap, final ML search will >> be conducted under the GAMMA model you specified >> Bootstrap[10]: Time 44.442728 bootstrap likelihood -inf, best >> rearrangement setting 5 >> Bootstrap[0]: Time 44.814948 bootstrap likelihood -inf, best >> rearrangement setting 5 >> Bootstrap[6]: Time 46.470371 bootstrap likelihood -inf, best >> rearrangement setting 6 >> [compute-0-11:08698] *** Process received signal *** >> [compute-0-11:08698] Signal: Segmentation fault (11) >> [compute-0-11:08698] Signal code: Address not mapped (1) >> [compute-0-11:08698] Failing at address: 0x408 >> [compute-0-11:08698] [ 0] /lib64/libpthread.so.0 [0x3fb580de80] >> [compute-0-11:08698] [ 1] >> >> /usr/prog/bioinformatics/RAxML/7.0.4/x86_64/RAxML-7.0.4/raxmlHPC-MPI(hookup+0) >> [0x413ca0] >> [compute-0-11:08698] [ 2] >> >> /usr/prog/bioinformatics/RAxML/7.0.4/x86_64/RAxML-7.0.4/raxmlHPC-MPI(restoreTL+0xd9) >> [0x442c09] >> [compute-0-11:08698] [ 3] >> /usr/prog/bioinformatics/RAxML/7.0.4/x86_64/RAxML-7.0.4/raxmlHPC-MPI >> [0x42c968] >> [compute-0-11:08698] [ 4] >> >> /usr/prog/bioinformatics/RAxML/7.0.4/x86_64/RAxML-7.0.4/raxmlHPC-MPI(doAllInOne+0x91a) >> [0x42b21a] >> [compute-0-11:08698] [ 5] >> >> /usr/prog/bioinformatics/RAxML/7.0.4/x86_64/RAxML-7.0.4/raxmlHPC-MPI(main+0xc25) >> [0x4063f5] >> [compute-0-11:08698] [ 6] /lib64/libc.so.6(__libc_start_main+0xf4) >> [0x3fb501d8b4] >> [compute-0-11:08698] [ 7] >> /usr/prog/bioinformatics/RAxML/7.0.4/x86_64/RAxML-7.0.4/raxmlHPC-MPI >> [0x405719] >> [compute-0-11:08698] *** End of error message *** >> Bootstrap[1]: Time 8.400332 bootstrap likelihood -inf, best >> rearrangement setting 5 >> -------------------------------------------------------------------------- >> mpirun noticed that process rank 1 with PID 8698 on node >> compute-0-11.local exited on signal 11 (Segmentation fault). >> -------------------------------------------------------------------------- >> >> >> >> My $PATH is >> /usr/prog/mpi/openmpi/1.3.3/x86_64-no-mem-man/bin/:/usr/prog/mpi/openmpi/1.3.3/x86_64/bin/:/usr/prog/intel/ifort/11.1.056/bin/intel64:/usr/prog/intel/icc/11.1.056//bin/intel64:/usr/prog/intel/ifort/11.1.056/bin/intel64:/usr/prog/intel/icc/11.1.056//bin/intel64:/opt/gridengine/bin/lx26-amd64:/usr/kerberos/sbin:/usr/kerberos/bin:/opt/gridengine/bin/lx26-amd64:/usr/java/latest/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/opt/ganglia/bin:/opt/ganglia/sbin:/opt/rocks/bin:/opt/rocks/sbin:/root/bin >> >> My $LD_LIBRARY_PATH is >> >> /usr/prog/mpi/openmpi/1.3.3/x86_64-no-mem-man/lib/:/usr/prog/mpi/openmpi/1.3.3/x86_64/lib/:/usr/prog/intel/ifort/11.1.056/lib/intel64:/usr/prog/intel/ifort/11.1.056/mkl/lib/em64t:/usr/prog/intel/icc/11.1.056//lib/intel64:/usr/prog/intel/icc/11.1.056//ipp/em64t/sharedlib:/usr/prog/intel/icc/11.1.056//mkl/lib/em64t:/usr/prog/intel/icc/11.1.056//tbb/intel64/cc4.1.0_libc2.4_kernel2.6.16.21/lib:/usr/prog/intel/ifort/11.1.056/lib/intel64:/usr/prog/intel/ifort/11.1.056/mkl/lib/em64t:/usr/prog/intel/icc/11.1.056//lib/intel64:/usr/prog/intel/icc/11.1.056//ipp/em64t/sharedlib:/usr/prog/intel/icc/11.1.056//mkl/lib/em64t:/usr/prog/intel/icc/11.1.056//tbb/intel64/cc4.1.0_libc2.4_kernel2.6.16.21/lib:/opt/gridengine/lib/lx26-amd64:/opt/gridengine/lib/lx26-amd64 >> >> Although I'm only running this on one node, it may be helpful to know >> that there is Infiniband with Voltaire OFED v1.4 on the nodes. Rocks' >> HPC roll MPIs is not installed. I've tried running the above on >> multiple nodes but still see the same error. I've attached the >> config.log and ompi_info to the email. >> >> I believe that the input is OK as I can run the serial gcc-compiled >> raXML on the data with no problems. I tried compiling openmpi with >> --with-memory-manager=none as a quick google >> (http://osdir.com/ml/clustering.open-mpi.user/2008-07/msg00201.html) >> suggested that it could help, but it made no difference. Google also >> suggested that it could be caused by the compile environment being >> different to the runtime, to test this I compiled and ran RaXML >> immediately after I compiled Openmpi in the same session, again with >> no joy. >> >> Does any one know how I can fix this? >> >> Thanks >> >> Nick >> >> <config.tar.gz><ompi-info.tar.gz><ATT2831213.txt> > > > -- > Jeff Squyres > jsquy...@cisco.com > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >