[OMPI users] New Cluster Centos 6.4 with Openmpi 1.6.4

2013-06-21 Thread thomas . forde
Hi im running into a strange problem when trying to start parallell processing with Numcea Fine Marine software for jobs. I have managed to setup openmpi 1.64 on qmaster and all nodes, so they all run the same version. Every time i try to start a job that requires more then 1 node the job die

Re: [OMPI users] OpenMPI 1.6.4 and Intel Composer_xe_2013.4.183: problem with remote runs, orted: error while loading shared libraries: libimf.so

2013-06-21 Thread thomas . forde
hi Stefano your error message show that you are missing a shared library, not necessary that library path is wrong. do you actually have libimf.so, can you find the file on your system. ./Thomas From: Stefano Zaghi To: us...@open-mpi.org, List-Post: users@lists.open-mpi.org Date:

Re: [OMPI users] OpenMPI 1.6.4 and Intel Composer_xe_2013.4.183: problem with remote runs, orted: error while loading shared libraries: libimf.so

2013-06-21 Thread thomas . forde
your settings are as following: export MPI=/home/stefano/opt/mpi/openmpi/1.6.4/intel export PATH=${MPI}/bin:$PATH export LD_LIBRARY_PATH=${MPI}/lib/openmpi:${MPI}/lib:$LD_LIBRARY_PATH export LD_RUN_PATH=${MPI}/lib/openmpi:${MPI}/lib:$LD_RUN_PATH and your path to libimf.so file is /home/stefan

Re: [OMPI users] OpenMPI 1.6.4 and Intel Composer_xe_2013.4.183: problem with remote runs, orted: error while loading shared libraries: libimf.so

2013-06-21 Thread thomas . forde
hi Stefano /home/stefano/opt/intel/2013.4.183/lib/intel64/ is also the wrong path, as the file is in ..183/lib/ and not ...183/lib/intel64/ is that why? ./Thomas Den 21. juni 2013 kl. 10:26 skrev "Stefano Zaghi" : > Dear Thomas, > thank you again. > > Symlink in /usr/lib64 is not enough, I h

Re: [OMPI users] New Cluster Centos 6.4 with Openmpi 1.6.4

2013-06-21 Thread thomas . forde
That is what i belive aswell, i have done a few tests now the past few hours, and heavily tested my queue system with submitting jobs diretly with qsub, and i have no problem allocating resources across several nodes. but when i try through the task manager of numeca fine/marine, it stops, and h

[OMPI users] Locker memory Limits error

2013-07-26 Thread thomas . forde
hi guys im having a strange problem when starting some jobs that i dont uderstand. its just 1 node that has an issue and i find it odd. The OpenFabrics (openib) BTL failed to initialize while trying to allocate some locked memory. This typically can indicate that the memlock limits are set too

[OMPI users] random error bugging me..

2014-01-18 Thread thomas . forde
Hi I have had a running cluster going good for a while, and 2 days ago we decided to upgrade it from 128 to 256 cores. Most om my deployment of nodes goes through cobbler and scripting, and it has worked fine before.on the first 8 nodes. But after adding new nodes, everything is fucked up and

Re: [OMPI users] random error bugging me..

2014-01-19 Thread thomas . forde
Yes. It's a shared NSF partition on the nodes. Sendt fra min iPhone > Den 19. jan. 2014 kl. 13:29 skrev "Reuti" : > > Hi, > > Am 18.01.2014 um 22:43 schrieb thomas.fo...@ulstein.com: > > > I have had a running cluster going good for a while, and 2 days ago we decided to upgrade it from 128 to 2