date:20120424

Re: [OMPI users] MPI doesn't recognize multiple cores available on multicore machines

2012-04-24 Thread Ralph Castain

Did you tell it --bind-to-core? If not, then the procs would be unbound to any particular core - so your code might well think they are "sharing" cores. On Apr 24, 2012, at 4:46 PM, Kyle Boe wrote: > Right, I tried using a hostfile, and it made no difference. This is running > OpenMPI 1.4.4 on

Re: [OMPI users] OpenMPI fails to run with -np larger than 10

2012-04-24 Thread Gutierrez, Samuel K

Hi Ralph, Yes, you are absolutely correct. A user can suppress the warning, however, by simply setting shmem_mmap_enable_nfs_warning to 0. For what it's worth, I just verified that the warning shows itself on Panasas and NFS. Looks like Lustre and GPFS will behave similarly. Sam On Apr 24, 2

Re: [OMPI users] MPI doesn't recognize multiple cores available on multicore machines

2012-04-24 Thread Kyle Boe

Right, I tried using a hostfile, and it made no difference. This is running OpenMPI 1.4.4 on CentOS 5.x machines. The original issue was an error trap built into my code, where it said one of the cores was asking for information it already owned. I'm sorry to be vague, but I can't share anything fr

Re: [OMPI users] OpenMPI fails to run with -np larger than 10

2012-04-24 Thread Ralph Castain

I thought we had code in the 1.5 series that would "bark" if the tmp dir was on a network mount? Is that not true? On Apr 24, 2012, at 3:20 PM, Gutierrez, Samuel K wrote: > Hi, > > I just wanted to record the behind the scenes resolution to this particular > issue. For more info, take a look

Re: [OMPI users] MPI doesn't recognize multiple cores available on multicore machines

2012-04-24 Thread Ralph Castain

You don't need a hostfile to run multiple procs on the localhost. What version of OMPI are you using? What was the original issue? On Apr 24, 2012, at 4:07 PM, Jingcha Joba wrote: > Try using slots in hostfile ? > > -- > Sent from my iPhone > > On Apr 24, 2012, at 2:52 PM, Kyle Boe wrote: >

Re: [OMPI users] MPI doesn't recognize multiple cores available on multicore machines

2012-04-24 Thread Jingcha Joba

Try using slots in hostfile ? -- Sent from my iPhone On Apr 24, 2012, at 2:52 PM, Kyle Boe wrote: > I'm having a problem trying to use OpenMPI on some multicore machines I have. > The code I am running was giving me errors which suggested that MPI was > assigning multiple processes to the sam

[OMPI users] MPI doesn't recognize multiple cores available on multicore machines

2012-04-24 Thread Kyle Boe

I'm having a problem trying to use OpenMPI on some multicore machines I have. The code I am running was giving me errors which suggested that MPI was assigning multiple processes to the same core (which I do not want). So, I tried launching my job using the -nooversubscribe option, and I get this e

Re: [OMPI users] OpenMPI fails to run with -np larger than 10

2012-04-24 Thread Gutierrez, Samuel K

Hi, I just wanted to record the behind the scenes resolution to this particular issue. For more info, take a look at: https://svn.open-mpi.org/trac/ompi/ticket/3076 It seems as if the problem stems from /tmp being mounted as an NFS space that is shared between the compute nodes. This problem

Re: [OMPI users] Optimal 3-D Cartesian processor mapping

2012-04-24 Thread Jeffrey Squyres

On Apr 24, 2012, at 3:33 PM, Tom Rosmond wrote: > Yes, I would be interested in such a plugin. But be advised that I am > strictly a fortran programmer, so if it requires any C/C++ talent, I > would be in trouble. So maybe, before jumping into that, I would like > to be able to look at what proc

Re: [OMPI users] MPI_Allreduce hangs

2012-04-24 Thread Brock Palen

To throw in my $0.02, though it is worth less. Were you running this on verb based infiniband? We see a problem that we have a work around for even with the newest 1.4.5 only on IB, we can reproduce it with IMB. You can find an old thread from me about it. Your problem might not be the same.

Re: [OMPI users] Optimal 3-D Cartesian processor mapping

2012-04-24 Thread Tom Rosmond

Will do. My machine is currently quite busy, so it will be a while before I get answers. Stay tuned. T. Rosmond On Tue, 2012-04-24 at 13:36 -0600, Ralph Castain wrote: > Add --display-map to your mpirun cmd line > > On Apr 24, 2012, at 1:33 PM, Tom Rosmond wrote: > > > Jeff, > > > > Yes, I

Re: [OMPI users] Optimal 3-D Cartesian processor mapping

2012-04-24 Thread Ralph Castain

Add --display-map to your mpirun cmd line On Apr 24, 2012, at 1:33 PM, Tom Rosmond wrote: > Jeff, > > Yes, I would be interested in such a plugin. But be advised that I am > strictly a fortran programmer, so if it requires any C/C++ talent, I > would be in trouble. So maybe, before jumping int

Re: [OMPI users] Optimal 3-D Cartesian processor mapping

2012-04-24 Thread Tom Rosmond

Jeff, Yes, I would be interested in such a plugin. But be advised that I am strictly a fortran programmer, so if it requires any C/C++ talent, I would be in trouble. So maybe, before jumping into that, I would like to be able to look at what processor/node mapping Open-mpi is actually giving me.

Re: [OMPI users] Segmentation fault during MPI initialization

2012-04-24 Thread Jeffrey Squyres

That's very odd, indeed -- it's listed as being inside MPI_INIT, but we don't get any further details from there. :-\ Any chance you could try upgrading to OMPI 1.4.5 and/or 1.5.5? On Apr 24, 2012, at 1:57 PM, Jeffrey A Cummings wrote: > I've been having an intermittent failure during MPI init

Re: [OMPI users] Optimal 3-D Cartesian processor mapping

2012-04-24 Thread Jeffrey Squyres

On Apr 24, 2012, at 3:01 PM, Tom Rosmond wrote: > My question is this: If the cartesian mapping is done so the two > spacial dimensions are the 'most rapidly varying' in equivalent 1-D > processor mapping, will Open-mpi automatically assign those 2 dimensions > 'on-node', and assign the 'ensemble

Re: [OMPI users] MPI_Allreduce hangs

2012-04-24 Thread Jeffrey Squyres

Could you repeat your tests with 1.4.5 and/or 1.5.5? On Apr 23, 2012, at 1:32 PM, Martin Siegert wrote: > Hi, > > I am debugging a program that hangs in MPI_Allreduce (openmpi-1.4.3). > An strace of one of the processes shows: > > Process 10925 attached with 3 threads - interrupt to quit > [pi

[OMPI users] Optimal 3-D Cartesian processor mapping

2012-04-24 Thread Tom Rosmond

We have a large ensemble-based atmospheric data assimilation system that does a 3-D cartesian partitioning of the 'domain' using MPI_DIMS_CREATE, MPI_CART_CREATE, etc. Two of the dimensions are spacial, i.e. latitude and longitude; the third is an 'ensemble' dimension, across which subsets of ense

Re: [OMPI users] Ompi-restart failed and process migration

2012-04-24 Thread Josh Hursey

The ~/.openmpi/mca-params.conf file should contain the same information on all nodes. You can install Open MPI as root. However, we do not recommend that you run Open MPI as root. If the user $HOME directory is NFS mounted, then you can use an NFS mounted directory to store your files. With this

Re: [OMPI users] OpenMPI fails to run with -np larger than 10

2012-04-24 Thread Seyyed Mohtadin Hashemi

Hi, I ran those cmd's and have posted the outputs on: https://svn.open-mpi.org/trac/ompi/ticket/3076 -mca shmem posix worked for all -np (even when oversubscribing), however sysv did not work for any -np. On Tue, Apr 24, 2012 at 5:36 PM, Gutierrez, Samuel K wrote: > Hi, > > Just out of curios

Re: [OMPI users] Segmentation fault during MPI initialization

2012-04-24 Thread Gus Correa

Hi Jeffrey Assuming you are on Linux, a frequent cause of out-of-nowhere segfaults is a limited/small stack size. They can happen if you [ab]use big automatic arrays, etc. You can set the stacksize bigger/unlimited with the ulimit/limit command, or edit the /etc/security/limits.conf. Of course,

[OMPI users] Segmentation fault during MPI initialization

2012-04-24 Thread Jeffrey A Cummings

I've been having an intermittent failure during MPI initialization (v 1.4.3) for several months. It comes and goes as I make changes to my application, that is changes unrelated to MPI calls. Even when I have a version of my app which shows the problem, it doesn't happen on every submittal.

Re: [OMPI users] Ompi-restart failed and process migration

2012-04-24 Thread kidd

Hi ,Thank you For your reply. I have some problem: Q1: I setting 2 kinds mac.para.conf (1) crs_base_snapshot_dir=/root/kidd_openMPI/Tmp snapc_base_global_snapshot_dir=/root/kidd_openMPI/checkpoints My Master : /root/kidd_openMPI is My opempi-Installed Dir ,

Re: [OMPI users] OpenMPI fails to run with -np larger than 10

2012-04-24 Thread Gutierrez, Samuel K

Hi, Just out of curiosity, what happens when you add -mca shmem posix to your mpirun command line using 1.5.5? Can you also please try: -mca shmem sysv I'm shooting in the dark here, but I want to make sure that the failure isn't due to a small backing store. Thanks, Sam On Apr 16, 2012,

Re: [OMPI users] Ompi-restart failed and process migration

2012-04-24 Thread Josh Hursey

On Tue, Apr 24, 2012 at 10:10 AM, kidd wrote: > Hi ,Thank you For your reply. > but I still failed. I must add -x LD_LIBRARY_PATH > this is my All Setting ; > 1) Master-Node(cuda07) & Slaves Node(cuda08) : > Configure: > ./configure --prefix=/root/kidd_openMPI --with-ft=cr > --enable-f

Re: [OMPI users] Ompi-restart failed and process migration

2012-04-24 Thread kidd

Hi ,Thank you For your reply. but I still failed. I must add -x LD_LIBRARY_PATH this is my All Setting ; 1) Master-Node(cuda07) & Slaves Node(cuda08) : Configure: ./configure --prefix=/root/kidd_openMPI --with-ft=cr --enable-ft-thread --with-blcr=/usr/local/BLCR --with-blcr-

Re: [OMPI users] MPI and CUDA

2012-04-24 Thread Rolf vandeVaart

I am not sure about everything that is going wrong, but there are at least two issues I found. First, you are skipping the first line that you read from integers.txt. Maybe something like this instead. while(fgets(line, sizeof line, fp)!= NULL){ sscanf(line,"%d",&data[k]); sum = sum +

Re: [OMPI users] Regarding the problem while connecting to nodes present in a cluster

2012-04-24 Thread Jeffrey Squyres

It looks like you are using LAM/MPI. This list is for supporting Open MPI, a wholly different MPI software implementation. However, speaking as one of the core LAM/MPI developers, I'll tell you that you should uninstall LAM and install Open MPI install. We abandoned LAM/MPI several years ago.

Re: [OMPI users] HRM problem

2012-04-24 Thread TERRY DONTJE

On 4/24/2012 6:19 AM, Syed Ahsan Ali wrote: I am not familiar with attaching debugger to the processes. Other things you asked are as follows: The easiest is to get Totalview or Allinea (both are parallel debuggers) and attach them to the job. However they cost. Another is to try padb, look

Re: [OMPI users] HRM problem

2012-04-24 Thread Syed Ahsan Ali

I am not familiar with attaching debugger to the processes. Other things you asked are as follows: Is this the first time you've ran it (with Open MPI? with any MPI?) *No We have been running this and other models but this problem has arised now * How many processes is the job using? Are you o

Re: [OMPI users] HRM problem

2012-04-24 Thread TERRY DONTJE

To determine if an MPI process is waiting for a message do what Rayson suggested and attach a debugger to the processes and see if any of them are stuck in MPI. Either internally in a MPI_Recv or MPI_Wait call or looping on a MPI_Test call. Other things to consider. Is this the first time y

[OMPI users] Regarding the problem while connecting to nodes present in a cluster

2012-04-24 Thread seshendra seshu

Hi, I have installed MPI and when i tried to run MPI parallelly on all the nodes, that while MPI is looking to establish connection i have been getting the following error "*ERROR: LAM/MPI unexpectedly received the following on stderr:Permission denied (publickey,gssapi-with-mic)." so any one could

[OMPI users] MPI and CUDA

2012-04-24 Thread Rohan Deshpande

I am combining mpi and cuda. Trying to find out sum of array elements using cuda and using mpi to distribute the array. my cuda code #include __global__ void add(int *devarray, int *devsum) { int index = blockIdx.x * blockDim.x + threadIdx.x; *devsum = *devsum + devarray[index]

Re: [OMPI users] HRM problem

2012-04-24 Thread Syed Ahsan Ali

Dear Rayson, That is a Nuemrical model that is written by National weather service of a country. The logs of the model show every detail about the simulation progress. I have checked on the remote nodes as well the application binary is running but the logs show no progress, it is just waiting at

Re: [OMPI users] HRM problem

2012-04-24 Thread Rayson Ho

Seems like there's a bug in the application. Did you or someone else write it, or did you get it from an ISV?? You can log onto one of the nodes, attach a debugger, and see if the MPI task is waiting for a message (looping in one of the MPI receive functions)... Rayson ==

[OMPI users] HRM problem

2012-04-24 Thread Syed Ahsan Ali

Dear All, I am having problem with running an application on Dell cluster . The model starts well but no further progress is shown. It just stuck. I have checked the systems, no apparent hardware error is there. Other open mpi applications are running well on the same cluster. I have tried running

Re: [OMPI users] MPI doesn't recognize multiple cores available on multicore machines

Re: [OMPI users] OpenMPI fails to run with -np larger than 10

Re: [OMPI users] MPI doesn't recognize multiple cores available on multicore machines

Re: [OMPI users] OpenMPI fails to run with -np larger than 10

Re: [OMPI users] MPI doesn't recognize multiple cores available on multicore machines

Re: [OMPI users] MPI doesn't recognize multiple cores available on multicore machines

[OMPI users] MPI doesn't recognize multiple cores available on multicore machines

Re: [OMPI users] OpenMPI fails to run with -np larger than 10

Re: [OMPI users] Optimal 3-D Cartesian processor mapping

Re: [OMPI users] MPI_Allreduce hangs

Re: [OMPI users] Optimal 3-D Cartesian processor mapping

Re: [OMPI users] Optimal 3-D Cartesian processor mapping

Re: [OMPI users] Optimal 3-D Cartesian processor mapping

Re: [OMPI users] Segmentation fault during MPI initialization

Re: [OMPI users] Optimal 3-D Cartesian processor mapping

Re: [OMPI users] MPI_Allreduce hangs

[OMPI users] Optimal 3-D Cartesian processor mapping

Re: [OMPI users] Ompi-restart failed and process migration

Re: [OMPI users] OpenMPI fails to run with -np larger than 10

Re: [OMPI users] Segmentation fault during MPI initialization

[OMPI users] Segmentation fault during MPI initialization

Re: [OMPI users] Ompi-restart failed and process migration

Re: [OMPI users] OpenMPI fails to run with -np larger than 10

Re: [OMPI users] Ompi-restart failed and process migration

Re: [OMPI users] Ompi-restart failed and process migration

Re: [OMPI users] MPI and CUDA

Re: [OMPI users] Regarding the problem while connecting to nodes present in a cluster

Re: [OMPI users] HRM problem

Re: [OMPI users] HRM problem

Re: [OMPI users] HRM problem

[OMPI users] Regarding the problem while connecting to nodes present in a cluster

[OMPI users] MPI and CUDA

Re: [OMPI users] HRM problem

Re: [OMPI users] HRM problem

[OMPI users] HRM problem

35 matches

Site Navigation

Mail list logo

Footer information