Re: [OMPI users] OpenMPI Checkpoint/Restart is failed

2010-05-18 Thread Hideyuki Jitsumoto
Hi Josh, Thank you for your replying. I tried to patch a Ticket #2139 to openmpi-1.4.1 and to install all of the elements from the very beginning. Then I got a correct work. Probably there are some faults on my environment preparation. # I cannot reproduce the environment when I got failure. # I'

Re: [OMPI users] init of component openib returned failure

2010-05-18 Thread Jeff Squyres
Try running with: mpirun.openmpi-1.4.1 --mca btl_base_verbose 50 --mca btl self,openib -n 2 --mca btl_openib_verbose 100 ./IMB-MPI1 -npmin 2 PingPong Also, are you saying that running the same command line with osu_latency works just fine? That would be really weird... On May 18, 2010, at 6

Re: [OMPI users] Using a rankfile for ompi-restart

2010-05-18 Thread Josh Hursey
(Sorry for the delay in replying, more below) On Apr 8, 2010, at 1:34 PM, Fernando Lemos wrote: Hello, I've noticed that ompi-restart doesn't support the --rankfile option. It only supports --hostfile/--machinefile. Is there any reason --rankfile isn't supported? Suppose you have a cluster w

Re: [OMPI users] OpenMPI Checkpoint/Restart is failed

2010-05-18 Thread Josh Hursey
(Sorry for the delay in replying, more below) On Apr 12, 2010, at 6:36 AM, Hideyuki Jitsumoto wrote: Hi Members, I tried to use checkpoint/restart by openmpi. But I can not get collect checkpoint data. I prepared execution environment as follows, the strings in () mean name of output file whic

Re: [OMPI users] (no subject)

2010-05-18 Thread Josh Hursey
The functionality of checkpoint operation is not tied to CPU utilization. Are you running with the C/R thread enabled? If not then the checkpoint might be waiting until the process enters the MPI library. Does the system emit an error message describing the error that it encountered? Th

Re: [OMPI users] opal_cr_tmp_dir

2010-05-18 Thread ananda.mudar
That's correct. I have prefixed them with OMPI_MCA_ when I defined them in my environment. Despite that I still see some of these files being created under the default directory /tmp which is different from what I had set. Thanks Ananda From: Josh Hursey Subject

Re: [OMPI users] opal_cr_tmp_dir

2010-05-18 Thread Josh Hursey
When you defined them in your environment did you prefix them with 'OMPI_MCA_'? Open MPI looks for this prefix to identify which parameters are intended for it specifically. -- Josh On May 12, 2010, at 11:09 PM, > wrote: Ralph Defining these parameters in my environment also did not res

Re: [OMPI users] ompi-restart fails with "found pid in use"

2010-05-18 Thread Josh Hursey
So I recently hit this same problem while doing some scalability testing. I experimented with adding the --no-restore-pid option, but found the same problem as you mention. Unfortunately, the problem is with BLCR, not Open MPI. BLCR will restart the process with a new PID, but the value ret

Re: [OMPI users] default hostfile (Ubuntu-9.10)

2010-05-18 Thread Ralph Castain
Starting in the 1.3 series, you have to tell OMPI where to find the default hostfile. So put this in your default MCA param file: orte_default_hostfile= That should fix it. On Tue, May 18, 2010 at 7:26 AM, Stefan Kuhne wrote: > Am 18.05.2010 15:09, schrieb Ralph Castain: > > Hello, > > > Could

Re: [OMPI users] default hostfile (Ubuntu-9.10)

2010-05-18 Thread Stefan Kuhne
Am 18.05.2010 15:09, schrieb Ralph Castain: Hello, > Could you tell us what version of OMPI you are using? > it's openmpi-1.3.2. Regards, Stefan Kuhne signature.asc Description: OpenPGP digital signature

Re: [OMPI users] default hostfile (Ubuntu-9.10)

2010-05-18 Thread Ralph Castain
Could you tell us what version of OMPI you are using? Thanks Ralph On Tue, May 18, 2010 at 6:51 AM, Stefan Kuhne wrote: > Hello, > > I manage a little HPC-Cluster. > It seams like the default-hostfile is located in /etc/openmpi. > But when i write my hosts in it, it isn't used by mpirun. > > Ho

[OMPI users] default hostfile (Ubuntu-9.10)

2010-05-18 Thread Stefan Kuhne
Hello, I manage a little HPC-Cluster. It seams like the default-hostfile is located in /etc/openmpi. But when i write my hosts in it, it isn't used by mpirun. How can i use an default hostfile? Regards, Stefan Kuhne signature.asc Description: OpenPGP digital signature

[OMPI users] init of component openib returned failure

2010-05-18 Thread Peter Kruse
Hello, trying to run Intel MPI Benchmarks with OpenMPI 1.4.1 fails in initializing the component openib. System is Debian GNU/Linux 5.0.4. The command to start the job (under Torque 2.4.7) was: mpirun.openmpi-1.4.1 --mca btl_base_verbose 50 --mca btl self,openib -n 2 ./IMB-MPI1 -npmin 2 PingP