$0,02 of contribution, try macports Le 09-05-04 à 11:42, Jeff Squyres a écrit :
FWIW, I don't use Xcode, but I use the precompiled gcc/gfortran from here with good success:http://hpc.sourceforge.net/ On May 4, 2009, at 11:38 AM, Warner Yuen wrote:Have you installed a Fortran compiler? Mac OS X's developer tools donot come with a Fortran compiler, so you'll need to install one if youhaven't already done so. I routinely use the Intel IFORT compilers with success. However, I hear many good things about the gfortran compilers on Mac OS X, you can't beat the price of gfortran! Warner Yuen Scientific Computing Consulting Engineer Apple, Inc. email: wy...@apple.com Tel: 408.718.2859 On May 4, 2009, at 7:28 AM, users-requ...@open-mpi.org wrote: > Send users mailing list submissions to > us...@open-mpi.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://www.open-mpi.org/mailman/listinfo.cgi/users > or, via email, send a message with subject or body 'help' to > users-requ...@open-mpi.org > > You can reach the person managing the list at > users-ow...@open-mpi.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of users digest..." > > > Today's Topics: > > 1. How do I compile OpenMPI in Xcode 3.1 (Vicente) > 2. Re: 1.3.1 -rf rankfile behaviour ?? (Ralph Castain) > >> ----------------------------------------------------------------------> > Message: 1 > Date: Mon, 4 May 2009 16:12:44 +0200 > From: Vicente <vpui...@gmail.com> > Subject: [OMPI users] How do I compile OpenMPI in Xcode 3.1 > To: us...@open-mpi.org > Message-ID: <1c2c0085-940f-43bb-910f-975871ae2...@gmail.com> > Content-Type: text/plain; charset="windows-1252"; Format="flowed"; > DelSp="yes" > > Hi, I've seen the FAQ "How do I use Open MPI wrapper compilers in> Xcode", but it's only for MPICC. I am using MPIF90, so I did the same, > but changing MPICC for MPIF90, and also the path, but it did not work.> > Building target ?fortran? of project ?fortran? with configuration > ?Debug? > > > Checking Dependencies > Invalid value 'MPIF90' for GCC_VERSION > > > The file "MPIF90.cpcompspec" looks like this: > > 1 /** > 2 Xcode Coompiler Specification for MPIF90 > 3 > 4 */ > 5 > 6 { Type = Compiler; > 7 Identifier = com.apple.compilers.mpif90; > 8 BasedOn = com.apple.compilers.gcc.4_0; > 9 Name = "MPIF90"; > 10 Version = "Default"; > 11 Description = "MPI GNU C/C++ Compiler 4.0"; > 12 ExecPath = "/usr/local/bin/mpif90"; // This gets > converted to the g++ variant automatically > 13 PrecompStyle = pch; > 14 } > > and is located in "/Developer/Library/Xcode/Plug-ins" > > and when I do mpif90 -v on terminal it works well: > > Using built-in specs. > Target: i386-apple-darwin8.10.1 > Configured with: /tmp/gfortran-20090321/ibin/../gcc/configure --> prefix=/usr/local/gfortran --enable-languages=c,fortran --with- gmp=/> tmp/gfortran-20090321/gfortran_libs --enable-bootstrap > Thread model: posix > gcc version 4.4.0 20090321 (experimental) [trunk revision 144983] > (GCC) > > > Any idea?? > > Thanks. > > Vincent > -------------- next part -------------- > HTML attachment scrubbed and removed > > ------------------------------ > > Message: 2 > Date: Mon, 4 May 2009 08:28:26 -0600 > From: Ralph Castain <r...@open-mpi.org> > Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ?? > To: Open MPI Users <us...@open-mpi.org> > Message-ID: > <71d2d8cc0905040728h2002f4d7s4c49219eee29e...@mail.gmail.com> > Content-Type: text/plain; charset="iso-8859-1" >> Unfortunately, I didn't write any of that code - I was just fixing the > mapper so it would properly map the procs. From what I can tell, the> proper > things are happening there. > > I'll have to dig into the code that specifically deals with parsing > the> results to bind the processes. Afraid that will take awhile longer -> pretty > dark in that hole. > > > On Mon, May 4, 2009 at 8:04 AM, Geoffroy Pignot > <geopig...@gmail.com> wrote: > >> Hi, >>>> So, there are no more crashes with my "crazy" mpirun command. But the >> paffinity feature seems to be broken. Indeed I am not able to pin my>> processes. >> >> Simple test with a program using your plpa library : >> >> r011n006% cat hostf >> r011n006 slots=4 >> >> r011n006% cat rankf >> rank 0=r011n006 slot=0 ----> bind to CPU 0 , exact ? >> >> r011n006% /tmp/HALMPI/openmpi-1.4a/bin/mpirun --hostfile hostf -- >> rankfile >> rankf --wdir /tmp -n 1 a.out >>>>> PLPA Number of processors online: 4 >>>>> PLPA Number of processor sockets: 2 >>>>> PLPA Socket 0 (ID 0): 2 cores >>>>> PLPA Socket 1 (ID 3): 2 cores >> >> Ctrl+Z >> r011n006%bg >> >> r011n006% ps axo stat,user,psr,pid,pcpu,comm | grep gpignot >> R+ gpignot 3 9271 97.8 a.out >> >> In fact whatever the slot number I put in my rankfile , a.out >> always runs>> on the CPU 3. I was looking for it on CPU 0 accordind to my cpuinfo>> file >> (see below) >> The result is the same if I try another syntax (rank 0=r011n006 >> slot=0:0 >> bind to socket 0 - core 0 , exact ? ) >> >> Thanks in advance >> >> Geoffroy >> >> PS: I run on rhel5 >> >> r011n006% uname -a >> Linux r011n006 2.6.18-92.1.1NOMAP32.el5 #1 SMP Sat Mar 15 01:46:39 >> CDT 2008 >> x86_64 x86_64 x86_64 GNU/Linux >> >> My configure is : >> ./configure --prefix=/tmp/openmpi-1.4a --libdir='${exec_prefix}/ >> lib64' >> --disable-dlopen --disable-mpi-cxx --enable-heterogeneous >> >> >> r011n006% cat /proc/cpuinfo >> processor : 0 >> vendor_id : GenuineIntel >> cpu family : 6 >> model : 15 >> model name : Intel(R) Xeon(R) CPU 5150 @ 2.66GHz >> stepping : 6 >> cpu MHz : 2660.007 >> cache size : 4096 KB >> physical id : 0 >> siblings : 2 >> core id : 0 >> cpu cores : 2 >> fpu : yes >> fpu_exception : yes >> cpuid level : 10 >> wp : yes >> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr >> pge mca >> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall >> nx lm >> constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm >> bogomips : 5323.68 >> clflush size : 64 >> cache_alignment : 64 >> address sizes : 36 bits physical, 48 bits virtual >> power management: >> >> processor : 1 >> vendor_id : GenuineIntel >> cpu family : 6 >> model : 15 >> model name : Intel(R) Xeon(R) CPU 5150 @ 2.66GHz >> stepping : 6 >> cpu MHz : 2660.007 >> cache size : 4096 KB >> physical id : 3 >> siblings : 2 >> core id : 0 >> cpu cores : 2 >> fpu : yes >> fpu_exception : yes >> cpuid level : 10 >> wp : yes >> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr >> pge mca >> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall >> nx lm >> constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm >> bogomips : 5320.03 >> clflush size : 64 >> cache_alignment : 64 >> address sizes : 36 bits physical, 48 bits virtual >> power management: >> >> processor : 2 >> vendor_id : GenuineIntel >> cpu family : 6 >> model : 15 >> model name : Intel(R) Xeon(R) CPU 5150 @ 2.66GHz >> stepping : 6 >> cpu MHz : 2660.007 >> cache size : 4096 KB >> physical id : 0 >> siblings : 2 >> core id : 1 >> cpu cores : 2 >> fpu : yes >> fpu_exception : yes >> cpuid level : 10 >> wp : yes >> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr >> pge mca >> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall >> nx lm >> constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm >> bogomips : 5319.39 >> clflush size : 64 >> cache_alignment : 64 >> address sizes : 36 bits physical, 48 bits virtual >> power management: >> >> processor : 3 >> vendor_id : GenuineIntel >> cpu family : 6 >> model : 15 >> model name : Intel(R) Xeon(R) CPU 5150 @ 2.66GHz >> stepping : 6 >> cpu MHz : 2660.007 >> cache size : 4096 KB >> physical id : 3 >> siblings : 2 >> core id : 1 >> cpu cores : 2 >> fpu : yes >> fpu_exception : yes >> cpuid level : 10 >> wp : yes >> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr >> pge mca >> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall >> nx lm >> constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm >> bogomips : 5320.03 >> clflush size : 64 >> cache_alignment : 64 >> address sizes : 36 bits physical, 48 bits virtual >> power management: >> >> >>> ------------------------------ >>> >>> Message: 2 >>> Date: Mon, 4 May 2009 04:45:57 -0600 >>> From: Ralph Castain <r...@open-mpi.org> >>> Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ?? >>> To: Open MPI Users <us...@open-mpi.org> >>> Message-ID: <d01d7b16-4b47-46f3-ad41-d1a90b2e4...@open-mpi.org> >>> >>> Content-Type: text/plain; charset="us-ascii"; Format="flowed"; >>> DelSp="yes" >>>>>> My apologies - I wasn't clear enough. You need a tarball from r21111>>> or greater...such as: >>> >>> http://www.open-mpi.org/nightly/trunk/openmpi-1.4a1r21142.tar.gz >>> >>> HTH >>> Ralph >>> >>> >>> On May 4, 2009, at 2:14 AM, Geoffroy Pignot wrote: >>> >>>> Hi , >>>>>>>> I got the openmpi-1.4a1r21095.tar.gz tarball, but unfortunately my>>>> command doesn't work >>>> >>>> cat rankf: >>>> rank 0=node1 slot=* >>>> rank 1=node2 slot=* >>>> >>>> cat hostf: >>>> node1 slots=2 >>>> node2 slots=2 >>>> >>>> mpirun --rankfile rankf --hostfile hostf --host node1 -n 1 >>>> hostname : --host node2 -n 1 hostname >>>> >>>> Error, invalid rank (1) in the rankfile (rankf) >>>> >>>>>>> -------------------------------------------------------------------------->>>> [r011n006:28986] [[45541,0],0] ORTE_ERROR_LOG: Bad parameter in >>>> file >>>> rmaps_rank_file.c at line 403 >>>> [r011n006:28986] [[45541,0],0] ORTE_ERROR_LOG: Bad parameter in >>>> file >>>> base/rmaps_base_map_job.c at line 86 >>>> [r011n006:28986] [[45541,0],0] ORTE_ERROR_LOG: Bad parameter in >>>> file >>>> base/plm_base_launch_support.c at line 86 >>>> [r011n006:28986] [[45541,0],0] ORTE_ERROR_LOG: Bad parameter in >>>> file >>>> plm_rsh_module.c at line 1016 >>>> >>>>>>>> Ralph, could you tell me if my command syntax is correct or not ?>>>> if >>>> not, give me the expected one ? >>>> >>>> Regards >>>> >>>> Geoffroy >>>> >>>> >>>> >>>> >>>> 2009/4/30 Geoffroy Pignot <geopig...@gmail.com> >>>> Immediately Sir !!! :) >>>> >>>> Thanks again Ralph >>>> >>>> Geoffroy >>>> >>>> >>>> >>>> >>>> >>>> ------------------------------ >>>> >>>> Message: 2 >>>> Date: Thu, 30 Apr 2009 06:45:39 -0600 >>>> From: Ralph Castain <r...@open-mpi.org> >>>> Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ?? >>>> To: Open MPI Users <us...@open-mpi.org> >>>> Message-ID:>>>> <71d2d8cc0904300545v61a42fe1k50086d2704d0f...@mail.gmail.com >>>>> Content-Type: text/plain; charset="iso-8859-1" >>>> >>>> I believe this is fixed now in our development trunk - you can >>>> download any >>>> tarball starting from last night and give it a try, if you like. >>>> Any >>>> feedback would be appreciated. >>>> >>>> Ralph >>>> >>>> >>>> On Apr 14, 2009, at 7:57 AM, Ralph Castain wrote: >>>> >>>> Ah now, I didn't say it -worked-, did I? :-) >>>>>>>> Clearly a bug exists in the program. I'll try to take a look at it>>>> (if Lenny>>>> doesn't get to it first), but it won't be until later in the week.>>>> >>>> On Apr 14, 2009, at 7:18 AM, Geoffroy Pignot wrote: >>>> >>>> I agree with you Ralph , and that 's what I expect from openmpi >>>> but my >>>> second example shows that it's not working >>>> >>>> cat hostfile.0 >>>> r011n002 slots=4 >>>> r011n003 slots=4 >>>> >>>> cat rankfile.0 >>>> rank 0=r011n002 slot=0 >>>> rank 1=r011n003 slot=1 >>>> >>>> mpirun --hostfile hostfile.0 -rf rankfile.0 -n 1 hostname : -n 1 >>>> hostname >>>> ### CRASHED >>>> >>>>>> Error, invalid rank (1) in the rankfile (rankfile.0) >>>>>> >>>>> >>>>>>> -------------------------------------------------------------------------- >>>>>> [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in>>>> file >>>>>> rmaps_rank_file.c at line 404>>>>>> [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in>>>> file >>>>>> base/rmaps_base_map_job.c at line 87>>>>>> [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in>>>> file >>>>>> base/plm_base_launch_support.c at line 77>>>>>> [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in>>>> file >>>>>> plm_rsh_module.c at line 985 >>>>>> >>>>> >>>>>>> -------------------------------------------------------------------------->>>>>> A daemon (pid unknown) died unexpectedly on signal 1 while >>>>> attempting to >>>>>> launch so we are aborting. >>>>>> >>>>>> There may be more information reported by the environment (see >>>>> above). >>>>>>>>>>>> This may be because the daemon was unable to find all the needed>>>>> shared>>>>>> libraries on the remote node. You may set your LD_LIBRARY_PATH to>>>>> have the >>>>>> location of the shared libraries on the remote nodes and this >>>>>> will >>>>>> automatically be forwarded to the remote nodes. >>>>>> >>>>> >>>>>>> -------------------------------------------------------------------------->>>>>> >>>>> >>>>>>> -------------------------------------------------------------------------- >>>>>> orterun noticed that the job aborted, but has no info as to the>>>>> process >>>>>> that caused that situation. >>>>>> >>>>> >>>>>>> -------------------------------------------------------------------------->>>>>> orterun: clean termination accomplished >>>> >>>> >>>> >>>> Message: 4 >>>> Date: Tue, 14 Apr 2009 06:55:58 -0600 >>>> From: Ralph Castain <r...@lanl.gov> >>>> Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ?? >>>> To: Open MPI Users <us...@open-mpi.org> >>>> Message-ID: <f6290ada-a196-43f0-a853-cbcb802d8...@lanl.gov> >>>> Content-Type: text/plain; charset="us-ascii"; Format="flowed"; >>>> DelSp="yes" >>>> >>>> The rankfile cuts across the entire job - it isn't applied on an>>>> app_context basis. So the ranks in your rankfile must correspond to>>>> the eventual rank of each process in the cmd line. >>>> >>>> Unfortunately, that means you have to count ranks. In your case, >>>> you>>>> only have four, so that makes life easier. Your rankfile would look>>>> something like this: >>>> >>>> rank 0=r001n001 slot=0 >>>> rank 1=r001n002 slot=1 >>>> rank 2=r001n001 slot=1 >>>> rank 3=r001n002 slot=2 >>>> >>>> HTH >>>> Ralph >>>> >>>> On Apr 14, 2009, at 12:19 AM, Geoffroy Pignot wrote: >>>> >>>>> Hi, >>>>> >>>>> I agree that my examples are not very clear. What I want to do >>>>> is to>>>>> launch a multiexes application (masters-slaves) and benefit from>>>>> the >>>>> processor affinity.>>>>> Could you show me how to convert this command , using -rf option>>>>> (whatever the affinity is) >>>>> >>>>> mpirun -n 1 -host r001n001 master.x options1 : -n 1 -host >>>>> r001n002>>>>> master.x options2 : -n 1 -host r001n001 slave.x options3 : -n 1 ->>>>> host r001n002 slave.x options4 >>>>> >>>>> Thanks for your help >>>>> >>>>> Geoffroy >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Message: 2 >>>>> Date: Sun, 12 Apr 2009 18:26:35 +0300 >>>>> From: Lenny Verkhovsky <lenny.verkhov...@gmail.com> >>>>> Subject: Re: [OMPI users] 1.3.1 -rf rankfile behaviour ?? >>>>> To: Open MPI Users <us...@open-mpi.org> >>>>> Message-ID: >>>>> >>>>> <453d39990904120826t2e1d1d33l7bb1fe3de65b5...@mail.gmail.com> >>>>> Content-Type: text/plain; charset="iso-8859-1" >>>>> >>>>> Hi, >>>>> >>>>> The first "crash" is OK, since your rankfile has ranks 0 and 1 >>>>> defined, >>>>> while n=1, which means only rank 0 is present and can be >>>>> allocated. >>>>> >>>>> NP must be >= the largest rank in rankfile. >>>>> >>>>> What exactly are you trying to do ? >>>>> >>>>> I tried to recreate your seqv but all I got was >>>>>>>>>> ~/work/svn/ompi/trunk/build_x86-64/install/bin/mpirun -- hostfile>>>>> hostfile.0 >>>>> -rf rankfile.0 -n 1 hostname : -rf rankfile.1 -n 1 hostname >>>>> [witch19:30798] mca: base: component_find: paffinity >>>>> "mca_paffinity_linux" >>>>> uses an MCA interface that is not recognized (component MCA >>>> v1.0.0 != >>>>> supported MCA v2.0.0) -- ignored >>>>> >>>>>>> -------------------------------------------------------------------------->>>>> It looks like opal_init failed for some reason; your parallel >>>>> process is >>>>> likely to abort. There are many reasons that a parallel process >>>>> can>>>>> fail during opal_init; some of which are due to configuration or>>>>> environment problems. This failure appears to be an internal >>>> failure; >>>>> here's some additional information (which may only be relevant >>>>> to an >>>>> Open MPI developer): >>>>> >>>>> opal_carto_base_select failed >>>>> --> Returned value -13 instead of OPAL_SUCCESS >>>>> >>>>>>> -------------------------------------------------------------------------- >>>>> [witch19:30798] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in>>>> file >>>>> ../../orte/runtime/orte_init.c at line 78>>>>> [witch19:30798] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in>>>> file >>>>> ../../orte/orted/orted_main.c at line 344 >>>>> >>>>>>> -------------------------------------------------------------------------->>>>> A daemon (pid 11629) died unexpectedly with status 243 while >>>>> attempting >>>>> to launch so we are aborting. >>>>> >>>>> There may be more information reported by the environment (see >>>> above). >>>>>>>>>> This may be because the daemon was unable to find all the needed>>>>> shared>>>>> libraries on the remote node. You may set your LD_LIBRARY_PATH to>>>>> have the>>>>> location of the shared libraries on the remote nodes and this will>>>>> automatically be forwarded to the remote nodes. >>>>> >>>>>>> -------------------------------------------------------------------------->>>>> >>>>>>> -------------------------------------------------------------------------->>>>> mpirun noticed that the job aborted, but has no info as to the >>>> process >>>>> that caused that situation. >>>>> >>>>>>> -------------------------------------------------------------------------->>>>> mpirun: clean termination accomplished >>>>> >>>>> >>>>> Lenny. >>>>> >>>>> >>>>> On 4/10/09, Geoffroy Pignot <geopig...@gmail.com> wrote: >>>>>> >>>>>> Hi , >>>>>> >>>>>> I am currently testing the process affinity capabilities of >>>>> openmpi and I >>>>>> would like to know if the rankfile behaviour I will describe >>>>>> below >>>>> is normal >>>>>> or not ? >>>>>> >>>>>> cat hostfile.0 >>>>>> r011n002 slots=4 >>>>>> r011n003 slots=4 >>>>>> >>>>>> cat rankfile.0 >>>>>> rank 0=r011n002 slot=0 >>>>>> rank 1=r011n003 slot=1 >>>>>> >>>>>> >>>>>> >>>>> >>>>>>> ##################################################################################>>>>>>>>>>>> mpirun --hostfile hostfile.0 -rf rankfile.0 -n 2 hostname ### OK>>>>>> r011n002 >>>>>> r011n003 >>>>>> >>>>>> >>>>>> >>>>> >>>>>>> ##################################################################################>>>>>> but>>>>>> mpirun --hostfile hostfile.0 -rf rankfile.0 -n 1 hostname : - n 1>>>>> hostname >>>>>> ### CRASHED >>>>>> * >>>>>> >>>>> >>>>>>> -------------------------------------------------------------------------->>>>>> Error, invalid rank (1) in the rankfile (rankfile.0) >>>>>> >>>>> >>>>>>> -------------------------------------------------------------------------- >>>>>> [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in>>>> file >>>>>> rmaps_rank_file.c at line 404>>>>>> [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in>>>> file >>>>>> base/rmaps_base_map_job.c at line 87>>>>>> [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in>>>> file >>>>>> base/plm_base_launch_support.c at line 77>>>>>> [r011n002:25129] [[63976,0],0] ORTE_ERROR_LOG: Bad parameter in>>>> file >>>>>> plm_rsh_module.c at line 985 >>>>>> >>>>> >>>>>>> -------------------------------------------------------------------------->>>>>> A daemon (pid unknown) died unexpectedly on signal 1 while >>>>> attempting to >>>>>> launch so we are aborting. >>>>>> >>>>>> There may be more information reported by the environment (see >>>>> above). >>>>>>>>>>>> This may be because the daemon was unable to find all the needed>>>>> shared>>>>>> libraries on the remote node. You may set your LD_LIBRARY_PATH to>>>>> have the >>>>>> location of the shared libraries on the remote nodes and this >>>>>> will >>>>>> automatically be forwarded to the remote nodes. >>>>>> >>>>> >>>>>>> -------------------------------------------------------------------------->>>>>> >>>>> >>>>>>> -------------------------------------------------------------------------- >>>>>> orterun noticed that the job aborted, but has no info as to the>>>>> process >>>>>> that caused that situation. >>>>>> >>>>> >>>>>>> -------------------------------------------------------------------------->>>>>> orterun: clean termination accomplished >>>>>> *>>>>>> It seems that the rankfile option is not propagted to the second>>>>> command>>>>>> line ; there is no global understanding of the ranking inside a>>>>> mpirun >>>>>> command. >>>>>> >>>>>> >>>>>> >>>>> >>>>>>> ##################################################################################>>>>>> >>>>>> Assuming that , I tried to provide a rankfile to each command >>>> line: >>>>>> >>>>>> cat rankfile.0 >>>>>> rank 0=r011n002 slot=0 >>>>>> >>>>>> cat rankfile.1 >>>>>> rank 0=r011n003 slot=1 >>>>>>>>>>>> mpirun --hostfile hostfile.0 -rf rankfile.0 -n 1 hostname : - rf>>>>> rankfile.1 >>>>>> -n 1 hostname ### CRASHED >>>>>> *[r011n002:28778] *** Process received signal *** >>>>>> [r011n002:28778] Signal: Segmentation fault (11) >>>>>> [r011n002:28778] Signal code: Address not mapped (1) >>>>>> [r011n002:28778] Failing at address: 0x34 >>>>>> [r011n002:28778] [ 0] [0xffffe600] >>>>>> [r011n002:28778] [ 1] >>>>>> /tmp/HALMPI/openmpi-1.3.1/lib/libopen-rte.so. >>>>> 0(orte_odls_base_default_get_add_procs_data+0x55d) >>>>>> [0x5557decd] >>>>>> [r011n002:28778] [ 2] >>>>>> /tmp/HALMPI/openmpi-1.3.1/lib/libopen-rte.so. >>>>> 0(orte_plm_base_launch_apps+0x117) >>>>>> [0x555842a7] >>>>>> [r011n002:28778] [ 3] /tmp/HALMPI/openmpi-1.3.1/lib/openmpi/ >>>>> mca_plm_rsh.so >>>>>> [0x556098c0] >>>>>> [r011n002:28778] [ 4] /tmp/HALMPI/openmpi-1.3.1/bin/orterun >>>>> [0x804aa27] >>>>>> [r011n002:28778] [ 5] /tmp/HALMPI/openmpi-1.3.1/bin/orterun >>>>> [0x804a022] >>>>>> [r011n002:28778] [ 6] /lib/libc.so.6(__libc_start_main+0xdc) >>>>> [0x9f1dec] >>>>>> [r011n002:28778] [ 7] /tmp/HALMPI/openmpi-1.3.1/bin/orterun >>>>> [0x8049f71] >>>>>> [r011n002:28778] *** End of error message *** >>>>>> Segmentation fault (core dumped)* >>>>>> >>>>>> >>>>>>>>>>>> I hope that I've found a bug because it would be very important>>>>> for me to >>>>>> have this kind of capabiliy .>>>>>> Launch a multiexe mpirun command line and be able to bind my exes>>>>> and >>>>>> sockets together. >>>>>> >>>>>> Thanks in advance for your help >>>>>> >>>>>> Geoffroy >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> >>>> -------------- next part -------------- >>>> HTML attachment scrubbed and removed >>>> >>>> ------------------------------ >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> >>>> End of users Digest, Vol 1202, Issue 2 >>>> ************************************** >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> -------------- next part -------------- >>>> HTML attachment scrubbed and removed >>>> >>>> ------------------------------ >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> >>>> End of users Digest, Vol 1218, Issue 2 >>>> ************************************** >>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> -------------- next part -------------- >>> HTML attachment scrubbed and removed >>> >>> ------------------------------ >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> End of users Digest, Vol 1221, Issue 3 >>> ************************************** >>> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > -------------- next part -------------- > HTML attachment scrubbed and removed > > ------------------------------ > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > End of users Digest, Vol 1221, Issue 6 > ************************************** _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users-- Jeff Squyres Cisco Systems _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
smime.p7s
Description: S/MIME cryptographic signature
PGP.sig
Description: Ceci est une signature électronique PGP