Thanks Warner,
This is frustrating... I read the ticket. 6 months already and 2 releases
postponed... Frankly, I am very skeptical that this will be fixed for 1.3.4.
I really hope so, but when 1.3.4 will be released?

I have to think about going with 1.2.x and possible disruptions in my
configuration (I use Fink) or wait.

And I offered myself to test any nightly snapshot claiming this bug is
fixed.

Cheers,
Alan

On Fri, Aug 14, 2009 at 17:20, Warner Yuen <wy...@apple.com> wrote:

> Hi Alan,
>
> Xgrid support for Open MPI is currently broken in the latest version of
> Open MPI. See the ticket below. However, I believe that Xgrid still works
> with one of the earlier 1.2  versions of Open MPI. I don't recall for sure,
> but I think that it's Open MPI 1.2.3.
>
> #1777: Xgrid support is broken in the v1.3 series
>
> ---------------------+------------------------------------------------------
> Reporter:  jsquyres  |        Owner:  brbarret
>   Type:  defect    |       Status:  accepted
> Priority:  major     |    Milestone:  Open MPI 1.3.4
> Version:  trunk     |   Resolution:
> Keywords:            |
>
> ---------------------+------------------------------------------------------
> Changes (by bbenton):
>
>  * milestone:  Open MPI 1.3.3 => Open MPI 1.3.4
>
>
> Warner Yuen
> Scientific Computing
> Consulting Engineer
> Apple, Inc.
> email: wy...@apple.com
> Tel: 408.718.2859
>
>
>
>
> On Aug 14, 2009, at 6:21 AM, users-requ...@open-mpi.org wrote:
>
>
>> Message: 1
>> Date: Fri, 14 Aug 2009 14:21:30 +0100
>> From: Alan <alanwil...@gmail.com>
>> Subject: [OMPI users] openmpi with xgrid
>> To: us...@open-mpi.org
>> Message-ID:
>>        <cf58c8d00908140621v18d384f2wef97ee80ca3de...@mail.gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>>
>> Hi there,
>> I saw that http://www.open-mpi.org/community/lists/users/2007/08/3900.php
>> .
>>
>> I use fink, and so I changed the openmpi.info file in order to get
>> openmpi
>> with xgrid support.
>>
>> As you can see:
>> amadeus[2081]:~/Downloads% /sw/bin/ompi_info
>>                Package: Open MPI root@amadeus.local Distribution
>>               Open MPI: 1.3.3
>>  Open MPI SVN revision: r21666
>>  Open MPI release date: Jul 14, 2009
>>               Open RTE: 1.3.3
>>  Open RTE SVN revision: r21666
>>  Open RTE release date: Jul 14, 2009
>>                   OPAL: 1.3.3
>>      OPAL SVN revision: r21666
>>      OPAL release date: Jul 14, 2009
>>           Ident string: 1.3.3
>>                 Prefix: /sw
>> Configured architecture: x86_64-apple-darwin9
>>         Configure host: amadeus.local
>>          Configured by: root
>>          Configured on: Fri Aug 14 12:58:12 BST 2009
>>         Configure host: amadeus.local
>>               Built by:
>>               Built on: Fri Aug 14 13:07:46 BST 2009
>>             Built host: amadeus.local
>>             C bindings: yes
>>           C++ bindings: yes
>>     Fortran77 bindings: yes (single underscore)
>>     Fortran90 bindings: yes
>> Fortran90 bindings size: small
>>             C compiler: gcc
>>    C compiler absolute: /sw/var/lib/fink/path-prefix-10.6/gcc
>>           C++ compiler: g++
>>  C++ compiler absolute: /sw/var/lib/fink/path-prefix-10.6/g++
>>     Fortran77 compiler: gfortran
>>  Fortran77 compiler abs: /sw/bin/gfortran
>>     Fortran90 compiler: gfortran
>>  Fortran90 compiler abs: /sw/bin/gfortran
>>            C profiling: yes
>>          C++ profiling: yes
>>    Fortran77 profiling: yes
>>    Fortran90 profiling: yes
>>         C++ exceptions: no
>>         Thread support: posix (mpi: no, progress: no)
>>          Sparse Groups: no
>>  Internal debug support: no
>>    MPI parameter check: runtime
>> Memory profiling support: no
>> Memory debugging support: no
>>        libltdl support: yes
>>  Heterogeneous support: no
>> mpirun default --prefix: no
>>        MPI I/O support: yes
>>      MPI_WTIME support: gettimeofday
>> Symbol visibility support: yes
>>  FT Checkpoint support: no  (checkpoint thread: no)
>>          MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.3.3)
>>          MCA paffinity: darwin (MCA v2.0, API v2.0, Component v1.3.3)
>>              MCA carto: auto_detect (MCA v2.0, API v2.0, Component v1.3.3)
>>              MCA carto: file (MCA v2.0, API v2.0, Component v1.3.3)
>>          MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.3.3)
>>              MCA timer: darwin (MCA v2.0, API v2.0, Component v1.3.3)
>>        MCA installdirs: env (MCA v2.0, API v2.0, Component v1.3.3)
>>        MCA installdirs: config (MCA v2.0, API v2.0, Component v1.3.3)
>>                MCA dpm: orte (MCA v2.0, API v2.0, Component v1.3.3)
>>             MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.3.3)
>>          MCA allocator: basic (MCA v2.0, API v2.0, Component v1.3.3)
>>          MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.3.3)
>>               MCA coll: basic (MCA v2.0, API v2.0, Component v1.3.3)
>>               MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.3.3)
>>               MCA coll: inter (MCA v2.0, API v2.0, Component v1.3.3)
>>               MCA coll: self (MCA v2.0, API v2.0, Component v1.3.3)
>>               MCA coll: sm (MCA v2.0, API v2.0, Component v1.3.3)
>>               MCA coll: sync (MCA v2.0, API v2.0, Component v1.3.3)
>>               MCA coll: tuned (MCA v2.0, API v2.0, Component v1.3.3)
>>                 MCA io: romio (MCA v2.0, API v2.0, Component v1.3.3)
>>              MCA mpool: fake (MCA v2.0, API v2.0, Component v1.3.3)
>>              MCA mpool: rdma (MCA v2.0, API v2.0, Component v1.3.3)
>>              MCA mpool: sm (MCA v2.0, API v2.0, Component v1.3.3)
>>                MCA pml: cm (MCA v2.0, API v2.0, Component v1.3.3)
>>                MCA pml: csum (MCA v2.0, API v2.0, Component v1.3.3)
>>                MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.3.3)
>>                MCA pml: v (MCA v2.0, API v2.0, Component v1.3.3)
>>                MCA bml: r2 (MCA v2.0, API v2.0, Component v1.3.3)
>>             MCA rcache: vma (MCA v2.0, API v2.0, Component v1.3.3)
>>                MCA btl: self (MCA v2.0, API v2.0, Component v1.3.3)
>>                MCA btl: sm (MCA v2.0, API v2.0, Component v1.3.3)
>>                MCA btl: tcp (MCA v2.0, API v2.0, Component v1.3.3)
>>               MCA topo: unity (MCA v2.0, API v2.0, Component v1.3.3)
>>                MCA osc: pt2pt (MCA v2.0, API v2.0, Component v1.3.3)
>>                MCA osc: rdma (MCA v2.0, API v2.0, Component v1.3.3)
>>                MCA iof: hnp (MCA v2.0, API v2.0, Component v1.3.3)
>>                MCA iof: orted (MCA v2.0, API v2.0, Component v1.3.3)
>>                MCA iof: tool (MCA v2.0, API v2.0, Component v1.3.3)
>>                MCA oob: tcp (MCA v2.0, API v2.0, Component v1.3.3)
>>               MCA odls: default (MCA v2.0, API v2.0, Component v1.3.3)
>>                MCA ras: slurm (MCA v2.0, API v2.0, Component v1.3.3)
>>              MCA rmaps: rank_file (MCA v2.0, API v2.0, Component v1.3.3)
>>              MCA rmaps: round_robin (MCA v2.0, API v2.0, Component v1.3.3)
>>              MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.3.3)
>>                MCA rml: oob (MCA v2.0, API v2.0, Component v1.3.3)
>>             MCA routed: binomial (MCA v2.0, API v2.0, Component v1.3.3)
>>             MCA routed: direct (MCA v2.0, API v2.0, Component v1.3.3)
>>             MCA routed: linear (MCA v2.0, API v2.0, Component v1.3.3)
>>                MCA plm: rsh (MCA v2.0, API v2.0, Component v1.3.3)
>>                MCA plm: slurm (MCA v2.0, API v2.0, Component v1.3.3)
>>                MCA plm: xgrid (MCA v2.0, API v2.0, Component v1.3.3)
>>              MCA filem: rsh (MCA v2.0, API v2.0, Component v1.3.3)
>>             MCA errmgr: default (MCA v2.0, API v2.0, Component v1.3.3)
>>                MCA ess: env (MCA v2.0, API v2.0, Component v1.3.3)
>>                MCA ess: hnp (MCA v2.0, API v2.0, Component v1.3.3)
>>                MCA ess: singleton (MCA v2.0, API v2.0, Component v1.3.3)
>>                MCA ess: slurm (MCA v2.0, API v2.0, Component v1.3.3)
>>                MCA ess: tool (MCA v2.0, API v2.0, Component v1.3.3)
>>            MCA grpcomm: bad (MCA v2.0, API v2.0, Component v1.3.3)
>>            MCA grpcomm: basic (MCA v2.0, API v2.0, Component v1.3.3)
>>
>> All seemed fine and I also have xgrid controller and agent running in my
>> laptop, and then when I tried:
>>
>> /sw/bin/om-mpirun -c 2 mpiapp  # hello world example for mpi
>> [amadeus.local:40293] [[804,0],0] ORTE_ERROR_LOG: Unknown error: 1 in file
>> src/plm_xgrid_module.m at line 119
>> [amadeus.local:40293] [[804,0],0] ORTE_ERROR_LOG: Unknown error: 1 in file
>> src/plm_xgrid_module.m at line 153
>> --------------------------------------------------------------------------
>> om-mpirun was unable to start the specified application as it encountered
>> an
>> error.
>> More information may be available above.
>> --------------------------------------------------------------------------
>> 2009-08-14 14:16:19.715 om-mpirun[40293:10b] *** Terminating app due to
>> uncaught exception 'NSInvalidArgumentException', reason: '***
>> -[NSKVONotifying_XGConnection<0x1001164b0> finalize]: called when
>> collecting
>> not enabled'
>> 2009-08-14 14:16:19.716 om-mpirun[40293:10b] Stack: (
>>   140735390096156,
>>   140735366109391,
>>   140735390122388,
>>   4295943988,
>>   4295939168,
>>   4295171139,
>>   4295883300,
>>   4295025321,
>>   4294973498,
>>   4295401605,
>>   4295345774,
>>   4295056598,
>>   4295116412,
>>   4295119970,
>>   4295401605,
>>   4294972881,
>>   4295401605,
>>   4295345774,
>>   4295056598,
>>   4295172615,
>>   4295938185,
>>   4294971936,
>>   4294969401,
>>   4294969340
>> )
>> terminate called after throwing an instance of 'NSException'
>> [amadeus:40293] *** Process received signal ***
>> [amadeus:40293] Signal: Abort trap (6)
>> [amadeus:40293] Signal code:  (0)
>> [amadeus:40293] [ 0] 2   libSystem.B.dylib
>> 0x00000000831443fa _sigtramp + 26
>> [amadeus:40293] [ 1] 3   ???
>> 0x000000005fbfb1e8 0x0 + 1606398440
>> [amadeus:40293] [ 2] 4   libstdc++.6.dylib
>> 0x00000000827f2085 _ZN9__gnu_cxx27__verbose_terminate_handlerEv + 377
>> [amadeus:40293] [ 3] 5   libobjc.A.dylib
>> 0x0000000081811adf objc_end_catch + 280
>> [amadeus:40293] [ 4] 6   libstdc++.6.dylib
>> 0x00000000827f0425 __gxx_personality_v0 + 1259
>> [amadeus:40293] [ 5] 7   libstdc++.6.dylib
>> 0x00000000827f045b _ZSt9terminatev + 19
>> [amadeus:40293] [ 6] 8   libstdc++.6.dylib
>> 0x00000000827f054c __cxa_rethrow + 0
>> [amadeus:40293] [ 7] 9   libobjc.A.dylib
>> 0x0000000081811966 objc_exception_rethrow + 0
>> [amadeus:40293] [ 8] 10  CoreFoundation
>> 0x0000000082ef8194 _CF_forwarding_prep_0 + 5700
>> [amadeus:40293] [ 9] 11  mca_plm_xgrid.so
>> 0x00000000000ee734 orte_plm_xgrid_finalize + 4884
>> [amadeus:40293] [10] 12  mca_plm_xgrid.so
>> 0x00000000000ed460 orte_plm_xgrid_finalize + 64
>> [amadeus:40293] [11] 13  libopen-rte.0.dylib
>> 0x0000000000031c43 orte_plm_base_close + 195
>> [amadeus:40293] [12] 14  mca_ess_hnp.so
>> 0x00000000000dfa24 0x0 + 916004
>> [amadeus:40293] [13] 15  libopen-rte.0.dylib
>> 0x000000000000e2a9 orte_finalize + 89
>> [amadeus:40293] [14] 16  om-mpirun
>> 0x000000000000183a start + 4210
>> [amadeus:40293] [15] 17  libopen-pal.0.dylib
>> 0x000000000006a085 opal_event_add_i + 1781
>> [amadeus:40293] [16] 18  libopen-pal.0.dylib
>> 0x000000000005c66e opal_progress + 142
>> [amadeus:40293] [17] 19  libopen-rte.0.dylib
>> 0x0000000000015cd6 orte_trigger_event + 70
>> [amadeus:40293] [18] 20  libopen-rte.0.dylib
>> 0x000000000002467c orte_daemon_recv + 4332
>> [amadeus:40293] [19] 21  libopen-rte.0.dylib
>> 0x0000000000025462 orte_daemon_cmd_processor + 722
>> [amadeus:40293] [20] 22  libopen-pal.0.dylib
>> 0x000000000006a085 opal_event_add_i + 1781
>> [amadeus:40293] [21] 23  om-mpirun
>> 0x00000000000015d1 start + 3593
>> [amadeus:40293] [22] 24  libopen-pal.0.dylib
>> 0x000000000006a085 opal_event_add_i + 1781
>> [amadeus:40293] [23] 25  libopen-pal.0.dylib
>> 0x000000000005c66e opal_progress + 142
>> [amadeus:40293] [24] 26  libopen-rte.0.dylib
>> 0x0000000000015cd6 orte_trigger_event + 70
>> [amadeus:40293] [25] 27  libopen-rte.0.dylib
>> 0x0000000000032207 orte_plm_base_launch_failed + 135
>> [amadeus:40293] [26] 28  mca_plm_xgrid.so
>> 0x00000000000ed089 orte_plm_xgrid_spawn + 89
>> [amadeus:40293] [27] 29  om-mpirun
>> 0x0000000000001220 start + 2648
>> [amadeus:40293] [28] 30  om-mpirun
>> 0x0000000000000839 start + 113
>> [amadeus:40293] [29] 31  om-mpirun
>> 0x00000000000007fc start + 52
>> [amadeus:40293] *** End of error message ***
>> [1]    40293 abort      /sw/bin/om-mpirun -c 2 mpiapp
>>
>>
>> Is there anyone using openmpi with xgrid successfully keen to share
>> his/her
>> experience? I am not new to xgrid or mpi, but to both integrated I must
>> say
>> that I am in uncharted waters.
>>
>> Any help would be very appreciated.
>>
>> Many thanks in advance,
>> Alan
>> --
>> Alan Wilter S. da Silva, D.Sc. - CCPN Research Associate
>> Department of Biochemistry, University of Cambridge.
>> 80 Tennis Court Road, Cambridge CB2 1GA, UK.
>>
>>> http://www.bio.cam.ac.uk/~awd28<<
>>>>
>>> -------------- next part --------------
>> HTML attachment scrubbed and removed
>>
>> ------------------------------
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> End of users Digest, Vol 1318, Issue 2
>> **************************************
>>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Alan Wilter S. da Silva, D.Sc. - CCPN Research Associate
Department of Biochemistry, University of Cambridge.
80 Tennis Court Road, Cambridge CB2 1GA, UK.
>>http://www.bio.cam.ac.uk/~awd28<<

Reply via email to