Ah - okay, my misunderstanding. Would you be willing to give the trunk a try? 
It might help to know if the problem is solely in 1.6, or continues.


On Jul 26, 2012, at 4:32 PM, Brock Palen wrote:

> I think so, sorry if I gave you the impression that Rmpi changed, 
> 
> Brock Palen
> www.umich.edu/~brockp
> CAEN Advanced Computing
> bro...@umich.edu
> (734)936-1985
> 
> 
> 
> On Jul 26, 2012, at 7:30 PM, Ralph Castain wrote:
> 
>> Guess I'm confused - your original note indicated that something had changed 
>> in Rmpi that broke things. Are you now saying it was something in OMPI?
>> 
>> On Jul 26, 2012, at 4:22 PM, Brock Palen wrote:
>> 
>>> Ok will see, Rmpi we had working with 1.4 and has not been updated after 
>>> 2010,  this this kinda stinks.
>>> 
>>> I will keep digging into it thanks for the help.
>>> 
>>> Brock Palen
>>> www.umich.edu/~brockp
>>> CAEN Advanced Computing
>>> bro...@umich.edu
>>> (734)936-1985
>>> 
>>> 
>>> 
>>> On Jul 26, 2012, at 7:16 PM, Ralph Castain wrote:
>>> 
>>>> Crud - afraid you'll have to ask them, then :-(
>>>> 
>>>> 
>>>> On Jul 26, 2012, at 3:50 PM, Brock Palen wrote:
>>>> 
>>>>> Ralph,
>>>>> 
>>>>> Rmpi wraps everything up, so I tried setting them with
>>>>> 
>>>>> export OMPI_plm_base_verbose=5
>>>>> export OMPI_dpm_base_verbose=5
>>>>> 
>>>>> and I get no extra messages even on helloworld example simple MPI-1.0 
>>>>> code. 
>>>>> 
>>>>> 
>>>>> Brock Palen
>>>>> www.umich.edu/~brockp
>>>>> CAEN Advanced Computing
>>>>> bro...@umich.edu
>>>>> (734)936-1985
>>>>> 
>>>>> 
>>>>> 
>>>>> On Jul 26, 2012, at 6:42 PM, Ralph Castain wrote:
>>>>> 
>>>>>> Well, it looks like comm_spawn is working on 1.6. Afraid I don't know 
>>>>>> enough about Rmpi/snow to advise on what changed, but you could add some 
>>>>>> debug params to get an idea of where the problem is occurring:
>>>>>> 
>>>>>> -mca plm_base_verbose 5 -mca dpm_base_verbose 5
>>>>>> 
>>>>>> should tell you from an OMPI perspective. I can try to help debug that 
>>>>>> end, at least.
>>>>>> 
>>>>>> 
>>>>>> On Jul 26, 2012, at 3:02 PM, Ralph Castain wrote:
>>>>>> 
>>>>>>> Weird - looks like it has done a comm_spawn and having trouble 
>>>>>>> connecting between the jobs. I can check the basic code and make sure 
>>>>>>> it is working - I seem to recall someone else recently talking about 
>>>>>>> Rmpi changes causing problems (different ones than this, IIRC), so you 
>>>>>>> might want to search our user archives for rmpi to see what they ran 
>>>>>>> into. Not sure what rmpi changed, or why.
>>>>>>> 
>>>>>>> On Jul 26, 2012, at 2:41 PM, Brock Palen wrote:
>>>>>>> 
>>>>>>>> I have ran into a problem using Rmpi with OpenMPI (trying to get snow 
>>>>>>>> running).
>>>>>>>> 
>>>>>>>> I built OpenMPI following another post where I built static:
>>>>>>>> 
>>>>>>>> ./configure --prefix=$INSTALL/gcc-4.4.6-static 
>>>>>>>> --mandir=$INSTALL/gcc-4.4.6-static/man --with-tm=/usr/local/torque/ 
>>>>>>>> --with-openib --with-psm --enable-static CC=gcc CXX=g++ FC=gfortran 
>>>>>>>> F77=gfortran
>>>>>>>> 
>>>>>>>> Rmpi/snow work fine when I run on a single node.  When I span more 
>>>>>>>> than one node I get nasty errors (pasted below).
>>>>>>>> 
>>>>>>>> I tested this mpi install with a simple hello world and that works.  
>>>>>>>> Any thoughts what is different about Rmpi/snow that could cause this?
>>>>>>>> 
>>>>>>>> [nyx0400.engin.umich.edu:11927] [[48116,0],4] ORTE_ERROR_LOG: Not 
>>>>>>>> found in file routed_binomial.c at line 386
>>>>>>>> [nyx0400.engin.umich.edu:11927] [[48116,0],4]:route_callback tried 
>>>>>>>> routing message from [[48116,2],16] to [[48116,1],0]:16, can't find 
>>>>>>>> route
>>>>>>>> [nyx0405.engin.umich.edu:07707] [[48116,0],8] ORTE_ERROR_LOG: Not 
>>>>>>>> found in file routed_binomial.c at line 386
>>>>>>>> [nyx0405.engin.umich.edu:07707] [[48116,0],8]:route_callback tried 
>>>>>>>> routing message from [[48116,2],32] to [[48116,1],0]:16, can't find 
>>>>>>>> route
>>>>>>>> [0] 
>>>>>>>> func:/home/software/rhel6/openmpi-1.6.0/gcc-4.4.6-static/lib/libopen-rte.so.4(opal_backtrace_print+0x1f)
>>>>>>>>  [0x2b7e9209e0df]
>>>>>>>> [1] 
>>>>>>>> func:/home/software/rhel6/openmpi-1.6.0/gcc-4.4.6-static/lib/libopen-rte.so.4(+0x9f77a)
>>>>>>>>  [0x2b7e9206577a]
>>>>>>>> [2] 
>>>>>>>> func:/home/software/rhel6/openmpi-1.6.0/gcc-4.4.6-static/lib/libopen-rte.so.4(mca_oob_tcp_msg_recv_complete+0x27f)
>>>>>>>>  [0x2b7e920404af]
>>>>>>>> [3] 
>>>>>>>> func:/home/software/rhel6/openmpi-1.6.0/gcc-4.4.6-static/lib/libopen-rte.so.4(+0x7bed2)
>>>>>>>>  [0x2b7e92041ed2]
>>>>>>>> [4] 
>>>>>>>> func:/home/software/rhel6/openmpi-1.6.0/gcc-4.4.6-static/lib/libopen-rte.so.4(opal_event_base_loop+0x238)
>>>>>>>>  [0x2b7e92087e38]
>>>>>>>> [5] 
>>>>>>>> func:/home/software/rhel6/openmpi-1.6.0/gcc-4.4.6-static/lib/libopen-rte.so.4(orte_daemon+0x8d8)
>>>>>>>>  [0x2b7e92016768]
>>>>>>>> [6] func:orted(main+0x66) [0x400966]
>>>>>>>> [7] func:/lib64/libc.so.6(__libc_start_main+0xfd) [0x3d39c1ecdd]
>>>>>>>> [8] func:orted() [0x400839]
>>>>>>>> [nyx0397.engin.umich.edu:06959] [[48116,0],1] ORTE_ERROR_LOG: Not 
>>>>>>>> found in file routed_binomial.c at line 386
>>>>>>>> [nyx0397.engin.umich.edu:06959] [[48116,0],1]:route_callback tried 
>>>>>>>> routing message from [[48116,2],7] to [[48116,1],0]:16, can't find 
>>>>>>>> route
>>>>>>>> [nyx0401.engin.umich.edu:07782] [[48116,0],5] ORTE_ERROR_LOG: Not 
>>>>>>>> found in file routed_binomial.c at line 386
>>>>>>>> [nyx0401.engin.umich.edu:07782] [[48116,0],5]:route_callback tried 
>>>>>>>> routing message from [[48116,2],23] to [[48116,1],0]:16, can't find 
>>>>>>>> route
>>>>>>>> [nyx0406.engin.umich.edu:07743] [[48116,0],9] ORTE_ERROR_LOG: Not 
>>>>>>>> found in file routed_binomial.c at line 386
>>>>>>>> [nyx0406.engin.umich.edu:07743] [[48116,0],9]:route_callback tried 
>>>>>>>> routing message from [[48116,2],39] to [[48116,1],0]:16, can't find 
>>>>>>>> route
>>>>>>>> [0] 
>>>>>>>> func:/home/software/rhel6/openmpi-1.6.0/gcc-4.4.6-static/lib/libopen-rte.so.4(opal_backtrace_print+0x1f)
>>>>>>>>  [0x2ae2ad17d0df]
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Brock Palen
>>>>>>>> www.umich.edu/~brockp
>>>>>>>> CAEN Advanced Computing
>>>>>>>> bro...@umich.edu
>>>>>>>> (734)936-1985
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> 
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to