Hi Jeff, No worries. I've been able to get the most recent (1.3a september 25th) to compile and it does exactly what I need it to do (which is work accross different subnets) and I can basically support that myself. (not quite sure what went wrong first time I tried this though)
Strange thing is, we've searched through the 1.2 code branch for the function that causes this (which is in the file ompi/mca/btl/tcp/btl_tcp_proc.c, function is_private_ipv4() ) and adjusted this to always return true. This also seems to work! (don't think this will be accepted as a patch as I have absolutely _no_ idea what it'll break but both solutions seem to work for me(tm) ) Regards, Jeroen Kleijer On Tue, Sep 30, 2008 at 4:38 PM, Jeff Squyres <jsquy...@cisco.com> wrote: > Sorry for the delay in replying -- I thought I had replied to this already, > but I guess I hadn't. :-( > > We've talked about this feature several times, but this specific > functionality hasn't made it into the OMPI code base yet. Sorry! :-( > > (patches would be gladly accepted, but note that we'll likely be kinda picky > about this code because it's a little hairy and complex...) > > > On Sep 19, 2008, at 7:00 PM, Jeroen Kleijer wrote: > >> Hi, >> >> I'm trying to get an openmpi application running accross different >> nodes but seem to have hit a snag when the processes are on different >> nodes, especially when the machines are on different TCP subnets. >> The orted daemons start up fine but after that application borks with >> the message >> >> [0,1,2][btl_tcp_endpoint.c:572:mca_btl_tcp_endpoint_complete_connect] >> connect() failed with errno=111 >> >> I've read in this thread >> >> http://thread.gmane.org/gmane.comp.clustering.open-mpi.user/3427/focus=3437 >> that openmpi currently can't do this yet but (pre-release?) versions >> of openmpi 1.3 will work. >> I've tried compiling openmpi 1.3a (nightly build) and running my >> program with that (compiled with the mpicc of openmpi 1.3a) but I got >> the same error message. >> >> Can anybody confirm that: >> 1) openmpi has dificulties using the tcp btl accross different subnets >> 2) there are currently no workarounds for this. >> >> If there are solutions to this I'd really like to know about it as >> I've been trying this for quite a while now. >> >> Regards, >> >> Jeroen Kleijer >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > Jeff Squyres > Cisco Systems > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >