Prakash, tm_poll: protocol number dis error 11 ret is 17002 instead of 0: tm_init failed 3 processes killed (possibly by Open MPI)
I encountered similar problem with OpenPBS before, which also uses the TM interfaces. It returns a TM_ENOTCONNECTED (17002) when I tried to call tm_init for the second time (which in turns call tm_poll and returned that errno).
I think what you did to start tm_init from another node and connect to another mom which I do not think is allowed. The TM module in OpenMPI already called tm_init once. I am curious to know about the reason that you need to call tm_init again?
If you are curious to know about the implementation for PBS, you can download the source from openpbs.org. OpenPBS source: v2.3.16/src/lib/Libifl/tm.c
-- Thanks, - Pak Lui pak....@sun.com