Hi jeff,
  I am using open MPI 1.1 alpha 7 release.
  My MPI process with threads was not terminating even when i send SIGINT thru 
key board. I shall check it agian anyhow and get back to you.

  I am terminating all MPI actions in threads ( my thread waits on MPI::Recv 
and i send data to the MPI::Recv from the signal handler, to terminate it.I 
call MPI ::Finalize in the main thread, where  i called MPI::init.

  snippet of code would be,
  Thread
  -------------
  signal(SIGALRM);

  while(flag)
  {
  alarm(3):
  MPI::Recv();
  alarm(0);
  .....
  }

  signal_handler(int sig)
  {
  MPI::Send(); //to the waiting thread
  flag=flase;
  }
  ------------------
Main
  ---------------------
  Main thread
  {
  MPI:Init();
  Pthread_create();
  MPI::Finalize();

  }


  And there are 25 such processes , with same code as above. Each send and recv 
to other. But they dont signal each other.


  Is there any way i can terminate all processes on other nodes at a time? Or 
do i need to write script on my own.?

  Regards,
  Imran

"Jeff Squyres (jsquyres)" <jsquy...@cisco.com> wrote:
      1. Starting from scratch is probably easiest.  If you installed Open MPI 
to its own directory, just remove the installation directory.  If you installed 
Open MPI to a directory that contains other things, a "make uninstall" in your 
original Open MPI source tree should completely uninstall it properly.

  2. What specific version of Open MPI are you using?  We just fixed a shared 
memory threaded issue -- I'm afraid I didn't follow this thread closely enough 
to remember if you updated before or after that fix.

  3. Are you saying that your processes would not die if you killed them with 
SIGINT?  This would be extremely strange.

  4. Note that there are issues with signals and threads on Linux -- IIRC, you 
can't necessarily guarantee which thread will catch which signal.  It depends 
on what you are doing with your SIGALRM processing -- how are you shutting down 
MPI?  Are you terminating all MPI actions in threads before calling 
MPI_FINALIZE?

  5. Open MPI does not have an equivalent of lamclean or lamwipe at this time.  
Sorry!




---------------------------------
  From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On 
Behalf Of imran shaik
Sent: Wednesday, May 31, 2006 1:41 AM
To: Open MPI Users
Subject: Re: [OMPI users] Few more questions



Thanks brian,
I shall download alpha 8 and upgrade. I have few more questions. 

1)Are there simple ways to upgrade, or shall i start from scratch?


2)Pls look at the following error message.

P= 14 NA= 0 RF--> 16
P=10 RN=53
P= 10 NA= 53 RF--> 8
Signal:11 info.si_errno:0(Success) si_code:196609()
Failing at addr:0x2
[0] func:/usr/local/openmpi/lib/libopal.so.0 [0x40178df4]
[1] func:/lib/libpthread.so.0 [0x40040e07]
[2] func:/lib/libc.so.6 [0x402c94f0]
[3] 
func:/usr/local/openmpi/lib/openmpi/mca_btl_sm.so(mca_btl_sm_component_progress+0x7e2)
 [0x4047ded2]
[4] 
func:/usr/local/openmpi/lib/openmpi/mca_btl_sm.so(mca_btl_sm_component_event_thread+0x40)
 [0x4047d6e4]
[5] func:/lib/libpthread.so.0 [0x4003aae0]
[6] func:/lib/libc.so.6(__clone+0x57) [0x40383927]
*** End of error message ***
P=2 RN=72
P= 2 NA= 72 RF--> 6
P=18 RN=34
P= 18 NA= 34 RF--> 3
mpirun noticed that job rank 3 with PID 5621 on node "Neelw4" exited on signal 
11.
-----------------------------
I run the 25 processes, each having a thread that makes MPI calls along with 
its main thread. I use THREAD_MULTIPLE option. I am registering a function to 
catch SIGALRM signal in the thread. Each thread catches the signal after some 
time and terminates normally. This is also as the same problem as the previous 
one, sometimes error message comes, and some times it wont. 
What could be the problem??

3) None of the threads(even main thread) were catching SIGINT.

4) Is there any way to make the threads catch signal without creating problems, 
as i faced above?

5)Is there any tool available to wipe out all process across the nodes.? like 
lamclean or wipe . Anything will u suggest?

Thanks and regards,

Imran





Brian Barrett <brbar...@open-mpi.org> wrote:   On May 26, 2006, at 11:31 PM, 
imran shaik wrote:

> I have installed openMPI alpha 7 release. I created an MPI programs 
> with pthreads. I ran with just 6 process, each thread making MPI 
> calls concurrently with main thread. Things work fine . I use a TCP 
> network.
>
> Some times i get a strange error message.



> Sometimes i get this error message, and sometimes not. I can say in 
> a run of 7 i get once. But i get the output properly and the 
> program works fine. I just wanted to know why that occured?

We just released alpha 8, which should include a fix for a problem 
that sounds very similar to what you are seeing. Can you try 
upgrading and see if that solves your problem?

> Another one, i tried to get verbose output from "mpirun", but 
> couldnt. Even "mpiexec". I was using the same command as
> mpirun -v -np 6 myprogram in lam, i used to get the verbose saying 
> which process is running where. Here nothing happens.
>
> What is the problem? Otherwise how can i know what process is 
> running on what node? Any suggestions??

We don't currently have a good way of dealing with this. You can get 
lots of debugging information from the -d option to mpirun, but it 
would be difficult to get exactly what you are looking for from the 
debugging output.

Your best bet would probably be to use gethostname() and MPI_Comm_rank 
() inside your MPI application and print the results to stdout / stderr.


Brian

-- 
Brian Barrett
Open MPI developer
http://www.open-mpi.org/


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

  __________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


---------------------------------
Yahoo! Messenger with Voice. Make PC-to-Phone Calls to the US (and 30+ 
countries) for 2ยข/min or less.

Reply via email to