Hi, 
I find the reason why the program is killed by operating system in the case 
that the problem size is large.
It consumes more memory and leads to more memory swap. 
This also degrade the program performance. 
But, I cannot determine which function of the worker process causes the 
problem. 
I have used try-catch in my code but no exception popped out.
I found that 
-------------------------------------------------------------------When the 
processes running on your server attempt to allocate more memory than your 
system has available, the kernel begins to swap memory pages to and from the 
disk.
This is done in order to free up sufficient physical memory to meet the RAM 
allocation requirements of the 
requestor.------------------------------------------------------------------
I am not sure it is really caused by CPLEX ( an optimization model solver) or 
other routines or maybe by other dynamic memory allocation used by CPLEX API 
libray at background. 
Any help is really appreciated. 
Jack
From: r...@open-mpi.org
List-Post: users@lists.open-mpi.org
Date: Wed, 13 Apr 2011 10:34:38 -0600
To: us...@open-mpi.org
Subject: Re: [OMPI users] OMPI monitor each process behavior




On Apr 13, 2011, at 10:19 AM, Jack Bryan wrote:Hi,  I am using 
mpirun (Open MPI) 1.3.4

But, I have these, 
orte-clean  orted       orte-iof    orte-ps     orterun
Can they do the same thing ? 
Unfortunately, no

If I use them, will they use a lot of memory on each worker node and print out 
a lot of things on some log files ?
No, but they won't help. orte-top would be run only on the head node (i.e., 
where you are logged in), and would generate output to your screen.
But you don't have it with that release, so the point is moot. Afraid there 
isn't much else you can do - you might talk to your sys admin and see what 
tools are available on your cluster for this purpose. Perhaps a nice parallel 
debugger is available?


Any help is really appreciated. 
Thanks
Jack 
From: r...@open-mpi.org
List-Post: users@lists.open-mpi.org
Date: Wed, 13 Apr 2011 08:09:17 -0600
To: us...@open-mpi.org
Subject: Re: [OMPI users] OMPI monitor each process behavior

What version are you using? If you are using 1.5.x, there is an "orte-top" 
command that will do what you ask. It queries the daemons to get the info.

On Apr 12, 2011, at 9:55 PM, Jack Bryan wrote:Hi , All: 
I need to monitor the memory usage of each parallel process on a linux Open MPI 
cluster. 
But, top, ps command cannot help here because they only show the head node 
information. 
I need to follow the behavior of each process on each cluster node.
I cannot use ssh to access each node. 
The program takes 8 hours to finish. 
Any help is really appreciated. 
Jack _______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________ users mailing list 
us...@open-mpi.org 
http://www.open-mpi.org/mailman/listinfo.cgi/users_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users                              
          

Reply via email to