Hi all,
I explained the problem I'm facing @ http://www.ideone.com/EGMMn
please help
thanks
Hi,
The job queue has a time budget, which has been set in my job script.
For example, my current job queue is 24 hours.
But, my program got SIGKILL (signal 9) within not more than 2 hours since it
began to run.
Are there other possible settings that I need to consider ?
thanks
Jack
> From:
If your program runs faster across 3 processes, 2 of which are local to each
other, with --mca btl tcp,self compared to --mca btl tcp,sm,self, then
something is very, very strange.
Tim cites all kinds of things that can cause slowdowns, but it's still very,
very odd that simply enabling using t
+1 on what Ralph is saying.
You need to talk to your local administrators and ask them why Torque is
killing your job. Perhaps you're submitting to a queue that only allows jobs
to run for a few seconds, or something like that.
On Mar 27, 2011, at 3:08 PM, Ralph Castain wrote:
> It means tha
Hi, I use MPI_Barrier to make all processes to terminate at the same time.
int main(){ for masternode: while (loop <= LOOP_NUMBER) {
master node distributes tasks to workers; master collects
results from workers; ++loop; } for
I checked that this issue is not caused by using different compile
options for different libraries. There is a set of libraries and
executable compiled with mpif90, and this warning comes for
executable's object and one of libraries...
2011/3/25 Dmitry N. Mikushin :
> Hi,
>
> I'm wondering if anyb
This might not have anything to do with your problem, but how do you
finalize your worker nodes when your master loop terminates?
On Sun, Mar 27, 2011 at 3:27 PM, Jack Bryan wrote:
> Hi, my original bug is :
>
> --
> mpirun
Hi, my original bug is :
--mpirun
noticed that process rank 0 with PID 77967 on node n342 exited on signal 9
(Killed).--
The main framework of my code i
It means that Torque is unhappy with your job - either you are running longer
than it permits, or you exceeded some other system limit.
Talk to your sys admin about imposed limits. Usually, there are flags you can
provide to your job submission that allow you to change limits for your program.
Hi, I have figured out how to run the command.
OMPI_RANKFILE=$HOME/$PBS_JOBID.ranks
mpirun -np 200 -rf $OMPI_RANKFILE --mca btl self,sm,openib -output-filename
700g200i200p14ye ./myapplication
Each process print out to a distinct file.
But, the program is terminated by the error
:--
On Mar 27, 2011, at 7:37 AM, Tim Prince wrote:
> On 3/27/2011 2:26 AM, Michele Marena wrote:
>> Hi,
>> My application performs good without shared memory utilization, but with
>> shared memory I get performance worst than without of it.
>> Do I make a mistake? Don't I pay attention to something?
On 3/27/2011 2:26 AM, Michele Marena wrote:
Hi,
My application performs good without shared memory utilization, but with
shared memory I get performance worst than without of it.
Do I make a mistake? Don't I pay attention to something?
I know OpenMPI uses /tmp directory to allocate shared memory
This is my machinefile
node-1-16 slots=2
node-1-17 slots=2
node-1-18 slots=2
node-1-19 slots=2
node-1-20 slots=2
node-1-21 slots=2
node-1-22 slots=2
node-1-23 slots=2
Each cluster node has 2 processors. I launch my application with 3
processes, one on node-1-16 (manager) and two on node-1-17(worke
Hi,
My application performs good without shared memory utilization, but with
shared memory I get performance worst than without of it.
Do I make a mistake? Don't I pay attention to something?
I know OpenMPI uses /tmp directory to allocate shared memory and it is in
the local filesystem.
I thank yo
14 matches
Mail list logo