Try upgrading to the v3.0, or at least to the latest in the v2.x series. The
v1.10 series is legacy and no longer maintained.
> On Nov 21, 2017, at 8:20 AM, Kulshrestha, Vipul
> wrote:
>
> Hi,
>
> I am finding that on Ctrl-C, mpirun immediately stops and does not sends
> SIGTERM to the chil
Hi,
I am finding that on Ctrl-C, mpirun immediately stops and does not sends
SIGTERM to the child processes.
I am using openmpi 1.10.6.
The child processes are able to handle SIGINT. I verified that by a printf in
my signal handler and then issuing SIGINT to my process directly. However, when
Am 13.03.2007 um 06:01 schrieb Ralph Castain:
I've been letting this rattle around in my head some more, and
*may* have
come up with an idea of what *might* be going on.
In the GE environment, qsub only launches the daemons - the daemons
are the
ones that actually "launch" your local appli
Hi Reuti (and others),
> And now the odd thing: the jobscript (with the mpirun) is gone on the
> head node of this parallel job, but all the spawned qrsh processes
> are still there:
I'm glad that someone else can almost reproduce my problem.
On the suspicion that my application was not ignoring
Am 12.03.2007 um 21:29 schrieb Ralph Castain:
But now we are going beyond Mark's initial problem.
Back to the initial problem: suspending a parallel job in SGE leads to:
19924 1786 19924 S \_ sge_shepherd-45250 -bg
19926 19924 19926 Ts| \_ /bin/sh /var/spool/sge/node39/
job_script
Am 12.03.2007 um 21:29 schrieb Ralph Castain:
On 3/12/07 2:18 PM, "Reuti" wrote:
Am 12.03.2007 um 20:36 schrieb Ralph Castain:
ORTE propagates the signal to the application processes, but the
ORTE
daemons never actually look at the signal themselves (looks just
like a
message to them). So
I've been letting this rattle around in my head some more, and *may* have
come up with an idea of what *might* be going on.
In the GE environment, qsub only launches the daemons - the daemons are the
ones that actually "launch" your local application processes. If qsub
-notify uses qsub's knowledg
On 3/12/07 2:18 PM, "Reuti" wrote:
> Am 12.03.2007 um 20:36 schrieb Ralph Castain:
>
>> ORTE propagates the signal to the application processes, but the ORTE
>> daemons never actually look at the signal themselves (looks just
>> like a
>> message to them). So I'm a little puzzled by that erro
Am 12.03.2007 um 20:36 schrieb Ralph Castain:
ORTE propagates the signal to the application processes, but the ORTE
daemons never actually look at the signal themselves (looks just
like a
message to them). So I'm a little puzzled by that error message
about the
"daemon received signal 12" -
It's supposed to be...but several of us have found it "blocking" somewhere
in the OPAL subdirectory tree.
On 3/12/07 2:06 PM, "Ben Allan" wrote:
> A build-related questions about 1.1.4
> Is parallel make usage (make -j 8) supported (at least if make is gnu?).
>
> thanks,
> Ben
>
> ___
A build-related questions about 1.1.4
Is parallel make usage (make -j 8) supported (at least if make is gnu?).
thanks,
Ben
Am 12.03.2007 um 20:22 schrieb Pak Lui:
Hi Mark,
Olesen, Mark wrote:
I'm testing openmpi 1.2rc1 with GridEngine 6.0u9 and ran into
interesting
behaviour when using the qsub -notify option.
With -notify, USR1 and USR2 are sent X seconds before sending STOP
and KILL
signals, respectively.
ORTE propagates the signal to the application processes, but the ORTE
daemons never actually look at the signal themselves (looks just like a
message to them). So I'm a little puzzled by that error message about the
"daemon received signal 12" - I suspect that's just a misleading message
that was s
Am 12.03.2007 um 19:55 schrieb Ralph Castain:
I'll have to look into it - I suspect this is simply an erroneous
message
and that no daemon is actually being started.
I'm not entirely sure I understand what's happening, though, in
your code.
Are you saying that mpirun starts some number of a
Hi Mark,
Olesen, Mark wrote:
I'm testing openmpi 1.2rc1 with GridEngine 6.0u9 and ran into interesting
behaviour when using the qsub -notify option.
With -notify, USR1 and USR2 are sent X seconds before sending STOP and KILL
signals, respectively.
When the USR2 signal is sent to the process gro
I'll have to look into it - I suspect this is simply an erroneous message
and that no daemon is actually being started.
I'm not entirely sure I understand what's happening, though, in your code.
Are you saying that mpirun starts some number of application processes which
run merrily along, and the
I'm testing openmpi 1.2rc1 with GridEngine 6.0u9 and ran into interesting
behaviour when using the qsub -notify option.
With -notify, USR1 and USR2 are sent X seconds before sending STOP and KILL
signals, respectively.
When the USR2 signal is sent to the process group with the mpirun process, I
re
17 matches
Mail list logo