_LIBRARY_PATH differently
I don't think that Ubuntu will do anything different than any other Linux.
Did you compile Open MPI on your own, or did you install any repository?
Are the CUDA application written by yourself or any freely available
applications?
- -- Reuti
> and instead add
s.
> Regarding the final part of the email, is it a problem that 'undefined
> reference' is appearing?
Yes, it tries to resolve missing symbols and didn't succeed.
-- Reuti
>
> Thanks and regards,
> Tim
>
> On 22 May 2017 at 06:54, Reuti wrote:
>
>&
Hi,
Am 23.05.2017 um 05:03 schrieb Tim Jim:
> Dear Reuti,
>
> Thanks for the reply. What options do I have to test whether it has
> successfully built?
LIke before: can you compile and run mpihello.c this time – all as ordinary
user in case you installed the Open MPI into so
chemistry
> program.
Did you compile Open MPI on your own? Did you move it after the installation?
-- Reuti
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
64?
2) do you source .bashrc also for interactive logins? Otherwise it should go to
~/.bash_profile or ~/.profile
>
>
> En date de : Mar 30.5.17, Reuti a écrit :
>
> Objet: Re: [OMPI users] No components were able to be opened in the pml
&
is an additional point: which one?
It might be, that you have to put the two exports of PATHS and LD_LIBRARY_PATH
in your jobscript instead, in you never want to run the application from the
command line in parallel.
-- Reuti
>
> En date de : Mar
t;
> qsub –pe orte 8 –b y –V –l m_mem_free=40G –cwd mpirun –np 8 a.out
m_mem_free is part of Univa SGE (but not the various free ones of SGE AFAIK).
Also: this syntax is for SGE, in LSF it's different.
To have this independent from the actual queuing system, one could look into
DR
e their headers installed on it. Then configure OMPI
> --with-xxx pointing to each of the RM’s headers so all the components get
> built. When the binary hits your customer’s machine, only those components
> that have active libraries present will execute.
Just note, th
a string in the environment variable, you may want to use the
plain value in bytes there.
-- Reuti
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
rsions. So I can't comment on this for sure, but it seems to set the memory
also in cgroups.
-- Reuti
> mpirun just uses the nodes that SGE provides.
>
> What your cmd line does is restrict the entire operation on each node (daemon
> + 8 procs) to 40GB of memory. OMPI
arently
> never propagated through remote startup,
Isn't it a setting inside SGE which the sge_execd is aware of? I never exported
any environment variable for this purpose.
-- Reuti
> so killing those orphans after
> VASP crashes may fail, though resource reporting works. (I ne
> How do I get around this cleanly? This works just fine when I set
> LD_LIBRARY_PATH in my .bashrc, but I’d rather not pollute that if I can avoid
> it.
Do you set or extend the LD_LIBRARY_PATH in your .bashrc?
-- Reuti
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users
IBRARY_PATH
>
> this is the easiest option, but cannot be used if you plan to relocate the
> Open MPI installation directory.
There is the tool `chrpath` to change rpath and runpath inside a
binary/library. This has to match relocated directory then.
-- Reuti
> an other
obscript to feed
an "adjusted" $PE_HOSTFILE to Open MPI and then it's working as intended: Open
MPI creates forks.
Does anyone else need such a patch in Open MPI and is it suitable to be
included?
-- Reuti
PS: Only the headnodes have more than one network interface in our cas
list.
-- Reuti
n package with pgcc/pgCC too (then it succeeds for
me)?
Although I think it's a bug, I tend to stay inside one and the same compile
process with only one compiler vendor anyway.
-- Reuti
NB: Latest version is 1.4.2.
> I appreciate if any one can help me.
>
> Best Regard
>
> a simple as changing your LD_LIBRARY_PATH, but it might not.
what's inside the binaries (like mpirun) can be checked with:
$ readelf -a mpirun
...
0x000f (RPATH) Library rpath:
[/opt/pgi/linux86-64/9.0-4/lib]
0x0000001d (RUNPATH)Library run
ot; or "linux threads" then no, you cannot have different
> threads on different nodes under any programming paradigm.
There are some efforts like http://www.kerrighed.org/wiki/index.php/Main_Page,
but for the current release the thread migration is indeed disabled.
-- Reuti
> Ho
w you to transfer information between nodes and start
worker processes:
http://www.netlib.org/pvm3/book/node17.html
Looks like PVM is no longer included in Pelican_HPC by default, but you can
compile it on your own.
-- Reuti
> This is what i've tried to explain in the last msg. A dream for
index.php/Main_Page). Wasn't this
also one idea behind "High Performance Fortran" - running in parallel across
nodes even without knowing that it's across nodes at all while programming and
access all data like it's being local.
-- Reuti
> ___
ed passphraseless keys which I would
suggest to replace with hostbased authentication anyway).
-- Reuti
> Does this setting fail over to ssh if rsh is not available or should it just
> use rsh only??? Also is there any command
> (this is a linux cluster) to see if ssh is being used. I
t file, not the name of the input
file (maybe the web page you mention describes MPICH(1)).
-- Reuti
> Is there a solution for our problem?
>
> Regards,
> Andrei
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
and it's per slot then; and with a tight integration all Open MPI
processes will be children of sge_execd and the limit will be enforced.
-- Reuti
> The nodes of our cluster each have 24GB of physical memory, of
> which 4GB is taken up by the kernel and the root file system.
> Not
Am 08.10.2010 um 00:40 schrieb Ralph Castain:
>
> On Oct 7, 2010, at 2:55 AM, Reuti wrote:
>
>> Am 07.10.2010 um 01:55 schrieb David Turner:
>>
>>> Hi,
>>>
>>> We would like to set process memory limits (vmemoryuse, in csh
>>> ter
e the fact that all 4 forks are
bound to one core. Should it really be four?
-- Reuti
> __
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
Hi,
Am 14.10.2010 um 13:23 schrieb Dave Love:
> Reuti writes:
>
>> With the default binding_instance set to "set" (the default) the
>> shepherd should bind the processes to cores already. With other types
>> of binding_instance these selected cores must be fo
just to monitor the physical and virtual machines by an application
running under MPI? It sounds like it could be done by Ganglia or Nagios then.
-- Reuti
> Please, Can You help me with architecture of this system (is my thoughts
> right) ?
> And one more qeustion - that is the bes
Am 22.10.2010 um 14:09 schrieb Vasiliy G Tolstov:
> On Fri, 2010-10-22 at 14:07 +0200, Reuti wrote:
>> Hi,
>>
>> Am 22.10.2010 um 10:58 schrieb Vasiliy G Tolstov:
>>
>>> Hello. May be this question already answered, but i can't see it in list
>>>
gument list from the program, it has split the
> argument into two (or more).
>
> I have also enclosed the arguments in quotes, but that doesn't seem to
> help.
I don't know for Windows, but sometimes it can help to use single and double
quotes in combination:
'"C:\
hosts
> ./hello
> --
> There are no allocated resources for the application
> ./hello
> that match the requested mapping:
> ../Cluster.hosts
what's in the above file? Do you run by hand in a cluster or directed by a
queuing system?
-- Reuti
> Verify tha
you mention below. So they should work together by design.
-- Reuti
> Open MPI makes an ABI promise (that started with version 1.3.2) that all the
> releases in a given feature series and its corresponding super-stable series
> (i.e., x.y.* and x.(y+1).*, where y is odd) are ABI compat
Am 29.10.2010 um 18:47 schrieb Jeff Squyres:
> On Oct 29, 2010, at 12:40 PM, Reuti wrote:
>
>>> I'd have to go check 1.4.3 and 1.4.1 to be sure, but I would generally
>>> *NOT* assume that different versions like this are compatible.
>>
>> I'm g
pe_hostfile. Here the
logical core and socket numbers are printed (they start at 0
and have no holes) in colon separated pairs (i.e.
0,0:1,0 which means core 0 on socket 0 and core 0 on socket 1).
For more information about the $pe_hostfile check
ge_pe(5)
binding_instance" "pe" and reformat the information in the
$PE_HOSTFILE to a "rankfile", it should work to get the desired allocation.
Maybe you can share the script with this list once you got it working.
-- Reuti
he -binding pe linear:1, each
> execution node binds processes for the job_id to one core. If I have
> -binding pe linear:2, I get:
>
> exec6.cluster.stats.local 2 batch.q@exec6.cluster.stats.local 0,1:0,2
So the cores 1 and 2 on socket 0 aren't free?
-- Reuti
> exec1.
ocesses on each slave
node are threads and don't invoke an additional `qrsh -inherit ...`. If you
have only one MPI process per node it's working fine?
-- Reuti
> Cheers,
>
> Chris
>
>
> --
> Dr Chris Jewell
> Department of Statistics
> University of Wa
er.stats.local
> 0,1:0,2
> exec3.cluster.stats.local 1
> batch.q@exec3.cluster.stats.local
> 0,1:0,2
> exec2.cluster.stats.local 1
> batch.q@exec2.cluster.stats.local
> 0,1:0,2
> exec5.cluster.stats.local 1
> batch.q@exec5.cluster.stats.local
> 0,1:0,2
>
> Is
maschine if possible.
- possibly only 3, when only 3 slots are granted on a machine
- you will never ever get more than 4 slots per machine, i.e. it's an upper
limit for slots per machine for this particular job
-- Reuti
>
> --td
>
>> -- Reuti
>>
>>
>>
Correction:
Am 15.11.2010 um 20:23 schrieb Terry Dontje:
> On 11/15/2010 02:11 PM, Reuti wrote:
>> Just to give my understanding of the problem:
>>
>> Am 15.11.2010 um 19:57 schrieb Terry Dontje:
>>
>>
>>> On 11/15/2010 11:08 AM, Chris Jewell wrote:
&
Am 16.11.2010 um 10:26 schrieb Chris Jewell:
> Hi all,
>
>> On 11/15/2010 02:11 PM, Reuti wrote:
>>> Just to give my understanding of the problem:
>>>>
>>>>>> Sorry, I am still trying to grok all your email as what the problem you
>>&g
rd SGE option today (at least, I know
> it used to be). I don't believe any patch or devel work is required (to
> either SGE or OMPI).
When you use a fixed allocation_rule and a matching -binding request it will
work today. But any other case won't be distributed in the correct w
n we get a full output of such a run with -report-bindings turned on. I
> think we should find out that things actually are happening correctly except
> for the fact that the 6 of the nodes have 2 cores allocated but only one is
> being bound to by a process.
You mean, to accept the current behavior as being the intended one, as finally
for having only one job running on these machines we get what we asked for -
despite the fact that cores are lost for other processes?
-- Reuti
Am 16.11.2010 um 17:35 schrieb Terry Dontje:
> On 11/16/2010 10:59 AM, Reuti wrote:
>> Am 16.11.2010 um 15:26 schrieb Terry Dontje:
>>
>>
>>>>>
>>>>> 1. allocate a specified number of cores on each node to your job
>>>>>
>&g
lf unconstrained.
> If I understand the thread correctly, you're saying that this is what happens
> today - true?
Exactly. It won't apply any binding at all and orted would think of being
unlimited. I.e. limited only by the number of slots it should use thereon.
-- Reuti
And a second one "limit_cores_by_slot_count true/false" instead of new
allocation_rules. To choose $fillup, $round_robin or others is independent from
limiting it IMO.
-- Reuti
> In summary I don't believe there is any OMPI bugs related to what we've seen
> and the OGE i
on a Rocks cluster. Any
>> ideas on how to debug this will be greatly appreciated.
is `xterm` working, when you start it with `mpirun` on the nodes?
Besides defining -X all the time, you can also put it in ~/.ssh/config in a
"Host *" rule. SSH authentication is hostbased (or
ostfile $PBS_NODEFILE \
>./testmom
often it's sufficient to unset some of the environment variable during
execution to switch off the automatic integration. I.e. unsetting $PBS_JOBID
and $PBS_ENVIRONMENT in the jobscript could do it already.
-- Reuti
> Brock Palen
> www.umic
PATH=/usr/lib/openmpi/bin${PATH:+:$PATH}
export LD_LIBRARY_PATH=/usr/lib/openmpi/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}
-- Reuti
> Currently, mpirun executes successfully on either node individually.
> However, when trying to run over the network, I get:
>
> [mpiuser@c-199 ~]$ mpirun
the nodes. When some of the parallel tasks are started on the nodes, they will
get most of the computing time (this means: oversubscription by intention). The
background queue can be used for less important jobs. Such a setup is usefull
when your parallel application isn't running in parallel all the
Am 25.01.2011 um 20:10 schrieb Will Glover:
> Thanks for your response, Reuti. Actually I had seen you mention the SGE
> mailing list in response to a similar question but I can't for the life of me
> find that list :(
The list was removed with the shutdown of the open source
queuing
system this happens sometimes). What you can try is:
$ mpirun -np 2 --debug ./a.out a b "'c d'"
$ mpirun -np 2 --debug ./a.out a b "\"c d\""
-- Reuti
> and not
>
> a b "c d"
>
> I think there is an issue in parsing the
ame happens with single quotes inside double quotes?
-- Reuti
> and not:
>
> a
> b
> "c d"
>
> 2011/1/27 Reuti
> Hi,
>
> Am 27.01.2011 um 09:48 schrieb Gabriele Fatigati:
>
> > Dear OpenMPI users and developers,
> >
> > i'm usi
a failed rank and act appropriate on its own?
Having a true ability to survive a dying process (i.e. rank) which might be
computing already for hours would mean to have some kind of "rank RAID" or
"rank Parchive". E.g. start 12 ranks when you need 10 - what ever 2 ranks
Am 27.01.2011 um 16:10 schrieb Joshua Hursey:
>
> On Jan 27, 2011, at 9:47 AM, Reuti wrote:
>
>> Am 27.01.2011 um 15:23 schrieb Joshua Hursey:
>>
>>> The current version of Open MPI does not support continued operation of an
>>> MPI application after pro
Open MPI, compile it to build into e.g. ~/local/openmpi-1.4.3, adjust your
PATHs and you are done.
Unless you build it with static libraries, it might in addition be necessary to
adjust LD_LIBRARY_PATH at runtime.
I use most often my own version on the clusters I have access to and disregard
any i
will avoid setting any
known_hosts file or passphraseless ssh-keys for each user.
-- Reuti
> HostName domU-12-31-39-07-35-21
> BatchMode yes
> IdentityFile /home/tsakai/.ssh/tsakai
> ChallengeResponseAuthentication no
> IdentitiesOnly yes
>
> # machine B
hostkeys
(private and public), this way it works for all users. Just for reference:
http://arc.liv.ac.uk/SGE/howto/hostbased-ssh.html
You could look into it later.
==
- Can you try to use a command when connecting from A to B? E.g. ssh
`domU-12-31-39-06-74-E2 ls`. Is this working too?
- Wha
Hi,
Am 10.02.2011 um 22:03 schrieb Tena Sakai:
> Hi Reuti,
>
> Thanks for suggesting "LogLevel DEBUG3." I did so and complete
> session is captured in the attached file.
>
> What I did is much similar to what I have done before: verify
> that ssh works and th
2011 from vixen.egcrc.org
> [tsakai@dasher ~]$
> [tsakai@dasher ~]$ cd Notes/R/parallel/Rmpi/
> [tsakai@dasher Rmpi]$
> [tsakai@dasher Rmpi]$ mpirun -app app.ac3
> mpirun: killing job...
a) can you ssh from dasher to vixen?
b) firewall on vixen?
-- Reuti
>
of your source and execution.
-- Reuti
> 2011/3/26 Ralph Castain
> Can you update to a more recent version? That version is several years
> out-of-date - we don't even really support it any more.
>
>
> On Mar 26, 2011, at 1:04 PM, Michele Marena wrote:
>
>> Yes
east under Torque.
>
> Somebody is playing an April Fools joke on you. The majority of
> supercomputers use ssh as their sole launch mechanism, and I have seen no
> indication that anyone intends to change that situation. That said, Torque is
> certainly popular and a good environment.
up, each daemon monitors mpirun's existence. So Torque
> only knows about mpirun, and Torque kills mpirun when (e.g.) walltime is
> reached. OMPI's daemons see that mpirun has died and terminate their local
> processes prior to terminating themselves.
I thought Open MPI has a
r each call
Problems:
1) control `ssh` under Torque
2) provide a partially hostlist to `mpirun`, maybe by disabling the default
tight integration
-- Reuti
> A simple example:
>
> vayu1:~/MPI > qsub -lncpus=24,vmem=24gb,walltime=10:00 -wd -I
> qsub: waiting for job 574900.vu-pbs t
Am 05.04.2011 um 11:11 schrieb SLIM H.A.:
> After an upgrade of our system I receive the following error message
> (openmpi 1.4.2 with gridengine):
Did you move openmpi 1.4.2 to a new (i.e. different) location?
-- Reuti
&g
each process on each cluster node.
>
> I cannot use ssh to access each node.
What about submitting another job with `mpirun ... ps -e f` or alike - in case
you can request the same nodes?
Can you `qrsh` to a node by the queuingsystem?
-- Reuti
> The program takes 8 hours to fi
e 8 is the maximum IBM
offers from their datasheet, and still you can request 16 per node?
Can it be a memory porblem?
-- Reuti
> We have also tried test jobs on 8+7 (or 7+8) with inconclusive results.
> Some of the live jobs run for a month or more and cut down versions do
> not model
could run and
> completed successfully
>
> qsub -q dev.q ./ompi_job.sh
Then you are bypassing SGE’s slot allocation and will have wrong accounting and
no job control of the slave tasks.
-- Reuti
> 4) Although I don't think PATH and LD_LIBRARY_PATH would cause issues in
> ub
nfiguration
> of MPI with Condor.
> If you know any person who uses Condor for running MPI jobs then please let
> me know.
Is the use of Open MPI supported by Condor? In former times they had a special
universe for MPICH(1) and only for an older version to run parallel jobs under
Condor. Di
libraries in your jobscript? Shows the
output of:
#!/bin/sh
which mpiexec
echo $LD_LIBRARY_PATH
ldd ompi_job
the expected ones (ompi_job is the binary and ompi_job.sh the script) when
submitted with a PE request?
-- Reuti
> jsv_url none
> jsv_allowed_mod
All stuff below looks fine.
You can even try to start "from scratch" with a private copy of Open MPI which
you install for example in $HOME/local/openmpi-1.4.3 and set the paths
accordingly.
-- Reuti
> #!/bin/sh
> which mpiexec
> echo $LD_LIBRARY_PATH
> ldd ompi_job
>
t works.
>
> I want to confirm one more thing: does SGE's master host need to have OpenMPI
> installed? Is it relevant?
In principle: no. But often it's installed too, as you will compile on either
the master machine or a dedicated login server.
-- Reuti
> Many thanks Re
.ac.uk/SGE/howto/hostbased-ssh.html , or enable `rsh` on the
machines and tell Open MPI to use it. Is:
mpiexec hostname
giving you a list of the involved machines?
-- Reuti
> Thanks a lot,
> Regards,
> ArchyGU
> Nanyang Technological University
> _
Good, then please supply a hostfile with the names of the machines you want to
run for a particular run and give it as option to `mpiexec`. See options -np
and -machinefile.
-- Reuti
Am 19.04.2011 um 06:38 schrieb mohd naseem:
> sir
> when i give mpiexec hostname command.
> it only
run bash -c "$SCRIPT" >& "$out" &, and with "mpi" it will do the same
> with 'mpirun -np 1' prepended. The output I get is:
what about:
( trap "" sigint; exec mpiexec ...) &
i.e. replace the subshell with changed interrupt
o I stop it
> doing so?
I get:
$ cat /proc/31619/status
...
SigCgt: 4b813efb
...
$ trap '' int
$ cat /proc/31619/status
...
SigCgt: 4b813ef9
...
$ trap '' hup
$ cat /proc/31619/status
...
SigCgt: 4b813ef8
Looks like SIGINT(2) is bit 1 and likewise SIGH
t- will cause ctrl-c to be sent to -both-
> processes.
What about setsid and pushing it in a new seesion instead of using & in the
script?
-- Reuti
>
> At least when I test it, even non-mpirun processes will abort.
>
>> it's not unreasonable to expect to be able
B, the working script looks like:
>>
>> setsid bash -c "mpirun command>& out"&
>> tail -f out
>>
>
> Yes - but now you can't kill mpirun when something goes wrong
You can still send a sigint from the command line to the mpirun process o
tication:
http://arc.liv.ac.uk/SGE/howto/hostbased-ssh.html
You have the same users on all machines with the same UID and GID?
-- Reuti
> mpirun --prefix /usr/local/openmpi1.4.3 -np 4 --hostfile hostfile hello
>
>
>
> Copied below is the output. How d
S. in both parts if the cluster, me (login marked as x here) can login
>> to any node by ssh without need to type the password.
>From the headnode of the cluster to a node or also between nodes?
-- Reuti
>>
>>
>>
>> --
ient? Does the -bynode
> switch imply only one slot is used on each node before it moves on to the
> next?
Do I get it right: inside the granted slots by SGE you want the allocation
inside Open MPI to follow a specific pattern, i.e.: which rank is where?
Am 26.07.2011 um 21:51 schrieb Ralph Castain:
>
> On Jul 26, 2011, at 1:39 PM, Reuti wrote:
>
>> Hi,
>>
>> Am 26.07.2011 um 21:19 schrieb Lane, William:
>>
>>> I can successfully run the MPI testcode via OpenMPI 1.3.3 on less than 87
&
Am 27.07.2011 um 19:43 schrieb Lane, William:
> Thank you for your help Ralph and Reuti,
>
> The problem turned out to be the number of file descriptors was insufficient.
>
> The reason given by a sys admin was that since SGE isn't a user it wasn't
> initially us
).
>
> ring_c was compiled separately on each computer, however both have the same
> version of openmpi and OSX. I've gone through the FAQ and searched the user
> forum, but I can't quite seems to get this problem unstuck.
do you have any firewall on the machines?
-- Reuti
that code
> has hit a distribution yet.
Can you please point me to these projects?
I was always wondering how to phrase it in a submission request. It would need
include to specify: I need 2 hrs 2 cores, then 30 minutes 1 core and finally 6
hrs 4 cores which targets already features of a real-ti
d of the user before you submit the job? In
GridEngine you can specify whether the actual group id should be used for the
job, or the default login id.
Having a tight integration, also the slave processes will run with the same
group id.
-- Reuti
> Ed
>
> From: Ralph Castain [mailto:
Torque I can't make any definite statement for it.
Are you resetting inside the job script some variables to let it run outside
Torque, i.e. without tight integration?
-- Reuti
> -Original Message-
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
>
15
>>
>> bin/mpirun --machinefile mpihosts.dat -np 16 -mca orte_rsh_agent
>> /usr/bin/rsh./test_setup
Is it a typo or copy/paste error: there is a blank missing between /usr/bin/rsh
and ./test_setup
- Reuti
>> bloscel@f8312's password:
&
i-forum.org/mpi3-ft/ which covers fault tolerance.
I was pointed to it here
http://www.open-mpi.org/community/lists/users/2011/01/15440.php
-- Reuti
> On Sep 12, 2011, at 5:52 PM, Rob Stewart wrote:
>
>> Hi,
>>
>> I have implemented a simple fault tolerant ping pong
d/or
do local configurations exist overwriting this (qconf -sconfl)?
-- Reuti
>
> Thanks for any guidance,
>
> Ed
>
>
> error: executing task of job 139362 failed: execution daemon on host "f8312"
> didn't accept task
> -
failed: execution daemon on host "f8312"
> didn't accept task
did you supply a machinefile on your own? In a proper SGE integration it's
running in a parallel environment. You defined and requested one? The error
looks li
but not the launcher).
Did this change? I thought you need --with-sge to get SGE support as it's
longer default since 1.3?
-- Reuti
>
> Try setting the following:
>
> -mca plm_rsh_disable_qrsh 1
>
> on your cmd line. That should force it to avoid qrsh, and use rsh ins
ll is fine, the
job script is already the first task and TRUE should work.
-- Reuti
> -Original Message-
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
> Behalf Of Reuti
> Sent: Tuesday, September 13, 2011 4:27 PM
> To: Open MPI Users
> Sub
Am 14.09.2011 um 00:25 schrieb Blosch, Edwin L:
> Your comment guided me in the right direction, Reuti. And overlapped with
> your guidance, Ralph.
>
> It works: if I add this flag then it runs
> --mca plm_rsh_disable_qrsh
>
> Thank you both for the explanations.
>
Am 14.09.2011 um 00:29 schrieb Ralph Castain:
>
> On Sep 13, 2011, at 4:25 PM, Reuti wrote:
>
>> Am 13.09.2011 um 23:54 schrieb Blosch, Edwin L:
>>
>>> This version of OpenMPI I am running was built without any guidance
>>> regarding SGE in the configur
Hi,
Am 14.09.2011 um 17:39 schrieb Blosch, Edwin L:
> Thanks, Ralph,
>
> I get the failure messages, unfortunately:
>
> setgid FAILED
> setgid FAILED
> setgid FAILED
N.B.: This would be a side effect of the tight integration to do it
automatcially for all slave t
Am 14.09.2011 um 19:02 schrieb Blosch, Edwin L:
> Thanks for trying.
>
> Do you feel that this is an impossible request without the assistance of some
> process running as root, for example, as Reuti mentioned, the daemons of a
> job scheduler? Or are you saying it will
rent uid/gid on the machines, but with the new feature they must be
the same. Okay, in a cluster they are most likely unique across all machines
anyway. But just to note as a side effect.
-- Reuti
> Thank you again for the support
>
>
>
> From: users-boun...@open-mpi.org [ma
oo many arguments.
I works again if I use * in place of a @, but this will change how the bash
will assemble the arguments:
$ set "11 12" "21 22"
$ echo $1
11 12
$ echo $2
21 22
$ sg 24000 "./tt.sh ${*}"
12
11
What I expect is:
$ ./tt.sh "${@}"
21 22
11 1
on other nodes run with group 1040, and the files they create have
>> group ownership 1040.
What about setting the set-guid flag for the directory? Created files therein
will inherit the group from the folder then (which has to be set to the group
in question of course).
-- Reuti
software.intel.com/file/6335 (page 260)
You could try with the mentioned switches whether you get more consistent
output.
If there would be a MPI ABI, and you could just drop in any MPI library, it
would be quite easy to spot the real point where the discrepancy occured.
-- Reuti
> Tha
101 - 200 of 548 matches
Mail list logo