Jeff,
if my understanding is correct, https requires open-mpi.org is the only
(httpd) domain served on port 443 for a given IP (e.g. no shared hosting)
a certificate is based on host name (e.g. www.open-mpi.org) and can
contains wildcards (e.g. *.open-mpi.org)
so if the first condition is met, th
wrote:
> Is the web server's private key, used to generate the CSR, also
> needed? If so, perhaps IU cannot share that.
>
>
>
> On Sat, Jul 30, 2016 at 11:09 PM, Gilles Gouaillardet
> > wrote:
> > Jeff,
> >
> > if my understanding is correct, htt
The error message is related to a permission issue (which is very
puzzling in itself ...)
can you manually check the permissions ?
cd /home/pi/Downloads/openmpi-2.0.0/opal/asm
ls -l .deps/atomic-asm.Tpo atomic-asm.S
then you can
make clean
make V=1 atomic-asm.lo
and post the output
mea
] Error 1
make[2]: Leaving directory
'/home/pi/Downloads/TEST/openmpi-2.0.0/opal/asm'
make[1]: *** [Makefile:2301: all-recursive] Error 1
make[1]: Leaving directory '/home/pi/Downloads/TEST/openmpi-2.0.0/opal'
make: *** [Makefile:1800: all-recursive] Error 1
* 3 - Hypothese
rectory '/home/pi/Downloads/TEST/openmpi-2.0.0/opal'
make: *** [Makefile:1800: all-recursive] Error 1
* 3 - Hypotheses ==*
*Solaris*
I'm not expert but searching "__curbrk" I discover it belongs to glibc
http://stackoverflow.com/questions/6210685/explanation-for-t
What if you run
mpirun -np 1 mysqlconnect
on your frontend (aka compilation and/or submission) host ?
does it work as expected ?
if yes, then this likely indicates a MySQL permission/configuration issue.
for example, it accepts connections from 'someUser' only from one node,
or maybe mysqld
Siegmar,
how did you configure openmpi ? which java version did you use ?
i just found a regression and you currently have to explicitly add
CFLAGS=-D_REENTRANT CPPFLAGS=-D_REENTRANT
to your configure command line
if you want to debug this issue (i cannot reproduce it on a solaris 11
x86 virtual
Can you also check there is no cpu binding issue (several mpi tasks and/or
OpenMP threads if any, bound to the same core and doing time sharing ?
A simple way to check that is to log into a compute node, run top and then
press 1 f j
If some cores have higher usage than others, you are likely doin
Hi Siegmar,
You might need to configure with --enable-debug and add -g -O0 to your CFLAGS
and LDFLAGS
Then once you attach with gdb, you have to find the thread that is polling :
thread 1
bt
thread 2
bt
and so on until you find the good thread
If _dbg is a local variable, you need to select the
It looks like we faced a similar issue :
opal_process_name_t is 64 bits aligned wheteas orte_process_name_t is 32 bits
aligned. If you run an alignment sensitive cpu such as sparc and you are not
lucky (so to speak) you can run into this issue.
i will make a patch for this shortly
Ralph Castain
variable declaration
only.
Any thought ?
Ralph Castain wrote:
>Will PR#249 solve it? If so, we should just go with it as I suspect that is
>the long-term solution.
>
>> On Oct 26, 2014, at 4:25 PM, Gilles Gouaillardet
>> wrote:
>>
>> It lo
; If you add changes to your branch, I can pass you a patch with my suggested
> alterations.
>
>> On Oct 26, 2014, at 5:55 PM, Gilles Gouaillardet
>> wrote:
>>
>> No :-(
>> I need some extra work to stop declaring orte_process_name_t and
>> ompi_process_name_
>>>>>> while
>>> (_dbg) poll(NULL, 0, 1);
>>>>>> tyr java 400 nm /usr/local/openmpi-1.9.0_64_gcc/lib64/*.so | grep -i _dbg
>>>>>> tyr java 401 nm /usr/local/openmpi-1.9.0_64_gcc/lib64/*.so | grep -i
>>>>>> JNI_OnLoad
&g
Hi,
i tested on a RedHat 6 like linux server and could not observe any
memory leak.
BTW, are you running 32 or 64 bits cygwin ? and what is your configure
command line ?
Thanks,
Gilles
On 2014/10/27 18:26, Marco Atzeri wrote:
> On 10/27/2014 8:30 AM, maxinator333 wrote:
>> Hello,
>>
>> I notic
Thanks Marco,
I could reproduce the issue even with one node sending/receiving to itself.
I will investigate this tomorrow
Cheers,
Gilles
Marco Atzeri wrote:
>
>
>On 10/27/2014 10:30 AM, Gilles Gouaillardet wrote:
>> Hi,
>>
>> i tested on a RedHat 6 like linux s
Michael,
Could you please run
mpirun -np 1 df -h
mpirun -np 1 df -hi
on both compute and login nodes
Thanks
Gilles
michael.rach...@dlr.de wrote:
>Dear developers of OPENMPI,
>
>We have now installed and tested the bugfixed OPENMPI Nightly Tarball of
>2014-10-24 (openmpi-dev-176-g9334abc.tar.
Michael,
The available space must be greater than the requested size + 5%
From the logs, the error message makes sense to me : there is not enough space
in /tmp
Since the compute nodes have a lot of memory, you might want to try using
/dev/shm instead of /tmp for the backing files
Cheers,
Gil
Ralph,
On 2014/10/28 0:46, Ralph Castain wrote:
> Actually, I propose to also remove that issue. Simple enough to use a
> hash_table_32 to handle the jobids, and let that point to a
> hash_table_32 of vpids. Since we rarely have more than one jobid
> anyway, the memory overhead actually decreases
Marco,
here is attached a patch that fixes the issue
/* i could not find yet why this does not occurs on Linux ... */
could you please give it a try ?
Cheers,
Gilles
On 2014/10/27 18:45, Marco Atzeri wrote:
>
>
> On 10/27/2014 10:30 AM, Gilles Gouaillardet wrote:
>> Hi,
>
Hi Siegmar,
From the jvm logs, there is an alignment error in native_get_attr but i could
not find it by reading the source code.
Could you please do
ulimit -c unlimited
mpiexec ...
and then
gdb /bin/java core
And run bt on all threads until you get a line number in native_get_attr
Thanks
Gill
Thanks Marco,
pthread_mutex_init calls calloc under cygwin but does not allocate memory under
linux, so not invoking pthread_mutex_destroy causes a memory leak only under
cygwin.
Gilles
Marco Atzeri wrote:
>On 10/28/2014 12:04 PM, Gilles Gouaillardet wrote:
>> Marco,
>>
>&g
Yep, will do today
Ralph Castain wrote:
>Gilles: will you be committing this to trunk and PR to 1.8?
>
>
>> On Oct 28, 2014, at 11:05 AM, Marco Atzeri wrote:
>>
>> On 10/28/2014 4:41 PM, Gilles Gouaillardet wrote:
>>> Thanks Marco,
>>>
>>>
ead can be found to
>>> satisfy query
>>> (gdb) bt
>>> #0 0x7f6173d0 in rtld_db_dlactivity () from
>>> /usr/lib/sparcv9/ld.so.1
>>> #1 0xffff7f6175a8 in rd_event () from /usr/lib/sparcv9/ld.so.1
>>> #2 0x7f618950 in lm
Michael,
could you please share your test program so we can investigate it ?
Cheers,
Gilles
On 2014/10/31 18:53, michael.rach...@dlr.de wrote:
> Dear developers of OPENMPI,
>
> There remains a hanging observed in MPI_WIN_ALLOCATE_SHARED.
>
> But first:
> Thank you for your advices to employ
ved with our large CFD-code.
>
> Are OPENMPI-developers nevertheless interested in that testprogram?
>
> Greetings
> Michael
>
>
>
>
>
>
> -Ursprüngliche Nachricht-
> Von: users [mailto:users-boun...@open-mpi.org] Im Auftrag von Gilles
> Gouaillar
Michael,
the root cause is openmpi was not compiled with the intel compilers but
the gnu compiler.
fortran modules are not binary compatible so openmpi and your
application must be compiled
with the same compiler.
Cheers,
Gilles
On 2014/11/05 18:25, michael.rach...@dlr.de wrote:
> Dear OPENMPI
ding an
>mpi.mod file, because the User can look inside the module
>and can directly see, if something is missing or possibly wrongly coded.
>
>Greetings
> Michael Rachner
>
>
>-Ursprüngliche Nachricht-
>Von: users [mailto:users-boun...@open-mpi.org] Im Auftra
Brock,
Is your post related to ib0/eoib0 being used at all, or being used with load
balancing ?
let me clarify this :
--mca btl ^openib
disables the openib btl aka *native* infiniband.
This does not disable ib0 and eoib0 that are handled by the tcp btl.
As you already figured out, btl_tcp_if_inc
Ralph,
IIRC there is load balancing accros all the btl, for example
between vader and scif.
So load balancing between ib0 and eoib0 is just a particular case that might
not necessarily be handled by the btl tcp.
Cheers,
Gilles
Ralph Castain wrote:
>OMPI discovers all active interfaces and aut
Hi,
IIRC there were some bug fixes between 1.8.1 and 1.8.2 in order to really
use all the published interfaces.
by any change, are you running a firewall on your head node ?
one possible explanation is the compute node tries to access the public
interface of the head node, and packets get dropped
Could you please send the output of netstat -nr on both head and compute node ?
no problem obfuscating the ip of the head node, i am only interested in
netmasks and routes.
Ralph Castain wrote:
>
>> On Nov 12, 2014, at 2:45 PM, Reuti wrote:
>>
>> Am 12.11.2014 um 17:27 schrieb Reuti:
>>
>>> A
Hi,
it seems you messed up the command line
could you try
$ mpirun --mca btl ^openib --host compute-01-01,compute-01-06 ring_c
can you also try to run mpirun from a compute node instead of the head
node ?
Cheers,
Gilles
On 2014/11/13 16:07, Syed Ahsan Ali wrote:
> Here is what I see when di
9
> [compute-01-01.private.dns.zone][[11064,1],0][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect]
> connect() to 192.168.108.10 failed: No route to host (113)
>
>
> On Thu, Nov 13, 2014 at 12:11 PM, Gilles Gouaillardet
> wrote:
>> Hi,
>>
>> it seems you me
.0 b) TX bytes:0 (0.0 b)
>
>
>
> So the point is why mpirun is following the ib path while I it has
> been disabled. Possible solutions?
>
> On Thu, Nov 13, 2014 at 12:32 PM, Gilles Gouaillardet
> wrote:
>> mpirun complains about the 192.168.108.10 ip address, bu
ddr
>>> 80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
>>> inet addr:192.168.108.14 Bcast:192.168.108.255
>>> Mask:255.255.255.0
>>> UP BROADCAST MULTICAST MTU:65520 Metric:1
>>> RX packets:0 errors:0 dropped
0.0.0.0 255.0.0.0 U 0 0 0 eth0
> 0.0.0.0 10.0.0.10.0.0.0 UG0 0 0 eth0
> [pmdtest@compute-01-06 ~]$
>
>
> On Thu, Nov 13, 2014 at 12:56 PM, Gilles Gouaillardet
> wrote:
>> This is really weird ?
>
My 0.02 US$
first, the root cause of the problem was a default gateway was
configured on the node,
but this gateway was unreachable.
imho, this is incorrect system setting that can lead to unpredictable
results :
- openmpi 1.8.1 works (you are lucky, good for you)
- openmpi 1.8.3 fails (no luck th
Siegmar,
This is correct, --enable-heterogenous is now fixed in the trunk.
Please also note that -D_REENTRANT is now automatically set on solaris
Cheers
Gilles
Siegmar Gross wrote:
>Hi Jeff, hi Ralph,
>
>> This issue should now be fixed, too.
>
>Yes, it is. Thank you very much for your help.
Hi John,
do you MPI_Init() or do you MPI_Init_thread(MPI_THREAD_MULTIPLE) ?
does your program calls MPI anywhere from an OpenMP region ?
does your program calls MPI only within an !$OMP MASTER section ?
does your program does not invoke MPI at all from any OpenMP region ?
can you reproduce this
Daniel,
you can run
$ ompi_info --parseable --all | grep _algorithm: | grep enumerator
that will give you the list of supported algo for the collectives,
here is a sample output :
mca:coll:tuned:param:coll_tuned_allreduce_algorithm:enumerator:value:0:ignore
mca:coll:tuned:param:coll_tuned_allred
Hi Ghislain,
that sound like a but in MPI_Dist_graph_create :-(
you can use MPI_Dist_graph_create_adjacent instead :
MPI_Dist_graph_create_adjacent(MPI_COMM_WORLD, degrees, &targets[0],
&weights[0],
degrees, &targets[0], &weights[0], info,
rankReordering, &commGraph);
it
t reagrds,
> Ghislain
>
> 2014-11-21 7:23 GMT+01:00 Gilles Gouaillardet > :
>> Hi Ghislain,
>>
>> that sound like a but in MPI_Dist_graph_create :-(
>>
>> you can use MPI_Dist_graph_create_adjacent instead :
>>
>> MPI_Dis
xpect based on prior knowledge.
>
> George.
>
>
> On Fri, Nov 21, 2014 at 3:48 AM, Gilles Gouaillardet <
> gilles.gouaillar...@iferc.org> wrote:
>
>> Ghislain,
>>
>> i can confirm there is a bug in mca_topo_base_dist_graph_distribute
>>
>>
Folks,
one drawback of retrieving time with rdtsc is that this value is core
specific :
if a task is not bound to a core, then the value returned by MPI_Wtime()
might go backward.
if i run the following program with
taskset -c 1 ./time
and then move it accross between cores
(taskset -cp 0 ; tas
It could be because configure did not find the knem headers and hence knem is
not supported and hence this mca parameter is read-only
My 0.2 us$ ...
Dave Love さんのメール:
>Why can't I set parameters like this (not the only one) with 1.8.3?
>
> WARNING: A user-supplied value attempted to override th
Folks,
FWIW, i observe a similar behaviour on my system.
imho, the root cause is OFED has been upgraded from a (quite) older
version to latest 3.12 version
here is the relevant part of code (btl_openib.c from the master) :
static uint64_t calculate_max_reg (void)
{
if (0 == stat("/sys/modu
Luca,
your email mentions openmpi 1.6.5
but gdb output points to openmpi 1.8.1.
could the root cause be a mix of versions that does not occur with root
account ?
which openmpi version are you expecting ?
you can run
pmap
when your binary is running and/or under gdb to confirm the openmpi libra
size should be
>> unlimited.
>> Check /etc/security/limits.conf and "ulimit -a".
>>
>> I hope this helps,
>> Gus Correa
>>
>> On 12/10/2014 08:28 AM, Gilles Gouaillardet wrote:
>>> Luca,
>>>
>>> your email mentions ope
Alex,
can you try something like
call system(sh -c 'env -i /.../mpirun -np 2 /.../app_name')
-i start with an empty environment
that being said, you might need to set a few environment variables
manually :
env -i PATH=/bin ...
and that being also said, this "trick" could be just a bad idea :
you
gt; I realize
> getting passed over a job scheduler with this approach might not work at
> all...
>
> I have looked at the MPI_Comm_spawn call but I failed to understand how it
> could help here. For instance, can I use it to launch an mpi app with the
> option "-n 5" ?
,MPI_INFO_NULL,rank,MPI_COMM_WORLD,my_intercomm,MPI_ERRCODES_IGNORE,status)
> enddo
>
> I do get 15 instances of the 'hello_world' app running: 5 for each parent
> rank 1, 2 and 3.
>
> Thanks a lot, Gilles.
>
> Best regargs,
>
> Alex
>
>
>
>
> 2014-1
ront end to use those, but since we have a lot of data to process
>
>it also benefits from a parallel environment.
>
>
>Alex
>
>
>
>
>2014-12-12 2:30 GMT-02:00 Gilles Gouaillardet :
>
>Alex,
>
>just to make sure ...
>this is the behavior you expe
would I track each one for their completion?
>
>Alex
>
>
>2014-12-12 22:35 GMT-02:00 Gilles Gouaillardet :
>
>Alex,
>
>You need MPI_Comm_disconnect at least.
>I am not sure if this is 100% correct nor working.
>
>If you are using third party apps, why dont you do
to say that
>we could do a lot better if they could be executed in parallel.
>
>I am not familiar with DMRAA but it seems to be the right choice to deal with
>job schedulers as it covers the ones I am interested in (pbs/torque and
>loadlever).
>
>Alex
>
>
>2014-12-13
Eric,
can you make your test case (source + input file + howto) available so i
can try to reproduce and fix this ?
Based on the stack trace, i assume this is a complete end user application.
have you tried/been able to reproduce the same kind of crash with a
trimmed test program ?
BTW, what kind
Eric,
i checked the source code (v1.8) and the limit for the shared_fp_fname
is 256 (hard coded).
i am now checking if the overflow is correctly detected (that could
explain the one byte overflow reported by valgrind)
Cheers,
Gilles
On 2014/12/15 11:52, Eric Chamberland wrote:
> Hi again,
>
>
Eric,
here is a patch for the v1.8 series, it fixes a one byte overflow.
valgrind should stop complaining, and assuming this is the root cause of
the memory corruption,
that could also fix your program.
that being said, shared_fp_fname is limited to 255 characters (this is
hard coded) so even if
Hi Siegmar,
a similar issue was reported in mpich with xlf compilers :
http://trac.mpich.org/projects/mpich/ticket/2144
They concluded this is a compiler issue (e.g. the compiler does not
implement TS 29113 subclause 8.1)
Jeff,
i made PR 315 https://github.com/open-mpi/ompi/pull/315
f08 binding
Eric,
thanks for the simple test program.
i think i see what is going wrong and i will make some changes to avoid
the memory overflow.
that being said, there is a hard coded limit of 256 characters, and your
path is bigger than 300 characters.
bottom line, and even if there is no more memory ove
.
Cheers,
Gilles
On 2014/12/16 12:43, Gilles Gouaillardet wrote:
> Eric,
>
> thanks for the simple test program.
>
> i think i see what is going wrong and i will make some changes to avoid
> the memory overflow.
>
> that being said, there is a hard coded limit of 256 charac
future release.
>
>
>However, siesta is launched only by specifying input/output files with i/o
>redirection like
>
>mpirun -n siesta < infile > outfile
>
>
>So far, I could not find anything about how to set an stdin file for an
>spawnee process.
>
ere are some limitations but it works very well for our
>uses... and until a "real" fix is proposed...
>
>Thanks for helping!
>
>Eric
>
>
>On 12/15/2014 11:42 PM, Gilles Gouaillardet wrote:
>> Eric and all,
>>
>> That is clearly a limitation in rom
FWIW
I faced a simlar issue on my linux virtualbox.
My shared folder is a vboxfs filesystem, but statfs returns the nfs magic id.
That causes some mess and the test fails.
At this stage i cannot tell whether i should blame the glibc, the kernel, a
virtualbox driver or myself
Cheer,
Gilles
Mike
Siegmar,
could you please give a try to the attached patch ?
/* and keep in mind this is just a workaround that happen to work */
Cheers,
Gilles
On 2014/12/22 22:48, Siegmar Gross wrote:
> Hi,
>
> today I installed openmpi-dev-602-g82c02b4 on my machines (Solaris 10 Sparc,
> Solaris 10 x86_64,
Kawashima-san,
i'd rather consider this as a bug in the README (!)
heterogenous support has been broken for some time, but it was
eventually fixed.
truth is there are *very* limited resources (both human and hardware)
maintaining heterogeneous
support, but that does not mean heterogeneous suppo
Where does the error occurs ?
MPI_Init ?
MPI_Finalize ?
In between ?
In the first case, the bug is likely a mishandled error case,
which means OpenMPI is unlikely the root cause of the crash.
Did you check infniband is up and running on your cluster ?
Cheers,
Gilles
Saliya Ekanayake さんのメール:
>
lat and ib_read_bw
>that measures latency and bandwith between two nodes. They are part of
>the "perftest" repo package."
>
>On Dec 28, 2014 10:20 AM, "Saliya Ekanayake" wrote:
>
>This happens at MPI_Init. I've attached the full error message.
>
FWIW ompi does not yet support XRC with OFED 3.12.
Cheers,
Gilles
Deva さんのメール:
>Hi Waleed,
>
>
>It is highly recommended to upgrade to latest OFED. Meanwhile, Can you try
>latest OMPI release (v1.8.4), where this warning is ignored on older OFEDs
>
>
>-Devendar
>
>
>On Sun, Dec 28, 2014 at 6:
Diego,
First, i recommend you redefine tParticle and add a padding integer so
everything is aligned.
Before invoking MPI_Type_create_struct, you need to
call MPI_Get_address(dummy, base, MPI%err)
displacements = displacements - base
MPI_Type_create_resized might be unnecessary if tParticle is
t;
>
>What do you think?
>
>George, Did i miss something?
>
>
>Thanks a lot
>
>
>
>
>Diego
>
>
>On 2 January 2015 at 12:51, Gilles Gouaillardet
> wrote:
>
>Diego,
>
>First, i recommend you redefine tParticle and add a p
gt; What do you meam "remove mpi_get_address(dummy) from all displacements".
>
> Thanks for all your help
>
> Diego
>
>
>
> Diego
>
>
> On 3 January 2015 at 00:45, Gilles Gouaillardet <
> gilles.gouaillar...@gmail.com> wrote:
>
>> Die
MENTS*
> * ENDIF*
>
> and the results is:
>
>*139835891001320 -139835852218120 -139835852213832*
> * -139835852195016 8030673735967299609*
>
> I am not able to understand it.
>
> Thanks a lot.
>
> In the attachment you can find the program
>
>
>
>
ces in displacements(2), I have only an integer in
>dummy%ip?
>
>Why do you use dummy(1) and dummy(2)?
>
>
>Thanks a lot
>
>
>
>Diego
>
>
>On 5 January 2015 at 02:44, Gilles Gouaillardet
> wrote:
>
>Diego,
>
>MPI_Get_address was invoked wi
Diego,
my bad, i should have passed displacements(1) to MPI_Type_create_struct
here is an updated version
(note you have to use a REQUEST integer for MPI_Isend and MPI_Irecv,
and you also have to call MPI_Wait to ensure the requests complete)
Cheers,
Gilles
On 2015/01/08 8:23, Diego Avesani w
Well, per the source code, this is not a bug but a feature :
from publish function from ompi/mca/pubsub/orte/pubsub_orte.c
ompi_info_get_bool(info, "ompi_unique", &unique, &flag);
if (0 == flag) {
/* uniqueness not specified - overwrite by default */
unique = false;
}
ust as
> reasonable as the alternative (I believe we flipped a coin)
>
>
>> On Jan 7, 2015, at 6:47 PM, Gilles Gouaillardet
>> wrote:
>>
>> Well, per the source code, this is not a bug but a feature :
>>
>>
>> from publish function from ompi/mca/p
the program run in your case?
>
> Thanks again
>
>
>
> Diego
>
>
> On 8 January 2015 at 03:02, Gilles Gouaillardet <
> gilles.gouaillar...@iferc.org> wrote:
>
>> Diego,
>>
>> my bad, i should have passed displacements(1) to MPI_Type_create_stru
> Attached is my copy of your program with fixes for the above-mentioned issues.
>
> BTW, I missed the beginning of this thread -- I assume that this is an
> artificial use of mpi_type_create_resized for the purposes of a small
> example. The specific use of it in this program ap
Hi Siegmar,
could you please try again with adding '-D_STDC_C99' to your CFLAGS ?
Thanks and regards,
Gilles
On 2015/01/12 20:54, Siegmar Gross wrote:
> Hi,
>
> today I tried to build openmpi-dev-685-g881b1dc on my machines
> (Solaris 10 Sparc, Solaris 10 x86_64, and openSUSE Linux 12.1
> x86_6
Ryan,
this issue has already been reported.
please refer to
http://www.open-mpi.org/community/lists/users/2015/01/26134.php for a
workaround
Cheers,
Gilles
On 2015/01/14 16:35, Novosielski, Ryan wrote:
> OpenMPI 1.8.4 does not appear to be buildable with GCC 4.9.2. The output, as
> requested
Alexander,
i was able to reproduce this behaviour.
basically, bad things happen when the garbage collector is invoked ...
i was even able to reproduce some crashes (but that happen at random
stages) very early in the code
by manually inserting calls to the garbage collector (e.g. System.gc();)
C
Dave,
the QDR Infiniband uses the openib btl (by default :
btl_openib_exclusivity=1024)
i assume the RoCE 10Gbps card is using the tcp btl (by default :
btl_tcp_exclusivity=100)
that means that by default, when both openib and tcp btl could be used,
the tcp btl is discarded.
could you give a try
Simona,
On 2015/02/08 20:45, simona bellavista wrote:
> I have two systems A (aka Host) and B (aka Target). On A a compiler suite
> is installed (intel 14.0.2), on B there is no compiler. I want to compile
> openmpi on A for running it on system B (in particular, I want to use
> mpirun and mpif90)
Khalid,
i am not aware of such a mechanism.
/* there might be a way to use MPI_T_* mechanisms to force the algorithm,
and i will let other folks comment on that */
you definetly cannot directly invoke ompi_coll_tuned_bcast_intra_binomial
(abstraction violation, non portable, and you miss the som
min
or if you know what you are doing, you can try mpirun -mca sec basic)
on blue waters, that would mean ompi does not run out of the box, but
fails with an understandable message.
that would be less user friendly, but more secure
any thoughts ?
Cheers,
Gilles
[gouaillardet@node0
On 2015/03/26 13:00, Ralph Castain wrote:
> Well, I did some digging around, and this PR looks like the right solution.
ok then :-)
following stuff is not directly related to ompi, but you might want to
comment on that anyway ...
> Second, the running of munge on the IO nodes is not only okay but
see Munge is/can be used by both SLURM and
> TORQUE.
> (http://docs.adaptivecomputing.com/torque/4-0-2/Content/topics/1-installConfig/serverConfig.htm#usingMUNGEAuth)
>
> If I misunderstood the drift, please ignore ;-)
>
> Mark
>
>
>> On 26 Mar 2015, at 5:38 , Gilles
per the error message, you likely misspeled vader (e.g. missed the "r")
Jeff,
the behavior was initially reported on a single node, so the tcp btl is
unlikely used
Cheers,
Gilles
On Friday, May 6, 2016, Zhen Wang wrote:
>
>
> 2016-05-05 9:27 GMT-05:00 Gilles Gouaillardet &
Dave,
I briefly read the papers and it suggests the SLOAVx algorithm is
implemented by the ml collective module
this module had some issues and was judged not good for production.
it is disabled by default in the v1.10 series, and has been simply removed
from the v2.x branch.
you can either use (
Siegmar,
at first glance, this looks like a crash of the compiler.
so I guess the root cause is not openmpi
(that being said, a workaround could be implemented in openmpi)
Cheers,
Gilles
On Saturday, May 7, 2016, Siegmar Gross <
siegmar.gr...@informatik.hs-fulda.de> wrote:
> Hi,
>
> today I tr
Siegmar,
did you upgrade your os recently ? or change hyper threading settings ?
this error message typically appears when the numactl-devel rpm is not
installed
(numactl-devel on redhat, the package name might differ on sles)
if not, would you mind retesting frI'm scratch a previous tarball that
Siegmar,
per the config.log, you need to update your CXXFLAGS="-m64
-library=stlport4 -std=sun03"
or just CXXFLAGS="-m64"
Cheers,
Gilles
On Saturday, May 7, 2016, Siegmar Gross <
siegmar.gr...@informatik.hs-fulda.de> wrote:
> Hi,
>
> today I tried to install openmpi-v1.10.2-176-g9d45e07 on my
Devon,
send() is a libc function that is used internally by Open MPI, and it uses
your user function instead of the libc ne.
simply rename your function mysend() or something else that is not used by
libc, and your issue will likely be fixed
Cheers,
Gilles
On Tuesday, May 10, 2016, Devon Hollow
Hi,
i was able to build openmpi 1.10.2 with the same configure command line
(after i quoted the LDFLAGS parameters)
can you please run
grep SIZEOF_PTRDIFF_T config.status
it should be 4 or 8, but it seems different in your environment (!)
are you running 32 or 64 bit kernel ? on which p
you can direct OpenMPI to only use a specific range of ports (that
should be open in your firewall configuration)
mpirun --mca oob_tcp_static_ipv4_ports - ...
if you use the tcp btl, you can (also) use
mpirun --mca btl_tcp_port_min_v4 --mca btl_tcp_port_range_v4
...
Cheers,
Gilles
On
Siegmar,
this issue was previously reported at
http://www.open-mpi.org/community/lists/devel/2016/05/18923.php
i just pushed the patch
Cheers,
Gilles
On 5/10/2016 2:27 PM, Siegmar Gross wrote:
Hi,
I tried to install openmpi-dev-4010-g6c9d65c on my "SUSE Linux
Enterprise Server 12 (x86
; That worked perfectly. Thank you. I'm surprised that clang didn't emit a
> warning about this!
>
> -Devon
>
> On Mon, May 9, 2016 at 3:42 PM, Gilles Gouaillardet <
> gilles.gouaillar...@gmail.com
> > wrote:
>
>> Devon,
>>
>> send() is a libc
I was basically suggesting you open a few ports to anyone (e.g. any IP
address), and Jeff suggests you open all ports to a few trusted IP
addresses.
btw, how many network ports do you have ?
if you have two ports (e.g. eth0 for external access and eth1 for private
network) and MPI should only use
Ilias,
at first glance, you are using the PGI preprocessor (!)
can you re-run configure with CPP=cpp,
or after removing all PGI related environment variables,
and see it it helps ?
Cheers,
Gilles
On Wednesday, May 11, 2016, Ilias Miroslav wrote:
> https://www.open-mpi.org/community/lists/use
Hi,
Where did you get the openmpi package from ?
fc20 ships openmpi 1.7.3 ...
does it work as expected if you do not use mpirun
(e.g. ./hello_c)
if yes, then you can try
ldd hello_c
which mpirun
ldd mpirun
mpirun -np 1 ldd hello_c
and confirm both mpirun and hello_c use the same mpi
1 - 100 of 1080 matches
Mail list logo