We are starting to look at supporting MPI on our Hadoop/Spark YARN based
cluster. I found a bunch of referneces to Hamster, but what I don't find is if
it was ever merged into regular OpenMPI, and if so is it just another RM
integration? Or does it need more setup?
I found this:
http://pivota
I think a lot of the information on this page:
http://www.open-mpi.org/faq/?category=java
Is out of date with the 1.8 release.
Brock Palen
www.umich.edu/~brockp
CAEN Advanced Computing
XSEDE Campus Champion
bro...@umich.edu
(734)936-1985
Ralph,
this is also a solution.
the pro is it seems more lightweight than PR #249
the two cons i can see are :
- opal_process_name_t alignment goes from 64 to 32 bits
- some functions (opal_hash_table_*) takes an uint64_t as argument so we
still need to use memcpy in order to
* guarantee 64 bits
Kawashima-san,
thanks a lot for the detailled explanation.
FWIW, i was previously testing on Solaris 11 that behaves like Linux :
printf("%s", NULL) outputs '(null)'
vs a SIGSEGV on Solaris 10
i commited a16c1e44189366fbc8e967769e050f517a40f3f8 in order to fix this
issue
(i moved the call to mca_
Hi Gilles, Oscar, Ralph, Takahiro
thank you very much for all your help and time investigating my
problems on Sparc systems.
> thanks a lot for the detailled explanation.
> FWIW, i was previously testing on Solaris 11 that behaves like Linux :
> printf("%s", NULL) outputs '(null)'
> vs a SIGSEGV
Hello,
I noticed this weird behavior, because after a certain time of more than
one minute the transfer rates of MPI_Send and MPI_Recv dropped by a
factor of 100+. By chance I saw, that my program did allocate more and
more memory. I have the following minimal working example:
#include
Hello,
After compiling and running a MPI program, it seems to hang at
MPI_Init(), but it eventually will work after a minute or two.
While the problem occured on my Notebook it did not on my desktop PC.
Both run on Win 7, cygwin 64 Bit, OpenMPI version 1.8.3 r32794
(ompi_info), g++ v 4.8.3.
Hi Takahiro, Gilles, Siegmar,
Thank you very much for all your fix.
I don't notice about calling 'mca_base_var_register' before MPI_Init.
I'm sorry for the inconvenience.
Regards,
Oscar
El 27/10/14 07:16, Gilles Gouaillardet escribió:
Kawashima-san,
thanks a lot for the detailled explanation.
On 10/27/2014 8:32 AM, maxinator333 wrote:
Hello,
After compiling and running a MPI program, it seems to hang at
MPI_Init(), but it eventually will work after a minute or two.
While the problem occured on my Notebook it did not on my desktop PC.
It can be a timeout on a network interface.
I
On 10/27/2014 8:30 AM, maxinator333 wrote:
Hello,
I noticed this weird behavior, because after a certain time of more than
one minute the transfer rates of MPI_Send and MPI_Recv dropped by a
factor of 100+. By chance I saw, that my program did allocate more and
more memory. I have the following
Hi,
i tested on a RedHat 6 like linux server and could not observe any
memory leak.
BTW, are you running 32 or 64 bits cygwin ? and what is your configure
command line ?
Thanks,
Gilles
On 2014/10/27 18:26, Marco Atzeri wrote:
> On 10/27/2014 8:30 AM, maxinator333 wrote:
>> Hello,
>>
>> I notic
On 10/27/2014 10:30 AM, Gilles Gouaillardet wrote:
Hi,
i tested on a RedHat 6 like linux server and could not observe any
memory leak.
BTW, are you running 32 or 64 bits cygwin ? and what is your configure
command line ?
Thanks,
Gilles
the problem is present in both versions.
cygwin 1.8
Dear Mr. Squyres.
We will try to install your bug-fixed nigthly tarball of 2014-10-24 on Cluster5
to see whether it works or not.
The installation however will take some time. I get back to you, if I know more.
Let me add the information that on the Laki each nodes has 16 GB of shared
memory (t
Thanks Marco,
I could reproduce the issue even with one node sending/receiving to itself.
I will investigate this tomorrow
Cheers,
Gilles
Marco Atzeri wrote:
>
>
>On 10/27/2014 10:30 AM, Gilles Gouaillardet wrote:
>> Hi,
>>
>> i tested on a RedHat 6 like linux server and could not observe any
Dear developers of OPENMPI,
We have now installed and tested the bugfixed OPENMPI Nightly Tarball of
2014-10-24 (openmpi-dev-176-g9334abc.tar.gz) on Cluster5 .
As before (with OPENMPI-1.8.3 release version) the small Ftn-testprogram runs
correctly on the login-node.
As before the program abort
Michael,
Could you please run
mpirun -np 1 df -h
mpirun -np 1 df -hi
on both compute and login nodes
Thanks
Gilles
michael.rach...@dlr.de wrote:
>Dear developers of OPENMPI,
>
>We have now installed and tested the bugfixed OPENMPI Nightly Tarball of
>2014-10-24 (openmpi-dev-176-g9334abc.tar.
Michael,
The available space must be greater than the requested size + 5%
From the logs, the error message makes sense to me : there is not enough space
in /tmp
Since the compute nodes have a lot of memory, you might want to try using
/dev/shm instead of /tmp for the backing files
Cheers,
Gil
-Ursprüngliche Nachricht-
Von: users [mailto:users-boun...@open-mpi.org] Im Auftrag von Gilles
Gouaillardet
Gesendet: Montag, 27. Oktober 2014 14:49
An: Open MPI Users
Betreff: Re: [OMPI users] WG: Bug in OpenMPI-1.8.3: storage limition in shared
memory allocation (MPI_WIN_ALLOCATE_SHAR
Dear Gilles,
This is the system response on the login node of cluster5:
cluster5:~/dat> mpirun -np 1 df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda31 228G 5.6G 211G 3% /
udev 32G 232K 32G 1% /dev
tmpfs32G 0 32G 0% /dev/shm
/dev/sda11
On Mon, Oct 27, 2014 at 02:15:45PM +, michael.rach...@dlr.de wrote:
> Dear Gilles,
>
> This is the system response on the login node of cluster5:
>
> cluster5:~/dat> mpirun -np 1 df -h
> Filesystem Size Used Avail Use% Mounted on
> /dev/sda31 228G 5.6G 211G 3% /
> udev
> On Oct 26, 2014, at 9:56 PM, Brock Palen wrote:
>
> We are starting to look at supporting MPI on our Hadoop/Spark YARN based
> cluster.
You poor soul…
> I found a bunch of referneces to Hamster, but what I don't find is if it was
> ever merged into regular OpenMPI, and if so is it just an
> On Oct 26, 2014, at 11:12 PM, Gilles Gouaillardet
> wrote:
>
> Ralph,
>
> this is also a solution.
> the pro is it seems more lightweight than PR #249
> the two cons i can see are :
> - opal_process_name_t alignment goes from 64 to 32 bits
> - some functions (opal_hash_table_*) takes an uint
Hello,
After compiling and running a MPI program, it seems to hang at
MPI_Init(), but it eventually will work after a minute or two.
While the problem occured on my Notebook it did not on my desktop PC.
It can be a timeout on a network interface.
I see a similar issue with wireless ON but no
Thanks this is good feedback.
I was worried with the dynamic nature of Yarn containers that it would be hard
to coordinate wire up, and you have confirmed that.
Thanks
Brock Palen
www.umich.edu/~brockp
CAEN Advanced Computing
XSEDE Campus Champion
bro...@umich.edu
(734)936-1985
> On Oct 27,
FWIW: the “better” solution is to move Hadoop to an HPC-like RM such as Slurm.
We did this as Pivotal as well as at Intel, but in both cases business moves at
the very end of the project (Greenplum becoming Pivotal, and Intel moving its
Hadoop work into Cloudera) blocked its release. Frustrating
Good morning,
I'm trying to build OpenMPI with the Intel 14.01 compilers with the following
configure line
./configure --prefix=/opt/openmpi-1.8.3/intel-14.01 CC=icc CXX=icpc FC=ifort
On a 6-core 3.5 GHz Intel Xeon E5 Mac Pro running Mac OS X 10.9.5.
Configure outputs a pthread error, complainin
On 24/10/14 18:09 pm, Ralph Castain wrote:
I was able to build and run the trunk without problem on Yosemite with:
gcc (MacPorts gcc49 4.9.1_0) 4.9.1
GNU Fortran (MacPorts gcc49 4.9.1_0) 4.9.1
Will test 1.8 branch now, though I believe the fortran support in 1.8
is up-to-date
Dear all,
I r
FWIW: I just tested with the Intel 15 compilers on Mac 10.10 and it works fine,
so apparently the problem has been fixed. You should be able to upgrade to the
15 versions, so that might be the best solution
> On Oct 27, 2014, at 11:06 AM, Bosler, Peter Andrew wrote:
>
> Good morning,
>
> I’m
Ralph,
On 2014/10/28 0:46, Ralph Castain wrote:
> Actually, I propose to also remove that issue. Simple enough to use a
> hash_table_32 to handle the jobids, and let that point to a
> hash_table_32 of vpids. Since we rarely have more than one jobid
> anyway, the memory overhead actually decreases
29 matches
Mail list logo