I agree with Gilles -- when you compile with one MPI implementation, but then 
accidentally use the mpirun/mpiexec from a different MPI implementation to 
launch it, it's quite a common symptom to see an MPI_COMM_WORLD size of 1 
(i.e., each MPI process is rank 0 in MPI_COMM_WORLD).

Make sure that you're using the mpifort and mpirun from the same MPI 
implementation (e.g., Open MPI).  You might want to re-build HPL from scratch 
and ensure that you are using a specific mpifort, and then be absolutely 100% 
sure to re-run the resulting HPL with the mpirun from that same MPI 
implementation.


> On May 26, 2015, at 9:38 PM, Heerdt, Lanze M. <heerdt...@gcc.edu> wrote:
> 
> I have run a hello world program for any number of processes. If I say “–n 
> 16” I get 4 responses from each node saying “Hello world! I am process (0-15) 
> of response.
> 
>  
> 
> As a response to the illegal entry in HPL.dat, that doesn’t really make much 
> sense since I run it just fine with p =1 and q =1, it only says that when I 
> changeand q to 2, which I know is not an illegal entry
> 
>  
> 
> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gilles 
> Gouaillardet
> Sent: Tuesday, May 26, 2015 8:14 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] Running HPL on RPi cluster, seems like MPI is 
> somehow not configured properly since it work with 1 node but not more
> 
>  
> 
> At first glance, it seems all mpi tasks believe they are rank zero and comm 
> world size is 1 (!)
> 
> Did you compile xhpl with OpenMPI (and not a stub library for serial version 
> only) ?
> can you make sure there is nothing wrong with your LD_LIBRARY_PATH and you do 
> not mix MPI librairies
> (e.g. OpenMPI mpirun but xhpl ends up using mpich, or the other way around)
> 
> As already suggested by Ralph, i would start by running a hello world program
> (just print rank and size to confirm it works)
> 
> Cheers,
> 
> Gilles
> 
> 
> On 5/27/2015 8:42 AM, Ralph Castain wrote:
> 
> I don't know enough about HPL to resolve the problem. However, I would 
> suggest that you first just try to run the example programs in the examples 
> directory to ensure you have everything working. If they work, then the 
> problem is clearly in the HPL arena.
> 
>  
> 
> I do note that your image reports that you have an illegal entry in HPL.dat - 
> if the examples work, you might start there.
> 
>  
> 
>  
> 
> On Tue, May 26, 2015 at 12:26 PM, Heerdt, Lanze M. <heerdt...@gcc.edu> wrote:
> 
> I realize this may be a bit off topic, but since what I am doing seems to be 
> a pretty commonly done thing I am hoping to find someone who has done it 
> before/can help since I’ve been at my wits end for so long they are calling 
> me Mr. Whittaker.
> 
>  
> 
> I am trying to run HPL on a Raspberry Pi cluster. I used the following guides 
> to get to where I am now:
> 
> http://www.tinkernut.com/2014/04/make-cluster-computer/
> 
> http://www.tinkernut.com/2014/05/make-cluster-computer-part-2/
> 
> https://www.howtoforge.com/tutorial/hpl-high-performance-linpack-benchmark-raspberry-pi/#comments
> 
> and a bit of: 
> https://www.raspberrypi.org/forums/viewtopic.php?p=301458#p301458 when the 
> above guide wasn’t working
> 
>  
> 
> basically when I run: “mpiexec -machinefile ~/machinefile -n 1 xhpl” it works 
> just fine
> 
> but when I run “mpiexec -machinefile ~/machinefile -n 4 xhpl” it errors with 
> the attached image. (if I use “mpirun…” I get the exact same behavior)
> 
> [Note: I HAVE changed the HPL.dat to have “2    Ps” and “2    Qs” from 1 and 
> 1 for when I try to run it with 4 processes]
> 
>  
> 
> This is for a project of mine which I need done by the end of the week so if 
> you see this after 5/29 thank you but don’t bother responding
> 
>  
> 
> I have hpl-2.1, mpi4py-1.3.1, mpich-3.1, and openmpi-1.8.5 at my disposal
> 
> In the machinefile are the 4 IP addresses of my 4 RPi nodes
> 
> 10.15.106.107
> 
> 10.15.101.29
> 
> 10.15.106.108
> 
> 10.15.101.30
> 
>  
> 
> Any other information you need I can easily get to you so please do not 
> hesitate to ask. I have nothing else to do but try and get this to work :P
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/05/26945.php
> 
>  
> 
> 
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/05/26948.php
>  
> 
> <Zoop.PNG>_______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/05/26950.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to