[OMPI users] ARMv8 support

2016-03-22 Thread Sreenidhi Bharathkar Ramesh
Hello,

Using openmpi-1.10.1, I was able to successfully run few tests from OSU
micro benchmarks on ARMv8 platform; used Juno evaluation board.

However, please refer to :
- https://www.open-mpi.org/community/lists/devel/2013/01/11955.php  #
thread
- https://svn.open-mpi.org/trac/ompi/ticket/3481  # ticket

These seems to indicate cleaner way of adding ARM support is being
planned.  But, ticket has not been updated for more than 2 years.

My questions are:
1. Are there any plans in the offing, for ARM, specifically roadmap for
ARMv8 ?
2. Any potential areas or issues identified in OpenMPI for ARMv8, where
users can contribute ?

Please let me know.

Thanks,
- Sreenidhi


Re: [OMPI users] Strange problem with mpirun and LIGGGHTS after reboot of machine

2016-03-22 Thread Rainer Koenig
Am 17.03.2016 um 10:40 schrieb Ralph Castain:
> Just some thoughts offhand:
> 
> * what version of OMPI are you using?

dpkg -l openmpi-bin says 1.6.5-8 from Ubuntu 14.04.
> 
> * are you saying that after the warm reboot, all 48 procs are running on a 
> subset of cores?

Yes. After a cold boot all 48 processses are spread over all 48 cores
and all cores show up as almost 100% in the htop cpu meter.

After a warm boot, the 48 processes are just spread over a few cores and
the rest of the system is idling.

> * it sounds like some of the cores have been marked as “offline” for some 
> reason. Make sure you have hwloc installed on the machine, and run “lstopo” 
> and see if that is the case

I tried with lstopo, but the graphics that I got look almost similar.
The visible difference is in the sort of topology for the graphics
adapter and the LAN cards. The path to the graphics shows 2 times the
numbers 4,0 above the lines and the path to the eth0 shows 2 times the
numbers 0,2 above the lines. lstopo for the warm boot looks identical,
but those small numbers are missing now.

I also tried with hwloc-gather-topology and diff'd the 2 results. There
is nothing special to see. Differneces in /proc/stats/ and
/proc/cpuinfo, but nothing special, just ohter values.

Something is obviously wrong on a low level, but I'm still struggling to
find it. :-/

Rainer
-- 
Dipl.-Inf. (FH) Rainer Koenig
Project Manager Linux Clients
Dept. PDG WPS R&D SW OSE

Fujitsu Technology Solutions
Bürgermeister-Ullrich-Str. 100
86199 Augsburg
Germany

Telephone: +49-821-804-3321
Telefax:   +49-821-804-2131
Mail:  mailto:rainer.koe...@ts.fujitsu.com

Internet ts.fujtsu.com
Company Details  ts.fujitsu.com/imprint.html


Re: [OMPI users] Strange problem with mpirun and LIGGGHTS after reboot of machine

2016-03-22 Thread Gilles Gouaillardet
Rainer,

a first step could be to gather /proc/pid/status for your 48 tasks.
then you can
grep Cpus_allowed_list
and see if you find something suspucious.

if your processes are idling, then the scheduler might assign them to the
same core.
in this case, your processes not being spread is a consequence and not a
root cause.

just to make sure there are no strange side effects, could you
mpirun --mca btl sm,self ...

Cheers,

Gilles


On Tuesday, March 22, 2016, Rainer Koenig 
wrote:

> Am 17.03.2016 um 10:40 schrieb Ralph Castain:
> > Just some thoughts offhand:
> >
> > * what version of OMPI are you using?
>
> dpkg -l openmpi-bin says 1.6.5-8 from Ubuntu 14.04.
> >
> > * are you saying that after the warm reboot, all 48 procs are running on
> a subset of cores?
>
> Yes. After a cold boot all 48 processses are spread over all 48 cores
> and all cores show up as almost 100% in the htop cpu meter.
>
> After a warm boot, the 48 processes are just spread over a few cores and
> the rest of the system is idling.
>
> > * it sounds like some of the cores have been marked as “offline” for
> some reason. Make sure you have hwloc installed on the machine, and run
> “lstopo” and see if that is the case
>
> I tried with lstopo, but the graphics that I got look almost similar.
> The visible difference is in the sort of topology for the graphics
> adapter and the LAN cards. The path to the graphics shows 2 times the
> numbers 4,0 above the lines and the path to the eth0 shows 2 times the
> numbers 0,2 above the lines. lstopo for the warm boot looks identical,
> but those small numbers are missing now.
>
> I also tried with hwloc-gather-topology and diff'd the 2 results. There
> is nothing special to see. Differneces in /proc/stats/ and
> /proc/cpuinfo, but nothing special, just ohter values.
>
> Something is obviously wrong on a low level, but I'm still struggling to
> find it. :-/
>
> Rainer
> --
> Dipl.-Inf. (FH) Rainer Koenig
> Project Manager Linux Clients
> Dept. PDG WPS R&D SW OSE
>
> Fujitsu Technology Solutions
> Bürgermeister-Ullrich-Str. 100
> 86199 Augsburg
> Germany
>
> Telephone: +49-821-804-3321
> Telefax:   +49-821-804-2131
> Mail:  mailto:rainer.koe...@ts.fujitsu.com 
>
> Internet ts.fujtsu.com
> Company Details  ts.fujitsu.com/imprint.html
> ___
> users mailing list
> us...@open-mpi.org 
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/03/28787.php