[OMPI users] ARMv8 support
Hello, Using openmpi-1.10.1, I was able to successfully run few tests from OSU micro benchmarks on ARMv8 platform; used Juno evaluation board. However, please refer to : - https://www.open-mpi.org/community/lists/devel/2013/01/11955.php # thread - https://svn.open-mpi.org/trac/ompi/ticket/3481 # ticket These seems to indicate cleaner way of adding ARM support is being planned. But, ticket has not been updated for more than 2 years. My questions are: 1. Are there any plans in the offing, for ARM, specifically roadmap for ARMv8 ? 2. Any potential areas or issues identified in OpenMPI for ARMv8, where users can contribute ? Please let me know. Thanks, - Sreenidhi
Re: [OMPI users] Strange problem with mpirun and LIGGGHTS after reboot of machine
Am 17.03.2016 um 10:40 schrieb Ralph Castain: > Just some thoughts offhand: > > * what version of OMPI are you using? dpkg -l openmpi-bin says 1.6.5-8 from Ubuntu 14.04. > > * are you saying that after the warm reboot, all 48 procs are running on a > subset of cores? Yes. After a cold boot all 48 processses are spread over all 48 cores and all cores show up as almost 100% in the htop cpu meter. After a warm boot, the 48 processes are just spread over a few cores and the rest of the system is idling. > * it sounds like some of the cores have been marked as “offline” for some > reason. Make sure you have hwloc installed on the machine, and run “lstopo” > and see if that is the case I tried with lstopo, but the graphics that I got look almost similar. The visible difference is in the sort of topology for the graphics adapter and the LAN cards. The path to the graphics shows 2 times the numbers 4,0 above the lines and the path to the eth0 shows 2 times the numbers 0,2 above the lines. lstopo for the warm boot looks identical, but those small numbers are missing now. I also tried with hwloc-gather-topology and diff'd the 2 results. There is nothing special to see. Differneces in /proc/stats/ and /proc/cpuinfo, but nothing special, just ohter values. Something is obviously wrong on a low level, but I'm still struggling to find it. :-/ Rainer -- Dipl.-Inf. (FH) Rainer Koenig Project Manager Linux Clients Dept. PDG WPS R&D SW OSE Fujitsu Technology Solutions Bürgermeister-Ullrich-Str. 100 86199 Augsburg Germany Telephone: +49-821-804-3321 Telefax: +49-821-804-2131 Mail: mailto:rainer.koe...@ts.fujitsu.com Internet ts.fujtsu.com Company Details ts.fujitsu.com/imprint.html
Re: [OMPI users] Strange problem with mpirun and LIGGGHTS after reboot of machine
Rainer, a first step could be to gather /proc/pid/status for your 48 tasks. then you can grep Cpus_allowed_list and see if you find something suspucious. if your processes are idling, then the scheduler might assign them to the same core. in this case, your processes not being spread is a consequence and not a root cause. just to make sure there are no strange side effects, could you mpirun --mca btl sm,self ... Cheers, Gilles On Tuesday, March 22, 2016, Rainer Koenig wrote: > Am 17.03.2016 um 10:40 schrieb Ralph Castain: > > Just some thoughts offhand: > > > > * what version of OMPI are you using? > > dpkg -l openmpi-bin says 1.6.5-8 from Ubuntu 14.04. > > > > * are you saying that after the warm reboot, all 48 procs are running on > a subset of cores? > > Yes. After a cold boot all 48 processses are spread over all 48 cores > and all cores show up as almost 100% in the htop cpu meter. > > After a warm boot, the 48 processes are just spread over a few cores and > the rest of the system is idling. > > > * it sounds like some of the cores have been marked as “offline” for > some reason. Make sure you have hwloc installed on the machine, and run > “lstopo” and see if that is the case > > I tried with lstopo, but the graphics that I got look almost similar. > The visible difference is in the sort of topology for the graphics > adapter and the LAN cards. The path to the graphics shows 2 times the > numbers 4,0 above the lines and the path to the eth0 shows 2 times the > numbers 0,2 above the lines. lstopo for the warm boot looks identical, > but those small numbers are missing now. > > I also tried with hwloc-gather-topology and diff'd the 2 results. There > is nothing special to see. Differneces in /proc/stats/ and > /proc/cpuinfo, but nothing special, just ohter values. > > Something is obviously wrong on a low level, but I'm still struggling to > find it. :-/ > > Rainer > -- > Dipl.-Inf. (FH) Rainer Koenig > Project Manager Linux Clients > Dept. PDG WPS R&D SW OSE > > Fujitsu Technology Solutions > Bürgermeister-Ullrich-Str. 100 > 86199 Augsburg > Germany > > Telephone: +49-821-804-3321 > Telefax: +49-821-804-2131 > Mail: mailto:rainer.koe...@ts.fujitsu.com > > Internet ts.fujtsu.com > Company Details ts.fujitsu.com/imprint.html > ___ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/03/28787.php