[OMPI users] How to force striping_factor (on lustre or other FS)?

2022-11-25 Thread Eric Chamberland via users
Hi, In 2012 we wrote and tested our functions to use MPI I/O to have good performances while doing I/O on a Lustre filesystem. Everything was fine about "striping_factor" we passed to file creation. Now I am trying to verify some performance degradation we observed and I am surprised because

Re: [OMPI users] users Digest, Vol 4818, Issue 1

2022-11-25 Thread Jeff Squyres (jsquyres) via users
Ok, this is a good / consistent output. That being said, I don't grok what is happening here: it says it finds 2 slots, but then it tells you it doesn't have enough slots. Let me dig deeper and get back to you... -- Jeff Squyres jsquy...@cisco.com From: timesir

Re: [OMPI users] users Digest, Vol 4818, Issue 1

2022-11-25 Thread Jeff Squyres (jsquyres) via users
Thanks for the output. I'm seeing inconsistent output between your different outputs, however. For example, one of your outputs seems to ignore the hostfile and only show slots on the local host, but another output shows 2 hosts with 1 slot each. But I don't know what was in the hosts file fo

Re: [OMPI users] users Digest, Vol 4818, Issue 1

2022-11-25 Thread timesir via users
The netmask of both is 255.255.255.0 在 2022/11/15 12:32, timesir 写道: *(py3.9) ➜  /share   mpirun -n 2 --machinefile hosts --mca rmaps_base_verbose 100 --mca ras_base_verbose 100  which mpirun* [computer01:39342] mca: base: component_find: searching NULL for ras components [computer01:39342] m

Re: [OMPI users] users Digest, Vol 4818, Issue 1

2022-11-25 Thread timesir via users
*(py3.9) ➜ /share ompi_info --version* Open MPI v5.0.0rc9 https://www.open-mpi.org/community/help/ *(py3.9) ➜ /share cat hosts* 192.168.180.48 slots=1 192.168.60.203 slots=1 *(py3.9) ➜** /share mpirun -n 2 --machinefile hosts --mca plm_base_verbose 100 --mca rmaps_base_verbose 100 --m

Re: [OMPI users] users Digest, Vol 4818, Issue 1

2022-11-25 Thread timesir via users
Do you receive this email? 在 2022年11月23日星期三,timesir 写道: > > *1. This command now runs correctly * > > *(py3.9) ➜ /share mpirun -n 2 --machinefile hosts --mca plm_base_verbose > 100 --mca rmaps_base_verbose 100 --mca ras_base_verbose 100 uptime* > > > > *2. But this command gets stuck. It seem

Re: [OMPI users] users Digest, Vol 4818, Issue 1

2022-11-25 Thread timesir via users
*(py3.9) ➜ /share ompi_info --version* Open MPI v5.0.0rc9 https://www.open-mpi.org/community/help/ *(py3.9) ➜ /share cat hosts* 192.168.180.48 slots=1 192.168.60.203 slots=1 *(py3.9) ➜** /share mpirun -n 2 --machinefile hosts --mca plm_base_verbose 100 --mca rmaps_base_verbose 100 --m

Re: [OMPI users] users Digest, Vol 4818, Issue 1

2022-11-25 Thread Jeff Squyres (jsquyres) via users
Yes, Gilles responded within a few hours: https://www.mail-archive.com/users@lists.open-mpi.org/msg35057.html Looking closer, we should still be seeing more output compared to what you posted. It's almost like you have a busted Open MPI installation -- perhaps it's missing the "hostfile" compo

Re: [OMPI users] users Digest, Vol 4818, Issue 1

2022-11-25 Thread Jeff Squyres (jsquyres) via users
I see 2 config.log files -- can you also send the other information requested on that page? I.e, the version you're using (I think​ you said in a prior email that it was 5.0rc9, but I'm not 100% sure), and the output from ompi_info --all. -- Jeff Squyres jsquy...@cisco.com

Re: [OMPI users] users Digest, Vol 4818, Issue 1

2022-11-25 Thread 龙龙 via users
do you receive my email? timesir 于2022年11月15日周二 12:33写道: > *(py3.9) ➜ /share mpirun -n 2 --machinefile hosts --mca > rmaps_base_verbose 100 --mca ras_base_verbose 100 which mpirun* > [computer01:39342] mca: base: component_find: searching NULL for ras > components > [computer01:39342] mca: b