Hi,
In 2012 we wrote and tested our functions to use MPI I/O to have good
performances while doing I/O on a Lustre filesystem. Everything was fine
about "striping_factor" we passed to file creation.
Now I am trying to verify some performance degradation we observed and I
am surprised because
Ok, this is a good / consistent output. That being said, I don't grok what is
happening here: it says it finds 2 slots, but then it tells you it doesn't have
enough slots.
Let me dig deeper and get back to you...
--
Jeff Squyres
jsquy...@cisco.com
From: timesir
Thanks for the output.
I'm seeing inconsistent output between your different outputs, however. For
example, one of your outputs seems to ignore the hostfile and only show slots
on the local host, but another output shows 2 hosts with 1 slot each. But I
don't know what was in the hosts file fo
The netmask of both is 255.255.255.0
在 2022/11/15 12:32, timesir 写道:
*(py3.9) ➜ /share mpirun -n 2 --machinefile hosts --mca
rmaps_base_verbose 100 --mca ras_base_verbose 100 which mpirun*
[computer01:39342] mca: base: component_find: searching NULL for ras
components
[computer01:39342] m
*(py3.9) ➜ /share ompi_info --version*
Open MPI v5.0.0rc9
https://www.open-mpi.org/community/help/
*(py3.9) ➜ /share cat hosts*
192.168.180.48 slots=1
192.168.60.203 slots=1
*(py3.9) ➜** /share mpirun -n 2 --machinefile hosts --mca
plm_base_verbose 100 --mca rmaps_base_verbose 100 --m
Do you receive this email?
在 2022年11月23日星期三,timesir 写道:
>
> *1. This command now runs correctly *
>
> *(py3.9) ➜ /share mpirun -n 2 --machinefile hosts --mca plm_base_verbose
> 100 --mca rmaps_base_verbose 100 --mca ras_base_verbose 100 uptime*
>
>
>
> *2. But this command gets stuck. It seem
*(py3.9) ➜ /share ompi_info --version*
Open MPI v5.0.0rc9
https://www.open-mpi.org/community/help/
*(py3.9) ➜ /share cat hosts*
192.168.180.48 slots=1
192.168.60.203 slots=1
*(py3.9) ➜** /share mpirun -n 2 --machinefile hosts --mca
plm_base_verbose 100 --mca rmaps_base_verbose 100 --m
Yes, Gilles responded within a few hours:
https://www.mail-archive.com/users@lists.open-mpi.org/msg35057.html
Looking closer, we should still be seeing more output compared to what you
posted. It's almost like you have a busted Open MPI installation -- perhaps
it's missing the "hostfile" compo
I see 2 config.log files -- can you also send the other information requested
on that page? I.e, the version you're using (I think you said in a prior
email that it was 5.0rc9, but I'm not 100% sure), and the output from ompi_info
--all.
--
Jeff Squyres
jsquy...@cisco.com
do you receive my email?
timesir 于2022年11月15日周二 12:33写道:
> *(py3.9) ➜ /share mpirun -n 2 --machinefile hosts --mca
> rmaps_base_verbose 100 --mca ras_base_verbose 100 which mpirun*
> [computer01:39342] mca: base: component_find: searching NULL for ras
> components
> [computer01:39342] mca: b
10 matches
Mail list logo