Re: [OMPI users] mca_sharedfp_lockfile issues

2021-11-02 Thread Gabriel, Edgar via users
What file system are you running your code on ? And is the same directory shared across all nodes? I have seen this error if users try to use a non-shared directory for MPI I/O operations ( e.g. /tmp which is a different drive/folder on each node). Thanks Edgar -Original Message- From

Re: [OMPI users] Status of pNFS, CephFS and MPI I/O

2021-09-23 Thread Gabriel, Edgar via users
collective I/O for example). -Original Message- From: users On Behalf Of Gabriel, Edgar via users Sent: Thursday, September 23, 2021 5:31 PM To: Eric Chamberland ; Open MPI Users Cc: Gabriel, Edgar ; Louis Poirel ; Vivien Clauzon Subject: Re: [OMPI users] Status of pNFS, CephFS and MPI I/O

Re: [OMPI users] Status of pNFS, CephFS and MPI I/O

2021-09-23 Thread Gabriel, Edgar via users
-Original Message- From: Eric Chamberland Thanks for your answer Edgard! In fact, we are able to use NFS and certainly any POSIX file system on a single node basis. I should have been asking for: What are the supported file systems for *multiple nodes* read/write access to files? ->

Re: [OMPI users] Status of pNFS, CephFS and MPI I/O

2021-09-23 Thread Gabriel, Edgar via users
Eric, generally speaking, ompio should be able to operate correctly on all file systems that have support for POSIX functions. The generic ufs component is for example being used on BeeGFS parallel file systems without problems, we are using that on a daily basis. For GPFS, the only reason we

Re: [OMPI users] 4.1 mpi-io test failures on lustre

2021-01-19 Thread Gabriel, Edgar via users
work on an update of the FAQ section. -Original Message- From: users On Behalf Of Dave Love via users Sent: Monday, January 18, 2021 11:14 AM To: Gabriel, Edgar via users Cc: Dave Love Subject: Re: [OMPI users] 4.1 mpi-io test failures on lustre "Gabriel, Edgar via users&quo

Re: [OMPI users] 4.1 mpi-io test failures on lustre

2021-01-15 Thread Gabriel, Edgar via users
I would like to correct one of my statements: -Original Message- From: users On Behalf Of Gabriel, Edgar via users Sent: Friday, January 15, 2021 7:58 AM To: Open MPI Users Cc: Gabriel, Edgar Subject: Re: [OMPI users] 4.1 mpi-io test failures on lustre > The entire infrastructure

Re: [OMPI users] 4.1 mpi-io test failures on lustre

2021-01-15 Thread Gabriel, Edgar via users
-Original Message- From: users On Behalf Of Dave Love via users Sent: Friday, January 15, 2021 4:48 AM To: Gabriel, Edgar via users Cc: Dave Love Subject: Re: [OMPI users] 4.1 mpi-io test failures on lustre > How should we know that's expected to fail? It at least shouldn

Re: [OMPI users] 4.1 mpi-io test failures on lustre

2021-01-14 Thread Gabriel, Edgar via users
I will have a look at those tests. The recent fixes were not correctness, but performance fixes. Nevertheless, we used to pass the mpich tests, but I admit that it is not a testsuite that we run regularly, I will have a look at them. The atomicity tests are expected to fail, since this the one c

Re: [OMPI users] Parallel HDF5 low performance

2020-12-03 Thread Gabriel, Edgar via users
the reason for potential performance issues on NFS are very different from Lustre. Basically, depending on your use-case and the NFS configuration, you have to enforce different locking policy to ensure correct output files. The default value for chosen for ompio is the most conservative setting

Re: [OMPI users] MPI-IO on Lustre - OMPIO or ROMIO?

2020-11-26 Thread Gabriel, Edgar via users
I will have a look at the t_bigio tests on Lustre with ompio. We had from collaborators some reports about the performance problems similar to the one that you mentioned here (which was the reason we were hesitant to make ompio the default on Lustre), but part of the problem is that we were not

Re: [OMPI users] MPI-IO on Lustre - OMPIO or ROMIO?

2020-11-16 Thread Gabriel, Edgar via users
the --with-lustre option twice, once inside of the "--with-io-romio-flags=" (along with the option that you provided), and once outside (for ompio). Thanks Edgar -Original Message- From: Mark Dixon Sent: Monday, November 16, 2020 8:19 AM To: Gabriel, Edgar via users Cc: Gabr

Re: [OMPI users] MPI-IO on Lustre - OMPIO or ROMIO?

2020-11-16 Thread Gabriel, Edgar via users
this is in theory still correct, the default MPI I/O library used by Open MPI on Lustre file systems is ROMIO in all release versions. That being said, ompio does have support for Lustre as well starting from the 2.1 series, so you can use that as well. The main reason that we did not switch to

Re: [OMPI users] ompe support for filesystems

2020-11-04 Thread Gabriel, Edgar via users
the ompio software infrastructure has multiple frameworks. fs framework: abstracts out file system level operations (open, close, etc) fbtl framework: provides the abstractions and implementations of *individual* file I/O operations (seek,read,write, iread,iwrite) fcoll framework: provides the

Re: [OMPI users] MPI I/O question using MPI_File_write_shared

2020-06-05 Thread Gabriel, Edgar via users
Your code looks correct, and based on your output I would actually suspect that the I/O part finished correctly, the error message that you see is not an IO error, but from the btl (which is communication related). What version of Open MPI are using, and on what file system? Thanks Edgar -

Re: [OMPI users] Slow collective MPI File IO

2020-04-06 Thread Gabriel, Edgar via users
The one test that would give you a good idea of the upper bound for your scenario would be that write a benchmark where each process writes to a separate file, and look at the overall bandwidth achieved across all processes. The MPI I/O performance will be less or equal to the bandwidth achieved

Re: [OMPI users] Slow collective MPI File IO

2020-04-06 Thread Gabriel, Edgar via users
Hi, A couple of comments. First, if you use MPI_File_write_at, this is usually not considered collective I/O, even if executed by multiple processes. MPI_File_write_at_all would be collective I/O. Second, MPI I/O can not do ‘magic’, but is bound by hardware that you are providing. If already a

Re: [OMPI users] How to prevent linking in GPFS when it is present

2020-03-30 Thread Gabriel, Edgar via users
ompio only added recently support for gpfs, and its only available in master (so far). If you are using any of the released versions of Open MPI (2.x, 3.x, 4.x) you will not find this feature in ompio yet. Thus, the issue is only how to disable gpfs in romio. I could not find right away an optio

Re: [OMPI users] Read from file performance degradation whenincreasing number of processors in some cases

2020-03-06 Thread Gabriel, Edgar via users
How is the performance if you leave a few cores for the OS, e,g. running with 60 processes instead of 64? Reasoning being that the file read operation is really executed by the OS, and could potentially be quite resource intensive. Thanks Edgar From: users On Behalf Of Ali Cherry via users Sen

Re: [OMPI users] Help with One-Sided Communication: Works in Intel MPI, Fails in Open MPI

2020-02-24 Thread Gabriel, Edgar via users
I am not an expert for the one-sided code in Open MPI, I wanted to comment briefly on the potential MPI -IO related item. As far as I can see, the error message “Read -1, expected 48, errno = 1” does not stem from MPI I/O, at least not from the ompio library. What file system did you use for t

Re: [OMPI users] Deadlock in netcdf tests

2019-10-26 Thread Gabriel, Edgar via users
Orion, It might be a good idea. This bug is triggered from the fcoll/two_phase component (and having spent just two minutes in looking at it, I have a suspicion what triggers it, namely in int vs. long conversion issue), so it is probably unrelated to the other one. I need to add running the ne

Re: [OMPI users] Deadlock in netcdf tests

2019-10-25 Thread Gabriel, Edgar via users
Never mind, I see it in the backtrace :-) Will look into it, but am currently traveling. Until then, Gilles suggestion is probably the right approach. Thanks Edgar > -Original Message- > From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Gabriel, > Edgar via use

Re: [OMPI users] Deadlock in netcdf tests

2019-10-25 Thread Gabriel, Edgar via users
Orion, I will look into this problem, is there a specific code or testcase that triggers this problem? Thanks Edgar > -Original Message- > From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Orion > Poplawski via users > Sent: Thursday, October 24, 2019 11:56 PM > To: Open