Hm, thanks for the report, I will look into this. I did not run the romio tests, but the hdf5 tests are run regularly and with 3.1.2 you should not have any problems on a regular unix fs. How many processes did you use, and which tests did you run specifically? The main tests that I execute from their parallel testsuite are testphdf5 and t_shapesame.
I will also look into the testmpio that you mentioned in the next couple of days. Thanks Edgar > -----Original Message----- > From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Dave > Love > Sent: Monday, October 8, 2018 10:20 AM > To: Open MPI Users <users@lists.open-mpi.org> > Subject: Re: [OMPI users] ompio on Lustre > > I said I'd report back about trying ompio on lustre mounted without flock. > > I couldn't immediately figure out how to run MTT. I tried the parallel > hdf5 tests from the hdf5 1.10.3, but I got errors with that even with the > relevant environment variable to put the files on (local) /tmp. > Then it occurred to me rather late that romio would have tests. Using the > "runtests" script modified to use "--mca io ompio" in the romio/test directory > from ompi 3.1.2 on no-flock-mounted Lustre, after building the tests with an > installed ompi-3.1.2, it did this and apparently hung at the end: > > **** Testing simple.c **** > No Errors > **** Testing async.c **** > No Errors > **** Testing async-multiple.c **** > No Errors > **** Testing atomicity.c **** > Process 3: readbuf[118] is 0, should be 10 > Process 2: readbuf[65] is 0, should be 10 > -------------------------------------------------------------------------- > MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD > with errorcode 1. > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > You may or may not see output from other processes, depending on > exactly when Open MPI kills them. > -------------------------------------------------------------------------- > Process 1: readbuf[145] is 0, should be 10 > **** Testing coll_test.c **** > No Errors > **** Testing excl.c **** > error opening file test > error opening file test > error opening file test > > Then I ran on local /tmp as a sanity check and still got errors: > > **** Testing I/O functions **** > **** Testing simple.c **** > No Errors > **** Testing async.c **** > No Errors > **** Testing async-multiple.c **** > No Errors > **** Testing atomicity.c **** > Process 2: readbuf[155] is 0, should be 10 > Process 1: readbuf[128] is 0, should be 10 > Process 3: readbuf[128] is 0, should be 10 > -------------------------------------------------------------------------- > MPI_ABORT was invoked on rank 2 in communicator MPI_COMM_WORLD > with errorcode 1. > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > You may or may not see output from other processes, depending on > exactly when Open MPI kills them. > -------------------------------------------------------------------------- > **** Testing coll_test.c **** > No Errors > **** Testing excl.c **** > No Errors > **** Testing file_info.c **** > No Errors > **** Testing i_noncontig.c **** > No Errors > **** Testing noncontig.c **** > No Errors > **** Testing noncontig_coll.c **** > No Errors > **** Testing noncontig_coll2.c **** > No Errors > **** Testing aggregation1 **** > No Errors > **** Testing aggregation2 **** > No Errors > **** Testing hindexed **** > No Errors > **** Testing misc.c **** > file pointer posn = 265, should be 10 > > byte offset = 3020, should be 1080 > > file pointer posn = 265, should be 10 > > byte offset = 3020, should be 1080 > > file pointer posn = 265, should be 10 > > byte offset = 3020, should be 1080 > > file pointer posn in bytes = 3280, should be 1000 > > file pointer posn = 265, should be 10 > > byte offset = 3020, should be 1080 > > file pointer posn in bytes = 3280, should be 1000 > > file pointer posn in bytes = 3280, should be 1000 > > file pointer posn in bytes = 3280, should be 1000 > > Found 12 errors > **** Testing shared_fp.c **** > No Errors > **** Testing ordered_fp.c **** > No Errors > **** Testing split_coll.c **** > No Errors > **** Testing psimple.c **** > No Errors > **** Testing error.c **** > File set view did not return an error > Found 1 errors > **** Testing status.c **** > No Errors > **** Testing types_with_zeros **** > No Errors > **** Testing darray_read **** > No Errors > > I even got an error with romio on /tmp (modifying the script to use mpirun -- > mca io romi314): > > **** Testing error.c **** > Unexpected error message MPI_ERR_ARG: invalid argument of some other > kind > Found 1 errors > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users _______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users