Drew Parsons a écrit le 06/04/2020 à 05:08 :
> On 2020-04-06 09:56, Drew Parsons wrote:
>> On 2020-04-06 01:48, Gilles Filippini wrote:
>>> Drew Parsons a écrit le 05/04/2020 à 18:57 :
>>>>
>>>> Another option is to create an environment variable to force h5py to
>>>> load the mpi version even when run in a serial environment without
>>>> mpirun. Easy enough to set up, though I'm interested to see if "mpirun
>>>> -n 1 dh_auto_build" or a variation of that is viable. Maybe
>>>> %:
>>>> mpirun -n 1 dh $@ --with python3 --buildsystem=pybuild
>>>
>>> This, way the test cases run against python3.7 is OK, but it fails
>>> against python3.8 with:
>>>
>>> I: pybuild base:217: cd
>>> /build/bitshuffle-z2ZvpN/bitshuffle-0.3.5/.pybuild/cpython3_3.8_bitshuffle/build;
>>>
>>> python3.8 -m unittest discover -v
>>> [pinibrem15:43725] OPAL ERROR: Unreachable in file ext3x_client.c at
>>> line 112
>>> *** An error occurred in MPI_Init_thread
>>> *** on a NULL communicator
>>> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
>>> *** and potentially your MPI job)
>>> [pinibrem15:43725] Local abort before MPI_INIT completed completed
>>> successfully, but am not able to aggregate error messages, and not able
>>> to guarantee that all other processes were killed!
>>> E: pybuild pybuild:352: test: plugin distutils failed with: exit code=1:
>>> cd
>>> /build/bitshuffle-z2ZvpN/bitshuffle-0.3.5/.pybuild/cpython3_3.8_bitshuffle/build;
>>>
>>> python3.8 -m unittest discover -v
>>> dh_auto_test: error: pybuild --test -i python{version} -p "3.7 3.8"
>>> returned exit code 13
>>>
>>> But the HDF5 error is no more present with python3.7. So it seems a good
>>> point.
>>
>>
>> Strange again. I would have expected the same behaviour in python3.8
>> and python3.7, whether successful or unsuccessful.
>
>
> Putting dh into mpirun seems to be interfering with process spawning.
> Once MPI is initialised (for the python3.7 test) it's not reinitialised
> for the python3.8 and so it's in a bad state for the test. Something
> like that.
>
> It's only in the tests where h5py is invoked that we get the problems.
> This variant works, applying mpirun separately for each test run:
>
> override_dh_auto_test:
> set -e; \
> for py in `py3versions -s -v`; do \
> mpirun -n 1 pybuild --test -i python{version} -p $$py; \
> done
>
> (could use mpirun -n $(NPROC) for real mpi testing).Yes, it works! \o/ > Do we want to use this as a solution? Or would you prefer an environment > variable that h5py can check to allow mpi invocation on a serial process? I let this decision up to you. Whatever you choose it deserve a bit fat note in README.Debian. > Note that this means bitshuffle as built now is expressly tied in with > hdf5-mpi and h5py-mpi (this seems intentional by debian/rules and > debian/control, though the Build-Depends must be updated to > python3-h5py-mpi). It's a separate question whether it's desirable to > also support a hdf5-serial build of bitshuffle. Likewise we need to > think about what we want to happen when bitshuffle is invoked in a > serial process. I'll let that to the bitshuffle maintainer. I'll propose a patch to fix the current FTBFS, sticking on the mpi flavour to be conservative vs bitshuffle's previous builds. > I think part of the confusion here is that bitshuffle (at least in the > tests) is double-handling the HDF5 library, with direct calls on the one > hand, but indirectly through h5py as well, on the other hand. Thanks, _g.
signature.asc
Description: OpenPGP digital signature

