Here's a version of Debian's SIMD Everywhere patch as a pull request for MMseqs2 https://github.com/soedinglab/MMseqs2/pull/309
On Fri, May 8, 2020 at 9:15 PM Michael Crusoe <michael.cru...@gmail.com> wrote: > [replying on the debian-med list with permission. Please keep Martin and > Milot CC'd as they do not subscribe] > > On Fri, May 8, 2020 at 7:36 PM Milot Mirdita <mi...@mirdita.de> wrote: > >> Hi Michael, >> >> I am a developer on the MMseqs2 team and I saw your tweet regarding the >> AWS ARM64 machines earlier and checked on Debian Salsa if it would be a lot >> of work enabling ARM64 support with the next release as we worked on that >> recently. >> > > Hey Milot, thanks for your email! > > I saw that Debian's MMseqs2 now uses SIMDe to abstract away different >> architectures. While this is a very cool technical achievement, I am very >> uncomfortable with it without being properly integrated into and monitored >> by our CI regression testing. >> >> During ARM64 development I found that there are a lot of subtle issues >> that can result in differing sensitivity between architectures (e.g. >> ARM64's default unsigned char type causes issues, there are many crashes on >> 32-bit ARM). I am also worried that our two most important platforms >> (SSE4.1 and AVX2) might suffer from performance regressions. >> > > Interesting! On Debian we have to provide binaries that respect the > architecture baseline. That means no SSE-, SSE2-, only binaries on > i386/i686 and no SSE3+ only binaries on AMD64. So that's why we compile > mmseqs2 multiple times, so there is a version that doesn't violate the > baseline, along with versions that should match the highest level of SIMD > support available on the user's CPU. > > https://salsa.debian.org/med-team/mmseqs2/-/blob/master/debian/rules#L22 > > > https://salsa.debian.org/med-team/mmseqs2/-/blob/master/debian/bin/simd-dispatch > > >> >> We will have ARM64 and hopefully also PPC64LE support in the next >> release. I would suggest to either wait and use our upstream code, or >> submit a PR with your changes to us and see how we can integrate everything >> correctly. >> > > Sure, happy to send the patches! I meant to, but hadn't gotten around to > it yet > > https://wiki.debian.org/SIMDEverywhere#Packages_Status > > >> >> Also I would be very glad if you could integrate the full regression >> suite to spot if all architectures produce consistent results. You can run >> the regression by calling from the repository: >> git submodule update --init >> ./util/regression/run_regression.sh ./path-to-mmseqs-binary >> scratch-directory >> > > Oh yeah, would love to! Except we need all the upstream sources in a > single tarball, which git submodules + GitHub releases makes difficult. So > if you can add a pure source (with all git submodules) tarball to > https://github.com/soedinglab/MMseqs2/releases that would be appreciated! > > >> >> We had refactored this test suite to make it as easy as possible to use >> for Shayan who initially had proposed to package MMseqs2 for Debian. The >> test subfolder is badly named and contains scratch scripts for feature >> development. They don't do anything useful for testing such as finding >> performance or sensitivity drops. >> > > Noted. > > >> Thanks for your work and best regards, >> > > Thank you for sharing your work under a F/OSS license and for your > contributions to Open Science! > > >> Milot >> > > > -- > Michael R. Crusoe > -- Michael R. Crusoe