Am 18.11.2016 um 02:11 schrieb Ilia Mirkin: > On Thu, Nov 17, 2016 at 2:37 AM, Andrew A. <andj2...@gmail.com> wrote: >> Hello, >> >> I'm using Mesa's software renderer for the purposes of regression >> testing in our graphics software. We render various scenes, save a >> screencap of the framebuffer for each scene, then compare those >> framebuffer captures to previously known-good captures. >> >> Across runs of these tests on the same hardware, the results seem to >> be 100% identical. When running the same tests on a different machine, >> results are *slightly* different. It's very similar within a small >> tolerance, so this is still usable. However, I was hoping for fully >> deterministic behavior, even if the hardware is slightly different. >> Are there some compile time settings or some code that I can change to >> get Mesa's llvmpipe renderer/rasterizer to be fully deterministic in >> its output? > > You can force the AVX-capable CPU to run in SSE mode. You can do this > by setting the environment variable > > LP_NATIVE_VECTOR_WIDTH=128 > >> I'm using llvmpipe, and these are the two different CPUs I'm using to >> run the tests: >> Intel(R) Xeon(R) CPU E3-1275 v3 >> Intel(R) Xeon(R) CPU X5650 > > The former has AVX, while the latter does not. I believe this explains > the difference. >
Yep, forcing 128bit vectors should do the trick. Note that generally 8-wide execution (which we use with avx) vs 4-wide should not actually make a difference (neither should avx itself, but we disable that too if you force 128bit vectors), except: - we use fma if available (at least on intel cpus, this requires avx2 even, but the former cpu is Haswell so has it) so mul+adds get turned into fma, which is of course numerically different. (Setting 128bit vectors will force this off too.) - there used to be different attribute interpolation code in llvmpipe dependent on 4-wide vs 8-wide (the different code is still there but disabled now, but you didn't specify the mesa version, we switched to the version which has higher precision always, this difference was actually pretty annoying as some tests are quite sensitive to it, I'd say this difference was far more significant than the one due to fma). - the texture sampling code might chose different codepath (AoS vs. SoA filtering, the latter has higher precision) depending on the exact sampling environment (i.e. if per-pixel lod is needed for instance) depending on 4-wide vs. 8-wide vectors. (Even if AoS filtering is chosen for both ways, this itself also has some differences wrt how texture coord wrapping is done based on if AVX is available, though I think this one should return the same results, but I wouldn't quite guarantee it... At least the filtering itself within the AoS path will be the same). If you'd use some non-x86 cpus you'd get some more differences most likely (we switch off denorms on x86 simd for instance and this stuff is very much arch-dependent), and there might actually be really different results in some cases (that is, bugs...), but for x86 at least if you have sse41 that should be all. Roland _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev