Am 29.10.2015 um 20:10 schrieb Oded Gabbay: > On Thu, Oct 29, 2015 at 9:02 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote: >> On Thu, Oct 29, 2015 at 2:44 PM, Oded Gabbay <oded.gab...@gmail.com> wrote: >>> However, I would hate to keep the situation as is, meaning the test >>> passes on x86-64 and fails on ppc64le. >> >> Sounds like it'd actually be a difference between AVX and SSE4.2 as >> well -- what happens if you run on your x86_64 chip with >> LP_NATIVE_VECTOR_WIDTH=128 ? It fails for me on my HSW chip, looking >> at the results visually it's mostly good but there's a sprinkling of >> red pixels. >> >> -ilia > > It fails on my Haswell chip - definitely sprinkling of red pixels. > Also the error seems to be greater than 5e-7. Even with 1.6e-6 as > failure point, it still fails, while on ppc64le it passes. > Only when I went for 2e-6, the test passes. > > As I said and Roland explained, the calculation method is inherently > less accurate in the two-stages path. Although I don't know why on > SSE4.2 the deviation is a bit larger than on VMX >
Does that have fma and does it auto-fuse mul/add to fma? Albeit I don't think it should right now... Other than that, I'm not sure why the results would be different - albeit on x86 we explicitly disable denorms which could cause different results, for this example I don't think this should be an issue. Roland _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev