Hi all, While working on FP64 for i965, there's an issue that I thought of with the vec4 backend that I'm not sure how to resolve. From what I understand, the execmask works the same way in Align16 mode as Align1 mode, except that you only use the first 8 channels in practice for SIMD4x2, and the first four channels are always the same as well as the last 4 channels. But this doesn't work for 64-bit things, since there we only operate on 4 components at the same time, so it's more like SIMD2x2. For example, imagine that only the second vertex is currently enabled at the moment. Then the execmask looks like 00001111, and if we do something like:
mul(4) g24<1>DF g12<4,4,1>DF g13<4,4,1>DF { align16 }; then all 4 channels will be disabled, which is not what we want. I think the first thing to do is to write a piglit test that tests this case, since currently all the arb_gpu_shader_fp64 tests only use uniforms. We need a test that uses non-uniform control flow that triggers the case described above. Once we do that, and if we determine there's actually a problem, then we need to figure out how to solve it.. The ideas I had were: 1. make every FP64 thing use WE_all. This isn't actually too bad at the moment, since our notion of interference already assumes (more-or-less) that everything is WE_all, but it prevents us from improving it in the future with FP64 things. Unfortunately, it also means that we can't use writemasks since setting WE_all makes the EU ignore the writemask, so we'll have to do some trickery to get things with only 1 channel enabled to work correctly. 2. Use the NibCtrl field, and split each FP64 operation into 2. Unfortunately, this field only appeared on gen8, and the PRM only says it works for SIMD4 operations, whereas we need it to work for SIMD2 operations, although there's a chance it'll actually work for SIMD2 as well. This lets us potentially do better register allocation, but it might not work and even if it does it won't work for gen7. #1 sounds like the better solution for now, but who knows... maybe the HW people magically made it work already, and I'm not aware or they didn't document it. Connor _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev