Hi, while developing support for Vulkan shaderInt16 on Anvil I came across a feature of NIR that was a bit inconvenient: bools are always 32-bit by design, but the Intel hardware produces 16-bit bool results for 16- bit comparisons, so that creates a problem that manifests like this:
vec1 32 ssa_21 = fge ssa_20, ssa_16 vec1 16 ssa_22 = b2f ssa_21 Our CMP instruction will produce a 16-bit boolean result for the first NIR instruction (where NIR expects it to be 32-bit), so by the time we emit the second instruction in the driver the bit-size for the operand of b2f provided by NIR no longer matches the reality and we emit incorrect code. This seems to have been a consicious design choice in NIR, and while discussing this with Jason he was unsure how much we wanted to change this or how to do it, given how thoroughly 32-bit bools are baked into NIR and the complexities that modifying this would also bring to our bit-size validation code. I have been considering alternatives that didn't involve changing NIR to support multiple bit-sizes for booleans: 1) Drivers that need to emit smaller booleans could try to fix the generated NIR by correcting the expected bit-sizes for CMP instructions. This would be rather trivial to implement in drivers (and maybe we could even make a generic pass for other drivers to use if they need it) but this will make the validator complain because it won't recognize comparisons with 16-bit bool outputs as valid NIR opcodes. I also found instances where nir_search would complain about mismatching bit-sizes. I haven't looked any further into it yet though, so maybe we can reasonably work around these issues. 2) Drivers could handle this specially when they emit code from NIR. Specifically, when they see a 32-bit boolean source in an instruction, they would have to search for the instruction that produced that source value and check whether it is a 16-bit or a 32-bit boolean to emit proper code for the instruction. 3) Drivers can just convert the 16-bit bool result they generate for 16-bit cmp to the 32-bit bool that NIR expects, and then possibly run an optimization pass to eliminate these extra conversions and fix up the code accordingly. Does anyone else have any better ideas? Iago _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev