Returning FLT_MAX instead of 0 also works, which is similar to another hw instruction: V_RCP_CLAMP_F32.
The Unreal engine is a pretty big target with a lot of apps out there. I'm afraid a driconf option isn't feasible. Marek On Thu, Dec 4, 2014 at 5:39 PM, Roland Scheidegger <srol...@vmware.com> wrote: > Hmm I have to say I'm not really convinced of that solution. Because all > divs are lowered, this will screw the results of all divs (if the rcp > would come from a legacy arb_fp rcp that would be different and quite > possible some apps depending on it, problems like that are very common > for d3d9 apps too). But really there's some expectations stuff conforms > to ieee754 rules these days, and making divs by zero return 0 ain't so hot. > Maybe it's the div lowering itself which causes this, in which case it > should probably be disabled? Might be a good idea anyway (if the driver > supports native div) since rcp isn't accurate usually. > Difficult to tell though without seeing the glsl and tgsi shader. But if > it was really the app expecting zero out of a div by zero (but I have > doubts about that), I'd certainly classify that as an app bug, and any > workarounds only be enabled by some dri conf option. > > Roland > > > > Am 04.12.2014 um 13:34 schrieb Marek Olšák: >> From: Marek Olšák <marek.ol...@amd.com> >> >> Discussion: https://bugs.freedesktop.org/show_bug.cgi?id=83510#c8 >> --- >> src/gallium/drivers/radeonsi/si_shader.c | 27 +++++++++++++++++++++++++++ >> 1 file changed, 27 insertions(+) >> >> diff --git a/src/gallium/drivers/radeonsi/si_shader.c >> b/src/gallium/drivers/radeonsi/si_shader.c >> index 973bac2..e0799c9 100644 >> --- a/src/gallium/drivers/radeonsi/si_shader.c >> +++ b/src/gallium/drivers/radeonsi/si_shader.c >> @@ -2744,6 +2744,32 @@ static int si_generate_gs_copy_shader(struct >> si_screen *sscreen, >> return r; >> } >> >> +/** >> + * This emulates V_RCP_LEGACY_F32, which has the following rule for division >> + * by zero: 1 / 0 = 0 >> + * >> + * V_RCP_F32(x) = 1 / x >> + * V_RCP_LEGACY_F32(x) = (x != +-0) ? V_RCP_F32(x) : 0. >> + */ >> +static void si_llvm_emit_rcp_legacy(const struct lp_build_tgsi_action * >> action, >> + struct lp_build_tgsi_context * bld_base, >> + struct lp_build_emit_data * emit_data) >> +{ >> + LLVMValueRef cmp = >> + lp_build_cmp(&bld_base->base, >> + PIPE_FUNC_NOTEQUAL, >> + emit_data->args[0], >> + bld_base->base.zero); >> + >> + LLVMValueRef div = >> + lp_build_emit_llvm_binary(bld_base, TGSI_OPCODE_DIV, >> + bld_base->base.one, >> + emit_data->args[0]); >> + >> + emit_data->output[emit_data->chan] = >> + lp_build_select(&bld_base->base, cmp, div, >> bld_base->base.zero); >> +} >> + >> int si_shader_create(struct si_screen *sscreen, struct si_shader *shader) >> { >> struct si_shader_selector *sel = shader->selector; >> @@ -2798,6 +2824,7 @@ int si_shader_create(struct si_screen *sscreen, struct >> si_shader *shader) >> bld_base->op_actions[TGSI_OPCODE_MIN].emit = >> build_tgsi_intrinsic_nomem; >> bld_base->op_actions[TGSI_OPCODE_MIN].intr_name = >> "llvm.minnum.f32"; >> } >> + bld_base->op_actions[TGSI_OPCODE_RCP].emit = si_llvm_emit_rcp_legacy; >> >> si_shader_ctx.radeon_bld.load_system_value = declare_system_value; >> si_shader_ctx.tokens = sel->tokens; >> > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev