On August 24, 2018 22:49:14 "Ian Romanick" <i...@freedesktop.org> wrote:
From: Ian Romanick <ian.d.roman...@intel.com>
Funny story... a single shader was hurt for instructions, spills, fills.
That same shader was also the most helped for cycles. #GPUsAreWeird
No changes on any other Intel platform.
Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14304116 -> 14304261 (<.01%)
instructions in affected programs: 12776 -> 12921 (1.13%)
helped: 19
HURT: 1
helped stats (abs) min: 1 max: 16 x̄: 2.32 x̃: 1
helped stats (rel) min: 0.05% max: 7.27% x̄: 0.92% x̃: 0.55%
HURT stats (abs) min: 189 max: 189 x̄: 189.00 x̃: 189
HURT stats (rel) min: 4.87% max: 4.87% x̄: 4.87% x̃: 4.87%
95% mean confidence interval for instructions value: -12.83 27.33
95% mean confidence interval for instructions %-change: -1.57% 0.31%
Inconclusive result (value mean confidence interval includes 0).
total cycles in shared programs: 527552861 -> 527531226 (<.01%)
cycles in affected programs: 1459195 -> 1437560 (-1.48%)
helped: 16
HURT: 2
helped stats (abs) min: 2 max: 21328 x̄: 1353.69 x̃: 6
helped stats (rel) min: 0.01% max: 5.29% x̄: 0.36% x̃: 0.03%
HURT stats (abs) min: 12 max: 12 x̄: 12.00 x̃: 12
HURT stats (rel) min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03%
95% mean confidence interval for cycles value: -3699.81 1295.92
95% mean confidence interval for cycles %-change: -0.94% 0.30%
Inconclusive result (value mean confidence interval includes 0).
total spills in shared programs: 8025 -> 8033 (0.10%)
spills in affected programs: 208 -> 216 (3.85%)
helped: 1
HURT: 1
total fills in shared programs: 10989 -> 11040 (0.46%)
fills in affected programs: 444 -> 495 (11.49%)
helped: 1
HURT: 1
Ivy Bridge
total instructions in shared programs: 11709181 -> 11709153 (<.01%)
instructions in affected programs: 3505 -> 3477 (-0.80%)
helped: 3
HURT: 0
helped stats (abs) min: 1 max: 23 x̄: 9.33 x̃: 4
helped stats (rel) min: 0.11% max: 1.16% x̄: 0.63% x̃: 0.61%
total cycles in shared programs: 254741126 -> 254738801 (<.01%)
cycles in affected programs: 919067 -> 916742 (-0.25%)
helped: 3
HURT: 0
helped stats (abs) min: 21 max: 2144 x̄: 775.00 x̃: 160
helped stats (rel) min: 0.03% max: 0.90% x̄: 0.32% x̃: 0.03%
total spills in shared programs: 4536 -> 4533 (-0.07%)
spills in affected programs: 40 -> 37 (-7.50%)
helped: 1
HURT: 0
total fills in shared programs: 4819 -> 4813 (-0.12%)
fills in affected programs: 94 -> 88 (-6.38%)
helped: 1
HURT: 0
Signed-off-by: Ian Romanick <ian.d.roman...@intel.com>
---
src/intel/compiler/brw_fs_nir.cpp | 38 ++++++++++++++++++++++++++++++++------
1 file changed, 32 insertions(+), 6 deletions(-)
diff --git a/src/intel/compiler/brw_fs_nir.cpp
b/src/intel/compiler/brw_fs_nir.cpp
index 9c9df5ac09f..a2c3d715380 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -3659,9 +3659,20 @@ fs_visitor::nir_emit_cs_intrinsic(const fs_builder &bld,
break;
}
- case nir_intrinsic_shared_atomic_add:
- nir_emit_shared_atomic(bld, BRW_AOP_ADD, instr);
+ case nir_intrinsic_shared_atomic_add: {
+ int op = BRW_AOP_ADD;
+ const nir_const_value *const val =
nir_src_as_const_value(instr->src[1]);
+
+ if (val != NULL) {
+ if (val->i32[0] == 1)
+ op = BRW_AOP_INC;
+ else if (val->i32[0] == -1)
+ op = BRW_AOP_DEC;
+ }
Given the repeated pattern in this and the other patches, would it make
sense to add a little brw_atomic_add_op helper that takes a nir_src? I'm
not sure exactly how that would work with adjusting the number of surfed
everywhere though. Feel free to disregard if it doesn't make sense.
Another thought (more of an FYI, really) is that I had considered doing
this as a NIR lowering pass at one point. Not sure if anyone else has
hardware inc/dec operations though. You don't need to do anything with
that information; I'm just throwing it out there.
--Jason
+
+ nir_emit_shared_atomic(bld, op, instr);
break;
+ }
case nir_intrinsic_shared_atomic_imin:
nir_emit_shared_atomic(bld, BRW_AOP_IMIN, instr);
break;
@@ -4377,9 +4388,20 @@ fs_visitor::nir_emit_intrinsic(const fs_builder
&bld, nir_intrinsic_instr *instr
break;
}
- case nir_intrinsic_ssbo_atomic_add:
- nir_emit_ssbo_atomic(bld, BRW_AOP_ADD, instr);
+ case nir_intrinsic_ssbo_atomic_add: {
+ int op = BRW_AOP_ADD;
+ const nir_const_value *const val =
nir_src_as_const_value(instr->src[2]);
+
+ if (val != NULL) {
+ if (val->i32[0] == 1)
+ op = BRW_AOP_INC;
+ else if (val->i32[0] == -1)
+ op = BRW_AOP_DEC;
+ }
+
+ nir_emit_ssbo_atomic(bld, op, instr);
break;
+ }
case nir_intrinsic_ssbo_atomic_imin:
nir_emit_ssbo_atomic(bld, BRW_AOP_IMIN, instr);
break;
@@ -4888,7 +4910,9 @@ fs_visitor::nir_emit_ssbo_atomic(const fs_builder &bld,
}
fs_reg offset = get_nir_src(instr->src[1]);
- fs_reg data1 = get_nir_src(instr->src[2]);
+ fs_reg data1;
+ if (op != BRW_AOP_INC && op != BRW_AOP_DEC && op != BRW_AOP_PREDEC)
+ data1 = get_nir_src(instr->src[2]);
fs_reg data2;
if (op == BRW_AOP_CMPWR)
data2 = get_nir_src(instr->src[3]);
@@ -4962,7 +4986,9 @@ fs_visitor::nir_emit_shared_atomic(const fs_builder &bld,
fs_reg surface = brw_imm_ud(GEN7_BTI_SLM);
fs_reg offset;
- fs_reg data1 = get_nir_src(instr->src[1]);
+ fs_reg data1;
+ if (op != BRW_AOP_INC && op != BRW_AOP_DEC && op != BRW_AOP_PREDEC)
+ data1 = get_nir_src(instr->src[1]);
fs_reg data2;
if (op == BRW_AOP_CMPWR)
data2 = get_nir_src(instr->src[2]);
--
2.14.4
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev