On 2016-05-01 22:47:40, Jordan Justen wrote: > 7-10, 12-20, 36-43, 57-58: > Reviewed-by: Jordan Justen <jordan.l.jus...@intel.com>
34-35 Reviewed-by: Jordan Justen <jordan.l.jus...@intel.com> > > I also sent questions about 56 & 59. > > On 2016-04-29 04:28:57, Samuel Iglesias Gonsálvez wrote: > > Hello, > > > > This patch series continues adding arb_gpu_shader_fp64 support to the > > Intel driver. Specifically, this targets the i965 scalar backend for > > BDW+ hardware (vec4 is still under research and gen7 has its own > > issues which we intend tackle after gen8). > > > > This adds most of the fp64 scalar implementation, it starts by enabling > > the various lowering passes in NIR for doubles and then adds all the > > infrastructure required in the backend to operate with 64-bit floating > > point data. > > > > For reference, this series fixes 1009 fp64 piglit tests in BDW. Fp64 > > totals look like this: > > > > pass: 2523 > > fail: 46 > > crash: 447 > > skip: 16 > > total: 3032 > > > > There are a few missing things in this series to achieve a perfect fp64 > > pass rate: > > > > 1. Fixes to copy propagation. The fp64 code creates new code patterns > > that copy-propagation isn't really ready to handle yet leading to > > incorrect results in some cases. We have 9 patches to fix copy > > propagation for fp64 that we intend to send separately after the > > main fp64 infrastructure has been reviewed. > > > > 2. ubo/ssbo/shared-variables. We will also send the patches for this in > > a separate series after this one. > > > > 3. A fix for the SIMD lowering pass to properly handle execmasking when > > transposing the results of split instructions back together. We have > > a local fix for this, but Curro hit the same problem while working > > on SIMD32 and has a better solution for it so we intend to use his > > solution when it is ready. > > > > 4. Spilling. We don't support spilling of DF registers yet and some > > piglit tests need this to compile. Jason had plans to work on the > > spilling code and address the needs of fp64 along the way. > > > > The series does not introduce any regressions in piglit on ILK, SNB, > > HSW, BDW and SKL. > > > > A branch with this series is available for testing here: > > > > $ git clone -b i965-fp64-scalar-backend-part-1 > > https://github.com/Igalia/mesa.git > > > > You will have to enable the extension with: > > > > $ export MESA_EXTENSION_OVERRIDE=GL_ARB_gpu_shader_fp64 > > > > The full scalar fp64 implementation, containing also the fixes to > > copy-propagation as well as ubo/ssbo and our local fix for the SIMD > > lowering pass is available here: > > > > git clone -b i965-fp64 https://github.com/Igalia/mesa.git > > > > And for the adventurous, there is also a work-in-progress branch that > > adds scalar support for HSW here: > > > > git clone -b i965-fp64-gen7 https://github.com/Igalia/mesa.git > > > > Thanks, > > > > Sam > > > > > > Connor Abbott (33): > > i965: use double lowering pass > > i965: use pack/unpackDouble lowering > > i965/disasm: fix disasm of 3-src doubles > > i965/eu: allow doubles in math instructions > > i965: add brw_imm_df > > i965: add support for getting/setting DF immediates > > i965: add support for disassembling DF immediates > > i965/eu: add support for DF immediates > > i965: fix brw_negate_immediate() for doubles > > i965: fix is_zero(), is_one() and is_negative_one() for doubles > > i965: fixup uniform setup for doubles > > i965/fs: print writemask_all when it's enabled > > i965/fs: use the NIR bit size when creating registers > > i965/fs: don't propagate 64-bit immediates > > i965/fs: add support for printing double immediates > > i965/fs: always pass the bitsize to brw_type_for_nir_type() > > i965/fs: add a stride helper > > i965/fs: add PACK opcode > > i965/fs: add a pass for lowering PACK opcodes > > i965/fs/nir: translate double pack/unpack > > i965/fs: fix type_size() for doubles > > i965/fs: handle uniforms in byte_offset() > > i965/fs: use byte_offset() in offset() for uniforms > > i965/fs: fix assign_constant_locations() for doubles > > i965/fs: generalize SIMD16 interference workaround > > i965/fs: extend exec_size halving in the generator > > i965/fs: fix compares for doubles > > i965/fs: fix regs_read() for uniforms > > i965/fs: fix is_copy_payload() for doubles > > i965/fs: fix regs_written in LOAD_PAYLOAD for doubles > > i965/fs: fix dst width calculation in CSE > > i965/fs: add a pass for legalizing d2f > > i965/fs: add support for f2d and d2f > > > > Iago Toral Quiroga (15): > > i965: fix brw_saturate_immediate() for doubles > > i965: fix brw_abs_immediate() for doubles > > i965: two-argument instructions can only use 32-bit immediates > > i965/fs: optimize pack double > > i965/fs: optimize unpack double > > i965/fs: handle fp64 opcodes in brw_do_channel_expressions > > i965/fs: We only support 32-bit integer ALU operations for now > > i965/fs: add null_reg_df > > i965/fs: implement fsign() for doubles > > i965/fs: implement d2b > > i965/fs: implement d2i and d2u > > i965/fs: implement i2d and u2d > > i965/fs: rename our lower_d2f pass to lower_d2x > > i965/fs/lower_simd_width: Fix registers written for split instructions > > i965/fs: recognize writes with a subreg_offset > 0 as partial > > > > Samuel Iglesias Gonsálvez (7): > > i965: enable lrp lowering for doubles > > vc4: lower lrp when operating with double operands > > freedreno/ir3: lower lrp when operating with double operands > > i965/fs: align access to double-based uniforms in push constant buffer > > i965/fs: demote_pull_constants() did not take into account double > > types > > i965/fs: take into account doubles when calculating read_size for > > MOV_INDIRECT > > i965/fs: fix MOV_INDIRECT exec_size for doubles > > > > Topi Pohjolainen (4): > > i965: Lower DFRACEXP/DLDEXP > > i965: Determine size of double precision float register > > i965: Tell backend register about double precision type > > i965/eu: Allow 3-src float ops with doubles > > > > src/gallium/drivers/freedreno/ir3/ir3_nir.c | 1 + > > src/gallium/drivers/vc4/vc4_program.c | 1 + > > src/mesa/drivers/dri/i965/Makefile.sources | 2 + > > src/mesa/drivers/dri/i965/brw_compiler.c | 2 + > > src/mesa/drivers/dri/i965/brw_compiler.h | 8 + > > src/mesa/drivers/dri/i965/brw_defines.h | 9 + > > src/mesa/drivers/dri/i965/brw_disasm.c | 3 +- > > src/mesa/drivers/dri/i965/brw_eu_emit.c | 60 +++-- > > src/mesa/drivers/dri/i965/brw_fs.cpp | 106 ++++++-- > > src/mesa/drivers/dri/i965/brw_fs.h | 6 +- > > src/mesa/drivers/dri/i965/brw_fs_builder.h | 15 +- > > .../dri/i965/brw_fs_channel_expressions.cpp | 23 +- > > .../drivers/dri/i965/brw_fs_copy_propagation.cpp | 3 + > > src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 3 +- > > src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 16 +- > > src/mesa/drivers/dri/i965/brw_fs_lower_d2x.cpp | 75 ++++++ > > src/mesa/drivers/dri/i965/brw_fs_lower_pack.cpp | 59 +++++ > > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 287 > > ++++++++++++++++++--- > > src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 67 +++-- > > src/mesa/drivers/dri/i965/brw_inst.h | 25 ++ > > src/mesa/drivers/dri/i965/brw_ir_fs.h | 14 +- > > src/mesa/drivers/dri/i965/brw_link.cpp | 1 + > > src/mesa/drivers/dri/i965/brw_nir.c | 10 + > > src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp | 7 +- > > src/mesa/drivers/dri/i965/brw_program.c | 1 + > > src/mesa/drivers/dri/i965/brw_reg.h | 10 + > > src/mesa/drivers/dri/i965/brw_shader.cpp | 73 ++++-- > > src/mesa/drivers/dri/i965/brw_shader.h | 1 + > > src/mesa/drivers/dri/i965/brw_wm.c | 2 + > > src/mesa/drivers/dri/i965/gen6_constant_state.c | 12 +- > > 30 files changed, 773 insertions(+), 129 deletions(-) > > create mode 100644 src/mesa/drivers/dri/i965/brw_fs_lower_d2x.cpp > > create mode 100644 src/mesa/drivers/dri/i965/brw_fs_lower_pack.cpp > > > > -- > > 2.5.0 > > > > _______________________________________________ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev