-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
On 02/05/16 23:50, Mark Janes wrote: > Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes: > >> Hello, >> >> This patch series continues adding arb_gpu_shader_fp64 support to >> the Intel driver. Specifically, this targets the i965 scalar >> backend for BDW+ hardware (vec4 is still under research and gen7 >> has its own issues which we intend tackle after gen8). >> >> This adds most of the fp64 scalar implementation, it starts by >> enabling the various lowering passes in NIR for doubles and then >> adds all the infrastructure required in the backend to operate >> with 64-bit floating point data. >> >> For reference, this series fixes 1009 fp64 piglit tests in BDW. >> Fp64 totals look like this: >> >> pass: 2523 fail: 46 crash: >> 447 skip: 16 total: 3032 >> >> There are a few missing things in this series to achieve a >> perfect fp64 pass rate: >> >> 1. Fixes to copy propagation. The fp64 code creates new code >> patterns that copy-propagation isn't really ready to handle yet >> leading to incorrect results in some cases. We have 9 patches to >> fix copy propagation for fp64 that we intend to send separately >> after the main fp64 infrastructure has been reviewed. >> >> 2. ubo/ssbo/shared-variables. We will also send the patches for >> this in a separate series after this one. >> >> 3. A fix for the SIMD lowering pass to properly handle >> execmasking when transposing the results of split instructions >> back together. We have a local fix for this, but Curro hit the >> same problem while working on SIMD32 and has a better solution >> for it so we intend to use his solution when it is ready. >> >> 4. Spilling. We don't support spilling of DF registers yet and >> some piglit tests need this to compile. Jason had plans to work >> on the spilling code and address the needs of fp64 along the >> way. >> >> The series does not introduce any regressions in piglit on ILK, >> SNB, HSW, BDW and SKL. > > In addition to the fp64 failures and assertions described above, I > see the following regressions when I run piglit: > > piglit.spec.arb_tessellation_shader.execution (5 tests, SKL, IVB, > HSW, BYT, BSW, BDW) > This is weird for gen < 8... Were you running them with the MESA_EXTENSION_OVERRIDE? Because this is not supported in gen < 8. About gen8+, see below. > These tests give the same assertion as most of the fp64 tests: > src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp:626: int > brw::type_size_vec4(const glsl_type*): Assertion `!"not reached"' > failed. > This is expected because there is no fp64 support for vec4 backend in this patch series. > piglit.shaders.shadersource-no-compile (all platforms) > > Fails with "Failed to link: error: linking with uncompiled shader" > Comparing with the master HEAD used for part1 (c750029) this is not a regression because it is failing there too. It seems this was fixed recently. > Sam, are you able to reproduce my results? > About this: piglit.spec.glsl-1_10.compiler.vector-dereference-in-dereference.frag asserts with "glslparsertest: src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp:422: virtual ir_visitor_status ir_channel_expressions_visitor::visit_leave(ir_assignment*): Assertion `!"should have been lowered"' failed." This is not a regression with the reference we used (c750029) because it fails there too. Can you compare piglit results with c750029 as reference? Thanks a lot! Sam >> A branch with this series is available for testing here: >> >> $ git clone -b i965-fp64-scalar-backend-part-1 >> https://github.com/Igalia/mesa.git >> >> You will have to enable the extension with: >> >> $ export MESA_EXTENSION_OVERRIDE=GL_ARB_gpu_shader_fp64 >> >> The full scalar fp64 implementation, containing also the fixes >> to copy-propagation as well as ubo/ssbo and our local fix for the >> SIMD lowering pass is available here: >> >> git clone -b i965-fp64 https://github.com/Igalia/mesa.git >> >> And for the adventurous, there is also a work-in-progress branch >> that adds scalar support for HSW here: >> >> git clone -b i965-fp64-gen7 https://github.com/Igalia/mesa.git >> >> Thanks, >> >> Sam >> >> >> Connor Abbott (33): i965: use double lowering pass i965: use >> pack/unpackDouble lowering i965/disasm: fix disasm of 3-src >> doubles i965/eu: allow doubles in math instructions i965: add >> brw_imm_df i965: add support for getting/setting DF immediates >> i965: add support for disassembling DF immediates i965/eu: add >> support for DF immediates i965: fix brw_negate_immediate() for >> doubles i965: fix is_zero(), is_one() and is_negative_one() for >> doubles i965: fixup uniform setup for doubles i965/fs: print >> writemask_all when it's enabled i965/fs: use the NIR bit size >> when creating registers i965/fs: don't propagate 64-bit >> immediates i965/fs: add support for printing double immediates >> i965/fs: always pass the bitsize to brw_type_for_nir_type() >> i965/fs: add a stride helper i965/fs: add PACK opcode i965/fs: >> add a pass for lowering PACK opcodes i965/fs/nir: translate >> double pack/unpack i965/fs: fix type_size() for doubles i965/fs: >> handle uniforms in byte_offset() i965/fs: use byte_offset() in >> offset() for uniforms i965/fs: fix assign_constant_locations() >> for doubles i965/fs: generalize SIMD16 interference workaround >> i965/fs: extend exec_size halving in the generator i965/fs: fix >> compares for doubles i965/fs: fix regs_read() for uniforms >> i965/fs: fix is_copy_payload() for doubles i965/fs: fix >> regs_written in LOAD_PAYLOAD for doubles i965/fs: fix dst width >> calculation in CSE i965/fs: add a pass for legalizing d2f >> i965/fs: add support for f2d and d2f >> >> Iago Toral Quiroga (15): i965: fix brw_saturate_immediate() for >> doubles i965: fix brw_abs_immediate() for doubles i965: >> two-argument instructions can only use 32-bit immediates i965/fs: >> optimize pack double i965/fs: optimize unpack double i965/fs: >> handle fp64 opcodes in brw_do_channel_expressions i965/fs: We >> only support 32-bit integer ALU operations for now i965/fs: add >> null_reg_df i965/fs: implement fsign() for doubles i965/fs: >> implement d2b i965/fs: implement d2i and d2u i965/fs: implement >> i2d and u2d i965/fs: rename our lower_d2f pass to lower_d2x >> i965/fs/lower_simd_width: Fix registers written for split >> instructions i965/fs: recognize writes with a subreg_offset > 0 >> as partial >> >> Samuel Iglesias Gonsálvez (7): i965: enable lrp lowering for >> doubles vc4: lower lrp when operating with double operands >> freedreno/ir3: lower lrp when operating with double operands >> i965/fs: align access to double-based uniforms in push constant >> buffer i965/fs: demote_pull_constants() did not take into account >> double types i965/fs: take into account doubles when calculating >> read_size for MOV_INDIRECT i965/fs: fix MOV_INDIRECT exec_size >> for doubles >> >> Topi Pohjolainen (4): i965: Lower DFRACEXP/DLDEXP i965: Determine >> size of double precision float register i965: Tell backend >> register about double precision type i965/eu: Allow 3-src float >> ops with doubles >> >> src/gallium/drivers/freedreno/ir3/ir3_nir.c | 1 + >> src/gallium/drivers/vc4/vc4_program.c | 1 + >> src/mesa/drivers/dri/i965/Makefile.sources | 2 + >> src/mesa/drivers/dri/i965/brw_compiler.c | 2 + >> src/mesa/drivers/dri/i965/brw_compiler.h | 8 + >> src/mesa/drivers/dri/i965/brw_defines.h | 9 + >> src/mesa/drivers/dri/i965/brw_disasm.c | 3 +- >> src/mesa/drivers/dri/i965/brw_eu_emit.c | 60 +++-- >> src/mesa/drivers/dri/i965/brw_fs.cpp | 106 >> ++++++-- src/mesa/drivers/dri/i965/brw_fs.h | 6 >> +- src/mesa/drivers/dri/i965/brw_fs_builder.h | 15 +- >> .../dri/i965/brw_fs_channel_expressions.cpp | 23 +- >> .../drivers/dri/i965/brw_fs_copy_propagation.cpp | 3 + >> src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 3 +- >> src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 16 +- >> src/mesa/drivers/dri/i965/brw_fs_lower_d2x.cpp | 75 ++++++ >> src/mesa/drivers/dri/i965/brw_fs_lower_pack.cpp | 59 +++++ >> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 287 >> ++++++++++++++++++--- >> src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 67 +++-- >> src/mesa/drivers/dri/i965/brw_inst.h | 25 ++ >> src/mesa/drivers/dri/i965/brw_ir_fs.h | 14 +- >> src/mesa/drivers/dri/i965/brw_link.cpp | 1 + >> src/mesa/drivers/dri/i965/brw_nir.c | 10 + >> src/mesa/drivers/dri/i965/brw_nir_uniforms.cpp | 7 +- >> src/mesa/drivers/dri/i965/brw_program.c | 1 + >> src/mesa/drivers/dri/i965/brw_reg.h | 10 + >> src/mesa/drivers/dri/i965/brw_shader.cpp | 73 ++++-- >> src/mesa/drivers/dri/i965/brw_shader.h | 1 + >> src/mesa/drivers/dri/i965/brw_wm.c | 2 + >> src/mesa/drivers/dri/i965/gen6_constant_state.c | 12 +- 30 >> files changed, 773 insertions(+), 129 deletions(-) create mode >> 100644 src/mesa/drivers/dri/i965/brw_fs_lower_d2x.cpp create mode >> 100644 src/mesa/drivers/dri/i965/brw_fs_lower_pack.cpp >> >> -- 2.5.0 >> >> _______________________________________________ mesa-dev mailing >> list mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev > -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJXKHjdAAoJEH/0ujLxfcND4fEQAK/OP5lLw+/Ejx1vsn0ddAVb yxa17GZjTG9QfdhTFGk9I8cqDQfz8Ydd/WO1/YbcYBtE38rY2QFq4fMPLtCF8V2Q 2yXjru4oGoR6pyQuqmblu27Xo6OGTzMcX7xkuxene7nx+fL1TicLlZI5znAxX+uF gRaLJXYYt1Kbnww4KKirzZIak6K2q+miSSTG+aD0bkJriAabIGjHzLuxyJ849Z47 mL8B1Waeu+SUAibI0U2vwuESKfy7Zanb7huuf17l+bzaH6cXudixdoDpraE0xyxx F6pqtPxzgqPEDmV6okj4dAKCNHu7d81+0sO0guF7StWzBzLFOfhTeZY3FwJEWbEV lEVTuVUwCtirA8sVv4Ng6NARi08dd35q7UeRQ/uN245mQfRStXmzhugH+JEhGdJU QIgweuQkqzpfjyd9ptxj26YfkO1SDvdcg2jqfO2fall336PMyRWq+R67a/pUbquO hY8TSmo3uugREMWm+eIBdyDEdPjwrtYyU0zCqr7oLV3/2KDFdbyr5U0KAfCMu3Hz /kfQu80l69MsjUUGU7nTjVAhhax8cdCg+i0LLPLxFMKLLkWzhawD0tDN8F//5V3i 0aOEJapaCLLBFxSrAY0MmSsLNUSbDRoty9PYorpvcyGnTGAMIDWZX9SB3jqjrVf1 SnDZF7fDigfSicd0EdyE =RdmR -----END PGP SIGNATURE----- _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev