Patches 23-26, 28, 30-35, 37-38, 40, (41 gets killed later so I didn't review it), 42-44 are
Reviewed-by: Connor Abbott <cwabbo...@gmail.com> I'm going to bed now, but I'll try to do some more later. On Tue, Dec 16, 2014 at 1:04 AM, Jason Ekstrand <ja...@jlekstrand.net> wrote: > NIR (pronounced "ner") is a new IR (internal representation) for the Mesa > shader compiler that will sit between the old IR (GLSL IR) and back-end > compilers. The primary purpose of NIR is to be more efficient for doing > optimizations and generate better code for the back-ends. We have a lot of > optimizations implemented in GLSL IR right now. However, they still > generate fairly bad code primarily because its tree-based structure makes > writing good optimizations difficult. For this reason, we have implemented > a lot of optimizations in the i965 back-end compilers just to fix up the > code we get from GLSL IR. The "proper fix" to this is to implement a > better high-level IR; enter NIR. > > Most of the initial work on NIR including setting up common data > structures, helper methods, and a few basic passes was by Connor Abbot who > interned with us over the summer. Connor did a fantastic job, but there is > still a lot left to be done. I've spent the last two months trying to fill > in the pieces that we need in order to get NIR off the ground. At this > point, we now have compitent in and out of SSA passes, are at zero piglit > regressions for i965 SIMD8 fragment shaders, and the shader-db numbers > aren't terrible. > > This is still a bit experimental. I have been testing only on HSW but it > should work ok on SNB and later. Eventually, once we get booleans fixed > up, it should work fine on older chips as well. It also doesn't yet > support SIMD16, so performance won't be that great. That said, I think we > are at the point now where we should try and land this and I can stop > developing in my masive private branch. Since this isn't quite ready for > prime-time yet, using it requires setting the INTEL_USE_NIR environment > variable. > > A few key points about NIR: > > 1. It is primarily an SSA-based IR. > 2. It supports source/destination-modifiers and swizzles/*write-masks. > 3. Standard GPU operations such as sin() and fmad() are first-class ALU > operations, not intrinsics. > 4. GLSL concepts like inputs, outputs, uniforms, etc. are built into the > IR so we can do proper analysis on them. > 5. Even though it's SSA, it still has a concept of registers and > write-masks in the core IR data structures. This means we can generate > code that is much closer to what backends want. > 6. Control flow is structured explicitly in the IR. > > (*write-masks are not available for SSA values) > > While source/destination modifiers and writemasks/swizzles are not > particularly useful for optimizations, having them represented in the IR > gives us the ability to generate more useful code for backends. > > A few notes about review: > > 1. For those of you who aren't interested in the general compiler, I'm > sorry for the patch-bomb. However, several people have requsted that > we maintain the history of the NIR development since connor's original > drop at the end of the summer. Therefore, while I've squashed several > things, I've tried to leave the diff of what I've done more-or-less > preserved. > > 2. No, this is not LLVM. There was a long-winded discussion about that > when Connor dropped his patches that went a whole lot of nowhere as > usual. I would really prefer if we left that debate alone. If there > must be bikeshedding on the topic, please do so on the cover-letter > e-mail. > > 3. Please keep all bikeshedding about C++, typedefs, etc. on the core > datastructures e-mail. If we need, we can split that off in its own > thread. > > 4. While I welcome review, I don't plan to make non-trivial changes to > specific patches or squash anything beyond what has already been > squashed. I've tried thus far to more-or-less keep the history and I'd > like to continue this if we can. > > 5. Eric Anholt has also written NIR -> TGSI -> NIR passes which will > hopefully get landed soon after NIR initially lands. Exactly how that > all gets hooked up for other gallium drivers beyond vc4 is outside the > scope of this series. > > I have pushed a branch to my personal freedesktop.org account. For certain > types of review, it may be easier to look at the end result rather than the > patches. The branch can be found via freedesktop cgit here: > > http://cgit.freedesktop.org/~jekstrand/mesa/log/?h=review/nir-v1 > > Last week, I did a presentation for some of the other Intel people to try > and help bring them up to speed on NIR concepts quickly. As part of this, > I typed up a bunch of notes that provide a decent overview of a lot of NIR > concepts. Those notes can be found here: > > http://www.jlekstrand.net/jason/projects/mesa/nir-notes/ > > Happy reviewing! > > P.S. Connor, Don't do too much reviewing before your finals are done. :-P > > Connor Abbott (22): > exec_list: add a list_foreach_typed_reverse() macro > nir: add initial README > nir: add a simple C wrapper around glsl_types.h > nir: add the core datastructures > nir: add core helper functions > nir: add a printer > nir: add a validation pass > nir: add a glsl-to-nir pass > nir: add a pass to lower variables for scalar backends > nir: keep track of the number of input, output, and uniform slots > nir: add a pass to remove unused variables > nir: add a pass to lower sampler instructions > nir: add a pass to lower system value reads > nir: add a pass to lower atomics > nir: add an optimization to turn global registers into local registers > nir: calculate dominance information > nir: add a pass to convert to SSA > nir: add an SSA-based copy propagation pass > nir: add an SSA-based dead code elimination pass > i965/fs: make emit_fragcoord_interpolation() more general > i965/fs: Don't pass through the coordinate type > i965/fs: add a NIR frontend > > Jason Ekstrand (101): > i965/fs: Only use nir for 8-wide non-fast-clear shaders. > i965/fs_nir: Make the sampler register always unsigned > i965/fs_nir: Use the correct types for texture inputs > i965/fs_nir: Use the correct texture offset immediate > Fix what I think are a few NIR typos > Fix up varying pull constants > i965/fs_nir: Add support for sample_pos and sample_id > nir/glsl: Add support for saturate > nir: Add fine and coarse derivative opcodes > nir/glsl: Add support for coarse and fine derivatives > i965/fs_nir: Handle coarse/fine derivatives > nir/lower_atomics: Multiply array offsets by ATOMIC_COUNTER_SIZE > i965/fs_nir: Add atomic counters support > i965/fs: Allow reinterpretation in constant propagation > nir: Add NIR_TRUE and NIR_FALSE constants and use them for boolean > immediates > nir: Add intrinsics to do alternate interpolation on inputs > i965/fs: Don't take an ir_variable for emit_general_interpolation > i965/fs_nir: Don't duplicate emit_general_interpolation > nir: Add a naieve from-SSA pass > nir: Add a lower_vec_to_movs pass > i965/fs_nir: Convert the shader to/from SSA > nir/lower_variables_scalar: Silence a compiler warning > nir: Add a basic metadata management system > nir: Add an assert > nir/foreach_block: Return false if the callback on the last block > fails > nir: Add a foreach_block_reverse function > nir: Add a function to detect if a block is immediately followed by an > if > nir: Make the nir_index_* functions return the nuber of items > nir: Add an SSA-based liveness analysis pass. > nir: Add an initialization function for SSA definitions > nir: Automatically handle SSA uses when an instruction is inserted > nir: Add a function for rewriting all the uses of a SSA def > nir: Add a parallel copy instruction type > nir: Add a function for comparing two sources > nir: Add a better out-of-SSA pass > i965/fs_nir: Do retyping for ALU srouces in get_nir_alu_src > glsl/list: Fix the exec_list_validate function > nir: Validate all lists in the validator > nir/print: Don't reindex things > nir: Differentiate between signed and unsigned versions of find_msb > i965/fs_nir: Validate optimization passes > nir/nir: Fix a bug in move_successors > glsl/list: Add a foreach_list_typed_safe_reverse macro > nir/nir: Use safe iterators when iterating over the CFG > nir/nir: Patch up phi predecessors in move_successors > nir: Add a peephole select optimization > i965/fs_nir: Turn on the peephole select optimization > nir: Validate that the SSA def and register indices are unique > nir: Add a fused multiply-add peephole > nir: Add a basic CSE pass > i965/fs_nir: Add the CSE pass and actually run in a loop > i965/fs_nir: Use an array rather than a hash table for register lookup > i965/fs_nir: Handle SSA constants > i965/fs_nir: Properly saturate multiplies > nir: Add a helper for rewriting an instruction source > nir/lower_samplers: Use the nir_instr_rewrite_src function > nir: Clean up nir_deref helper functions > nir: Make array deref direct vs. indirect an enum > nir: Add a concept of a wildcard array dereference > nir: Use an integer index for specifying structure fields > nir: Don't require a function in ssa_def_init > nir/copy_propagate: Don't cause size mismatches on phi node sources > nir: Validate that the sources of a phi have the same size as the > destination > nir/glsl: Don't allocate a state_slots array for 0 state slots > i965/fs_nir: Don't dump the shader. > nir: Use the enum for the variable mode > nir: Automatically update SSA if uses > nir: Add a copy splitting pass > nir: Add a pass to lower local variable accesses to SSA values > nir: Add a pass to lower local variables to registers > nir: Add a pass for lowering input/output loads/stores > nir: Add a pass to lower global variables to local variables > nir/glsl: Generate SSA NIR > i965/fs_nir: Use the new variable lowering code > nir/validate: Ensure that outputs are write-only and inputs are > read-only > nir: Remove the old variable lowering code > nir: Vectorize intrinsics > nir/validate: Validate intrinsic source/destination sizes > nir: Add gpu_shader5 interpolation intrinsics > nir/glsl: Add support for gpu_shader5 interpolation instrinsics > nir: Add a helper for getting a constant value from an SSA source > i965/fs_nir: Add a has_indirect flag and clean up some of the > input/output code > i965/fs_nir: Implement the ARB_gpu_shader5 interpolation intrinsics > nir: Add neg, abs, and sat opcodes > nir: Add a lowering pass for adding source modifiers where possible > nir: Make the type casting operations static inline functions > nir/glsl: Emit abs, neg, and sat operations instead of source > modifiers > nir: Add an expression matching framework > nir: Add infastructure for generating algebraic transformation passes > nir: Add an algebraic optimization pass > nir: Add a basic constant folding pass > nir: Remove the ffma peephole > nir: Make texture instruction names more consistent > nir: Constant fold array indirects > nir: Use a source for uniform buffer indices instead of an index > nir: Add a sampler index indirect to nir_tex_instr > nir: Rework the way samplers are lowered > i965/fs_nir: Add support for indirect texture arrays > nir/metadata: Rename metadata_dirty to metadata_preserve > nir: Call nir_metadata_preserve more places > nir: Make bcsel a fully vector operation > > src/glsl/Makefile.am | 10 +- > src/glsl/Makefile.sources | 39 +- > src/glsl/list.h | 19 +- > src/glsl/nir/README | 118 ++ > src/glsl/nir/glsl_to_nir.cpp | 1825 +++++++++++++++++ > src/glsl/nir/glsl_to_nir.h | 40 + > src/glsl/nir/nir.c | 2042 > ++++++++++++++++++++ > src/glsl/nir/nir.h | 1433 ++++++++++++++ > src/glsl/nir/nir_algebraic.py | 249 +++ > src/glsl/nir/nir_dominance.c | 298 +++ > src/glsl/nir/nir_from_ssa.c | 859 ++++++++ > src/glsl/nir/nir_intrinsics.c | 49 + > src/glsl/nir/nir_intrinsics.h | 140 ++ > src/glsl/nir/nir_live_variables.c | 282 +++ > src/glsl/nir/nir_lower_atomics.c | 146 ++ > src/glsl/nir/nir_lower_global_vars_to_local.c | 107 + > src/glsl/nir/nir_lower_io.c | 324 ++++ > src/glsl/nir/nir_lower_locals_to_regs.c | 308 +++ > src/glsl/nir/nir_lower_samplers.cpp | 181 ++ > src/glsl/nir/nir_lower_system_values.c | 107 + > src/glsl/nir/nir_lower_to_source_mods.c | 181 ++ > src/glsl/nir/nir_lower_variables.c | 1046 ++++++++++ > src/glsl/nir/nir_lower_vec_to_movs.c | 96 + > src/glsl/nir/nir_metadata.c | 54 + > src/glsl/nir/nir_opcodes.c | 46 + > src/glsl/nir/nir_opcodes.h | 356 ++++ > src/glsl/nir/nir_opt_algebraic.py | 67 + > src/glsl/nir/nir_opt_constant_folding.c | 355 ++++ > src/glsl/nir/nir_opt_copy_propagate.c | 325 ++++ > src/glsl/nir/nir_opt_cse.c | 269 +++ > src/glsl/nir/nir_opt_dce.c | 186 ++ > src/glsl/nir/nir_opt_global_to_local.c | 103 + > src/glsl/nir/nir_opt_peephole_select.c | 214 ++ > src/glsl/nir/nir_print.c | 948 +++++++++ > src/glsl/nir/nir_remove_dead_variables.c | 138 ++ > src/glsl/nir/nir_search.c | 337 ++++ > src/glsl/nir/nir_search.h | 80 + > src/glsl/nir/nir_split_var_copies.c | 225 +++ > src/glsl/nir/nir_to_ssa.c | 660 +++++++ > src/glsl/nir/nir_types.cpp | 143 ++ > src/glsl/nir/nir_types.h | 75 + > src/glsl/nir/nir_validate.c | 912 +++++++++ > src/mesa/drivers/dri/i965/Makefile.sources | 1 + > src/mesa/drivers/dri/i965/brw_fs.cpp | 74 +- > src/mesa/drivers/dri/i965/brw_fs.h | 57 +- > .../drivers/dri/i965/brw_fs_copy_propagation.cpp | 4 +- > src/mesa/drivers/dri/i965/brw_fs_fp.cpp | 32 +- > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 1778 +++++++++++++++++ > src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 39 +- > src/mesa/main/bitset.h | 1 + > 50 files changed, 17301 insertions(+), 77 deletions(-) > create mode 100644 src/glsl/nir/README > create mode 100644 src/glsl/nir/glsl_to_nir.cpp > create mode 100644 src/glsl/nir/glsl_to_nir.h > create mode 100644 src/glsl/nir/nir.c > create mode 100644 src/glsl/nir/nir.h > create mode 100644 src/glsl/nir/nir_algebraic.py > create mode 100644 src/glsl/nir/nir_dominance.c > create mode 100644 src/glsl/nir/nir_from_ssa.c > create mode 100644 src/glsl/nir/nir_intrinsics.c > create mode 100644 src/glsl/nir/nir_intrinsics.h > create mode 100644 src/glsl/nir/nir_live_variables.c > create mode 100644 src/glsl/nir/nir_lower_atomics.c > create mode 100644 src/glsl/nir/nir_lower_global_vars_to_local.c > create mode 100644 src/glsl/nir/nir_lower_io.c > create mode 100644 src/glsl/nir/nir_lower_locals_to_regs.c > create mode 100644 src/glsl/nir/nir_lower_samplers.cpp > create mode 100644 src/glsl/nir/nir_lower_system_values.c > create mode 100644 src/glsl/nir/nir_lower_to_source_mods.c > create mode 100644 src/glsl/nir/nir_lower_variables.c > create mode 100644 src/glsl/nir/nir_lower_vec_to_movs.c > create mode 100644 src/glsl/nir/nir_metadata.c > create mode 100644 src/glsl/nir/nir_opcodes.c > create mode 100644 src/glsl/nir/nir_opcodes.h > create mode 100644 src/glsl/nir/nir_opt_algebraic.py > create mode 100644 src/glsl/nir/nir_opt_constant_folding.c > create mode 100644 src/glsl/nir/nir_opt_copy_propagate.c > create mode 100644 src/glsl/nir/nir_opt_cse.c > create mode 100644 src/glsl/nir/nir_opt_dce.c > create mode 100644 src/glsl/nir/nir_opt_global_to_local.c > create mode 100644 src/glsl/nir/nir_opt_peephole_select.c > create mode 100644 src/glsl/nir/nir_print.c > create mode 100644 src/glsl/nir/nir_remove_dead_variables.c > create mode 100644 src/glsl/nir/nir_search.c > create mode 100644 src/glsl/nir/nir_search.h > create mode 100644 src/glsl/nir/nir_split_var_copies.c > create mode 100644 src/glsl/nir/nir_to_ssa.c > create mode 100644 src/glsl/nir/nir_types.cpp > create mode 100644 src/glsl/nir/nir_types.h > create mode 100644 src/glsl/nir/nir_validate.c > create mode 100644 src/mesa/drivers/dri/i965/brw_fs_nir.cpp > > -- > 2.2.0 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev