For my own sake as well of those reviewing things, I'm keeping both the review/nir-v1 branch and http://patchwork.freedesktop.org/bundle/jekstrand/nir-v1/ up-to-date as new patches get sent out and reviewed-by's get added.
On Tue, Dec 16, 2014 at 10:59 PM, Connor Abbott <cwabbo...@gmail.com> wrote: > > Patches 23-26, 28, 30-35, 37-38, 40, (41 gets killed later so I didn't > review it), 42-44 are > > Reviewed-by: Connor Abbott <cwabbo...@gmail.com> > > I'm going to bed now, but I'll try to do some more later. > > On Tue, Dec 16, 2014 at 1:04 AM, Jason Ekstrand <ja...@jlekstrand.net> > wrote: > > NIR (pronounced "ner") is a new IR (internal representation) for the Mesa > > shader compiler that will sit between the old IR (GLSL IR) and back-end > > compilers. The primary purpose of NIR is to be more efficient for doing > > optimizations and generate better code for the back-ends. We have a lot > of > > optimizations implemented in GLSL IR right now. However, they still > > generate fairly bad code primarily because its tree-based structure makes > > writing good optimizations difficult. For this reason, we have > implemented > > a lot of optimizations in the i965 back-end compilers just to fix up the > > code we get from GLSL IR. The "proper fix" to this is to implement a > > better high-level IR; enter NIR. > > > > Most of the initial work on NIR including setting up common data > > structures, helper methods, and a few basic passes was by Connor Abbot > who > > interned with us over the summer. Connor did a fantastic job, but there > is > > still a lot left to be done. I've spent the last two months trying to > fill > > in the pieces that we need in order to get NIR off the ground. At this > > point, we now have compitent in and out of SSA passes, are at zero piglit > > regressions for i965 SIMD8 fragment shaders, and the shader-db numbers > > aren't terrible. > > > > This is still a bit experimental. I have been testing only on HSW but it > > should work ok on SNB and later. Eventually, once we get booleans fixed > > up, it should work fine on older chips as well. It also doesn't yet > > support SIMD16, so performance won't be that great. That said, I think > we > > are at the point now where we should try and land this and I can stop > > developing in my masive private branch. Since this isn't quite ready for > > prime-time yet, using it requires setting the INTEL_USE_NIR environment > > variable. > > > > A few key points about NIR: > > > > 1. It is primarily an SSA-based IR. > > 2. It supports source/destination-modifiers and swizzles/*write-masks. > > 3. Standard GPU operations such as sin() and fmad() are first-class ALU > > operations, not intrinsics. > > 4. GLSL concepts like inputs, outputs, uniforms, etc. are built into the > > IR so we can do proper analysis on them. > > 5. Even though it's SSA, it still has a concept of registers and > > write-masks in the core IR data structures. This means we can > generate > > code that is much closer to what backends want. > > 6. Control flow is structured explicitly in the IR. > > > > (*write-masks are not available for SSA values) > > > > While source/destination modifiers and writemasks/swizzles are not > > particularly useful for optimizations, having them represented in the IR > > gives us the ability to generate more useful code for backends. > > > > A few notes about review: > > > > 1. For those of you who aren't interested in the general compiler, I'm > > sorry for the patch-bomb. However, several people have requsted that > > we maintain the history of the NIR development since connor's > original > > drop at the end of the summer. Therefore, while I've squashed > several > > things, I've tried to leave the diff of what I've done more-or-less > > preserved. > > > > 2. No, this is not LLVM. There was a long-winded discussion about that > > when Connor dropped his patches that went a whole lot of nowhere as > > usual. I would really prefer if we left that debate alone. If there > > must be bikeshedding on the topic, please do so on the cover-letter > > e-mail. > > > > 3. Please keep all bikeshedding about C++, typedefs, etc. on the core > > datastructures e-mail. If we need, we can split that off in its own > > thread. > > > > 4. While I welcome review, I don't plan to make non-trivial changes to > > specific patches or squash anything beyond what has already been > > squashed. I've tried thus far to more-or-less keep the history and > I'd > > like to continue this if we can. > > > > 5. Eric Anholt has also written NIR -> TGSI -> NIR passes which will > > hopefully get landed soon after NIR initially lands. Exactly how > that > > all gets hooked up for other gallium drivers beyond vc4 is outside > the > > scope of this series. > > > > I have pushed a branch to my personal freedesktop.org account. For > certain > > types of review, it may be easier to look at the end result rather than > the > > patches. The branch can be found via freedesktop cgit here: > > > > http://cgit.freedesktop.org/~jekstrand/mesa/log/?h=review/nir-v1 > > > > Last week, I did a presentation for some of the other Intel people to try > > and help bring them up to speed on NIR concepts quickly. As part of > this, > > I typed up a bunch of notes that provide a decent overview of a lot of > NIR > > concepts. Those notes can be found here: > > > > http://www.jlekstrand.net/jason/projects/mesa/nir-notes/ > > > > Happy reviewing! > > > > P.S. Connor, Don't do too much reviewing before your finals are done. :-P > > > > Connor Abbott (22): > > exec_list: add a list_foreach_typed_reverse() macro > > nir: add initial README > > nir: add a simple C wrapper around glsl_types.h > > nir: add the core datastructures > > nir: add core helper functions > > nir: add a printer > > nir: add a validation pass > > nir: add a glsl-to-nir pass > > nir: add a pass to lower variables for scalar backends > > nir: keep track of the number of input, output, and uniform slots > > nir: add a pass to remove unused variables > > nir: add a pass to lower sampler instructions > > nir: add a pass to lower system value reads > > nir: add a pass to lower atomics > > nir: add an optimization to turn global registers into local registers > > nir: calculate dominance information > > nir: add a pass to convert to SSA > > nir: add an SSA-based copy propagation pass > > nir: add an SSA-based dead code elimination pass > > i965/fs: make emit_fragcoord_interpolation() more general > > i965/fs: Don't pass through the coordinate type > > i965/fs: add a NIR frontend > > > > Jason Ekstrand (101): > > i965/fs: Only use nir for 8-wide non-fast-clear shaders. > > i965/fs_nir: Make the sampler register always unsigned > > i965/fs_nir: Use the correct types for texture inputs > > i965/fs_nir: Use the correct texture offset immediate > > Fix what I think are a few NIR typos > > Fix up varying pull constants > > i965/fs_nir: Add support for sample_pos and sample_id > > nir/glsl: Add support for saturate > > nir: Add fine and coarse derivative opcodes > > nir/glsl: Add support for coarse and fine derivatives > > i965/fs_nir: Handle coarse/fine derivatives > > nir/lower_atomics: Multiply array offsets by ATOMIC_COUNTER_SIZE > > i965/fs_nir: Add atomic counters support > > i965/fs: Allow reinterpretation in constant propagation > > nir: Add NIR_TRUE and NIR_FALSE constants and use them for boolean > > immediates > > nir: Add intrinsics to do alternate interpolation on inputs > > i965/fs: Don't take an ir_variable for emit_general_interpolation > > i965/fs_nir: Don't duplicate emit_general_interpolation > > nir: Add a naieve from-SSA pass > > nir: Add a lower_vec_to_movs pass > > i965/fs_nir: Convert the shader to/from SSA > > nir/lower_variables_scalar: Silence a compiler warning > > nir: Add a basic metadata management system > > nir: Add an assert > > nir/foreach_block: Return false if the callback on the last block > > fails > > nir: Add a foreach_block_reverse function > > nir: Add a function to detect if a block is immediately followed by an > > if > > nir: Make the nir_index_* functions return the nuber of items > > nir: Add an SSA-based liveness analysis pass. > > nir: Add an initialization function for SSA definitions > > nir: Automatically handle SSA uses when an instruction is inserted > > nir: Add a function for rewriting all the uses of a SSA def > > nir: Add a parallel copy instruction type > > nir: Add a function for comparing two sources > > nir: Add a better out-of-SSA pass > > i965/fs_nir: Do retyping for ALU srouces in get_nir_alu_src > > glsl/list: Fix the exec_list_validate function > > nir: Validate all lists in the validator > > nir/print: Don't reindex things > > nir: Differentiate between signed and unsigned versions of find_msb > > i965/fs_nir: Validate optimization passes > > nir/nir: Fix a bug in move_successors > > glsl/list: Add a foreach_list_typed_safe_reverse macro > > nir/nir: Use safe iterators when iterating over the CFG > > nir/nir: Patch up phi predecessors in move_successors > > nir: Add a peephole select optimization > > i965/fs_nir: Turn on the peephole select optimization > > nir: Validate that the SSA def and register indices are unique > > nir: Add a fused multiply-add peephole > > nir: Add a basic CSE pass > > i965/fs_nir: Add the CSE pass and actually run in a loop > > i965/fs_nir: Use an array rather than a hash table for register lookup > > i965/fs_nir: Handle SSA constants > > i965/fs_nir: Properly saturate multiplies > > nir: Add a helper for rewriting an instruction source > > nir/lower_samplers: Use the nir_instr_rewrite_src function > > nir: Clean up nir_deref helper functions > > nir: Make array deref direct vs. indirect an enum > > nir: Add a concept of a wildcard array dereference > > nir: Use an integer index for specifying structure fields > > nir: Don't require a function in ssa_def_init > > nir/copy_propagate: Don't cause size mismatches on phi node sources > > nir: Validate that the sources of a phi have the same size as the > > destination > > nir/glsl: Don't allocate a state_slots array for 0 state slots > > i965/fs_nir: Don't dump the shader. > > nir: Use the enum for the variable mode > > nir: Automatically update SSA if uses > > nir: Add a copy splitting pass > > nir: Add a pass to lower local variable accesses to SSA values > > nir: Add a pass to lower local variables to registers > > nir: Add a pass for lowering input/output loads/stores > > nir: Add a pass to lower global variables to local variables > > nir/glsl: Generate SSA NIR > > i965/fs_nir: Use the new variable lowering code > > nir/validate: Ensure that outputs are write-only and inputs are > > read-only > > nir: Remove the old variable lowering code > > nir: Vectorize intrinsics > > nir/validate: Validate intrinsic source/destination sizes > > nir: Add gpu_shader5 interpolation intrinsics > > nir/glsl: Add support for gpu_shader5 interpolation instrinsics > > nir: Add a helper for getting a constant value from an SSA source > > i965/fs_nir: Add a has_indirect flag and clean up some of the > > input/output code > > i965/fs_nir: Implement the ARB_gpu_shader5 interpolation intrinsics > > nir: Add neg, abs, and sat opcodes > > nir: Add a lowering pass for adding source modifiers where possible > > nir: Make the type casting operations static inline functions > > nir/glsl: Emit abs, neg, and sat operations instead of source > > modifiers > > nir: Add an expression matching framework > > nir: Add infastructure for generating algebraic transformation passes > > nir: Add an algebraic optimization pass > > nir: Add a basic constant folding pass > > nir: Remove the ffma peephole > > nir: Make texture instruction names more consistent > > nir: Constant fold array indirects > > nir: Use a source for uniform buffer indices instead of an index > > nir: Add a sampler index indirect to nir_tex_instr > > nir: Rework the way samplers are lowered > > i965/fs_nir: Add support for indirect texture arrays > > nir/metadata: Rename metadata_dirty to metadata_preserve > > nir: Call nir_metadata_preserve more places > > nir: Make bcsel a fully vector operation > > > > src/glsl/Makefile.am | 10 +- > > src/glsl/Makefile.sources | 39 +- > > src/glsl/list.h | 19 +- > > src/glsl/nir/README | 118 ++ > > src/glsl/nir/glsl_to_nir.cpp | 1825 > +++++++++++++++++ > > src/glsl/nir/glsl_to_nir.h | 40 + > > src/glsl/nir/nir.c | 2042 > ++++++++++++++++++++ > > src/glsl/nir/nir.h | 1433 ++++++++++++++ > > src/glsl/nir/nir_algebraic.py | 249 +++ > > src/glsl/nir/nir_dominance.c | 298 +++ > > src/glsl/nir/nir_from_ssa.c | 859 ++++++++ > > src/glsl/nir/nir_intrinsics.c | 49 + > > src/glsl/nir/nir_intrinsics.h | 140 ++ > > src/glsl/nir/nir_live_variables.c | 282 +++ > > src/glsl/nir/nir_lower_atomics.c | 146 ++ > > src/glsl/nir/nir_lower_global_vars_to_local.c | 107 + > > src/glsl/nir/nir_lower_io.c | 324 ++++ > > src/glsl/nir/nir_lower_locals_to_regs.c | 308 +++ > > src/glsl/nir/nir_lower_samplers.cpp | 181 ++ > > src/glsl/nir/nir_lower_system_values.c | 107 + > > src/glsl/nir/nir_lower_to_source_mods.c | 181 ++ > > src/glsl/nir/nir_lower_variables.c | 1046 ++++++++++ > > src/glsl/nir/nir_lower_vec_to_movs.c | 96 + > > src/glsl/nir/nir_metadata.c | 54 + > > src/glsl/nir/nir_opcodes.c | 46 + > > src/glsl/nir/nir_opcodes.h | 356 ++++ > > src/glsl/nir/nir_opt_algebraic.py | 67 + > > src/glsl/nir/nir_opt_constant_folding.c | 355 ++++ > > src/glsl/nir/nir_opt_copy_propagate.c | 325 ++++ > > src/glsl/nir/nir_opt_cse.c | 269 +++ > > src/glsl/nir/nir_opt_dce.c | 186 ++ > > src/glsl/nir/nir_opt_global_to_local.c | 103 + > > src/glsl/nir/nir_opt_peephole_select.c | 214 ++ > > src/glsl/nir/nir_print.c | 948 +++++++++ > > src/glsl/nir/nir_remove_dead_variables.c | 138 ++ > > src/glsl/nir/nir_search.c | 337 ++++ > > src/glsl/nir/nir_search.h | 80 + > > src/glsl/nir/nir_split_var_copies.c | 225 +++ > > src/glsl/nir/nir_to_ssa.c | 660 +++++++ > > src/glsl/nir/nir_types.cpp | 143 ++ > > src/glsl/nir/nir_types.h | 75 + > > src/glsl/nir/nir_validate.c | 912 +++++++++ > > src/mesa/drivers/dri/i965/Makefile.sources | 1 + > > src/mesa/drivers/dri/i965/brw_fs.cpp | 74 +- > > src/mesa/drivers/dri/i965/brw_fs.h | 57 +- > > .../drivers/dri/i965/brw_fs_copy_propagation.cpp | 4 +- > > src/mesa/drivers/dri/i965/brw_fs_fp.cpp | 32 +- > > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 1778 > +++++++++++++++++ > > src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 39 +- > > src/mesa/main/bitset.h | 1 + > > 50 files changed, 17301 insertions(+), 77 deletions(-) > > create mode 100644 src/glsl/nir/README > > create mode 100644 src/glsl/nir/glsl_to_nir.cpp > > create mode 100644 src/glsl/nir/glsl_to_nir.h > > create mode 100644 src/glsl/nir/nir.c > > create mode 100644 src/glsl/nir/nir.h > > create mode 100644 src/glsl/nir/nir_algebraic.py > > create mode 100644 src/glsl/nir/nir_dominance.c > > create mode 100644 src/glsl/nir/nir_from_ssa.c > > create mode 100644 src/glsl/nir/nir_intrinsics.c > > create mode 100644 src/glsl/nir/nir_intrinsics.h > > create mode 100644 src/glsl/nir/nir_live_variables.c > > create mode 100644 src/glsl/nir/nir_lower_atomics.c > > create mode 100644 src/glsl/nir/nir_lower_global_vars_to_local.c > > create mode 100644 src/glsl/nir/nir_lower_io.c > > create mode 100644 src/glsl/nir/nir_lower_locals_to_regs.c > > create mode 100644 src/glsl/nir/nir_lower_samplers.cpp > > create mode 100644 src/glsl/nir/nir_lower_system_values.c > > create mode 100644 src/glsl/nir/nir_lower_to_source_mods.c > > create mode 100644 src/glsl/nir/nir_lower_variables.c > > create mode 100644 src/glsl/nir/nir_lower_vec_to_movs.c > > create mode 100644 src/glsl/nir/nir_metadata.c > > create mode 100644 src/glsl/nir/nir_opcodes.c > > create mode 100644 src/glsl/nir/nir_opcodes.h > > create mode 100644 src/glsl/nir/nir_opt_algebraic.py > > create mode 100644 src/glsl/nir/nir_opt_constant_folding.c > > create mode 100644 src/glsl/nir/nir_opt_copy_propagate.c > > create mode 100644 src/glsl/nir/nir_opt_cse.c > > create mode 100644 src/glsl/nir/nir_opt_dce.c > > create mode 100644 src/glsl/nir/nir_opt_global_to_local.c > > create mode 100644 src/glsl/nir/nir_opt_peephole_select.c > > create mode 100644 src/glsl/nir/nir_print.c > > create mode 100644 src/glsl/nir/nir_remove_dead_variables.c > > create mode 100644 src/glsl/nir/nir_search.c > > create mode 100644 src/glsl/nir/nir_search.h > > create mode 100644 src/glsl/nir/nir_split_var_copies.c > > create mode 100644 src/glsl/nir/nir_to_ssa.c > > create mode 100644 src/glsl/nir/nir_types.cpp > > create mode 100644 src/glsl/nir/nir_types.h > > create mode 100644 src/glsl/nir/nir_validate.c > > create mode 100644 src/mesa/drivers/dri/i965/brw_fs_nir.cpp > > > > -- > > 2.2.0 > > > > _______________________________________________ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev