Jason Ekstrand <ja...@jlekstrand.net> writes: > On Dec 14, 2015 6:24 AM, "Francisco Jerez" <curroje...@riseup.net> wrote: >> >> Jason Ekstrand <ja...@jlekstrand.net> writes: >> >> > On Dec 11, 2015 5:44 AM, "Francisco Jerez" <curroje...@riseup.net> > wrote: >> >> >> >> Jason Ekstrand <ja...@jlekstrand.net> writes: >> >> >> >> > On Dec 10, 2015 6:58 AM, "Francisco Jerez" <curroje...@riseup.net> >> > wrote: >> >> >> >> >> >> Jason Ekstrand <ja...@jlekstrand.net> writes: >> >> >> >> >> >> > We aren't using it anymore. >> >> >> >> >> >> It seems useful to me to be able to represent indirect access as > part >> > of >> >> >> any instruction source or destination register. >> >> >> >> >> >> The following: >> >> >> >> >> >> | mov_indirect g0, g1, a0 >> >> >> | foo g2, g0 >> >> >> >> >> >> and the converse case with indirect destination offset (which you > don't >> >> >> seem to represent currently) can be implemented by the hardware more >> >> >> efficiently using a single instruction in certain cases. The > current >> > IR >> >> >> is able to represent what the hardware can do, but supporting the >> >> >> MOV_INDIRECT instruction only would force us to keep the indirection >> >> >> separate from the instruction that uses it, so it seems like a less >> >> >> expressive representation to me than the current approach, unless >> > you're >> >> >> willing to add _INDIRECT variants of most hardware opcodes. >> >> > >> >> > Yes and, mostly, no. Yes, you can put an indirect on almost anything >> > but >> >> > it has substantial restrictions: >> >> > >> >> Yes, I'm aware of the restrictions of indirect addressing on Gen >> >> hardware. The fact that reladdr can represent addressing modes which >> >> aren't directly implemented in the hardware doesn't invalidate the >> >> abstraction, nor implies that optimization and translation passes must >> >> be made aware of such restrictions. It just means that our abstraction >> >> is a superset of the hardware indirect addressing scheme, which is fine >> >> given a lowering pass to convert indirect addressing into a legal form. >> >> >> >> Your proposal instead goes the opposite direction and replaces the >> >> preexisting abstraction with a new abstraction which can only represent >> >> a tiny subset of the functionality of hardware instructions, which I >> >> don't think is acceptable for a low-level IR. >> >> >> >> > 1) Destination indirects must be uniform (I'm only 95% sure this is > the >> >> > case) >> >> > >> >> > 2) We only have 8 address subregisters on HSW and prior and 16 on BDW >> > which >> >> > means: >> >> > >> >> >> >> That's the kind of thing that SIMD lowering would be able to take care >> >> of easily. >> > >> > Yes, and we use it for MOV_INDIRECT. >> > >> Right, and there's no reason why it couldn't be used for reladdr in the >> same way to handle the above restrictions. >> >> >> > a) All indirects must be uniform OR >> >> > >> >> > b) We must be on Broadwell, have only two indirects, and split > the >> >> > instruction OR >> >> > >> >> > c) We can only have one indirect (and still split the > instruction on >> >> > HSW) >> >> > >> >> > So, yes, "everthing can be uniform", but not in real life. >> >> >> >> > The reladdr abstraction, while maybe ok for a high-level IR, doesn't >> >> > accurately represent the hardware at all because it can't represent >> >> > any of the restrictions without piles of helper functions and checks. >> >> >> >> The reladdr abstraction can represent the full flexibility of the >> >> hardware, while MOV_INDIRECT cannot, that makes reladdr a more accurate >> >> low-level representation of the program. For that reason I'd argue the >> >> exact opposite: MOV_INDIRECT would be a great abstraction and a welcome >> >> simplification for a high-level IR (e.g. NIR), in which nothing is lost >> >> by sacrificing the expressiveness of individual instructions in favour >> >> of instruction composition, because the back-end can always combine >> >> multiple high-level instructions into a single low-level instruction >> >> where the hardware ISA allows, as long as the low-level IR is able to >> >> represent the full functionality of hardware instructions (IOW the >> >> mapping between low-level IR instructions and hardware instructions is >> >> surjective), which is further from being the case after this series. >> > >> > Flexibility and expressiveness isn't always a good thing. >> >> Expressiveness is a good thing as long as it's required to represent the >> capabilities of the hardware. > > As long as it's required to represent those capabilities that we care > about, yes. However, the hardware can do many things that either aren't > useful for us or take so much effort for the little benefit you gain, that > it's not worth it. I'm arguing that saving a single MOV in the few places > that we do in directs isn't worth the complexity. > >> > Yes, reladdr is more "expressive", but it's not getting used because >> > that expressiveness makes it a pain. >> >> AFAIK nobody has bothered to implement it completely down to the >> back-end generator so it's not a wonder that it's not extensively used. >> I'd like to see some evidence that it would actually be a pain >> (e.g. some proof of concept implementation) before we give up and commit >> to an inherently more limited model. If you already have such a proof >> of concept, please share it. If you don't and are fully convinced that >> it will be a pain I suggest you hold up this series for a while and let >> me write the proof of concept -- Doing something with the expectation >> that it's going to be a pain is the best way to make sure it actually >> becomes a pain. Worst-case scenario you're right and it does become a >> pain, at which point I come back and review this series. Best-case >> scenario you're wrong and we don't give up useful functionality of the >> IR based on an expectation. > > Do I have a proof-of-concept in code, no. However, I've run through it in > my head and I have a pretty good idea what it would look like.
Ah, that's what I guessed. I don't think that your previous claim that supporting indirect addressing based on the current reladdr approach would require rewriting optimization passes to be recursive demonstrates a particularly clear notion of what it could look like. > You are free to go off and do it if you don't believe me, but I don't > really want to hold things up while you do. If you go off and > demonstrate that reladdr really is better, its not that hard to > revert. > Whether it will be hard to revert or not depends on how much of the code added in the meantime assumes that reladdr is no longer a thing, and you cannot give any guarantees about that. I don't have any obligation to spend time substantiating or refuting your assumption, and I wont if your plan is to make things unnecessarily difficult for me in the meantime. > One other side-note: One of the substantial advantages to this series is > the removal of the uniform size arrays. If you do decide to make a > different prototype, please make that a major design point. When we get > uniforms in from SPIR-V they may come in std140 packed and with just an > offset and maybe range. The size arrays are implicitly tied to variables. > I want them gone. > >> > You could say it is getting used, but it's current use is just an >> > intermediate to something less expressive than MOV_INDIRECT. In that >> > sense, we are expressing *more* with MOV_INDIRECT than we were before >> > since we are also expressing in directing on registers and not just >> > pull constants. >> > >> Yes, but MOV_INDIRECT is a dead end in the long-term, and the same >> functionality you implement here could have been implemented based on >> the preexisting model. >> >> > Also, while reladdr is "expressive", it doesn't actually represent the >> > hardware. If we wanted that, we would expose the address register file >> > (which I did consider doing). However, that comes with it's own set of >> > issues. >> > >> It represents the hardware more accurately than MOV_INDIRECT in the >> following sense: It allows us to represent a wide subset of the indirect >> addressing capabilities of the hardware down to the generator, on any >> instruction, source or destination we can possibly address indirectly. >> MOV_INDIRECT on the other hand allows us to represent a MOV instruction >> with indirectly-addressed source *only*. >> >> >> > The MOV_INDIRECT instruction, on the other hand, is something we can >> >> > always do because it falls in the "one indirect" case above. The >> >> > reladdr abstraction also suffers from the recursive reladdr problem >> >> > which MOV_INDIRECT neatly side-steps by design. >> >> >> >> I don't think the reladdr "problem" requires any fundamentally more >> >> difficult logic in optimization passes than what you need in order to >> >> handle MOV_INDIRECT instructions reasonably. >> > >> > No, it just requires rewriting them because iterating over sources just >> > became recursive. >> > >> That's quite an overstatement, most optimization passes only need to >> care about the top-level register of a source or destination, because >> wherever a reladdr is present it's not statically determined so you're >> unlikely to be able to optimize through it anyway regardless of whether >> you recurse or not. Cases where you actually want to iterate over each >> register used by an instruction are a minority and don't require >> fundamentally changing the optimization pass, the iteration can be >> factored out as in the straightforward for_each_use() helper I >> originally implemented to solve a different problem but which happens to >> handle arbitrarily-nested reladdr as one would expect. >> >> >> > We can't even use reladdr past initial NIR -> FS stage because none > of >> >> > the optimization passes know what to do with it (and no, I don't want >> >> > to add reladdr support to any of them). >> >> > >> >> > Yes, removing reladdr makes the IR "less expressive", but we're not >> > really >> >> > using that expressiveness nor is reladdr the abstraction we want. > If we >> >> > want to start taking more advantage of relative addressing in the >> > future, >> >> > we can come up with a better abstraction then. >> >> > >> >> > --Jason >> >> > >> >> >> > --- >> >> >> > src/mesa/drivers/dri/i965/brw_fs.cpp | 7 +------ >> >> >> > src/mesa/drivers/dri/i965/brw_ir_fs.h | 5 +---- >> >> >> > 2 files changed, 2 insertions(+), 10 deletions(-) >> >> >> > >> >> >> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp >> >> > b/src/mesa/drivers/dri/i965/brw_fs.cpp >> >> >> > index 7cc03c5..786c5fb 100644 >> >> >> > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp >> >> >> > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp >> >> >> > @@ -433,7 +433,6 @@ fs_reg::fs_reg(struct ::brw_reg reg) : >> >> >> > { >> >> >> > this->reg_offset = 0; >> >> >> > this->subreg_offset = 0; >> >> >> > - this->reladdr = NULL; >> >> >> > this->stride = 1; >> >> >> > if (this->file == IMM && >> >> >> > (this->type != BRW_REGISTER_TYPE_V && >> >> >> > @@ -448,7 +447,6 @@ fs_reg::equals(const fs_reg &r) const >> >> >> > { >> >> >> > return (this->backend_reg::equals(r) && >> >> >> > subreg_offset == r.subreg_offset && >> >> >> > - !reladdr && !r.reladdr && >> >> >> > stride == r.stride); >> >> >> > } >> >> >> > >> >> >> > @@ -4716,9 +4714,7 @@ >> > fs_visitor::dump_instruction(backend_instruction >> >> > *be_inst, FILE *file) >> >> >> > break; >> >> >> > case UNIFORM: >> >> >> > fprintf(file, "u%d", inst->src[i].nr + >> >> > inst->src[i].reg_offset); >> >> >> > - if (inst->src[i].reladdr) { >> >> >> > - fprintf(file, "+reladdr"); >> >> >> > - } else if (inst->src[i].subreg_offset) { >> >> >> > + if (inst->src[i].subreg_offset) { >> >> >> > fprintf(file, "+%d.%d", inst->src[i].reg_offset, >> >> >> > inst->src[i].subreg_offset); >> >> >> > } >> >> >> > @@ -4829,7 +4825,6 @@ >> >> > fs_visitor::get_instruction_generating_reg(fs_inst *start, >> >> >> > { >> >> >> > if (end == start || >> >> >> > end->is_partial_write() || >> >> >> > - reg.reladdr || >> >> >> > !reg.equals(end->dst)) { >> >> >> > return NULL; >> >> >> > } else { >> >> >> > diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h >> >> > b/src/mesa/drivers/dri/i965/brw_ir_fs.h >> >> >> > index c3eec2e..e4f20f4 100644 >> >> >> > --- a/src/mesa/drivers/dri/i965/brw_ir_fs.h >> >> >> > +++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h >> >> >> > @@ -58,8 +58,6 @@ public: >> >> >> > */ >> >> >> > int subreg_offset; >> >> >> > >> >> >> > - fs_reg *reladdr; >> >> >> > - >> >> >> > /** Register region horizontal stride */ >> >> >> > uint8_t stride; >> >> >> > }; >> >> >> > @@ -136,8 +134,7 @@ component(fs_reg reg, unsigned idx) >> >> >> > static inline bool >> >> >> > is_uniform(const fs_reg ®) >> >> >> > { >> >> >> > - return (reg.stride == 0 || reg.is_null()) && >> >> >> > - (!reg.reladdr || is_uniform(*reg.reladdr)); >> >> >> > + return (reg.stride == 0 || reg.is_null()); >> >> >> > } >> >> >> > >> >> >> > /** >> >> >> > -- >> >> >> > 2.5.0.400.gff86faf >> >> >> > >> >> >> > _______________________________________________ >> >> >> > mesa-dev mailing list >> >> >> > mesa-dev@lists.freedesktop.org >> >> >> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev