Jason Ekstrand <ja...@jlekstrand.net> writes: > On Tue, May 19, 2015 at 5:42 AM, Francisco Jerez <curroje...@riseup.net> > wrote: >> Jason Ekstrand <ja...@jlekstrand.net> writes: >> >>> On Mon, May 18, 2015 at 10:34 AM, Francisco Jerez <curroje...@riseup.net> >>> wrote: >>>>[...] >>>> I've given this idea a shot. Can you have a look at the >>>> image-load-store-lower branch of my tree [1]? It's just a quick and >>>> dirty proof of concept, so don't bother to review it carefully, just let >>>> me know if you agree with the general design before I spend more time on >>>> it. >>>> >>>> [1] >>>> http://cgit.freedesktop.org/~currojerez/mesa/log/?h=image-load-store-lower >>> >>> I took a look at it. I think patch 3 "Add pass to lower opcodes with >>> unsupported SIMD width." is more-or-less exactly what I'm talking >>> about. What I don't understand is the stuff about split payloads. >>> While I think we *might* be able to split a payload it seems dangerous >>> and like something we shouldn't be doing. >> >> Dangerous how? Can you elaborate? > > It is not always the case that if you just leave the header alone and > split the others that you will get the payload you want for SIMD8. > More in a moment. > >>> This is where the "logical" opcodes I mentioned come into play. I >>> think there has been some miscommunication there; perhaps I didn't >>> explain myself very well. Allow me to be more explicit; I'll use >>> image loads for my example. >>> >>> 1) We would add an opcode SHADER_IMAGE_LOAD_LOGICAL (or some other >>> name) that takes 4 arguments: image, address, format, and dims just >>> like the emit_image_load helper. >>> 2) Instead of calling the helper, the visitor would just emit >>> SHADER_IMAGE_LOAD_LOGICAL instruction with those arguments. >>> 3) We then run the splitting pass which can easily split the new load >>> instruction since no payloads are involved. >>> 4) We then have a lowering pass which knows how to turn >>> SHADER_IMAGE_LOAD_LOGICAL into an actual load including the payload, >>> pixel mask, and whatever other fiddly bits there are. >>> >>> Steps (1) and (2) may not be quite right (you'll have to help me out >>> here). We may want to keep emit_image_load so that it can do format >>> conversion and emit an untyped logical instruction. However, in any >>> case, the logical instruction does not have any payload sources if we >>> can at all help it. >>> >>> Does that make more sense? Is there something I'm missing? >> >> I don't think that a high-level "image load" opcode would be of much use >> in the back-end IR, the hardware can only do a number untyped and typed >> surface operations, and we probably want to represent them as such. >> >> My _SPLIT opcodes are roughly the same as the _LOGICAL opcodes you >> describe -- as far as the visitor and optimization passes are concerned, >> they both behave as a normal opcode taking an address, surface, >> dimensions and size as separate arguments, the main difference is that >> the lowering to a send-message-style opcode (your step 4) is fully >> deterministic, as the layout of the message payload is inferred from the >> source_is_payload(i) and regs_read(i) instruction queries. This has two >> obvious advantages: >> >> 1/ The same lowering logic can be reused for *all* send-message opcodes >> making use of this infrastructure, so there is no need to implement >> ad-hoc lowering logic for each message, which seemed like the >> greatest annoyance of your proposal. > > The fact that you can do that for untyped reads/writes is great. It > means we should only need one lowering function for them. > Unfortunately, other messages such as FB writes aren't going to be > quite so simple. I'm not sure what texturing will look like but I'll > hazard a guess that they won't be as trivial either. > No, FB writes and texturing both fit under the same framework just fine. I'll port them if people consider it useful.
> In other words, while it works nicely for those opcodes, I wouldn't > bother building a lot of infastructure in the compiler for it unless > it really saves you something for the untyped surface read/writes. > The texturing and fb-write code will probably have to be custom. That > said, we already have that custom code written we just need to change > it from emit_single_fb_write to lower_fb_write and make it use the > builder. > >> 2/ It could make the transition easier to Gen9 split send messages, as >> we could just change the one lowering pass to emit instructions with >> two partially assembled payload sources and let the hardware do the >> rest, in a way transparent to the visitor code making use of this >> infrastructure. >> >> By doing this I can also easily avoid defining the array_reg stuff >> others seemed to disagree with for some reason, although personally I >> consider this more an obfuscation than an advantage (sigh). > > I understand. You may not like it, but it's a path towards getting > things merged. > --Jason
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev