On Wed, Aug 20, 2014 at 12:17 PM, Tom Stellard <t...@stellard.net> wrote: > On Tue, Aug 19, 2014 at 05:19:15PM -0700, Connor Abbott wrote: >> On Tue, Aug 19, 2014 at 3:57 PM, Tom Stellard <t...@stellard.net> wrote: >> > On Tue, Aug 19, 2014 at 01:37:56PM -0700, Connor Abbott wrote: >> >> On Tue, Aug 19, 2014 at 11:40 AM, Francisco Jerez <curroje...@riseup.net> >> >> wrote: >> >> > Tom Stellard <t...@stellard.net> writes: >> >> > >> >> >> On Tue, Aug 19, 2014 at 11:04:59AM -0400, Connor Abbott wrote: >> >> >>> On Mon, Aug 18, 2014 at 8:52 PM, Michel Dänzer <mic...@daenzer.net> >> >> >>> wrote: >> >> >>> > On 19.08.2014 01:28, Connor Abbott wrote: >> >> >>> >> On Mon, Aug 18, 2014 at 4:32 AM, Michel Dänzer >> >> >>> >> <mic...@daenzer.net> wrote: >> >> >>> >>> On 16.08.2014 09:12, Connor Abbott wrote: >> >> >>> >>>> I know what you might be thinking right now. "Wait, *another* >> >> >>> >>>> IR? Don't >> >> >>> >>>> we already have like 5 of those, not counting all the >> >> >>> >>>> driver-specific >> >> >>> >>>> ones? Isn't this stuff complicated enough already?" Well, there >> >> >>> >>>> are some >> >> >>> >>>> pretty good reasons to start afresh (again...). In the years >> >> >>> >>>> we've been >> >> >>> >>>> using GLSL IR, we've come to realize that, in fact, it's not >> >> >>> >>>> what we >> >> >>> >>>> want *at all* to do optimizations on. >> >> >>> >>> >> >> >>> >>> Did you evaluate using LLVM IR instead of inventing yet another >> >> >>> >>> one? >> >> >>> >>> >> >> >>> >>> >> >> >>> >>> -- >> >> >>> >>> Earthling Michel Dänzer | >> >> >>> >>> http://www.amd.com >> >> >>> >>> Libre software enthusiast | Mesa and X >> >> >>> >>> developer >> >> >>> >> >> >> >>> >> Yes. See >> >> >>> >> >> >> >>> >> http://lists.freedesktop.org/archives/mesa-dev/2014-February/053502.html >> >> >>> >> >> >> >>> >> and >> >> >>> >> >> >> >>> >> http://lists.freedesktop.org/archives/mesa-dev/2014-February/053522.html >> >> >>> > >> >> >>> > I know Ian can't deal with LLVM for some reason. I was wondering if >> >> >>> > *you* evaluated it, and if so, why you rejected it. >> >> >>> > >> >> >>> > >> >> >>> > -- >> >> >>> > Earthling Michel Dänzer | >> >> >>> > http://www.amd.com >> >> >>> > Libre software enthusiast | Mesa and X >> >> >>> > developer >> >> >>> >> >> >>> >> >> >>> Well, first of all, the fact that Ian and Ken don't want to use it >> >> >>> means that any plan to use LLVM for the Intel driver is dead in the >> >> >>> water anyways - you can translate NIR into LLVM if you want, but for >> >> >>> i965 we want to share optimizations between our 2 backends (FS and >> >> >>> vec4) that we can't do today in GLSL IR so this is what we want to use >> >> >>> for that, and since nobody else does anything with the core GLSL >> >> >>> compiler except when they have to, when we start moving things out of >> >> >>> GLSL IR this will probably replace GLSL IR as the infrastructure that >> >> >>> all Mesa drivers use. But with that in mind, here are a few reasons >> >> >>> why we wouldn't want to use LLVM: >> >> >>> >> >> >>> * LLVM wasn't built to understand structured CFG's, meaning that you >> >> >>> need to re-structurize it using a pass that's fragile and prone to >> >> >>> break if some other pass "optimizes" the shader in a way that makes it >> >> >>> non-structured (i.e. not expressible in terms of loops and if >> >> >>> statements). This loss of information also means that passes that need >> >> >>> to know things like, for example, the loop nesting depth need to do an >> >> >>> analysis pass whereas with NIR you can just walk up the control flow >> >> >>> tree and count the number of loops we hit. >> >> >>> >> >> >> >> >> >> LLVM has a pass to structurize the CFG. We use it in the radeon >> >> >> drivers, and it is run after all of the other LLVM optimizations which >> >> >> have >> >> >> no concept of structured CFG. It's not bug free, but it works really >> >> >> well even with all of the complex OpenCL kernels we throw at it. >> >> >> >> >> >> Your point about losing information when the CFG is de-structurized is >> >> >> valid, but for things like loop depth, I'm not sure why we couldn't >> >> >> write an >> >> >> LLVM analysis pass for this (if one doesn't already exist). >> >> >> >> >> > >> >> > I don't think this is such a big deal either. At least the >> >> > structurization pass used on newer AMD hardware isn't "fragile" in the >> >> > way you seem to imply -- AFAIK (unlike the old AMDIL heuristic >> >> > algorithm) it's guaranteed to give you a valid structurized output no >> >> > matter what the previous optimization passes have done to the CFG, >> >> > modulo bugs. I admit that the situation is nevertheless suboptimal. >> >> > Ideally this information wouldn't get lost along the way. For the long >> >> > term we may want to represent structured control flow directly in the IR >> >> > as you say, I just don't see how reinventing the IR saves us any work if >> >> > we could just fix the existing one. >> >> >> >> It seems to me that something like how we represent control flow is a >> >> pretty fundamental part of the IR - it affects any optimization pass >> >> that needs to do anything beyond adding and removing instructions. How >> >> would you fix that, especially given that LLVM is primarily designed >> >> for CPU's where you don't want to be restricted to structured control >> >> flow at all? It seems like our goals (preserve the structure) conflict >> >> with the way LLVM has been designed. >> >> >> > >> > I think it's important to distinguish between LLVM IR and the tools >> > available to manipulate it. LLVM IR is meant to be a platform >> > independent program representation. There is nothing about the IR that >> > would prevent someone from using it for hardware that required structured >> > control flow. >> >> Right - when I said that structured control flow was a fundamental >> part of the IR, I meant that in the sense that it's a constraint that >> all optimization passes have to follow. I was also thinking of NIR, >> where it actually is a fundamental part of the IR datastructures - all >> control flow consists of a tree of loops, if statements, and basic >> blocks and there are no jump statements in the IR except for break, >> continue, and return. There are helpers to mutate the control flow >> tree (adding an if after an instruction, deleting a loop, etc.) so >> that you can more or less pretend you're operating on something like >> GLSL IR, while the CFG is being updated for you, basic blocks are >> being created and deleted, etc. >> >> > >> > The tools (mainly the optimization passes) are where decisions about >> > things like preserving structured control flow are made. There are >> > currently two strategies available for using the tools to produce programs >> > with structured control flow: >> > >> > 1. Use the CFG structurizer pass >> > >> > 2. Only use transforms that maintain the structure of the control flow. >> >> I'm a little confused about how this strategy would work. I'm assuming >> that the control flow structure (i.e. the tree of loops and ifs) is >> stored in some kind of metadata or fake instruction on top of the IR - >> I haven't looked into this much, so correct me if I'm wrong. If so, >> wouldn't you still have to make every optimization pass that touches >> the CFG properly update that metadata to avoid it going stale, since >> the optimizations themselves are operating on a list of basic blocks >> which is a little lower-level? >> > > There is no CFG metadata. If you want to collect some information about the > CFG, you would use an analysis pass to do this. For example, LLVM has an > analysis pass for computing the dominator tree. If an optimization > wants to use this analysis it would add this analysis as a pass dependency > and then LLVM would run the dominator tree analysis before the optimizations > pass. > > Once the analysis has been run, the result is cached for other passes to use. > However, the base assumption is that optimization passes invalidate > all analysis information, so passes are required to report which analysis > passes > or which features of the program are preserved. So, if a pass reports > that it preserves the CFG, then the dominator tree analysis is still > considered > valid. > > This a high level overview of how it works, but to get back to your question, > if you wanted to use strategy number 2, you could just choose to only run > optimizations that preserved the CFG. > > -Tom
Ah, I see, that makes sense. That does seem like a rather terrible solution though, since not being able to change the CFG seems rather harsh. >> > >> > -Tom >> > >> >> > >> >> >>> * LLVM doesn't do modifiers, meaning that we can't do optimizations >> >> >>> like "clamp(x, 0.0, 1.0) => mov.sat x" and "clamp(x, 0.25, 1.0) => >> >> >>> max.sat(x, .25)" in a generic fashion. >> >> >>> >> >> >> >> >> >> The way to handle this with LLVM would be to add intrinsics to >> >> >> represent >> >> >> the various modifiers and then fold them into instructions during >> >> >> instruction selection. >> >> >> >> >> > >> >> > IMHO this is a feature. One of the things I don't like about NIR is >> >> > that it's still vec4-centric. Most drivers are going to want something >> >> > else and different to each other, we cannot please all of them with one >> >> > single vector addressing model built into the core instruction set, so >> >> > I'd rather have modifiers, writemasks and swizzles represented as the >> >> > composition of separate instructions/intrinsics with simple and >> >> > well-defined semantics, which can be coalesced back into the real >> >> > instruction as Tom says (easy even if you don't use LLVM's instruction >> >> > selector as long as it's SSA form). >> >> >> >> While NIR is vec4-centric, nothing's stopping you from splitting up >> >> instructions and doing optimizations at the scalar level for scalar >> >> ISA's - in fact, that's what I expect to happen. And for backends that >> >> really do need to have swizzles and writemasks, coalescing these >> >> things back into the original instruction is not at all trivial - in >> >> fact, going into and out of SSA without introducing extra copies even >> >> in situations like: >> >> >> >> foo.xyz = ... >> >> ... = foo >> >> foo.x = ... >> >> >> >> is a problem that hasn't been solved yet publicly (it seems doable, >> >> but difficult). So while we might not need swizzles and writemasks for >> >> most backends, for the few that do need it (like, for example, the >> >> i965 vec4 backend) it will be very nice to have one common lowering >> >> pass that solves this hard problem, which would be impossible to do >> >> without having swizzles and writemasks in the IR. And it's very likely >> >> that these backends, which probably aren't using SSA due to the >> >> aforementioned difficulties, will also benefit from having modifiers >> >> already folded for them - this is something that's already a problem >> >> for i965 vec4 backend and that NIR will help a lot. >> >> >> >> > >> >> >>> * LLVM is hard to embed into other projects, especially if it's used >> >> >>> as anything but a command-line tool that only runs once. See, for >> >> >>> example, http://blog.llvm.org/2014/07/ftl-webkits-llvm-based-jit.html >> >> >>> under "Linking WebKit with LLVM" - most of those problems would also >> >> >>> apply to us. >> >> >>> >> >> >> >> >> >> You have to keep in mind that the way webkit uses LLVM is totally >> >> >> different than how Mesa would use LLVM if LLVM IR was adopted as a >> >> >> common IR. >> >> >> >> >> >> webkit is using LLVM as a full JIT compiler, which means it depends >> >> >> on almost all of the pieces of the LLVM stack, the IR manipulation, >> >> >> optimization passes, one or more of the code gen backends, as well >> >> >> as the entire JIT layer. The JIT layer in particular is missing a lot >> >> >> of >> >> >> functionality in the C API, which makes it more difficult to work with. >> >> >> >> >> >> If Mesa were to adopt LLVM IR as a common IR, the only LLVM library >> >> >> functionality it would need would be the IR manipulation and the >> >> >> optimizations passes. >> >> >> >> >> >>> * LLVM is on a different release schedule (6 months vs. 3 months), has >> >> >>> a different review process, etc., which means that to add support for >> >> >>> new functionality that involves shaders, we now have to submit patches >> >> >>> to two separate projects, and then 2 months later when we ship Mesa it >> >> >>> turns out that nobody can actually use the new feature because it >> >> >>> depends upon an unreleased version of LLVM that won't be released for >> >> >>> another 3 months and then packaged by distros even later... we've >> >> >>> already had problems where distros refused to ship newer Mesa releases >> >> >>> because radeon depended on a version of LLVM newer than the one they >> >> >>> were shipping, and if we started using LLVM in core Mesa it would get >> >> >>> even worse. Proprietary drivers solve this problem by just forking >> >> >>> LLVM, building it with the rest of their driver, and linking it in as >> >> >>> a static library, but distro packagers would hate us if we did that. >> >> >>> >> >> >> >> >> >> If Mesa were using LLVM IR as a common IR I'm not sure what features >> >> >> in Mesa would be tied to new additions in LLVM. As I said before, >> >> >> all Mesa would be using would be the IR manipulations and the >> >> >> optimization passes. The IR manipulations only require new features >> >> >> when something new is added to LLVM IR specification, which is rare. >> >> >> It's possible there could be some lag in new features that go into >> >> >> the optimization passes, but if there was some optimization that was >> >> >> deemed really critical, it could be implemented in Mesa using the IR >> >> >> manipulators. >> >> >> >> >> >> -Tom >> >> >> >> >> >>> I wouldn't completely rule out LLVM, and I do think they do a lot of >> >> >>> things right, but for now it seems like it's not the path that the >> >> >>> Intel team wants to take. >> >> >>> >> >> >>> Connor >> >> >>> _______________________________________________ >> >> >>> mesa-dev mailing list >> >> >>> mesa-dev@lists.freedesktop.org >> >> >>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev >> >> >> _______________________________________________ >> >> >> mesa-dev mailing list >> >> >> mesa-dev@lists.freedesktop.org >> >> >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev