On Mon, Dec 21, 2015 at 7:48 PM, Jason Ekstrand <ja...@jlekstrand.net> wrote: > > On Dec 21, 2015 9:09 AM, "Connor Abbott" <cwabbo...@gmail.com> wrote: >> >> On Mon, Dec 21, 2015 at 11:45 AM, Marek Olšák <mar...@gmail.com> wrote: >> > On Mon, Dec 21, 2015 at 4:38 PM, Connor Abbott <cwabbo...@gmail.com> >> > wrote: >> >> On Mon, Dec 21, 2015 at 6:39 AM, Marek Olšák <mar...@gmail.com> wrote: >> >>> On Mon, Dec 21, 2015 at 6:48 AM, Jason Ekstrand <ja...@jlekstrand.net> >> >>> wrote: >> >>>> >> >>>> On Dec 20, 2015 7:43 PM, "Rob Clark" <robdcl...@gmail.com> wrote: >> >>>>> >> >>>>> On Sun, Dec 20, 2015 at 10:29 PM, Connor Abbott >> >>>>> <cwabbo...@gmail.com> >> >>>>> wrote: >> >>>>> > On Sun, Dec 20, 2015 at 10:04 PM, Rob Clark <robdcl...@gmail.com> >> >>>>> > wrote: >> >>>>> >> On Sun, Dec 20, 2015 at 9:12 PM, Jason Ekstrand >> >>>>> >> <ja...@jlekstrand.net> >> >>>>> >> wrote: >> >>>>> >>> >> >>>>> >>> On Dec 19, 2015 5:55 PM, "Rob Clark" <robdcl...@gmail.com> >> >>>>> >>> wrote: >> >>>>> >>>> >> >>>>> >>>> From: Rob Clark <robcl...@freedesktop.org> >> >>>>> >>>> >> >>>>> >>>> Jason, >> >>>>> >>>> >> >>>>> >>>> How much do you hate this idea? Seems like an easy alternative >> >>>>> >>>> to >> >>>>> >>>> using ralloc ctx's to clean up nir variants/clones, which would >> >>>>> >>>> let >> >>>>> >>>> us drop the parent memctx for nir_shader_create()/clone(), >> >>>>> >>>> making >> >>>>> >>>> it easier to introduce reference counting. >> >>>>> >>> >> >>>>> >>> I think "hate" is a but strong. I don't like it but it works. >> >>>>> >>> If we >> >>>>> >>> really >> >>>>> >>> want nir_shader refcounted, we'll have to do something. >> >>>>> >> >> >>>>> >> I suppose the alternate idea of moving the nir_shader_clone() out >> >>>>> >> of >> >>>>> >> brw_compile_xyz(), and always passing in the clone would be a >> >>>>> >> cleaner >> >>>>> >> way. It looks like each of the brw_compile_xyz() has exactly one >> >>>>> >> call-site, so doing the nir_shader_clone() inside doesn't really >> >>>>> >> buy >> >>>>> >> anything. >> >>>> >> >>>> Your forgetting that there may be *cough* other users of this API... >> >>>> We can >> >>>> change those too but I would like the needs of the compiler users to >> >>>> drive >> >>>> the API, not the cloning. I still have some details to work out >> >>>> there. In >> >>>> any case, it doesn't really matter; we can figure something out. >> >>>> >> >>>>> >>> About refcounting... The more I think about it the more I'm not >> >>>>> >>> convinced >> >>>>> >>> it's useful. As it stands, we have no use for it an I'm not >> >>>>> >>> convinced >> >>>>> >>> you >> >>>>> >>> do either. We'll see if I can convince you. :-) >> >>>>> >>> >> >>>>> >>> In the history of i965 using NIR, we've had about three >> >>>>> >>> different ways >> >>>>> >>> of >> >>>>> >>> doing things: >> >>>>> >>> >> >>>>> >>> 1) GLSL is the gold copy and we run glsl_to_nir for every >> >>>>> >>> shader/variant >> >>>>> >>> compile. This is what we did when we first stated using NIR >> >>>>> >>> because >> >>>>> >>> it was >> >>>>> >>> easy and didn't involve reworking any plumbing. >> >>>>> >>> >> >>>>> >>> 2) Lowered NIR is the gold copy; variants are done entirely in >> >>>>> >>> the >> >>>>> >>> back-end >> >>>>> >>> IR. This is what we did up until about a month ago. Because >> >>>>> >>> variants >> >>>>> >>> are >> >>>>> >>> done in the back-end, we can run gksl_to_nir and do all of our >> >>>>> >>> optimizing >> >>>>> >>> and lowering at link time. Going from NIR to the final shader >> >>>>> >>> binary >> >>>>> >>> is >> >>>>> >>> then a read-only operation as far as NIR is concerned. >> >>>>> >>> >> >>>>> >>> 3) Optimized but not lowered NIR is the gold copy; variants are >> >>>>> >>> sometimes >> >>>>> >>> done in NIR. This is the scheme we use now. We call >> >>>>> >>> glsl_to_nir and >> >>>>> >>> do >> >>>>> >>> some of the optimization and lowering at link time but leave it >> >>>>> >>> in SSA >> >>>>> >>> form. >> >>>>> >>> When we go to compile the final shader, we make a copy, apply >> >>>>> >>> variants, do >> >>>>> >>> the final lowering and then go into the back-end IR. >> >>>>> >>> >> >>>>> >>> In each of these cases, we know exactly where we need to make a >> >>>>> >>> copy >> >>>>> >>> without >> >>>>> >>> the help of reference counting. In the first, we get a fresh >> >>>>> >>> copy >> >>>>> >>> each time >> >>>>> >>> so we are free to destroy the copy. In the second, we never >> >>>>> >>> have to >> >>>>> >>> modify >> >>>>> >>> the NIR so no copy. In the third scheme, we always have to make >> >>>>> >>> a >> >>>>> >>> copy >> >>>>> >>> because, even if variants are a no-op, we still have to go out >> >>>>> >>> of SSA >> >>>>> >>> form >> >>>>> >>> and do final lowering. You could say that we could avoid making >> >>>>> >>> that >> >>>>> >>> copy. >> >>>>> >>> However, the work to determine when we don't need variants and >> >>>>> >>> can do >> >>>>> >>> all >> >>>>> >>> our lowering up-front is far more than the work saved by >> >>>>> >>> reference >> >>>>> >>> counting. >> >>>>> >>> >> >>>>> >>> How about gallium? Here's how I imagine it would work (please >> >>>>> >>> correct >> >>>>> >>> me of >> >>>>> >>> I'm wrong): >> >>>>> >>> >> >>>>> >>> 1) In the TGSI case, tgsi_to_nir gets called for each compile so >> >>>>> >>> you >> >>>>> >>> get a >> >>>>> >>> fresh mutable shader each time. In this case, you are free to >> >>>>> >>> do >> >>>>> >>> whatever >> >>>>> >>> you want with the shader without making a copy. >> >>>>> >>> >> >>>>> >>> 2) In the GLSL case, you run glsl_to_nir and do some basic >> >>>>> >>> optimizations at >> >>>>> >>> link time and hold onto the NIR shader. (Hold a reference of >> >>>>> >>> you'd >> >>>>> >>> like.) >> >>>>> >>> When you go to compile it in the back-end, it needs to do it's >> >>>>> >>> own >> >>>>> >>> lowering >> >>>>> >>> so it takes a reference and ends up making a copy. >> >>>>> >>> >> >>>>> >>> If this description is anywhere close to correct, then I don't >> >>>>> >>> think >> >>>>> >>> you >> >>>>> >>> really need it either. Determining whether or not you need to >> >>>>> >>> copy is >> >>>>> >>> simply "if (comes_from_tgsi)”. Maybe there's something subtle >> >>>>> >>> about >> >>>>> >>> the >> >>>>> >>> gallium layer that I don't know that makes refcounting the best >> >>>>> >>> solution. >> >>>>> >>> Please enlighten me of there is. >> >>>>> >> >> >>>>> >> This issue is that we *potentially* have both the state tracker >> >>>>> >> and >> >>>>> >> the driver both doing some of there own variant management. >> >>>>> >> (Which >> >>>>> >> tbh, isn't awesome, it would have been nice if someone realized >> >>>>> >> earlier on that nearly every driver is going to have to do some >> >>>>> >> sort >> >>>>> >> of variant mgmt and figured out a way just to push it all down to >> >>>>> >> the >> >>>>> >> driver.. but I can't see a good way to get there from here.) >> >>>>> >> >> >>>>> >> With TGSI as the IR, driver just unconditionally does >> >>>>> >> tgsi_dup_tokens().. because of the whole thing where st does >> >>>>> >> variants >> >>>>> >> in some cases, things are defined that driver doesn't own the >> >>>>> >> copy of >> >>>>> >> the TGSI IR passed in after the fxn call to driver returns. >> >>>>> >> >> >>>>> >> With NIR I was hoping to fix this, esp. since nir_shader_clone() >> >>>>> >> is >> >>>>> >> more heavyweight than tgsi_dup_tokens() (memcpy()). >> >>>>> >> >> >>>>> >> Refcnt'ing is a nice solution so that we can pass the driver a >> >>>>> >> reference that it owns. In cases where state tracker isn't doing >> >>>>> >> variant mgmt, we pass it the one-and-only ref (avoiding clone). >> >>>>> >> >> >>>>> >> I'd suggested that in cases where st does variant mgmt, that st >> >>>>> >> should >> >>>>> >> do the clone/dup. But that was shot down: >> >>>>> >> >> >>>>> >> >> >>>>> >> http://lists.freedesktop.org/archives/mesa-dev/2015-October/097748.html >> >>>> >> >>>> It sounds like Marek's argument there is more about lifetime >> >>>> management than >> >>>> anything. He wants gallium modules to be able to create IR, call >> >>>> into the >> >>>> driver, and then throw it away. In particular, he doesn't want them >> >>>> to have >> >>>> to think about cloning. In a lot of ways it sounds a lot like what >> >>>> i965 >> >>>> wants too. I really like having brw_compile_foo take a const >> >>>> nir_shader. >> >>>> The difference is that i965 basically always wants to clone whereas a >> >>>> gallium driver may not have to if gallium doesn't care what happens >> >>>> to the >> >>>> shader when it's done. How common is this case? How important is it >> >>>> to >> >>>> optimize for? I don't know. >> >>>> >> >>>> One other thing that bothers me a bit: From Marek's comment, it >> >>>> sounds like >> >>>> the components want to just pass in IR and be agnostic about whether >> >>>> the >> >>>> driver wants its own copy or wants to change it or whatever. This >> >>>> seems >> >>>> like an argument for always cloning to me. From the perspective of a >> >>>> gallium module, "I want to hang in to this, I'll keep a reference" >> >>>> seems >> >>>> exactly the same as "I want to hang onto this, I'll give the driver a >> >>>> copy". >> >>>> How are they actually different given that the driver basically has >> >>>> to >> >>>> modify what you give it in order to do lowering? >> >>>> >> >>>>> > Ugh... I didn't read this at the time, but I don't like Marek's >> >>>>> > response. My understanding of the situation, based on this thread, >> >>>>> > is >> >>>>> > that there are some cases where the st knows that there's only >> >>>>> > going >> >>>>> > to be one variant and can throw away the (NIR or TGSI) shader >> >>>>> > after it >> >>>>> > hands it to the driver, while at other times it has to hold onto >> >>>>> > all >> >>>>> > the variants and only give the driver a read-only copy (or >> >>>>> > duplicate >> >>>> >> >>>> As per above, my interpretation of Marek's comment is that he doesn't >> >>>> want >> >>>> the st to have to think about cloning ever. He wants it to assume >> >>>> that >> >>>> compilation never modifies the IR so the driver should always clone. >> >>>> You >> >>>> have to keep in mind that Marek is most likely thinking about caching >> >>>> the >> >>>> TGSI rather than doing in-place lowering in it. >> >>>> >> >>>> If I'm understanding Marek correctly, then it sounds like shader >> >>>> compilation >> >>>> should never touch the IR that's passed in. If this is the case, it >> >>>> sounds >> >>>> like always cloning is the way to go. At least its not *that* >> >>>> expensive. >> >>> >> >>> Note that st/mesa needs to keep the FS IR because of glDrawPixels and >> >>> glBitmap, and the VS IR because of edge flags, glRasterPos evaluation, >> >>> selection and feedback modes. The last three are done with Draw/LLVM >> >>> and only support TGSI. >> >>> >> >>> Therefore, st/mesa always hangs onto the IR and drivers can't modify >> >>> it. It also needs VS in TGSI to be able to do everything correctly. >> >>> >> >>> What other Gallium modules want or not want is not that important, but >> >>> changing the current semantics will require fixing a lot of places. >> >>> (state trackers - mesa, nine, xa; modules - blitter, draw, hud, >> >>> postprocess, tests, vl) >> >>> >> >>> You really better think about whether changing all those and the risk >> >>> of breaking them is worth it. >> >>> >> >>> Marek >> >> >> >> Well, we're talking about passing NIR here, not TGSI, so none of those >> >> places will need to be updated. NIR is a much more heavyweight IR, and >> >> copying is much more expensive, so it makes more sense there for the >> >> st to duplicate it and let the driver own the IR it passes in, to >> >> reduce copying when there's no variant management necessary. >> > >> > My main concern was about TGSI, not so much about NIR. >> >> Yes, exactly -- we don't plan on changing the semantics of passing in >> TGSI, so this only matters when the user passes in a NIR shader. > > I think two different concepts of ownership are getting conflated here: > Right/responsibility to delete and right to modify. > > The way I understand it, gallium, as it stands, gives neither to the driver. > A back-end using NIR requires the right to modify but who deletes it doesn't > ultimately matter. I think it's dangerous to pass one of these rights to > the driver and not the other but we need to think about both. > > What I'm trying to say is that we have two options here: > > 1) gallium passes IR to the back-end that it is free to modify and is > required to delete when it's done. > > 2) gallium passes read-only IR to the back-end and it always makes a copy.
Not always. The copy is optional. Drivers are encouraged not to make a copy if they don't need it. Or they can keep a copy in a different IR, which is the same scenario for the original IR. Marek _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev