On Mon, Dec 28, 2015 at 2:05 PM, Jason Ekstrand <ja...@jlekstrand.net> wrote: > > > On Mon, Dec 28, 2015 at 10:33 AM, Rob Clark <robdcl...@gmail.com> wrote: >> >> On Mon, Dec 28, 2015 at 1:20 PM, Jason Ekstrand <ja...@jlekstrand.net> >> wrote: >> > >> > >> > On Mon, Dec 28, 2015 at 9:37 AM, Connor Abbott <cwabbo...@gmail.com> >> > wrote: >> >> >> >> On Mon, Dec 28, 2015 at 10:13 AM, Rob Clark <robdcl...@gmail.com> >> >> wrote: >> >> > On Tue, Dec 22, 2015 at 10:11 PM, Connor Abbott <cwabbo...@gmail.com> >> >> > wrote: >> >> >> On Tue, Dec 22, 2015 at 9:55 PM, Rob Clark <robdcl...@gmail.com> >> >> >> wrote: >> >> >>> On Tue, Dec 22, 2015 at 9:47 PM, Connor Abbott >> >> >>> <cwabbo...@gmail.com> >> >> >>> wrote: >> >> >>>> On Tue, Dec 22, 2015 at 9:02 PM, Rob Clark <robdcl...@gmail.com> >> >> >>>> wrote: >> >> >>>>> On Mon, Dec 21, 2015 at 1:48 PM, Jason Ekstrand >> >> >>>>> <ja...@jlekstrand.net> wrote: >> >> >>>>>> >> >> >>>>>> I think two different concepts of ownership are getting >> >> >>>>>> conflated >> >> >>>>>> here: >> >> >>>>>> Right/responsibility to delete and right to modify. >> >> >>>>>> >> >> >>>>>> The way I understand it, gallium, as it stands, gives neither to >> >> >>>>>> the driver. >> >> >>>>>> A back-end using NIR requires the right to modify but who >> >> >>>>>> deletes >> >> >>>>>> it doesn't >> >> >>>>>> ultimately matter. I think it's dangerous to pass one of these >> >> >>>>>> rights to >> >> >>>>>> the driver and not the other but we need to think about both. >> >> >>>>> >> >> >>>>> yeah, uneasy about driver modifying the IR if the state tracker >> >> >>>>> is >> >> >>>>> still going to potentially spin off variants of the IR.. that >> >> >>>>> sounds >> >> >>>>> like madness. >> >> >>>>> >> >> >>>>> The refcnt'ing I proposed does deal w/ right to modify vs delete >> >> >>>>> via >> >> >>>>> nir_shader(_is)_mutable() which returns something that is >> >> >>>>> guaranteed >> >> >>>>> to be safe to modify (ie. has only a single reference) >> >> >>>>> >> >> >>>>>> What I'm trying to say is that we have two options here: >> >> >>>>>> >> >> >>>>>> 1) gallium passes IR to the back-end that it is free to modify >> >> >>>>>> and >> >> >>>>>> is >> >> >>>>>> required to delete when it's done. >> >> >>>>> >> >> >>>>> with refcnt'ing, s/delete/unref/ >> >> >>>>> >> >> >>>>> The idea is, the st transfers ownership of the reference it >> >> >>>>> passes >> >> >>>>> to >> >> >>>>> the driver. If the st wants to hang on to a reference itself, it >> >> >>>>> must >> >> >>>>> increment the refcnt before passing to the driver (backend). >> >> >>>>> >> >> >>>>> Without refcnt'ing, I suppose we could (since we don't have to >> >> >>>>> follow >> >> >>>>> TGSI semantics), just decree that the driver always takes >> >> >>>>> ownership >> >> >>>>> of >> >> >>>>> the copy passed in, and if the st wants to hang on to a copy too, >> >> >>>>> then >> >> >>>>> it must clone. I suppose this would work well enough for >> >> >>>>> freedreno/vc4, which both end up generating variants later. It >> >> >>>>> does >> >> >>>>> force an extra clone for drivers that immediately translate into >> >> >>>>> their >> >> >>>>> own backend IR and don't need to keep the NIR around, for >> >> >>>>> example. >> >> >>>>> Maybe that is not worth caring about (since at this point it is >> >> >>>>> hypothetical). >> >> >>>> >> >> >>>> While always cloning does have this disadvantage, I don't think >> >> >>>> it's >> >> >>>> really relevant here. Even if the driver throws away the NIR >> >> >>>> immediately after consuming it, almost invariably it's going to >> >> >>>> want >> >> >>>> to modify it. The generic NIR passed in by the state tracker >> >> >>>> (other >> >> >>>> IR -> NIR + some optimizations) is almost never going to be the >> >> >>>> same >> >> >>>> as the NIR after going through driver-specific lowering passes, >> >> >>>> which >> >> >>>> means that drivers are never going to want a read-only version of >> >> >>>> the >> >> >>>> IR. In light of that, I think making the driver own the IR passed >> >> >>>> in >> >> >>>> seems like the most sensible thing. >> >> >>> >> >> >>> well, unless the driver is already doing it's own lowering in it's >> >> >>> own >> >> >>> native IR.. >> >> >> >> >> >> Well, if you're not doing any lowering in NIR, then you aren't >> >> >> really >> >> >> taking any advantage of it. I can't see a plausible scenario where >> >> >> all >> >> >> the lowering is done in the driver's own IR -- and as soon as you do >> >> >> anything in NIR, you need the driver-owns-IR semantics. >> >> > >> >> > When it comes to shader variants, I have a mix, with some things >> >> > lowered in nir and others just handled in backend.. >> >> > >> >> > The re-work / cleanup that I have had on a branch for a while now >> >> > (since it is currently blocked on refcnt'ing) does a first round of >> >> > variant-key independent NIR lowering/opt passes. And then at draw >> >> > time, if the variant key has anything that is lowered in nir, I do a >> >> > second round. >> > >> > >> > Just to be clear, your key-dependent lowering happens after all of your >> > other lowering? If this is the case, then I guarantee you that you're >> > unique in this since i965 and vc4 need to at least run out-of-SSA >> > afterwards. To be honest, I completely forgot that a driver could use >> > fully >> > ssa NIR. >> >> It is a mix.. I do texcoord saturate, clip-plane, and 2-sided color >> lowering in NIR. But flat-shading, binning-pass, and half vs full >> precision color output in ir3. >> >> I do as much lowering in NIR as I can, in an effort to do as much as >> possible at compile time, vs draw time. I do the first round of >> lowering/opt w/ null shader key, which is enough for the common cases. >> >> Pretty much independent, I suppose, of whether I came out of SSA or >> not first. Although binning-pass variant and the instruction >> scheduling I do are easier in SSA. >> >> Somewhat unrelated, but I may end up converting array access to >> registers, but leave everything else in SSA, so I can benefit from >> converting multi-dimensional offsets into a single offset.. this is >> still one open issue w/ gallium glsl_to_nir.. right now I have a >> hacked up version of nir_lower_io that converts global/local >> load/store_var's into load/store_var2 which take an offset as a src >> (like load_input/store_output) instead of deref chain.. not sure yet >> whether this will be the permanent solution, but at least it fixes a >> huge heap of variable-indexing piglits and lets me continue w/ >> implementing lowering passes for everything else that was previously >> done in glsl->tgsi or tgsi->tgsi passes. > > > If you do this, you'll be back to always needing a mutable copy. Most > lowering and optimization passes die the moment they see a register. You'll > either have to go fix a bunch of stuff up to no-op properly or run > vars_to_regs after doing your NIR lowering but before going into your > backend IR. This means that your "gold copy" still has variables and you > always need to lower them to registers before you go into the backend.
ugg.. but good point, thanks for pointing that out before I wasted another afternoon on yet another dead-end for handling deref's.. Ok, I guess I need to think of a better name than load/store_var2 for the new intrinsics ;-) PS. I would really like to keep the nir_variable ptr in the intrinsic instruction, rather than matching things up w/ var->data.driver_location, which is rather clunky.. BR, -R >> >> >> BR, >> -R >> >> >> >> >> >>> >> >> >>> Maybe it is too much of a hypothetical.. I still think refcnt'ing >> >> >>> gives some nice flexibility to deal with various scenarios, and >> >> >>> having >> >> >>> to call nir_shader_unref() isn't so much of a burden. >> >> >> >> >> >> Still, I can't see how this flexibility is at all useful, and it >> >> >> seems >> >> >> like overkill since the driver will always want a mutable version of >> >> >> the IR anyways. >> >> > >> >> > Well, due to the structure I mentioned above, at draw time when I >> >> > need >> >> > to generate a variant with nothing lowered in NIR, I simply incr the >> >> > refcnt on the IR which has already gone through first round of NIR >> >> > passes, and pass that in to my back end compiler. At the end, once >> >> > the shader binary is generated, I can unconditionally unref the >> >> > nir_shader without having to care. >> >> > >> >> > Without refcnt'ing I'd either have to generate a pointless clone or >> >> > keep track that the nir_shader should not actually be free'd. Not >> >> > impossible, just a bit more ugly. >> >> >> >> Assuming you do all your variant management in your driver's IR, then >> >> you don't need to do anything. If you do some variant management in >> >> NIR, then in the function where you do the NIR-based lowering you can >> >> check if you need to do any lowering based on the shader key, clone >> >> first, and give the NIR->ir3 function the cloned IR and then free it. >> >> It might be a "bit more ugly," but it's really not that much different >> >> from the refcounting, and when the extra shader gets created/freed is >> >> made explicit. >> >> >> >> > >> >> > (The gallium glsl_to_nir stuff is also currently using refcnt'ing, >> >> > although at least for freedreno/ir3 it isn't strictly needed.. I >> >> > could >> >> > just unconditionally clone in the state tracker. That said, I'm >> >> > still >> >> > of the opinion that refcnt'ing could be useful to some other driver >> >> > someday) >> >> >> >> "It could be useful to some driver someday" isn't a good argument for >> >> adding stuff today. We've already had enough examples of things in NIR >> >> that we added because we thought it was useful, but turned out not to >> >> be. >> >> >> >> > >> >> > BR, >> >> > -R >> >> > >> >> >>> >> >> >>> BR, >> >> >>> -R >> >> >>> >> >> >>>>> >> >> >>>>> (I guess nouveau is the one driver, that if it ever consumed NIR, >> >> >>>>> would translate immediately into it's own backend IR?) >> >> >>>>> >> >> >>>>>> 2) gallium passes read-only IR to the back-end and it always >> >> >>>>>> makes >> >> >>>>>> a copy. >> >> >>>>> >> >> >>>>> This is how it is w/ TGSI, but I think with NIR we are free to >> >> >>>>> make >> >> >>>>> a >> >> >>>>> clean break. And we *definitely* want to avoid >> >> >>>>> the-driver-always-copies semantics.. >> >> >>>>> >> >> >>>>>> It sounds like, from what Marek is saying, that gallium is >> >> >>>>>> currently doing >> >> >>>>>> (2) and changing it to (1) would be painful. I think reference >> >> >>>>>> counting is >> >> >>>>>> more like an awkward option 1.5 than option 3. Reference >> >> >>>>>> counting >> >> >>>>>> would >> >> >>>>>> mean that gallium passes a reference to the driver which it is >> >> >>>>>> expected to >> >> >>>>>> unref but may keep a second reference if it wants to keep the >> >> >>>>>> driver from >> >> >>>>>> modifying it. Then the driver may or may not make a copy based >> >> >>>>>> on >> >> >>>>>> the >> >> >>>>>> number of references. Why don't we just make it explicit and >> >> >>>>>> add a >> >> >>>>>> read-only bit and call it a day. >> >> >>>>>> >> >> >>>>>> One of the reasons I don't like passing a reference is that it >> >> >>>>>> effectively >> >> >>>>>> puts allocation and freeing in different components of the >> >> >>>>>> driver. >> >> >>>>> >> >> >>>>> With refcnt'ing you should talk in terms of ref/unref rather than >> >> >>>>> allocate/free.. imho. Although maybe that is what you meant. >> >> >>>>> (In >> >> >>>>> which case, yes, that was my idea, that passing in to driver >> >> >>>>> transfers >> >> >>>>> ownership of the passed reference.) >> >> >>>>> >> >> >>>>>> This >> >> >>>>>> means that if and driver doesn't care at all about the shader >> >> >>>>>> that >> >> >>>>>> gets >> >> >>>>>> passed in, it still has to under it to avoid a memory leak. You >> >> >>>>>> can't have >> >> >>>>>> the driver take the reference because then, either it comes in >> >> >>>>>> with >> >> >>>>>> a >> >> >>>>>> recount of 0 and should have been deleted, or the "can I modify >> >> >>>>>> this" check >> >> >>>>>> becomes "recount <= 2" which makes no sense. >> >> >>>>> >> >> >>>>> hmm, no, if ownership of the reference is transferred to the >> >> >>>>> driver, >> >> >>>>> then it becomes "refcount == 1" (and refcount == 0 should be an >> >> >>>>> assert, because something has gone horribly wrong) >> >> >>>>> >> >> >>>>> BR, >> >> >>>>> -R >> > >> > > > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev