On Thu, Jan 3, 2019 at 2:03 PM Bas Nieuwenhuizen <b...@basnieuwenhuizen.nl> wrote:
> On Thu, Jan 3, 2019 at 6:59 PM Jason Ekstrand <ja...@jlekstrand.net> > wrote: > > > > On Thu, Jan 3, 2019 at 3:39 AM Erik Faye-Lund < > erik.faye-l...@collabora.com> wrote: > >> > >> On Wed, 2019-01-02 at 10:16 -0600, Jason Ekstrand wrote: > >> > On Wed, Jan 2, 2019 at 9:43 AM Ilia Mirkin <imir...@alum.mit.edu> > >> > wrote: > >> > > Have a look at the first 4 patches in the series from Jonathan > >> > > Marek > >> > > to address some of these issues: > >> > > > >> > > https://patchwork.freedesktop.org/series/54295/ > >> > > > >> > > Not sure exactly what state that work is in, but I've added > >> > > Jonathan > >> > > to CC, perhaps he can provide an update. > >> > > > >> > > Cheers, > >> > > > >> > > -ilia > >> > > > >> > > On Wed, Jan 2, 2019 at 6:28 AM Qiang Yu <yuq...@gmail.com> wrote: > >> > > > > >> > > > Hi guys, > >> > > > > >> > > > I found the problem with this test fragment shader when lima > >> > > development: > >> > > > uniform int color; > >> > > > void main() { > >> > > > if (color > 1) > >> > > > gl_FragColor = vec4(1.0, 0.0, 0.0, 1); > >> > > > else > >> > > > gl_FragColor = vec4(0.0, 1.0, 0.0, 1); > >> > > > } > >> > > > > >> > > > nir_print_shader output: > >> > > > impl main { > >> > > > block block_0: > >> > > > /* preds: */ > >> > > > vec1 32 ssa_0 = load_const (0x00000001 /* 0.000000 */) > >> > > > vec4 32 ssa_1 = load_const (0x3f800000 /* 1.000000 */, > >> > > > 0x00000000 /* 0.000000 */, 0x00000000 /* 0.000000 */, 0x3f800000 > >> > > /* > >> > > > 1.000000 */) > >> > > > vec4 32 ssa_2 = load_const (0x00000000 /* 0.000000 */, > >> > > > 0x3f800000 /* 1.000000 */, 0x00000000 /* 0.000000 */, 0x3f800000 > >> > > /* > >> > > > 1.000000 */) > >> > > > vec1 32 ssa_3 = load_const (0x00000000 /* 0.000000 */) > >> > > > vec1 32 ssa_4 = intrinsic load_uniform (ssa_3) (0, 1, 0) > >> > > /* > >> > > > base=0 */ /* range=1 */ /* component=0 */ /* color */ > >> > > > vec1 32 ssa_5 = slt ssa_0, ssa_4 > >> > > > vec1 32 ssa_6 = fnot ssa_5 > >> > > > vec4 32 ssa_7 = bcsel ssa_6.xxxx, ssa_2, ssa_1 > >> > > > intrinsic store_output (ssa_7, ssa_3) (0, 15, 0) /* > >> > > base=0 */ > >> > > > /* wrmask=xyzw */ /* component=0 */ /* gl_FragColor */ > >> > > > /* succs: block_1 */ > >> > > > block block_1: > >> > > > } > >> > > > > >> > > > ssa0 is not converted to float when glsl to nir. I see > >> > > glsl_to_nir.cpp > >> > > > will create flt/ilt/ult > >> > > > based on source type for gpu support native integer, but for gpu > >> > > not > >> > > > support native > >> > > > integer, just create slt for all source type. And in > >> > > > nir_lower_constant_initializers, > >> > > > there's also no type conversion for integer constant. > >> > > >> > This is a generally sticky issue. In NIR, we have no concept of > >> > types on SSA values which has proven perfectly reasonable and > >> > actually very powerful in a world where integers are supported > >> > natively. Unfortunately, it causes significant problems for float- > >> > only architectures. > >> > >> I would like to take this chance to say that this untyped SSA-value > >> choice has lead to issues in both radeon_si (because LLVM values are > >> typed) and zink (similarly, because SPIR-V values are typed), where we > >> need to to bitcasts on every access because there's just not enough > >> information available to emit variables with the right type. > > > > > > I'm not sure if I agree that the two problems are the same or not... > More on that in a bit. > > > >> > >> It took us a lot of time to realize that the meta-data from the opcodes > >> doesn't *really* provide this, because the rest of nir doesn't treat > >> values consistently. In fact, this feels arguably more like buggy > >> behavior; why do we even have fmov when all of the time the compiler > >> will emit imovs for floating-point values...? Or why do we have bitcast > > > > > > Why do we have different mov opcodes? Because they have different > behavior in the presence of source/destination modifiers. You likely don't > use those in radeon or Zink but we do use them on Intel. That being said, > I've very seriously considered adding a generic nir_op_mov which is > entirely typeless and doesn't support modifiers at all and make most of > core NIR use that. For that matter, there's no real reason why we need > fmov with modifiers at all when we could we could easily replace "ssa1 = > fmov.sat |x|" with "ssa1 = fsat |x|" or "ssa1 = fabs.sat x". If we had a > generic nir_op_mov to deal with swizzling etc., the only thing having > i/fmov would really accomplish is lowering the number of different ways you > can write "fmov.sat |x|". The only reason I haven't added this nir_op_mov > opcode is because it's a pile of work and anyone who can't do integers can > just implement imov as fmov and it's not a big deal. > > Is there anything blocking just always using fmov and removing imov > from nir? Currently anyone that want to add a modifier to an imov can > just change the opcode to an fmov and add the modifier, is it not? > NIR (and intel HW) has both integer and float modifiers. modifiers on fmov and imov do different things. I'm not really sure what you're suggesting here. > >> > >> I would really love it if we could at least consider making this "we > >> can just treat floats as ints without bitcasts if we feel like it"- > >> design optional for the backend somehow. > >> > >> I guess the assumption is that bitcasts are for free? They aren't once > >> you have to emit them and have a back-end remove a bunch of redundant > >> ones... We should already have all the information to trivially place > >> casts for backends that care, yet we currently make it hard (unless > >> your HW/backend happens to be untyped)... > >> > >> Is there some way we can perhaps improve this for backends that care? > > > > > > That's an interesting question... > > > > Before answering that question, let's back up and ask why. From a > hardware perspective, I know of no graphics hardware that puts types on > registers. On Intel, we assign types to sources and destinations of > instructions but outside a specific instruction, a register is just a bag > of bits. I suspect most other hardware is the same. In the hardware that > his particular thread was originally about, you could argue that they have > a type and it's always float. The real issue that the GLES2 people are > having is that constants have a bag of bits value which may need to be > interpreted as integer or float depending on context which they do not have > when lowering. They don't care, for instance, about the fact that imov or > bcsel take integers because they know that the sources will either be > floats or integers that they've lowered to floats and so they know that the > result will be a float. > > > > The problem you're describing is in converting from NIR to another IR, > not to hardware. In LLVM they made a choice to put types on SSA values but > then to have the actual semantics be based on the instruction itself. This > leads to lots of redundancy in the IR but also lots of things you can > validate which is kind-of nice. Is that redundancy really useful? In our > experience with NIR, we haven't found it to be other than booleans (now > sorted), this one constant issue, and translating to IRs that have that > redundancy. Then why did they add it? I'm really not sure but it may, at > least somewhat, be related to the fact that they allow arrays and structs > in their SSA values and so need full types. This is another design > decision in LLVM which I find highly questionable. What you're is > more-or-less that NIR should carry, maintain, and validate extra useless > information just so we can pass it on to LLVM which is going to use it for > what, exactly? Sorry if I'm extremely reluctant to make fairly fundamental > changes to NIR with no better reason than "LLVM did it that way". > > > > There's a second reason why I don't like the idea of putting types on > SSA values: It's extremely deceptive. In SPIR-V you're allowed to do an > OpSConvert with a source that is an unsigned 16-bit integer and a > destination that is an unsigned 32-bit integer. What happens? Your uint > -> uint cast sign extends! Yup.... That's what happens. No joke. The > same is true of signed vs. unsigned division or modulus. The end result is > that the types in SPIR-V are useless and you can't actually trust them for > anything except bit-size and sometimes labelling something as a float vs. > int. > > > > So how do we deal with the mismatch? My recommendation would be to use > the source/destination types of the ALU ops where needed/useful, add > special cases for things like imov and bcsel where you can pass the source > type through, and insert extra bitcasts whenever they're needed. This is > obvious enough that I'm going to guess you're already more-or-less doing > exactly that. Sorry if I'm not being more helpful but the longer this > e-mail gets the more convinced I become that SPIR-V and LLVM made the wrong > design decision and we don't want to follow in their folly. > > Let me disagree on that one. However to start with I agree on the > SPIR-v unsigned/signed type because they did not give it any semantics > and it is hence very useless. > > However, I think in general types have advantages. In particular they > result in some abstraction and flexibility. It allows LLVM frontends > to generate LLVM IR closer to their source language and then have a > set of passes to lower that to something more useful for a backend > (and those passes can be shared between a lot of frontends). This > results in one of the strengths of LLVM IMO, that you can throw a lot > of pretty high level IR at it and that LLVM will produce something > reasonable, which is especially nice considering the number of LLVM > frontends. > > Furthermore, having an abstract type like e.g. Sampler (or even just a > pointer) can be useful to have the frontend and middle end not care > about properties of that type like the bitsize. > > And even NIR is growing a type system: > - The canonical size descriptor is "bitsize x components" > - With the 1-bit bools we are very explicitly using the above as types. > You're not wrong.... > - With derefs we have e.g. the requirement that most derefs take > another deref as input becomes another implicitly encoded part of the > type: > - That we are going to need to keep track of deref bitwidths (for > AMD they are *really* 64-bit) means a bunch of plumbing in > optimizations that we did not need if they were abstracted. > Sorting out opaque types has been a bit of a sore point lately. NIR (as you point out later) is a fairly low-level IR and the assumption was made when it was designed that things would pretty much have a size when you hit NIR. Booleans were a bit of an odd case and the 1-bit solution seems to have worked out very well but you're right that it's almost a new type. With pointers, my current plan (see the UBO/SSBO MR) is to have drivers a type for each kind of pointer and spirv_to_nir will give them the type they ask for. Would it be better to have an opaque SSA value type and a lowering pass? That's entirely possible. > And neither is the LLVM typesystem perfect (e.g. they somewhat > recently removed the destination type from the pointer type). > > float vs. int is kinda useless for us though for a long time LLVM used > it to put things in general purpose registers vs. fpu/mmx/sse, and > that works relatively well as long as people do not do too many > bitcasts in practice. > > Overall I don't believe the LLVM type system is a mistake, and that > was the main point I wanted to get across. I do think it is probably > not the right path for nir though as I see nir as a bit lower level > and more backend centric than LLVM (partly due to its lack of > comprehensive SSA types). It think nir being generally more backend > centric is the right choice as we grow to many more backends than > frontends. > I think this is the key issue. NIR has never claimed to be a high-level IR; it's only claimed to be high enough level to be effectively cross-vendor. I should also say that I'm not staunchly opposed to a bit more descriptive type information on SSA values in principle. As you pointed out, we basically did for booleans and I've very considered it for pointers and bindless handles. I just don't think that the need to insert bitcasts or "because LLVM did it" are good reasons. When you look at the shaders we get out of a lot of D3D translators, they're full of bitcasts everywhere including storing integers and floats in the same vec4; the fact that NIR chews them up and just doesn't care is a feature. :-) --Jason > > > > --Jason > > > > > >> > >> > There are two general possible solutions: > >> > > >> > 1. convert all integers to floats in glsl_to_nir and prog_to_nir and > >> > adjust various lowering/optimization passes to handle > >> > nir_compiler_options::supports_native_integers == false. > >> > > >> > 2. Just allow integers all the way through until you get very close > >> > to the end and then lower integers to floats at the last possible > >> > moment. > >> > > >> > Both of these come with significant issues. With the first approach, > >> > there are potentially a lot of passes that will need to be adjusted > >> > and it's not 100% clear what to do with UBO offsets and indirect > >> > addressing; fortunately, you should be able to disable most of those > >> > optimizations to get going so it shouldn't be too bad. The second > >> > would be less invasive to NIR because it doesn't require modifying as > >> > many passes. However, doing such a lowering would be very tricky to > >> > get right primarily because of constants. With everything else, you > >> > can just sort of assume that inputs are floats (you lowered, right?) > >> > and lower to produce a float output. With constants, however, you > >> > don't know whether or not it's an integer that needs lowering. We > >> > could, in theory, add an extra bit to load_const to solve this > >> > problem but there are also significant problems with doing that so > >> > it's not clear it's a good idea. > >> > > >> > I think the patches from Marek (as indicated by ilia) attempt the > >> > first approach. If we can do it practically, my suspicion is that > >> > the first will work better than the second. However, it will take > >> > some experimentation in order to actually determine that. > >> > > >> > --Jason > >> > _______________________________________________ > >> > mesa-dev mailing list > >> > mesa-dev@lists.freedesktop.org > >> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > >> > > _______________________________________________ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev