Am 30.11.2016 um 20:19 schrieb Nicolai Hähnle: > On 30.11.2016 19:06, Roland Scheidegger wrote: >> Am 30.11.2016 um 14:35 schrieb Nicolai Hähnle: >>> From: Nicolai Hähnle <nicolai.haeh...@amd.com> >>> >>> This is for geometry shader outputs. Without it, drivers have no way of >>> knowing which stream each output is intended for, and have to >>> conservatively write all outputs to all streams. >>> >>> Separate stream numbers for each component are required due to output >>> packing. >> Are you sure this is true? >> This is an area I don't know much about, but >> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.opengl.org_wiki_Layout-5FQualifier-5F-28GLSL-29&d=DgIDaQ&c=uilaK90D4TOVoH58JNXRgQ&r=_QIjpv-UJ77xEQY8fIYoQtr5qv8wKrPJc7v7_-CYAb0&m=fVpTGTYN2KTEhU17RpFTxEULrsIfC3bdpEin0k8NIYE&s=uamnHj-9Xr12ctr0gHDfCMIMHq8DyUBtKIwHQQpjDLs&e= >> >> tells me "Stream >> assignments for a geometry shader are required to be the same for all >> members of a block, but offsets are not." >> >> Therefore I don't think output packing should ever happen across >> multiple streams. I think it would be MUCH nicer if the semantic needed >> just one stream member... > > There are two variants of that question, I guess. > > The answer to the first variant is: Yes, this is currently true. > lower_packed_varyings will happily pack outputs from different vertex > streams into the same vec4. This affects quite a lot of programs, e.g. > you see it in piglit arb_gpu_shader5-xfb-streams. > > The second question is: Do we want it to be true? I agree that it would > be convenient to be able to use a single Stream member. Also, isolating > the stream0 components from the rest would lead to slightly more > efficient shaders for us in some cases. > > I opted against it so far because I didn't want to think through the > implications of changing lower_packed_varyings. The main question I have > is: if you account for the size of the GS output in # of components, > then it could happen that the number of output vec4s ends up being > larger than (max # of output components) / 4. Will that be a problem > somewhere?
I don't know if that would be a problem, but if it is I'd assume this would be fixable (since the number of actual components ultimately doesn't change). Having outputs belonging to multiple streams in a single output just seems weird... That said, I wonder if it actually would be possible to do that with d3d11 too. With shader model 5 you'd have: dcl_stream 0 dcl_output o0.xy dcl_stream 1 dcl_output o0.zw // legal or not??? Though the shader model 4/5 rules are a bit weird for packing inputs/outputs, I'm not even sure two dcl_output are legal for the same reg without a dcl_stream in between them (but you can pack system values together with ordinary inputs/outputs). So maybe just allowing this is the right solution... Roland > > Nicolai > >> >> Roland >> >> >> >>> --- >>> src/compiler/glsl/ir_print_visitor.cpp | 4 +-- >>> src/gallium/auxiliary/tgsi/tgsi_build.c | 18 +++++++++-- >>> src/gallium/auxiliary/tgsi/tgsi_dump.c | 13 ++++++++ >>> src/gallium/auxiliary/tgsi/tgsi_text.c | 48 >>> ++++++++++++++++++++++++++++++ >>> src/gallium/include/pipe/p_shader_tokens.h | 5 +++- >>> 5 files changed, 83 insertions(+), 5 deletions(-) >>> >>> diff --git a/src/compiler/glsl/ir_print_visitor.cpp >>> b/src/compiler/glsl/ir_print_visitor.cpp >>> index 2b77c14..d401426 100644 >>> --- a/src/compiler/glsl/ir_print_visitor.cpp >>> +++ b/src/compiler/glsl/ir_print_visitor.cpp >>> @@ -173,26 +173,26 @@ void ir_print_visitor::visit(ir_variable *ir) >>> if (ir->data.location != -1) >>> snprintf(loc, sizeof(loc), "location=%i ", ir->data.location); >>> >>> char component[32] = {0}; >>> if (ir->data.explicit_component) >>> snprintf(component, sizeof(component), "component=%i ", >>> ir->data.location_frac); >>> >>> char stream[32] = {0}; >>> if (ir->data.stream & (1u << 31)) { >>> if (ir->data.stream & ~(1u << 31)) { >>> - snprintf(stream, sizeof(stream), "stream(%u,%u,%u,%u)", >>> + snprintf(stream, sizeof(stream), "stream(%u,%u,%u,%u) ", >>> ir->data.stream & 3, (ir->data.stream >> 2) & 3, >>> (ir->data.stream >> 4) & 3, (ir->data.stream >> 6) >>> & 3); >>> } >>> } else if (ir->data.stream) { >>> - snprintf(stream, sizeof(stream), "stream%u", ir->data.stream); >>> + snprintf(stream, sizeof(stream), "stream%u ", ir->data.stream); >>> } >>> >>> const char *const cent = (ir->data.centroid) ? "centroid " : ""; >>> const char *const samp = (ir->data.sample) ? "sample " : ""; >>> const char *const patc = (ir->data.patch) ? "patch " : ""; >>> const char *const inv = (ir->data.invariant) ? "invariant " : ""; >>> const char *const prec = (ir->data.precise) ? "precise " : ""; >>> const char *const mode[] = { "", "uniform ", "shader_storage ", >>> "shader_shared ", "shader_in ", >>> "shader_out ", >>> "in ", "out ", "inout ", >>> diff --git a/src/gallium/auxiliary/tgsi/tgsi_build.c >>> b/src/gallium/auxiliary/tgsi/tgsi_build.c >>> index d525c8f..773f892 100644 >>> --- a/src/gallium/auxiliary/tgsi/tgsi_build.c >>> +++ b/src/gallium/auxiliary/tgsi/tgsi_build.c >>> @@ -232,40 +232,50 @@ tgsi_build_declaration_interp(unsigned >>> interpolate, >>> return di; >>> } >>> >>> static struct tgsi_declaration_semantic >>> tgsi_default_declaration_semantic( void ) >>> { >>> struct tgsi_declaration_semantic ds; >>> >>> ds.Name = TGSI_SEMANTIC_POSITION; >>> ds.Index = 0; >>> - ds.Padding = 0; >>> + ds.StreamX = 0; >>> + ds.StreamY = 0; >>> + ds.StreamZ = 0; >>> + ds.StreamW = 0; >>> >>> return ds; >>> } >>> >>> static struct tgsi_declaration_semantic >>> tgsi_build_declaration_semantic( >>> unsigned semantic_name, >>> unsigned semantic_index, >>> + unsigned streamx, >>> + unsigned streamy, >>> + unsigned streamz, >>> + unsigned streamw, >>> struct tgsi_declaration *declaration, >>> struct tgsi_header *header ) >>> { >>> struct tgsi_declaration_semantic ds; >>> >>> assert( semantic_name <= TGSI_SEMANTIC_COUNT ); >>> assert( semantic_index <= 0xFFFF ); >>> >>> ds.Name = semantic_name; >>> ds.Index = semantic_index; >>> - ds.Padding = 0; >>> + ds.StreamX = streamx; >>> + ds.StreamY = streamy; >>> + ds.StreamZ = streamz; >>> + ds.StreamW = streamw; >>> >>> declaration_grow( declaration, header ); >>> >>> return ds; >>> } >>> >>> static struct tgsi_declaration_image >>> tgsi_default_declaration_image(void) >>> { >>> struct tgsi_declaration_image di; >>> @@ -454,20 +464,24 @@ tgsi_build_full_declaration( >>> struct tgsi_declaration_semantic *ds; >>> >>> if( maxsize <= size ) >>> return 0; >>> ds = (struct tgsi_declaration_semantic *) &tokens[size]; >>> size++; >>> >>> *ds = tgsi_build_declaration_semantic( >>> full_decl->Semantic.Name, >>> full_decl->Semantic.Index, >>> + full_decl->Semantic.StreamX, >>> + full_decl->Semantic.StreamY, >>> + full_decl->Semantic.StreamZ, >>> + full_decl->Semantic.StreamW, >>> declaration, >>> header ); >>> } >>> >>> if (full_decl->Declaration.File == TGSI_FILE_IMAGE) { >>> struct tgsi_declaration_image *di; >>> >>> if (maxsize <= size) { >>> return 0; >>> } >>> diff --git a/src/gallium/auxiliary/tgsi/tgsi_dump.c >>> b/src/gallium/auxiliary/tgsi/tgsi_dump.c >>> index 614bcb2..f74aad1 100644 >>> --- a/src/gallium/auxiliary/tgsi/tgsi_dump.c >>> +++ b/src/gallium/auxiliary/tgsi/tgsi_dump.c >>> @@ -353,20 +353,33 @@ iter_declaration( >>> if (decl->Declaration.Semantic) { >>> TXT( ", " ); >>> ENM( decl->Semantic.Name, tgsi_semantic_names ); >>> if (decl->Semantic.Index != 0 || >>> decl->Semantic.Name == TGSI_SEMANTIC_TEXCOORD || >>> decl->Semantic.Name == TGSI_SEMANTIC_GENERIC) { >>> CHR( '[' ); >>> UID( decl->Semantic.Index ); >>> CHR( ']' ); >>> } >>> + >>> + if (decl->Semantic.StreamX != 0 || decl->Semantic.StreamY != 0 || >>> + decl->Semantic.StreamZ != 0 || decl->Semantic.StreamW != 0) { >>> + TXT(", STREAM("); >>> + UID(decl->Semantic.StreamX); >>> + TXT(", "); >>> + UID(decl->Semantic.StreamY); >>> + TXT(", "); >>> + UID(decl->Semantic.StreamZ); >>> + TXT(", "); >>> + UID(decl->Semantic.StreamW); >>> + CHR(')'); >>> + } >>> } >>> >>> if (decl->Declaration.File == TGSI_FILE_IMAGE) { >>> TXT(", "); >>> ENM(decl->Image.Resource, tgsi_texture_names); >>> TXT(", "); >>> TXT(util_format_name(decl->Image.Format)); >>> if (decl->Image.Writable) >>> TXT(", WR"); >>> if (decl->Image.Raw) >>> diff --git a/src/gallium/auxiliary/tgsi/tgsi_text.c >>> b/src/gallium/auxiliary/tgsi/tgsi_text.c >>> index be80842..1b4f594 100644 >>> --- a/src/gallium/auxiliary/tgsi/tgsi_text.c >>> +++ b/src/gallium/auxiliary/tgsi/tgsi_text.c >>> @@ -1539,20 +1539,68 @@ static boolean parse_declaration( struct >>> translate_ctx *ctx ) >>> ctx->cur = cur; >>> break; >>> } >>> } >>> } >>> } >>> } >>> >>> cur = ctx->cur; >>> eat_opt_white( &cur ); >>> + if (*cur == ',' && >>> + file == TGSI_FILE_OUTPUT && ctx->processor == >>> PIPE_SHADER_GEOMETRY) { >>> + cur++; >>> + eat_opt_white(&cur); >>> + if (str_match_nocase_whole(&cur, "STREAM")) { >>> + uint stream[4]; >>> + >>> + eat_opt_white(&cur); >>> + if (*cur != '(') { >>> + report_error(ctx, "Expected '('"); >>> + return FALSE; >>> + } >>> + cur++; >>> + >>> + for (int i = 0; i < 4; ++i) { >>> + eat_opt_white(&cur); >>> + if (!parse_uint(&cur, &stream[i])) { >>> + report_error(ctx, "Expected literal integer"); >>> + return FALSE; >>> + } >>> + >>> + eat_opt_white(&cur); >>> + if (i < 3) { >>> + if (*cur != ',') { >>> + report_error(ctx, "Expected ','"); >>> + return FALSE; >>> + } >>> + cur++; >>> + } >>> + } >>> + >>> + if (*cur != ')') { >>> + report_error(ctx, "Expected ')'"); >>> + return FALSE; >>> + } >>> + cur++; >>> + >>> + decl.Semantic.StreamX = stream[0]; >>> + decl.Semantic.StreamY = stream[1]; >>> + decl.Semantic.StreamZ = stream[2]; >>> + decl.Semantic.StreamW = stream[3]; >>> + >>> + ctx->cur = cur; >>> + } >>> + } >>> + >>> + cur = ctx->cur; >>> + eat_opt_white( &cur ); >>> if (*cur == ',' && !is_vs_input) { >>> uint i; >>> >>> cur++; >>> eat_opt_white( &cur ); >>> for (i = 0; i < TGSI_INTERPOLATE_COUNT; i++) { >>> if (str_match_nocase_whole( &cur, tgsi_interpolate_names[i] >>> )) { >>> decl.Declaration.Interpolate = 1; >>> decl.Interp.Interpolate = i; >>> >>> diff --git a/src/gallium/include/pipe/p_shader_tokens.h >>> b/src/gallium/include/pipe/p_shader_tokens.h >>> index 4a259db..ee59df0 100644 >>> --- a/src/gallium/include/pipe/p_shader_tokens.h >>> +++ b/src/gallium/include/pipe/p_shader_tokens.h >>> @@ -200,21 +200,24 @@ enum tgsi_semantic { >>> TGSI_SEMANTIC_BASEINSTANCE, >>> TGSI_SEMANTIC_DRAWID, >>> TGSI_SEMANTIC_WORK_DIM, /**< opencl get_work_dim value */ >>> TGSI_SEMANTIC_COUNT, /**< number of semantic values */ >>> }; >>> >>> struct tgsi_declaration_semantic >>> { >>> unsigned Name : 8; /**< one of TGSI_SEMANTIC_x */ >>> unsigned Index : 16; /**< UINT */ >>> - unsigned Padding : 8; >>> + unsigned StreamX : 2; /**< vertex stream (for GS output) */ >>> + unsigned StreamY : 2; >>> + unsigned StreamZ : 2; >>> + unsigned StreamW : 2; >>> }; >>> >>> struct tgsi_declaration_image { >>> unsigned Resource : 8; /**< one of TGSI_TEXTURE_ */ >>> unsigned Raw : 1; >>> unsigned Writable : 1; >>> unsigned Format : 10; /**< one of PIPE_FORMAT_ */ >>> unsigned Padding : 12; >>> }; >>> >>> >> _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev