On 01/23/2013 03:59 AM, Vincent Lejeune wrote:


----- Mail original -----
De : Vadim Girlin <vadimgir...@gmail.com>
À : Christoph Bumiller <e0425...@student.tuwien.ac.at>
Cc : mesa-dev@lists.freedesktop.org
Envoyé le : Mercredi 23 janvier 2013 0h44
Objet : Re: [Mesa-dev] [PATCH] glsl_to_tgsi: indirect array information

On 01/22/2013 10:59 PM, Christoph Bumiller wrote:
  On 21.01.2013 21:10, Vadim Girlin wrote:
  Provide the information about indirectly addressable arrays (ranges of
temps) in
  the shader to the drivers. TGSI representation itself isn't
modified, array
  information is passed as an additional data in the pipe_shader_state,
so the
  drivers can use it as a hint for optimization.
  ---

  It's far from being an ideal solution, but I saw the discussions
about that
  problem starting from 2009 IIRC, and we still have no solution (neither
good
  nor bad) despite the years passed. I hope we can use this not very
intrusive
  approach until we get something better.


  I'd rather not have any hacks in the interface, let alone ones that
  solve the problem only partially (you still won't know which array is
  accessed by a particular instruction, which is important for
  optimization and essential in some cases for making INPUT/OUTPUT arrays
  work), and not just because it reduces the pressure on people to
  implement a proper solution.

  With this, you just get to know which range of TEMPs are indirectly
  addressed and which ones are not, and you can do the same by simply
  creating multiple declarations of TEMPs, one for each array, and adding
  a single bit of info to tgsi_declaration (which has 7 bits of padding
  anyway, so ample space), which is a lot less ugly, and doesn't suffer
  from an arbitrary limit, and doesn't require any modification of
drivers
  either.


Array accessed by any indirect operand can be identified by the
immediate offset, e.g. TEMP[ADDR[0].x+1] implies the array starting from
1, thus we can find it's entry in the information provided by this patch
to get the addressable range for every indirect operand. If I'm not
missing something, glsl_to_tgsi accumulates all other parts of the
offset in the address register before the indirect access. If I'm wrong,
we can fix it to ensure such behavior.

I'm not sure about that ; when I worked on indirect addressing of const memory,
I discovered when tracking vp/fo regression that the immediate offset is the 
result of
  glsl_to_tgsi constant propagation and not related to the underlying array.
This means that the dynamic index can be negative, which is not always
desirable depending on the hw. (In R600 case, const fetch instruction does not
support negative index. MOVA inst does).

For instance, the following pseudo code snippet is fine for an index value of 
-4 :

uniform int index;

float array[4];
float data = array[6 + index];

and is lowered to
MOV TEMP[0] TEMP[ADDR[0].x + 6];


I tried the following shader:

uniform int index;

void main()
{
    float array[4] = float[4](0.1, 0.2, 0.3, 0.4);
    float data = array[6 + index];
    gl_FragColor = vec4(data, 1.0, 0.0, 1.0);
}

Resulting TGSI:

--------------------------------------------------------------
FRAG
PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1
DCL OUT[0], COLOR
DCL CONST[0]
DCL TEMP[0], LOCAL
DCL TEMP[1], LOCAL
DCL TEMP[2], LOCAL
DCL TEMP[3], LOCAL
DCL TEMP[4], LOCAL
DCL TEMP[5], LOCAL
DCL TEMP[6], LOCAL
DCL TEMP[7], LOCAL
DCL ADDR[0]
IMM[0] FLT32 {    0.1000,     0.2000,     0.3000,     0.4000}
IMM[1] FLT32 {    1.0000,     0.0000,     0.0000,     0.0000}
IMM[2] INT32 {6, 0, 0, 0}
  0: MOV TEMP[1].yzw, IMM[1].yxyx
  1: MOV TEMP[2], IMM[0].xxxx
  2: MOV TEMP[3], IMM[0].yyyy
  3: MOV TEMP[4], IMM[0].zzzz
  4: MOV TEMP[5], IMM[0].wwww
  5: UADD TEMP[6].x, IMM[2].xxxx, CONST[0].xxxx
  6: UARL ADDR[0].x, TEMP[6].xxxx
  7: MOV TEMP[1].x, TEMP[ADDR[0].x+2].xxxx
  8: MOV_SAT OUT[0], TEMP[1]
  9: END
--------------------------------------------------------------

Also I tried the following:

uniform float array[4];
uniform int index;

void main()
{
    float data = array[6 + index];
    gl_FragColor = vec4(data, 1.0, 0.0, 1.0);
}

Resulting TGSI:

--------------------------------------------------------------
FRAG
PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1
DCL OUT[0], COLOR
DCL CONST[0..4]
DCL TEMP[0], LOCAL
DCL TEMP[1], LOCAL
DCL ADDR[0]
IMM[0] FLT32 {    1.0000,     0.0000,     0.0000,     0.0000}
IMM[1] INT32 {6, 0, 0, 0}
  0: MOV TEMP[0].yzw, IMM[0].yxyx
  1: UADD TEMP[1].x, IMM[1].xxxx, CONST[0].xxxx
  2: UARL ADDR[0].x, TEMP[1].xxxx
  3: MOV TEMP[0].x, CONST[ADDR[0].x+1].xxxx
  4: MOV_SAT OUT[0], TEMP[0]
  5: END
--------------------------------------------------------------

So far immediate offset in the indirect operand is always equal to the start offset of the array. Could you provide some more complete example that demonstrates the problem, please.

Vadim

I didn't test your patch atm, but I think you may have to fix glsl_to_tgsi.
Otherwise I'm in favor of implementing something not optimal but far
better that what we have currently.

Vincent


I'll be perfectly OK with any other solution, as long as it's a really
working (already implemented) solution that I can use today, not just
some abstract ideas in the discussions. This patch isn't perfect and can
be improved, but it already works for me. I'll be very happy to use any
other solution from you or anyone else.

Vadim

    src/gallium/auxiliary/tgsi/tgsi_ureg.c     |  2 ++
    src/gallium/include/pipe/p_state.h         | 13 +++++++
    src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 57
+++++++++++++++++++++++++-----
    src/mesa/state_tracker/st_program.c        |  4 +++
    src/mesa/state_tracker/st_program.h        |  1 +
    5 files changed, 69 insertions(+), 8 deletions(-)

  diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
  index 3c2a923..61db431 100644
  --- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
  +++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
  @@ -1658,6 +1658,8 @@ void *ureg_create_shader( struct ureg_program
*ureg,
       else
          memset(&state.stream_output, 0,
sizeof(state.stream_output));

  +   memset(&state.array_info, 0, sizeof(state.array_info));
  +
       if (ureg->processor == TGSI_PROCESSOR_VERTEX)
          return pipe->create_vs_state( pipe, &state );
       else
  diff --git a/src/gallium/include/pipe/p_state.h
b/src/gallium/include/pipe/p_state.h
  index ab49cab..4490f2e 100644
  --- a/src/gallium/include/pipe/p_state.h
  +++ b/src/gallium/include/pipe/p_state.h
  @@ -65,6 +65,8 @@ extern "C" {
    #define PIPE_MAX_TEXTURE_LEVELS   16
    #define PIPE_MAX_SO_BUFFERS        4

  +#define PIPE_MAX_INDIRECT_ARRAYS  16
  +

    struct pipe_reference
    {
  @@ -205,11 +207,22 @@ struct pipe_stream_output_info
       } output[PIPE_MAX_SHADER_OUTPUTS];
    };

  +struct pipe_shader_indirect_array {
  +   unsigned index:16;
  +   unsigned size:16;
  +};
  +
  +struct pipe_shader_array_info
  +{
  +   struct pipe_shader_indirect_array arrays[PIPE_MAX_INDIRECT_ARRAYS];
  +   unsigned num_arrays;
  +};

    struct pipe_shader_state
    {
       const struct tgsi_token *tokens;
       struct pipe_stream_output_info stream_output;
  +   struct pipe_shader_array_info array_info;
    };


  diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
  index 1d96e90..4ded1be 100644
  --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
  +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
  @@ -110,6 +110,7 @@ public:
          this->index2D = 0;
          this->type = type ? type->base_type : GLSL_TYPE_ERROR;
          this->reladdr = NULL;
  +      this->size = 0;
       }

       st_src_reg(gl_register_file file, int index, int type)
  @@ -121,6 +122,7 @@ public:
          this->swizzle = SWIZZLE_XYZW;
          this->negate = 0;
          this->reladdr = NULL;
  +      this->size = 0;
       }

       st_src_reg(gl_register_file file, int index, int type, int
index2D)
  @@ -132,6 +134,7 @@ public:
          this->swizzle = SWIZZLE_XYZW;
          this->negate = 0;
          this->reladdr = NULL;
  +      this->size = 0;
       }

       st_src_reg()
  @@ -143,6 +146,7 @@ public:
          this->swizzle = 0;
          this->negate = 0;
          this->reladdr = NULL;
  +      this->size = 0;
       }

       explicit st_src_reg(st_dst_reg reg);
  @@ -155,6 +159,7 @@ public:
       int type; /** GLSL_TYPE_* from GLSL IR (enum glsl_base_type) */
       /** Register index should be offset by the integer in this reg. */
       st_src_reg *reladdr;
  +   int size;
    };

    class st_dst_reg {
  @@ -244,8 +249,9 @@ public:

    class variable_storage : public exec_node {
    public:
  -   variable_storage(ir_variable *var, gl_register_file file, int
index)
  -      : file(file), index(index), var(var)
  +   variable_storage(ir_variable *var, gl_register_file file, int
index,
  +                    int size = 0)
  +      : file(file), index(index), var(var), size(size)
       {
          /* empty */
       }
  @@ -253,6 +259,7 @@ public:
       gl_register_file file;
       int index;
       ir_variable *var; /* variable that maps to this, if any */
  +   int size;
    };

    class immediate_storage : public exec_node {
  @@ -312,6 +319,7 @@ public:
       struct gl_program *prog;
       struct gl_shader_program *shader_program;
       struct gl_shader_compiler_options *options;
  +   struct pipe_shader_array_info array_info;

       int next_temp;

  @@ -445,6 +453,8 @@ public:
       void emit_block_mov(ir_assignment *ir, const struct glsl_type
*type,
                           st_dst_reg *l, st_src_reg *r);

  +   void add_array_info(const st_src_reg *src);
  +
       void *mem_ctx;
    };

  @@ -1004,7 +1014,8 @@ glsl_to_tgsi_visitor::get_temp(const glsl_type
*type)
       src.file = PROGRAM_TEMPORARY;
       src.index = next_temp;
       src.reladdr = NULL;
  -   next_temp += type_size(type);
  +   src.size = type_size(type);
  +   next_temp += src.size;

       if (type->is_array() || type->is_record()) {
          src.swizzle = SWIZZLE_NOOP;
  @@ -1075,9 +1086,10 @@ glsl_to_tgsi_visitor::visit(ir_variable *ir)
             assert((int) ir->num_state_slots ==
type_size(ir->type));

             storage = new(mem_ctx) variable_storage(ir,
PROGRAM_TEMPORARY,
  -                             this->next_temp);
  +                                                 this->next_temp,
  +
type_size(ir->type));
             this->variables.push_tail(storage);
  -         this->next_temp += type_size(ir->type);
  +         this->next_temp += storage->size;

             dst = st_dst_reg(st_src_reg(PROGRAM_TEMPORARY,
storage->index,
                   native_integers ? ir->type->base_type :
GLSL_TYPE_FLOAT));
  @@ -2029,10 +2041,10 @@
glsl_to_tgsi_visitor::visit(ir_dereference_variable *ir)
          case ir_var_auto:
          case ir_var_temporary:
             entry = new(mem_ctx) variable_storage(var,
PROGRAM_TEMPORARY,
  -                               this->next_temp);
  +                                               this->next_temp,
  +
type_size(var->type));
             this->variables.push_tail(entry);
  -
  -         next_temp += type_size(var->type);
  +         next_temp += entry->size;
             break;
          }

  @@ -2043,6 +2055,8 @@
glsl_to_tgsi_visitor::visit(ir_dereference_variable *ir)
       }

       this->result = st_src_reg(entry->file, entry->index,
var->type);
  +   this->result.size = entry->size;
  +
       if (!native_integers)
          this->result.type = GLSL_TYPE_FLOAT;
    }
  @@ -2062,12 +2076,16 @@
glsl_to_tgsi_visitor::visit(ir_dereference_array *ir)
       if (index) {
          src.index += index->value.i[0] * element_size;
       } else {
  +
          /* Variable index array dereference.  It eats the
"vec4" of the
           * base of the array and an index that offsets the TGSI
register
           * index.
           */
          ir->array_index->accept(this);

  +      if (src.size > 1 && src.file == PROGRAM_TEMPORARY)
  +         add_array_info(&src);
  +
          st_src_reg index_reg;

          if (element_size == 1) {
  @@ -2969,6 +2987,7 @@ glsl_to_tgsi_visitor::glsl_to_tgsi_visitor()
       prog = NULL;
       shader_program = NULL;
       options = NULL;
  +   array_info.num_arrays = 0;
    }

    glsl_to_tgsi_visitor::~glsl_to_tgsi_visitor()
  @@ -3771,6 +3790,23 @@ glsl_to_tgsi_visitor::renumber_registers(void)
       this->next_temp = new_index;
    }

  +void
  +glsl_to_tgsi_visitor::add_array_info(const st_src_reg *src)
  +{
  +   unsigned i;
  +
  +   if (array_info.num_arrays == PIPE_MAX_INDIRECT_ARRAYS)
  +      return;
  +
  +   for (i = 0; i < array_info.num_arrays; ++i) {
  +      if (array_info.arrays[i].index == (unsigned)src->index)
  +         return;
  +   }
  +   array_info.arrays[i].index = src->index;
  +   array_info.arrays[i].size = src->size;
  +   ++array_info.num_arrays;
  +}
  +
    /**
     * Returns a fragment program which implements the current pixel
transfer ops.
     * Based on get_pixel_transfer_program in st_atom_pixeltransfer.c.
  @@ -4798,6 +4834,11 @@ st_translate_program(
       }

       if (program->indirect_addr_temps) {
  +      struct pipe_shader_array_info *ai =
  +         &((struct st_fragment_program*) proginfo)->array_info;
  +
  +      memcpy(ai, &program->array_info, sizeof(struct
pipe_shader_array_info));
  +
          /* If temps are accessed with indirect addressing, declare
temporaries
           * in sequential order.  Else, we declare them on demand
elsewhere.
           * (Note: the number of temporaries is equal to
program->next_temp)
  diff --git a/src/mesa/state_tracker/st_program.c
b/src/mesa/state_tracker/st_program.c
  index a9111b5..1370c76 100644
  --- a/src/mesa/state_tracker/st_program.c
  +++ b/src/mesa/state_tracker/st_program.c
  @@ -739,6 +739,10 @@ st_translate_fragment_program(struct st_context
*st,
                                    fs_output_semantic_index, FALSE,
                                    key->clamp_color);

  +   if (stfp->glsl_to_tgsi)
  +      memcpy(&variant->tgsi.array_info,
&stfp->array_info,
  +             sizeof(struct pipe_shader_array_info));
  +
       variant->tgsi.tokens = ureg_get_tokens( ureg, NULL );
       ureg_destroy( ureg );

  diff --git a/src/mesa/state_tracker/st_program.h
b/src/mesa/state_tracker/st_program.h
  index 23a262c..d6b2b0f 100644
  --- a/src/mesa/state_tracker/st_program.h
  +++ b/src/mesa/state_tracker/st_program.h
  @@ -90,6 +90,7 @@ struct st_fragment_program
    {
       struct gl_fragment_program Base;
       struct glsl_to_tgsi_visitor* glsl_to_tgsi;
  +   struct pipe_shader_array_info array_info;

       struct st_fp_variant *variants;
    };



_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to