On Thu, May 15, 2014 at 12:57:42AM -0700, Kenneth Graunke wrote: > On 05/12/2014 08:42 AM, Topi Pohjolainen wrote: > > v2: Create the intel renderbuffer with level hardcoded to zero instead > > of overriding it in the surface state configuration. Also moved the > > dimension adjustments for tiling, mip level, msaa into the render > > buffer creation. Finally prepares for another blit path needed for > > miptree updownsampling. > > > > Signed-off-by: Topi Pohjolainen <topi.pohjolai...@intel.com> > > --- > > src/mesa/drivers/dri/i965/Makefile.sources | 1 + > > src/mesa/drivers/dri/i965/brw_context.h | 6 + > > src/mesa/drivers/dri/i965/brw_meta_stencil_blit.c | 492 > > ++++++++++++++++++++++ > > 3 files changed, 499 insertions(+) > > create mode 100644 src/mesa/drivers/dri/i965/brw_meta_stencil_blit.c > > > > diff --git a/src/mesa/drivers/dri/i965/Makefile.sources > > b/src/mesa/drivers/dri/i965/Makefile.sources > > index 9379fd2..5fc90b5 100644 > > --- a/src/mesa/drivers/dri/i965/Makefile.sources > > +++ b/src/mesa/drivers/dri/i965/Makefile.sources > > @@ -76,6 +76,7 @@ i965_FILES = \ > > brw_lower_texture_gradients.cpp \ > > brw_lower_unnormalized_offset.cpp \ > > brw_meta_updownsample.c \ > > + brw_meta_stencil_blit.c \ > > brw_meta_util.c \ > > brw_misc_state.c \ > > brw_object_purgeable.c \ > > diff --git a/src/mesa/drivers/dri/i965/brw_context.h > > b/src/mesa/drivers/dri/i965/brw_context.h > > index 6d4a5b5..d6b7f16 100644 > > --- a/src/mesa/drivers/dri/i965/brw_context.h > > +++ b/src/mesa/drivers/dri/i965/brw_context.h > > @@ -1482,6 +1482,12 @@ GLuint brw_get_rb_for_slice(struct brw_context *brw, > > void brw_meta_updownsample(struct brw_context *brw, > > struct intel_mipmap_tree *src, > > struct intel_mipmap_tree *dst); > > + > > +void brw_meta_fbo_stencil_blit(struct brw_context *brw, > > + GLfloat srcX0, GLfloat srcY0, > > + GLfloat srcX1, GLfloat srcY1, > > + GLfloat dstX0, GLfloat dstY0, > > + GLfloat dstX1, GLfloat dstY1); > > /*====================================================================== > > * brw_misc_state.c > > */ > > diff --git a/src/mesa/drivers/dri/i965/brw_meta_stencil_blit.c > > b/src/mesa/drivers/dri/i965/brw_meta_stencil_blit.c > > new file mode 100644 > > index 0000000..c38987e > > --- /dev/null > > +++ b/src/mesa/drivers/dri/i965/brw_meta_stencil_blit.c > > @@ -0,0 +1,492 @@ > > +/* > > + * Copyright © 2014 Intel Corporation > > + * > > + * Permission is hereby granted, free of charge, to any person obtaining a > > + * copy of this software and associated documentation files (the > > "Software"), > > + * to deal in the Software without restriction, including without > > limitation > > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > > + * and/or sell copies of the Software, and to permit persons to whom the > > + * Software is furnished to do so, subject to the following conditions: > > + * > > + * The above copyright notice and this permission notice (including the > > next > > + * paragraph) shall be included in all copies or substantial portions of > > the > > + * Software. > > + * > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS > > OR > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR > > OTHER > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > > DEALINGS > > + * IN THE SOFTWARE. > > + */ > > + > > +/** > > + * @file brw_meta_stencil_blit.c > > + * > > + * Implements upsampling, downsampling and scaling of stencil miptrees. The > > + * logic can be originally found in brw_blorp_blit.c. > > + * Implementation creates a temporary draw framebuffer object and attaches > > the > > + * destination stencil buffer attachment as color attachment. Source > > attachment > > + * is in turn treated as a stencil texture and the glsl program used for > > the > > + * blitting samples it using stencil-indexing. > > + * > > + * Unfortunately as the data port does not support interleaved > > msaa-surfaces > > + * (stencil is always IMS), the glsl program needs to handle the writing of > > + * individual samples manually. Surface is configured as if it were single > > + * sampled (with adjusted dimensions) and the glsl program extracts the > > + * sample indices from the input coordinates for correct texturing. > > + * > > + * Target surface is also configured as Y-tiled instead of W-tiled in order > > + * to support generations 6-7. Later hardware supports W-tiled as render > > target > > + * and the logic here could be simplified for those. > > + */ > > + > > +#include "brw_context.h" > > +#include "intel_batchbuffer.h" > > +#include "intel_fbo.h" > > + > > +#include "main/blit.h" > > +#include "main/buffers.h" > > +#include "main/fbobject.h" > > +#include "main/uniforms.h" > > +#include "main/texparam.h" > > +#include "main/texobj.h" > > +#include "main/viewport.h" > > +#include "main/enable.h" > > +#include "main/blend.h" > > +#include "main/varray.h" > > +#include "main/shaderapi.h" > > +#include "glsl/ralloc.h" > > + > > +#include "drivers/common/meta.h" > > +#include "brw_meta_util.h" > > + > > +#define FILE_DEBUG_FLAG DEBUG_FBO > > + > > +struct blit_dims { > > + int src_x0, src_y0, src_x1, src_y1; > > + int dst_x0, dst_y0, dst_x1, dst_y1; > > + bool mirror_x, mirror_y; > > +}; > > + > > +static const char *vs_source = > > + "#version 130\n" > > + "in vec2 position;\n" > > + "out vec2 tex_coords;\n" > > + "void main()\n" > > + "{\n" > > + " tex_coords = (position + 1.0) / 2.0;\n" > > + " gl_Position = vec4(position, 0.0, 1.0);\n" > > + "}\n"; > > + > > +static const struct sampler_and_fetch { > > + const char *sampler; > > + const char *fetch; > > +} samplers[] = { > > + { "uniform usampler2D texSampler;\n", > > + " out_color = texelFetch(texSampler, txl_coords, 0)" }, > > + { "#extension GL_ARB_texture_multisample : enable\n" > > + "uniform usampler2DMS texSampler;\n", > > + " out_color = texelFetch(texSampler, txl_coords, sample_index)" } > > +}; > > + > > +/** > > + * Translating Y-tiled to W-tiled: > > + * > > + * X' = (X & ~0b1011) >> 1 | (Y & 0b1) << 2 | X & 0b1 > > + * Y' = (Y & ~0b1) << 1 | (X & 0b1000) >> 2 | (X & 0b10) >> 1 > > + */ > > +static const char *fs_tmpl = > > + "#version 130\n" > > + "%s" > > + "uniform float src_x_scale;\n" > > + "uniform float src_y_scale;\n" > > + "uniform float src_x_off;\n" /* Top right coordinates of the source */ > > + "uniform float src_y_off;\n" /* rectangle in W-tiled space. */ > > + "uniform float dst_x_off;\n" /* Top right coordinates of the target */ > > + "uniform float dst_y_off;\n" /* rectangle in Y-tiled space. */ > > + "uniform float draw_rect_w;\n" /* This is the unnormalized size of the > > */ > > + "uniform float draw_rect_h;\n" /* drawing rectangle in Y-tiled space. */ > > + "uniform int dst_x0;\n" /* This is the bounding rectangle in the > > W-tiled */ > > + "uniform int dst_x1;\n" /* space that will be used to skip pixels lying > > */ > > + "uniform int dst_y0;\n" /* outside. In some cases the Y-tiled rectangle > > */ > > + "uniform int dst_y1;\n" /* is larger. */ > > + "uniform int dst_num_samples;\n" > > + "in vec2 tex_coords;\n" > > + "ivec2 txl_coords;\n" > > + "int sample_index;\n" > > + "out uvec4 out_color;\n" > > + "\n" > > + "void get_unorm_target_coords()\n" > > + "{\n" > > + " txl_coords.x = int(tex_coords.x * draw_rect_w + dst_x_off);\n" > > + " txl_coords.y = int(tex_coords.y * draw_rect_h + dst_y_off);\n" > > + "}\n" > > + "\n" > > + "void translate_dst_to_src()\n" > > + "{\n" > > + " txl_coords.x = int(float(txl_coords.x) * src_x_scale + > > src_x_off);\n" > > + " txl_coords.y = int(float(txl_coords.y) * src_y_scale + > > src_y_off);\n" > > + "}\n" > > + "\n" > > + "void translate_y_to_w_tiling()\n" > > + "{\n" > > + " int X = txl_coords.x;\n" > > + " int Y = txl_coords.y;\n" > > + " txl_coords.x = (X & int(0xfff4)) >> 1;\n" > > + " txl_coords.x |= ((Y & int(0x1)) << 2);\n" > > + " txl_coords.x |= (X & int(0x1));\n" > > + " txl_coords.y = (Y & int(0xfffe)) << 1;\n" > > + " txl_coords.y |= ((X & int(0x8)) >> 2);\n" > > + " txl_coords.y |= ((X & int(0x2)) >> 1);\n" > > + "}\n" > > + "\n" > > + "void decode_msaa()\n" > > + "{\n" > > + " int X = txl_coords.x;\n" > > + " int Y = txl_coords.y;\n" > > + " switch (dst_num_samples) {\n" > > + " case 0:\n" > > + " sample_index = 0;\n" > > + " break;\n" > > + " case 2:\n" > > + " txl_coords.x = ((X & int(0xfffc)) >> 1) | (X & int(0x1));\n" > > + " sample_index = (X & 0x2) >> 1;\n" > > + " break;\n" > > + " case 4:\n" > > + " txl_coords.x = ((X & int(0xfffc)) >> 1) | (X & int(0x1));\n" > > + " txl_coords.y = ((Y & int(0xfffc)) >> 1) | (Y & int(0x1));\n" > > + " sample_index = (Y & 0x2) | ((X & 0x2) >> 1);\n" > > + " break;\n" > > + " case 8:\n" > > + " txl_coords.x = ((X & int(0xfff8)) >> 2) | (X & int(0x1));\n" > > + " txl_coords.y = ((Y & int(0xfffc)) >> 1) | (Y & int(0x1));\n" > > + " sample_index = (X & 0x4) | (Y & 0x2) | ((X & 0x2) >> 1);\n" > > + " }\n" > > + "}\n" > > + "\n" > > + "void discard_outside_bounding_rect()\n" > > + "{\n" > > + " int X = txl_coords.x;\n" > > + " int Y = txl_coords.y;\n" > > + " if (X >= dst_x1 || X < dst_x0 || Y >= dst_y1 || Y < dst_y0)\n" > > + " discard;\n" > > + "}\n" > > + "\n" > > + "void main()\n" > > + "{\n" > > + " get_unorm_target_coords();\n" > > + " translate_y_to_w_tiling();\n" > > + " decode_msaa();" > > + " discard_outside_bounding_rect();\n" > > + " translate_dst_to_src();\n" > > + " %s;\n" > > + "}\n"; > > + > > +/** > > + * Setup uniforms telling the coordinates of the destination rectangle in > > the > > + * native w-tiled space. These are needed to ignore pixels that lie > > outside. > > + * The destination is drawn as Y-tiled and in some cases the Y-tiled > > drawing > > + * rectangle is larger than the original (for example 1x4 w-tiled requires > > + * 16x2 y-tiled). > > + */ > > +static void > > +setup_bounding_rect(GLuint prog, const struct blit_dims *dims) > > +{ > > + _mesa_Uniform1i(_mesa_GetUniformLocation(prog, "dst_x0"), dims->dst_x0); > > + _mesa_Uniform1i(_mesa_GetUniformLocation(prog, "dst_x1"), dims->dst_x1); > > + _mesa_Uniform1i(_mesa_GetUniformLocation(prog, "dst_y0"), dims->dst_y0); > > + _mesa_Uniform1i(_mesa_GetUniformLocation(prog, "dst_y1"), dims->dst_y1); > > +} > > + > > +/** > > + * Setup uniforms telling the destination width, height and the offset. > > These > > + * are needed to unnoormalize the input coordinates and to correctly > > translate > > + * between destination and source that may have differing offsets. > > + */ > > +static void > > +setup_drawing_rect(GLuint prog, const struct blit_dims *dims) > > +{ > > + _mesa_Uniform1f(_mesa_GetUniformLocation(prog, "draw_rect_w"), > > + dims->dst_x1 - dims->dst_x0); > > + _mesa_Uniform1f(_mesa_GetUniformLocation(prog, "draw_rect_h"), > > + dims->dst_y1 - dims->dst_y0); > > + _mesa_Uniform1f(_mesa_GetUniformLocation(prog, "dst_x_off"), > > dims->dst_x0); > > + _mesa_Uniform1f(_mesa_GetUniformLocation(prog, "dst_y_off"), > > dims->dst_y0); > > +} > > + > > +/** > > + * When not mirroring a coordinate (say, X), we need: > > + * src_x - src_x0 = (dst_x - dst_x0 + 0.5) * scale > > + * Therefore: > > + * src_x = src_x0 + (dst_x - dst_x0 + 0.5) * scale > > + * > > + * The program uses "round toward zero" to convert the transformed floating > > + * point coordinates to integer coordinates, whereas the behaviour we > > actually > > + * want is "round to nearest", so 0.5 provides the necessary correction. > > + * > > + * When mirroring X we need: > > + * src_x - src_x0 = dst_x1 - dst_x - 0.5 > > + * Therefore: > > + * src_x = src_x0 + (dst_x1 -dst_x - 0.5) * scale > > + */ > > +static void > > +setup_coord_coeff(GLuint prog, GLuint multiplier, GLuint offset, > > + int src_0, int src_1, int dst_0, int dst_1, bool mirror) > > +{ > > + const float scale = ((float)(src_1 - src_0)) / (dst_1 - dst_0); > > + > > + if (mirror) { > > + _mesa_Uniform1f(multiplier, -scale); > > + _mesa_Uniform1f(offset, src_0 + (dst_1 - 0.5) * scale); > > + } else { > > + _mesa_Uniform1f(multiplier, scale); > > + _mesa_Uniform1f(offset, src_0 + (-dst_0 + 0.5) * scale); > > + } > > +} > > + > > +/** > > + * Setup uniforms providing relation between source and destination > > surfaces. > > + * Destination coordinates are in Y-tiling layout while texelFetch() > > expects > > + * W-tiled coordinates. Once the destination coordinates are > > re-interpreted by > > + * the program into the original W-tiled layout, the program needs to know > > the > > + * offset and scaling factors between the destination and source. > > + * Note that these are calculated in the original W-tiled space before the > > + * destination rectangle is adjusted for possible msaa and Y-tiling. > > + */ > > +static void > > +setup_coord_transform(GLuint prog, const struct blit_dims *dims) > > +{ > > + setup_coord_coeff(prog, > > + _mesa_GetUniformLocation(prog, "src_x_scale"), > > + _mesa_GetUniformLocation(prog, "src_x_off"), > > + dims->src_x0, dims->src_x1, dims->dst_x0, > > dims->dst_x1, > > + dims->mirror_x); > > + > > + setup_coord_coeff(prog, > > + _mesa_GetUniformLocation(prog, "src_y_scale"), > > + _mesa_GetUniformLocation(prog, "src_y_off"), > > + dims->src_y0, dims->src_y1, dims->dst_y0, > > dims->dst_y1, > > + dims->mirror_y); > > +} > > + > > +static GLuint > > +setup_program(struct gl_context *ctx, bool msaa_tex) > > +{ > > + struct blit_state *blit = &ctx->Meta->Blit; > > + static GLuint prog_cache[] = { 0, 0 }; > > I wonder about using static storage here, vs. storing it in a context. > I feel like we need to store it in the context, but I can't concretely > think of why this wouldn't work. We can always fix this later if it's a > problem.
Hmm, I think you have point here. While the resulted programs do not actually depend on the state of the context yet (even thought the ctx->Meta->Blit is referenced) that may change anytime. Also I don't exactly how programs are scoped (can they be shared between different contexts). Hence even if it creates possibly many copies of identical programs for multiple contexts I would prefer it. How about I follow it up with another patch? It will introduce a little more dynamic and I would like to test it thoroughly before pushing. > > > + void *mem_ctx; > > + const char *fs_source; > > + const struct sampler_and_fetch *sampler = &samplers[msaa_tex]; > > + > > + _mesa_meta_setup_vertex_objects(&blit->VAO, &blit->VBO, true, 2, 2, 0); > > + > > + if (prog_cache[msaa_tex]) { > > + _mesa_UseProgram(prog_cache[msaa_tex]); > > + return prog_cache[msaa_tex]; > > + } > > + > > + mem_ctx = ralloc_context(NULL); > > + fs_source = ralloc_asprintf(mem_ctx, fs_tmpl, sampler->sampler, > > + sampler->fetch); > > + _mesa_meta_compile_and_link_program(ctx, vs_source, fs_source, > > + "i965 stencil blit", > > + &prog_cache[msaa_tex]); > > + ralloc_free(mem_ctx); > > You can actually skip the temporary mem_ctx here and just do: > > fs_source = ralloc_asprintf(NULL, fs_tmpl, sampler->sampler, > sampler->fetch); > _mesa_meta_compile... > ralloc_free(fs_source); Oh, nice. I'll revise. > > You're also welcome to use C99 mixed declarations and code in the i965 > directory, if you like. (Either way is fine.) > > I see no reason to hold up this landing, so for the series: > Reviewed-by: Kenneth Graunke <kenn...@whitecape.org> > > My review is pretty weak on the details, but it looks like you're doing > the right sorts of things, in a pretty reasonable manner. The real test > is Piglit. :) We can always fix more things or polish things in > follow-up patches. > > You should go ahead and push this. Thanks, Topi! > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev