On Sun, Jan 4, 2015 at 1:07 PM, Ben Widawsky <b...@bwidawsk.net> wrote:
> I just did a very cursory review. I assume someone smarter than me will do > a > real review, but if not, feel free to ping me. > > I think all the comments apply to both functions. > > On Sat, Jan 03, 2015 at 11:54:15AM -0800, Jason Ekstrand wrote: > > From: Sisinty Sasmita Patra <sisinty.pa...@intel.com> > > > > Added intel_readpixels_tiled_mempcpy and > intel_gettexsubimage_tiled_mempcpy > > functions. These are the fast paths for glReadPixels and glGetTexImage. > > > > v2: Jason Ekstrand <jason.ekstr...@intel.com> > > - Refactor to make the functions look more like the old > > intel_tex_subimage_tiled_memcpy > > - Don't export the readpixels_tiled_memcpy function > > - Fix some pointer arithmatic bugs in partial image downloads (using > > ReadPixels with a non-zero x or y offset) > > - Fix a bug when ReadPixels is performed on an FBO wrapping a texture > > miplevel other than zero. > > > > Signed-off-by: Jason Ekstrand <jason.ekstr...@intel.com> > > --- > > src/mesa/drivers/dri/i965/intel_pixel_read.c | 134 > +++++++++++++++++++++++++- > > src/mesa/drivers/dri/i965/intel_tex.h | 9 ++ > > src/mesa/drivers/dri/i965/intel_tex_image.c | 137 > ++++++++++++++++++++++++++- > > 3 files changed, 277 insertions(+), 3 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/intel_pixel_read.c > b/src/mesa/drivers/dri/i965/intel_pixel_read.c > > index beb3152..d1e7798 100644 > > --- a/src/mesa/drivers/dri/i965/intel_pixel_read.c > > +++ b/src/mesa/drivers/dri/i965/intel_pixel_read.c > > @@ -38,14 +38,16 @@ > > > > #include "brw_context.h" > > #include "intel_screen.h" > > +#include "intel_batchbuffer.h" > > #include "intel_blit.h" > > #include "intel_buffers.h" > > #include "intel_fbo.h" > > #include "intel_mipmap_tree.h" > > #include "intel_pixel.h" > > #include "intel_buffer_objects.h" > > +#include "intel_tiled_memcpy.h" > > > > -#define FILE_DEBUG_FLAG DEBUG_PIXEL > > +#define FILE_DEBUG_FLAG DEBUG_TEXTURE > > > > /* For many applications, the new ability to pull the source buffers > > * back out of the GTT and then do the packing/conversion operations > > @@ -161,17 +163,147 @@ do_blit_readpixels(struct gl_context * ctx, > > return true; > > } > > > > +/** > > + * \brief A fast path for glReadPixels > > + * > > + * This fast path is taken when the source format is BGRA, RGBA, > > + * A or L and when the texture memory is X- or Y-tiled. It downloads > > + * the source data by mapping the memory without a GTT fence, thus > > + * acquiring a linear view of the memory. > > That last sentence is confusing since you're using linear differently than > just > about everywhere else (though the statement is accurate). > > How about something like, "It maps the source data with a CPU mapping > which then > needs to be de-tiled [by the CPU] before presenting linear data back to the > user." > Yeah, I'll get that fixed. > > > + * > > + * This is a performance win over the conventional texture download > path. > > + * In the conventional texture download path, > > + * > > + */ > > + > > +static bool > > +intel_readpixels_tiled_memcpy(struct gl_context * ctx, > > + GLint xoffset, GLint yoffset, > > + GLsizei width, GLsizei height, > > + GLenum format, GLenum type, > > + GLvoid * pixels, > > + const struct gl_pixelstore_attrib *pack) > > +{ > > + struct brw_context *brw = brw_context(ctx); > > + struct gl_renderbuffer *rb = ctx->ReadBuffer->_ColorReadBuffer; > > + > > + /* This path supports reading from color buffers only */ > > + if (rb == NULL) > > + return false; > > + > > + struct intel_renderbuffer *irb = intel_renderbuffer(rb); > > + int dst_pitch; > > + > > + /* The miptree's buffer. */ > > + drm_intel_bo *bo; > > + > > + int error = 0; > > + > > + uint32_t cpp; > > + mem_copy_fn mem_copy = NULL; > > + > > + /* This fastpath is restricted to specific renderbuffer types: > > + * a 2D BGRA, RGBA, L8 or A8 texture. It could be generalized to > support > > + * more types. > > + */ > > + if (!brw->has_llc || > > + !(type == GL_UNSIGNED_BYTE || type == > GL_UNSIGNED_INT_8_8_8_8_REV) || > > + pixels == NULL || > > + _mesa_is_bufferobj(pack->BufferObj) || > > + pack->Alignment > 4 || > > + pack->SkipPixels > 0 || > > + pack->SkipRows > 0 || > > + (pack->RowLength != 0 && pack->RowLength != width) || > > + pack->SwapBytes || > > + pack->LsbFirst || > > + pack->Invert) > > + return false; > > + > > + if (!intel_get_memcpy(rb->Format, format, type, &mem_copy, &cpp)) > > + return false; > > + > > + if (!irb->mt || > > + (irb->mt->tiling != I915_TILING_X && > > + irb->mt->tiling != I915_TILING_Y)) { > > + /* The algorithm is written only for X- or Y-tiled memory. */ > > + return false; > > + } > > + > > + /* Since we are going to read raw data to the miptree, we need to > resolve > > + * any pending fast color clears before we start. > > + */ > > + intel_miptree_resolve_color(brw, irb->mt); > > + > > + bo = irb->mt->bo; > > + > > + if (drm_intel_bo_references(brw->batch.bo, bo)) { > > + perf_debug("Flushing before mapping a referenced bo.\n"); > > + intel_batchbuffer_flush(brw); > > + } > > I don't know much about the spec, but is this even possible for > glReadPixels? > Yes, it is. All you have to do is render and then readpixels without a flush. I added that for precicely that reason. :-) > > > + > > + error = brw_bo_map(brw, bo, false /* write enable */, "miptree"); > > + if (error || bo->virtual == NULL) { > > I realize this was probably just copied from the original code, but it > should be > superfluous (as is the comment about write enable). Not really sure why > Eric > initially added it. > Sure, the comment about write enable probably isn't needed, but it does explain the boolean. As for the other, I talked to Kristian and he said that, while unlikely, it is still possible for brw_bo_map to fail so we should handle the error case. > > + DBG("%s: failed to map bo\n", __FUNCTION__); > > + return false; > > + } > > + > > + dst_pitch = _mesa_image_row_stride(pack, width, format, type); > > + > > + /* We postponed printing this message until having committed to > executing > > + * the function. > > + */ > > + DBG("%s: x,y=(%d,%d) (w,h)=(%d,%d) format=0x%x type=0x%x " > > + "mesa_format=0x%x tiling=%d " > > + "pack=(alignment=%d row_length=%d skip_pixels=%d > skip_rows=%d)\n", > > + __FUNCTION__, xoffset, yoffset, width, height, > > + format, type, rb->Format, irb->mt->tiling, > > + pack->Alignment, pack->RowLength, pack->SkipPixels, > > + pack->SkipRows); > > + > > + /* This renderbuffer can come from a texture which may not be > miplevel 0. > > + * We need to adjust accordingly. > > + */ > > + if (rb->TexImage) { > > + int level = rb->TexImage->Level + > rb->TexImage->TexObject->MinLevel; > > + > > + /* Adjust x and y offset based on miplevel */ > > + xoffset += irb->mt->level[level].level_x; > > + yoffset += irb->mt->level[level].level_y; > > + } > > Can the slice be non-zero? Should this be using > intel_miptree_get_image_offset() instead? > Good catch. For now, I'm going to just implse that the texture's target is one of GL_TEXTURE_2D or GL_TEXTURE_RECTANGLE. None of the download paths handle arrays either, so that seems sane for now. > > > + > > + tiled_to_linear( > > + xoffset * cpp, (xoffset + width) * cpp, > > + yoffset, yoffset + height, > > + pixels - (ptrdiff_t) yoffset * dst_pitch - (ptrdiff_t) xoffset * > cpp, > > + bo->virtual, > > + dst_pitch, irb->mt->pitch, > > + brw->has_swizzling, > > + irb->mt->tiling, > > + mem_copy > > + ); > > + > > + drm_intel_bo_unmap(bo); > > + return true; > > +} > > + > > void > > intelReadPixels(struct gl_context * ctx, > > GLint x, GLint y, GLsizei width, GLsizei height, > > GLenum format, GLenum type, > > const struct gl_pixelstore_attrib *pack, GLvoid * > pixels) > > { > > + bool ok; > > + > > struct brw_context *brw = brw_context(ctx); > > bool dirty; > > > > DBG("%s\n", __FUNCTION__); > > > > + ok = intel_readpixels_tiled_memcpy(ctx, x, y, width, height, > > + format, type, pixels, pack); > > + if(ok) > > + return; > > + > > if (_mesa_is_bufferobj(pack->BufferObj)) { > > /* Using PBOs, so try the BLT based path. */ > > if (do_blit_readpixels(ctx, x, y, width, height, format, type, > pack, > > diff --git a/src/mesa/drivers/dri/i965/intel_tex.h > b/src/mesa/drivers/dri/i965/intel_tex.h > > index 27f7f11..f048e84 100644 > > --- a/src/mesa/drivers/dri/i965/intel_tex.h > > +++ b/src/mesa/drivers/dri/i965/intel_tex.h > > @@ -68,4 +68,13 @@ intel_texsubimage_tiled_memcpy(struct gl_context *ctx, > > const struct gl_pixelstore_attrib > *packing, > > bool for_glTexImage); > > > > +bool > > +intel_gettexsubimage_tiled_memcpy(struct gl_context *ctx, > > + struct gl_texture_image *texImage, > > + GLint xoffset, GLint yofset, > > + GLsizei width, GLsizei height, > > + GLenum format, GLenum type, > > + GLvoid *pixels, > > + const struct gl_pixelstore_attrib > *packing); > > + > > #endif > > diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c > b/src/mesa/drivers/dri/i965/intel_tex_image.c > > index 3317779..075315d 100644 > > --- a/src/mesa/drivers/dri/i965/intel_tex_image.c > > +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c > > @@ -24,7 +24,7 @@ > > #include "intel_blit.h" > > #include "intel_fbo.h" > > #include "intel_image.h" > > - > > +#include "intel_tiled_memcpy.h" > > #include "brw_context.h" > > > > #define FILE_DEBUG_FLAG DEBUG_TEXTURE > > @@ -507,12 +507,145 @@ blit_texture_to_pbo(struct gl_context *ctx, > > return true; > > } > > > > +/** > > + * \brief A fast path for glGetTexImage. > > + * > > + * > > + * This fast path is taken when the texture format is BGRA, RGBA, > > + * A or L and when the texture memory is X- or Y-tiled. It downloads > > + * the texture data by mapping the texture memory without a GTT fence, > thus > > + * acquiring a linear view of the memory. > > + * > > + * This is a performance win over the conventional texture download > path. > > + * In the conventional texture download path, > > + * > > + */ > > + > > +bool > > +intel_gettexsubimage_tiled_memcpy(struct gl_context *ctx, > > + struct gl_texture_image *texImage, > > + GLint xoffset, GLint yoffset, > > + GLsizei width, GLsizei height, > > + GLenum format, GLenum type, > > + GLvoid *pixels, > > + const struct gl_pixelstore_attrib > *packing) > > +{ > > + struct brw_context *brw = brw_context(ctx); > > + struct intel_texture_image *image = intel_texture_image(texImage); > > + int dst_pitch; > > + > > + /* The miptree's buffer. */ > > + drm_intel_bo *bo; > > + > > + int error = 0; > > + > > + uint32_t cpp; > > + mem_copy_fn mem_copy = NULL; > > + > > + /* This fastpath is restricted to specific texture types: > > + * a 2D BGRA, RGBA, L8 or A8 texture. It could be generalized to > support > > + * more types. > > + * > > + * FINISHME: The restrictions below on packing alignment and packing > row > > + * length are likely unneeded now because we calculate the > destination stride > > + * with _mesa_image_row_stride. However, before removing the > restrictions > > + * we need tests. > > + */ > > + if (!brw->has_llc || > > + !(type == GL_UNSIGNED_BYTE || type == > GL_UNSIGNED_INT_8_8_8_8_REV) || > > + !(texImage->TexObject->Target == GL_TEXTURE_2D || > > + texImage->TexObject->Target == GL_TEXTURE_RECTANGLE) || > > + pixels == NULL || > > + _mesa_is_bufferobj(packing->BufferObj) || > > + packing->Alignment > 4 || > > + packing->SkipPixels > 0 || > > + packing->SkipRows > 0 || > > + (packing->RowLength != 0 && packing->RowLength != width) || > > + packing->SwapBytes || > > + packing->LsbFirst || > > + packing->Invert) > > + return false; > > + > > + if (!intel_get_memcpy(texImage->TexFormat, format, type, &mem_copy, > &cpp)) > > + return false; > > + > > + /* If this is a nontrivial texture view, let another path handle it > instead. */ > > + if (texImage->TexObject->MinLayer) > > + return false; > > + > > + if (!image->mt || > > + (image->mt->tiling != I915_TILING_X && > > + image->mt->tiling != I915_TILING_Y)) { > > + /* The algorithm is written only for X- or Y-tiled memory. */ > > + return false; > > + } > > + > > + /* Since we are going to write raw data to the miptree, we need to > resolve > > + * any pending fast color clears before we start. > > + */ > > + intel_miptree_resolve_color(brw, image->mt); > > + > > + bo = image->mt->bo; > > + > > + if (drm_intel_bo_references(brw->batch.bo, bo)) { > > + perf_debug("Flushing before mapping a referenced bo.\n"); > > + intel_batchbuffer_flush(brw); > > + } > > + > > + error = brw_bo_map(brw, bo, false /* write enable */, "miptree"); > > + if (error || bo->virtual == NULL) { > > + DBG("%s: failed to map bo\n", __FUNCTION__); > > + return false; > > + } > > + > > + dst_pitch = _mesa_image_row_stride(packing, width, format, type); > > + > > + DBG("%s: level=%d x,y=(%d,%d) (w,h)=(%d,%d) format=0x%x type=0x%x " > > + "mesa_format=0x%x tiling=%d " > > + "packing=(alignment=%d row_length=%d skip_pixels=%d > skip_rows=%d)\n", > > + __FUNCTION__, texImage->Level, xoffset, yoffset, width, height, > > + format, type, texImage->TexFormat, image->mt->tiling, > > + packing->Alignment, packing->RowLength, packing->SkipPixels, > > + packing->SkipRows); > > + > > + int level = texImage->Level + texImage->TexObject->MinLevel; > > + > > + /* Adjust x and y offset based on miplevel */ > > + xoffset += image->mt->level[level].level_x; > > + yoffset += image->mt->level[level].level_y; > > + > > + tiled_to_linear( > > + xoffset * cpp, (xoffset + width) * cpp, > > + yoffset, yoffset + height, > > + pixels - (ptrdiff_t) yoffset * dst_pitch - (ptrdiff_t) xoffset * > cpp, > > + bo->virtual, > > + dst_pitch, image->mt->pitch, > > + brw->has_swizzling, > > + image->mt->tiling, > > + mem_copy > > + ); > > + > > + drm_intel_bo_unmap(bo); > > + return true; > > +} > > + > > + > > static void > > intel_get_tex_image(struct gl_context *ctx, > > GLenum format, GLenum type, GLvoid *pixels, > > - struct gl_texture_image *texImage) { > > + struct gl_texture_image *texImage) > > +{ > > + bool ok; > > + > > DBG("%s\n", __FUNCTION__); > > > > + ok = intel_gettexsubimage_tiled_memcpy(ctx, texImage, 0, 0, > > + texImage->Width, > texImage->Height, > > + format, type, pixels, > &ctx->Pack); > > + > > + if(ok) > > + return; > > + > > if (_mesa_is_bufferobj(ctx->Pack.BufferObj)) { > > /* Using PBOs, so try the BLT based path. */ > > if (blit_texture_to_pbo(ctx, format, type, pixels, texImage)) > > -- > > 2.2.0 > > > > _______________________________________________ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > > -- > Ben Widawsky, Intel Open Source Technology Center >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev