On 01/20/2015 12:09 PM, Jason Ekstrand wrote: > > > On Tue, Jan 13, 2015 at 10:44 AM, Chad Versace > <chad.vers...@intel.com <mailto:chad.vers...@intel.com>> wrote: > > On 01/12/2015 10:22 AM, Jason Ekstrand wrote: >> From: Sisinty Sasmita Patra <sisinty.pa...@intel.com >> <mailto:sisinty.pa...@intel.com>> >> >> Added intel_readpixels_tiled_mempcpy and >> intel_gettexsubimage_tiled_mempcpy functions. These are the fast >> paths for glReadPixels and glGetTexImage. >> >> On chrome, using the RoboHornet 2D Canvas toDataURL test, this >> patch cuts amount of time spent in glReadPixels by more than half >> and reduces the time of the entire test by 10%. >> >> v2: Jason Ekstrand <jason.ekstr...@intel.com >> <mailto:jason.ekstr...@intel.com>> - Refactor to make the functions >> look more like the old intel_tex_subimage_tiled_memcpy - Don't >> export the readpixels_tiled_memcpy function - Fix some pointer >> arithmatic bugs in partial image downloads (using ReadPixels with a >> non-zero x or y offset) - Fix a bug when ReadPixels is performed on >> an FBO wrapping a texture miplevel other than zero. >> >> v3: Jason Ekstrand <jason.ekstr...@intel.com >> <mailto:jason.ekstr...@intel.com>> - Better documentation fot the >> *_tiled_memcpy functions - Add target restrictions for >> renderbuffers wrapping textures >> >> v4: Jason Ekstrand <jason.ekstr...@intel.com >> <mailto:jason.ekstr...@intel.com>> - Only check the return value of >> brw_bo_map for error and not bo->virtual >> >> Signed-off-by: Jason Ekstrand <jason.ekstr...@intel.com >> <mailto:jason.ekstr...@intel.com>> --- >> src/mesa/drivers/dri/i965/intel_pixel_read.c | 142 >> ++++++++++++++++++++++++++- src/mesa/drivers/dri/i965/intel_tex.h >> | 9 ++ src/mesa/drivers/dri/i965/intel_tex_image.c | 139 >> +++++++++++++++++++++++++- 3 files changed, 287 insertions(+), 3 >> deletions(-) > > >> +/** + * \brief A fast path for glReadPixels + * + * This fast path >> is taken when the source format is BGRA, RGBA, + * A or L and when >> the texture memory is X- or Y-tiled. It downloads + * the source >> data by directly mapping the memory without a GTT fence. + * This >> then needs to be de-tiled on the CPU before presenting the data to >> + * the user in the linear fasion. + * + * This is a performance >> win over the conventional texture download path. + * In the >> conventional texture download path, the texture is either mapped + >> * through the GTT or copied to a linear buffer with the blitter >> before + * handing off to a software path. This allows us to avoid >> round-tripping + * through the GPU (in the case where we would be >> blitting) and do only a + * single copy operation. + */ +static >> bool +intel_readpixels_tiled_memcpy(struct gl_context * ctx, + >> GLint xoffset, GLint yoffset, + >> GLsizei width, GLsizei height, + >> GLenum format, GLenum type, + GLvoid * >> pixels, + const struct >> gl_pixelstore_attrib *pack) +{ > > [snip] > >> + + if (!intel_get_memcpy(rb->Format, format, type, &mem_copy, >> &cpp)) + return false; > > The use of intel_get_memcpy here is surprising and relies on a lucky > coincidence, if I understand this function correctly (which maybe I > don't). Whether the driver is copying data *to* of *from* the > miptree, intel_get_memcpy is given the same parameter values. In > other words, the 'tiledFormat' and 'format' parameters of > intel_get_memcpy are symmetric. Please add a comment somewhere (maybe > in intel_get_memcpy's Doxygen, maybe somewhere else is better) that > documents that symmetry. > > > Sure, I can do that. Yes, the symmetry is a bit subtle and relies on > the fact that the only two copy functions that are returned are > memcpy and one for bgra both of which are symmetric. Well, more to > the point, they are their own inverse. I'll add a comment to > intel_get_memcpy to that effect. In particular, I'll add the > following comment to get_memcpy: > > * The only two possible copy functions which are ever returned are a > * direct memcpy and a RGBA <-> BGRA copy function. Since RGBA -> > BGRA and * BGRA -> RGBA are exactly the same operation (and memcpy is > obviously * symmetric), it doesn't matter whether the copy is from > the tiled image * to the untiled or vice versa. The copy function > required is the same in * either case so this function can be used. > > If you don't mind, I'll squash that in to the commit where we add the > tiled_to_linear paths. > > >> +/** + * \brief A fast path for glGetTexImage. + * + * This fast >> path is taken when the source format is BGRA, RGBA, + * A or L and >> when the texture memory is X- or Y-tiled. It downloads + * the >> source data by directly mapping the memory without a GTT fence. + * >> This then needs to be de-tiled on the CPU before presenting the >> data to + * the user in the linear fasion. + * + * This is a >> performance win over the conventional texture download path. + * In >> the conventional texture download path, the texture is either >> mapped + * through the GTT or copied to a linear buffer with the >> blitter before + * handing off to a software path. This allows us >> to avoid round-tripping + * through the GPU (in the case where we >> would be blitting) and do only a + * single copy operation. + */ > > Rather than duplicating such a length comment block, I think it's > better to refer back to the original block with \see > intel_readpixels_tiled_memcpy(). > > > Sure. > > > >> +bool +intel_gettexsubimage_tiled_memcpy(struct gl_context *ctx, + >> struct gl_texture_image *texImage, + >> GLint xoffset, GLint yoffset, + >> GLsizei width, GLsizei height, + >> GLenum format, GLenum type, + >> GLvoid *pixels, + const struct >> gl_pixelstore_attrib *packing) > > The signatures and bodies of intel_gettexsubimage_tiled_memcpy and > intel_readpixels_tiled_memcpy are nearly identical, yet the function > bodies are not trivial. The two functions should really share their > common code. As far as I can tell, the only differences between the > two function bodies are: - intel_readpixels_tiled_memcpy pulls the > miptree out of ctx->_ColorReadBuffer - > intel_gettexsubimage_tiled_memcpy pulls the miptree out of a > texImage - intel_gettexsubimage_tiled_memcpy does a few sanity checks > on that texImage > > > I looked over those again. Yes, they are similar and we could share > some code but it's kind of tricky. We don't want to have too many > "check some stuff" helper functions. One option would be to have > another function with the tiled_memcpy functions that does a takes a > miptree and handles checking the tiling, doing the resolve, mapping > the buffer etc. However, I can't think of a good way to do the other > packing/format checks in a shared function without running into > "shared code for the sake of shared code".
Then forget the code sharing. Perfection is the enemy of the good. With the comment change mentioned earlier, this series is Reviewed-by: Chad Versace <chad.vers...@intel.com>
signature.asc
Description: OpenPGP digital signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev