Re: [Mesa-dev] [PATCH] mesa/formats: Fix swizzle flipping for big-endian targets
On Sun, Aug 9, 2015 at 11:55 PM, Oded Gabbay wrote: > On Mon, Aug 10, 2015 at 9:52 AM, Jason Ekstrand wrote: >> For those of you tracking the big-endian stuff, I've pushed a couple >> patches to a branch: >> >> http://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/be-fixes >> >> I've removed the patch to change the way gallium converts PIPE_FORMAT >> to MESA_FORMAT because it was clearly wrong. As I work on this, I'll >> try and keep this branch updated to the current "best known patches". >> Hopefully, we can find/fix all of the bogus code and get BE at least >> sort-of working again. >> --Jason >> > Jason, > There is an additional patch I sent that fixes the swizzle clearing > when running on BE machine: > > http://cgit.freedesktop.org/~gabbayo/mesa/commit/?h=mesa-upstream&id=8bd4265ffe3b357e17430f0ab39eb04e07c7ddf0 > > It got r-b by Emil Velikov but no one pushed it yet to upstream. Thanks! I added my R-B and pushed it. I've also updated my be-fixes branch to include it. --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91591] rounding.h:102:2: error: #error "Unsupported or undefined LONG_BIT"
https://bugs.freedesktop.org/show_bug.cgi?id=91591 Jose Fonseca changed: What|Removed |Added CC||matts...@gmail.com Assignee|jfons...@vmware.com |matts...@gmail.com --- Comment #7 from Jose Fonseca --- (In reply to Vinson Lee from comment #6) > mesa: 1eaa29cb300e927409281ef0a9413072766eaa3d (11.0.0-devel) That's my fix. So even with it it fails... Not sure what do do next: - replace C-pp logic with regular if (sizeof(long) == 8) - add a configure check for long size. - avoid longs -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: Map packed gallium formats to packed mesa formats.
On Mon, Aug 10, 2015 at 7:58 AM, Jason Ekstrand wrote: > On Sun, Aug 9, 2015 at 3:11 AM, Oded Gabbay wrote: >> On Sun, Aug 9, 2015 at 2:43 AM, Jason Ekstrand wrote: >>> On Sat, Aug 8, 2015 at 3:14 PM, Oded Gabbay wrote: On Sun, Aug 9, 2015 at 12:26 AM, Jason Ekstrand wrote: If I understand your patch, instead of using the endian-dependent defines, you use the LE defines all the time when converting pipe<-->mesa formats I'm not saying its wrong, but these endian-dependent defines, e.g. PIPE_FORMAT_ARGB_UNORM, are used in more places in the gallium code than in the above two functions. Why change only here and not in all places, i.e removing them ? Also, this fixes only llvmpipe/softpipe. It doesn't fix sw_rast. That is still broken >>> >>> The fact that sw_rast is broken bothers me. The sw_rast code doesn't >>> leave core mesa so it shouldn't be affected by any gallium changes. >>> From that perspective, this makes sense. However, sw_rast should be >>> using the auto-generated packing/unpacking functions in core mesa. >>> Did sw_rast work prior to the readpixels changes? It's possible it >>> was already broken. >>> >> I didn't check sw_rast status before readpixels changes. >> However, from looking at the automatic pack/unpack functions, it seems >> strange to me they don't take into account the endianness. >> e.g on ppc64 machine, in file format_pack.c, there is the following function: >> >> static inline void >> pack_ubyte_a8b8g8r8_unorm(const GLubyte src[4], void *dst) >> { >> uint8_t a = _mesa_unorm_to_unorm(src[3], 8, 8); >> uint8_t b = _mesa_unorm_to_unorm(src[2], 8, 8); >> uint8_t g = _mesa_unorm_to_unorm(src[1], 8, 8); >> uint8_t r = _mesa_unorm_to_unorm(src[0], 8, 8); >> >> uint32_t d = 0; >> d |= PACK(a, 0, 8); >> d |= PACK(b, 8, 8); >> d |= PACK(g, 16, 8); >> d |= PACK(r, 24, 8); >> (*(uint32_t *) dst) = d; >> } >> >> PACK is defined as: >> #define PACK(SRC, OFFSET, BITS) (((SRC) & MAX_UINT(BITS)) << (OFFSET)) >> >> So, we see that 'R' is written to the most-significant byte of the >> dword and 'A' is written to the least-significant byte of the dword, >> although in BE, that should be reversed, since this is abgr format, >> not rgba. > > No, what I've been trying to tell you this whole time is that this is > how the a8r8g8b8 format is *defined*. It's defined in terms of > carving up a single 32-bit word in native endianness. This is why we > have to flip things based on endianness if we want to get it in terms > of an array of 8-bit values. You're free to argue that that's a > clumsy definition for a color format, but that's how it's defined none > the less. > >> It's enough to see that the implementation of this function is the >> same for LE and BE to suspect the code, unless I'm missing something. >> >> I tested this patch on llvmpipe, combined with the previous patch you sent on POWER8 machine running ppc64 (BE) First of all, as I said above, it fixes the piglit sanity test. Second, it seems that it improves piglit gpu.py results much more than my patch. "Only" 1368 tests failed, vs. a bit more than 2000 with my patch. Still a lot higher than the ~600 tests that fail on ppc64le >>> >>> I'm not surprised that it didn't get to zero regressions vs. ppc64le. >>> This patch was just kind of thrown together and I didn't check all of >>> the formats. Also, as per my conversation with rolland, I may be >>> misunderstanding the PIPE_FORMAT enums. >>> >>> Really, this is a problem that is going to have to be chased by >>> someone other than me. I have no BE hardware, so I can't really chase >>> it down. I I also have very limited gallium knowledge so I'm probably >>> not the best person to start chasing it through gallium. All I've >>> done so far is throw darts at the wall. I'm sure this patch is >>> incomplete and, to be honest, I didn't even compile-test it as I don't >>> usually build gallium. What i do know is that, for the most part, the >>> core mesa code is consistent with itself as far as endianness goes. >> >> So I'm not married to my patch, and your 2 patches seem to do a better >> job (although sw-rast is still broken, but that's another issue and >> frankly, it is of less importance for me atm), and I accept your >> arguments for your solution. I would suggest that you upstream your >> patches as a start and I will continue from there, based on my >> schedule. > > Unfortunately, from what Marek and Rolland have said, this patch (the > one to change the mesa format -> gallium format conversion) is also > wrong. It seems like the code was doing it correct all along. This > means that something else is wrong somewhere. Possibly in llvmpipe, > possibly somewhere else. It's going to take some tracking. > > --Jason Jason, From my debugging it seems as though the bug originates from the mismatch between how llvmpipe handles BE and
[Mesa-dev] [Bug 91474] egl initializes wrong card with hybrid graphics card
https://bugs.freedesktop.org/show_bug.cgi?id=91474 Marek Chalupa changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |NOTOURBUG --- Comment #4 from Marek Chalupa --- https://bugzilla.gnome.org/show_bug.cgi?id=753434 -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: Map packed gallium formats to packed mesa formats.
On Mon, Aug 10, 2015 at 12:07 AM, Oded Gabbay wrote: > On Mon, Aug 10, 2015 at 7:58 AM, Jason Ekstrand wrote: >> On Sun, Aug 9, 2015 at 3:11 AM, Oded Gabbay wrote: >>> On Sun, Aug 9, 2015 at 2:43 AM, Jason Ekstrand wrote: On Sat, Aug 8, 2015 at 3:14 PM, Oded Gabbay wrote: > On Sun, Aug 9, 2015 at 12:26 AM, Jason Ekstrand > wrote: > > If I understand your patch, instead of using the endian-dependent > defines, you use the LE defines all the time when converting > pipe<-->mesa formats > > I'm not saying its wrong, but these endian-dependent defines, e.g. > PIPE_FORMAT_ARGB_UNORM, are used in more places in the gallium > code than in the above two functions. Why change only here and not in > all places, i.e removing them ? > > Also, this fixes only llvmpipe/softpipe. It doesn't fix sw_rast. That > is still broken The fact that sw_rast is broken bothers me. The sw_rast code doesn't leave core mesa so it shouldn't be affected by any gallium changes. From that perspective, this makes sense. However, sw_rast should be using the auto-generated packing/unpacking functions in core mesa. Did sw_rast work prior to the readpixels changes? It's possible it was already broken. >>> I didn't check sw_rast status before readpixels changes. >>> However, from looking at the automatic pack/unpack functions, it seems >>> strange to me they don't take into account the endianness. >>> e.g on ppc64 machine, in file format_pack.c, there is the following >>> function: >>> >>> static inline void >>> pack_ubyte_a8b8g8r8_unorm(const GLubyte src[4], void *dst) >>> { >>> uint8_t a = _mesa_unorm_to_unorm(src[3], 8, 8); >>> uint8_t b = _mesa_unorm_to_unorm(src[2], 8, 8); >>> uint8_t g = _mesa_unorm_to_unorm(src[1], 8, 8); >>> uint8_t r = _mesa_unorm_to_unorm(src[0], 8, 8); >>> >>> uint32_t d = 0; >>> d |= PACK(a, 0, 8); >>> d |= PACK(b, 8, 8); >>> d |= PACK(g, 16, 8); >>> d |= PACK(r, 24, 8); >>> (*(uint32_t *) dst) = d; >>> } >>> >>> PACK is defined as: >>> #define PACK(SRC, OFFSET, BITS) (((SRC) & MAX_UINT(BITS)) << (OFFSET)) >>> >>> So, we see that 'R' is written to the most-significant byte of the >>> dword and 'A' is written to the least-significant byte of the dword, >>> although in BE, that should be reversed, since this is abgr format, >>> not rgba. >> >> No, what I've been trying to tell you this whole time is that this is >> how the a8r8g8b8 format is *defined*. It's defined in terms of >> carving up a single 32-bit word in native endianness. This is why we >> have to flip things based on endianness if we want to get it in terms >> of an array of 8-bit values. You're free to argue that that's a >> clumsy definition for a color format, but that's how it's defined none >> the less. >> >>> It's enough to see that the implementation of this function is the >>> same for LE and BE to suspect the code, unless I'm missing something. >>> >>> > I tested this patch on llvmpipe, combined with the previous patch you > sent on POWER8 machine running ppc64 (BE) > First of all, as I said above, it fixes the piglit sanity test. > Second, it seems that it improves piglit gpu.py results much more than > my patch. "Only" 1368 tests failed, vs. a bit more than 2000 with my > patch. Still a lot higher than the ~600 tests that fail on ppc64le I'm not surprised that it didn't get to zero regressions vs. ppc64le. This patch was just kind of thrown together and I didn't check all of the formats. Also, as per my conversation with rolland, I may be misunderstanding the PIPE_FORMAT enums. Really, this is a problem that is going to have to be chased by someone other than me. I have no BE hardware, so I can't really chase it down. I I also have very limited gallium knowledge so I'm probably not the best person to start chasing it through gallium. All I've done so far is throw darts at the wall. I'm sure this patch is incomplete and, to be honest, I didn't even compile-test it as I don't usually build gallium. What i do know is that, for the most part, the core mesa code is consistent with itself as far as endianness goes. >>> >>> So I'm not married to my patch, and your 2 patches seem to do a better >>> job (although sw-rast is still broken, but that's another issue and >>> frankly, it is of less importance for me atm), and I accept your >>> arguments for your solution. I would suggest that you upstream your >>> patches as a start and I will continue from there, based on my >>> schedule. >> >> Unfortunately, from what Marek and Rolland have said, this patch (the >> one to change the mesa format -> gallium format conversion) is also >> wrong. It seems like the code was doing it correct all along. This >> means that something else is wrong somewhere. Possibly in llvmpipe, >> possibly somewhere else. It's
Re: [Mesa-dev] [PATCH] st/mesa: Map packed gallium formats to packed mesa formats.
On Mon, Aug 10, 2015 at 10:30 AM, Jason Ekstrand wrote: > On Mon, Aug 10, 2015 at 12:07 AM, Oded Gabbay wrote: >> On Mon, Aug 10, 2015 at 7:58 AM, Jason Ekstrand wrote: >>> On Sun, Aug 9, 2015 at 3:11 AM, Oded Gabbay wrote: On Sun, Aug 9, 2015 at 2:43 AM, Jason Ekstrand wrote: > On Sat, Aug 8, 2015 at 3:14 PM, Oded Gabbay wrote: >> On Sun, Aug 9, 2015 at 12:26 AM, Jason Ekstrand >> wrote: >> >> If I understand your patch, instead of using the endian-dependent >> defines, you use the LE defines all the time when converting >> pipe<-->mesa formats >> >> I'm not saying its wrong, but these endian-dependent defines, e.g. >> PIPE_FORMAT_ARGB_UNORM, are used in more places in the gallium >> code than in the above two functions. Why change only here and not in >> all places, i.e removing them ? >> >> Also, this fixes only llvmpipe/softpipe. It doesn't fix sw_rast. That >> is still broken > > The fact that sw_rast is broken bothers me. The sw_rast code doesn't > leave core mesa so it shouldn't be affected by any gallium changes. > From that perspective, this makes sense. However, sw_rast should be > using the auto-generated packing/unpacking functions in core mesa. > Did sw_rast work prior to the readpixels changes? It's possible it > was already broken. > I didn't check sw_rast status before readpixels changes. However, from looking at the automatic pack/unpack functions, it seems strange to me they don't take into account the endianness. e.g on ppc64 machine, in file format_pack.c, there is the following function: static inline void pack_ubyte_a8b8g8r8_unorm(const GLubyte src[4], void *dst) { uint8_t a = _mesa_unorm_to_unorm(src[3], 8, 8); uint8_t b = _mesa_unorm_to_unorm(src[2], 8, 8); uint8_t g = _mesa_unorm_to_unorm(src[1], 8, 8); uint8_t r = _mesa_unorm_to_unorm(src[0], 8, 8); uint32_t d = 0; d |= PACK(a, 0, 8); d |= PACK(b, 8, 8); d |= PACK(g, 16, 8); d |= PACK(r, 24, 8); (*(uint32_t *) dst) = d; } PACK is defined as: #define PACK(SRC, OFFSET, BITS) (((SRC) & MAX_UINT(BITS)) << (OFFSET)) So, we see that 'R' is written to the most-significant byte of the dword and 'A' is written to the least-significant byte of the dword, although in BE, that should be reversed, since this is abgr format, not rgba. >>> >>> No, what I've been trying to tell you this whole time is that this is >>> how the a8r8g8b8 format is *defined*. It's defined in terms of >>> carving up a single 32-bit word in native endianness. This is why we >>> have to flip things based on endianness if we want to get it in terms >>> of an array of 8-bit values. You're free to argue that that's a >>> clumsy definition for a color format, but that's how it's defined none >>> the less. >>> It's enough to see that the implementation of this function is the same for LE and BE to suspect the code, unless I'm missing something. >> I tested this patch on llvmpipe, combined with the previous patch you >> sent on POWER8 machine running ppc64 (BE) >> First of all, as I said above, it fixes the piglit sanity test. >> Second, it seems that it improves piglit gpu.py results much more than >> my patch. "Only" 1368 tests failed, vs. a bit more than 2000 with my >> patch. Still a lot higher than the ~600 tests that fail on ppc64le > > I'm not surprised that it didn't get to zero regressions vs. ppc64le. > This patch was just kind of thrown together and I didn't check all of > the formats. Also, as per my conversation with rolland, I may be > misunderstanding the PIPE_FORMAT enums. > > Really, this is a problem that is going to have to be chased by > someone other than me. I have no BE hardware, so I can't really chase > it down. I I also have very limited gallium knowledge so I'm probably > not the best person to start chasing it through gallium. All I've > done so far is throw darts at the wall. I'm sure this patch is > incomplete and, to be honest, I didn't even compile-test it as I don't > usually build gallium. What i do know is that, for the most part, the > core mesa code is consistent with itself as far as endianness goes. So I'm not married to my patch, and your 2 patches seem to do a better job (although sw-rast is still broken, but that's another issue and frankly, it is of less importance for me atm), and I accept your arguments for your solution. I would suggest that you upstream your patches as a start and I will continue from there, based on my schedule. >>> >>> Unfortunately, from what Marek and Rolland have said, this patch (the >>> one to change the mesa format -> gallium format conversion) is also >>> wrong. It
Re: [Mesa-dev] [PATCH] st/mesa: Map packed gallium formats to packed mesa formats.
On Mon, Aug 10, 2015 at 12:41 AM, Oded Gabbay wrote: > On Mon, Aug 10, 2015 at 10:30 AM, Jason Ekstrand wrote: >> On Mon, Aug 10, 2015 at 12:07 AM, Oded Gabbay wrote: >>> On Mon, Aug 10, 2015 at 7:58 AM, Jason Ekstrand >>> wrote: On Sun, Aug 9, 2015 at 3:11 AM, Oded Gabbay wrote: > On Sun, Aug 9, 2015 at 2:43 AM, Jason Ekstrand > wrote: >> On Sat, Aug 8, 2015 at 3:14 PM, Oded Gabbay >> wrote: >>> On Sun, Aug 9, 2015 at 12:26 AM, Jason Ekstrand >>> wrote: >>> >>> If I understand your patch, instead of using the endian-dependent >>> defines, you use the LE defines all the time when converting >>> pipe<-->mesa formats >>> >>> I'm not saying its wrong, but these endian-dependent defines, e.g. >>> PIPE_FORMAT_ARGB_UNORM, are used in more places in the gallium >>> code than in the above two functions. Why change only here and not in >>> all places, i.e removing them ? >>> >>> Also, this fixes only llvmpipe/softpipe. It doesn't fix sw_rast. That >>> is still broken >> >> The fact that sw_rast is broken bothers me. The sw_rast code doesn't >> leave core mesa so it shouldn't be affected by any gallium changes. >> From that perspective, this makes sense. However, sw_rast should be >> using the auto-generated packing/unpacking functions in core mesa. >> Did sw_rast work prior to the readpixels changes? It's possible it >> was already broken. >> > I didn't check sw_rast status before readpixels changes. > However, from looking at the automatic pack/unpack functions, it seems > strange to me they don't take into account the endianness. > e.g on ppc64 machine, in file format_pack.c, there is the following > function: > > static inline void > pack_ubyte_a8b8g8r8_unorm(const GLubyte src[4], void *dst) > { > uint8_t a = _mesa_unorm_to_unorm(src[3], 8, 8); > uint8_t b = _mesa_unorm_to_unorm(src[2], 8, 8); > uint8_t g = _mesa_unorm_to_unorm(src[1], 8, 8); > uint8_t r = _mesa_unorm_to_unorm(src[0], 8, 8); > > uint32_t d = 0; > d |= PACK(a, 0, 8); > d |= PACK(b, 8, 8); > d |= PACK(g, 16, 8); > d |= PACK(r, 24, 8); > (*(uint32_t *) dst) = d; > } > > PACK is defined as: > #define PACK(SRC, OFFSET, BITS) (((SRC) & MAX_UINT(BITS)) << (OFFSET)) > > So, we see that 'R' is written to the most-significant byte of the > dword and 'A' is written to the least-significant byte of the dword, > although in BE, that should be reversed, since this is abgr format, > not rgba. No, what I've been trying to tell you this whole time is that this is how the a8r8g8b8 format is *defined*. It's defined in terms of carving up a single 32-bit word in native endianness. This is why we have to flip things based on endianness if we want to get it in terms of an array of 8-bit values. You're free to argue that that's a clumsy definition for a color format, but that's how it's defined none the less. > It's enough to see that the implementation of this function is the > same for LE and BE to suspect the code, unless I'm missing something. > > >>> I tested this patch on llvmpipe, combined with the previous patch you >>> sent on POWER8 machine running ppc64 (BE) >>> First of all, as I said above, it fixes the piglit sanity test. >>> Second, it seems that it improves piglit gpu.py results much more than >>> my patch. "Only" 1368 tests failed, vs. a bit more than 2000 with my >>> patch. Still a lot higher than the ~600 tests that fail on ppc64le >> >> I'm not surprised that it didn't get to zero regressions vs. ppc64le. >> This patch was just kind of thrown together and I didn't check all of >> the formats. Also, as per my conversation with rolland, I may be >> misunderstanding the PIPE_FORMAT enums. >> >> Really, this is a problem that is going to have to be chased by >> someone other than me. I have no BE hardware, so I can't really chase >> it down. I I also have very limited gallium knowledge so I'm probably >> not the best person to start chasing it through gallium. All I've >> done so far is throw darts at the wall. I'm sure this patch is >> incomplete and, to be honest, I didn't even compile-test it as I don't >> usually build gallium. What i do know is that, for the most part, the >> core mesa code is consistent with itself as far as endianness goes. > > So I'm not married to my patch, and your 2 patches seem to do a better > job (although sw-rast is still broken, but that's another issue and > frankly, it is of less importance for me atm), and I accept your > arguments for your solution. I would suggest that you upstream your > patches as a start and I will continue from there, based on my > schedule
[Mesa-dev] [PATCH 1/4] mesa: expose dimension check for glTex*Storage functions
This is done so that following patch can use it to verify dimenstions for multisample variants of glTex*Storage. Signed-off-by: Tapani Pälli --- src/mesa/main/texstorage.c | 22 +- src/mesa/main/texstorage.h | 3 +++ 2 files changed, 24 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/texstorage.c b/src/mesa/main/texstorage.c index 4a2cc60..9a321cc 100644 --- a/src/mesa/main/texstorage.c +++ b/src/mesa/main/texstorage.c @@ -266,6 +266,26 @@ _mesa_AllocTextureStorage_sw(struct gl_context *ctx, return GL_TRUE; } +/** + * Texture width, height and depth check shared with the + * multisample variants of TexStorage functions. + * + * From OpenGL 4.5 Core spec, page 260 (section 8.19) + * + * "An INVALID_VALUE error is generated if width, height, depth + * or levels are less than 1, for commands with the corresponding + * parameters." + * + * (referring to TextureStorage* commands, these also match values + * specified for OpenGL ES 3.1.) + */ +GLboolean +_mesa_tex_storage_invalid_dim(GLsizei width, GLsizei height, GLsizei depth) +{ + if (width < 1 || height < 1 || depth < 1) + return true; + return false; +} /** * Do error checking for calls to glTexStorage1/2/3D(). @@ -287,7 +307,7 @@ tex_storage_error_check(struct gl_context *ctx, * order to allow meta functions to use legacy formats. */ /* size check */ - if (width < 1 || height < 1 || depth < 1) { + if (_mesa_tex_storage_invalid_dim(width, height, depth)) { _mesa_error(ctx, GL_INVALID_VALUE, "glTex%sStorage%uD(width, height or depth < 1)", suffix, dims); diff --git a/src/mesa/main/texstorage.h b/src/mesa/main/texstorage.h index 6f5495f..62108a8 100644 --- a/src/mesa/main/texstorage.h +++ b/src/mesa/main/texstorage.h @@ -98,4 +98,7 @@ _mesa_AllocTextureStorage_sw(struct gl_context *ctx, GLsizei levels, GLsizei width, GLsizei height, GLsizei depth); +extern GLboolean +_mesa_tex_storage_invalid_dim(GLsizei width, GLsizei height, GLsizei depth); + #endif /* TEXSTORAGE_H */ -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] mesa: set correct error for non-renderable multisample textures
Signed-off-by: Tapani Pälli --- src/mesa/main/teximage.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c index 3ea7b2a..c6fd0be 100644 --- a/src/mesa/main/teximage.c +++ b/src/mesa/main/teximage.c @@ -5639,9 +5639,18 @@ _mesa_texture_image_multisample(struct gl_context *ctx, GLuint dims, } if (!is_renderable_texture_format(ctx, internalformat)) { - _mesa_error(ctx, GL_INVALID_OPERATION, -"%s(internalformat=%s)", -func, _mesa_enum_to_string(internalformat)); + /* Page 172 of OpenGL ES 3.1 spec says: + * "An INVALID_ENUM error is generated if sizedinternalformat is not + * color-renderable, depth-renderable, or stencil-renderable (as + * defined in section 9.4). + */ + if (_mesa_is_gles31(ctx)) { + _mesa_error(ctx, GL_INVALID_ENUM, "%s(internalformat=%s)", func, + _mesa_enum_to_string(internalformat)); + } else { + _mesa_error(ctx, GL_INVALID_OPERATION, "%s(internalformat=%s)", func, + _mesa_enum_to_string(internalformat)); + } return; } -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/4] OpenGL ES 3.1 texture storage changes
Hello; Here's bunch of changes to texture storage. I have a branch to enable GL_OES_texture_storage_multisample_2d_array and I extracted changes not directly related to extension itself and I see generic to OpenGL ES 3.1. Hopefully I haven't overlapped too much with your work here Marta, sorry if I did! Thanks; Tapani Pälli (4): mesa: expose dimension check for glTex*Storage functions mesa: validate size parameters for glTexStorage*Multisample mesa: set correct error for non-renderable multisample textures mesa: allow multisampled integer texture formats for ES 3.1 src/mesa/main/multisample.c | 6 -- src/mesa/main/teximage.c| 43 --- src/mesa/main/texstorage.c | 22 +- src/mesa/main/texstorage.h | 3 +++ 4 files changed, 68 insertions(+), 6 deletions(-) -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/4] mesa: validate size parameters for glTexStorage*Multisample
Signed-off-by: Tapani Pälli --- src/mesa/main/teximage.c | 28 1 file changed, 28 insertions(+) diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c index fc69387..3ea7b2a 100644 --- a/src/mesa/main/teximage.c +++ b/src/mesa/main/teximage.c @@ -5782,6 +5782,18 @@ _mesa_TexImage3DMultisample(GLenum target, GLsizei samples, "glTexImage3DMultisample"); } +static bool +valid_texstorage_ms_parameters(GLsizei width, GLsizei height, GLsizei depth, + GLsizei samples, const char *func) +{ + GET_CURRENT_CONTEXT(ctx); + + if (_mesa_tex_storage_invalid_dim(width, height, depth) || samples < 1) { + _mesa_error(ctx, GL_INVALID_VALUE, "%s()", func); + return false; + } + return true; +} void GLAPIENTRY _mesa_TexStorage2DMultisample(GLenum target, GLsizei samples, @@ -5795,6 +5807,10 @@ _mesa_TexStorage2DMultisample(GLenum target, GLsizei samples, if (!texObj) return; + if (!valid_texstorage_ms_parameters(width, height, 1, samples, + "glTexStorage2DMultisample")) + return; + _mesa_texture_image_multisample(ctx, 2, texObj, target, samples, internalformat, width, height, 1, fixedsamplelocations, GL_TRUE, @@ -5814,6 +5830,10 @@ _mesa_TexStorage3DMultisample(GLenum target, GLsizei samples, if (!texObj) return; + if (!valid_texstorage_ms_parameters(width, height, depth, samples, + "glTexStorage3DMultisample")) + return; + _mesa_texture_image_multisample(ctx, 3, texObj, target, samples, internalformat, width, height, depth, fixedsamplelocations, GL_TRUE, @@ -5834,6 +5854,10 @@ _mesa_TextureStorage2DMultisample(GLuint texture, GLsizei samples, if (!texObj) return; + if (!valid_texstorage_ms_parameters(width, height, 1, samples, + "glTextureStorage2DMultisample")) + return; + _mesa_texture_image_multisample(ctx, 2, texObj, texObj->Target, samples, internalformat, width, height, 1, fixedsamplelocations, GL_TRUE, @@ -5855,6 +5879,10 @@ _mesa_TextureStorage3DMultisample(GLuint texture, GLsizei samples, if (!texObj) return; + if (!valid_texstorage_ms_parameters(width, height, depth, samples, + "glTextureStorage3DMultisample")) + return; + _mesa_texture_image_multisample(ctx, 3, texObj, texObj->Target, samples, internalformat, width, height, depth, fixedsamplelocations, GL_TRUE, -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/4] mesa: allow multisampled integer texture formats for ES 3.1
Signed-off-by: Tapani Pälli --- src/mesa/main/multisample.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/multisample.c b/src/mesa/main/multisample.c index 490bad5..aac5a15 100644 --- a/src/mesa/main/multisample.c +++ b/src/mesa/main/multisample.c @@ -164,9 +164,11 @@ _mesa_check_sample_count(struct gl_context *ctx, GLenum target, * * "If internalformat is a signed or unsigned integer format and samples * is greater than zero, then the error INVALID_OPERATION is generated." +* +* OpenGL ES 3.1 allows this. */ - if (_mesa_is_gles3(ctx) && _mesa_is_enum_format_integer(internalFormat) - && samples > 0) { + if ((ctx->API == API_OPENGLES2 && ctx->Version == 30) && + _mesa_is_enum_format_integer(internalFormat) && samples > 0) { return GL_INVALID_OPERATION; } -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] gles/es3.1: Implement glMemoryBarrierByRegion
> -Original Message- > From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On > Behalf Of Ilia Mirkin > Sent: Friday, August 7, 2015 9:56 PM > To: Matt Turner > Cc: mesa-dev@lists.freedesktop.org > Subject: Re: [Mesa-dev] [PATCH v2] gles/es3.1: Implement > glMemoryBarrierByRegion > > On Fri, Aug 7, 2015 at 2:18 PM, Matt Turner wrote: > > On Tue, Aug 4, 2015 at 1:22 AM, Marta Lofstedt > > wrote: > >> From: Marta Lofstedt > >> > >> Signed-off-by: Marta Lofstedt > >> --- > >> src/mapi/glapi/gen/gl_API.xml | 4 > >> src/mesa/main/shaderimage.c | 40 > + > >> src/mesa/main/shaderimage.h | 3 +++ > >> src/mesa/main/tests/dispatch_sanity.cpp | 3 +-- > >> 4 files changed, 48 insertions(+), 2 deletions(-) > >> > >> diff --git a/src/mapi/glapi/gen/gl_API.xml > >> b/src/mapi/glapi/gen/gl_API.xml index 658efa4..3db4349 100644 > >> --- a/src/mapi/glapi/gen/gl_API.xml > >> +++ b/src/mapi/glapi/gen/gl_API.xml > >> @@ -2966,6 +2966,10 @@ > >> > >> > >> > >> + > >> + > >> + > >> + > >> > >> > >> > >> diff --git a/src/mesa/main/shaderimage.c > >> b/src/mesa/main/shaderimage.c index a348cdb..7337f22 100644 > >> --- a/src/mesa/main/shaderimage.c > >> +++ b/src/mesa/main/shaderimage.c > >> @@ -653,3 +653,43 @@ _mesa_MemoryBarrier(GLbitfield barriers) > >> if (ctx->Driver.MemoryBarrier) > >>ctx->Driver.MemoryBarrier(ctx, barriers); } > >> + > >> +void GLAPIENTRY > >> +_mesa_MemoryBarrierByRegion(GLbitfield barriers) { > >> + GET_CURRENT_CONTEXT(ctx); > >> + > >> + GLbitfield all_allowed_bits = GL_ATOMIC_COUNTER_BARRIER_BIT | > >> + GL_FRAMEBUFFER_BARRIER_BIT | > >> + GL_SHADER_IMAGE_ACCESS_BARRIER_BIT | > >> + GL_SHADER_STORAGE_BARRIER_BIT | > >> + GL_TEXTURE_FETCH_BARRIER_BIT | > >> + GL_UNIFORM_BARRIER_BIT; > >> + > >> + if (ctx->Driver.MemoryBarrier) { > >> + /* From section 7.11.2 of the OpenGL ES 3.1 specification: > >> + * > >> + *"When barriers is ALL_BARRIER_BITS, shader memory accesses > will be > >> + * synchronized relative to all these barrier bits, but not to > >> other > >> + * barrier bits specific to MemoryBarrier." > >> + * > >> + * That is, if barriers is the special value GL_ALL_BARRIER_BITS, > >> then > all > >> + * barriers allowed by glMemoryBarrierByRegion should be > activated." > >> + */ > >> + if (barriers == GL_ALL_BARRIER_BITS) > >> + return ctx->Driver.MemoryBarrier(ctx, all_allowed_bits); > >> + > >> + /* From section 7.11.2 of the OpenGL ES 3.1 specification: > >> + * > >> + *"An INVALID_VALUE error is generated if barriers is not the > special > >> + * value ALL_BARRIER_BITS, and has any bits set other than those > >> + * described above." > >> + */ > >> + if ((barriers & ~all_allowed_bits) != 0) { > >> + _mesa_error(ctx, GL_INVALID_VALUE, > >> + "glMemoryBarrierByRegion(unsupported barrier bit"); > >> + } > >> + > >> + ctx->Driver.MemoryBarrier(ctx, barriers); > >> + } > > > > Would probably be nice to put an unreachable("not implemented") as an > > else case for future implementors. > > > > Reviewed-by: Matt Turner > > I wonder if this shouldn't just be > > if (!ctx->Driver.MemoryBarrier) > INVALID_OPERATION > > But this is largely hypothetical... I'm not too worried about it. > Hi Ilia, Since the patch isn't merged I assume you want me to change something, but I am not sure what I should change. I see no consensus in the existing code on what to do at: If (!ctx->Driver.NNN) Ilia, are you suggesting that setting an INVALID_OPERATION should be the "new" standard way of handling this? > > > > Thanks! > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa/formats: Don't flip channels of null array formats
Before, if we encountered an array format of 0 on a BE system, we would flip all the channels even though it's an invalid format. This would result in a mostly invalid format with a swizzle of or . Instead, we should just return 0 if the array format stashed in the format info is invalid. --- src/mesa/main/formats.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/formats.c b/src/mesa/main/formats.c index 27590ed..d7b2bae 100644 --- a/src/mesa/main/formats.c +++ b/src/mesa/main/formats.c @@ -380,7 +380,8 @@ uint32_t _mesa_format_to_array_format(mesa_format format) { const struct gl_format_info *info = _mesa_get_format_info(format); - if (!_mesa_little_endian() && info->Layout == MESA_FORMAT_LAYOUT_PACKED) + if (info->ArrayFormat && !_mesa_little_endian() && + info->Layout == MESA_FORMAT_LAYOUT_PACKED) return _mesa_array_format_flip_channels(info->ArrayFormat); else return info->ArrayFormat; -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 57/70] i965: Stage blitted buffer uploads through the common upload bo
On Fri, Aug 07, 2015 at 09:14:01PM +0100, Chris Wilson wrote: > Reuse the common upload bo for the temporary blitter storage to try and > minimise resource usage. > > Caveats: this may impose a read-read relationship on the buffer between > the blitter and render engines, and may add stalls on old kernels. Note that read-read has only landed in 4.2, hence we might want to hold off on this one a bit until most distros have upgraded to this ;-) -Daniel > > Signed-off-by: Chris Wilson > --- > src/mesa/drivers/dri/i965/intel_buffer_objects.c | 12 > 1 file changed, 4 insertions(+), 8 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/intel_buffer_objects.c > b/src/mesa/drivers/dri/i965/intel_buffer_objects.c > index 32df381..2b18e2c 100644 > --- a/src/mesa/drivers/dri/i965/intel_buffer_objects.c > +++ b/src/mesa/drivers/dri/i965/intel_buffer_objects.c > @@ -235,17 +235,13 @@ brw_buffer_subdata(struct gl_context *ctx, > (long)offset, (long)offset + size, (long)(size/1024), > intel_obj->gpu_active_start, > intel_obj->gpu_active_end); > - brw_bo *temp_bo = > -brw_bo_create(&brw->batch, "subdata temp", size, 64, 0); > - > - brw_bo_write(temp_bo, 0, data, size, 0, > - PERF_DEBUG(brw, "BufferSubData")); > - > + uint32_t upload; > + brw_bo *temp_bo = NULL; > + intel_upload_data(brw, data, size, 64, &temp_bo, &upload); > intel_emit_linear_blit(brw, > intel_obj->buffer, offset, > -temp_bo, 0, > +temp_bo, upload, > size); > - > brw_bo_put(temp_bo); > return; >} else { > -- > 2.5.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 57/70] i965: Stage blitted buffer uploads through the common upload bo
On Mon, Aug 10, 2015 at 10:40:29AM +0200, Daniel Vetter wrote: > On Fri, Aug 07, 2015 at 09:14:01PM +0100, Chris Wilson wrote: > > Reuse the common upload bo for the temporary blitter storage to try and > > minimise resource usage. > > > > Caveats: this may impose a read-read relationship on the buffer between > > the blitter and render engines, and may add stalls on old kernels. > > Note that read-read has only landed in 4.2, hence we might want to hold > off on this one a bit until most distros have upgraded to this ;-) It's not that big of a penalty. Mesa has already closed and submitted the render batch due to the change of engines and will be forced to do a sync anyway because it is writing into a busy object (busy presumably due to a render operation). You've already lost. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] mesa: expose dimension check for glTex*Storage functions
On Mon, 2015-08-10 at 11:06 +0300, Tapani Pälli wrote: > This is done so that following patch can use it to verify dimenstions > for multisample variants of glTex*Storage. > > Signed-off-by: Tapani Pälli > --- > src/mesa/main/texstorage.c | 22 +- > src/mesa/main/texstorage.h | 3 +++ > 2 files changed, 24 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/main/texstorage.c b/src/mesa/main/texstorage.c > index 4a2cc60..9a321cc 100644 > --- a/src/mesa/main/texstorage.c > +++ b/src/mesa/main/texstorage.c > @@ -266,6 +266,26 @@ _mesa_AllocTextureStorage_sw(struct gl_context *ctx, > return GL_TRUE; > } > > +/** > + * Texture width, height and depth check shared with the > + * multisample variants of TexStorage functions. > + * > + * From OpenGL 4.5 Core spec, page 260 (section 8.19) > + * > + * "An INVALID_VALUE error is generated if width, height, depth > + * or levels are less than 1, for commands with the corresponding > + * parameters." > + * > + * (referring to TextureStorage* commands, these also match values > + * specified for OpenGL ES 3.1.) > + */ > +GLboolean > +_mesa_tex_storage_invalid_dim(GLsizei width, GLsizei height, GLsizei depth) > +{ > + if (width < 1 || height < 1 || depth < 1) > + return true; > + return false; > +} Since its only used in one other place I'm not sure if this is an improvement over just using width < 1 || height < 1 || depth < 1 in the next patch. Or is this used elsewhere in your branch? It would probably make sense to move it to texstorage.h and inline it. > > /** > * Do error checking for calls to glTexStorage1/2/3D(). > @@ -287,7 +307,7 @@ tex_storage_error_check(struct gl_context *ctx, > * order to allow meta functions to use legacy formats. */ > > /* size check */ > - if (width < 1 || height < 1 || depth < 1) { > + if (_mesa_tex_storage_invalid_dim(width, height, depth)) { >_mesa_error(ctx, GL_INVALID_VALUE, >"glTex%sStorage%uD(width, height or depth < 1)", >suffix, dims); > diff --git a/src/mesa/main/texstorage.h b/src/mesa/main/texstorage.h > index 6f5495f..62108a8 100644 > --- a/src/mesa/main/texstorage.h > +++ b/src/mesa/main/texstorage.h > @@ -98,4 +98,7 @@ _mesa_AllocTextureStorage_sw(struct gl_context *ctx, > GLsizei levels, GLsizei width, > GLsizei height, GLsizei depth); > > +extern GLboolean > +_mesa_tex_storage_invalid_dim(GLsizei width, GLsizei height, GLsizei > depth); > + > #endif /* TEXSTORAGE_H */ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] st/dri: Use packed RGB formats
From: Michel Dänzer Fixes Gallium based DRI drivers failing to load on big endian hosts because they can't find any matching fbconfigs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71789 Signed-off-by: Michel Dänzer --- src/gallium/state_trackers/dri/dri2.c | 26 +- src/gallium/state_trackers/dri/dri_drawable.c | 8 2 files changed, 17 insertions(+), 17 deletions(-) diff --git a/src/gallium/state_trackers/dri/dri2.c b/src/gallium/state_trackers/dri/dri2.c index 91b4431..fae100e 100644 --- a/src/gallium/state_trackers/dri/dri2.c +++ b/src/gallium/state_trackers/dri/dri2.c @@ -188,10 +188,10 @@ dri2_drawable_get_buffers(struct dri_drawable *drawable, * may occur as the stvis->color_format. */ switch(format) { - case PIPE_FORMAT_B8G8R8A8_UNORM: + case PIPE_FORMAT_BGRA_UNORM: depth = 32; break; - case PIPE_FORMAT_B8G8R8X8_UNORM: + case PIPE_FORMAT_BGRX_UNORM: depth = 24; break; case PIPE_FORMAT_B5G6R5_UNORM: @@ -261,13 +261,13 @@ dri_image_drawable_get_buffers(struct dri_drawable *drawable, case PIPE_FORMAT_B5G6R5_UNORM: image_format = __DRI_IMAGE_FORMAT_RGB565; break; - case PIPE_FORMAT_B8G8R8X8_UNORM: + case PIPE_FORMAT_BGRX_UNORM: image_format = __DRI_IMAGE_FORMAT_XRGB; break; - case PIPE_FORMAT_B8G8R8A8_UNORM: + case PIPE_FORMAT_BGRA_UNORM: image_format = __DRI_IMAGE_FORMAT_ARGB; break; - case PIPE_FORMAT_R8G8B8A8_UNORM: + case PIPE_FORMAT_RGBA_UNORM: image_format = __DRI_IMAGE_FORMAT_ABGR; break; default: @@ -314,10 +314,10 @@ dri2_allocate_buffer(__DRIscreen *sPriv, switch (format) { case 32: - pf = PIPE_FORMAT_B8G8R8A8_UNORM; + pf = PIPE_FORMAT_BGRA_UNORM; break; case 24: - pf = PIPE_FORMAT_B8G8R8X8_UNORM; + pf = PIPE_FORMAT_BGRX_UNORM; break; case 16: pf = PIPE_FORMAT_Z16_UNORM; @@ -724,13 +724,13 @@ dri2_create_image_from_winsys(__DRIscreen *_screen, pf = PIPE_FORMAT_B5G6R5_UNORM; break; case __DRI_IMAGE_FORMAT_XRGB: - pf = PIPE_FORMAT_B8G8R8X8_UNORM; + pf = PIPE_FORMAT_BGRX_UNORM; break; case __DRI_IMAGE_FORMAT_ARGB: - pf = PIPE_FORMAT_B8G8R8A8_UNORM; + pf = PIPE_FORMAT_BGRA_UNORM; break; case __DRI_IMAGE_FORMAT_ABGR: - pf = PIPE_FORMAT_R8G8B8A8_UNORM; + pf = PIPE_FORMAT_RGBA_UNORM; break; default: pf = PIPE_FORMAT_NONE; @@ -845,13 +845,13 @@ dri2_create_image(__DRIscreen *_screen, pf = PIPE_FORMAT_B5G6R5_UNORM; break; case __DRI_IMAGE_FORMAT_XRGB: - pf = PIPE_FORMAT_B8G8R8X8_UNORM; + pf = PIPE_FORMAT_BGRX_UNORM; break; case __DRI_IMAGE_FORMAT_ARGB: - pf = PIPE_FORMAT_B8G8R8A8_UNORM; + pf = PIPE_FORMAT_BGRA_UNORM; break; case __DRI_IMAGE_FORMAT_ABGR: - pf = PIPE_FORMAT_R8G8B8A8_UNORM; + pf = PIPE_FORMAT_RGBA_UNORM; break; default: pf = PIPE_FORMAT_NONE; diff --git a/src/gallium/state_trackers/dri/dri_drawable.c b/src/gallium/state_trackers/dri/dri_drawable.c index 0d2929a..f0cc4a2 100644 --- a/src/gallium/state_trackers/dri/dri_drawable.c +++ b/src/gallium/state_trackers/dri/dri_drawable.c @@ -231,11 +231,11 @@ dri_set_tex_buffer2(__DRIcontext *pDRICtx, GLint target, if (format == __DRI_TEXTURE_FORMAT_RGB) { /* only need to cover the formats recognized by dri_fill_st_visual */ switch (internal_format) { - case PIPE_FORMAT_B8G8R8A8_UNORM: -internal_format = PIPE_FORMAT_B8G8R8X8_UNORM; + case PIPE_FORMAT_BGRA_UNORM: +internal_format = PIPE_FORMAT_BGRX_UNORM; break; - case PIPE_FORMAT_A8R8G8B8_UNORM: -internal_format = PIPE_FORMAT_X8R8G8B8_UNORM; + case PIPE_FORMAT_ARGB_UNORM: +internal_format = PIPE_FORMAT_XRGB_UNORM; break; default: break; -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] mesa: expose dimension check for glTex*Storage functions
On 08/10/2015 12:04 PM, Timothy Arceri wrote: On Mon, 2015-08-10 at 11:06 +0300, Tapani Pälli wrote: This is done so that following patch can use it to verify dimenstions for multisample variants of glTex*Storage. Signed-off-by: Tapani Pälli --- src/mesa/main/texstorage.c | 22 +- src/mesa/main/texstorage.h | 3 +++ 2 files changed, 24 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/texstorage.c b/src/mesa/main/texstorage.c index 4a2cc60..9a321cc 100644 --- a/src/mesa/main/texstorage.c +++ b/src/mesa/main/texstorage.c @@ -266,6 +266,26 @@ _mesa_AllocTextureStorage_sw(struct gl_context *ctx, return GL_TRUE; } +/** + * Texture width, height and depth check shared with the + * multisample variants of TexStorage functions. + * + * From OpenGL 4.5 Core spec, page 260 (section 8.19) + * + * "An INVALID_VALUE error is generated if width, height, depth + * or levels are less than 1, for commands with the corresponding + * parameters." + * + * (referring to TextureStorage* commands, these also match values + * specified for OpenGL ES 3.1.) + */ +GLboolean +_mesa_tex_storage_invalid_dim(GLsizei width, GLsizei height, GLsizei depth) +{ + if (width < 1 || height < 1 || depth < 1) + return true; + return false; +} Since its only used in one other place I'm not sure if this is an improvement over just using width < 1 || height < 1 || depth < 1 in the next patch. Or is this used elsewhere in your branch? I would personally prefer to have the check only in one place. It would probably make sense to move it to texstorage.h and inline it. Yep, I can change this. /** * Do error checking for calls to glTexStorage1/2/3D(). @@ -287,7 +307,7 @@ tex_storage_error_check(struct gl_context *ctx, * order to allow meta functions to use legacy formats. */ /* size check */ - if (width < 1 || height < 1 || depth < 1) { + if (_mesa_tex_storage_invalid_dim(width, height, depth)) { _mesa_error(ctx, GL_INVALID_VALUE, "glTex%sStorage%uD(width, height or depth < 1)", suffix, dims); diff --git a/src/mesa/main/texstorage.h b/src/mesa/main/texstorage.h index 6f5495f..62108a8 100644 --- a/src/mesa/main/texstorage.h +++ b/src/mesa/main/texstorage.h @@ -98,4 +98,7 @@ _mesa_AllocTextureStorage_sw(struct gl_context *ctx, GLsizei levels, GLsizei width, GLsizei height, GLsizei depth); +extern GLboolean +_mesa_tex_storage_invalid_dim(GLsizei width, GLsizei height, GLsizei depth); + #endif /* TEXSTORAGE_H */ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/70] i965: Pack read-only booleans into a bitfield
On Fri, Aug 07, 2015 at 09:13:06PM +0100, Chris Wilson wrote: > GCC's read access for single bits in a bitfield is reasonable (just a > move + flag comparison), so let's save some cachelines by packing the > write-once/read-many booleans together. > >text data bss dec hex filename > 6490134191992 26192 6708318 665c5e lib64/i965_dri.so > 6491766191992 26192 6709950 6662be lib64/i965_dri.so > > Small inflation due to the extra immediate masks and entirely dubious as > to whether it is worth it. A run though Martin's testsuite on Braswell also suggests the extra code hurts noticeably, so just ignore this patch... -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91596] EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI
https://bugs.freedesktop.org/show_bug.cgi?id=91596 Bug ID: 91596 Summary: EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI Product: Mesa Version: git Hardware: All OS: other Status: NEW Severity: critical Priority: medium Component: EGL Assignee: mesa-dev@lists.freedesktop.org Reporter: issor.or...@gmail.com QA Contact: mesa-dev@lists.freedesktop.org Created attachment 117607 --> https://bugs.freedesktop.org/attachment.cgi?id=117607&action=edit logcat, NOTE: SEGFAULTS are unrelated to GUI flashes and independent Hi, while testing mesa 10.7.0devel/11.0.0 with android-x86 ISO builds, I've encountered a problem, which is affecting Android-x86 GUI on both kitkat-x86 and lollipop-x86 and on different HW (Radeon, Nvidia, Intel). The artifacts visible are: - GUI/Backgroung elements flashing, with B/W very rapid changes (100-200 ms cycles) - Old frame buffers polluting the composed GUI/screens, visible using the mouse wheel Symptoms indicate that somehow the handles to the EGL buffers or their DRI equivalents may have problems. After some checks on git on what could cause HW independend problems, I've found the first commit that generates the problem: commit c2c2e9ab604793c6e01f85497f3f5bf645f962fa egl: implement EGL_KHR_gl_colorspace (v2) v2: add missing "break" NOTE: since Android-x86 require at least two patches to have working Intel and Nouveau gallium_dri, main git bisecting operations was done with Radeonsi driver on lollipop-x86, where I could build the untouched mesa 11.0.0 git. Mauro -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91596] EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI
https://bugs.freedesktop.org/show_bug.cgi?id=91596 --- Comment #1 from Mauro Rossi --- Created attachment 117608 --> https://bugs.freedesktop.org/attachment.cgi?id=117608&action=edit dmesg, NOTE: segfaults are due to an independent problem -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91596] EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI
https://bugs.freedesktop.org/show_bug.cgi?id=91596 --- Comment #2 from Mauro Rossi --- Created attachment 117609 --> https://bugs.freedesktop.org/attachment.cgi?id=117609&action=edit dumpsys SurfaceFlinger -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/4] OpenGL ES 3.1 texture storage changes
On 08/10/2015 11:06 AM, Tapani Pälli wrote: Hello; Here's bunch of changes to texture storage. I have a branch to enable GL_OES_texture_storage_multisample_2d_array and I extracted changes not directly related to extension itself and I see generic to OpenGL ES 3.1. Hopefully I haven't overlapped too much with your work here Marta, sorry if I did! There was overlap indeed. Patch 4 can be forgotten, it has been adressed already by Marta. For the rest of changes I will put Marta as co-author where changes overlap. Thanks; Tapani Pälli (4): mesa: expose dimension check for glTex*Storage functions mesa: validate size parameters for glTexStorage*Multisample mesa: set correct error for non-renderable multisample textures mesa: allow multisampled integer texture formats for ES 3.1 src/mesa/main/multisample.c | 6 -- src/mesa/main/teximage.c| 43 --- src/mesa/main/texstorage.c | 22 +- src/mesa/main/texstorage.h | 3 +++ 4 files changed, 68 insertions(+), 6 deletions(-) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] r600g: use a bitfield to track dirty atoms
Please never use "long" in Mesa. It only has 32 bits on 32-bit systems. uint64_t is generally used for all unsigned 64-bit variables and "llu" or "ull" is the number suffix. Also, the 64-bit ctz is ctzll and a proper HAVE macro should be added for it too. The general idea is nice, thanks. The number of atoms can be cut down by merging all scissors states into 1 atom (just as there is 1 atom for 16 textures, there can be 1 atom for 16 scissors) and the same applies to viewport states. This would simplify the code, because all dirty bits would fit into 64 bits and there would even be some space left. Patches 1-3: Reviewed-by: Marek Olšák Marek On Sun, Aug 9, 2015 at 11:42 PM, Grazvydas Ignotas wrote: > r600 currently has 73 atoms and looping through their dirty flags has > become costly because checking each flag requires a pointer > dereference before the read. To avoid having to do that add additional > bitfield which can be checked really quickly thanks to tzcnt instruction. > > id field was added to struct r600_atom but that doesn't affect memory > usage for both 32 and 64 bit CPUs because it was stuffed into padding. > > The performance improvement is ~2% for benchmarks that can have FPS in > the thousands but is hardly measurable in "real" programs. > --- > src/gallium/drivers/r600/r600_hw_context.c| 12 +++ > src/gallium/drivers/r600/r600_pipe.h | 45 > +++ > src/gallium/drivers/r600/r600_state_common.c | 8 ++--- > src/gallium/drivers/radeon/r600_pipe_common.h | 1 + > 4 files changed, 56 insertions(+), 10 deletions(-) > > diff --git a/src/gallium/drivers/r600/r600_hw_context.c > b/src/gallium/drivers/r600/r600_hw_context.c > index c048a71..6445151 100644 > --- a/src/gallium/drivers/r600/r600_hw_context.c > +++ b/src/gallium/drivers/r600/r600_hw_context.c > @@ -51,13 +51,13 @@ void r600_need_cs_space(struct r600_context *ctx, > unsigned num_dw, > unsigned i; > > /* The number of dwords all the dirty states would take. */ > - for (i = 0; i < R600_NUM_ATOMS; i++) { > - if (ctx->atoms[i] && ctx->atoms[i]->dirty) { > - num_dw += ctx->atoms[i]->num_dw; > - if (ctx->screen->b.trace_bo) { > - num_dw += R600_TRACE_CS_DWORDS; > - } > + i = r600_next_dirty_atom(ctx, 0); > + while (i < R600_NUM_ATOMS) { > + num_dw += ctx->atoms[i]->num_dw; > + if (ctx->screen->b.trace_bo) { > + num_dw += R600_TRACE_CS_DWORDS; > } > + i = r600_next_dirty_atom(ctx, i + 1); > } > > /* The upper-bound of how much space a draw command would > take. */ > diff --git a/src/gallium/drivers/r600/r600_pipe.h > b/src/gallium/drivers/r600/r600_pipe.h > index 8b61269..5d10bb4 100644 > --- a/src/gallium/drivers/r600/r600_pipe.h > +++ b/src/gallium/drivers/r600/r600_pipe.h > @@ -85,6 +85,9 @@ > #define R600_BIG_ENDIAN 0 > #endif > > +#define R600_DIRTY_ATOM_WORD_BITS (sizeof(unsigned long) * 8) > +#define R600_DIRTY_ATOM_ARRAY_LEN DIV_ROUND_UP(R600_NUM_ATOMS, > R600_DIRTY_ATOM_WORD_BITS) > + > struct r600_context; > struct r600_bytecode; > struct r600_shader_key; > @@ -426,6 +429,8 @@ struct r600_context { > > /* State binding slots are here. */ > struct r600_atom*atoms[R600_NUM_ATOMS]; > + /* Dirty atom bitmask for fast tests */ > + unsigned long > dirty_atoms[R600_DIRTY_ATOM_ARRAY_LEN]; > /* States for CS initialization. */ > struct r600_command_buffer start_cs_cmd; /* invariant state > mostly */ > /** Compute specific registers initializations. The start_cs_cmd atom > @@ -502,7 +507,18 @@ static inline void r600_set_atom_dirty(struct > r600_context *rctx, >struct r600_atom *atom, >bool dirty) > { > + unsigned long mask; > + unsigned int w; > + > atom->dirty = dirty; > + > + assert(atom->id != 0); > + w = atom->id / R600_DIRTY_ATOM_WORD_BITS; > + mask = 1ul << (atom->id % R600_DIRTY_ATOM_WORD_BITS); > + if (dirty) > + rctx->dirty_atoms[w] |= mask; > + else > + rctx->dirty_atoms[w] &= ~mask; > } > > static inline void r600_mark_atom_dirty(struct r600_context *rctx, > @@ -511,6 +527,35 @@ static inline void r600_mark_atom_dirty(struct > r600_context *rctx, > r600_set_atom_dirty(rctx, atom, true); > } > > +static inline unsigned int r600_next_dirty_atom(struct r600_context *rctx, > + unsigned int id) > +{ > +#if !defined(DEBUG) && defined(HAVE___BUILTIN_CTZ) > + unsigned int w = id / R600_DIRTY_ATOM_WO
[Mesa-dev] [PATCH] mesa: add NV_read_{depth, stencil, depth_stencil} extensions
From: Rob Clark These extensions allow reading depth/stencil for GLES contexts, which is useful for tools like apitrace. Signed-off-by: Rob Clark --- I have a patch, which I will send out after some cleanup, that makes apitrace able to dump depth/stencil buffers with GLES, thanks to this extension. src/mesa/main/extensions.c | 3 +++ src/mesa/main/readpix.c| 25 +++-- 2 files changed, 22 insertions(+), 6 deletions(-) diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c index 2dbfabd..d934d19 100644 --- a/src/mesa/main/extensions.c +++ b/src/mesa/main/extensions.c @@ -385,6 +385,9 @@ static const struct extension extension_table[] = { { "GL_NV_point_sprite", o(NV_point_sprite), GL, 2001 }, { "GL_NV_primitive_restart",o(NV_primitive_restart), GLL,2002 }, { "GL_NV_read_buffer", o(dummy_true), ES2,2011 }, + { "GL_NV_read_depth", o(dummy_true), ES2,2011 }, + { "GL_NV_read_depth_stencil", o(dummy_true), ES2,2011 }, + { "GL_NV_read_stencil", o(dummy_true), ES2,2011 }, { "GL_NV_texgen_reflection",o(dummy_true), GLL,1999 }, { "GL_NV_texture_barrier", o(NV_texture_barrier), GL, 2009 }, { "GL_NV_texture_env_combine4", o(NV_texture_env_combine4), GLL,1999 }, diff --git a/src/mesa/main/readpix.c b/src/mesa/main/readpix.c index 2744232..65751aa 100644 --- a/src/mesa/main/readpix.c +++ b/src/mesa/main/readpix.c @@ -917,7 +917,9 @@ read_pixels_es3_error_check(GLenum format, GLenum type, GLboolean is_unsigned_int = GL_FALSE; GLboolean is_signed_int = GL_FALSE; - if (!_mesa_is_color_format(internalFormat)) { + /* TODO just drop the check? Are there any formats to filter out? */ + if (!(_mesa_is_color_format(internalFormat) || + _mesa_is_depth_or_stencil_format(internalFormat))) { return GL_INVALID_OPERATION; } @@ -950,6 +952,22 @@ read_pixels_es3_error_check(GLenum format, GLenum type, (is_unsigned_int && type == GL_UNSIGNED_INT)) return GL_NO_ERROR; break; + case GL_DEPTH_STENCIL: + if ((internalFormat == GL_DEPTH24_STENCIL8) && + (type == GL_UNSIGNED_INT_24_8)) + return GL_NO_ERROR; + if ((internalFormat == GL_DEPTH32F_STENCIL8) && + (type == GL_FLOAT_32_UNSIGNED_INT_24_8_REV)) + return GL_NO_ERROR; + break; + case GL_DEPTH_COMPONENT: + if ((internalFormat == GL_DEPTH_COMPONENT32F) && + (type == GL_FLOAT)) + return GL_NO_ERROR; + if ((internalFormat == GL_DEPTH_COMPONENT24) && + (type == GL_UNSIGNED_INT)) + return GL_NO_ERROR; + break; } return GL_INVALID_OPERATION; @@ -1023,11 +1041,6 @@ _mesa_ReadnPixelsARB( GLint x, GLint y, GLsizei width, GLsizei height, err = read_pixels_es3_error_check(format, type, rb); } - if (err == GL_NO_ERROR && (format == GL_DEPTH_COMPONENT - || format == GL_DEPTH_STENCIL)) { - err = GL_INVALID_ENUM; - } - if (err != GL_NO_ERROR) { _mesa_error(ctx, err, "glReadPixels(invalid format %s and/or type %s)", _mesa_enum_to_string(format), -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] glsl/es3.1: Fix up GL_ARB_compute_shader for GLSL ES 3.1
From: Marta Lofstedt GL_ARB_compute_shader is limited for GLSL version 430. This enables for GLSL ES version 310. V2: Updated error string to also include GLSL 3.10 Signed-off-by: Marta Lofstedt --- src/glsl/glsl_parser.yy | 5 ++--- src/glsl/glsl_parser_extras.h | 5 + 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy index 2b0c8bd..2abdab4 100644 --- a/src/glsl/glsl_parser.yy +++ b/src/glsl/glsl_parser.yy @@ -1519,11 +1519,10 @@ layout_qualifier_id: "invalid %s of %d specified", local_size_qualifiers[i], $3); YYERROR; -} else if (!state->is_version(430, 0) && - !state->ARB_compute_shader_enable) { +} else if (!state->has_compute_shader_()) { _mesa_glsl_error(& @3, state, "%s qualifier requires GLSL 4.30 or " -"ARB_compute_shader", + "GLSL ES 3.10 or ARB_compute_shader", local_size_qualifiers[i]); YYERROR; } else { diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h index eb325f0..57a7555 100644 --- a/src/glsl/glsl_parser_extras.h +++ b/src/glsl/glsl_parser_extras.h @@ -236,6 +236,11 @@ struct _mesa_glsl_parse_state { return ARB_shading_language_420pack_enable || is_version(420, 0); } + bool has_compute_shader() const + { + return ARB_compute_shader_enable || is_version(430, 310); + } + void process_version_directive(YYLTYPE *locp, int version, const char *ident); -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/70] util/list: Add convenience functions for moving a list element
On 07/08/15 23:13, Chris Wilson wrote: Just a couple of functions for removing an element from one list and adding to another (perhaps even the same list, just at the head or tail). Used in future patches. Signed-off-by: Chris Wilson --- src/util/list.h | 27 +-- 1 file changed, 21 insertions(+), 6 deletions(-) diff --git a/src/util/list.h b/src/util/list.h index b98ce59..cce1adc 100644 --- a/src/util/list.h +++ b/src/util/list.h @@ -55,6 +55,12 @@ static inline void list_inithead(struct list_head *item) item->next = item; } +static inline void __list_del(struct list_head *item) +{ +item->prev->next = item->next; +item->next->prev = item->prev; +} + static inline void list_add(struct list_head *item, struct list_head *list) { item->prev = list; @@ -63,6 +69,12 @@ static inline void list_add(struct list_head *item, struct list_head *list) list->next = item; } +inline static void list_move(struct list_head *from, struct list_head *to) +{ + __list_del(from); + list_add(from, to); +} + static inline void list_addtail(struct list_head *item, struct list_head *list) { item->next = list; @@ -71,6 +83,12 @@ static inline void list_addtail(struct list_head *item, struct list_head *list) list->prev = item; } +inline static void list_movetail(struct list_head *from, struct list_head *to) +{ + __list_del(from); + list_addtail(from, to); +} + static inline void list_replace(struct list_head *from, struct list_head *to) { to->prev = from->prev; @@ -81,17 +99,14 @@ static inline void list_replace(struct list_head *from, struct list_head *to) static inline void list_del(struct list_head *item) { -item->prev->next = item->next; -item->next->prev = item->prev; + __list_del(item); item->prev = item->next = NULL; } static inline void list_delinit(struct list_head *item) { -item->prev->next = item->next; -item->next->prev = item->prev; -item->next = item; -item->prev = item; + __list_del(item); + list_inithead(item); } static inline bool list_empty(struct list_head *list) Nice cleanup on top of the adding the new convenience functions. I don't mind having both in one patch but it would be nice adding that you do this cleanup :) With the updated commit message: Reviewed-by: Martin Peres ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/skl: Remove early platform support
If we go with this patch perhaps it would be good to remove supports_simd16_3src entirely from brw_device_info and any code that is referring to it in order to avoid carrying around useless code. Currently it seems like it would be quite easy to add a new brw_device_info and forget to add supports_simd16_3src and never notice that it is redundantly using the fallback. This would be sort of a pain for me because my Skylake is still one that needs this workaround but I guess I can just upgrade easily enough. Regards, - Neil Ben Widawsky writes: > We do not want bug reports from this early stepping of SKL. Few if any were > ever > shipped outside of Intel to early enabling partners, and none will be sold. > > There is a functional change here. If you're using new mesa on an old > kernel/libdrm, the revid will be -1, and we'll use new SKL values instead of > early ones (a hopefully irrelevant improvement IMO). > > Cc: Jason Ekstrand > Cc: Neil Roberts > Signed-off-by: Ben Widawsky > --- > src/mesa/drivers/dri/i965/brw_device_info.c | 15 +++ > 1 file changed, 7 insertions(+), 8 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c > b/src/mesa/drivers/dri/i965/brw_device_info.c > index be517e8..fc89221 100644 > --- a/src/mesa/drivers/dri/i965/brw_device_info.c > +++ b/src/mesa/drivers/dri/i965/brw_device_info.c > @@ -322,10 +322,7 @@ static const struct brw_device_info brw_device_info_chv > = { >.max_gs_entries = 640,\ > } > > -static const struct brw_device_info brw_device_info_skl_early = { > - GEN9_FEATURES, .gt = 1, > - .supports_simd16_3src = false, > -}; > +#define IS_SKL(devinfo) ((devinfo)->gen == 9 && !(devinfo)->is_broxton) > > static const struct brw_device_info brw_device_info_skl_gt1 = { > GEN9_FEATURES, .gt = 1, > @@ -376,10 +373,12 @@ brw_get_device_info(int devid, int revision) >return NULL; > } > > - if (devinfo->gen == 9 && > - !devinfo->is_broxton && > - (revision == 2 || revision == 3 || revision == -1)) > - return &brw_device_info_skl_early; > + if (IS_SKL(devinfo) && (revision != -1 && revision <= 3)) { > + fprintf(stderr, > + "i965_dri.so does not support this PCI ID with revision %d.\n", > + revision); > + return NULL; > + } > > return devinfo; > } > -- > 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa/es3.1: Enable GL_MAX_VERTEX_ATTRIB enums for GLES 3.1
From: Marta Lofstedt Signed-off-by: Marta Lofstedt --- src/mesa/main/get_hash_params.py | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index 7dc92f1..66c47de 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -450,6 +450,13 @@ descriptor=[ # GL_ARB_explicit_uniform_location / GLES 3.1 [ "MAX_UNIFORM_LOCATIONS", "CONTEXT_INT(Const.MaxUserAssignableUniformLocations), extra_ARB_explicit_uniform_location_es31" ], + +# GL_ARB_vertex_attrib_binding / GLES 3.1 + [ "MAX_VERTEX_ATTRIB_RELATIVE_OFFSET", "CONTEXT_ENUM(Const.MaxVertexAttribRelativeOffset), NO_EXTRA" ], + [ "MAX_VERTEX_ATTRIB_BINDINGS", "CONTEXT_ENUM(Const.MaxVertexAttribBindings), NO_EXTRA" ], + +# GL 4.4 / GLES 3.1 + [ "MAX_VERTEX_ATTRIB_STRIDE", "CONTEXT_ENUM(Const.MaxVertexAttribStride), NO_EXTRA" ], ]}, # Enums in OpenGL Core profile and ES 3.1 @@ -761,9 +768,6 @@ descriptor=[ [ "MAX_GEOMETRY_INPUT_COMPONENTS", "CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxInputComponents), extra_version_32" ], [ "MAX_GEOMETRY_OUTPUT_COMPONENTS", "CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxOutputComponents), extra_version_32" ], -# GL 4.4 - [ "MAX_VERTEX_ATTRIB_STRIDE", "CONTEXT_ENUM(Const.MaxVertexAttribStride), NO_EXTRA" ], - # GL_ARB_robustness [ "RESET_NOTIFICATION_STRATEGY_ARB", "CONTEXT_ENUM(Const.ResetStrategy), NO_EXTRA" ], @@ -801,10 +805,6 @@ descriptor=[ [ "MAX_GEOMETRY_ATOMIC_COUNTER_BUFFERS", "CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxAtomicBuffers), extra_ARB_shader_atomic_counters_and_geometry_shader" ], [ "MAX_GEOMETRY_ATOMIC_COUNTERS", "CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxAtomicCounters), extra_ARB_shader_atomic_counters_and_geometry_shader" ], -# GL_ARB_vertex_attrib_binding - [ "MAX_VERTEX_ATTRIB_RELATIVE_OFFSET", "CONTEXT_ENUM(Const.MaxVertexAttribRelativeOffset), NO_EXTRA" ], - [ "MAX_VERTEX_ATTRIB_BINDINGS", "CONTEXT_ENUM(Const.MaxVertexAttribBindings), NO_EXTRA" ], - # GL_ARB_shader_image_load_store [ "MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS", "CONTEXT_INT(Const.MaxCombinedImageUnitsAndFragmentOutputs), extra_ARB_shader_image_load_store" ], [ "MAX_IMAGE_SAMPLES", "CONTEXT_INT(Const.MaxImageSamples), extra_ARB_shader_image_load_store" ], -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 04/70] i965: Share the workaround bo between all contexts
On 07/08/15 23:13, Chris Wilson wrote: Since the workaround bo is used strictly as a write-only buffer, we need only allocate one per screen and use the same one from all contexts. (The caveat here is during extension initialisation, where we write into and read back register values from the buffer, but that is performed only once for the first context - and baring synchronisation issues should not be a problem. Safer would be to move that also to the screen.) v2: Give the workaround bo its own init function and don't piggy back intel_bufmgr_init() since it is not that related. Signed-off-by: Chris Wilson Cc: Kenneth Graunke Cc: Martin Peres --- src/mesa/drivers/dri/i965/brw_context.c | 7 +-- src/mesa/drivers/dri/i965/brw_context.h | 4 ++-- src/mesa/drivers/dri/i965/brw_pipe_control.c | 13 - src/mesa/drivers/dri/i965/intel_screen.c | 15 +++ src/mesa/drivers/dri/i965/intel_screen.h | 1 + 5 files changed, 23 insertions(+), 17 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index efcd91a..ac744d7 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -819,12 +819,7 @@ brwCreateContext(gl_api api, } } - if (brw_init_pipe_control(brw, devinfo)) { - *dri_ctx_error = __DRI_CTX_ERROR_NO_MEMORY; - intelDestroyContext(driContextPriv); - return false; - } - + brw_init_pipe_control(brw, devinfo); brw_init_state(brw); intelInitExtensions(ctx); diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index ffdf821..166b852 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -2016,8 +2016,8 @@ gen9_use_linear_1d_layout(const struct brw_context *brw, const struct intel_mipmap_tree *mt); /* brw_pipe_control.c */ -int brw_init_pipe_control(struct brw_context *brw, - const struct brw_device_info *info); +void brw_init_pipe_control(struct brw_context *brw, + const struct brw_device_info *info); void brw_fini_pipe_control(struct brw_context *brw); void brw_emit_pipe_control_flush(struct brw_context *brw, uint32_t flags); diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c b/src/mesa/drivers/dri/i965/brw_pipe_control.c index 7ee3cb6..872bfe8 100644 --- a/src/mesa/drivers/dri/i965/brw_pipe_control.c +++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c @@ -330,26 +330,21 @@ brw_emit_mi_flush(struct brw_context *brw) brw_render_cache_set_clear(brw); } -int +void brw_init_pipe_control(struct brw_context *brw, const struct brw_device_info *devinfo) { if (devinfo->gen < 6) - return 0; + return; /* We can't just use brw_state_batch to get a chunk of space for * the gen6 workaround because it involves actually writing to * the buffer, and the kernel doesn't let us write to the batch. */ - brw->workaround_bo = drm_intel_bo_alloc(brw->bufmgr, - "pipe_control workaround", - 4096, 4096); - if (brw->workaround_bo == NULL) - return -ENOMEM; + brw->workaround_bo = brw->intelScreen->workaround_bo; + drm_intel_bo_reference(brw->workaround_bo); Why do you need reference counting here? Why not simply destroy the buffer when the screen is destroyed? I would assume that a screen could not be destroyed until all its associated contexts are gone, right? In any case, it seems to improve performance by a few percent for Synmark OglBatch7. brw->pipe_controls_since_last_cs_stall = 0; - - return 0; } void diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index 147fa1e..61f1dbe 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -967,6 +967,7 @@ intelDestroyScreen(__DRIscreen * sPriv) { struct intel_screen *intelScreen = sPriv->driverPrivate; + drm_intel_bo_unreference(intelScreen->workaround_bo); dri_bufmgr_destroy(intelScreen->bufmgr); driDestroyOptionInfo(&intelScreen->optionCache); @@ -1106,6 +1107,17 @@ intel_init_bufmgr(struct intel_screen *intelScreen) } static bool +intel_init_workaround_bo(struct intel_screen *intelScreen) +{ + /* A small scratch bo shared by all contexts, primarily used +* for doing PIPECONTROL serialisation writes that are discarded. +*/ + intelScreen->workaround_bo = + drm_intel_bo_alloc(intelScreen->bufmgr, "pipe_control w/a", 4096, 4096); + return intelScreen->workaround_bo != NULL; +} + +static bool intel_detect_swizzling(struct intel_screen *screen) { drm_intel_bo *buffer; @@ -1417,6 +1429,9 @@ __DRIconfig **intelInitScreen2(__DRIscreen *psp) if (!i
Re: [Mesa-dev] [PATCH 04/70] i965: Share the workaround bo between all contexts
On 10/08/15 14:25, Martin Peres wrote: On 07/08/15 23:13, Chris Wilson wrote: Since the workaround bo is used strictly as a write-only buffer, we need only allocate one per screen and use the same one from all contexts. (The caveat here is during extension initialisation, where we write into and read back register values from the buffer, but that is performed only once for the first context - and baring synchronisation issues should not be a problem. Safer would be to move that also to the screen.) v2: Give the workaround bo its own init function and don't piggy back intel_bufmgr_init() since it is not that related. Signed-off-by: Chris Wilson Cc: Kenneth Graunke Cc: Martin Peres --- src/mesa/drivers/dri/i965/brw_context.c | 7 +-- src/mesa/drivers/dri/i965/brw_context.h | 4 ++-- src/mesa/drivers/dri/i965/brw_pipe_control.c | 13 - src/mesa/drivers/dri/i965/intel_screen.c | 15 +++ src/mesa/drivers/dri/i965/intel_screen.h | 1 + 5 files changed, 23 insertions(+), 17 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index efcd91a..ac744d7 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -819,12 +819,7 @@ brwCreateContext(gl_api api, } } - if (brw_init_pipe_control(brw, devinfo)) { - *dri_ctx_error = __DRI_CTX_ERROR_NO_MEMORY; - intelDestroyContext(driContextPriv); - return false; - } - + brw_init_pipe_control(brw, devinfo); brw_init_state(brw); intelInitExtensions(ctx); diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index ffdf821..166b852 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -2016,8 +2016,8 @@ gen9_use_linear_1d_layout(const struct brw_context *brw, const struct intel_mipmap_tree *mt); /* brw_pipe_control.c */ -int brw_init_pipe_control(struct brw_context *brw, - const struct brw_device_info *info); +void brw_init_pipe_control(struct brw_context *brw, + const struct brw_device_info *info); void brw_fini_pipe_control(struct brw_context *brw); void brw_emit_pipe_control_flush(struct brw_context *brw, uint32_t flags); diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c b/src/mesa/drivers/dri/i965/brw_pipe_control.c index 7ee3cb6..872bfe8 100644 --- a/src/mesa/drivers/dri/i965/brw_pipe_control.c +++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c @@ -330,26 +330,21 @@ brw_emit_mi_flush(struct brw_context *brw) brw_render_cache_set_clear(brw); } -int +void brw_init_pipe_control(struct brw_context *brw, const struct brw_device_info *devinfo) { if (devinfo->gen < 6) - return 0; + return; /* We can't just use brw_state_batch to get a chunk of space for * the gen6 workaround because it involves actually writing to * the buffer, and the kernel doesn't let us write to the batch. */ - brw->workaround_bo = drm_intel_bo_alloc(brw->bufmgr, - "pipe_control workaround", - 4096, 4096); - if (brw->workaround_bo == NULL) - return -ENOMEM; + brw->workaround_bo = brw->intelScreen->workaround_bo; + drm_intel_bo_reference(brw->workaround_bo); Why do you need reference counting here? Why not simply destroy the buffer when the screen is destroyed? I would assume that a screen could not be destroyed until all its associated contexts are gone, right? In any case, it seems to improve performance by a few percent for Synmark OglBatch7. Forgot to add my R-b: Reviewed-by: Martin Peres brw->pipe_controls_since_last_cs_stall = 0; - - return 0; } void diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index 147fa1e..61f1dbe 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -967,6 +967,7 @@ intelDestroyScreen(__DRIscreen * sPriv) { struct intel_screen *intelScreen = sPriv->driverPrivate; + drm_intel_bo_unreference(intelScreen->workaround_bo); dri_bufmgr_destroy(intelScreen->bufmgr); driDestroyOptionInfo(&intelScreen->optionCache); @@ -1106,6 +1107,17 @@ intel_init_bufmgr(struct intel_screen *intelScreen) } static bool +intel_init_workaround_bo(struct intel_screen *intelScreen) +{ + /* A small scratch bo shared by all contexts, primarily used +* for doing PIPECONTROL serialisation writes that are discarded. +*/ + intelScreen->workaround_bo = + drm_intel_bo_alloc(intelScreen->bufmgr, "pipe_control w/a", 4096, 4096); + return intelScreen->workaround_bo != NULL; +} + +static bool intel_detect_swizzling(struct intel_screen *screen) { drm_intel_bo *buffer; @@ -1417,6 +1
Re: [Mesa-dev] [PATCH] mesa/es3.1: Enable GL_ARB_separate_shader_objects for GLES 3.1
This patch has the wrong name also the MAX_VERTEX_ATTRIB_STRIDE was missing and this needs to go under a GLES 3.1 label. I have sent a replacement patch for this: http://patchwork.freedesktop.org/patch/56761/ /Marta > -Original Message- > From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On > Behalf Of Matt Turner > Sent: Tuesday, June 23, 2015 11:15 PM > To: Marta Lofstedt > Cc: mesa-dev@lists.freedesktop.org > Subject: Re: [Mesa-dev] [PATCH] mesa/es3.1: Enable > GL_ARB_separate_shader_objects for GLES 3.1 > > On Tue, Jun 23, 2015 at 4:30 AM, Marta Lofstedt > wrote: > > From: Marta Lofstedt > > > > Signed-off-by: Marta Lofstedt > > --- > > src/mesa/main/get_hash_params.py | 8 > > 1 file changed, 4 insertions(+), 4 deletions(-) > > > > diff --git a/src/mesa/main/get_hash_params.py > > b/src/mesa/main/get_hash_params.py > > index 74ff3ba..f7134a9 100644 > > --- a/src/mesa/main/get_hash_params.py > > +++ b/src/mesa/main/get_hash_params.py > > @@ -407,6 +407,10 @@ descriptor=[ > > { "apis": ["GL", "GL_CORE", "GLES3"], "params": [ # > > GL_ARB_sampler_objects / GL 3.3 / GLES 3.0 > >[ "SAMPLER_BINDING", "LOC_CUSTOM, TYPE_INT, > GL_SAMPLER_BINDING, > > NO_EXTRA" ], > > + > > +# GL_ARB_vertex_attrib_binding / GLES 3.1 > > + [ "MAX_VERTEX_ATTRIB_RELATIVE_OFFSET", > > +"CONTEXT_ENUM(Const.MaxVertexAttribRelativeOffset), NO_EXTRA" ], > > + [ "MAX_VERTEX_ATTRIB_BINDINGS", > > +"CONTEXT_ENUM(Const.MaxVertexAttribBindings), NO_EXTRA" ], > > Won't this enable it in ES 3.0 as well? I think you need to replace NO_EXTRA > with something that contains EXTRA_API_ES31. See > extra_ARB_draw_indirect_es31 for an example. > > > ]}, > > > > # Enums in OpenGL Core profile and ES 3.1 @@ -776,10 +780,6 @@ > > descriptor=[ > >[ "MAX_COMBINED_ATOMIC_COUNTER_BUFFERS", > "CONTEXT_INT(Const.MaxCombinedAtomicBuffers), > extra_ARB_shader_atomic_counters" ], > >[ "MAX_COMBINED_ATOMIC_COUNTERS", > > "CONTEXT_INT(Const.MaxCombinedAtomicCounters), > > extra_ARB_shader_atomic_counters" ], > > > > -# GL_ARB_vertex_attrib_binding > > - [ "MAX_VERTEX_ATTRIB_RELATIVE_OFFSET", > > "CONTEXT_ENUM(Const.MaxVertexAttribRelativeOffset), NO_EXTRA" ], > > - [ "MAX_VERTEX_ATTRIB_BINDINGS", > > "CONTEXT_ENUM(Const.MaxVertexAttribBindings), NO_EXTRA" ], > > - > > # GL_ARB_shader_image_load_store > >[ "MAX_IMAGE_UNITS", "CONTEXT_INT(Const.MaxImageUnits), > extra_ARB_shader_image_load_store"], > >[ "MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS", > > "CONTEXT_INT(Const.MaxCombinedImageUnitsAndFragmentOutputs), > > extra_ARB_shader_image_load_store"], > > -- > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 04/70] i965: Share the workaround bo between all contexts
On Mon, Aug 10, 2015 at 02:25:41PM +0300, Martin Peres wrote: > On 07/08/15 23:13, Chris Wilson wrote: > >Since the workaround bo is used strictly as a write-only buffer, we need > >only allocate one per screen and use the same one from all contexts. > > > >(The caveat here is during extension initialisation, where we write into > >and read back register values from the buffer, but that is performed only > >once for the first context - and baring synchronisation issues should not > >be a problem. Safer would be to move that also to the screen.) > > > >v2: Give the workaround bo its own init function and don't piggy back > >intel_bufmgr_init() since it is not that related. > > > >Signed-off-by: Chris Wilson > >Cc: Kenneth Graunke > >Cc: Martin Peres > >--- > > src/mesa/drivers/dri/i965/brw_context.c | 7 +-- > > src/mesa/drivers/dri/i965/brw_context.h | 4 ++-- > > src/mesa/drivers/dri/i965/brw_pipe_control.c | 13 - > > src/mesa/drivers/dri/i965/intel_screen.c | 15 +++ > > src/mesa/drivers/dri/i965/intel_screen.h | 1 + > > 5 files changed, 23 insertions(+), 17 deletions(-) > > > >diff --git a/src/mesa/drivers/dri/i965/brw_context.c > >b/src/mesa/drivers/dri/i965/brw_context.c > >index efcd91a..ac744d7 100644 > >--- a/src/mesa/drivers/dri/i965/brw_context.c > >+++ b/src/mesa/drivers/dri/i965/brw_context.c > >@@ -819,12 +819,7 @@ brwCreateContext(gl_api api, > >} > > } > >- if (brw_init_pipe_control(brw, devinfo)) { > >- *dri_ctx_error = __DRI_CTX_ERROR_NO_MEMORY; > >- intelDestroyContext(driContextPriv); > >- return false; > >- } > >- > >+ brw_init_pipe_control(brw, devinfo); > > brw_init_state(brw); > > intelInitExtensions(ctx); > >diff --git a/src/mesa/drivers/dri/i965/brw_context.h > >b/src/mesa/drivers/dri/i965/brw_context.h > >index ffdf821..166b852 100644 > >--- a/src/mesa/drivers/dri/i965/brw_context.h > >+++ b/src/mesa/drivers/dri/i965/brw_context.h > >@@ -2016,8 +2016,8 @@ gen9_use_linear_1d_layout(const struct brw_context > >*brw, > >const struct intel_mipmap_tree *mt); > > /* brw_pipe_control.c */ > >-int brw_init_pipe_control(struct brw_context *brw, > >- const struct brw_device_info *info); > >+void brw_init_pipe_control(struct brw_context *brw, > >+ const struct brw_device_info *info); > > void brw_fini_pipe_control(struct brw_context *brw); > > void brw_emit_pipe_control_flush(struct brw_context *brw, uint32_t flags); > >diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c > >b/src/mesa/drivers/dri/i965/brw_pipe_control.c > >index 7ee3cb6..872bfe8 100644 > >--- a/src/mesa/drivers/dri/i965/brw_pipe_control.c > >+++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c > >@@ -330,26 +330,21 @@ brw_emit_mi_flush(struct brw_context *brw) > > brw_render_cache_set_clear(brw); > > } > >-int > >+void > > brw_init_pipe_control(struct brw_context *brw, > >const struct brw_device_info *devinfo) > > { > > if (devinfo->gen < 6) > >- return 0; > >+ return; > > /* We can't just use brw_state_batch to get a chunk of space for > > * the gen6 workaround because it involves actually writing to > > * the buffer, and the kernel doesn't let us write to the batch. > > */ > >- brw->workaround_bo = drm_intel_bo_alloc(brw->bufmgr, > >- "pipe_control workaround", > >- 4096, 4096); > >- if (brw->workaround_bo == NULL) > >- return -ENOMEM; > >+ brw->workaround_bo = brw->intelScreen->workaround_bo; > >+ drm_intel_bo_reference(brw->workaround_bo); > > Why do you need reference counting here? Why not simply destroy the > buffer when the screen is destroyed? I would assume that a screen > could not be destroyed until all its associated contexts are gone, > right? Because it makes me feel happier! You are right, it can be removed and replaced with a comment. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa/format_utils: Add src_bits == dst_bits cases to [us]norm functions
Jason Ekstrand writes: > @@ -151,8 +153,10 @@ _mesa_snorm_to_snorm(int x, unsigned src_bits, unsigned > dst_bits) >return -MAX_INT(dst_bits); > else if (src_bits < dst_bits) >return EXTEND_NORMALIZED_INT(x, src_bits - 1, dst_bits - 1); > - else > + else if (src_bits > dst_bits) >return x >> (src_bits - dst_bits); > + else > + return x; > } This part seems pretty unnecessary as it seems pretty unlikely that a compiler couldn't optimise away x >> 0. However I'm happy if you want to land it anyway for consistency. If you wanted to optimise it a bit more you could move the if(src_bits==dst_bits) to above the first if statement because I think it would be tricky the for the compiler to optimise that away. Either way, Reviewed-by: Neil Roberts Regards, - Neil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/70] i965: Move the pipelined test for SO register access to the screen
On 07/08/15 23:13, Chris Wilson wrote: Moving the test to the screen places it alongside the other global HW feature tesst that want to be shared between contexts. Signed-off-by: Chris Wilson Cc: Kenneth Graunke Seems like a good idea and I am also much more comfortable with this as the workaround bo is per-screen now so contexts could mess up the results: Reviewed-by: Martin Peres --- src/mesa/drivers/dri/i965/brw_context.c | 2 + src/mesa/drivers/dri/i965/intel_extensions.c | 69 - src/mesa/drivers/dri/i965/intel_screen.c | 92 src/mesa/drivers/dri/i965/intel_screen.h | 8 +++ 4 files changed, 102 insertions(+), 69 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index ac744d7..e8d1396 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -728,6 +728,8 @@ brwCreateContext(gl_api api, brw->must_use_separate_stencil = screen->hw_must_use_separate_stencil; brw->has_swizzling = screen->hw_has_swizzling; + brw->has_pipelined_so = + screen->hw_has_pipelined_register & HW_HAS_PIPELINED_SOL_OFFSET; brw->vs.base.stage = MESA_SHADER_VERTEX; brw->gs.base.stage = MESA_SHADER_GEOMETRY; diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index bf8fdae..6346fbc 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -32,74 +32,6 @@ #include "intel_reg.h" #include "utils.h" -/** - * Test if we can use MI_LOAD_REGISTER_MEM from an untrusted batchbuffer. - * - * Some combinations of hardware and kernel versions allow this feature, - * while others don't. Instead of trying to enumerate every case, just - * try and write a register and see if works. - */ -static bool -can_do_pipelined_register_writes(struct brw_context *brw) -{ - /* Supposedly, Broadwell just works. */ - if (brw->gen >= 8) - return true; - - static int result = -1; - if (result != -1) - return result; - - /* We use SO_WRITE_OFFSET0 since you're supposed to write it (unlike the -* statistics registers), and we already reset it to zero before using it. -*/ - const int reg = GEN7_SO_WRITE_OFFSET(0); - const int expected_value = 0x1337d0d0; - const int offset = 100; - - /* The register we picked only exists on Gen7+. */ - assert(brw->gen == 7); - - uint32_t *data; - /* Set a value in a BO to a known quantity. The workaround BO already -* exists and doesn't contain anything important, so we may as well use it. -*/ - drm_intel_bo_map(brw->workaround_bo, true); - data = brw->workaround_bo->virtual; - data[offset] = 0x; - drm_intel_bo_unmap(brw->workaround_bo); - - /* Write the register. */ - BEGIN_BATCH(3); - OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2)); - OUT_BATCH(reg); - OUT_BATCH(expected_value); - ADVANCE_BATCH(); - - brw_emit_mi_flush(brw); - - /* Save the register's value back to the buffer. */ - BEGIN_BATCH(3); - OUT_BATCH(MI_STORE_REGISTER_MEM | (3 - 2)); - OUT_BATCH(reg); - OUT_RELOC(brw->workaround_bo, - I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION, - offset * sizeof(uint32_t)); - ADVANCE_BATCH(); - - intel_batchbuffer_flush(brw); - - /* Check whether the value got written. */ - drm_intel_bo_map(brw->workaround_bo, false); - data = brw->workaround_bo->virtual; - bool success = data[offset] == expected_value; - drm_intel_bo_unmap(brw->workaround_bo); - - result = success; - - return success; -} - static bool can_write_oacontrol(struct brw_context *brw) { @@ -328,7 +260,6 @@ intelInitExtensions(struct gl_context *ctx) ctx->Extensions.ARB_texture_compression_bptc = true; ctx->Extensions.ARB_texture_view = true; - brw->has_pipelined_so = can_do_pipelined_register_writes(brw); if (brw->has_pipelined_so) { ctx->Extensions.ARB_draw_indirect = true; ctx->Extensions.ARB_transform_feedback2 = true; diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index 61f1dbe..0a64d2b 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -1186,6 +1186,96 @@ intel_detect_timestamp(struct intel_screen *screen) } /** + * Test if we can use MI_LOAD_REGISTER_MEM from an untrusted batchbuffer. + * + * Some combinations of hardware and kernel versions allow this feature, + * while others don't. Instead of trying to enumerate every case, just + * try and write a register and see if works. + */ +static bool +intel_detect_pipelined_register(struct intel_screen *screen, +int reg, uint32_t expected_value, bool reset) +{ + drm_intel_bo *results, *bo; + uint32_t *batch; + uint32_t offset = 0; + bool success =
Re: [Mesa-dev] [PATCH 06/70] i965: Move the OACONTROL pipelined access check from context to screen
On 07/08/15 23:13, Chris Wilson wrote: Similarly to the pipelined SO_OFFSET check, this moves the global HW compatability check to the screen next to the other global checks. Reviewed-by: Martin Peres Signed-off-by: Chris Wilson --- src/mesa/drivers/dri/i965/intel_extensions.c | 68 +--- src/mesa/drivers/dri/i965/intel_screen.c | 17 +++ src/mesa/drivers/dri/i965/intel_screen.h | 3 +- 3 files changed, 21 insertions(+), 67 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index 6346fbc..e7828c7 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -32,71 +32,6 @@ #include "intel_reg.h" #include "utils.h" -static bool -can_write_oacontrol(struct brw_context *brw) -{ - if (brw->gen < 6 || brw->gen >= 8) - return false; - - static int result = -1; - if (result != -1) - return result; - - /* Set "Select Context ID" to a particular address (which is likely not a -* context), but leave all counting disabled. This should be harmless. -*/ - const int expected_value = 0x31337000; - const int offset = 110; - - uint32_t *data; - /* Set a value in a BO to a known quantity. The workaround BO already -* exists and doesn't contain anything important, so we may as well use it. -*/ - drm_intel_bo_map(brw->workaround_bo, true); - data = brw->workaround_bo->virtual; - data[offset] = 0x; - drm_intel_bo_unmap(brw->workaround_bo); - - /* Write OACONTROL. */ - BEGIN_BATCH(3); - OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2)); - OUT_BATCH(OACONTROL); - OUT_BATCH(expected_value); - ADVANCE_BATCH(); - - brw_emit_mi_flush(brw); - - /* Save the register's value back to the buffer. */ - BEGIN_BATCH(3); - OUT_BATCH(MI_STORE_REGISTER_MEM | (3 - 2)); - OUT_BATCH(OACONTROL); - OUT_RELOC(brw->workaround_bo, - I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION, - offset * sizeof(uint32_t)); - ADVANCE_BATCH(); - - brw_emit_mi_flush(brw); - - /* Set OACONTROL back to zero (everything off). */ - BEGIN_BATCH(3); - OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2)); - OUT_BATCH(OACONTROL); - OUT_BATCH(0); - ADVANCE_BATCH(); - - intel_batchbuffer_flush(brw); - - /* Check whether the value got written. */ - drm_intel_bo_map(brw->workaround_bo, false); - data = brw->workaround_bo->virtual; - bool success = data[offset] == expected_value; - drm_intel_bo_unmap(brw->workaround_bo); - - result = success; - - return success; -} - /** * Initializes potential list of extensions if ctx == NULL, or actually enables * extensions for a context. @@ -207,7 +142,8 @@ intelInitExtensions(struct gl_context *ctx) ctx->Extensions.EXT_shader_integer_mix = ctx->Const.GLSLVersion >= 130; ctx->Extensions.EXT_timer_query = true; - if (brw->gen == 5 || can_write_oacontrol(brw)) { + if (brw->gen == 5 || + brw->intelScreen->hw_has_pipelined_register & HW_HAS_PIPELINED_OACONTROL) { ctx->Extensions.AMD_performance_monitor = true; ctx->Extensions.INTEL_performance_query = true; } diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index 0a64d2b..36c7bb2 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -1275,6 +1275,21 @@ intel_detect_pipelined_so(struct intel_screen *screen) false); } +static bool +intel_detect_pipelined_oacontrol(struct intel_screen *screen) +{ + if (screen->devinfo->gen < 6 || screen->devinfo->gen >= 8) + return false; + + /* Set "Select Context ID" to a particular address (which is likely not a +* context), but leave all counting disabled. This should be harmless. +*/ + return intel_detect_pipelined_register(screen, + OACONTROL, + 0x31337000, + true); +} + /** * Return array of MSAA modes supported by the hardware. The array is * zero-terminated and sorted in decreasing order. @@ -1536,6 +1551,8 @@ __DRIconfig **intelInitScreen2(__DRIscreen *psp) intelScreen->hw_has_timestamp = intel_detect_timestamp(intelScreen); if (intel_detect_pipelined_so(intelScreen)) intelScreen->hw_has_pipelined_register |= HW_HAS_PIPELINED_SOL_OFFSET; + if (intel_detect_pipelined_oacontrol(intelScreen)) + intelScreen->hw_has_pipelined_register |= HW_HAS_PIPELINED_OACONTROL; const char *force_msaa = getenv("INTEL_FORCE_MSAA"); if (force_msaa) { diff --git a/src/mesa/drivers/dri/i965/intel_screen.h b/src/mesa/drivers/dri/i965/intel_screen.h index 7890706..e054b69 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.h +++ b/src/mesa/drivers/dri/i965/intel_scree
[Mesa-dev] [PATCH] i965: Fix HW binding tables editing
Since the introduction of new gl_shader_stages in commit a2af956963b6bc4d29f37485e44c98008d2ef077 Author: Fabian Bieler Date: Fri Mar 7 10:19:09 2014 +0100 mesa: add tessellation shader enums the translation table for the stage into the HW binding table edit command was broken, and so we used illegal commands. Fix the array initialisation to be impervious to changes in the gl_shader_stages enum and add the asserts that would have caught the issue earlier. Signed-off-by: Chris Wilson Cc: Abdiel Janulgue Cc: Jordan Justen Cc: Matt Turner Cc: Kenneth Graunke Reviewed-by: Matt Turner --- src/mesa/drivers/dri/i965/brw_binding_tables.c | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c b/src/mesa/drivers/dri/i965/brw_binding_tables.c index 8d3697a..b188fc7 100644 --- a/src/mesa/drivers/dri/i965/brw_binding_tables.c +++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c @@ -44,10 +44,10 @@ #include "brw_state.h" #include "intel_batchbuffer.h" -static const GLuint stage_to_bt_edit[MESA_SHADER_FRAGMENT + 1] = { - _3DSTATE_BINDING_TABLE_EDIT_VS, - _3DSTATE_BINDING_TABLE_EDIT_GS, - _3DSTATE_BINDING_TABLE_EDIT_PS, +static const GLuint stage_to_bt_edit[] = { + [MESA_SHADER_VERTEX] = _3DSTATE_BINDING_TABLE_EDIT_VS, + [MESA_SHADER_GEOMETRY] = _3DSTATE_BINDING_TABLE_EDIT_GS, + [MESA_SHADER_FRAGMENT] = _3DSTATE_BINDING_TABLE_EDIT_PS, }; static uint32_t @@ -233,7 +233,8 @@ gen7_edit_hw_binding_table_entry(struct brw_context *brw, uint32_t index, uint32_t surf_offset) { - assert(stage <= MESA_SHADER_FRAGMENT); + assert(stage < ARRAY_SIZE(stage_to_bt_edit)); + assert(stage_to_bt_edit[stage]); uint32_t dw2 = SET_FIELD(index, BRW_BINDING_TABLE_INDEX) | (brw->gen >= 8 ? GEN8_SURFACE_STATE_EDIT(surf_offset) : @@ -259,7 +260,9 @@ gen7_update_binding_table_from_array(struct brw_context *brw, int num_surfaces) { uint32_t dw2 = 0; - assert(stage <= MESA_SHADER_FRAGMENT); + + assert(stage < ARRAY_SIZE(stage_to_bt_edit)); + assert(stage_to_bt_edit[stage]); BEGIN_BATCH(num_surfaces + 2); OUT_BATCH(stage_to_bt_edit[stage] << 16 | num_surfaces); -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Fix HW binding tables editing
On Mon, Aug 10, 2015 at 12:49:47PM +0100, Chris Wilson wrote: > Since the introduction of new gl_shader_stages in > > commit a2af956963b6bc4d29f37485e44c98008d2ef077 > Author: Fabian Bieler > Date: Fri Mar 7 10:19:09 2014 +0100 > > mesa: add tessellation shader enums > > the translation table for the stage into the HW binding table edit > command was broken, and so we used illegal commands. Fix the array > initialisation to be impervious to changes in the gl_shader_stages enum > and add the asserts that would have caught the issue earlier. Sorry, misdirected git send-email, please ignore. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] dri2: Insert a synchronisation point for glXWaitX
"X rendering calls made prior to glXWaitX are guaranteed to be executed before GL rendering calls made after glXWaitX." The goal is to implement that without adding a round-trip to the Xserver. Adding one using XSync() is easy, but we can piggy-back another, the DRI2GetBuffers request made to update the render buffers before the next rendering (thus satisfying the condition of flushing all X operations before the next GL rendering). To this end we can just flag the DRI2 buffers as invalid, and the driver will refresh them in due course. If the DRI2 buffers are, or will be, invalid due to damage from X or through a SwapBuffers call, we will not have to add another roundtrip as the single DRI2GetBuffers will refresh over multiple invalidates. This should fix the historic issue that glXWaitX() has been unreliable, but has recently found itself a new trigger will the removal of unnecessary glViewport calls: commit 95073a2dca03a48f4c77bc846a4a6d1f0eb81ae6 Author: Kenneth Graunke Date: Fri Sep 26 15:13:30 2014 -0700 mesa: Avoid flagging _NEW_VIEWPORT on redundant viewport updates. since each Viewport typically ended up triggering an invalidate. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90264 Signed-off-by: Chris Wilson Cc: Michel Dänzer Cc: Kenneth Graunke Cc: Brian Paul Cc: Ian Romanick --- src/glx/dri2_glx.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c index 5767026..01b5c28 100644 --- a/src/glx/dri2_glx.c +++ b/src/glx/dri2_glx.c @@ -656,6 +656,8 @@ dri2_wait_x(struct glx_context *gc) struct dri2_drawable *priv = (struct dri2_drawable *) GetGLXDRIDrawable(gc->currentDpy, gc->currentDrawable); + dri2InvalidateBuffers(gc->currentDpy, gc->currentDrawable); + if (priv == NULL || !priv->have_fake_front) return; -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/70] i965: Combine the multiple pipelined register detection into one round-trip
On 07/08/15 23:13, Chris Wilson wrote: Combining the multiple access checks into a few batches and a single serialising read can reduce detection times from around 100us to 70us on a fast Haswell system. Signed-off-by: Chris Wilson Cc: Kenneth Graunke --- src/mesa/drivers/dri/i965/intel_screen.c | 165 +++ 1 file changed, 101 insertions(+), 64 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index 36c7bb2..0b60f13 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -1185,6 +1185,13 @@ intel_detect_timestamp(struct intel_screen *screen) return 0; } +struct detect_pipelined_register { + uint32_t reg; + uint32_t expected_value; + unsigned result; + bool reset; +}; + /** * Test if we can use MI_LOAD_REGISTER_MEM from an untrusted batchbuffer. * @@ -1192,102 +1199,134 @@ intel_detect_timestamp(struct intel_screen *screen) * while others don't. Instead of trying to enumerate every case, just * try and write a register and see if works. */ -static bool -intel_detect_pipelined_register(struct intel_screen *screen, -int reg, uint32_t expected_value, bool reset) +static void +__intel_detect_pipelined_registers(struct intel_screen *screen, + struct detect_pipelined_register *r, + int count) { - drm_intel_bo *results, *bo; - uint32_t *batch; - uint32_t offset = 0; - bool success = false; + drm_intel_bo *results; + int i; + + if (count == 0) + return; /* Create a zero'ed temporary buffer for reading our results */ results = drm_intel_bo_alloc(screen->bufmgr, "registers", 4096, 0); if (results == NULL) - goto err; - - bo = drm_intel_bo_alloc(screen->bufmgr, "batchbuffer", 4096, 0); - if (bo == NULL) - goto err_results; + return; - if (drm_intel_bo_map(bo, 1)) - goto err_batch; + /* Emit each access in a separate batch buffer so that if the kernel +* rejects an individual access attempt, we don't incorrectly assume +* all the register accesses are invalid. +*/ + for (i = 0; i < count; i++) { + drm_intel_bo *bo; + uint32_t *batch; - batch = bo->virtual; + bo = drm_intel_bo_alloc(screen->bufmgr, "batchbuffer", 4096, 0); + if (bo == NULL) + continue; - /* Write the register. */ - *batch++ = MI_LOAD_REGISTER_IMM | (3 - 2); - *batch++ = reg; - *batch++ = expected_value; + if (drm_intel_bo_map(bo, 1)) + goto err_batch; - /* Save the register's value back to the buffer. */ - *batch++ = MI_STORE_REGISTER_MEM | (3 - 2); - *batch++ = reg; - drm_intel_bo_emit_reloc(bo, (char *)batch -(char *)bo->virtual, - results, offset*sizeof(uint32_t), - I915_GEM_DOMAIN_INSTRUCTION, - I915_GEM_DOMAIN_INSTRUCTION); - *batch++ = results->offset + offset*sizeof(uint32_t); + batch = bo->virtual; - /* And afterwards clear the register */ - if (reset) { + /* Write the register. */ *batch++ = MI_LOAD_REGISTER_IMM | (3 - 2); - *batch++ = reg; - *batch++ = 0; - } + *batch++ = r[i].reg; + *batch++ = r[i].expected_value; + + /* Save the register's value back to the buffer. */ + *batch++ = MI_STORE_REGISTER_MEM | (3 - 2); + *batch++ = r[i].reg; + drm_intel_bo_emit_reloc(bo, (char *)batch -(char *)bo->virtual, + results, i*sizeof(uint32_t), + I915_GEM_DOMAIN_INSTRUCTION, + I915_GEM_DOMAIN_INSTRUCTION); + *batch++ = results->offset + i*sizeof(uint32_t); + + /* And afterwards clear the register */ + if (r[i].reset) { + *batch++ = MI_LOAD_REGISTER_IMM | (3 - 2); + *batch++ = r[i].reg; + *batch++ = 0; + } - *batch++ = MI_BATCH_BUFFER_END; + *batch++ = MI_BATCH_BUFFER_END; - drm_intel_bo_mrb_exec(bo, ALIGN((char *)batch - (char *)bo->virtual, 8), - NULL, 0, 0, - I915_EXEC_RENDER); + drm_intel_bo_mrb_exec(bo, ALIGN((char *)batch - (char *)bo->virtual, 8), +NULL, 0, 0, +I915_EXEC_RENDER); - /* Check whether the value got written. */ +err_batch: + drm_intel_bo_unreference(bo); + } + + /* Check whether the values got written. */ if (drm_intel_bo_map(results, false) == 0) { - success = *((uint32_t *)results->virtual + offset) == expected_value; + uint32_t *data = results->virtual; + for (i = 0; i < count; i++) + if (data[i] == r[i].expected_value) +screen->hw_has_pipelined_register |= r[i].result; drm_intel_bo_unmap(results); } -err_batch: - drm_intel_bo_unrefe
Re: [Mesa-dev] [PATCH 08/70] i965: Remove early release of DRI2 miptree
On 07/08/15 23:13, Chris Wilson wrote: intel_update_winsys_renderbuffer_miptree() will release the existing miptree when wrapping a new DRI2 buffer, so we can remove the early release and so prevent a NULL mt dereference should importing the new DRI2 name fail for any reason. (Reusing the old DRI2 name will result in the rendering going astray, to a stale buffer, and not shown on the screen, but it allows us to issue a warning and not crash much later in innocent code.) Signed-off-by: Chris Wilson --- src/mesa/drivers/dri/i965/brw_context.c | 1 - 1 file changed, 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index e8d1396..72f3897 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -1388,7 +1388,6 @@ intel_process_dri2_buffer(struct brw_context *brw, buffer->cpp, buffer->pitch); } - intel_miptree_release(&rb->mt); bo = drm_intel_bo_gem_create_from_name(brw->bufmgr, buffer_name, buffer->name); if (!bo) { Looks like a bug fix, shouldn't this be sent separately? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] glsl/es3.1: Fix up GL_ARB_compute_shader for GLSL ES 3.1
On Mon, 2015-08-10 at 13:04 +0200, Marta Lofstedt wrote: > From: Marta Lofstedt > > GL_ARB_compute_shader is limited for GLSL version 430. > This enables for GLSL ES version 310. > > V2: Updated error string to also include GLSL 3.10 > > Signed-off-by: Marta Lofstedt > --- > src/glsl/glsl_parser.yy | 5 ++--- > src/glsl/glsl_parser_extras.h | 5 + > 2 files changed, 7 insertions(+), 3 deletions(-) > > diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy > index 2b0c8bd..2abdab4 100644 > --- a/src/glsl/glsl_parser.yy > +++ b/src/glsl/glsl_parser.yy > @@ -1519,11 +1519,10 @@ layout_qualifier_id: > "invalid %s of %d specified", > local_size_qualifiers[i], $3); > YYERROR; > -} else if (!state->is_version(430, 0) && > - !state->ARB_compute_shader_enable) { > +} else if (!state->has_compute_shader_()) { > _mesa_glsl_error(& @3, state, > "%s qualifier requires GLSL 4.30 or " > -"ARB_compute_shader", > + "GLSL ES 3.10 or ARB_compute_shader", You have tabs in the above line. Also see the AoA example for a way to make the message a little nicer: http://lists.freedesktop.org/archives/mesa-dev/2015-July/090104.html > local_size_qualifiers[i]); > YYERROR; > } else { > diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h > index eb325f0..57a7555 100644 > --- a/src/glsl/glsl_parser_extras.h > +++ b/src/glsl/glsl_parser_extras.h > @@ -236,6 +236,11 @@ struct _mesa_glsl_parse_state { >return ARB_shading_language_420pack_enable || is_version(420, 0); > } > > + bool has_compute_shader() const > + { > + return ARB_compute_shader_enable || is_version(430, 310); > + } > + > void process_version_directive(YYLTYPE *locp, int version, >const char *ident); > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/70] i965: Remove direct includes of intel_batchbuffer.h
On 07/08/15 23:13, Chris Wilson wrote: Upcoming patches eliminate the intel_batchbuffer interface and one of the minor changes that causes a lot of churn is the removal of the header, along with the occassional need to now call intel_reg.h themselves. This patch moves the individual includes into brw_context.h. Signed-off-by: Chris Wilson It compiles, so: Reviewed-by: Martin Peres ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/70] i965: Remove early release of DRI2 miptree
On Mon, Aug 10, 2015 at 03:18:09PM +0300, Martin Peres wrote: > On 07/08/15 23:13, Chris Wilson wrote: > >intel_update_winsys_renderbuffer_miptree() will release the existing > >miptree when wrapping a new DRI2 buffer, so we can remove the early > >release and so prevent a NULL mt dereference should importing the new > >DRI2 name fail for any reason. (Reusing the old DRI2 name will result > >in the rendering going astray, to a stale buffer, and not shown on the > >screen, but it allows us to issue a warning and not crash much later in > >innocent code.) > > > >Signed-off-by: Chris Wilson > >--- > > src/mesa/drivers/dri/i965/brw_context.c | 1 - > > 1 file changed, 1 deletion(-) > > > >diff --git a/src/mesa/drivers/dri/i965/brw_context.c > >b/src/mesa/drivers/dri/i965/brw_context.c > >index e8d1396..72f3897 100644 > >--- a/src/mesa/drivers/dri/i965/brw_context.c > >+++ b/src/mesa/drivers/dri/i965/brw_context.c > >@@ -1388,7 +1388,6 @@ intel_process_dri2_buffer(struct brw_context *brw, > >buffer->cpp, buffer->pitch); > > } > >- intel_miptree_release(&rb->mt); > > bo = drm_intel_bo_gem_create_from_name(brw->bufmgr, buffer_name, > >buffer->name); > > if (!bo) { > > Looks like a bug fix, shouldn't this be sent separately? It is. I separated it out for that purpose :) -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 12/70] i965: Rename intel_batchbuffer to brw_batch
10-12 are: Reviewed-by: Martin Peres On 07/08/15 23:13, Chris Wilson wrote: In order to reduce future churn, rename the intel_batchbuffer struct. Signed-off-by: Chris Wilson --- src/mesa/drivers/dri/i965/brw_batch.h | 4 ++-- src/mesa/drivers/dri/i965/brw_context.h | 2 +- src/mesa/drivers/dri/i965/brw_state_batch.c | 6 ++ src/mesa/drivers/dri/i965/intel_batchbuffer.c | 4 ++-- src/mesa/drivers/dri/i965/intel_batchbuffer.h | 4 ++-- 5 files changed, 9 insertions(+), 11 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_batch.h b/src/mesa/drivers/dri/i965/brw_batch.h index c38b92a..c05f9b0 100644 --- a/src/mesa/drivers/dri/i965/brw_batch.h +++ b/src/mesa/drivers/dri/i965/brw_batch.h @@ -41,7 +41,7 @@ enum brw_gpu_ring { BLT_RING, }; -struct intel_batchbuffer { +typedef struct brw_batch { /** Current batchbuffer being queued up. */ brw_bo *bo; /** Last BO submitted to the hardware. Used for glFinish(). */ @@ -64,7 +64,7 @@ struct intel_batchbuffer { uint32_t *map_next; int reloc_count; } saved; -}; +} brw_batch; #ifdef __cplusplus } diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index ce0ea94..cd8ea50 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1023,7 +1023,7 @@ struct brw_context */ uint32_t reset_count; - struct intel_batchbuffer batch; + brw_batch batch; bool no_batch_wrap; struct { diff --git a/src/mesa/drivers/dri/i965/brw_state_batch.c b/src/mesa/drivers/dri/i965/brw_state_batch.c index d785c89..22dfbe5 100644 --- a/src/mesa/drivers/dri/i965/brw_state_batch.c +++ b/src/mesa/drivers/dri/i965/brw_state_batch.c @@ -40,14 +40,12 @@ brw_track_state_batch(struct brw_context *brw, int size, int index) { - struct intel_batchbuffer *batch = &brw->batch; - if (!brw->state_batch_list) { /* Our structs are always aligned to at least 32 bytes, so * our array doesn't need to be any larger */ brw->state_batch_list = ralloc_size(brw, sizeof(*brw->state_batch_list) * - batch->bo->size / 32); + brw->batch.bo->size / 32); } brw->state_batch_list[brw->state_batch_count].offset = offset; @@ -124,7 +122,7 @@ __brw_state_batch(struct brw_context *brw, uint32_t *out_offset) { - struct intel_batchbuffer *batch = &brw->batch; + brw_batch *batch = &brw->batch; uint32_t offset; assert(size < batch->bo->size); diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c index 44512be..ca65390 100644 --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c @@ -112,7 +112,7 @@ static void do_batch_dump(struct brw_context *brw) { struct drm_intel_decode *decode; - struct intel_batchbuffer *batch = &brw->batch; + brw_batch *batch = &brw->batch; int ret; decode = drm_intel_decode_context_alloc(brw->intelScreen->deviceID); @@ -290,7 +290,7 @@ throttle(struct brw_context *brw) static int do_flush_locked(struct brw_context *brw) { - struct intel_batchbuffer *batch = &brw->batch; + brw_batch *batch = &brw->batch; int ret = 0; if (brw->has_llc) { diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.h b/src/mesa/drivers/dri/i965/intel_batchbuffer.h index 48d6f5a..5ca3bf3 100644 --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.h +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.h @@ -32,7 +32,7 @@ extern "C" { */ #define BATCH_RESERVED 152 -struct intel_batchbuffer; +struct brw_batch; struct brw_context; enum brw_gpu_ring; @@ -154,7 +154,7 @@ static inline void intel_batchbuffer_advance(struct brw_context *brw) { #ifdef DEBUG - struct intel_batchbuffer *batch = &brw->batch; + brw_batch *batch = &brw->batch; unsigned int _n = USED_BATCH(*batch) - batch->emit; assert(batch->total != 0); if (_n != batch->total) { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/70] i965: Remove early release of DRI2 miptree
On 10/08/15 15:26, Chris Wilson wrote: On Mon, Aug 10, 2015 at 03:18:09PM +0300, Martin Peres wrote: On 07/08/15 23:13, Chris Wilson wrote: intel_update_winsys_renderbuffer_miptree() will release the existing miptree when wrapping a new DRI2 buffer, so we can remove the early release and so prevent a NULL mt dereference should importing the new DRI2 name fail for any reason. (Reusing the old DRI2 name will result in the rendering going astray, to a stale buffer, and not shown on the screen, but it allows us to issue a warning and not crash much later in innocent code.) Signed-off-by: Chris Wilson --- src/mesa/drivers/dri/i965/brw_context.c | 1 - 1 file changed, 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index e8d1396..72f3897 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -1388,7 +1388,6 @@ intel_process_dri2_buffer(struct brw_context *brw, buffer->cpp, buffer->pitch); } - intel_miptree_release(&rb->mt); bo = drm_intel_bo_gem_create_from_name(brw->bufmgr, buffer_name, buffer->name); if (!bo) { Looks like a bug fix, shouldn't this be sent separately? It is. I separated it out for that purpose :) -Chris I meant, sent separately of this series. To make it clear it was independent and make the upstreaming faster :) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 12/70] i965: Rename intel_batchbuffer to brw_batch
On Mon, Aug 10, 2015 at 03:37:24PM +0300, Martin Peres wrote: > 10-12 are: > > Reviewed-by: Martin Peres One thing to consider, do we want to do a s/typedef/struct brw_batch/ cleanup after the transition is complete? brw_batch vs struct brw_batch ? brw_bo vs struct brw_bo ? -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 12/70] i965: Rename intel_batchbuffer to brw_batch
On 10/08/15 15:45, Chris Wilson wrote: On Mon, Aug 10, 2015 at 03:37:24PM +0300, Martin Peres wrote: 10-12 are: Reviewed-by: Martin Peres One thing to consider, do we want to do a s/typedef/struct brw_batch/ cleanup after the transition is complete? brw_batch vs struct brw_batch ? brw_bo vs struct brw_bo ? -Chris That is a question for jason and Kenneth, isn't it? /me is not really a big contributor to mesa yet ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: Map packed gallium formats to packed mesa formats.
On Mon, Aug 10, 2015 at 10:43 AM, Jason Ekstrand wrote: > On Mon, Aug 10, 2015 at 12:41 AM, Oded Gabbay wrote: >> On Mon, Aug 10, 2015 at 10:30 AM, Jason Ekstrand >> wrote: >>> On Mon, Aug 10, 2015 at 12:07 AM, Oded Gabbay wrote: On Mon, Aug 10, 2015 at 7:58 AM, Jason Ekstrand wrote: > On Sun, Aug 9, 2015 at 3:11 AM, Oded Gabbay wrote: >> On Sun, Aug 9, 2015 at 2:43 AM, Jason Ekstrand >> wrote: >>> On Sat, Aug 8, 2015 at 3:14 PM, Oded Gabbay >>> wrote: On Sun, Aug 9, 2015 at 12:26 AM, Jason Ekstrand wrote: If I understand your patch, instead of using the endian-dependent defines, you use the LE defines all the time when converting pipe<-->mesa formats I'm not saying its wrong, but these endian-dependent defines, e.g. PIPE_FORMAT_ARGB_UNORM, are used in more places in the gallium code than in the above two functions. Why change only here and not in all places, i.e removing them ? Also, this fixes only llvmpipe/softpipe. It doesn't fix sw_rast. That is still broken >>> >>> The fact that sw_rast is broken bothers me. The sw_rast code doesn't >>> leave core mesa so it shouldn't be affected by any gallium changes. >>> From that perspective, this makes sense. However, sw_rast should be >>> using the auto-generated packing/unpacking functions in core mesa. >>> Did sw_rast work prior to the readpixels changes? It's possible it >>> was already broken. >>> >> I didn't check sw_rast status before readpixels changes. >> However, from looking at the automatic pack/unpack functions, it seems >> strange to me they don't take into account the endianness. >> e.g on ppc64 machine, in file format_pack.c, there is the following >> function: >> >> static inline void >> pack_ubyte_a8b8g8r8_unorm(const GLubyte src[4], void *dst) >> { >> uint8_t a = _mesa_unorm_to_unorm(src[3], 8, 8); >> uint8_t b = _mesa_unorm_to_unorm(src[2], 8, 8); >> uint8_t g = _mesa_unorm_to_unorm(src[1], 8, 8); >> uint8_t r = _mesa_unorm_to_unorm(src[0], 8, 8); >> >> uint32_t d = 0; >> d |= PACK(a, 0, 8); >> d |= PACK(b, 8, 8); >> d |= PACK(g, 16, 8); >> d |= PACK(r, 24, 8); >> (*(uint32_t *) dst) = d; >> } >> >> PACK is defined as: >> #define PACK(SRC, OFFSET, BITS) (((SRC) & MAX_UINT(BITS)) << (OFFSET)) >> >> So, we see that 'R' is written to the most-significant byte of the >> dword and 'A' is written to the least-significant byte of the dword, >> although in BE, that should be reversed, since this is abgr format, >> not rgba. > > No, what I've been trying to tell you this whole time is that this is > how the a8r8g8b8 format is *defined*. It's defined in terms of > carving up a single 32-bit word in native endianness. This is why we > have to flip things based on endianness if we want to get it in terms > of an array of 8-bit values. You're free to argue that that's a > clumsy definition for a color format, but that's how it's defined none > the less. > >> It's enough to see that the implementation of this function is the >> same for LE and BE to suspect the code, unless I'm missing something. >> >> I tested this patch on llvmpipe, combined with the previous patch you sent on POWER8 machine running ppc64 (BE) First of all, as I said above, it fixes the piglit sanity test. Second, it seems that it improves piglit gpu.py results much more than my patch. "Only" 1368 tests failed, vs. a bit more than 2000 with my patch. Still a lot higher than the ~600 tests that fail on ppc64le >>> >>> I'm not surprised that it didn't get to zero regressions vs. ppc64le. >>> This patch was just kind of thrown together and I didn't check all of >>> the formats. Also, as per my conversation with rolland, I may be >>> misunderstanding the PIPE_FORMAT enums. >>> >>> Really, this is a problem that is going to have to be chased by >>> someone other than me. I have no BE hardware, so I can't really chase >>> it down. I I also have very limited gallium knowledge so I'm probably >>> not the best person to start chasing it through gallium. All I've >>> done so far is throw darts at the wall. I'm sure this patch is >>> incomplete and, to be honest, I didn't even compile-test it as I don't >>> usually build gallium. What i do know is that, for the most part, the >>> core mesa code is consistent with itself as far as endianness goes. >> >> So I'm not married to my patch, and your 2 patches seem to do a better >> job (although sw-rast is still broken, but that's another issue and >> frankly, it is of less importance for me atm), and I accept yo
Re: [Mesa-dev] [PATCH 13/70] i965: Add a couple of utility functions to ref/unref a brw_bo
On 07/08/15 23:13, Chris Wilson wrote: To further reduce churn when replacing the buffer object implementation, wrap the existing drm_intel_bo_reference/drm_intel_bo_unreference. Signed-off-by: Chris Wilson --- src/mesa/drivers/dri/i965/brw_batch.h | 12 +++ src/mesa/drivers/dri/i965/brw_context.c| 25 +++--- src/mesa/drivers/dri/i965/brw_draw.c | 11 +- src/mesa/drivers/dri/i965/brw_draw_upload.c| 10 - .../drivers/dri/i965/brw_performance_monitor.c | 8 +++ src/mesa/drivers/dri/i965/brw_pipe_control.c | 5 ++--- src/mesa/drivers/dri/i965/brw_program.c| 4 ++-- src/mesa/drivers/dri/i965/brw_queryobj.c | 10 - src/mesa/drivers/dri/i965/brw_state_cache.c| 4 ++-- src/mesa/drivers/dri/i965/brw_vs_surface_state.c | 2 +- src/mesa/drivers/dri/i965/gen6_queryobj.c | 4 ++-- src/mesa/drivers/dri/i965/gen6_sol.c | 4 ++-- src/mesa/drivers/dri/i965/intel_batchbuffer.c | 17 ++- src/mesa/drivers/dri/i965/intel_buffer_objects.c | 14 ++-- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 13 ++- src/mesa/drivers/dri/i965/intel_screen.c | 6 +++--- src/mesa/drivers/dri/i965/intel_syncobj.c | 10 - src/mesa/drivers/dri/i965/intel_upload.c | 7 +++--- 18 files changed, 78 insertions(+), 88 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_batch.h b/src/mesa/drivers/dri/i965/brw_batch.h index c05f9b0..5745aa4 100644 --- a/src/mesa/drivers/dri/i965/brw_batch.h +++ b/src/mesa/drivers/dri/i965/brw_batch.h @@ -66,6 +66,18 @@ typedef struct brw_batch { } saved; } brw_batch; +inline static brw_bo *brw_bo_get(brw_bo *bo) +{ + drm_intel_bo_reference(bo); + return bo; +} + +inline static void brw_bo_put(brw_bo *bo) +{ + if (bo) + drm_intel_bo_unreference(bo); +} Interesting naming here. I really get the _get() one, but _release() would make more sense than put(), wouldn't it? I don't want to bikeshed here, so if I am the only one for whom it _put() makes no sense (aside from being the opposite of get), then feel free to keep it along with my R-b. + #ifdef __cplusplus } #endif diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 8b2d006..583ce7f 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -929,17 +929,11 @@ intelDestroyContext(__DRIcontext * driContextPriv) brw_destroy_state(brw); brw_draw_destroy(brw); - drm_intel_bo_unreference(brw->curbe.curbe_bo); - if (brw->vs.base.scratch_bo) - drm_intel_bo_unreference(brw->vs.base.scratch_bo); - if (brw->gs.base.scratch_bo) - drm_intel_bo_unreference(brw->gs.base.scratch_bo); - if (brw->wm.base.scratch_bo) - drm_intel_bo_unreference(brw->wm.base.scratch_bo); - - gen7_reset_hw_bt_pool_offsets(brw); - drm_intel_bo_unreference(brw->hw_bt_pool.bo); - brw->hw_bt_pool.bo = NULL; + brw_bo_put(brw->curbe.curbe_bo); + brw_bo_put(brw->vs.base.scratch_bo); + brw_bo_put(brw->gs.base.scratch_bo); + brw_bo_put(brw->wm.base.scratch_bo); + brw_bo_put(brw->hw_bt_pool.bo); drm_intel_gem_context_destroy(brw->hw_ctx); @@ -955,10 +949,8 @@ intelDestroyContext(__DRIcontext * driContextPriv) brw_fini_pipe_control(brw); intel_batchbuffer_free(brw); - drm_intel_bo_unreference(brw->throttle_batch[1]); - drm_intel_bo_unreference(brw->throttle_batch[0]); - brw->throttle_batch[1] = NULL; - brw->throttle_batch[0] = NULL; + brw_bo_put(brw->throttle_batch[1]); + brw_bo_put(brw->throttle_batch[0]); driDestroyOptionCache(&brw->optionCache); @@ -1402,6 +1394,7 @@ intel_process_dri2_buffer(struct brw_context *brw, intel_update_winsys_renderbuffer_miptree(brw, rb, bo, drawable->w, drawable->h, buffer->pitch); + brw_bo_put(bo); if (brw_is_front_buffer_drawing(fb) && (buffer->attachment == __DRI_BUFFER_FRONT_LEFT || @@ -1411,8 +1404,6 @@ intel_process_dri2_buffer(struct brw_context *brw, } assert(rb->mt); - - drm_intel_bo_unreference(bo); } /** diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 6dc0fd8..1aa0093 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -291,7 +291,7 @@ brw_merge_inputs(struct brw_context *brw, GLuint i; for (i = 0; i < brw->vb.nr_buffers; i++) { - drm_intel_bo_unreference(brw->vb.buffers[i].bo); + brw_bo_put(brw->vb.buffers[i].bo); brw->vb.buffers[i].bo = NULL; } brw->vb.nr_buffers = 0; @@ -482,13 +482,12 @@ brw_try_draw_prims(struct gl_context *ctx, brw->draw.gl_basevertex = prims[i].indexed ? pri
Re: [Mesa-dev] [PATCH 07/70] i965: Combine the multiple pipelined register detection into one round-trip
On Mon, Aug 10, 2015 at 03:17:26PM +0300, Martin Peres wrote: > >+static int > >+intel_detect_pipelined_oacontrol(struct intel_screen *screen, > >+ struct detect_pipelined_register *detect) > > { > > if (screen->devinfo->gen < 6 || screen->devinfo->gen >= 8) > >- return false; > >+ return 0; > > /* Set "Select Context ID" to a particular address (which is likely not > > a > > * context), but leave all counting disabled. This should be harmless. > > */ > >- return intel_detect_pipelined_register(screen, > >- OACONTROL, > >- 0x31337000, > >- true); > >+ detect->reg = OACONTROL; > >+ detect->expected_value = 0x31337000; > >+ detect->result = HW_HAS_PIPELINED_OACONTROL; > >+ detect->reset = true; > >+ return 1; > >+} > >+ > >+static void > >+intel_detect_pipelined_register_access(struct intel_screen *screen) > >+{ > >+ struct detect_pipelined_register regs[2], *r =regs; > >+ > >+ /* Combine the multiple register access validation into a single > >+* round trip through the kernel + GPU. > >+*/ > >+ r += intel_detect_pipelined_so(screen, r); > >+ r += intel_detect_pipelined_oacontrol(screen, r); > > Not a fan of this construct. How about changing the return types of > the detect functions to int?. Do you mean if (intel_detect_pipelined_so(screen, r)) r++; or int index = 0; index += intel_detect_pipelined_so(screen, &r[index]); ? -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/70] i965: Combine the multiple pipelined register detection into one round-trip
On 10/08/15 16:00, Chris Wilson wrote: On Mon, Aug 10, 2015 at 03:17:26PM +0300, Martin Peres wrote: +static int +intel_detect_pipelined_oacontrol(struct intel_screen *screen, + struct detect_pipelined_register *detect) { if (screen->devinfo->gen < 6 || screen->devinfo->gen >= 8) - return false; + return 0; /* Set "Select Context ID" to a particular address (which is likely not a * context), but leave all counting disabled. This should be harmless. */ - return intel_detect_pipelined_register(screen, - OACONTROL, - 0x31337000, - true); + detect->reg = OACONTROL; + detect->expected_value = 0x31337000; + detect->result = HW_HAS_PIPELINED_OACONTROL; + detect->reset = true; + return 1; +} + +static void +intel_detect_pipelined_register_access(struct intel_screen *screen) +{ + struct detect_pipelined_register regs[2], *r =regs; + + /* Combine the multiple register access validation into a single +* round trip through the kernel + GPU. +*/ + r += intel_detect_pipelined_so(screen, r); + r += intel_detect_pipelined_oacontrol(screen, r); Not a fan of this construct. How about changing the return types of the detect functions to int?. Do you mean if (intel_detect_pipelined_so(screen, r)) r++; or int index = 0; index += intel_detect_pipelined_so(screen, &r[index]); ? -Chris Sorry, I meant: static BOOL intel_detect_pipelined_so() --> static INT intel_detect_pipelined_so() Booleans are supposed to be booleans, not integers. The fact that true and false map to 0 and 1 are an implementation detail (which should never ever be different for plenty of reasons, but we are talking about semantics here). ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/70] i965: Combine the multiple pipelined register detection into one round-trip
On Mon, Aug 10, 2015 at 04:08:07PM +0300, Martin Peres wrote: > > > On 10/08/15 16:00, Chris Wilson wrote: > >On Mon, Aug 10, 2015 at 03:17:26PM +0300, Martin Peres wrote: > >>>+static int > >>>+intel_detect_pipelined_oacontrol(struct intel_screen *screen, > >>>+ struct detect_pipelined_register *detect) > >>> { > >>> if (screen->devinfo->gen < 6 || screen->devinfo->gen >= 8) > >>>- return false; > >>>+ return 0; > >>> /* Set "Select Context ID" to a particular address (which is likely > >>> not a > >>> * context), but leave all counting disabled. This should be > >>> harmless. > >>> */ > >>>- return intel_detect_pipelined_register(screen, > >>>- OACONTROL, > >>>- 0x31337000, > >>>- true); > >>>+ detect->reg = OACONTROL; > >>>+ detect->expected_value = 0x31337000; > >>>+ detect->result = HW_HAS_PIPELINED_OACONTROL; > >>>+ detect->reset = true; > >>>+ return 1; > >>>+} > >>>+ > >>>+static void > >>>+intel_detect_pipelined_register_access(struct intel_screen *screen) > >>>+{ > >>>+ struct detect_pipelined_register regs[2], *r =regs; > >>>+ > >>>+ /* Combine the multiple register access validation into a single > >>>+* round trip through the kernel + GPU. > >>>+*/ > >>>+ r += intel_detect_pipelined_so(screen, r); > >>>+ r += intel_detect_pipelined_oacontrol(screen, r); > >>Not a fan of this construct. How about changing the return types of > >>the detect functions to int?. > >Do you mean > > if (intel_detect_pipelined_so(screen, r)) > > r++; > >or > > int index = 0; > > index += intel_detect_pipelined_so(screen, &r[index]); > >? > >-Chris > > Sorry, I meant: > > static BOOL intel_detect_pipelined_so() --> static INT > intel_detect_pipelined_so() > > Booleans are supposed to be booleans, not integers. The fact that > true and false map to 0 and 1 are an implementation detail (which > should never ever be different for plenty of reasons, but we are > talking about semantics here). Sorry, it was meant to be int. I missed that I had left it as bool, hence the confusion. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: Map packed gallium formats to packed mesa formats.
Am 10.08.2015 um 07:00 schrieb Jason Ekstrand: > On Sun, Aug 9, 2015 at 6:46 AM, Roland Scheidegger wrote: >> Am 09.08.2015 um 12:11 schrieb Oded Gabbay: >>> On Sun, Aug 9, 2015 at 2:43 AM, Jason Ekstrand wrote: On Sat, Aug 8, 2015 at 3:14 PM, Oded Gabbay wrote: > On Sun, Aug 9, 2015 at 12:26 AM, Jason Ekstrand > wrote: > > If I understand your patch, instead of using the endian-dependent > defines, you use the LE defines all the time when converting > pipe<-->mesa formats > > I'm not saying its wrong, but these endian-dependent defines, e.g. > PIPE_FORMAT_ARGB_UNORM, are used in more places in the gallium > code than in the above two functions. Why change only here and not in > all places, i.e removing them ? > > Also, this fixes only llvmpipe/softpipe. It doesn't fix sw_rast. That > is still broken The fact that sw_rast is broken bothers me. The sw_rast code doesn't leave core mesa so it shouldn't be affected by any gallium changes. From that perspective, this makes sense. However, sw_rast should be using the auto-generated packing/unpacking functions in core mesa. Did sw_rast work prior to the readpixels changes? It's possible it was already broken. >>> I didn't check sw_rast status before readpixels changes. >>> However, from looking at the automatic pack/unpack functions, it seems >>> strange to me they don't take into account the endianness. >>> e.g on ppc64 machine, in file format_pack.c, there is the following >>> function: >>> >>> static inline void >>> pack_ubyte_a8b8g8r8_unorm(const GLubyte src[4], void *dst) >>> { >>> uint8_t a = _mesa_unorm_to_unorm(src[3], 8, 8); >>> uint8_t b = _mesa_unorm_to_unorm(src[2], 8, 8); >>> uint8_t g = _mesa_unorm_to_unorm(src[1], 8, 8); >>> uint8_t r = _mesa_unorm_to_unorm(src[0], 8, 8); >>> >>> uint32_t d = 0; >>> d |= PACK(a, 0, 8); >>> d |= PACK(b, 8, 8); >>> d |= PACK(g, 16, 8); >>> d |= PACK(r, 24, 8); >>> (*(uint32_t *) dst) = d; >>> } >> >> Aside from the endianness issues, it actually looks like this does some >> totally unnecessary math for 8bit unorm to 8bit unorm "conversion". Odd... > > I *think* that gets optimized away. It probably wouldn't hurt to put > a src_bits == dst_bits case in there though. Some very quick test seems to show you're half-right: llvm (clang) can optimize it away (3.6), whereas gcc (4.8.3) can't. It is not all that bad though, because while it can't optimize away the math it does turn the div into a mul, so the math doesn't look all that expensive... (I was mostly just wondering because it seemed like it could be a very common case.) Roland > >> Roland >> >>> PACK is defined as: >>> #define PACK(SRC, OFFSET, BITS) (((SRC) & MAX_UINT(BITS)) << (OFFSET)) >>> >>> So, we see that 'R' is written to the most-significant byte of the >>> dword and 'A' is written to the least-significant byte of the dword, >>> although in BE, that should be reversed, since this is abgr format, >>> not rgba. >>> >>> It's enough to see that the implementation of this function is the >>> same for LE and BE to suspect the code, unless I'm missing something. >>> >>> > I tested this patch on llvmpipe, combined with the previous patch you > sent on POWER8 machine running ppc64 (BE) > First of all, as I said above, it fixes the piglit sanity test. > Second, it seems that it improves piglit gpu.py results much more than > my patch. "Only" 1368 tests failed, vs. a bit more than 2000 with my > patch. Still a lot higher than the ~600 tests that fail on ppc64le I'm not surprised that it didn't get to zero regressions vs. ppc64le. This patch was just kind of thrown together and I didn't check all of the formats. Also, as per my conversation with rolland, I may be misunderstanding the PIPE_FORMAT enums. Really, this is a problem that is going to have to be chased by someone other than me. I have no BE hardware, so I can't really chase it down. I I also have very limited gallium knowledge so I'm probably not the best person to start chasing it through gallium. All I've done so far is throw darts at the wall. I'm sure this patch is incomplete and, to be honest, I didn't even compile-test it as I don't usually build gallium. What i do know is that, for the most part, the core mesa code is consistent with itself as far as endianness goes. >>> >>> So I'm not married to my patch, and your 2 patches seem to do a better >>> job (although sw-rast is still broken, but that's another issue and >>> frankly, it is of less importance for me atm), and I accept your >>> arguments for your solution. I would suggest that you upstream your >>> patches as a start and I will continue from there, based on my >>> schedule. >>> If you want to see what a given format means in a more explicit fashion, just look at the auto-ge
Re: [Mesa-dev] [PATCH] mesa/format_utils: Add src_bits == dst_bits cases to [us]norm functions
Am 10.08.2015 um 07:03 schrieb Jason Ekstrand: > This better ensures that the src_bits == dst_bits case gets optimized away. > --- > src/mesa/main/format_utils.h | 8 ++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/main/format_utils.h b/src/mesa/main/format_utils.h > index 00ec777..65314e6 100644 > --- a/src/mesa/main/format_utils.h > +++ b/src/mesa/main/format_utils.h > @@ -99,7 +99,7 @@ _mesa_unorm_to_unorm(unsigned x, unsigned src_bits, > unsigned dst_bits) > { > if (src_bits < dst_bits) { >return EXTEND_NORMALIZED_INT(x, src_bits, dst_bits); > - } else { > + } else if (src_bits > dst_bits) { >unsigned src_half = (1 << (src_bits - 1)) - 1; > >if (src_bits + dst_bits > sizeof(x) * 8) { > @@ -109,6 +109,8 @@ _mesa_unorm_to_unorm(unsigned x, unsigned src_bits, > unsigned dst_bits) >} else { > return (x * MAX_UINT(dst_bits) + src_half) / MAX_UINT(src_bits); >} > + } else { > + return x; > } > } > > @@ -151,8 +153,10 @@ _mesa_snorm_to_snorm(int x, unsigned src_bits, unsigned > dst_bits) >return -MAX_INT(dst_bits); > else if (src_bits < dst_bits) >return EXTEND_NORMALIZED_INT(x, src_bits - 1, dst_bits - 1); > - else > + else if (src_bits > dst_bits) >return x >> (src_bits - dst_bits); > + else > + return x; > } > > static inline unsigned > Reviewed-by: Roland Scheidegger (I don't care either way for the snorm case). Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91591] rounding.h:102:2: error: #error "Unsupported or undefined LONG_BIT"
https://bugs.freedesktop.org/show_bug.cgi?id=91591 --- Comment #8 from Roland Scheidegger --- What a mess... I found some reference that for c++ you need to define __STDC_LIMIT_MACROS before including to make it work. Dunno though if that's the problem here. Also, do we need to include ? Otherwise I'd just say we should ditch the longs. Not worth the trouble for something every caller just converts to an int anyway (even if it means that indeed _mesa_lroundevenf() wouldn't quite be lrintf equivalent, but who cares). -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 22/70] i965: Add dword aliases to bitfield structs
On 07/08/15 23:13, Chris Wilson wrote: When processing the packed fields, it often much easier to pass around verb missing after "it"? the dword value (as would be seen by hardware) than it is manipulating the bitfield. By aliasing the bitfield with a uint32_t member, we can treat the value as either a collection of bits or a single value depending upon the situation. Signed-off-by: Chris Wilson --- src/mesa/drivers/dri/i965/brw_structs.h | 33 +++-- 1 file changed, 19 insertions(+), 14 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_structs.h b/src/mesa/drivers/dri/i965/brw_structs.h index 55338c0..b09d4ba 100644 --- a/src/mesa/drivers/dri/i965/brw_structs.h +++ b/src/mesa/drivers/dri/i965/brw_structs.h @@ -391,13 +391,16 @@ struct brw_sf_unit_state unsigned pad3:1; } thread4; - struct + union { - unsigned front_winding:1; - unsigned viewport_transform:1; - unsigned pad0:3; - unsigned sf_viewport_state_offset:27; /* Offset from GENERAL_STATE_BASE */ - } sf5; + struct { + unsigned front_winding:1; + unsigned viewport_transform:1; + unsigned pad0:3; + unsigned sf_viewport_state_offset:27; /* Offset from GENERAL_STATE_BASE */ + } sf5; + uint32_t dw5; + }; struct { @@ -525,15 +528,17 @@ struct brw_wm_unit_state struct thread2 thread2; struct thread3 thread3; - struct { - unsigned stats_enable:1; - unsigned depth_buffer_clear:1; - unsigned sampler_count:3; - unsigned sampler_state_pointer:27; - } wm4; + union { + struct { + unsigned stats_enable:1; + unsigned depth_buffer_clear:1; + unsigned sampler_count:3; + unsigned sampler_state_pointer:27; + } wm4; + uint32_t dw4; + }; - struct - { + struct { unsigned enable_8_pix:1; unsigned enable_16_pix:1; unsigned enable_32_pix:1; Patches 14-22 are: Reviewed-by: Martin Peres ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91591] rounding.h:102:2: error: #error "Unsupported or undefined LONG_BIT"
https://bugs.freedesktop.org/show_bug.cgi?id=91591 --- Comment #9 from Jose Fonseca --- (In reply to Roland Scheidegger from comment #8) > I found some reference that for c++ you need to define __STDC_LIMIT_MACROS > before including to make it work. Dunno though if that's the > problem here. We unconditionally define __STDC_LIMIT_MACROS on scons builds. But we don't on autotools. LLVM requires that define, so build with LLVM will pick it up anyawy. Anwyay, roundeven_test.c is a C file, so I don't believe that's the problem. -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/skl: Remove early platform support
On Aug 10, 2015 4:14 AM, "Neil Roberts" wrote: > > If we go with this patch perhaps it would be good to remove > supports_simd16_3src entirely from brw_device_info and any code that is > referring to it in order to avoid carrying around useless code. > Currently it seems like it would be quite easy to add a new > brw_device_info and forget to add supports_simd16_3src and never notice > that it is redundantly using the fallback. I would second that. > This would be sort of a pain for me because my Skylake is still one that > needs this workaround but I guess I can just upgrade easily enough. > > Regards, > - Neil > > Ben Widawsky writes: > > > We do not want bug reports from this early stepping of SKL. Few if any were ever > > shipped outside of Intel to early enabling partners, and none will be sold. > > > > There is a functional change here. If you're using new mesa on an old > > kernel/libdrm, the revid will be -1, and we'll use new SKL values instead of > > early ones (a hopefully irrelevant improvement IMO). > > > > Cc: Jason Ekstrand > > Cc: Neil Roberts > > Signed-off-by: Ben Widawsky > > --- > > src/mesa/drivers/dri/i965/brw_device_info.c | 15 +++ > > 1 file changed, 7 insertions(+), 8 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c b/src/mesa/drivers/dri/i965/brw_device_info.c > > index be517e8..fc89221 100644 > > --- a/src/mesa/drivers/dri/i965/brw_device_info.c > > +++ b/src/mesa/drivers/dri/i965/brw_device_info.c > > @@ -322,10 +322,7 @@ static const struct brw_device_info brw_device_info_chv = { > >.max_gs_entries = 640,\ > > } > > > > -static const struct brw_device_info brw_device_info_skl_early = { > > - GEN9_FEATURES, .gt = 1, > > - .supports_simd16_3src = false, > > -}; > > +#define IS_SKL(devinfo) ((devinfo)->gen == 9 && !(devinfo)->is_broxton) > > > > static const struct brw_device_info brw_device_info_skl_gt1 = { > > GEN9_FEATURES, .gt = 1, > > @@ -376,10 +373,12 @@ brw_get_device_info(int devid, int revision) > >return NULL; > > } > > > > - if (devinfo->gen == 9 && > > - !devinfo->is_broxton && > > - (revision == 2 || revision == 3 || revision == -1)) > > - return &brw_device_info_skl_early; > > + if (IS_SKL(devinfo) && (revision != -1 && revision <= 3)) { > > + fprintf(stderr, > > + "i965_dri.so does not support this PCI ID with revision %d.\n", > > + revision); > > + return NULL; > > + } > > > > return devinfo; > > } > > -- > > 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91591] rounding.h:102:2: error: #error "Unsupported or undefined LONG_BIT"
https://bugs.freedesktop.org/show_bug.cgi?id=91591 --- Comment #10 from Roland Scheidegger --- (In reply to Jose Fonseca from comment #9) > Anwyay, roundeven_test.c is a C file, so I don't believe that's the problem. Well the last build failure showed a failure with glsl_types.cpp - only the previous failure was with roundeven_test.c -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] gallium: add GREMEDY_string_marker
From: Rob Clark Signed-off-by: Rob Clark --- src/gallium/docs/source/screen.rst | 1 + src/gallium/drivers/freedreno/freedreno_screen.c | 1 + src/gallium/drivers/i915/i915_screen.c | 1 + src/gallium/drivers/ilo/ilo_screen.c | 1 + src/gallium/drivers/llvmpipe/lp_screen.c | 1 + src/gallium/drivers/nouveau/nv30/nv30_screen.c | 1 + src/gallium/drivers/nouveau/nv50/nv50_screen.c | 1 + src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 1 + src/gallium/drivers/r300/r300_screen.c | 1 + src/gallium/drivers/r600/r600_pipe.c | 1 + src/gallium/drivers/radeonsi/si_pipe.c | 1 + src/gallium/drivers/softpipe/sp_screen.c | 1 + src/gallium/drivers/svga/svga_screen.c | 1 + src/gallium/drivers/vc4/vc4_screen.c | 3 ++- src/gallium/include/pipe/p_context.h | 7 +++ src/gallium/include/pipe/p_defines.h | 1 + src/mesa/state_tracker/st_context.c | 9 + src/mesa/state_tracker/st_extensions.c | 1 + 18 files changed, 33 insertions(+), 1 deletion(-) diff --git a/src/gallium/docs/source/screen.rst b/src/gallium/docs/source/screen.rst index dbdccc7..bd451bd 100644 --- a/src/gallium/docs/source/screen.rst +++ b/src/gallium/docs/source/screen.rst @@ -258,6 +258,7 @@ The integer capabilities: How many per-patch outputs and inputs are supported between tessellation control and tessellation evaluation shaders, not counting in TESSINNER and TESSOUTER. The minimum allowed value for OpenGL is 30. +* ``PIPE_CAP_STRING_MARKER``: Whether pipe->emit_string_marker() is supported. .. _pipe_capf: diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c b/src/gallium/drivers/freedreno/freedreno_screen.c index 417d7c6..fbbab2a 100644 --- a/src/gallium/drivers/freedreno/freedreno_screen.c +++ b/src/gallium/drivers/freedreno/freedreno_screen.c @@ -170,6 +170,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) case PIPE_CAP_MAX_DUAL_SOURCE_RENDER_TARGETS: case PIPE_CAP_START_INSTANCE: case PIPE_CAP_COMPUTE: + case PIPE_CAP_STRING_MARKER: return 0; case PIPE_CAP_SM3: diff --git a/src/gallium/drivers/i915/i915_screen.c b/src/gallium/drivers/i915/i915_screen.c index 6083687..18f8bdf 100644 --- a/src/gallium/drivers/i915/i915_screen.c +++ b/src/gallium/drivers/i915/i915_screen.c @@ -244,6 +244,7 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap cap) case PIPE_CAP_RESOURCE_FROM_USER_MEMORY: case PIPE_CAP_DEVICE_RESET_STATUS_QUERY: case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS: + case PIPE_CAP_STRING_MARKER: return 0; case PIPE_CAP_MAX_DUAL_SOURCE_RENDER_TARGETS: diff --git a/src/gallium/drivers/ilo/ilo_screen.c b/src/gallium/drivers/ilo/ilo_screen.c index 338643e..533659b 100644 --- a/src/gallium/drivers/ilo/ilo_screen.c +++ b/src/gallium/drivers/ilo/ilo_screen.c @@ -466,6 +466,7 @@ ilo_get_param(struct pipe_screen *screen, enum pipe_cap param) case PIPE_CAP_RESOURCE_FROM_USER_MEMORY: case PIPE_CAP_DEVICE_RESET_STATUS_QUERY: case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS: + case PIPE_CAP_STRING_MARKER: return 0; case PIPE_CAP_VENDOR_ID: diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c b/src/gallium/drivers/llvmpipe/lp_screen.c index 1c6c82e..336b71d 100644 --- a/src/gallium/drivers/llvmpipe/lp_screen.c +++ b/src/gallium/drivers/llvmpipe/lp_screen.c @@ -293,6 +293,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum pipe_cap param) case PIPE_CAP_RESOURCE_FROM_USER_MEMORY: case PIPE_CAP_DEVICE_RESET_STATUS_QUERY: case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS: + case PIPE_CAP_STRING_MARKER: return 0; } /* should only get here on unhandled cases */ diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c b/src/gallium/drivers/nouveau/nv30/nv30_screen.c index 97cf058..84367a0 100644 --- a/src/gallium/drivers/nouveau/nv30/nv30_screen.c +++ b/src/gallium/drivers/nouveau/nv30/nv30_screen.c @@ -164,6 +164,7 @@ nv30_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) case PIPE_CAP_RESOURCE_FROM_USER_MEMORY: case PIPE_CAP_DEVICE_RESET_STATUS_QUERY: case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS: + case PIPE_CAP_STRING_MARKER: return 0; case PIPE_CAP_VENDOR_ID: diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c b/src/gallium/drivers/nouveau/nv50/nv50_screen.c index d869544..983b5b5 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c @@ -211,6 +211,7 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) case PIPE_CAP_RESOURCE_FROM_USER_MEMORY: case PIPE_CAP_DEVICE_RESET_STATUS_QUERY: case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS: + case PIPE_CAP_STRING_MARKER: return 0; case PIPE_CAP_VENDOR_ID: diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen
[Mesa-dev] [PATCH 0/3] GREMEDY_string_marker support
From: Rob Clark Punch it through mesa and gallium, plus implementation in freedreno to serve as an example. Radeon should be able to do a similar thing with no-op packets. Not sure about other drivers. Plan to use this extension from apitrace, to emit call and frame #'s for draw commands, so I can find the corresponding draw from cmdstream dump (ie. to debug lockups, incorrect rendering, etc). Rob Clark (3): mesa: add GREMEDY_string_marker gallium: add GREMEDY_string_marker freedreno: implement emit_string_marker src/gallium/docs/source/screen.rst| 1 + src/gallium/drivers/freedreno/freedreno_context.c | 27 +++ src/gallium/drivers/freedreno/freedreno_screen.c | 1 + src/gallium/drivers/i915/i915_screen.c| 1 + src/gallium/drivers/ilo/ilo_screen.c | 1 + src/gallium/drivers/llvmpipe/lp_screen.c | 1 + src/gallium/drivers/nouveau/nv30/nv30_screen.c| 1 + src/gallium/drivers/nouveau/nv50/nv50_screen.c| 1 + src/gallium/drivers/nouveau/nvc0/nvc0_screen.c| 1 + src/gallium/drivers/r300/r300_screen.c| 1 + src/gallium/drivers/r600/r600_pipe.c | 1 + src/gallium/drivers/radeonsi/si_pipe.c| 1 + src/gallium/drivers/softpipe/sp_screen.c | 1 + src/gallium/drivers/svga/svga_screen.c| 1 + src/gallium/drivers/vc4/vc4_screen.c | 3 ++- src/gallium/include/pipe/p_context.h | 7 ++ src/gallium/include/pipe/p_defines.h | 1 + src/mapi/glapi/gen/GREMEDY_string_marker.xml | 18 +++ src/mapi/glapi/gen/Makefile.am| 1 + src/mapi/glapi/gen/gl_API.xml | 2 ++ src/mesa/main/dd.h| 6 + src/mesa/main/debug.c | 14 src/mesa/main/debug.h | 3 +++ src/mesa/main/extensions.c| 1 + src/mesa/main/mtypes.h| 1 + src/mesa/state_tracker/st_context.c | 9 src/mesa/state_tracker/st_extensions.c| 1 + 27 files changed, 106 insertions(+), 1 deletion(-) create mode 100644 src/mapi/glapi/gen/GREMEDY_string_marker.xml -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] mesa: add GREMEDY_string_marker
From: Rob Clark Signed-off-by: Rob Clark --- src/mapi/glapi/gen/GREMEDY_string_marker.xml | 18 ++ src/mapi/glapi/gen/Makefile.am | 1 + src/mapi/glapi/gen/gl_API.xml| 2 ++ src/mesa/main/dd.h | 6 ++ src/mesa/main/debug.c| 14 ++ src/mesa/main/debug.h| 3 +++ src/mesa/main/extensions.c | 1 + src/mesa/main/mtypes.h | 1 + 8 files changed, 46 insertions(+) create mode 100644 src/mapi/glapi/gen/GREMEDY_string_marker.xml diff --git a/src/mapi/glapi/gen/GREMEDY_string_marker.xml b/src/mapi/glapi/gen/GREMEDY_string_marker.xml new file mode 100644 index 000..ffa3eac --- /dev/null +++ b/src/mapi/glapi/gen/GREMEDY_string_marker.xml @@ -0,0 +1,18 @@ + + + + + + + + + + + + + + + + + + diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am index 7d9d1a6..405b71f 100644 --- a/src/mapi/glapi/gen/Makefile.am +++ b/src/mapi/glapi/gen/Makefile.am @@ -186,6 +186,7 @@ API_XML = \ EXT_texture_array.xml \ EXT_texture_integer.xml \ EXT_transform_feedback.xml \ + GREMEDY_string_marker.xml \ INTEL_performance_query.xml \ KHR_debug.xml \ KHR_context_flush_control.xml \ diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml index 658efa4..aea08bd 100644 --- a/src/mapi/glapi/gen/gl_API.xml +++ b/src/mapi/glapi/gen/gl_API.xml @@ -12613,6 +12613,8 @@ http://www.w3.org/2001/XInclude"/> +http://www.w3.org/2001/XInclude"/> + http://www.w3.org/2001/XInclude"/> http://www.w3.org/2001/XInclude"/> diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h index 87eb63e..1fc4ca4 100644 --- a/src/mesa/main/dd.h +++ b/src/mesa/main/dd.h @@ -787,6 +787,12 @@ struct dd_function_table { void (*UseProgram)(struct gl_context *ctx, struct gl_shader_program *shProg); /*@}*/ + /** +* \name GREMEDY debug/marker functions +*/ + /*@{*/ + void (*EmitStringMarker)(struct gl_context *ctx, const GLchar *string, GLsizei len); + /*@}*/ /** * \name Support for multiple T&L engines diff --git a/src/mesa/main/debug.c b/src/mesa/main/debug.c index 5ca7d5c..a640c0d 100644 --- a/src/mesa/main/debug.c +++ b/src/mesa/main/debug.c @@ -655,3 +655,17 @@ _mesa_print_texture(struct gl_context *ctx, struct gl_texture_image *img) ctx->Driver.UnmapTextureImage(ctx, img, slice); } + +void GLAPIENTRY +_mesa_StringMarkerGREMEDY(GLsizei len, const GLvoid * string) +{ + GET_CURRENT_CONTEXT(ctx); + if (ctx->Driver.EmitStringMarker) { + /* if length not specified, string will be null terminated: */ + if (len == 0) + len = strlen(string); + ctx->Driver.EmitStringMarker(ctx, string, len); + } else { + _mesa_warning(ctx, "StringMarkerGREMEDY unsupported"); + } +} diff --git a/src/mesa/main/debug.h b/src/mesa/main/debug.h index 902f595..a18ef8a 100644 --- a/src/mesa/main/debug.h +++ b/src/mesa/main/debug.h @@ -74,4 +74,7 @@ _mesa_dump_image(const char *filename, const void *image, GLuint w, GLuint h, extern void _mesa_print_texture(struct gl_context *ctx, struct gl_texture_image *img); +void GLAPIENTRY +_mesa_StringMarkerGREMEDY(GLsizei len, const GLvoid * string); + #endif diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c index d934d19..7181e18 100644 --- a/src/mesa/main/extensions.c +++ b/src/mesa/main/extensions.c @@ -364,6 +364,7 @@ static const struct extension extension_table[] = { { "GL_ATI_texture_env_combine3", o(ATI_texture_env_combine3),GLL,2002 }, { "GL_ATI_texture_float", o(ARB_texture_float), GL, 2002 }, { "GL_ATI_texture_mirror_once", o(ATI_texture_mirror_once), GL, 2006 }, + { "GL_GREMEDY_string_marker", o(GREMEDY_string_marker), GL, 2007 }, { "GL_IBM_multimode_draw_arrays", o(dummy_true), GL, 1998 }, { "GL_IBM_rasterpos_clip", o(dummy_true), GLL,1996 }, { "GL_IBM_texture_mirrored_repeat", o(dummy_true), GLL,1998 }, diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 4e00fb6..383074a 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -3953,6 +3953,7 @@ struct gl_extensions GLboolean ATI_texture_env_combine3; GLboolean ATI_fragment_shader; GLboolean ATI_separate_stencil; + GLboolean GREMEDY_string_marker; GLboolean INTEL_performance_query; GLboolean MESA_pack_invert; GLboolean MESA_ycbcr_texture; -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http
[Mesa-dev] [PATCH 3/3] freedreno: implement emit_string_marker
From: Rob Clark Writes string to cmdstream in payload of a no-op packet. Signed-off-by: Rob Clark --- src/gallium/drivers/freedreno/freedreno_context.c | 27 +++ src/gallium/drivers/freedreno/freedreno_screen.c | 2 +- 2 files changed, 28 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/freedreno/freedreno_context.c b/src/gallium/drivers/freedreno/freedreno_context.c index 8e6d431..9d8316e 100644 --- a/src/gallium/drivers/freedreno/freedreno_context.c +++ b/src/gallium/drivers/freedreno/freedreno_context.c @@ -139,6 +139,32 @@ fd_context_flush(struct pipe_context *pctx, struct pipe_fence_handle **fence, } } +/** + * emit marker string as payload of a no-op packet, which can be + * decoded by cffdump. + */ +static void +fd_emit_string_marker(struct pipe_context *pctx, const char *string, int len) +{ + struct fd_context *ctx = fd_context(pctx); + struct fd_ringbuffer *ring = ctx->ring; + const uint32_t *buf = (const void *)string; + + OUT_PKT3(ring, CP_NOP, align(len, 4) / 4); + while (len >= 4) { + OUT_RING(ring, *buf); + buf++; + len -= 4; + } + + /* copy remainder bytes without reading past end of input string: */ + if (len > 0) { + uint32_t w = 0; + memcpy(&w, buf, len); + OUT_RING(ring, w); + } +} + void fd_context_destroy(struct pipe_context *pctx) { @@ -205,6 +231,7 @@ fd_context_init(struct fd_context *ctx, struct pipe_screen *pscreen, pctx->screen = pscreen; pctx->priv = priv; pctx->flush = fd_context_flush; + pctx->emit_string_marker = fd_emit_string_marker; for (i = 0; i < ARRAY_SIZE(ctx->rings); i++) { ctx->rings[i] = fd_ringbuffer_new(screen->pipe, 0x10); diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c b/src/gallium/drivers/freedreno/freedreno_screen.c index fbbab2a..519dad4 100644 --- a/src/gallium/drivers/freedreno/freedreno_screen.c +++ b/src/gallium/drivers/freedreno/freedreno_screen.c @@ -154,6 +154,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) case PIPE_CAP_USER_CONSTANT_BUFFERS: case PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT: case PIPE_CAP_VERTEXID_NOBASE: + case PIPE_CAP_STRING_MARKER: return 1; case PIPE_CAP_SHADER_STENCIL_EXPORT: @@ -170,7 +171,6 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) case PIPE_CAP_MAX_DUAL_SOURCE_RENDER_TARGETS: case PIPE_CAP_START_INSTANCE: case PIPE_CAP_COMPUTE: - case PIPE_CAP_STRING_MARKER: return 0; case PIPE_CAP_SM3: -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/13] mesa: Replace sampler object locks with atomic inc/dec.
On Fri, Aug 7, 2015 at 10:09 AM, Ian Romanick wrote: > I know we've talked about this about 100 times, but something in the > back of my mind tells me that we have a pre-existing race. What happens > if the p_atomic_dec_zero happens on thread A while thread B is between > the _mesa_lookup_renderbuffer call and the _mesa_reference_renderbuffer > call on the same object? Won't thread A free the memory out from under > thread B? Generally: RefCount is initialized to 1 (in all cases but renderbuffers... that might be a bug -- did you choose renderbuffers in your example for this reason?). When Bind is called the first time, RefCount is incremented to 2. When Delete is called, RefCount is decremented -- RefCount only goes to 0 only if the object isn't bound anywhere, in which case it's actually deleted. If The object is still bound somewhere and RefCount is decremented to non-zero, the object cannot be rebound and will be deleted when its last reference is unbound. Commit 42aaa548 changed the renderbuffer code from RefCount = 1 to = 0. It seems pretty clearly wrong to me, since a glGenRenderbuffers()/glDeleteRenderBuffers() pair would fail to actually delete the renderbuffers. Does anyone else read it differently? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91591] rounding.h:102:2: error: #error "Unsupported or undefined LONG_BIT"
https://bugs.freedesktop.org/show_bug.cgi?id=91591 --- Comment #11 from Matt Turner --- Created attachment 117615 --> https://bugs.freedesktop.org/attachment.cgi?id=117615&action=edit patch Try this -- defines __STDC_LIMIT_MACROS before including limits.h. -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] Add mesa.icd to the .gitignore
Since 4d7e0fa8c731776 this file is generated by the configure script. --- src/gallium/targets/opencl/.gitignore | 1 + 1 file changed, 1 insertion(+) create mode 100644 src/gallium/targets/opencl/.gitignore diff --git a/src/gallium/targets/opencl/.gitignore b/src/gallium/targets/opencl/.gitignore new file mode 100644 index 000..dad573f --- /dev/null +++ b/src/gallium/targets/opencl/.gitignore @@ -0,0 +1 @@ +/mesa.icd -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/skl: Remove early platform support
On Mon, Aug 10, 2015 at 09:09:41AM -0700, Jason Ekstrand wrote: > On Aug 10, 2015 4:14 AM, "Neil Roberts" wrote: > > > > If we go with this patch perhaps it would be good to remove > > supports_simd16_3src entirely from brw_device_info and any code that is > > referring to it in order to avoid carrying around useless code. > > Currently it seems like it would be quite easy to add a new > > brw_device_info and forget to add supports_simd16_3src and never notice > > that it is redundantly using the fallback. > > I would second that. > Either of you opposed to doing it as a follow-on patch? (This failure you speak of was the primary goal for removing it). Initially, I tried to remove it, but we do still seem to use this logic pre-HSW, and so things got messier than I initially anticipated. Neil, you said "if we go with this patch" - I haven't heard any opposition, do you have any? [snip] ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] mesa: expose dimension check for glTex*Storage functions
On 08/10/2015 02:06 AM, Tapani Pälli wrote: This is done so that following patch can use it to verify dimenstions for multisample variants of glTex*Storage. Signed-off-by: Tapani Pälli --- src/mesa/main/texstorage.c | 22 +- src/mesa/main/texstorage.h | 3 +++ 2 files changed, 24 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/texstorage.c b/src/mesa/main/texstorage.c index 4a2cc60..9a321cc 100644 --- a/src/mesa/main/texstorage.c +++ b/src/mesa/main/texstorage.c @@ -266,6 +266,26 @@ _mesa_AllocTextureStorage_sw(struct gl_context *ctx, return GL_TRUE; } +/** + * Texture width, height and depth check shared with the + * multisample variants of TexStorage functions. + * + * From OpenGL 4.5 Core spec, page 260 (section 8.19) + * + * "An INVALID_VALUE error is generated if width, height, depth + * or levels are less than 1, for commands with the corresponding + * parameters." + * + * (referring to TextureStorage* commands, these also match values + * specified for OpenGL ES 3.1.) + */ +GLboolean s/GLboolean/bool/ +_mesa_tex_storage_invalid_dim(GLsizei width, GLsizei height, GLsizei depth) +{ + if (width < 1 || height < 1 || depth < 1) + return true; + return false; +} /** * Do error checking for calls to glTexStorage1/2/3D(). @@ -287,7 +307,7 @@ tex_storage_error_check(struct gl_context *ctx, * order to allow meta functions to use legacy formats. */ /* size check */ - if (width < 1 || height < 1 || depth < 1) { + if (_mesa_tex_storage_invalid_dim(width, height, depth)) { _mesa_error(ctx, GL_INVALID_VALUE, "glTex%sStorage%uD(width, height or depth < 1)", suffix, dims); diff --git a/src/mesa/main/texstorage.h b/src/mesa/main/texstorage.h index 6f5495f..62108a8 100644 --- a/src/mesa/main/texstorage.h +++ b/src/mesa/main/texstorage.h @@ -98,4 +98,7 @@ _mesa_AllocTextureStorage_sw(struct gl_context *ctx, GLsizei levels, GLsizei width, GLsizei height, GLsizei depth); +extern GLboolean +_mesa_tex_storage_invalid_dim(GLsizei width, GLsizei height, GLsizei depth); + #endif /* TEXSTORAGE_H */ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: add NV_read_{depth, stencil, depth_stencil} extensions
On Mon, Aug 10, 2015 at 1:02 PM, Rob Clark wrote: > From: Rob Clark > > These extensions allow reading depth/stencil for GLES contexts, which is > useful for tools like apitrace. > > Signed-off-by: Rob Clark > --- > I have a patch, which I will send out after some cleanup, that makes > apitrace able to dump depth/stencil buffers with GLES, thanks to this > extension. > > src/mesa/main/extensions.c | 3 +++ > src/mesa/main/readpix.c| 25 +++-- > 2 files changed, 22 insertions(+), 6 deletions(-) > > diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c > index 2dbfabd..d934d19 100644 > --- a/src/mesa/main/extensions.c > +++ b/src/mesa/main/extensions.c > @@ -385,6 +385,9 @@ static const struct extension extension_table[] = { > { "GL_NV_point_sprite", o(NV_point_sprite), > GL, 2001 }, > { "GL_NV_primitive_restart",o(NV_primitive_restart), > GLL,2002 }, > { "GL_NV_read_buffer", o(dummy_true), > ES2,2011 }, > + { "GL_NV_read_depth", o(dummy_true), > ES2,2011 }, > + { "GL_NV_read_depth_stencil", o(dummy_true), > ES2,2011 }, > + { "GL_NV_read_stencil", o(dummy_true), > ES2,2011 }, > { "GL_NV_texgen_reflection",o(dummy_true), > GLL,1999 }, > { "GL_NV_texture_barrier", o(NV_texture_barrier), > GL, 2009 }, > { "GL_NV_texture_env_combine4", > o(NV_texture_env_combine4), GLL,1999 }, > diff --git a/src/mesa/main/readpix.c b/src/mesa/main/readpix.c > index 2744232..65751aa 100644 > --- a/src/mesa/main/readpix.c > +++ b/src/mesa/main/readpix.c > @@ -917,7 +917,9 @@ read_pixels_es3_error_check(GLenum format, GLenum type, > GLboolean is_unsigned_int = GL_FALSE; > GLboolean is_signed_int = GL_FALSE; > > - if (!_mesa_is_color_format(internalFormat)) { > + /* TODO just drop the check? Are there any formats to filter out? */ Yes, please drop it. > + if (!(_mesa_is_color_format(internalFormat) || > + _mesa_is_depth_or_stencil_format(internalFormat))) { >return GL_INVALID_OPERATION; > } > > @@ -950,6 +952,22 @@ read_pixels_es3_error_check(GLenum format, GLenum type, >(is_unsigned_int && type == GL_UNSIGNED_INT)) > return GL_NO_ERROR; >break; > + case GL_DEPTH_STENCIL: > + if ((internalFormat == GL_DEPTH24_STENCIL8) && > + (type == GL_UNSIGNED_INT_24_8)) > + return GL_NO_ERROR; > + if ((internalFormat == GL_DEPTH32F_STENCIL8) && > + (type == GL_FLOAT_32_UNSIGNED_INT_24_8_REV)) > + return GL_NO_ERROR; > + break; > + case GL_DEPTH_COMPONENT: > + if ((internalFormat == GL_DEPTH_COMPONENT32F) && > + (type == GL_FLOAT)) > + return GL_NO_ERROR; > + if ((internalFormat == GL_DEPTH_COMPONENT24) && > + (type == GL_UNSIGNED_INT)) > + return GL_NO_ERROR; > + break; What about GL_STENCIL_INDEX? Mesa does support GL_OES_stencil8. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/dri: Use packed RGB formats
Reviewed-by: Marek Olšák Marek On Mon, Aug 10, 2015 at 11:44 AM, Michel Dänzer wrote: > From: Michel Dänzer > > Fixes Gallium based DRI drivers failing to load on big endian hosts > because they can't find any matching fbconfigs. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71789 > Signed-off-by: Michel Dänzer > --- > src/gallium/state_trackers/dri/dri2.c | 26 +- > src/gallium/state_trackers/dri/dri_drawable.c | 8 > 2 files changed, 17 insertions(+), 17 deletions(-) > > diff --git a/src/gallium/state_trackers/dri/dri2.c > b/src/gallium/state_trackers/dri/dri2.c > index 91b4431..fae100e 100644 > --- a/src/gallium/state_trackers/dri/dri2.c > +++ b/src/gallium/state_trackers/dri/dri2.c > @@ -188,10 +188,10 @@ dri2_drawable_get_buffers(struct dri_drawable *drawable, > * may occur as the stvis->color_format. > */ >switch(format) { > - case PIPE_FORMAT_B8G8R8A8_UNORM: > + case PIPE_FORMAT_BGRA_UNORM: > depth = 32; > break; > - case PIPE_FORMAT_B8G8R8X8_UNORM: > + case PIPE_FORMAT_BGRX_UNORM: > depth = 24; > break; >case PIPE_FORMAT_B5G6R5_UNORM: > @@ -261,13 +261,13 @@ dri_image_drawable_get_buffers(struct dri_drawable > *drawable, >case PIPE_FORMAT_B5G6R5_UNORM: > image_format = __DRI_IMAGE_FORMAT_RGB565; > break; > - case PIPE_FORMAT_B8G8R8X8_UNORM: > + case PIPE_FORMAT_BGRX_UNORM: > image_format = __DRI_IMAGE_FORMAT_XRGB; > break; > - case PIPE_FORMAT_B8G8R8A8_UNORM: > + case PIPE_FORMAT_BGRA_UNORM: > image_format = __DRI_IMAGE_FORMAT_ARGB; > break; > - case PIPE_FORMAT_R8G8B8A8_UNORM: > + case PIPE_FORMAT_RGBA_UNORM: > image_format = __DRI_IMAGE_FORMAT_ABGR; > break; >default: > @@ -314,10 +314,10 @@ dri2_allocate_buffer(__DRIscreen *sPriv, > > switch (format) { >case 32: > - pf = PIPE_FORMAT_B8G8R8A8_UNORM; > + pf = PIPE_FORMAT_BGRA_UNORM; > break; >case 24: > - pf = PIPE_FORMAT_B8G8R8X8_UNORM; > + pf = PIPE_FORMAT_BGRX_UNORM; > break; >case 16: > pf = PIPE_FORMAT_Z16_UNORM; > @@ -724,13 +724,13 @@ dri2_create_image_from_winsys(__DRIscreen *_screen, >pf = PIPE_FORMAT_B5G6R5_UNORM; >break; > case __DRI_IMAGE_FORMAT_XRGB: > - pf = PIPE_FORMAT_B8G8R8X8_UNORM; > + pf = PIPE_FORMAT_BGRX_UNORM; >break; > case __DRI_IMAGE_FORMAT_ARGB: > - pf = PIPE_FORMAT_B8G8R8A8_UNORM; > + pf = PIPE_FORMAT_BGRA_UNORM; >break; > case __DRI_IMAGE_FORMAT_ABGR: > - pf = PIPE_FORMAT_R8G8B8A8_UNORM; > + pf = PIPE_FORMAT_RGBA_UNORM; >break; > default: >pf = PIPE_FORMAT_NONE; > @@ -845,13 +845,13 @@ dri2_create_image(__DRIscreen *_screen, >pf = PIPE_FORMAT_B5G6R5_UNORM; >break; > case __DRI_IMAGE_FORMAT_XRGB: > - pf = PIPE_FORMAT_B8G8R8X8_UNORM; > + pf = PIPE_FORMAT_BGRX_UNORM; >break; > case __DRI_IMAGE_FORMAT_ARGB: > - pf = PIPE_FORMAT_B8G8R8A8_UNORM; > + pf = PIPE_FORMAT_BGRA_UNORM; >break; > case __DRI_IMAGE_FORMAT_ABGR: > - pf = PIPE_FORMAT_R8G8B8A8_UNORM; > + pf = PIPE_FORMAT_RGBA_UNORM; >break; > default: >pf = PIPE_FORMAT_NONE; > diff --git a/src/gallium/state_trackers/dri/dri_drawable.c > b/src/gallium/state_trackers/dri/dri_drawable.c > index 0d2929a..f0cc4a2 100644 > --- a/src/gallium/state_trackers/dri/dri_drawable.c > +++ b/src/gallium/state_trackers/dri/dri_drawable.c > @@ -231,11 +231,11 @@ dri_set_tex_buffer2(__DRIcontext *pDRICtx, GLint target, >if (format == __DRI_TEXTURE_FORMAT_RGB) { > /* only need to cover the formats recognized by dri_fill_st_visual > */ > switch (internal_format) { > - case PIPE_FORMAT_B8G8R8A8_UNORM: > -internal_format = PIPE_FORMAT_B8G8R8X8_UNORM; > + case PIPE_FORMAT_BGRA_UNORM: > +internal_format = PIPE_FORMAT_BGRX_UNORM; > break; > - case PIPE_FORMAT_A8R8G8B8_UNORM: > -internal_format = PIPE_FORMAT_X8R8G8B8_UNORM; > + case PIPE_FORMAT_ARGB_UNORM: > +internal_format = PIPE_FORMAT_XRGB_UNORM; > break; > default: > break; > -- > 2.5.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Don't check for draw-time errors that cannot occur in core profile
From: Ian Romanick On many CPU-limited applications, this is *the* hot path. The idea is to generate per-API versions of brw_draw_prims that elide some checks. This patch removes render-mode and "is everything in VBOs" checks from core-profile contexts. On my IVB laptop (which may have experienced thermal throttling): Gl32Batch7: 3.70955% +/- 1.11344% OglBatch7: 1.04398% +/- 0.772788% These are the same benchmark, but Gl32Batch7 uses an OpenGL 3.2 Core Profile context. v2: Reorder parameters to brw_try_draw_prims to reduce data shuffling. v3: Pass a gl_api into draw_prims instead of a must-be-core-profile flag. This will make it easier to expand to other profiles later. v4: Make brw_draw_prims_generic be a dispatcher. This way we always use the correct per-API version. This should reduce cache pollution when brw_draw_prims_core is used, but it didn't seem to affect performance one way or the other on my IVB. Signed-off-by: Ian Romanick --- src/mesa/drivers/dri/i965/brw_draw.c | 148 +- src/mesa/drivers/dri/i965/brw_draw.h | 54 ++-- src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 6 +- src/mesa/drivers/dri/i965/brw_primitive_restart.c | 4 +- 4 files changed, 164 insertions(+), 48 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index a23e9c0..bfd113f 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -416,10 +416,10 @@ brw_postdraw_set_buffers_need_resolve(struct brw_context *brw) */ static void brw_try_draw_prims(struct gl_context *ctx, - const struct gl_client_array *arrays[], const struct _mesa_prim *prims, GLuint nr_prims, const struct _mesa_index_buffer *ib, + const struct gl_client_array *arrays[], GLuint min_index, GLuint max_index, struct gl_buffer_object *indirect) @@ -572,23 +572,26 @@ retry: return; } -void -brw_draw_prims(struct gl_context *ctx, - const struct _mesa_prim *prims, - GLuint nr_prims, - const struct _mesa_index_buffer *ib, - GLboolean index_bounds_valid, - GLuint min_index, - GLuint max_index, - struct gl_transform_feedback_object *unused_tfb_object, - unsigned stream, - struct gl_buffer_object *indirect) +/** + * \warning + * This function must be static, inline, and always_inline. This is the only + * thing that allows the compiler to optimize the tests of \c + * must_be_core_profile away. + */ +static inline __attribute__((always_inline)) void +draw_prims(struct gl_context *ctx, + const struct _mesa_prim *prims, + GLuint nr_prims, + const struct _mesa_index_buffer *ib, + GLboolean index_bounds_valid, + GLuint min_index, + GLuint max_index, + struct gl_buffer_object *indirect, + gl_api API) { struct brw_context *brw = brw_context(ctx); const struct gl_client_array **arrays = ctx->Array._DrawArrays; - assert(unused_tfb_object == NULL); - if (!brw_check_conditional_render(brw)) return; @@ -598,38 +601,108 @@ brw_draw_prims(struct gl_context *ctx, return; } - /* Do GL_SELECT and GL_FEEDBACK rendering using swrast, even though it -* won't support all the extensions we support. + /* Core profile removed GL_SELECT and GL_FEEDBACK. +* +* FINISHME: OpenGL ES (all versions) also remove GL_SELECT and +* FINISHME: GL_FEEDBACK. We could take advantage of this easily. */ - if (ctx->RenderMode != GL_RENDER) { - perf_debug("%s render mode not supported in hardware\n", - _mesa_enum_to_string(ctx->RenderMode)); - _swsetup_Wakeup(ctx); - _tnl_wakeup(ctx); - _tnl_draw_prims(ctx, prims, nr_prims, ib, - index_bounds_valid, min_index, max_index, NULL, 0, NULL); - return; + if (API == API_OPENGL_COMPAT) { + /* Do GL_SELECT and GL_FEEDBACK rendering using swrast, even though it + * won't support all the extensions we support. + */ + if (ctx->RenderMode != GL_RENDER) { + perf_debug("%s render mode not supported in hardware\n", +_mesa_enum_to_string(ctx->RenderMode)); + _swsetup_Wakeup(ctx); + _tnl_wakeup(ctx); + _tnl_draw_prims(ctx, prims, nr_prims, ib, + index_bounds_valid, min_index, max_index, NULL, 0, + NULL); + return; + } } - /* If we're going to have to upload any of the user's vertex arrays, then -* get the minimum and maximum of their index buffer so we know what range -* to upload. + /* Core profile requires that all vertex data be stored in VBOs, so there +* is n
[Mesa-dev] [Bug 91591] rounding.h:102:2: error: #error "Unsupported or undefined LONG_BIT"
https://bugs.freedesktop.org/show_bug.cgi?id=91591 --- Comment #12 from Jose Fonseca --- if __STDC_LIMIT_MACROS is indeed the problem, then it must be defined globally in configure.ac. Defining in rounding.h won't be effective if by any reason limits.h is included before rounding.h is (ie, the the C file includes limits.h or includes another header that includes limits.h before rounding.h) -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91468] LLVM 3.8(svn): llvm changes llvm-config output again?
https://bugs.freedesktop.org/show_bug.cgi?id=91468 Krzysztof A. Sobiecki changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |NOTOURBUG --- Comment #10 from Krzysztof A. Sobiecki --- Packages on llvm.org/apt should be fixed now, no need for a patch. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91591] rounding.h:102:2: error: #error "Unsupported or undefined LONG_BIT"
https://bugs.freedesktop.org/show_bug.cgi?id=91591 Matt Turner changed: What|Removed |Added Attachment #117615|0 |1 is obsolete|| --- Comment #13 from Matt Turner --- Created attachment 117618 --> https://bugs.freedesktop.org/attachment.cgi?id=117618&action=edit patch Oh, that's true. Here's a patch to define it globally. -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/skl: Remove early platform support
On Mon, Aug 10, 2015 at 9:49 AM, Ben Widawsky wrote: > On Mon, Aug 10, 2015 at 09:09:41AM -0700, Jason Ekstrand wrote: >> On Aug 10, 2015 4:14 AM, "Neil Roberts" wrote: >> > >> > If we go with this patch perhaps it would be good to remove >> > supports_simd16_3src entirely from brw_device_info and any code that is >> > referring to it in order to avoid carrying around useless code. >> > Currently it seems like it would be quite easy to add a new >> > brw_device_info and forget to add supports_simd16_3src and never notice >> > that it is redundantly using the fallback. >> >> I would second that. >> > > Either of you opposed to doing it as a follow-on patch? (This failure you > speak > of was the primary goal for removing it). Initially, I tried to remove it, but > we do still seem to use this logic pre-HSW, and so things got messier than > I initially anticipated. Follow-on is fine with me. > Neil, you said "if we go with this patch" - I haven't heard any opposition, do > you have any? > > [snip] ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa/format_utils: Add src_bits == dst_bits cases to [us]norm functions
On Mon, Aug 10, 2015 at 4:35 AM, Neil Roberts wrote: > Jason Ekstrand writes: > >> @@ -151,8 +153,10 @@ _mesa_snorm_to_snorm(int x, unsigned src_bits, unsigned >> dst_bits) >>return -MAX_INT(dst_bits); >> else if (src_bits < dst_bits) >>return EXTEND_NORMALIZED_INT(x, src_bits - 1, dst_bits - 1); >> - else >> + else if (src_bits > dst_bits) >>return x >> (src_bits - dst_bits); >> + else >> + return x; >> } > > This part seems pretty unnecessary as it seems pretty unlikely that a > compiler couldn't optimise away x >> 0. However I'm happy if you want to > land it anyway for consistency. You're right. I probably shouldn't worry about x >> 0. > If you wanted to optimise it a bit more you could move the > if(src_bits==dst_bits) to above the first if statement because I think > it would be tricky the for the compiler to optimise that away. That's not quite right. The first if statement is to "sanitize" the value. For an 8-bit snorm, they could pass in -128 which is slightly less than -1.0 and we want them to get -127 (-1.0) back out. > Either way, > > Reviewed-by: Neil Roberts Thanks! > > Regards, > - Neil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/5] i965/vec4-nir: Handle boolean resolvese on ILK-
On Mon, Aug 3, 2015 at 5:22 PM, Jason Ekstrand wrote: > The analysis code was already there and running, we just weren't doing > anything with the result of it yet. > --- > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 13 + > 1 file changed, 13 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > index b13465b..27f23d0 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > @@ -1292,6 +1292,19 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr) > default: >unreachable("Unimplemented ALU operation"); > } > + > + /* If we need to do a boolean resolve, replace the result with -(x & 1) > +* to sign extend the low bit to 0/~0 > +*/ > + if (devinfo->gen <= 5 && > + (instr->instr.pass_flags & BRW_NIR_BOOLEAN_MASK) == > BRW_NIR_BOOLEAN_NEEDS_RESOLVE) { Line wrap. > + dst_reg masked = dst_reg(this, glsl_type::int_type); > + masked.writemask = dst.writemask; > + emit(AND(masked, src_reg(dst), src_reg(1))); > + src_reg masked_neg = src_reg(masked); > + masked_neg.negate = true; > + emit(MOV(retype(dst, BRW_REGISTER_TYPE_D), masked_neg)); > + } > } > > void > -- > 2.4.3 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/5] i965/vec4_nir: Do boolean source modifier resolves on BDW+
On Mon, Aug 3, 2015 at 5:22 PM, Jason Ekstrand wrote: > On BDW+, the negation source modifier on NOT, AND, OR, and XOR, is actually > a boolean negate and not an integer negate. However, NIR's soruce > modifiers are the integer version. We have to resolve it with a MOV prior > to emitting the actual instruction. This is basically the same thing we do > in the FS backend. > --- > src/mesa/drivers/dri/i965/brw_vec4.h | 1 + > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 15 +++ > src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 13 + > 3 files changed, 29 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h > b/src/mesa/drivers/dri/i965/brw_vec4.h > index 985886d..ff143be 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4.h > +++ b/src/mesa/drivers/dri/i965/brw_vec4.h > @@ -320,6 +320,7 @@ public: > dst_reg dst, src_reg src0, src_reg src1); > > src_reg fix_3src_operand(src_reg src); > + src_reg resolve_source_modifiers(src_reg src); > > vec4_instruction *emit_math(enum opcode opcode, const dst_reg &dst, const > src_reg &src0, > const src_reg &src1 = src_reg()); > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > index 27f23d0..1b7fb5e 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > @@ -1020,18 +1020,33 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr) > } > > case nir_op_inot: > + if (devinfo->gen >= 8) { > + op[0] = resolve_source_modifiers(op[0]); > + } >emit(NOT(dst, op[0])); >break; > > case nir_op_ixor: > + if (devinfo->gen >= 8) { > + op[0] = resolve_source_modifiers(op[0]); > + op[1] = resolve_source_modifiers(op[1]); > + } >emit(XOR(dst, op[0], op[1])); >break; > > case nir_op_ior: > + if (devinfo->gen >= 8) { > + op[0] = resolve_source_modifiers(op[0]); > + op[1] = resolve_source_modifiers(op[1]); > + } >emit(OR(dst, op[0], op[1])); >break; > > case nir_op_iand: > + if (devinfo->gen >= 8) { > + op[0] = resolve_source_modifiers(op[0]); > + op[1] = resolve_source_modifiers(op[1]); > + } >emit(AND(dst, op[0], op[1])); >break; > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp > index c5c0d2c..639f829 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp > @@ -313,6 +313,19 @@ vec4_visitor::fix_3src_operand(src_reg src) > } > > src_reg > +vec4_visitor::resolve_source_modifiers(src_reg src) > +{ > + if (!src.abs && !src.negate) > + return src; > + > + dst_reg resolved = dst_reg(this, glsl_type::ivec4_type); > + resolved.type = src.type; > + emit(MOV(resolved, src)); > + > + return src_reg(resolved); > +} The FS passes a pointer to an fs_reg and then updates that. Let's do that here to be consistent. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/gen9: Don't use fast copy blit in case of non power of 2 cpp
Many piglit tests (if using fast copy blit in Mesa) failed earlier because I missed adding this condition. Fast copy blit is currently enabled for use only with Yf/Ys tiling. Signed-off-by: Anuj Phogat --- src/mesa/drivers/dri/i965/intel_blit.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_blit.c b/src/mesa/drivers/dri/i965/intel_blit.c index 6d92580..bab7d90 100644 --- a/src/mesa/drivers/dri/i965/intel_blit.c +++ b/src/mesa/drivers/dri/i965/intel_blit.c @@ -431,8 +431,10 @@ can_fast_copy_blit(struct brw_context *brw, if ((dst_offset | src_offset) & 63) return false; - /* Color depth greater than 128 bits not supported. */ - if (cpp > 16) + /* Color depths which are not power of 2 or greater than 128 bits are +* not supported. +*/ + if (!is_power_of_two(cpp) || cpp > 16) return false; /* For Fast Copy Blits the pitch cannot be a negative number. So, bit 15 -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 12/70] i965: Rename intel_batchbuffer to brw_batch
On Mon, Aug 10, 2015 at 5:49 AM, Martin Peres wrote: > On 10/08/15 15:45, Chris Wilson wrote: >> >> On Mon, Aug 10, 2015 at 03:37:24PM +0300, Martin Peres wrote: >>> >>> 10-12 are: >>> >>> Reviewed-by: Martin Peres >> >> One thing to consider, do we want to do a s/typedef/struct brw_batch/ >> cleanup after the transition is complete? >> >> brw_batch vs struct brw_batch ? >> brw_bo vs struct brw_bo ? >> -Chris > > > That is a question for jason and Kenneth, isn't it? /me is not really a big > contributor to mesa yet In most of mesa, we've kept the struct decorator. We do typedefs in NIR (which some people have complained about) but the rest of the time we usually don't. Of course, code that's written in C++ doesn't use the "struct" keyword. --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91591] rounding.h:102:2: error: #error "Unsupported or undefined LONG_BIT"
https://bugs.freedesktop.org/show_bug.cgi?id=91591 --- Comment #14 from Jose Fonseca --- Comment on attachment 117618 --> https://bugs.freedesktop.org/attachment.cgi?id=117618 patch Review of attachment 117618: - Looks great. Reviewed-by: Jose Fonseca -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/5] i965/vec4_nir: Do boolean source modifier resolves on BDW+
On Mon, Aug 10, 2015 at 11:16 AM, Matt Turner wrote: > On Mon, Aug 3, 2015 at 5:22 PM, Jason Ekstrand wrote: >> On BDW+, the negation source modifier on NOT, AND, OR, and XOR, is actually >> a boolean negate and not an integer negate. However, NIR's soruce >> modifiers are the integer version. We have to resolve it with a MOV prior >> to emitting the actual instruction. This is basically the same thing we do >> in the FS backend. >> --- >> src/mesa/drivers/dri/i965/brw_vec4.h | 1 + >> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 15 +++ >> src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 13 + >> 3 files changed, 29 insertions(+) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h >> b/src/mesa/drivers/dri/i965/brw_vec4.h >> index 985886d..ff143be 100644 >> --- a/src/mesa/drivers/dri/i965/brw_vec4.h >> +++ b/src/mesa/drivers/dri/i965/brw_vec4.h >> @@ -320,6 +320,7 @@ public: >> dst_reg dst, src_reg src0, src_reg src1); >> >> src_reg fix_3src_operand(src_reg src); >> + src_reg resolve_source_modifiers(src_reg src); >> >> vec4_instruction *emit_math(enum opcode opcode, const dst_reg &dst, >> const src_reg &src0, >> const src_reg &src1 = src_reg()); >> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >> index 27f23d0..1b7fb5e 100644 >> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >> @@ -1020,18 +1020,33 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr) >> } >> >> case nir_op_inot: >> + if (devinfo->gen >= 8) { >> + op[0] = resolve_source_modifiers(op[0]); >> + } >>emit(NOT(dst, op[0])); >>break; >> >> case nir_op_ixor: >> + if (devinfo->gen >= 8) { >> + op[0] = resolve_source_modifiers(op[0]); >> + op[1] = resolve_source_modifiers(op[1]); >> + } >>emit(XOR(dst, op[0], op[1])); >>break; >> >> case nir_op_ior: >> + if (devinfo->gen >= 8) { >> + op[0] = resolve_source_modifiers(op[0]); >> + op[1] = resolve_source_modifiers(op[1]); >> + } >>emit(OR(dst, op[0], op[1])); >>break; >> >> case nir_op_iand: >> + if (devinfo->gen >= 8) { >> + op[0] = resolve_source_modifiers(op[0]); >> + op[1] = resolve_source_modifiers(op[1]); >> + } >>emit(AND(dst, op[0], op[1])); >>break; >> >> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp >> b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp >> index c5c0d2c..639f829 100644 >> --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp >> @@ -313,6 +313,19 @@ vec4_visitor::fix_3src_operand(src_reg src) >> } >> >> src_reg >> +vec4_visitor::resolve_source_modifiers(src_reg src) >> +{ >> + if (!src.abs && !src.negate) >> + return src; >> + >> + dst_reg resolved = dst_reg(this, glsl_type::ivec4_type); >> + resolved.type = src.type; >> + emit(MOV(resolved, src)); >> + >> + return src_reg(resolved); >> +} > > The FS passes a pointer to an fs_reg and then updates that. Let's do > that here to be consistent. Oh... I was just trying to be consistent with other functions in vec4_visitor such as fix_3src_operand() above. --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/gen9: Add a condition for starting pixel in fast copy blit
This condition restricts the use of fast copy blit to cases where starting pixel of src and dst is oword (16 byte) aligned. Many piglit tests (if using fast copy blit in Mesa) failed earlier because I missed adding this condition.Fast copy blit is currently enabled for use only with Yf/Ys tiling. Cc: Ben Widawsky Signed-off-by: Anuj Phogat --- src/mesa/drivers/dri/i965/intel_blit.c | 4 1 file changed, 4 insertions(+) diff --git a/src/mesa/drivers/dri/i965/intel_blit.c b/src/mesa/drivers/dri/i965/intel_blit.c index bab7d90..bb8fc66 100644 --- a/src/mesa/drivers/dri/i965/intel_blit.c +++ b/src/mesa/drivers/dri/i965/intel_blit.c @@ -427,6 +427,10 @@ can_fast_copy_blit(struct brw_context *brw, dst_tr_mode == INTEL_MIPTREE_TRMODE_NONE) return false; + /* The start pixel for Fast Copy blit should be on an OWord boundary. */ + if ((dst_x * cpp | src_x * cpp) & 15) + return false; + /* For all surface types buffers must be cacheline-aligned. */ if ((dst_offset | src_offset) & 63) return false; -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/5] i965/vec4_nir: Various fixes for ILK- and BDW+
On Mon, Aug 3, 2015 at 5:22 PM, Jason Ekstrand wrote: > This is a quick little series that fixes a bunch of piglit regressions with > vec4 NIR on ILK- and BDW+. With this series, vec4 NIR has (I think) zero > regressions versus non-NIR on all intel platforms. I haven't yet tested > NIR with vec4 vertex shaders on BDW, but I think it's probably ok. We > should definitely verify that at some point. I gave comments on 2 and 3. With those things fixed, this series is Reviewed-by: Matt Turner > I'm not 100% sure what I think about the first two patches. They actually > end up canceling each other out in effect. However, I do think that the > end result is better. Opinions welcome. Seems fine to me. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/5] Some register allocation improvements
On Fri, Jul 31, 2015 at 10:05 AM, Jason Ekstrand wrote: > The following 5 patches contain a few register allocation cleanups and > performance improvements. Chris Wilson noticed that setting up register > sets on i965 calls reralloc an absurd number of times. I did a little > hacking and found out that the initial size for the collision lists is way > too low. This series also contains a patch to avoid setting up registers > more times than needed on platforms where RA is the same for SIMD8 vs > SIMD16. > > The whole series seems to cut about 4 minutes off a piglit run on BYT. It > usually takes around 31 minutes and this time it ran in 27. > > Jason Ekstrand (5): > ra: Refactor ra_set_finalize > ra: Delete the conflict lists in ra_set_finalize > ra: Allocate bigger initial conflict lists > i965/fs: Use dispatch_width instead of reg_width in alloc_reg_sets The first four are Reviewed-by: Matt Turner > i965/fs: Don't do redundant RA setup on IVB+ I'll take your word that the register sets are identical between SIMD8/16 on IVB+ and give you an Acked-by: Matt Turner ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/5] i965/fs: Don't do redundant RA setup on IVB+
On Fri, Jul 31, 2015 at 10:05 AM, Jason Ekstrand wrote: > --- > src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 9 + > 1 file changed, 9 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp > b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp > index 211f70e..512da22 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp > @@ -79,6 +79,15 @@ brw_alloc_reg_set(struct brw_compiler *compiler, int > dispatch_width) > int base_reg_count = BRW_MAX_GRF; > int index = (dispatch_width / 8) - 1; > > + if (dispatch_width > 8 && devinfo->gen >= 7) { > + /* For IVB+, we don't need the PLN hacks or the 2-reg alignment in s/2-reg /even-/ (to match the PRM) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/5] i965/vec4_nir: Do boolean source modifier resolves on BDW+
On Mon, Aug 10, 2015 at 11:19 AM, Jason Ekstrand wrote: > On Mon, Aug 10, 2015 at 11:16 AM, Matt Turner wrote: >> On Mon, Aug 3, 2015 at 5:22 PM, Jason Ekstrand wrote: >>> On BDW+, the negation source modifier on NOT, AND, OR, and XOR, is actually >>> a boolean negate and not an integer negate. However, NIR's soruce >>> modifiers are the integer version. We have to resolve it with a MOV prior >>> to emitting the actual instruction. This is basically the same thing we do >>> in the FS backend. >>> --- >>> src/mesa/drivers/dri/i965/brw_vec4.h | 1 + >>> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 15 +++ >>> src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 13 + >>> 3 files changed, 29 insertions(+) >>> >>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h >>> b/src/mesa/drivers/dri/i965/brw_vec4.h >>> index 985886d..ff143be 100644 >>> --- a/src/mesa/drivers/dri/i965/brw_vec4.h >>> +++ b/src/mesa/drivers/dri/i965/brw_vec4.h >>> @@ -320,6 +320,7 @@ public: >>> dst_reg dst, src_reg src0, src_reg src1); >>> >>> src_reg fix_3src_operand(src_reg src); >>> + src_reg resolve_source_modifiers(src_reg src); >>> >>> vec4_instruction *emit_math(enum opcode opcode, const dst_reg &dst, >>> const src_reg &src0, >>> const src_reg &src1 = src_reg()); >>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >>> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >>> index 27f23d0..1b7fb5e 100644 >>> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >>> @@ -1020,18 +1020,33 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr) >>> } >>> >>> case nir_op_inot: >>> + if (devinfo->gen >= 8) { >>> + op[0] = resolve_source_modifiers(op[0]); >>> + } >>>emit(NOT(dst, op[0])); >>>break; >>> >>> case nir_op_ixor: >>> + if (devinfo->gen >= 8) { >>> + op[0] = resolve_source_modifiers(op[0]); >>> + op[1] = resolve_source_modifiers(op[1]); >>> + } >>>emit(XOR(dst, op[0], op[1])); >>>break; >>> >>> case nir_op_ior: >>> + if (devinfo->gen >= 8) { >>> + op[0] = resolve_source_modifiers(op[0]); >>> + op[1] = resolve_source_modifiers(op[1]); >>> + } >>>emit(OR(dst, op[0], op[1])); >>>break; >>> >>> case nir_op_iand: >>> + if (devinfo->gen >= 8) { >>> + op[0] = resolve_source_modifiers(op[0]); >>> + op[1] = resolve_source_modifiers(op[1]); >>> + } >>>emit(AND(dst, op[0], op[1])); >>>break; >>> >>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp >>> b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp >>> index c5c0d2c..639f829 100644 >>> --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp >>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp >>> @@ -313,6 +313,19 @@ vec4_visitor::fix_3src_operand(src_reg src) >>> } >>> >>> src_reg >>> +vec4_visitor::resolve_source_modifiers(src_reg src) >>> +{ >>> + if (!src.abs && !src.negate) >>> + return src; >>> + >>> + dst_reg resolved = dst_reg(this, glsl_type::ivec4_type); >>> + resolved.type = src.type; >>> + emit(MOV(resolved, src)); >>> + >>> + return src_reg(resolved); >>> +} >> >> The FS passes a pointer to an fs_reg and then updates that. Let's do >> that here to be consistent. > > Oh... I was just trying to be consistent with other functions in > vec4_visitor such as fix_3src_operand() above. Heh, consistency is inconsistent! If changing the FS to be consistent with this seems better, that works for me. In that case, make the argument 'const src_reg &src' to avoid the copy. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] clover: upload sample bitfield to kernel
--- src/gallium/state_trackers/clover/core/kernel.cpp | 7 +++ src/gallium/state_trackers/clover/core/sampler.cpp | 12 src/gallium/state_trackers/clover/core/sampler.hpp | 1 + 3 files changed, 20 insertions(+) diff --git a/src/gallium/state_trackers/clover/core/kernel.cpp b/src/gallium/state_trackers/clover/core/kernel.cpp index a226ec1..c3cf4eb 100644 --- a/src/gallium/state_trackers/clover/core/kernel.cpp +++ b/src/gallium/state_trackers/clover/core/kernel.cpp @@ -575,6 +575,13 @@ kernel::sampler_argument::set(size_t size, const void *value) { void kernel::sampler_argument::bind(exec_context &ctx, const module::argument &marg) { + auto v = bytes(s->bit_field()); + + extend(v, module::argument::zero_ext, marg.target_size); + byteswap(v, ctx.q->device().endianness()); + align(ctx.input, marg.target_align); + insert(ctx.input, v); + st = s->bind(*ctx.q); ctx.samplers.push_back(st); } diff --git a/src/gallium/state_trackers/clover/core/sampler.cpp b/src/gallium/state_trackers/clover/core/sampler.cpp index 6f2784b..23fb76a 100644 --- a/src/gallium/state_trackers/clover/core/sampler.cpp +++ b/src/gallium/state_trackers/clover/core/sampler.cpp @@ -23,6 +23,11 @@ #include "core/sampler.hpp" #include "pipe/p_state.h" +// CLK_* constants are defined in libclc (include/clc/image/image_defines.h) +#define __CLOVER_NORM_MODE_BITS(x) (x) +#define __CLOVER_ADDR_MODE_BITS(x) (((x)-CL_ADDRESS_NONE)<<1) +#define __CLOVER_FILT_MODE_BITS(x) (((x)-CL_FILTER_NEAREST)<<4) + using namespace clover; sampler::sampler(clover::context &ctx, bool norm_mode, @@ -47,6 +52,13 @@ sampler::filter_mode() { return _filter_mode; } +cl_uint +sampler::bit_field() const { + return __CLOVER_NORM_MODE_BITS(_norm_mode) | + __CLOVER_ADDR_MODE_BITS(_addr_mode) | + __CLOVER_FILT_MODE_BITS(_filter_mode); +} + void * sampler::bind(command_queue &q) { struct pipe_sampler_state info {}; diff --git a/src/gallium/state_trackers/clover/core/sampler.hpp b/src/gallium/state_trackers/clover/core/sampler.hpp index 2632c30..d896252 100644 --- a/src/gallium/state_trackers/clover/core/sampler.hpp +++ b/src/gallium/state_trackers/clover/core/sampler.hpp @@ -40,6 +40,7 @@ namespace clover { bool norm_mode(); cl_addressing_mode addr_mode(); cl_filter_mode filter_mode(); + cl_uint bit_field() const; const intrusive_ref context; -- 2.4.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] glsl: avoid compiler's segfault when processing operators with void arguments
Should this patch be cc'd to stable branches? Without it, the compiler crashes on invalid inputs, instead of generating an error. This patch applies cleanly to 10.5 and 10.6. -Mark Renaud Gaubert writes: > This is done by returning an rvalue of type void in the > ast_function_expression::hir function instead of a void expression. > > This produces (in the case of the ternary) an hir with a call > to the void returning function and an assignment of a void variable > which will be optimized out (the assignment) during the optimization > pass. > > This fix results in having a valid subexpression in the many > different cases where the subexpressions are functions whose > return values are void. > > Thus preventing to dereference NULL in the following cases: > * binary operator > * unary operators > * ternary operator > * comparison operators (except equal and nequal operator) > > Equal and nequal had to be handled as a special case because > instead of segfaulting on a forbidden syntax it was now accepting > expressions with a void return value on either (or both) side of > the expression. > > Signed-off-by: Renaud Gaubert > Reviewed-by: Gabriel Laskar > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85252 > > --- > Piglit tests were sent to the Piglit mailing list: > * glsl-1.10 Adds tests on how void functions are handled > > src/glsl/ast_function.cpp | 9 - > src/glsl/ast_to_hir.cpp | 9 - > 2 files changed, 16 insertions(+), 2 deletions(-) > > diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp > index 92e26bf..ac32723 100644 > --- a/src/glsl/ast_function.cpp > +++ b/src/glsl/ast_function.cpp > @@ -1785,7 +1785,14 @@ ast_function_expression::hir(exec_list *instructions, >/* an error has already been emitted */ >value = ir_rvalue::error_value(ctx); >} else { > - value = generate_call(instructions, sig, &actual_parameters, state); > +value = generate_call(instructions, sig, &actual_parameters, state); > +if (!value) { > + ir_variable *const tmp = new(ctx) > ir_variable(glsl_type::void_type, > + "void_var", > + ir_var_temporary); > + instructions->push_tail(tmp); > + value = new(ctx) ir_dereference_variable(tmp); > +} >} > >return value; > diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp > index 8cb46be..0d0ad2a 100644 > --- a/src/glsl/ast_to_hir.cpp > +++ b/src/glsl/ast_to_hir.cpp > @@ -1270,7 +1270,14 @@ ast_expression::do_hir(exec_list *instructions, > *applied to one operand that can make them match, in which > *case this conversion is done." > */ > - if ((!apply_implicit_conversion(op[0]->type, op[1], state) > + > + if (op[0]->type == glsl_type::void_type || op[1]->type == > glsl_type::void_type) { > + _mesa_glsl_error(& loc, state, "`%s': wrong operand types: " > + "no operation `%1$s' exists that takes a left-hand " > + "operand of type 'void' or a right operand of type " > + "'void'", (this->oper == ast_equal) ? "==" : "!="); > + error_emitted = true; > + } else if ((!apply_implicit_conversion(op[0]->type, op[1], state) > && !apply_implicit_conversion(op[1]->type, op[0], state)) >|| (op[0]->type != op[1]->type)) { > _mesa_glsl_error(& loc, state, "operands of `%s' must have the same > " > -- > 2.4.5 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r600, compute: setup compute sampler states and views
--- src/gallium/drivers/r600/evergreen_compute.c | 25 ++ src/gallium/drivers/r600/evergreen_state.c | 30 -- src/gallium/drivers/r600/evergreend.h| 5 + src/gallium/drivers/r600/r600_pipe.h | 7 +- src/gallium/drivers/r600/r600_state_common.c | 32 ++-- 5 files changed, 60 insertions(+), 39 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_compute.c b/src/gallium/drivers/r600/evergreen_compute.c index d71eeb9..e886847 100644 --- a/src/gallium/drivers/r600/evergreen_compute.c +++ b/src/gallium/drivers/r600/evergreen_compute.c @@ -504,6 +504,12 @@ static void compute_emit_cs(struct r600_context *ctx, const uint *block_layout, /* Emit constant buffer state */ r600_emit_atom(ctx, &ctx->constbuf_state[PIPE_SHADER_COMPUTE].atom); + /* Emit sampler state */ + r600_emit_atom(ctx, &ctx->samplers[PIPE_SHADER_COMPUTE].states.atom); + + /* Emit sampler view (texture resource) state */ + r600_emit_atom(ctx, &ctx->samplers[PIPE_SHADER_COMPUTE].views.atom); + /* Emit compute shader state */ r600_emit_atom(ctx, &ctx->cs_shader_state.atom); @@ -674,25 +680,6 @@ static void evergreen_set_compute_resources(struct pipe_context * ctx_, } } -void evergreen_set_cs_sampler_view(struct pipe_context *ctx_, - unsigned start_slot, unsigned count, - struct pipe_sampler_view **views) -{ - struct r600_pipe_sampler_view **resource = - (struct r600_pipe_sampler_view **)views; - - for (unsigned i = 0; i < count; i++){ - if (resource[i]) { - assert(i+1 < 12); - /* XXX: Implement */ - assert(!"Compute samplers not implemented."); - ///FETCH0 = VTX0 (param buffer), - //FETCH1 = VTX1 (global buffer pool), FETCH2... = TEX - } - } -} - - static void evergreen_set_global_binding( struct pipe_context *ctx_, unsigned first, unsigned n, struct pipe_resource **resources, diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index 688a092..5f68e08 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -2029,7 +2029,7 @@ static void evergreen_emit_cs_constant_buffers(struct r600_context *rctx, struct static void evergreen_emit_sampler_views(struct r600_context *rctx, struct r600_samplerview_state *state, -unsigned resource_id_base) +unsigned resource_id_base, unsigned pkt_flags) { struct radeon_winsys_cs *cs = rctx->b.rings.gfx.cs; uint32_t dirty_mask = state->dirty_mask; @@ -2042,7 +2042,7 @@ static void evergreen_emit_sampler_views(struct r600_context *rctx, rview = state->views[resource_index]; assert(rview); - radeon_emit(cs, PKT3(PKT3_SET_RESOURCE, 8, 0)); + radeon_emit(cs, PKT3(PKT3_SET_RESOURCE, 8, 0) | pkt_flags); radeon_emit(cs, (resource_id_base + resource_index) * 8); radeon_emit_array(cs, rview->tex_resource_words, 8); @@ -2051,11 +2051,11 @@ static void evergreen_emit_sampler_views(struct r600_context *rctx, rview->tex_resource->b.b.nr_samples > 1 ? RADEON_PRIO_SHADER_TEXTURE_MSAA : RADEON_PRIO_SHADER_TEXTURE_RO); - radeon_emit(cs, PKT3(PKT3_NOP, 0, 0)); + radeon_emit(cs, PKT3(PKT3_NOP, 0, 0) | pkt_flags); radeon_emit(cs, reloc); if (!rview->skip_mip_address_reloc) { - radeon_emit(cs, PKT3(PKT3_NOP, 0, 0)); + radeon_emit(cs, PKT3(PKT3_NOP, 0, 0) | pkt_flags); radeon_emit(cs, reloc); } } @@ -2064,17 +2064,26 @@ static void evergreen_emit_sampler_views(struct r600_context *rctx, static void evergreen_emit_vs_sampler_views(struct r600_context *rctx, struct r600_atom *atom) { - evergreen_emit_sampler_views(rctx, &rctx->samplers[PIPE_SHADER_VERTEX].views, 176 + R600_MAX_CONST_BUFFERS); + evergreen_emit_sampler_views(rctx, &rctx->samplers[PIPE_SHADER_VERTEX].views, +176 + R600_MAX_CONST_BUFFERS, 0); } static void evergreen_emit_gs_sampler_views(struct r600_context *rctx, struct r600_atom *atom) { - evergreen_emit_sampler_views(rctx, &rctx->samplers[PIPE_SHADER_GEOMETRY].views, 336 + R600_MAX_CONST_BUFFERS); + evergreen_emit_sampler_views(rctx, &rctx->samplers[PIPE_SHADER_GEOMETRY].views, +336 + R600_MAX_CONS
[Mesa-dev] [PATCH 2/2] clover: recognize sampler kernel arguments
--- src/gallium/state_trackers/clover/llvm/invocation.cpp | 9 + 1 file changed, 9 insertions(+) diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp b/src/gallium/state_trackers/clover/llvm/invocation.cpp index 86859af..50c4557 100644 --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp @@ -476,6 +476,15 @@ namespace { continue; } + // Sampler + if (type_name == "sampler_t") { +args.push_back(module::argument(module::argument::sampler, +arg_store_size, target_size, +target_align, +module::argument::zero_ext)); +continue; + } + // Image size implicit argument if (type_name == "__llvm_image_size") { args.push_back(module::argument(module::argument::scalar, -- 2.4.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] r600, compute: setup RATs for write-only images
--- src/gallium/drivers/r600/evergreen_compute.c | 101 --- src/gallium/drivers/r600/evergreen_state.c | 44 +++- src/gallium/drivers/r600/r600_pipe.h | 4 +- src/gallium/drivers/radeon/r600_texture.c| 1 + 4 files changed, 109 insertions(+), 41 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_compute.c b/src/gallium/drivers/r600/evergreen_compute.c index 8b27a66..6f39507 100644 --- a/src/gallium/drivers/r600/evergreen_compute.c +++ b/src/gallium/drivers/r600/evergreen_compute.c @@ -99,8 +99,39 @@ struct r600_resource* r600_compute_buffer_alloc_vram( return (struct r600_resource *)buffer; } +static void evergreen_set_cbuf( + struct r600_context *ctx, + unsigned id, + struct pipe_surface *surf) +{ + struct pipe_framebuffer_state *state = &ctx->framebuffer.state; + + /* Add the RAT to the list of color buffers */ + state->cbufs[id] = surf; + + /* Update the number of color buffers */ + state->nr_cbufs = MAX2(id + 1, state->nr_cbufs); + + /* Update the cb_target_mask +* XXX: I think this is a potential spot for bugs once we start doing +* GL interop. cb_target_mask may be modified in the 3D sections +* of this driver. */ + ctx->compute_cb_target_mask |= (0xf << (id << 2)); +} + +static void evergreen_unset_cbuf( + struct r600_context *ctx, + unsigned id) +{ + struct pipe_framebuffer_state *state = &ctx->framebuffer.state; + + state->cbufs[id] = NULL; + ctx->compute_cb_target_mask &= ~(0xf << (id << 2)); + /* nr_cbufs = ceil(last_bit / 4) */ + state->nr_cbufs = (util_last_bit(ctx->compute_cb_target_mask) + 3) >> 2; +} -static void evergreen_set_rat( +static void evergreen_set_rat_buf( struct r600_context *ctx, unsigned id, struct r600_resource* bo, @@ -123,23 +154,11 @@ static void evergreen_set_rat( rat_templ.u.tex.first_layer = 0; rat_templ.u.tex.last_layer = 0; - /* Add the RAT the list of color buffers */ - ctx->framebuffer.state.cbufs[id] = ctx->b.b.create_surface( - (struct pipe_context *)ctx, - (struct pipe_resource *)bo, &rat_templ); - - /* Update the number of color buffers */ - ctx->framebuffer.state.nr_cbufs = - MAX2(id + 1, ctx->framebuffer.state.nr_cbufs); - - /* Update the cb_target_mask -* XXX: I think this is a potential spot for bugs once we start doing -* GL interop. cb_target_mask may be modified in the 3D sections -* of this driver. */ - ctx->compute_cb_target_mask |= (0xf << (id * 4)); + evergreen_set_cbuf(ctx, id, ctx->b.b.create_surface( + (struct pipe_context *)ctx, (struct pipe_resource *)bo, &rat_templ)); surf = (struct r600_surface*)ctx->framebuffer.state.cbufs[id]; - evergreen_init_color_surface_rat(rctx, surf); + evergreen_init_color_surface_rat_buf(ctx, surf); } static void evergreen_cs_set_vertex_buffer( @@ -623,32 +642,34 @@ static void evergreen_set_compute_resources(struct pipe_context * ctx_, struct pipe_surface ** surfaces) { struct r600_context *ctx = (struct r600_context *)ctx_; - struct r600_surface **resources = (struct r600_surface **)surfaces; + struct pipe_surface *surf = NULL; COMPUTE_DBG(ctx->screen, "*** evergreen_set_compute_resources: start = %u count = %u\n", start, count); + if (!surfaces) { + for (unsigned i = 0; i < count; ++i) + evergreen_unset_cbuf(ctx, 1 + i); + return; + } + for (unsigned i = 0; i < count; i++) { - /* The First two vertex buffers are reserved for parameters and -* global buffers. */ - unsigned vtx_id = 2 + i; - if (resources[i]) { - struct r600_resource_global *buffer = - (struct r600_resource_global*) - resources[i]->base.texture; - if (resources[i]->base.writable) { - assert(i+1 < 12); - - evergreen_set_rat(ctx->cs_shader_state.shader, i+1, - (struct r600_resource *)resources[i]->base.texture, - buffer->chunk->start_in_dw*4, - resources[i]->base.texture->width0); - } - - evergreen_cs_set_vertex_buffer(ctx, vtx_id, - buffer->chunk->start_in_dw * 4, - resources[i]->base.texture); - } + surf = surfaces[i]; + if (!surf) + continue; + + /* XXX: Implement constant buffers. */ + assert(surf->wri
Re: [Mesa-dev] [PATCHv2 07/14] i965: Implement surface state set-up for shader images.
On Sat, Aug 8, 2015 at 4:04 AM, Francisco Jerez wrote: > Jason Ekstrand writes: > >> On Wed, May 13, 2015 at 9:43 AM, Francisco Jerez >> wrote: >>> v2: Add SKL support. >>> --- >>> src/mesa/drivers/dri/i965/brw_context.h | 2 + >>> src/mesa/drivers/dri/i965/brw_surface_formats.c | 109 >>> +++ >>> src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 77 >>> 3 files changed, 188 insertions(+) >>> >>> diff --git a/src/mesa/drivers/dri/i965/brw_context.h >>> b/src/mesa/drivers/dri/i965/brw_context.h >>> index 2dcc23c..c55dcec 100644 >>> --- a/src/mesa/drivers/dri/i965/brw_context.h >>> +++ b/src/mesa/drivers/dri/i965/brw_context.h >>> @@ -1741,6 +1741,8 @@ void brw_upload_abo_surfaces(struct brw_context *brw, >>> bool brw_render_target_supported(struct brw_context *brw, >>> struct gl_renderbuffer *rb); >>> uint32_t brw_depth_format(struct brw_context *brw, mesa_format format); >>> +mesa_format brw_lower_mesa_image_format(const struct brw_device_info >>> *devinfo, >>> +mesa_format format); >>> >>> /* brw_performance_monitor.c */ >>> void brw_init_performance_monitors(struct brw_context *brw); >>> diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c >>> b/src/mesa/drivers/dri/i965/brw_surface_formats.c >>> index 016f87a..e0e7fb6 100644 >>> --- a/src/mesa/drivers/dri/i965/brw_surface_formats.c >>> +++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c >>> @@ -805,3 +805,112 @@ brw_depth_format(struct brw_context *brw, mesa_format >>> format) >>>unreachable("Unexpected depth format."); >>> } >>> } >>> + >>> +mesa_format >>> +brw_lower_mesa_image_format(const struct brw_device_info *devinfo, >>> +mesa_format format) >>> +{ >>> + switch (format) { >>> + /* These are never lowered. Up to BDW we'll have to fall back to >>> untyped >>> +* surface access for 128bpp formats. >>> +*/ >>> + case MESA_FORMAT_RGBA_UINT32: >>> + case MESA_FORMAT_RGBA_SINT32: >>> + case MESA_FORMAT_RGBA_FLOAT32: >>> + case MESA_FORMAT_R_UINT32: >>> + case MESA_FORMAT_R_SINT32: >>> + case MESA_FORMAT_R_FLOAT32: >>> + return format; >> >> If it's an unsupported format, why not use MESA_FORMAT_NONE? That >> would make it easier for functions that call this function to know >> that it's just not supported and they shouldn't bother trying. >> > Because they are all supported. As the comment says these formats are > always accessed unlowered, the catch is that the shader may have to use > a different (much more painful) mechanism to access them depending on > the hardware generation -- But this function simply implements the > mapping between API image formats and what the 3D pipeline will see, > which is R(GBA)_*32 in this case regardless of whether untyped or typed > messages can be used. Ok, I failed to read. Sorry for the noise. >>> + >>> + /* From HSW to BDW the only 64bpp format supported for typed access is >>> +* RGBA_UINT16. IVB falls back to untyped. >>> +*/ >>> + case MESA_FORMAT_RGBA_UINT16: >>> + case MESA_FORMAT_RGBA_SINT16: >>> + case MESA_FORMAT_RGBA_FLOAT16: >>> + case MESA_FORMAT_RG_UINT32: >>> + case MESA_FORMAT_RG_SINT32: >>> + case MESA_FORMAT_RG_FLOAT32: >>> + return (devinfo->gen >= 9 ? format : >>> + devinfo->gen >= 8 || devinfo->is_haswell ? >>> + MESA_FORMAT_RGBA_UINT16 : MESA_FORMAT_RG_UINT32); >>> + >>> + /* Up to BDW no SINT or FLOAT formats of less than 32 bits per component >>> +* are supported. IVB doesn't support formats with more than one >>> component >>> +* for typed access. For 8 and 16 bpp formats IVB relies on the >>> +* undocumented behavior that typed reads from R_UINT8 and R_UINT16 >>> +* surfaces actually do a 32-bit misaligned read. The alternative >>> would be >>> +* to use two surface state entries with different formats for each >>> image, >>> +* one for reading (using R_UINT32) and another one for writing (using >>> +* R_UINT8 or R_UINT16), but that would complicate the shaders we >>> generate >>> +* even more. >>> +*/ >> >> Let's make sure I'm understanding this correctly. On BDW and HSW, we >> can just use the UINT format for SINT and FLOAT. On IVB, we set the >> surface state to UINT32 and unpack the components in the shader? What >> happens with writes on IVB to 16-bpp images? >> > Heh, not exactly. On IVB we disobey the hardware spec and bind 16-bpp > formats as R16_UINT and 8-bpp formats as R8_UINT, which aren't > documented to work for typed reads. Writes work just fine and update > the right 8- or 16-bit word on the image. Reads *almost* do what one > would expect: The LSBs of the values returned from the data port will > contain the texel we asked for, but the MSBs will be filled with garbage > which we simply mask out > (c.f. image_format_info::has_undefined_high_bits). >
[Mesa-dev] [PATCH 1/2] r600, compute: refactor 1st arg of evergreen_set_rat
--- src/gallium/drivers/r600/evergreen_compute.c | 21 + 1 file changed, 9 insertions(+), 12 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_compute.c b/src/gallium/drivers/r600/evergreen_compute.c index d89e3de..8b27a66 100644 --- a/src/gallium/drivers/r600/evergreen_compute.c +++ b/src/gallium/drivers/r600/evergreen_compute.c @@ -101,7 +101,7 @@ struct r600_resource* r600_compute_buffer_alloc_vram( static void evergreen_set_rat( - struct r600_pipe_compute *pipe, + struct r600_context *ctx, unsigned id, struct r600_resource* bo, int start, @@ -109,15 +109,12 @@ static void evergreen_set_rat( { struct pipe_surface rat_templ; struct r600_surface *surf = NULL; - struct r600_context *rctx = NULL; assert(id < 12); assert((size & 3) == 0); assert((start & 0xFF) == 0); - rctx = pipe->ctx; - - COMPUTE_DBG(rctx->screen, "bind rat: %i \n", id); + COMPUTE_DBG(ctx->screen, "bind rat: %i \n", id); /* Create the RAT surface */ memset(&rat_templ, 0, sizeof(rat_templ)); @@ -127,21 +124,21 @@ static void evergreen_set_rat( rat_templ.u.tex.last_layer = 0; /* Add the RAT the list of color buffers */ - pipe->ctx->framebuffer.state.cbufs[id] = pipe->ctx->b.b.create_surface( - (struct pipe_context *)pipe->ctx, + ctx->framebuffer.state.cbufs[id] = ctx->b.b.create_surface( + (struct pipe_context *)ctx, (struct pipe_resource *)bo, &rat_templ); /* Update the number of color buffers */ - pipe->ctx->framebuffer.state.nr_cbufs = - MAX2(id + 1, pipe->ctx->framebuffer.state.nr_cbufs); + ctx->framebuffer.state.nr_cbufs = + MAX2(id + 1, ctx->framebuffer.state.nr_cbufs); /* Update the cb_target_mask * XXX: I think this is a potential spot for bugs once we start doing * GL interop. cb_target_mask may be modified in the 3D sections * of this driver. */ - pipe->ctx->compute_cb_target_mask |= (0xf << (id * 4)); + ctx->compute_cb_target_mask |= (0xf << (id * 4)); - surf = (struct r600_surface*)pipe->ctx->framebuffer.state.cbufs[id]; + surf = (struct r600_surface*)ctx->framebuffer.state.cbufs[id]; evergreen_init_color_surface_rat(rctx, surf); } @@ -720,7 +717,7 @@ static void evergreen_set_global_binding( *(handles[i]) = util_cpu_to_le32(handle); } - evergreen_set_rat(ctx->cs_shader_state.shader, 0, pool->bo, 0, pool->size_in_dw * 4); + evergreen_set_rat(ctx, 0, pool->bo, 0, pool->size_in_dw * 4); evergreen_cs_set_vertex_buffer(ctx, 1, 0, (struct pipe_resource*)pool->bo); } -- 2.4.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/5] i965/vec4_nir: Do boolean source modifier resolves on BDW+
On Mon, Aug 10, 2015 at 11:30 AM, Matt Turner wrote: > On Mon, Aug 10, 2015 at 11:19 AM, Jason Ekstrand wrote: >> On Mon, Aug 10, 2015 at 11:16 AM, Matt Turner wrote: >>> On Mon, Aug 3, 2015 at 5:22 PM, Jason Ekstrand wrote: On BDW+, the negation source modifier on NOT, AND, OR, and XOR, is actually a boolean negate and not an integer negate. However, NIR's soruce modifiers are the integer version. We have to resolve it with a MOV prior to emitting the actual instruction. This is basically the same thing we do in the FS backend. --- src/mesa/drivers/dri/i965/brw_vec4.h | 1 + src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 15 +++ src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 13 + 3 files changed, 29 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 985886d..ff143be 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -320,6 +320,7 @@ public: dst_reg dst, src_reg src0, src_reg src1); src_reg fix_3src_operand(src_reg src); + src_reg resolve_source_modifiers(src_reg src); vec4_instruction *emit_math(enum opcode opcode, const dst_reg &dst, const src_reg &src0, const src_reg &src1 = src_reg()); diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index 27f23d0..1b7fb5e 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp @@ -1020,18 +1020,33 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr) } case nir_op_inot: + if (devinfo->gen >= 8) { + op[0] = resolve_source_modifiers(op[0]); + } emit(NOT(dst, op[0])); break; case nir_op_ixor: + if (devinfo->gen >= 8) { + op[0] = resolve_source_modifiers(op[0]); + op[1] = resolve_source_modifiers(op[1]); + } emit(XOR(dst, op[0], op[1])); break; case nir_op_ior: + if (devinfo->gen >= 8) { + op[0] = resolve_source_modifiers(op[0]); + op[1] = resolve_source_modifiers(op[1]); + } emit(OR(dst, op[0], op[1])); break; case nir_op_iand: + if (devinfo->gen >= 8) { + op[0] = resolve_source_modifiers(op[0]); + op[1] = resolve_source_modifiers(op[1]); + } emit(AND(dst, op[0], op[1])); break; diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index c5c0d2c..639f829 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -313,6 +313,19 @@ vec4_visitor::fix_3src_operand(src_reg src) } src_reg +vec4_visitor::resolve_source_modifiers(src_reg src) +{ + if (!src.abs && !src.negate) + return src; + + dst_reg resolved = dst_reg(this, glsl_type::ivec4_type); + resolved.type = src.type; + emit(MOV(resolved, src)); + + return src_reg(resolved); +} >>> >>> The FS passes a pointer to an fs_reg and then updates that. Let's do >>> that here to be consistent. >> >> Oh... I was just trying to be consistent with other functions in >> vec4_visitor such as fix_3src_operand() above. > > Heh, consistency is inconsistent! > > If changing the FS to be consistent with this seems better, that works for me. > > In that case, make the argument 'const src_reg &src' to avoid the copy. Mind if I do that as a follow-on? I'll figure out some way to make these all consistent with each other. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/5] i965/vec4_nir: Do boolean source modifier resolves on BDW+
On Mon, Aug 10, 2015 at 11:37 AM, Jason Ekstrand wrote: > On Mon, Aug 10, 2015 at 11:30 AM, Matt Turner wrote: >> On Mon, Aug 10, 2015 at 11:19 AM, Jason Ekstrand >> wrote: >>> On Mon, Aug 10, 2015 at 11:16 AM, Matt Turner wrote: On Mon, Aug 3, 2015 at 5:22 PM, Jason Ekstrand wrote: > On BDW+, the negation source modifier on NOT, AND, OR, and XOR, is > actually > a boolean negate and not an integer negate. However, NIR's soruce > modifiers are the integer version. We have to resolve it with a MOV prior > to emitting the actual instruction. This is basically the same thing we > do > in the FS backend. > --- > src/mesa/drivers/dri/i965/brw_vec4.h | 1 + > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 15 +++ > src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 13 + > 3 files changed, 29 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h > b/src/mesa/drivers/dri/i965/brw_vec4.h > index 985886d..ff143be 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4.h > +++ b/src/mesa/drivers/dri/i965/brw_vec4.h > @@ -320,6 +320,7 @@ public: > dst_reg dst, src_reg src0, src_reg src1); > > src_reg fix_3src_operand(src_reg src); > + src_reg resolve_source_modifiers(src_reg src); > > vec4_instruction *emit_math(enum opcode opcode, const dst_reg &dst, > const src_reg &src0, > const src_reg &src1 = src_reg()); > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > index 27f23d0..1b7fb5e 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > @@ -1020,18 +1020,33 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr) > } > > case nir_op_inot: > + if (devinfo->gen >= 8) { > + op[0] = resolve_source_modifiers(op[0]); > + } >emit(NOT(dst, op[0])); >break; > > case nir_op_ixor: > + if (devinfo->gen >= 8) { > + op[0] = resolve_source_modifiers(op[0]); > + op[1] = resolve_source_modifiers(op[1]); > + } >emit(XOR(dst, op[0], op[1])); >break; > > case nir_op_ior: > + if (devinfo->gen >= 8) { > + op[0] = resolve_source_modifiers(op[0]); > + op[1] = resolve_source_modifiers(op[1]); > + } >emit(OR(dst, op[0], op[1])); >break; > > case nir_op_iand: > + if (devinfo->gen >= 8) { > + op[0] = resolve_source_modifiers(op[0]); > + op[1] = resolve_source_modifiers(op[1]); > + } >emit(AND(dst, op[0], op[1])); >break; > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp > index c5c0d2c..639f829 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp > @@ -313,6 +313,19 @@ vec4_visitor::fix_3src_operand(src_reg src) > } > > src_reg > +vec4_visitor::resolve_source_modifiers(src_reg src) > +{ > + if (!src.abs && !src.negate) > + return src; > + > + dst_reg resolved = dst_reg(this, glsl_type::ivec4_type); > + resolved.type = src.type; > + emit(MOV(resolved, src)); > + > + return src_reg(resolved); > +} The FS passes a pointer to an fs_reg and then updates that. Let's do that here to be consistent. >>> >>> Oh... I was just trying to be consistent with other functions in >>> vec4_visitor such as fix_3src_operand() above. >> >> Heh, consistency is inconsistent! >> >> If changing the FS to be consistent with this seems better, that works for >> me. >> >> In that case, make the argument 'const src_reg &src' to avoid the copy. > > Mind if I do that as a follow-on? I'll figure out some way to make > these all consistent with each other. Changing the FS? Sure. Presumably you can make the argument a const ref without a follow-on. :) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Don't check for draw-time errors that cannot occur in core profile
On Mon, Aug 10, 2015 at 10:12 AM, Ian Romanick wrote: > From: Ian Romanick > > On many CPU-limited applications, this is *the* hot path. The idea is > to generate per-API versions of brw_draw_prims that elide some checks. > This patch removes render-mode and "is everything in VBOs" checks from > core-profile contexts. > > On my IVB laptop (which may have experienced thermal throttling): > > Gl32Batch7: 3.70955% +/- 1.11344% I'm getting 3.18414% +/- 0.587956% (n=113) on my IVB, , which probably matches your numbers depending on your value of n. > OglBatch7: 1.04398% +/- 0.772788% I'm getting 1.15377% +/- 1.05898% (n=34) on my IVB, which probably matches your numbers depending on your value of n. I'll review soon. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] i965/vec4_visitor: Make some function arguments const references
--- src/mesa/drivers/dri/i965/brw_vec4.h | 4 ++-- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 0c13d43..24bf354 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -319,13 +319,13 @@ public: void emit_scalar(ir_instruction *ir, enum prog_opcode op, dst_reg dst, src_reg src0, src_reg src1); - src_reg fix_3src_operand(src_reg src); + src_reg fix_3src_operand(const src_reg& src); src_reg resolve_source_modifiers(const src_reg& src); vec4_instruction *emit_math(enum opcode opcode, const dst_reg &dst, const src_reg &src0, const src_reg &src1 = src_reg()); - src_reg fix_math_operand(src_reg src); + src_reg fix_math_operand(const src_reg& src); void emit_pack_half_2x16(dst_reg dst, src_reg src0); void emit_unpack_half_2x16(dst_reg dst, src_reg src0); diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index 57d98c9..0acc64a 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -287,7 +287,7 @@ vec4_visitor::emit_dp(dst_reg dst, src_reg src0, src_reg src1, unsigned elements } src_reg -vec4_visitor::fix_3src_operand(src_reg src) +vec4_visitor::fix_3src_operand(const src_reg& src) { /* Using vec4 uniforms in SIMD4x2 programs is difficult. You'd like to be * able to use vertical stride of zero to replicate the vec4 uniform, like @@ -326,7 +326,7 @@ vec4_visitor::resolve_source_modifiers(const src_reg& src) } src_reg -vec4_visitor::fix_math_operand(src_reg src) +vec4_visitor::fix_math_operand(const src_reg& src) { if (devinfo->gen < 6 || devinfo->gen >= 8 || src.file == BAD_FILE) return src; -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev