[Mesa-dev] [PATCH] i965: Go back to using the kernel SOL reset feature.
It turns out the MI_LOAD_REGISTER_IMM approach doesn't work on Haswell, and regressed essentially all the transform feedback Piglit tests. This morally reverts eaa6fbe6d54dc99efac4ab8e800edef65ce8220d. However, the code is still simpler than it was. On BeginTransformFeedback, we simply flush the batch and set the SOL reset flag so that the next batch will start with zeroed offsets. There's still no software counting. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64887 Signed-off-by: Kenneth Graunke Cc: Eric Anholt --- src/mesa/drivers/dri/i965/gen7_sol_state.c | 10 ++ src/mesa/drivers/dri/intel/intel_batchbuffer.c | 4 src/mesa/drivers/dri/intel/intel_context.h | 1 + 3 files changed, 7 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c b/src/mesa/drivers/dri/i965/gen7_sol_state.c index 8dfac01..9e5f5f7 100644 --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c @@ -260,14 +260,8 @@ gen7_begin_transform_feedback(struct gl_context *ctx, GLenum mode, struct brw_context *brw = brw_context(ctx); struct intel_context *intel = &brw->intel; - /* Reset the SOL buffer offset register. */ - for (int i = 0; i < 4; i++) { - BEGIN_BATCH(3); - OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2)); - OUT_BATCH(GEN7_SO_WRITE_OFFSET(i)); - OUT_BATCH(0); - ADVANCE_BATCH(); - } + intel_batchbuffer_flush(intel); + intel->batch.needs_sol_reset = true; } void diff --git a/src/mesa/drivers/dri/intel/intel_batchbuffer.c b/src/mesa/drivers/dri/intel/intel_batchbuffer.c index c7f6d56..8c6524e 100644 --- a/src/mesa/drivers/dri/intel/intel_batchbuffer.c +++ b/src/mesa/drivers/dri/intel/intel_batchbuffer.c @@ -96,6 +96,7 @@ intel_batchbuffer_reset(struct intel_context *intel) intel->batch.reserved_space = BATCH_RESERVED; intel->batch.state_batch_offset = intel->batch.bo->size; intel->batch.used = 0; + intel->batch.needs_sol_reset = false; } void @@ -198,6 +199,9 @@ do_flush_locked(struct intel_context *intel) flags = I915_EXEC_BLT; } + if (batch->needs_sol_reset) +flags |= I915_EXEC_GEN7_SOL_RESET; + if (ret == 0) { if (unlikely(INTEL_DEBUG & DEBUG_AUB) && intel->vtbl.annotate_aub) intel->vtbl.annotate_aub(intel); diff --git a/src/mesa/drivers/dri/intel/intel_context.h b/src/mesa/drivers/dri/intel/intel_context.h index 8c50e6e..c0f07ff 100644 --- a/src/mesa/drivers/dri/intel/intel_context.h +++ b/src/mesa/drivers/dri/intel/intel_context.h @@ -135,6 +135,7 @@ struct intel_batchbuffer { uint32_t state_batch_offset; bool is_blit; + bool needs_sol_reset; struct { uint16_t used; -- 1.8.2.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 64791] swrast crashes with compiz
https://bugs.freedesktop.org/show_bug.cgi?id=64791 Peter Åstrand changed: What|Removed |Added CC||astr...@lysator.liu.se -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] ilo: Initialize need_flush in draw_vbo.
On Thu, May 23, 2013 at 2:24 PM, Vinson Lee wrote: > need_flush was uninitialized if hw3d->new_batch was true. > > Fixes "Uninitialized scalar variable" defect reported by Coverity. > > Signed-off-by: Vinson Lee Applied, thanks. Now I wonder why gcc did not give me any warning... > --- > src/gallium/drivers/ilo/ilo_3d.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/gallium/drivers/ilo/ilo_3d.c > b/src/gallium/drivers/ilo/ilo_3d.c > index 980bdb4..ba3fa96 100644 > --- a/src/gallium/drivers/ilo/ilo_3d.c > +++ b/src/gallium/drivers/ilo/ilo_3d.c > @@ -371,7 +371,7 @@ draw_vbo(struct ilo_3d *hw3d, const struct ilo_context > *ilo, > const struct pipe_draw_info *info, > int *prim_generated, int *prim_emitted) > { > - bool need_flush; > + bool need_flush = false; > int max_len; > > ilo_3d_own_render_ring(hw3d); > -- > 1.8.2.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev -- o...@lunarg.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Error compiling mesa 9.1.1
Bump .. could someone please help me with this? Possibly I am missing something small but can't seem to figure out. Any help is appreciated. Divick On Thu, May 16, 2013 at 12:58 PM, Divick Kishore wrote: > Hi, > I am trying to compile mesa 9.1.1 on my debian wheezy machine but > I get compilation error during the build. > > The configure options that I have used are: > > ../../configure --prefix=~/lib/mesa/dri_llvm --build=x86_64-linux-gnu > --with-dri-drivers="swrast" --with-dri-driverdir=~/lib/mesa/dri_llvm/ > --with-dri-searchpath='~/lib/mesa/dri_llvm/' --enable-glx-tls > --enable-xa --enable-driglx-direct --with-egl-platforms="x11" > --enable-gallium-llvm=yes --with-gallium-drivers="swrast" > --enable-gles1 --enable-gles2 --enable-gallium-egl --disable-glu > CFLAGS="-Wall -O2" CXXFLAGS="-Wall -O2" > > And the compilation error that I see is: > > make[5]: Entering directory > `/home/divick/work/work/mesa-9.1.1/build/dri/src/gallium/state_trackers/dri/drm' > make[5]: *** No rule to make target `dri_context.lo', needed by > `libdridrm.la'. Stop > > > Using the same set of configure options I am able to build mesa 8.0.5. > My objective is to build llvm based software only renderer. > > Could someone please help me build mesa 9.1.1? > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] tgsi: add buffer texture to tgsi_util_get_texture_coord_dim()
On 05/23/2013 12:41 AM, Chia-I Wu wrote: TGSI_TEXTURE_BUFFER is one-dimensional. Assert that exec_tex() is never called with TGSI_TEXTURE_BUFFER. Signed-off-by: Chia-I Wu --- src/gallium/auxiliary/tgsi/tgsi_exec.c |1 + src/gallium/auxiliary/tgsi/tgsi_util.c |2 ++ 2 files changed, 3 insertions(+) diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c b/src/gallium/auxiliary/tgsi/tgsi_exec.c index cb66a40..4482c6b 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_exec.c +++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c @@ -1791,6 +1791,7 @@ exec_tex(struct tgsi_exec_machine *mach, fetch_texel_offsets(mach, inst, offsets); assert(modifier != TEX_MODIFIER_LEVEL_ZERO); + assert(inst->Texture.Texture != TGSI_TEXTURE_BUFFER); dim = tgsi_util_get_texture_coord_dim(inst->Texture.Texture, &shadow_ref); diff --git a/src/gallium/auxiliary/tgsi/tgsi_util.c b/src/gallium/auxiliary/tgsi/tgsi_util.c index 862b79f..98c1e6e 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_util.c +++ b/src/gallium/auxiliary/tgsi/tgsi_util.c @@ -353,6 +353,7 @@ tgsi_util_get_texture_coord_dim(int tgsi_tex, int *shadow_or_sample) * Depending on the texture target, (src0.xyzw, src1.x) is interpreted * differently: * +* (s, X, X, X, X), for BUFFER * (s, X, X, X, X), for 1D * (s, t, X, X, X), for 2D, RECT * (s, t, r, X, X), for 3D, CUBE @@ -373,6 +374,7 @@ tgsi_util_get_texture_coord_dim(int tgsi_tex, int *shadow_or_sample) * (s, t, layer, sample, X), for 2D_ARRAY_MSAA */ switch (tgsi_tex) { + case TGSI_TEXTURE_BUFFER: case TGSI_TEXTURE_1D: case TGSI_TEXTURE_SHADOW1D: dim = 1; Reviewed-by: Brian Paul ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] sharing of context data for egl x11 backend
Hi, I have written a simple program which shares data like texture and shaders across two different GLContexts but it doesn't seem to work. Is sharing of texture and shaders supported in Mesa EGL backend? I first bind the first context and do the drawing and then I bind the second context to current thread and then do the drawing with shared data. I am using mesa version 8 with llvm softpipe renderer with EGL libs compiled for X11 backend. I would appreciate any help, Regards, Divick ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Error compiling mesa 9.1.1
On Thu, May 23, 2013 at 3:39 AM, Divick Kishore wrote: > Bump .. could someone please help me with this? Possibly I am missing > something small but can't seem to figure out. > > Any help is appreciated. > Divick > > On Thu, May 16, 2013 at 12:58 PM, Divick Kishore > wrote: >> Hi, >> I am trying to compile mesa 9.1.1 on my debian wheezy machine but >> I get compilation error during the build. >> >> The configure options that I have used are: >> >> ../../configure --prefix=~/lib/mesa/dri_llvm --build=x86_64-linux-gnu >> --with-dri-drivers="swrast" --with-dri-driverdir=~/lib/mesa/dri_llvm/ >> --with-dri-searchpath='~/lib/mesa/dri_llvm/' --enable-glx-tls >> --enable-xa --enable-driglx-direct --with-egl-platforms="x11" >> --enable-gallium-llvm=yes --with-gallium-drivers="swrast" >> --enable-gles1 --enable-gles2 --enable-gallium-egl --disable-glu >> CFLAGS="-Wall -O2" CXXFLAGS="-Wall -O2" >> >> And the compilation error that I see is: >> >> make[5]: Entering directory >> `/home/divick/work/work/mesa-9.1.1/build/dri/src/gallium/state_trackers/dri/drm' >> make[5]: *** No rule to make target `dri_context.lo', needed by >> `libdridrm.la'. Stop >> >> >> Using the same set of configure options I am able to build mesa 8.0.5. >> My objective is to build llvm based software only renderer. >> >> Could someone please help me build mesa 9.1.1? I think this is a build system bug caused by not building any classic hardware DRI drivers. Try --with-dri-drivers=i965,swrast. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH:mesa 1/2] integer overflow in XF86DRIOpenConnection() [CVE-2013-1993 1/2]
busIdStringLength is a CARD32 and needs to be bounds checked before adding one to it to come up with the total size to allocate, to avoid integer overflow leading to underallocation and writing data from the network past the end of the allocated buffer. Reported-by: Ilja Van Sprundel Signed-off-by: Alan Coopersmith --- src/glx/XF86dri.c |7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/src/glx/XF86dri.c b/src/glx/XF86dri.c index b1cdc9b..8f53bd7 100644 --- a/src/glx/XF86dri.c +++ b/src/glx/XF86dri.c @@ -43,6 +43,7 @@ SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. #include #include #include "xf86dristr.h" +#include static XExtensionInfo _xf86dri_info_data; static XExtensionInfo *xf86dri_info = &_xf86dri_info_data; @@ -201,7 +202,11 @@ XF86DRIOpenConnection(Display * dpy, int screen, drm_handle_t * hSAREA, } if (rep.length) { - if (!(*busIdString = calloc(rep.busIdStringLength + 1, 1))) { + if (rep.busIdStringLength < INT_MAX) + *busIdString = calloc(rep.busIdStringLength + 1, 1); + else + *busIdString = NULL; + if (*busIdString == NULL) { _XEatData(dpy, ((rep.busIdStringLength + 3) & ~3)); UnlockDisplay(dpy); SyncHandle(); -- 1.7.9.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH:mesa 0/2] integer overflows in GLX DRI code [CVE-2013-1993]
The X.Org security team has been notified by a security researcher of bugs in the protocol handling code across libX11 & many of its extension libraries. These could be exploited in X clients that are setuid or otherwise running with raised privileges, if a user could run them with their display set to a Xserver they've modified to exploit them (perhaps a custom Xephyr or remote Xorg). More details about these issues can be found in our advisory posting at http://www.x.org/wiki/Development/Security/Advisory-2013-05-23 . One of the extensions affected is DRI, for which the code is not in a shared libXdri, but copied into several locations, including Mesa's GLX library. This series of patches corrects these bugs in Mesa's copy. Alan Coopersmith (2): integer overflow in XF86DRIOpenConnection() [CVE-2013-1993 1/2] integer overflow in XF86DRIGetClientDriverName() [CVE-2013-1993 2/2] src/glx/XF86dri.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) -- 1.7.9.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH:mesa 2/2] integer overflow in XF86DRIGetClientDriverName() [CVE-2013-1993 2/2]
clientDriverNameLength is a CARD32 and needs to be bounds checked before adding one to it to come up with the total size to allocate, to avoid integer overflow leading to underallocation and writing data from the network past the end of the allocated buffer. Reported-by: Ilja Van Sprundel Signed-off-by: Alan Coopersmith --- src/glx/XF86dri.c |8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/src/glx/XF86dri.c b/src/glx/XF86dri.c index 8f53bd7..56e3557 100644 --- a/src/glx/XF86dri.c +++ b/src/glx/XF86dri.c @@ -305,9 +305,11 @@ XF86DRIGetClientDriverName(Display * dpy, int screen, *ddxDriverPatchVersion = rep.ddxDriverPatchVersion; if (rep.length) { - if (! - (*clientDriverName = - calloc(rep.clientDriverNameLength + 1, 1))) { + if (rep.clientDriverNameLength < INT_MAX) + *clientDriverName = calloc(rep.clientDriverNameLength + 1, 1); + else + *clientDriverName = NULL; + if (*clientDriverName == NULL) { _XEatData(dpy, ((rep.clientDriverNameLength + 3) & ~3)); UnlockDisplay(dpy); SyncHandle(); -- 1.7.9.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/12] intel: Conditionally compile mcs-related code for i965 only.
On 22 May 2013 12:18, Ian Romanick wrote: > On 05/21/2013 04:52 PM, Paul Berry wrote: > >> This patch ifdefs out intel_mipmap_tree::mcs_mt when building the i915 >> (pre-Gen4) driver (MCS buffers aren't supported until Gen7, so there >> is no need for this field in the i915 driver). This should make it a >> bit easier to implement fast color clears without undue risk to i915. >> > > We have a bunch of other fields like this (e.g., hiz_mt). Should we have > done this with those fields, or is this case different? Probably the only difference in this case is who is writing the patches :) I'd be willing to write a follow-up patch that ifdefs out some of the other i965-specific fields if there's interest. > > > --- >> src/mesa/drivers/dri/intel/**intel_mipmap_tree.c | 8 +++- >> src/mesa/drivers/dri/intel/**intel_mipmap_tree.h | 2 ++ >> 2 files changed, 9 insertions(+), 1 deletion(-) >> >> diff --git a/src/mesa/drivers/dri/intel/**intel_mipmap_tree.c >> b/src/mesa/drivers/dri/intel/**intel_mipmap_tree.c >> index 2dfa787..ad9e2b3 100644 >> --- a/src/mesa/drivers/dri/intel/**intel_mipmap_tree.c >> +++ b/src/mesa/drivers/dri/intel/**intel_mipmap_tree.c >> @@ -624,7 +624,9 @@ intel_miptree_release(struct intel_mipmap_tree **mt) >> intel_region_release(&((*mt)->**region)); >> intel_miptree_release(&(*mt)->**stencil_mt); >> intel_miptree_release(&(*mt)->**hiz_mt); >> +#ifndef I915 >> intel_miptree_release(&(*mt)->**mcs_mt); >> +#endif >> intel_miptree_release(&(*mt)->**singlesample_mt); >> intel_resolve_map_clear(&(*mt)**->hiz_map); >> >> @@ -963,8 +965,11 @@ intel_miptree_alloc_mcs(struct intel_context *intel, >> struct intel_mipmap_tree *mt, >> GLuint num_samples) >> { >> - assert(mt->mcs_mt == NULL); >> assert(intel->gen >= 7); /* MCS only used on Gen7+ */ >> +#ifdef I915 >> + return false; >> > > return NULL; > > since this function returns 'struct intel_mipmap_tree*'. Actually it returns bool. It only looks like it returns struct intel_mipmap_tree* because it ends with "return mt->mcs_mt;", but that is implicitly converted to bool. > > > +#else >> + assert(mt->mcs_mt == NULL); >> >> /* Choose the correct format for the MCS buffer. All that really >> matters >> * is that we allocate the right buffer size, since we'll always be >> @@ -1021,6 +1026,7 @@ intel_miptree_alloc_mcs(struct intel_context *intel, >> intel_miptree_unmap_raw(intel, mt->mcs_mt); >> >> return mt->mcs_mt; >> +#endif >> } >> >> /** >> diff --git a/src/mesa/drivers/dri/intel/**intel_mipmap_tree.h >> b/src/mesa/drivers/dri/intel/**intel_mipmap_tree.h >> index b7376e0..0ec3c5e 100644 >> --- a/src/mesa/drivers/dri/intel/**intel_mipmap_tree.h >> +++ b/src/mesa/drivers/dri/intel/**intel_mipmap_tree.h >> @@ -373,6 +373,7 @@ struct intel_mipmap_tree >> */ >> struct intel_mipmap_tree *stencil_mt; >> >> +#ifndef I915 >> /** >> * \brief MCS miptree for multisampled textures. >> * >> @@ -381,6 +382,7 @@ struct intel_mipmap_tree >> * (INTEL_MSAA_FORMAT_CMS). >> */ >> struct intel_mipmap_tree *mcs_mt; >> +#endif >> >> /* These are also refcounted: >> */ >> >> > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 64877] R600: OpenCL Regression since llvm commit b6379de427c009
https://bugs.freedesktop.org/show_bug.cgi?id=64877 --- Comment #6 from Tom Stellard --- Created attachment 79716 --> https://bugs.freedesktop.org/attachment.cgi?id=79716&action=edit Possible fix This patch should fix the bug. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH:mesa 0/2] integer overflows in GLX DRI code [CVE-2013-1993]
On 05/23/2013 09:44 AM, Alan Coopersmith wrote: The X.Org security team has been notified by a security researcher of bugs in the protocol handling code across libX11 & many of its extension libraries. These could be exploited in X clients that are setuid or otherwise running with raised privileges, if a user could run them with their display set to a Xserver they've modified to exploit them (perhaps a custom Xephyr or remote Xorg). More details about these issues can be found in our advisory posting at http://www.x.org/wiki/Development/Security/Advisory-2013-05-23 . One of the extensions affected is DRI, for which the code is not in a shared libXdri, but copied into several locations, including Mesa's GLX library. This series of patches corrects these bugs in Mesa's copy. Alan Coopersmith (2): integer overflow in XF86DRIOpenConnection() [CVE-2013-1993 1/2] integer overflow in XF86DRIGetClientDriverName() [CVE-2013-1993 2/2] src/glx/XF86dri.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) Looks good to me, but a second set of eyes would be good. One thing: these should probably be tagged with "NOTE: Candidate for the stable branches". Reviewed-by: Brian Paul ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/12] intel: Create intel_miptree_get_region() to prepare for fast color clear.
Paul Berry writes: > With the advent of fast color clears, it will no longer be safe for > the driver to access the data stored in a miptree with impunity. For > example, sometimes a resolve will need to be performed first (to > ensure that deferred writes due to a fast clear are performed before > the buffer is accessed). Other times, fast clear will need to be > disabled for the miptree, so that its contents can be safely shared > with an entity that Mesa can't synchronize with easily. > > To prepare for that, this patch renames intel_mipmap_tree::region to > intel_mipmap_tree::region_private and creates an accessor function, > intel_miptree_get_region(). At the moment, the accessor function > simply returns region_private. Later in the patch series, this > function will be expanded to take appropriate actions to maintain the > proper fast color clear state. > > As much as possible, I've tried to restrict the functions which > directly access region_private to low-level miptree functions > (e.g. miptree initialization functions), so that it will be easy to > verify that those functions access the miptree contents safely. I don't like this change. I think we should be explicitly resolving at the right points, like in patch 10. In this patch, the places I see that look like they could trigger a resolve from ACCESS_RENDER would all break the GPU state, so you have to have things resolved before. This means that these intel_miptree_get_region() functions just freak me out when I see them in some code -- "oh crap, would we resolve here? that would be bad... oh, looks like we prevent that over in this codepath over here." Once the places that should absolutely never resolve get removed, there's hardly anything left in this patch. It also goes against the work I've done to kill the region struct. pgpCG4SWpwXX0.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 64877] R600: OpenCL Regression since llvm commit b6379de427c009
https://bugs.freedesktop.org/show_bug.cgi?id=64877 Aaron Watry changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #7 from Aaron Watry --- This patch does in fact fix that test for me. I've done a CL piglit run as well, and many of the other related failures (abs_diff, add_sat, clamp, mad24, max, min, mul24, rotate) that I was having have also been resolved. Thank you very much Tom/Vincent for looking into the fix for this. --Aaron -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH libclc] Add bitselect builtin
From: Tom Stellard --- generic/include/clc/clc.h | 1 + generic/include/clc/relational/bitselect.h | 1 + 2 files changed, 2 insertions(+) create mode 100644 generic/include/clc/relational/bitselect.h diff --git a/generic/include/clc/clc.h b/generic/include/clc/clc.h index d2858a8..b53a217 100644 --- a/generic/include/clc/clc.h +++ b/generic/include/clc/clc.h @@ -80,6 +80,7 @@ /* 6.11.6 Relational Functions */ #include +#include #include /* 6.11.8 Synchronization Functions */ diff --git a/generic/include/clc/relational/bitselect.h b/generic/include/clc/relational/bitselect.h new file mode 100644 index 000..e91cbfd --- /dev/null +++ b/generic/include/clc/relational/bitselect.h @@ -0,0 +1 @@ +#define bitselect(x, y, z) ((x) ^ ((z) & ((y) ^ (x -- 1.8.1.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH:mesa 1/2] integer overflow in XF86DRIOpenConnection() [CVE-2013-1993 1/2]
On 05/23/2013 08:44 AM, Alan Coopersmith wrote: busIdStringLength is a CARD32 and needs to be bounds checked before adding one to it to come up with the total size to allocate, to avoid integer overflow leading to underallocation and writing data from the network past the end of the allocated buffer. Reported-by: Ilja Van Sprundel Signed-off-by: Alan Coopersmith --- src/glx/XF86dri.c |7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/src/glx/XF86dri.c b/src/glx/XF86dri.c index b1cdc9b..8f53bd7 100644 --- a/src/glx/XF86dri.c +++ b/src/glx/XF86dri.c @@ -43,6 +43,7 @@ SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. #include #include #include "xf86dristr.h" +#include static XExtensionInfo _xf86dri_info_data; static XExtensionInfo *xf86dri_info = &_xf86dri_info_data; @@ -201,7 +202,11 @@ XF86DRIOpenConnection(Display * dpy, int screen, drm_handle_t * hSAREA, } if (rep.length) { - if (!(*busIdString = calloc(rep.busIdStringLength + 1, 1))) { + if (rep.busIdStringLength < INT_MAX) + *busIdString = calloc(rep.busIdStringLength + 1, 1); But calloc takes size_t, and size_t is unsigned. That makes this look a little weird. The problem is when rep.busIdStringLength is INT_MAX, the problem occurs when it's UINT_MAX. Right? Even this is only a problem because of calloc's zero size handling behavior: If nmemb or size is 0, then calloc() returns either NULL, or a unique pointer value that can later be successfully passed to free(). Good times. + else + *busIdString = NULL; + if (*busIdString == NULL) { _XEatData(dpy, ((rep.busIdStringLength + 3) & ~3)); Doesn't this have a similar overflow issue? If rep.busIdStringLength is UINT_MAX-2, the result is 0. UnlockDisplay(dpy); SyncHandle(); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/12] intel: Create intel_miptree_get_region() to prepare for fast color clear.
On 23 May 2013 09:57, Eric Anholt wrote: > Paul Berry writes: > > > With the advent of fast color clears, it will no longer be safe for > > the driver to access the data stored in a miptree with impunity. For > > example, sometimes a resolve will need to be performed first (to > > ensure that deferred writes due to a fast clear are performed before > > the buffer is accessed). Other times, fast clear will need to be > > disabled for the miptree, so that its contents can be safely shared > > with an entity that Mesa can't synchronize with easily. > > > > To prepare for that, this patch renames intel_mipmap_tree::region to > > intel_mipmap_tree::region_private and creates an accessor function, > > intel_miptree_get_region(). At the moment, the accessor function > > simply returns region_private. Later in the patch series, this > > function will be expanded to take appropriate actions to maintain the > > proper fast color clear state. > > > > As much as possible, I've tried to restrict the functions which > > directly access region_private to low-level miptree functions > > (e.g. miptree initialization functions), so that it will be easy to > > verify that those functions access the miptree contents safely. > > I don't like this change. I think we should be explicitly resolving at > the right points, like in patch 10. In this patch, the places I see > that look like they could trigger a resolve from ACCESS_RENDER would all > break the GPU state, so you have to have things resolved before. This > means that these intel_miptree_get_region() functions just freak me out > when I see them in some code -- "oh crap, would we resolve here? that > would be bad... oh, looks like we prevent that over in this codepath > over here." > > Once the places that should absolutely never resolve get removed, > there's hardly anything left in this patch. It also goes against the > work I've done to kill the region struct. > We (me, Ken, Eric, and Chad) just had an in-person discussion about this, and came up with a new plan: 1. Eric will send out some patches that funnel all of the blitting operations* through a new intel_miptree_blit() function (whose arguments are miptrees rather than regions) 2. Once those land, I'll rework my series so that it does the resolve (and other state updating) inside intel_miptree_blit(). Hopefully that will allow us to drop this patch. *Technically there is one blitting operation that can't go through intel_miptree_blit() (intelEmitImmediateColorExpandBlit(), which is used to accelerate glBitmap()). We'll put a resolve hook in there as a special case. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/12] intel: Create intel_miptree_get_region() to prepare for fast color clear.
On 22 May 2013 12:21, Ian Romanick wrote: > On 05/21/2013 04:52 PM, Paul Berry wrote: > >> diff --git a/src/mesa/drivers/dri/i915/i830_vtbl.c >> b/src/mesa/drivers/dri/i915/i830_vtbl.c >> index b30f45e..a35f58b 100644 >> --- a/src/mesa/drivers/dri/i915/**i830_vtbl.c >> +++ b/src/mesa/drivers/dri/i915/**i830_vtbl.c >> @@ -760,7 +760,13 @@ i830_update_draw_buffer(struct intel_context *intel) >> >> for (i = 0; i < fb->_NumColorDrawBuffers; i++) { >> irb = intel_renderbuffer(fb->_**ColorDrawBuffers[i]); >> - colorRegions[i] = (irb && irb->mt) ? irb->mt->region : NULL; >> + if (irb && irb->mt) { >> + colorRegions[i] = >> + intel_miptree_get_region(**intel, irb->mt, >> + INTEL_MIPTREE_ACCESS_RENDER); >> + } else { >> + colorRegions[i] = NULL; >> + } >> > > This idiom appears several places. Would it be better to pass irb and let > intel_miptree_get_region return NULL? Or have a wrapper function? Just > thinking out loud... > I'm not convinced it would be a benefit. But considering the discussion I just had with Eric, Ken, and Chad (see my other reply with the same subject line from a few minutes ago), the question is probably moot. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC 0/2] freedreno: adding adreno a3xx support
From: Rob Clark Currently, es2gears, ioquake, xonotic, compiz, etc. work. The shader compiler is quite sub-optimal, but despite that most things seem to be ~2-3x faster compared (and at higher resolution) with the a320 on my nexus4 compared to a220 on my HP touchpad. Since the patches will probably bounce due to size, you can find them on my github tree: https://github.com/freedreno/mesa/tree/a3xx-rfc git://github.com/freedreno/mesa.git a3xx-rfc The first patch is mostly just shuffling things around. The second patch is what actually adds a3xx support. Rob Clark (2): RFC: freedreno: prepare for a3xx RFC: freedreno: add a3xx support configure.ac |2 + src/gallium/drivers/freedreno/Makefile.am | 20 +- src/gallium/drivers/freedreno/a2xx.xml.h | 1473 src/gallium/drivers/freedreno/a2xx/Makefile.am | 28 + src/gallium/drivers/freedreno/a2xx/a2xx.xml.h | 1465 src/gallium/drivers/freedreno/a2xx/disasm-a2xx.c | 632 +++ src/gallium/drivers/freedreno/a2xx/fd2_blend.c | 86 + src/gallium/drivers/freedreno/a2xx/fd2_blend.h | 51 + src/gallium/drivers/freedreno/a2xx/fd2_compiler.c | 1191 + src/gallium/drivers/freedreno/a2xx/fd2_compiler.h | 38 + src/gallium/drivers/freedreno/a2xx/fd2_context.c | 101 ++ src/gallium/drivers/freedreno/a2xx/fd2_context.h | 52 + src/gallium/drivers/freedreno/a2xx/fd2_draw.c | 294 src/gallium/drivers/freedreno/a2xx/fd2_draw.h | 38 + src/gallium/drivers/freedreno/a2xx/fd2_emit.c | 443 + src/gallium/drivers/freedreno/a2xx/fd2_emit.h | 48 + src/gallium/drivers/freedreno/a2xx/fd2_gmem.c | 393 + src/gallium/drivers/freedreno/a2xx/fd2_gmem.h | 36 + src/gallium/drivers/freedreno/a2xx/fd2_program.c | 506 ++ src/gallium/drivers/freedreno/a2xx/fd2_program.h | 82 + .../drivers/freedreno/a2xx/fd2_rasterizer.c| 113 ++ .../drivers/freedreno/a2xx/fd2_rasterizer.h| 55 + src/gallium/drivers/freedreno/a2xx/fd2_screen.c| 109 ++ src/gallium/drivers/freedreno/a2xx/fd2_screen.h| 36 + src/gallium/drivers/freedreno/a2xx/fd2_texture.c | 158 ++ src/gallium/drivers/freedreno/a2xx/fd2_texture.h | 69 + src/gallium/drivers/freedreno/a2xx/fd2_util.c | 322 src/gallium/drivers/freedreno/a2xx/fd2_util.h | 47 + src/gallium/drivers/freedreno/a2xx/fd2_zsa.c | 96 ++ src/gallium/drivers/freedreno/a2xx/fd2_zsa.h | 56 + src/gallium/drivers/freedreno/a2xx/instr-a2xx.h| 389 + src/gallium/drivers/freedreno/a2xx/ir-a2xx.c | 635 +++ src/gallium/drivers/freedreno/a2xx/ir-a2xx.h | 180 ++ src/gallium/drivers/freedreno/a3xx/Makefile.am | 28 + src/gallium/drivers/freedreno/a3xx/a3xx.xml.h | 1761 src/gallium/drivers/freedreno/a3xx/disasm-a3xx.c | 946 +++ src/gallium/drivers/freedreno/a3xx/fd3_blend.c | 87 + src/gallium/drivers/freedreno/a3xx/fd3_blend.h | 52 + src/gallium/drivers/freedreno/a3xx/fd3_compiler.c | 998 +++ src/gallium/drivers/freedreno/a3xx/fd3_compiler.h | 38 + src/gallium/drivers/freedreno/a3xx/fd3_context.c | 118 ++ src/gallium/drivers/freedreno/a3xx/fd3_context.h | 68 + src/gallium/drivers/freedreno/a3xx/fd3_draw.c | 229 +++ src/gallium/drivers/freedreno/a3xx/fd3_draw.h | 38 + src/gallium/drivers/freedreno/a3xx/fd3_emit.c | 582 +++ src/gallium/drivers/freedreno/a3xx/fd3_emit.h | 62 + src/gallium/drivers/freedreno/a3xx/fd3_gmem.c | 395 + src/gallium/drivers/freedreno/a3xx/fd3_gmem.h | 36 + src/gallium/drivers/freedreno/a3xx/fd3_program.c | 637 +++ src/gallium/drivers/freedreno/a3xx/fd3_program.h | 111 ++ .../drivers/freedreno/a3xx/fd3_rasterizer.c| 92 + .../drivers/freedreno/a3xx/fd3_rasterizer.h| 56 + src/gallium/drivers/freedreno/a3xx/fd3_screen.c| 103 ++ src/gallium/drivers/freedreno/a3xx/fd3_screen.h| 36 + src/gallium/drivers/freedreno/a3xx/fd3_texture.c | 140 ++ src/gallium/drivers/freedreno/a3xx/fd3_texture.h | 68 + src/gallium/drivers/freedreno/a3xx/fd3_util.c | 292 src/gallium/drivers/freedreno/a3xx/fd3_util.h | 54 + src/gallium/drivers/freedreno/a3xx/fd3_zsa.c | 100 ++ src/gallium/drivers/freedreno/a3xx/fd3_zsa.h | 56 + src/gallium/drivers/freedreno/a3xx/instr-a3xx.h| 523 ++ src/gallium/drivers/freedreno/a3xx/ir-a3xx.c | 525 ++ src/gallium/drivers/freedreno/a3xx/ir-a3xx.h | 190 +++ src/gallium/drivers/freedreno/adreno_common.xml.h | 11 +- src/gallium/drivers/freedreno/adreno_pm4.xml.h | 97 +- src/gallium/drivers/freedreno/disasm.c | 632 --- src/gallium/drivers/freedreno/disasm.h |5 +- src/gallium/drivers/freedreno/freedreno_blend.c| 175 -- src/gallium/drivers/fr
Re: [Mesa-dev] [PATCH 06/12] i965/gen7+: Implement fast color clear operation in BLORP.
On 22 May 2013 12:30, Ian Romanick wrote: > On 05/21/2013 04:52 PM, Paul Berry wrote: > >> Since we defer allocation of the MCS miptree until the time of the >> fast clear operation, this patch also implements creation of the MCS >> miptree. >> >> In addition, this patch adds the field >> intel_mipmap_tree::fast_clear_**color_value, which holds the most recent >> fast color clear value, if any. We use it to set the SURFACE_STATE's >> clear color for render targets. >> --- >> src/mesa/drivers/dri/i965/brw_**blorp.cpp | 1 + >> src/mesa/drivers/dri/i965/brw_**blorp.h | 11 +- >> src/mesa/drivers/dri/i965/brw_**blorp_clear.cpp | 143 >> +- >> src/mesa/drivers/dri/i965/brw_**clear.c | 2 +- >> src/mesa/drivers/dri/i965/brw_**defines.h | 2 + >> src/mesa/drivers/dri/i965/**gen7_blorp.cpp | 18 ++- >> src/mesa/drivers/dri/i965/**gen7_wm_surface_state.c | 10 +- >> src/mesa/drivers/dri/intel/**intel_mipmap_tree.c| 47 +++ >> src/mesa/drivers/dri/intel/**intel_mipmap_tree.h| 13 ++ >> 9 files changed, 233 insertions(+), 14 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/**brw_blorp.cpp >> b/src/mesa/drivers/dri/i965/**brw_blorp.cpp >> index 20f7153..c6019d1 100644 >> --- a/src/mesa/drivers/dri/i965/**brw_blorp.cpp >> +++ b/src/mesa/drivers/dri/i965/**brw_blorp.cpp >> @@ -147,6 +147,7 @@ brw_blorp_params::brw_blorp_**params() >>y1(0), >>depth_format(0), >>hiz_op(GEN6_HIZ_OP_NONE), >> + fast_clear_op(GEN7_FAST_CLEAR_**OP_NONE), >>num_samples(0), >>use_wm_prog(false) >> { >> diff --git a/src/mesa/drivers/dri/i965/**brw_blorp.h >> b/src/mesa/drivers/dri/i965/**brw_blorp.h >> index 6360a62..687d7eb 100644 >> --- a/src/mesa/drivers/dri/i965/**brw_blorp.h >> +++ b/src/mesa/drivers/dri/i965/**brw_blorp.h >> @@ -46,7 +46,8 @@ brw_blorp_blit_miptrees(struct intel_context *intel, >> bool mirror_x, bool mirror_y); >> >> bool >> -brw_blorp_clear_color(struct intel_context *intel, struct gl_framebuffer >> *fb); >> +brw_blorp_clear_color(struct intel_context *intel, struct gl_framebuffer >> *fb, >> + bool partial_clear); >> >> #ifdef __cplusplus >> } /* end extern "C" */ >> @@ -195,6 +196,13 @@ struct brw_blorp_prog_data >> bool persample_msaa_dispatch; >> }; >> >> + >> +enum gen7_fast_clear_op { >> + GEN7_FAST_CLEAR_OP_NONE, >> + GEN7_FAST_CLEAR_OP_FAST_CLEAR, >> +}; >> + >> + >> class brw_blorp_params >> { >> public: >> @@ -212,6 +220,7 @@ public: >> brw_blorp_surface_info src; >> brw_blorp_surface_info dst; >> enum gen6_hiz_op hiz_op; >> + enum gen7_fast_clear_op fast_clear_op; >> unsigned num_samples; >> bool use_wm_prog; >> brw_blorp_wm_push_constants wm_push_consts; >> diff --git a/src/mesa/drivers/dri/i965/**brw_blorp_clear.cpp >> b/src/mesa/drivers/dri/i965/**brw_blorp_clear.cpp >> index 28d7ad0..675289b 100644 >> --- a/src/mesa/drivers/dri/i965/**brw_blorp_clear.cpp >> +++ b/src/mesa/drivers/dri/i965/**brw_blorp_clear.cpp >> @@ -49,7 +49,8 @@ public: >> brw_blorp_clear_params(struct brw_context *brw, >> struct gl_framebuffer *fb, >> struct gl_renderbuffer *rb, >> - GLubyte *color_mask); >> + GLubyte *color_mask, >> + bool partial_clear); >> >> virtual uint32_t get_wm_prog(struct brw_context *brw, >> brw_blorp_prog_data **prog_data) const; >> @@ -105,10 +106,49 @@ brw_blorp_clear_program::~brw_** >> blorp_clear_program() >> ralloc_free(mem_ctx); >> } >> >> + >> +/** >> + * Determine if fast color clear supports the given clear color. >> + * >> + * Fast color clear can only clear to color values of 1.0 or 0.0. At the >> + * moment we only support floating point buffers. >> + */ >> +static bool >> +is_color_fast_clear_**compatible(gl_format format, >> + const union gl_color_union *color) >> +{ >> + if (_mesa_is_format_integer_**color(format)) >> + return false; >> + >> + for (int i = 0; i < 4; i++) { >> + if (color->f[i] != 0.0 && color->f[i] != 1.0) >> + return false; >> > > Should this generate a perf debug message? Eric may have an opinion about > generating warnings for the non-fast path... Sounds reasonable to me. We already have perf debug messages for other things that can inhibit fast clears (e.g. scissor preventing fast depth clear). I'll add it unless I hear an objection. > > + } >> + return true; >> +} >> + >> + >> +/** >> + * Convert the given color to a bitfield suitable for ORing into DWORD 7 >> of >> + * SURFACE_STATE. >> + */ >> +static uint32_t >> +compute_fast_clear_color_**bits(const union gl_color_union *color) >> +{ >> + uint32_t bits = 0; >> + for (int i = 0; i < 4; i++) { >> + if (co
Re: [Mesa-dev] [PATCH 06/12] i965/gen7+: Implement fast color clear operation in BLORP.
On 22 May 2013 16:32, Eric Anholt wrote: > Paul Berry writes: > > diff --git a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c > b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c > > index 9d1b91a..657532f 100644 > > --- a/src/mesa/drivers/dri/intel/intel_mipmap_tree.c > > +++ b/src/mesa/drivers/dri/intel/intel_mipmap_tree.c > > @@ -1163,6 +1163,53 @@ intel_miptree_alloc_mcs(struct intel_context > *intel, > > #endif > > } > > > > + > > +bool > > +intel_miptree_alloc_non_msrt_mcs(struct intel_context *intel, > > + struct intel_mipmap_tree *mt) > > +{ > > +#ifdef I915 > > + assert(!"MCS not supported on i915"); > > +#else > > New build warning in i915: > > intel_mipmap_tree.c: In function 'intel_miptree_alloc_non_msrt_mcs': > intel_mipmap_tree.c:1230:1: warning: control reaches end of non-void > function [-Wreturn-type] > Whoops, you're right. I didn't notice because I was doing a debug build. Thanks. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/12] intel: Conditionally compile mcs-related code for i965 only.
On 05/23/2013 09:33 AM, Paul Berry wrote: On 22 May 2013 12:18, Ian Romanick wrote: On 05/21/2013 04:52 PM, Paul Berry wrote: This patch ifdefs out intel_mipmap_tree::mcs_mt when building the i915 (pre-Gen4) driver (MCS buffers aren't supported until Gen7, so there is no need for this field in the i915 driver). This should make it a bit easier to implement fast color clears without undue risk to i915. We have a bunch of other fields like this (e.g., hiz_mt). Should we have done this with those fields, or is this case different? Probably the only difference in this case is who is writing the patches :) I'd be willing to write a follow-up patch that ifdefs out some of the other i965-specific fields if there's interest. For the record, I'm not opposed to any patches that #ifdef out hiz_mt. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH:mesa 1/2] integer overflow in XF86DRIOpenConnection() [CVE-2013-1993 1/2]
On 05/23/13 11:07 AM, Ian Romanick wrote: On 05/23/2013 08:44 AM, Alan Coopersmith wrote: if (rep.length) { - if (!(*busIdString = calloc(rep.busIdStringLength + 1, 1))) { + if (rep.busIdStringLength < INT_MAX) + *busIdString = calloc(rep.busIdStringLength + 1, 1); But calloc takes size_t, and size_t is unsigned. That makes this look a little weird. The problem is when rep.busIdStringLength is INT_MAX, the problem occurs when it's UINT_MAX. Right? Right - UINT_MAX would cause overflow, but we used INT_MAX in most of the checks for these in the X libraries to avoid issues if other parts of the code try to treat it as signed, and really, once your string length hits 2gb you're in ludicrous territory anyway, and best not to waste time trying to map all those pages. + else + *busIdString = NULL; + if (*busIdString == NULL) { _XEatData(dpy, ((rep.busIdStringLength + 3) & ~3)); Doesn't this have a similar overflow issue? If rep.busIdStringLength is UINT_MAX-2, the result is 0. Yes - in that case though you'll just not throw away enough data, and on modern systems, trigger an xcb assertion that there's unread data left in the response. In the X libraries I added a new _XEatDataWords API that takes the packet length from req.length and applies overflow checks in one place instead of every caller, but I didn't take the time to figure out how to add new autoconf checks to Mesa to do that here. -- -Alan Coopersmith- alan.coopersm...@oracle.com Oracle Solaris Engineering - http://blogs.oracle.com/alanc ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC 0/2] freedreno: adding adreno a3xx support
On Thu, May 23, 2013 at 11:48 AM, Rob Clark wrote: > From: Rob Clark > > Currently, es2gears, ioquake, xonotic, compiz, etc. work. The > shader compiler is quite sub-optimal, but despite that most things > seem to be ~2-3x faster compared (and at higher resolution) with > the a320 on my nexus4 compared to a220 on my HP touchpad. > > Since the patches will probably bounce due to size, you can find > them on my github tree: > > https://github.com/freedreno/mesa/tree/a3xx-rfc > git://github.com/freedreno/mesa.git a3xx-rfc > > The first patch is mostly just shuffling things around. The second > patch is what actually adds a3xx support. > > Rob Clark (2): > RFC: freedreno: prepare for a3xx > RFC: freedreno: add a3xx support Might want to fix the subjects to not include the 'RFC: ' before pushing. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC 0/2] freedreno: adding adreno a3xx support
On Thu, May 23, 2013 at 11:48 AM, Rob Clark wrote: > From: Rob Clark > > Currently, es2gears, ioquake, xonotic, compiz, etc. work. The > shader compiler is quite sub-optimal, but despite that most things > seem to be ~2-3x faster compared (and at higher resolution) with > the a320 on my nexus4 compared to a220 on my HP touchpad. > > Since the patches will probably bounce due to size, you can find > them on my github tree: > > https://github.com/freedreno/mesa/tree/a3xx-rfc > git://github.com/freedreno/mesa.git a3xx-rfc > > The first patch is mostly just shuffling things around. The second > patch is what actually adds a3xx support. In patch 1, PIC_FLAGS is still dead. Should remove it from the current freedreno Makefile.am as well. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC 0/2] freedreno: adding adreno a3xx support
On Thu, May 23, 2013 at 4:21 PM, Matt Turner wrote: > On Thu, May 23, 2013 at 11:48 AM, Rob Clark wrote: >> From: Rob Clark >> >> Currently, es2gears, ioquake, xonotic, compiz, etc. work. The >> shader compiler is quite sub-optimal, but despite that most things >> seem to be ~2-3x faster compared (and at higher resolution) with >> the a320 on my nexus4 compared to a220 on my HP touchpad. >> >> Since the patches will probably bounce due to size, you can find >> them on my github tree: >> >> https://github.com/freedreno/mesa/tree/a3xx-rfc >> git://github.com/freedreno/mesa.git a3xx-rfc >> >> The first patch is mostly just shuffling things around. The second >> patch is what actually adds a3xx support. >> >> Rob Clark (2): >> RFC: freedreno: prepare for a3xx >> RFC: freedreno: add a3xx support > > Might want to fix the subjects to not include the 'RFC: ' before pushing. Yup.. not quite ready to push yet, still a couple things to debug and working out the XA support for xf86-video-freedreno (since newer devices drop the 2d core again).. but I figured it was far enough along to start getting review comments. re: PIC_FLAGS, feel free to push a commit to remove that if you want, or otherwise I can. That is an unrelated change anyways, and no need to wait for $(PIC_FLAGS) isn't used for anything anymore BR, -R ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/4] Multiple viewports in Gallium
This series adds support for multiple viewports/scissors to gallium and implements it in llvmpipe. All the other drivers still support just a single viewport/scissor combo and their behavior should be exactly the same as it was. Zack Rusin (4): gallium: Add support for multiple viewports draw: implement support for multiple viewports util/blitter: make sure the blitter can restore all viewports llvmpipe: implement support for multiple viewports src/gallium/auxiliary/cso_cache/cso_context.c | 37 ++- src/gallium/auxiliary/cso_cache/cso_context.h |9 ++-- src/gallium/auxiliary/draw/draw_cliptest_tmp.h | 10 +++- src/gallium/auxiliary/draw/draw_context.c | 50 +++- src/gallium/auxiliary/draw/draw_context.h |5 +- src/gallium/auxiliary/draw/draw_gs.c | 11 - src/gallium/auxiliary/draw/draw_gs.h |1 + src/gallium/auxiliary/draw/draw_pipe_clip.c| 11 - src/gallium/auxiliary/draw/draw_private.h |9 ++-- .../draw/draw_pt_fetch_shade_pipeline_llvm.c |4 +- src/gallium/auxiliary/draw/draw_vs.c |7 --- src/gallium/auxiliary/draw/draw_vs_variant.c | 33 +++-- src/gallium/auxiliary/hud/hud_context.c|6 +-- src/gallium/auxiliary/postprocess/pp_run.c |6 +-- src/gallium/auxiliary/tgsi/tgsi_scan.c |6 +++ src/gallium/auxiliary/tgsi/tgsi_scan.h |1 + src/gallium/auxiliary/tgsi/tgsi_strings.c |3 +- src/gallium/auxiliary/util/u_blit.c| 12 ++--- src/gallium/auxiliary/util/u_blitter.c | 10 ++-- src/gallium/auxiliary/util/u_blitter.h | 24 ++ src/gallium/auxiliary/util/u_gen_mipmap.c |6 +-- src/gallium/auxiliary/vl/vl_compositor.c |4 +- src/gallium/auxiliary/vl/vl_idct.c |4 +- src/gallium/auxiliary/vl/vl_matrix_filter.c|2 +- src/gallium/auxiliary/vl/vl_mc.c |2 +- src/gallium/auxiliary/vl/vl_median_filter.c|2 +- src/gallium/auxiliary/vl/vl_zscan.c|2 +- src/gallium/docs/source/context.rst|8 ++-- src/gallium/drivers/freedreno/freedreno_resource.c |4 +- src/gallium/drivers/freedreno/freedreno_state.c| 10 ++-- src/gallium/drivers/galahad/glhd_context.c | 16 --- src/gallium/drivers/i915/i915_state.c | 12 +++-- src/gallium/drivers/i915/i915_surface.c|4 +- src/gallium/drivers/identity/id_context.c | 22 + src/gallium/drivers/ilo/ilo_blit.c |2 +- src/gallium/drivers/ilo/ilo_state.c| 14 +++--- src/gallium/drivers/llvmpipe/lp_context.h |9 +++- src/gallium/drivers/llvmpipe/lp_screen.c |2 + src/gallium/drivers/llvmpipe/lp_setup.c| 34 - src/gallium/drivers/llvmpipe/lp_setup.h|5 +- src/gallium/drivers/llvmpipe/lp_setup_context.h|9 ++-- src/gallium/drivers/llvmpipe/lp_setup_line.c | 12 +++-- src/gallium/drivers/llvmpipe/lp_setup_point.c | 12 +++-- src/gallium/drivers/llvmpipe/lp_setup_tri.c| 17 +-- src/gallium/drivers/llvmpipe/lp_state_clip.c | 24 ++ src/gallium/drivers/llvmpipe/lp_state_derived.c| 15 +- src/gallium/drivers/llvmpipe/lp_surface.c |4 +- src/gallium/drivers/noop/noop_state.c | 14 +++--- src/gallium/drivers/nv30/nv30_draw.c |2 +- src/gallium/drivers/nv30/nv30_miptree.c|4 +- src/gallium/drivers/nv30/nv30_state.c | 14 +++--- src/gallium/drivers/nv50/nv50_state.c | 16 --- src/gallium/drivers/nvc0/nvc0_state.c | 14 +++--- src/gallium/drivers/r300/r300_blit.c |4 +- src/gallium/drivers/r300/r300_context.c|2 +- src/gallium/drivers/r300/r300_state.c | 16 --- src/gallium/drivers/r600/evergreen_state.c |5 +- src/gallium/drivers/r600/r600_blit.c |4 +- src/gallium/drivers/r600/r600_state.c |7 +-- src/gallium/drivers/r600/r600_state_common.c |9 ++-- src/gallium/drivers/radeonsi/r600_blit.c |2 +- src/gallium/drivers/radeonsi/si_state.c| 14 +++--- src/gallium/drivers/rbug/rbug_context.c| 22 + src/gallium/drivers/softpipe/sp_screen.c |2 + src/gallium/drivers/softpipe/sp_state_clip.c | 16 --- src/gallium/drivers/softpipe/sp_surface.c |4 +- src/gallium/drivers/svga/svga_pipe_blit.c |4 +- src/gallium/drivers/svga/svga_pipe_misc.c | 18 +++ src/gallium/drivers/svga/svga_swtnl_state.c|2 +- src/gallium/drivers/trace/tr_context.c | 28 ++- src/gallium/inc
[Mesa-dev] [PATCH 2/4] draw: implement support for multiple viewports
This adds support for multiple viewports to the draw module. Multiple viewports depend on the presence of geometry shaders which can write the viewport index. Signed-off-by: Zack Rusin --- src/gallium/auxiliary/draw/draw_cliptest_tmp.h | 10 - src/gallium/auxiliary/draw/draw_context.c | 44 +++- src/gallium/auxiliary/draw/draw_gs.c | 11 - src/gallium/auxiliary/draw/draw_gs.h |1 + src/gallium/auxiliary/draw/draw_pipe_clip.c| 11 - src/gallium/auxiliary/draw/draw_private.h |9 ++-- .../draw/draw_pt_fetch_shade_pipeline_llvm.c |4 +- src/gallium/auxiliary/draw/draw_vs.c |7 src/gallium/auxiliary/draw/draw_vs_variant.c | 33 +-- 9 files changed, 96 insertions(+), 34 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_cliptest_tmp.h b/src/gallium/auxiliary/draw/draw_cliptest_tmp.h index 48f2349..09e1fd7 100644 --- a/src/gallium/auxiliary/draw/draw_cliptest_tmp.h +++ b/src/gallium/auxiliary/draw/draw_cliptest_tmp.h @@ -31,8 +31,6 @@ static boolean TAG(do_cliptest)( struct pt_post_vs *pvs, struct draw_vertex_info *info ) { struct vertex_header *out = info->verts; - const float *scale = pvs->draw->viewport.scale; - const float *trans = pvs->draw->viewport.translate; /* const */ float (*plane)[4] = pvs->draw->plane; const unsigned pos = draw_current_shader_position_output(pvs->draw); const unsigned cv = draw_current_shader_clipvertex_output(pvs->draw); @@ -44,6 +42,9 @@ static boolean TAG(do_cliptest)( struct pt_post_vs *pvs, unsigned j; unsigned i; bool have_cd = false; + unsigned viewport_index_output = + draw_current_shader_viewport_index_output(pvs->draw); + cd[0] = draw_current_shader_clipdistance_output(pvs->draw, 0); cd[1] = draw_current_shader_clipdistance_output(pvs->draw, 1); @@ -52,7 +53,12 @@ static boolean TAG(do_cliptest)( struct pt_post_vs *pvs, for (j = 0; j < info->count; j++) { float *position = out->data[pos]; + int viewport_index = + draw_current_shader_uses_viewport_index(pvs->draw) ? + *((unsigned*)out->data[viewport_index_output]): 0; unsigned mask = 0x0; + const float *scale = pvs->draw->viewports[viewport_index].scale; + const float *trans = pvs->draw->viewports[viewport_index].translate; initialize_vertex_header(out); diff --git a/src/gallium/auxiliary/draw/draw_context.c b/src/gallium/auxiliary/draw/draw_context.c index ed642c5..46cf165 100644 --- a/src/gallium/auxiliary/draw/draw_context.c +++ b/src/gallium/auxiliary/draw/draw_context.c @@ -317,17 +317,18 @@ void draw_set_viewport_states( struct draw_context *draw, { const struct pipe_viewport_state *viewport = vps; draw_do_flush(draw, DRAW_FLUSH_PARAMETER_CHANGE); - draw->viewport = *viewport; /* struct copy */ - draw->identity_viewport = (viewport->scale[0] == 1.0f && - viewport->scale[1] == 1.0f && - viewport->scale[2] == 1.0f && - viewport->scale[3] == 1.0f && - viewport->translate[0] == 0.0f && - viewport->translate[1] == 0.0f && - viewport->translate[2] == 0.0f && - viewport->translate[3] == 0.0f); - - draw_vs_set_viewport( draw, viewport ); + draw->num_viewports = num_viewports; + memcpy(draw->viewports, vps, + sizeof(struct pipe_viewport_state) * num_viewports); + draw->identity_viewport = (num_viewports == 1) && + (viewport->scale[0] == 1.0f && + viewport->scale[1] == 1.0f && + viewport->scale[2] == 1.0f && + viewport->scale[3] == 1.0f && + viewport->translate[0] == 0.0f && + viewport->translate[1] == 0.0f && + viewport->translate[2] == 0.0f && + viewport->translate[3] == 0.0f); } @@ -694,6 +695,27 @@ draw_current_shader_position_output(const struct draw_context *draw) /** * Return the index of the shader output which will contain the + * viewport index]. + */ +uint +draw_current_shader_viewport_index_output(const struct draw_context *draw) +{ + if (draw->gs.geometry_shader) + return draw->gs.geometry_shader->viewport_index_output; + return 0; +} + +boolean +draw_current_shader_uses_viewport_index(const struct draw_context *draw) +{ + if (draw->gs.geometry_shader) + return draw->gs.geometry_shader->info.writes_viewport_index; + return FALSE; +} + + +/** + * Return the index of the shader output which will contain the * vertex position. */ uint diff --git a/src/gallium/auxiliary/draw/draw_gs.c b/src/gallium/auxiliary/draw/draw_gs.c index fa0981e..67e5117 100644 --- a/src/gallium/auxiliary/draw/draw_gs.c +++ b/src/gallium/auxiliary/draw/draw_gs.c @@ -335,8 +335,13 @@ llvm_fetch_gs_outputs(struct draw_g
[Mesa-dev] [PATCH 3/4] util/blitter: make sure the blitter can restore all viewports
in case a driver supports multiple viewports the blitter needs to be able to restore all of them and not just the first one. Signed-off-by: Zack Rusin --- src/gallium/auxiliary/util/u_blitter.c |6 +++-- src/gallium/auxiliary/util/u_blitter.h | 24 +--- src/gallium/drivers/freedreno/freedreno_resource.c |4 ++-- src/gallium/drivers/i915/i915_surface.c|4 ++-- src/gallium/drivers/ilo/ilo_blit.c |2 +- src/gallium/drivers/llvmpipe/lp_surface.c |4 ++-- src/gallium/drivers/nv30/nv30_miptree.c|4 ++-- src/gallium/drivers/r300/r300_blit.c |4 ++-- src/gallium/drivers/r600/r600_blit.c |4 ++-- src/gallium/drivers/radeonsi/r600_blit.c |2 +- src/gallium/drivers/softpipe/sp_surface.c |4 ++-- src/gallium/drivers/svga/svga_pipe_blit.c |4 ++-- 12 files changed, 38 insertions(+), 28 deletions(-) diff --git a/src/gallium/auxiliary/util/u_blitter.c b/src/gallium/auxiliary/util/u_blitter.c index e985376..8274b79 100644 --- a/src/gallium/auxiliary/util/u_blitter.c +++ b/src/gallium/auxiliary/util/u_blitter.c @@ -504,7 +504,8 @@ static void blitter_restore_fragment_states(struct blitter_context_priv *ctx) /* XXX check whether these are saved and whether they need to be restored * (depending on the operation) */ pipe->set_stencil_ref(pipe, &ctx->base.saved_stencil_ref); - pipe->set_viewport_states(pipe, 1, &ctx->base.saved_viewport); + pipe->set_viewport_states(pipe, ctx->base.num_saved_viewports, + ctx->base.saved_viewports); } static void blitter_check_saved_fb_state(struct blitter_context_priv *ctx) @@ -1496,7 +1497,8 @@ void util_blitter_blit_generic(struct blitter_context *blitter, blitter_restore_textures(ctx); blitter_restore_fb_state(ctx); if (scissor) { - pipe->set_scissor_states(pipe, 1, &ctx->base.saved_scissor); + pipe->set_scissor_states(pipe, ctx->base.num_saved_scissors, + ctx->base.saved_scissors); } blitter_restore_render_cond(ctx); blitter_unset_running_flag(ctx); diff --git a/src/gallium/auxiliary/util/u_blitter.h b/src/gallium/auxiliary/util/u_blitter.h index 1901584..5d4296a 100644 --- a/src/gallium/auxiliary/util/u_blitter.h +++ b/src/gallium/auxiliary/util/u_blitter.h @@ -106,8 +106,10 @@ struct blitter_context struct pipe_framebuffer_state saved_fb_state; /**< framebuffer state */ struct pipe_stencil_ref saved_stencil_ref; /**< stencil ref */ - struct pipe_viewport_state saved_viewport; - struct pipe_scissor_state saved_scissor; + struct pipe_viewport_state saved_viewports[PIPE_MAX_VIEWPORTS]; + unsigned num_saved_viewports; + struct pipe_scissor_state saved_scissors[PIPE_MAX_VIEWPORTS]; + unsigned num_saved_scissors; boolean is_sample_mask_saved; unsigned saved_sample_mask; @@ -442,17 +444,23 @@ void util_blitter_save_framebuffer(struct blitter_context *blitter, } static INLINE -void util_blitter_save_viewport(struct blitter_context *blitter, -struct pipe_viewport_state *state) +void util_blitter_save_viewports(struct blitter_context *blitter, + unsigned num_viewports, + struct pipe_viewport_state *states) { - blitter->saved_viewport = *state; + blitter->num_saved_viewports = num_viewports; + memcpy(blitter->saved_viewports, states, + sizeof(struct pipe_viewport_state) * num_viewports); } static INLINE -void util_blitter_save_scissor(struct blitter_context *blitter, - struct pipe_scissor_state *state) +void util_blitter_save_scissors(struct blitter_context *blitter, + unsigned num_scissors, + struct pipe_scissor_state *states) { - blitter->saved_scissor = *state; + blitter->num_saved_scissors = num_scissors; + memcpy(blitter->saved_scissors, states, + sizeof(struct pipe_viewport_state) * num_scissors); } static INLINE diff --git a/src/gallium/drivers/freedreno/freedreno_resource.c b/src/gallium/drivers/freedreno/freedreno_resource.c index 00f3db8..7a0c68d 100644 --- a/src/gallium/drivers/freedreno/freedreno_resource.c +++ b/src/gallium/drivers/freedreno/freedreno_resource.c @@ -254,8 +254,8 @@ fd_blit(struct pipe_context *pctx, const struct pipe_blit_info *blit_info) util_blitter_save_vertex_elements(ctx->blitter, ctx->vtx); util_blitter_save_vertex_shader(ctx->blitter, ctx->prog.vp); util_blitter_save_rasterizer(ctx->blitter, ctx->rasterizer); - util_blitter_save_viewport(ctx->blitter, &ctx->viewport); - util_blitter_save_scissor(ctx->blitter, &ctx->scissor); + util_blitter_save_viewports(ctx->blitter, 1, &ctx->viewport); + util_blitter_save_scissors(ctx->blitter, 1, &ctx->sciss
[Mesa-dev] [PATCH 4/4] llvmpipe: implement support for multiple viewports
Largely related to making sure the rasterizer can correctly pick out the correct scissor box for the current viewport. Signed-off-by: Zack Rusin --- src/gallium/drivers/llvmpipe/lp_context.h |9 -- src/gallium/drivers/llvmpipe/lp_screen.c|2 +- src/gallium/drivers/llvmpipe/lp_setup.c | 34 +++ src/gallium/drivers/llvmpipe/lp_setup.h |5 ++-- src/gallium/drivers/llvmpipe/lp_setup_context.h |9 -- src/gallium/drivers/llvmpipe/lp_setup_line.c| 12 ++-- src/gallium/drivers/llvmpipe/lp_setup_point.c | 12 +--- src/gallium/drivers/llvmpipe/lp_setup_tri.c | 17 src/gallium/drivers/llvmpipe/lp_state_clip.c|8 -- src/gallium/drivers/llvmpipe/lp_state_derived.c | 15 +- src/gallium/drivers/llvmpipe/lp_surface.c |4 +-- 11 files changed, 91 insertions(+), 36 deletions(-) diff --git a/src/gallium/drivers/llvmpipe/lp_context.h b/src/gallium/drivers/llvmpipe/lp_context.h index d605dba..444c768 100644 --- a/src/gallium/drivers/llvmpipe/lp_context.h +++ b/src/gallium/drivers/llvmpipe/lp_context.h @@ -75,10 +75,12 @@ struct llvmpipe_context { struct pipe_constant_buffer constants[PIPE_SHADER_TYPES][LP_MAX_TGSI_CONST_BUFFERS]; struct pipe_framebuffer_state framebuffer; struct pipe_poly_stipple poly_stipple; - struct pipe_scissor_state scissor; + struct pipe_scissor_state scissors[PIPE_MAX_VIEWPORTS]; + unsigned num_scissors; struct pipe_sampler_view *sampler_views[PIPE_SHADER_TYPES][PIPE_MAX_SHADER_SAMPLER_VIEWS]; - struct pipe_viewport_state viewport; + struct pipe_viewport_state viewports[PIPE_MAX_VIEWPORTS]; + unsigned num_viewports; struct pipe_vertex_buffer vertex_buffer[PIPE_MAX_ATTRIBS]; struct pipe_index_buffer index_buffer; struct pipe_resource *mapped_vs_tex[PIPE_MAX_SHADER_SAMPLER_VIEWS]; @@ -116,6 +118,9 @@ struct llvmpipe_context { /** Which vertex shader output slot contains point size */ int psize_slot; + /** Which vertex shader output slot contains viewport index */ + int viewport_index_slot; + /**< minimum resolvable depth value, for polygon offset */ double mrd; diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c b/src/gallium/drivers/llvmpipe/lp_screen.c index 712b7c6..9c4de72 100644 --- a/src/gallium/drivers/llvmpipe/lp_screen.c +++ b/src/gallium/drivers/llvmpipe/lp_screen.c @@ -231,7 +231,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum pipe_cap param) case PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER: return 0; case PIPE_CAP_MULTIPLE_VIEWPORTS: - return 0; + return 1; } /* should only get here on unhandled cases */ debug_printf("Unexpected PIPE_CAP %d query\n", param); diff --git a/src/gallium/drivers/llvmpipe/lp_setup.c b/src/gallium/drivers/llvmpipe/lp_setup.c index 9fef34e..caa168d 100644 --- a/src/gallium/drivers/llvmpipe/lp_setup.c +++ b/src/gallium/drivers/llvmpipe/lp_setup.c @@ -616,17 +616,23 @@ lp_setup_set_blend_color( struct lp_setup_context *setup, void -lp_setup_set_scissor( struct lp_setup_context *setup, - const struct pipe_scissor_state *scissor ) +lp_setup_set_scissors( struct lp_setup_context *setup, + unsigned num_scissors, + const struct pipe_scissor_state *scissors ) { + unsigned i; LP_DBG(DEBUG_SETUP, "%s\n", __FUNCTION__); - assert(scissor); + assert(scissors); + + setup->num_scissors = num_scissors; - setup->scissor.x0 = scissor->minx; - setup->scissor.x1 = scissor->maxx-1; - setup->scissor.y0 = scissor->miny; - setup->scissor.y1 = scissor->maxy-1; + for (i = 0; i < num_scissors; ++i) { + setup->scissors[i].x0 = scissors[i].minx; + setup->scissors[i].x1 = scissors[i].maxx-1; + setup->scissors[i].y0 = scissors[i].miny; + setup->scissors[i].y1 = scissors[i].maxy-1; + } setup->dirty |= LP_SETUP_NEW_SCISSOR; } @@ -1012,10 +1018,15 @@ try_update_scene_state( struct lp_setup_context *setup ) } if (setup->dirty & LP_SETUP_NEW_SCISSOR) { - setup->draw_region = setup->framebuffer; - if (setup->scissor_test) { - u_rect_possible_intersection(&setup->scissor, - &setup->draw_region); + unsigned i; + /* we always need at least one draw region */ + setup->draw_regions[0] = setup->framebuffer; + for (i = 0; i < setup->num_scissors; ++i) { + setup->draw_regions[i] = setup->framebuffer; + if (setup->scissor_test) { +u_rect_possible_intersection(&setup->scissors[i], + &setup->draw_regions[i]); + } } /* If the framebuffer is large we have to think about fixed-point * integer overflow. For 2K by 2K images, coordinates need 15 bits @@ -1061,6 +1072,7 @@ lp_setup_update_state( struct lp_setup_context *setup, * to know abo
Re: [Mesa-dev] [PATCH 1/4] gallium: Add support for multiple viewports
Am 23.05.2013 22:33, schrieb Zack Rusin: > Gallium supported only a single viewport/scissor combination. This > commit changes the interface to allow us to add support for multiple > viewports/scissors. > > Signed-off-by: Zack Rusin > --- > diff --git a/src/gallium/include/pipe/p_context.h > b/src/gallium/include/pipe/p_context.h > index d1130bc..eaaa043 100644 > --- a/src/gallium/include/pipe/p_context.h > +++ b/src/gallium/include/pipe/p_context.h > @@ -211,11 +211,13 @@ struct pipe_context { > void (*set_polygon_stipple)( struct pipe_context *, > const struct pipe_poly_stipple * ); > > - void (*set_scissor_state)( struct pipe_context *, > - const struct pipe_scissor_state * ); > + void (*set_scissor_states)( struct pipe_context *, > + unsigned num_scissors, > + const struct pipe_scissor_state * ); > > - void (*set_viewport_state)( struct pipe_context *, > - const struct pipe_viewport_state * ); > + void (*set_viewport_states)( struct pipe_context *, > +unsigned num_viewports, > +const struct pipe_viewport_state *); > > void (*set_fragment_sampler_views)(struct pipe_context *, >unsigned num_views, > diff --git a/src/gallium/include/pipe/p_defines.h > b/src/gallium/include/pipe/p_defines.h > index bb86968..00f0a37 100644 > --- a/src/gallium/include/pipe/p_defines.h > +++ b/src/gallium/include/pipe/p_defines.h > @@ -507,7 +507,8 @@ enum pipe_cap { > PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER = 80, > PIPE_CAP_QUERY_PIPELINE_STATISTICS = 81, > PIPE_CAP_TEXTURE_BORDER_COLOR_QUIRK = 82, > - PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE = 83 > + PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE = 83, > + PIPE_CAP_MULTIPLE_VIEWPORTS = 84 Would it be better if this were PIPE_CAP_MAX_VIEWPORTS instead? Though I guess there's no real need right now to support anything but 16 (as that's needed by d3d10/11, and is the minimum supported value for GL, though GL would allow for more), so I don't have a strong opinion on that. Also please document that CAP bit. > }; > > #define PIPE_QUIRK_TEXTURE_BORDER_COLOR_SWIZZLE_NV50 (1 << 0) > diff --git a/src/gallium/include/pipe/p_shader_tokens.h > b/src/gallium/include/pipe/p_shader_tokens.h > index 50de2d3..b33cf1d 100644 > --- a/src/gallium/include/pipe/p_shader_tokens.h > +++ b/src/gallium/include/pipe/p_shader_tokens.h > @@ -164,7 +164,8 @@ struct tgsi_declaration_interp > #define TGSI_SEMANTIC_THREAD_ID 18 /**< block-relative id of the current > thread */ > #define TGSI_SEMANTIC_TEXCOORD 19 /**< texture or sprite coordinates */ > #define TGSI_SEMANTIC_PCOORD 20 /**< point sprite coordinate */ > -#define TGSI_SEMANTIC_COUNT 21 /**< number of semantic values */ > +#define TGSI_SEMANTIC_VIEWPORT_INDEX 21 /**< viewport index */ > +#define TGSI_SEMANTIC_COUNT 22 /**< number of semantic values */ > > struct tgsi_declaration_semantic > { > diff --git a/src/gallium/include/pipe/p_state.h > b/src/gallium/include/pipe/p_state.h > index 262078d..ff0aac7 100644 > --- a/src/gallium/include/pipe/p_state.h > +++ b/src/gallium/include/pipe/p_state.h > @@ -65,6 +65,7 @@ extern "C" { > #define PIPE_MAX_TEXTURE_LEVELS 16 > #define PIPE_MAX_SO_BUFFERS4 > #define PIPE_MAX_SO_OUTPUTS 64 > +#define PIPE_MAX_VIEWPORTS16 > > > struct pipe_reference Otherwise looks good to me. Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] gallium: Add support for multiple viewports
1) I prefer this interface instead: void (*set_scissor_states)( struct pipe_context *, unsigned start_slot, unsigned count, const struct pipe_scissor_state * ); void (*set_viewport_states)( struct pipe_context *, unsigned start_slot, unsigned count, const struct pipe_viewport_state *); Both function should allow updating only a subset of all viewports and scissors (from start_slot to start_slot+count-1). This is especially important for meta ops (u_gen_mipmap, etc.), which need to update only the first viewport (and no scissor), leaving the other viewports unchanged. This idea is not new: the vertex buffer and compute sampler functions have the start_slot parameter too. 2) What does cso_context need to keep a copy of all viewports for? All meta ops need only one viewport, just as they need only one vertex buffer and one constant buffer (and cso_context doesn't really allow meta ops to use more than that). For example, see how the cso_context interface for saving and restoring the constant buffer slot 0 looks like. It's preferable to use the same mechanism unless there is a need to have the save and restore functionality for all slots. Marek On Thu, May 23, 2013 at 10:33 PM, Zack Rusin wrote: > Gallium supported only a single viewport/scissor combination. This > commit changes the interface to allow us to add support for multiple > viewports/scissors. > > Signed-off-by: Zack Rusin ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] llvmpipe: implement support for multiple viewports
Am 23.05.2013 22:33, schrieb Zack Rusin: > Largely related to making sure the rasterizer can correctly > pick out the correct scissor box for the current viewport. > > Signed-off-by: Zack Rusin > --- > src/gallium/drivers/llvmpipe/lp_context.h |9 -- > src/gallium/drivers/llvmpipe/lp_screen.c|2 +- > src/gallium/drivers/llvmpipe/lp_setup.c | 34 > +++ > src/gallium/drivers/llvmpipe/lp_setup.h |5 ++-- > src/gallium/drivers/llvmpipe/lp_setup_context.h |9 -- > src/gallium/drivers/llvmpipe/lp_setup_line.c| 12 ++-- > src/gallium/drivers/llvmpipe/lp_setup_point.c | 12 +--- > src/gallium/drivers/llvmpipe/lp_setup_tri.c | 17 > src/gallium/drivers/llvmpipe/lp_state_clip.c|8 -- > src/gallium/drivers/llvmpipe/lp_state_derived.c | 15 +- > src/gallium/drivers/llvmpipe/lp_surface.c |4 +-- > 11 files changed, 91 insertions(+), 36 deletions(-) > > diff --git a/src/gallium/drivers/llvmpipe/lp_context.h > b/src/gallium/drivers/llvmpipe/lp_context.h > index d605dba..444c768 100644 > --- a/src/gallium/drivers/llvmpipe/lp_context.h > +++ b/src/gallium/drivers/llvmpipe/lp_context.h > @@ -75,10 +75,12 @@ struct llvmpipe_context { > struct pipe_constant_buffer > constants[PIPE_SHADER_TYPES][LP_MAX_TGSI_CONST_BUFFERS]; > struct pipe_framebuffer_state framebuffer; > struct pipe_poly_stipple poly_stipple; > - struct pipe_scissor_state scissor; > + struct pipe_scissor_state scissors[PIPE_MAX_VIEWPORTS]; > + unsigned num_scissors; > struct pipe_sampler_view > *sampler_views[PIPE_SHADER_TYPES][PIPE_MAX_SHADER_SAMPLER_VIEWS]; > > - struct pipe_viewport_state viewport; > + struct pipe_viewport_state viewports[PIPE_MAX_VIEWPORTS]; > + unsigned num_viewports; > struct pipe_vertex_buffer vertex_buffer[PIPE_MAX_ATTRIBS]; > struct pipe_index_buffer index_buffer; > struct pipe_resource *mapped_vs_tex[PIPE_MAX_SHADER_SAMPLER_VIEWS]; > @@ -116,6 +118,9 @@ struct llvmpipe_context { > /** Which vertex shader output slot contains point size */ > int psize_slot; > > + /** Which vertex shader output slot contains viewport index */ > + int viewport_index_slot; > + > /**< minimum resolvable depth value, for polygon offset */ > double mrd; > > diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c > b/src/gallium/drivers/llvmpipe/lp_screen.c > index 712b7c6..9c4de72 100644 > --- a/src/gallium/drivers/llvmpipe/lp_screen.c > +++ b/src/gallium/drivers/llvmpipe/lp_screen.c > @@ -231,7 +231,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum > pipe_cap param) > case PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER: >return 0; > case PIPE_CAP_MULTIPLE_VIEWPORTS: > - return 0; > + return 1; > } > /* should only get here on unhandled cases */ > debug_printf("Unexpected PIPE_CAP %d query\n", param); > diff --git a/src/gallium/drivers/llvmpipe/lp_setup.c > b/src/gallium/drivers/llvmpipe/lp_setup.c > index 9fef34e..caa168d 100644 > --- a/src/gallium/drivers/llvmpipe/lp_setup.c > +++ b/src/gallium/drivers/llvmpipe/lp_setup.c > @@ -616,17 +616,23 @@ lp_setup_set_blend_color( struct lp_setup_context > *setup, > > > void > -lp_setup_set_scissor( struct lp_setup_context *setup, > - const struct pipe_scissor_state *scissor ) > +lp_setup_set_scissors( struct lp_setup_context *setup, > + unsigned num_scissors, > + const struct pipe_scissor_state *scissors ) > { > + unsigned i; > LP_DBG(DEBUG_SETUP, "%s\n", __FUNCTION__); > > - assert(scissor); > + assert(scissors); > + > + setup->num_scissors = num_scissors; > > - setup->scissor.x0 = scissor->minx; > - setup->scissor.x1 = scissor->maxx-1; > - setup->scissor.y0 = scissor->miny; > - setup->scissor.y1 = scissor->maxy-1; > + for (i = 0; i < num_scissors; ++i) { > + setup->scissors[i].x0 = scissors[i].minx; > + setup->scissors[i].x1 = scissors[i].maxx-1; > + setup->scissors[i].y0 = scissors[i].miny; > + setup->scissors[i].y1 = scissors[i].maxy-1; > + } > setup->dirty |= LP_SETUP_NEW_SCISSOR; > } > > @@ -1012,10 +1018,15 @@ try_update_scene_state( struct lp_setup_context > *setup ) > } > > if (setup->dirty & LP_SETUP_NEW_SCISSOR) { > - setup->draw_region = setup->framebuffer; > - if (setup->scissor_test) { > - u_rect_possible_intersection(&setup->scissor, > - &setup->draw_region); > + unsigned i; > + /* we always need at least one draw region */ > + setup->draw_regions[0] = setup->framebuffer; > + for (i = 0; i < setup->num_scissors; ++i) { > + setup->draw_regions[i] = setup->framebuffer; > + if (setup->scissor_test) { > +u_rect_possible_intersection(&setup->scissors[i], > + &setup->d
Re: [Mesa-dev] [PATCH 3/4] util/blitter: make sure the blitter can restore all viewports
Same as the the first patch: u_blitter doesn't really need to change more than one scissor and viewport. With the start_slot parameter in set_viewport_states and set_scissor_states, you can just save, set, and restore the first slot. Note that the same approach is also used for the vertex buffer (only the first slot is changed by u_blitter). Marek On Thu, May 23, 2013 at 10:33 PM, Zack Rusin wrote: > in case a driver supports multiple viewports the blitter needs > to be able to restore all of them and not just the first one. > > Signed-off-by: Zack Rusin > --- > src/gallium/auxiliary/util/u_blitter.c |6 +++-- > src/gallium/auxiliary/util/u_blitter.h | 24 > +--- > src/gallium/drivers/freedreno/freedreno_resource.c |4 ++-- > src/gallium/drivers/i915/i915_surface.c|4 ++-- > src/gallium/drivers/ilo/ilo_blit.c |2 +- > src/gallium/drivers/llvmpipe/lp_surface.c |4 ++-- > src/gallium/drivers/nv30/nv30_miptree.c|4 ++-- > src/gallium/drivers/r300/r300_blit.c |4 ++-- > src/gallium/drivers/r600/r600_blit.c |4 ++-- > src/gallium/drivers/radeonsi/r600_blit.c |2 +- > src/gallium/drivers/softpipe/sp_surface.c |4 ++-- > src/gallium/drivers/svga/svga_pipe_blit.c |4 ++-- > 12 files changed, 38 insertions(+), 28 deletions(-) > > diff --git a/src/gallium/auxiliary/util/u_blitter.c > b/src/gallium/auxiliary/util/u_blitter.c > index e985376..8274b79 100644 > --- a/src/gallium/auxiliary/util/u_blitter.c > +++ b/src/gallium/auxiliary/util/u_blitter.c > @@ -504,7 +504,8 @@ static void blitter_restore_fragment_states(struct > blitter_context_priv *ctx) > /* XXX check whether these are saved and whether they need to be restored > * (depending on the operation) */ > pipe->set_stencil_ref(pipe, &ctx->base.saved_stencil_ref); > - pipe->set_viewport_states(pipe, 1, &ctx->base.saved_viewport); > + pipe->set_viewport_states(pipe, ctx->base.num_saved_viewports, > + ctx->base.saved_viewports); > } > > static void blitter_check_saved_fb_state(struct blitter_context_priv *ctx) > @@ -1496,7 +1497,8 @@ void util_blitter_blit_generic(struct blitter_context > *blitter, > blitter_restore_textures(ctx); > blitter_restore_fb_state(ctx); > if (scissor) { > - pipe->set_scissor_states(pipe, 1, &ctx->base.saved_scissor); > + pipe->set_scissor_states(pipe, ctx->base.num_saved_scissors, > + ctx->base.saved_scissors); > } > blitter_restore_render_cond(ctx); > blitter_unset_running_flag(ctx); > diff --git a/src/gallium/auxiliary/util/u_blitter.h > b/src/gallium/auxiliary/util/u_blitter.h > index 1901584..5d4296a 100644 > --- a/src/gallium/auxiliary/util/u_blitter.h > +++ b/src/gallium/auxiliary/util/u_blitter.h > @@ -106,8 +106,10 @@ struct blitter_context > > struct pipe_framebuffer_state saved_fb_state; /**< framebuffer state */ > struct pipe_stencil_ref saved_stencil_ref; /**< stencil ref */ > - struct pipe_viewport_state saved_viewport; > - struct pipe_scissor_state saved_scissor; > + struct pipe_viewport_state saved_viewports[PIPE_MAX_VIEWPORTS]; > + unsigned num_saved_viewports; > + struct pipe_scissor_state saved_scissors[PIPE_MAX_VIEWPORTS]; > + unsigned num_saved_scissors; > boolean is_sample_mask_saved; > unsigned saved_sample_mask; > > @@ -442,17 +444,23 @@ void util_blitter_save_framebuffer(struct > blitter_context *blitter, > } > > static INLINE > -void util_blitter_save_viewport(struct blitter_context *blitter, > -struct pipe_viewport_state *state) > +void util_blitter_save_viewports(struct blitter_context *blitter, > + unsigned num_viewports, > + struct pipe_viewport_state *states) > { > - blitter->saved_viewport = *state; > + blitter->num_saved_viewports = num_viewports; > + memcpy(blitter->saved_viewports, states, > + sizeof(struct pipe_viewport_state) * num_viewports); > } > > static INLINE > -void util_blitter_save_scissor(struct blitter_context *blitter, > - struct pipe_scissor_state *state) > +void util_blitter_save_scissors(struct blitter_context *blitter, > + unsigned num_scissors, > + struct pipe_scissor_state *states) > { > - blitter->saved_scissor = *state; > + blitter->num_saved_scissors = num_scissors; > + memcpy(blitter->saved_scissors, states, > + sizeof(struct pipe_viewport_state) * num_scissors); > } > > static INLINE > diff --git a/src/gallium/drivers/freedreno/freedreno_resource.c > b/src/gallium/drivers/freedreno/freedreno_resource.c > index 00f3db8..7a0c68d 100644 > --- a/src/gallium/drivers/freedreno/freedreno_resource.c > +++ b/src/gallium/drivers/f
Re: [Mesa-dev] [PATCH 1/4] gallium: Add support for multiple viewports
> 1) I prefer this interface instead: > > void (*set_scissor_states)( struct pipe_context *, >unsigned start_slot, unsigned count, >const struct pipe_scissor_state * ); > > void (*set_viewport_states)( struct pipe_context *, > unsigned start_slot, unsigned count, > const struct pipe_viewport_state *); > > Both function should allow updating only a subset of all viewports and > scissors (from start_slot to start_slot+count-1). This is especially > important for meta ops (u_gen_mipmap, etc.), which need to update only > the first viewport (and no scissor), leaving the other viewports > unchanged. This idea is not new: the vertex buffer and compute sampler > functions have the start_slot parameter too. It's obviously based on the d3d10 interface, it just seems like a lot simpler interface. I understand that gl does specify the start slot when updating the viewports, but I did think that explicitly specifying viewports from 0 up to the number of viewports set, was semantically very clear. > 2) What does cso_context need to keep a copy of all viewports for? All > meta ops need only one viewport, just as they need only one vertex > buffer and one constant buffer (and cso_context doesn't really allow > meta ops to use more than that). For example, see how the cso_context > interface for saving and restoring the constant buffer slot 0 looks > like. It's preferable to use the same mechanism unless there is a need > to have the save and restore functionality for all slots. It's a bit weird to say "none of the utils uses multiple viewports" given that this is the first commit that at all introduces the concept. What if something will? It's not a lot of code and it seemed to make sense to do it properly from the start. z ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/4] util/blitter: make sure the blitter can restore all viewports
> Same as the the first patch: u_blitter doesn't really need to change > more than one scissor and viewport. With the start_slot parameter in > set_viewport_states and set_scissor_states, you can just save, set, > and restore the first slot. Note that the same approach is also used > for the vertex buffer (only the first slot is changed by u_blitter). Sure, but the count on restoring is going to be incorrect. A lot of code depends on the knowledge of how many viewports/scissors are set and unless the restoring can properly restore the count, all of it will be broken. z ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH libclc] Add bitselect builtin
Reviewed-by: Aaron Watry Please also send the attached test patch (or an expanded version of it) to the piglit list. On Thu, May 23, 2013 at 12:48 PM, Tom Stellard wrote: > From: Tom Stellard > > --- > generic/include/clc/clc.h | 1 + > generic/include/clc/relational/bitselect.h | 1 + > 2 files changed, 2 insertions(+) > create mode 100644 generic/include/clc/relational/bitselect.h > > diff --git a/generic/include/clc/clc.h b/generic/include/clc/clc.h > index d2858a8..b53a217 100644 > --- a/generic/include/clc/clc.h > +++ b/generic/include/clc/clc.h > @@ -80,6 +80,7 @@ > > /* 6.11.6 Relational Functions */ > #include > +#include > #include > > /* 6.11.8 Synchronization Functions */ > diff --git a/generic/include/clc/relational/bitselect.h > b/generic/include/clc/relational/bitselect.h > new file mode 100644 > index 000..e91cbfd > --- /dev/null > +++ b/generic/include/clc/relational/bitselect.h > @@ -0,0 +1 @@ > +#define bitselect(x, y, z) ((x) ^ ((z) & ((y) ^ (x > -- > 1.8.1.5 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev 0001-CL-Basic-test-of-bitselect-builtin.patch Description: Binary data ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] gallium: Add support for multiple viewports
On Thu, May 23, 2013 at 11:59 PM, Zack Rusin wrote: >> 1) I prefer this interface instead: >> >> void (*set_scissor_states)( struct pipe_context *, >>unsigned start_slot, unsigned count, >>const struct pipe_scissor_state * ); >> >> void (*set_viewport_states)( struct pipe_context *, >> unsigned start_slot, unsigned count, >> const struct pipe_viewport_state *); >> >> Both function should allow updating only a subset of all viewports and >> scissors (from start_slot to start_slot+count-1). This is especially >> important for meta ops (u_gen_mipmap, etc.), which need to update only >> the first viewport (and no scissor), leaving the other viewports >> unchanged. This idea is not new: the vertex buffer and compute sampler >> functions have the start_slot parameter too. > > It's obviously based on the d3d10 interface, it just seems like a lot simpler > interface. I understand that gl does specify the start slot when updating the > viewports, but I did think that explicitly specifying viewports from 0 up to > the number of viewports set, was semantically very clear. The number of viewports set doesn't really matter. On latest hardware, there are always 16 viewports. And we usually implement the union of both GL and D3D. It would be reasonable to have the start_slot parameter for pretty much every client (OpenGL, hardware drivers, and internal Mesa meta ops) except D3D (which isn't even present in the repository). > >> 2) What does cso_context need to keep a copy of all viewports for? All >> meta ops need only one viewport, just as they need only one vertex >> buffer and one constant buffer (and cso_context doesn't really allow >> meta ops to use more than that). For example, see how the cso_context >> interface for saving and restoring the constant buffer slot 0 looks >> like. It's preferable to use the same mechanism unless there is a need >> to have the save and restore functionality for all slots. > > It's a bit weird to say "none of the utils uses multiple viewports" given > that this is the first commit that at all introduces the concept. What if > something will? It's not a lot of code and it seemed to make sense to do it > properly from the start. I can imagine u_gen_mipmap using layered rendering (a very important feature of geometry shaders), however nothing comes to mind which would need to use multiple viewports. If we ever need the support for multiple viewports in meta ops / internal rendering code in the middle application rendering, we can add the necessary code to cso_context. Until then, it's just burning CPU cycles for nothing. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Mask the cut index based on the index buffer type in 3DSTATE_VF.
According to the documentation: "The Cut Index is compared to the fetched (and possibly-sign-extended) vertex index, and if these values are equal, the current primitive topology is terminated. Note that, for index buffers <32bpp, it is possible to set the Cut Index to a (large) value that will never match a sign-extended vertex index." This suggests that we should not set the value to 0x for unsigned byte or short index buffers, but rather 0xFF or 0x. Fixes sporadic failures in the ES 3 instanced_arrays_primitive_restart conformance test when run in combination with other tests. No Piglit regressions. Cc: Ian Romanick Signed-off-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_primitive_restart.c | 27 --- 1 file changed, 19 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_primitive_restart.c b/src/mesa/drivers/dri/i965/brw_primitive_restart.c index f824915..cf4a1ea 100644 --- a/src/mesa/drivers/dri/i965/brw_primitive_restart.c +++ b/src/mesa/drivers/dri/i965/brw_primitive_restart.c @@ -183,19 +183,30 @@ haswell_upload_cut_index(struct brw_context *brw) if (!intel->is_haswell) return; - const unsigned cut_index_setting = - ctx->Array._PrimitiveRestart ? HSW_CUT_INDEX_ENABLE : 0; - - BEGIN_BATCH(2); - OUT_BATCH(_3DSTATE_VF << 16 | cut_index_setting | (2 - 2)); - OUT_BATCH(ctx->Array._RestartIndex); - ADVANCE_BATCH(); + if (ctx->Array._PrimitiveRestart) { + int cut_index = ctx->Array._RestartIndex; + + if (brw->ib.type == GL_UNSIGNED_BYTE) + cut_index &= 0xff; + else if (brw->ib.type == GL_UNSIGNED_SHORT) + cut_index &= 0x; + + BEGIN_BATCH(2); + OUT_BATCH(_3DSTATE_VF << 16 | HSW_CUT_INDEX_ENABLE | (2 - 2)); + OUT_BATCH(cut_index); + ADVANCE_BATCH(); + } else { + BEGIN_BATCH(2); + OUT_BATCH(_3DSTATE_VF << 16 | (2 - 2)); + OUT_BATCH(0); + ADVANCE_BATCH(); + } } const struct brw_tracked_state haswell_cut_index = { .dirty = { .mesa = _NEW_TRANSFORM, - .brw = 0, + .brw = BRW_NEW_INDEX_BUFFER, .cache = 0, }, .emit = haswell_upload_cut_index, -- 1.8.2.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Mask the cut index based on the index buffer type in 3DSTATE_VF.
On Thu, May 23, 2013 at 3:46 PM, Kenneth Graunke wrote: > According to the documentation: "The Cut Index is compared to the > fetched (and possibly-sign-extended) vertex index, and if these values > are equal, the current primitive topology is terminated. Note that, > for index buffers <32bpp, it is possible to set the Cut Index to a > (large) value that will never match a sign-extended vertex index." > > This suggests that we should not set the value to 0x for > unsigned byte or short index buffers, but rather 0xFF or 0x. I was wondering what the GL spec had to say about this situation. For example, what should happen if the index is 0x100, and bytes are used. Should it effectively disable prim-restart? Should it use 0xff, or 0x00? Unfortunately, I didn't find anything concrete. Reviewed-by: Jordan Justen > Fixes sporadic failures in the ES 3 instanced_arrays_primitive_restart > conformance test when run in combination with other tests. No Piglit > regressions. > > Cc: Ian Romanick Cc: Paul Berry > Signed-off-by: Kenneth Graunke > --- > src/mesa/drivers/dri/i965/brw_primitive_restart.c | 27 > --- > 1 file changed, 19 insertions(+), 8 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_primitive_restart.c > b/src/mesa/drivers/dri/i965/brw_primitive_restart.c > index f824915..cf4a1ea 100644 > --- a/src/mesa/drivers/dri/i965/brw_primitive_restart.c > +++ b/src/mesa/drivers/dri/i965/brw_primitive_restart.c > @@ -183,19 +183,30 @@ haswell_upload_cut_index(struct brw_context *brw) > if (!intel->is_haswell) >return; > > - const unsigned cut_index_setting = > - ctx->Array._PrimitiveRestart ? HSW_CUT_INDEX_ENABLE : 0; > - > - BEGIN_BATCH(2); > - OUT_BATCH(_3DSTATE_VF << 16 | cut_index_setting | (2 - 2)); > - OUT_BATCH(ctx->Array._RestartIndex); > - ADVANCE_BATCH(); > + if (ctx->Array._PrimitiveRestart) { > + int cut_index = ctx->Array._RestartIndex; > + > + if (brw->ib.type == GL_UNSIGNED_BYTE) > + cut_index &= 0xff; > + else if (brw->ib.type == GL_UNSIGNED_SHORT) > + cut_index &= 0x; > + > + BEGIN_BATCH(2); > + OUT_BATCH(_3DSTATE_VF << 16 | HSW_CUT_INDEX_ENABLE | (2 - 2)); > + OUT_BATCH(cut_index); > + ADVANCE_BATCH(); > + } else { > + BEGIN_BATCH(2); > + OUT_BATCH(_3DSTATE_VF << 16 | (2 - 2)); > + OUT_BATCH(0); > + ADVANCE_BATCH(); > + } > } > > const struct brw_tracked_state haswell_cut_index = { > .dirty = { >.mesa = _NEW_TRANSFORM, > - .brw = 0, > + .brw = BRW_NEW_INDEX_BUFFER, >.cache = 0, > }, > .emit = haswell_upload_cut_index, > -- > 1.8.2.2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Mask the cut index based on the index buffer type in 3DSTATE_VF.
On 23 May 2013 15:46, Kenneth Graunke wrote: > According to the documentation: "The Cut Index is compared to the > fetched (and possibly-sign-extended) vertex index, and if these values > are equal, the current primitive topology is terminated. Note that, > for index buffers <32bpp, it is possible to set the Cut Index to a > (large) value that will never match a sign-extended vertex index." > > This suggests that we should not set the value to 0x for > unsigned byte or short index buffers, but rather 0xFF or 0x. > > Fixes sporadic failures in the ES 3 instanced_arrays_primitive_restart > conformance test when run in combination with other tests. No Piglit > regressions. > > Cc: Ian Romanick Cc: Paul Berry > Signed-off-by: Kenneth Graunke > --- > src/mesa/drivers/dri/i965/brw_primitive_restart.c | 27 > --- > 1 file changed, 19 insertions(+), 8 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_primitive_restart.c > b/src/mesa/drivers/dri/i965/brw_primitive_restart.c > index f824915..cf4a1ea 100644 > --- a/src/mesa/drivers/dri/i965/brw_primitive_restart.c > +++ b/src/mesa/drivers/dri/i965/brw_primitive_restart.c > @@ -183,19 +183,30 @@ haswell_upload_cut_index(struct brw_context *brw) > if (!intel->is_haswell) >return; > > - const unsigned cut_index_setting = > - ctx->Array._PrimitiveRestart ? HSW_CUT_INDEX_ENABLE : 0; > - > - BEGIN_BATCH(2); > - OUT_BATCH(_3DSTATE_VF << 16 | cut_index_setting | (2 - 2)); > - OUT_BATCH(ctx->Array._RestartIndex); > - ADVANCE_BATCH(); > + if (ctx->Array._PrimitiveRestart) { > + int cut_index = ctx->Array._RestartIndex; > + > + if (brw->ib.type == GL_UNSIGNED_BYTE) > Can we put a "/* BRW_NEW_INDEX_BUFFER */" comment above this line? With that, this patch is: Reviewed-by: Paul Berry > + cut_index &= 0xff; > + else if (brw->ib.type == GL_UNSIGNED_SHORT) > + cut_index &= 0x; > + > + BEGIN_BATCH(2); > + OUT_BATCH(_3DSTATE_VF << 16 | HSW_CUT_INDEX_ENABLE | (2 - 2)); > + OUT_BATCH(cut_index); > + ADVANCE_BATCH(); > + } else { > + BEGIN_BATCH(2); > + OUT_BATCH(_3DSTATE_VF << 16 | (2 - 2)); > + OUT_BATCH(0); > + ADVANCE_BATCH(); > + } > } > > const struct brw_tracked_state haswell_cut_index = { > .dirty = { >.mesa = _NEW_TRANSFORM, > - .brw = 0, > + .brw = BRW_NEW_INDEX_BUFFER, >.cache = 0, > }, > .emit = haswell_upload_cut_index, > -- > 1.8.2.2 > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/4] util/blitter: make sure the blitter can restore all viewports
That wouldn't be an issue if there were the start_slot parameter in the first place. The hardware viewport count is always 16, I don't think that can be changed. (my radeon hardware docs seem to suggest it really can't). OpenGL doesn't provide a way to specify the viewport count either. (even the legacy glViewport function actually updates all 16 viewports) The count of set scissor rectangles isn't important either (other than telling the driver which scissors are being changed). Marek On Fri, May 24, 2013 at 12:05 AM, Zack Rusin wrote: >> Same as the the first patch: u_blitter doesn't really need to change >> more than one scissor and viewport. With the start_slot parameter in >> set_viewport_states and set_scissor_states, you can just save, set, >> and restore the first slot. Note that the same approach is also used >> for the vertex buffer (only the first slot is changed by u_blitter). > > Sure, but the count on restoring is going to be incorrect. A lot of code > depends on the knowledge of how many viewports/scissors are set and unless > the restoring can properly restore the count, all of it will be broken. > > z ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/12] i965/gen7+: Implement fast color clear operation in BLORP.
Paul Berry writes: > On 22 May 2013 12:30, Ian Romanick wrote: > >> On 05/21/2013 04:52 PM, Paul Berry wrote: >> >>> Since we defer allocation of the MCS miptree until the time of the >>> fast clear operation, this patch also implements creation of the MCS >>> miptree. >>> >>> In addition, this patch adds the field >>> intel_mipmap_tree::fast_clear_**color_value, which holds the most recent >>> fast color clear value, if any. We use it to set the SURFACE_STATE's >>> clear color for render targets. >>> --- >>> src/mesa/drivers/dri/i965/brw_**blorp.cpp | 1 + >>> src/mesa/drivers/dri/i965/brw_**blorp.h | 11 +- >>> src/mesa/drivers/dri/i965/brw_**blorp_clear.cpp | 143 >>> +- >>> src/mesa/drivers/dri/i965/brw_**clear.c | 2 +- >>> src/mesa/drivers/dri/i965/brw_**defines.h | 2 + >>> src/mesa/drivers/dri/i965/**gen7_blorp.cpp | 18 ++- >>> src/mesa/drivers/dri/i965/**gen7_wm_surface_state.c | 10 +- >>> src/mesa/drivers/dri/intel/**intel_mipmap_tree.c| 47 +++ >>> src/mesa/drivers/dri/intel/**intel_mipmap_tree.h| 13 ++ >>> 9 files changed, 233 insertions(+), 14 deletions(-) >>> >>> diff --git a/src/mesa/drivers/dri/i965/**brw_blorp.cpp >>> b/src/mesa/drivers/dri/i965/**brw_blorp.cpp >>> index 20f7153..c6019d1 100644 >>> --- a/src/mesa/drivers/dri/i965/**brw_blorp.cpp >>> +++ b/src/mesa/drivers/dri/i965/**brw_blorp.cpp >>> @@ -147,6 +147,7 @@ brw_blorp_params::brw_blorp_**params() >>>y1(0), >>>depth_format(0), >>>hiz_op(GEN6_HIZ_OP_NONE), >>> + fast_clear_op(GEN7_FAST_CLEAR_**OP_NONE), >>>num_samples(0), >>>use_wm_prog(false) >>> { >>> diff --git a/src/mesa/drivers/dri/i965/**brw_blorp.h >>> b/src/mesa/drivers/dri/i965/**brw_blorp.h >>> index 6360a62..687d7eb 100644 >>> --- a/src/mesa/drivers/dri/i965/**brw_blorp.h >>> +++ b/src/mesa/drivers/dri/i965/**brw_blorp.h >>> @@ -46,7 +46,8 @@ brw_blorp_blit_miptrees(struct intel_context *intel, >>> bool mirror_x, bool mirror_y); >>> >>> bool >>> -brw_blorp_clear_color(struct intel_context *intel, struct gl_framebuffer >>> *fb); >>> +brw_blorp_clear_color(struct intel_context *intel, struct gl_framebuffer >>> *fb, >>> + bool partial_clear); >>> >>> #ifdef __cplusplus >>> } /* end extern "C" */ >>> @@ -195,6 +196,13 @@ struct brw_blorp_prog_data >>> bool persample_msaa_dispatch; >>> }; >>> >>> + >>> +enum gen7_fast_clear_op { >>> + GEN7_FAST_CLEAR_OP_NONE, >>> + GEN7_FAST_CLEAR_OP_FAST_CLEAR, >>> +}; >>> + >>> + >>> class brw_blorp_params >>> { >>> public: >>> @@ -212,6 +220,7 @@ public: >>> brw_blorp_surface_info src; >>> brw_blorp_surface_info dst; >>> enum gen6_hiz_op hiz_op; >>> + enum gen7_fast_clear_op fast_clear_op; >>> unsigned num_samples; >>> bool use_wm_prog; >>> brw_blorp_wm_push_constants wm_push_consts; >>> diff --git a/src/mesa/drivers/dri/i965/**brw_blorp_clear.cpp >>> b/src/mesa/drivers/dri/i965/**brw_blorp_clear.cpp >>> index 28d7ad0..675289b 100644 >>> --- a/src/mesa/drivers/dri/i965/**brw_blorp_clear.cpp >>> +++ b/src/mesa/drivers/dri/i965/**brw_blorp_clear.cpp >>> @@ -49,7 +49,8 @@ public: >>> brw_blorp_clear_params(struct brw_context *brw, >>> struct gl_framebuffer *fb, >>> struct gl_renderbuffer *rb, >>> - GLubyte *color_mask); >>> + GLubyte *color_mask, >>> + bool partial_clear); >>> >>> virtual uint32_t get_wm_prog(struct brw_context *brw, >>> brw_blorp_prog_data **prog_data) const; >>> @@ -105,10 +106,49 @@ brw_blorp_clear_program::~brw_** >>> blorp_clear_program() >>> ralloc_free(mem_ctx); >>> } >>> >>> + >>> +/** >>> + * Determine if fast color clear supports the given clear color. >>> + * >>> + * Fast color clear can only clear to color values of 1.0 or 0.0. At the >>> + * moment we only support floating point buffers. >>> + */ >>> +static bool >>> +is_color_fast_clear_**compatible(gl_format format, >>> + const union gl_color_union *color) >>> +{ >>> + if (_mesa_is_format_integer_**color(format)) >>> + return false; >>> + >>> + for (int i = 0; i < 4; i++) { >>> + if (color->f[i] != 0.0 && color->f[i] != 1.0) >>> + return false; >>> >> >> Should this generate a perf debug message? Eric may have an opinion about >> generating warnings for the non-fast path... > > > Sounds reasonable to me. We already have perf debug messages for other > things that can inhibit fast clears (e.g. scissor preventing fast depth > clear). I'll add it unless I hear an objection. Thanks! I love getting this kind of information into our driver at the point that we're leaving some potential performance on the floor. Even when it doesn't seem important now (non-8x4 fast dept
[Mesa-dev] libclc: vload/vstore initial implementation
I've implemented the OpenCL vload/vstore builtin functions in two parts. 1) Pure CL C implementation. No Assembly 2) Add assembly optimizations for 32-bit int/uint loads/stores of 4+ component vectors Note: The vstore implementation assumes that the hardware back end supports byte-addressable stores. This may not always be optimal. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/4] libclc: Initial vload implementation
Should work for all targets and data types. Completely unoptimized. --- generic/include/clc/clc.h | 1 + generic/include/clc/shared/vload.h | 37 ++ generic/lib/SOURCES| 1 + generic/lib/shared/vload.cl| 47 ++ 4 files changed, 86 insertions(+) create mode 100644 generic/include/clc/shared/vload.h create mode 100644 generic/lib/shared/vload.cl diff --git a/generic/include/clc/clc.h b/generic/include/clc/clc.h index d2858a8..7937003 100644 --- a/generic/include/clc/clc.h +++ b/generic/include/clc/clc.h @@ -71,6 +71,7 @@ #include #include #include +#include /* 6.11.5 Geometric Functions */ #include diff --git a/generic/include/clc/shared/vload.h b/generic/include/clc/shared/vload.h new file mode 100644 index 000..93d0750 --- /dev/null +++ b/generic/include/clc/shared/vload.h @@ -0,0 +1,37 @@ +#define _CLC_VLOAD_DECL(PRIM_TYPE, VEC_TYPE, WIDTH, ADDR_SPACE) \ + _CLC_OVERLOAD _CLC_DECL VEC_TYPE vload##WIDTH(size_t offset, const ADDR_SPACE PRIM_TYPE *x); + +#define _CLC_VECTOR_VLOAD_DECL(PRIM_TYPE, ADDR_SPACE) \ + _CLC_VLOAD_DECL(PRIM_TYPE, PRIM_TYPE##2, 2, ADDR_SPACE) \ + _CLC_VLOAD_DECL(PRIM_TYPE, PRIM_TYPE##3, 3, ADDR_SPACE) \ + _CLC_VLOAD_DECL(PRIM_TYPE, PRIM_TYPE##4, 4, ADDR_SPACE) \ + _CLC_VLOAD_DECL(PRIM_TYPE, PRIM_TYPE##8, 8, ADDR_SPACE) \ + _CLC_VLOAD_DECL(PRIM_TYPE, PRIM_TYPE##16, 16, ADDR_SPACE) + +#define _CLC_VECTOR_VLOAD_PRIM1(PRIM_TYPE) \ + _CLC_VECTOR_VLOAD_DECL(PRIM_TYPE, __private) \ + _CLC_VECTOR_VLOAD_DECL(PRIM_TYPE, __local) \ + _CLC_VECTOR_VLOAD_DECL(PRIM_TYPE, __constant) \ + _CLC_VECTOR_VLOAD_DECL(PRIM_TYPE, __global) \ + +#define _CLC_VECTOR_VLOAD_PRIM() \ +_CLC_VECTOR_VLOAD_PRIM1(char) \ +_CLC_VECTOR_VLOAD_PRIM1(uchar) \ +_CLC_VECTOR_VLOAD_PRIM1(short) \ +_CLC_VECTOR_VLOAD_PRIM1(ushort) \ +_CLC_VECTOR_VLOAD_PRIM1(int) \ +_CLC_VECTOR_VLOAD_PRIM1(uint) \ +_CLC_VECTOR_VLOAD_PRIM1(long) \ +_CLC_VECTOR_VLOAD_PRIM1(ulong) \ +_CLC_VECTOR_VLOAD_PRIM1(float) \ + +#ifdef cl_khr_fp64 +#define _CLC_VECTOR_VLOAD() \ + _CLC_VECTOR_VLOAD_PRIM1(double) \ + _CLC_VECTOR_VLOAD_PRIM() +#else +#define _CLC_VECTOR_VLOAD() \ + _CLC_VECTOR_VLOAD_PRIM() +#endif + +_CLC_VECTOR_VLOAD() diff --git a/generic/lib/SOURCES b/generic/lib/SOURCES index 59eb9bb..5d9e3fa 100644 --- a/generic/lib/SOURCES +++ b/generic/lib/SOURCES @@ -23,5 +23,6 @@ relational/any.cl shared/clamp.cl shared/max.cl shared/min.cl +shared/vload.cl workitem/get_global_id.cl workitem/get_global_size.cl diff --git a/generic/lib/shared/vload.cl b/generic/lib/shared/vload.cl new file mode 100644 index 000..24d8240 --- /dev/null +++ b/generic/lib/shared/vload.cl @@ -0,0 +1,47 @@ +#include + +#define VLOAD_VECTORIZE(PRIM_TYPE, ADDR_SPACE) \ + _CLC_OVERLOAD _CLC_DEF PRIM_TYPE##2 vload2(size_t offset, const ADDR_SPACE PRIM_TYPE *x) { \ +return (PRIM_TYPE##2)(x[offset] , x[offset+1]); \ + } \ +\ + _CLC_OVERLOAD _CLC_DEF PRIM_TYPE##3 vload3(size_t offset, const ADDR_SPACE PRIM_TYPE *x) { \ +return (PRIM_TYPE##3)(x[offset] , x[offset+1], x[offset+2]); \ + } \ +\ + _CLC_OVERLOAD _CLC_DEF PRIM_TYPE##4 vload4(size_t offset, const ADDR_SPACE PRIM_TYPE *x) { \ +return (PRIM_TYPE##4)(x[offset], x[offset+1], x[offset+2], x[offset+3]); \ + } \ +\ + _CLC_OVERLOAD _CLC_DEF PRIM_TYPE##8 vload8(size_t offset, const ADDR_SPACE PRIM_TYPE *x) { \ +return (PRIM_TYPE##8)(vload4(offset, x), vload4(offset+4, x)); \ + } \ +\ + _CLC_OVERLOAD _CLC_DEF PRIM_TYPE##16 vload16(size_t offset, const ADDR_SPACE PRIM_TYPE *x) { \ +return (PRIM_TYPE##16)(vload8(offset, x), vload8(offset+8, x)); \ + } \ + +#define VLOAD_ADDR_SPACES(SCALAR_GENTYPE) \ +VLOAD_VECTORIZE(SCALAR_GENTYPE, __private) \ +VLOAD_VECTORIZE(SCALAR_GENTYPE, __local) \ +VLOAD_VECTORIZE(SCALAR_GENTYPE, __constant) \ +VLOAD_VECTORIZE(SCALAR_GENTYPE, __global) \ + +#define VLOAD_TYPES() \ +VLOAD_ADDR_SPACES(char) \ +VLOAD_ADDR_SPACES(uchar) \ +VLOAD_ADDR_SPACES(short) \ +VLOAD_ADDR_SPACES(ushort) \ +VLOAD_ADDR_SPACES(int) \ +VLOAD_ADDR_SPACES(uint) \ +VLOAD_ADDR_SPACES(long) \ +VLOAD_ADDR_SPACES(ulong) \ +VLOAD_ADDR_SPACES(float) \ + +VLOAD_TYPES() + +#ifdef cl_khr_fp64 +#pragma OPENCL EXTENSION cl_khr_fp64 : enable +VLOAD_ADDR_SPACES(double) +#endif + -- 1.8.1.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/4] libclc: Initial vstore implementation
Assumes that the target supports byte-addressable stores. Completely unoptimized. --- generic/include/clc/clc.h | 1 + generic/include/clc/shared/vstore.h | 36 generic/lib/SOURCES | 1 + generic/lib/shared/vstore.cl| 56 + 4 files changed, 94 insertions(+) create mode 100644 generic/include/clc/shared/vstore.h create mode 100644 generic/lib/shared/vstore.cl diff --git a/generic/include/clc/clc.h b/generic/include/clc/clc.h index 7937003..10d30e0 100644 --- a/generic/include/clc/clc.h +++ b/generic/include/clc/clc.h @@ -72,6 +72,7 @@ #include #include #include +#include /* 6.11.5 Geometric Functions */ #include diff --git a/generic/include/clc/shared/vstore.h b/generic/include/clc/shared/vstore.h new file mode 100644 index 000..1f784f8 --- /dev/null +++ b/generic/include/clc/shared/vstore.h @@ -0,0 +1,36 @@ +#define _CLC_VSTORE_DECL(PRIM_TYPE, VEC_TYPE, WIDTH, ADDR_SPACE) \ + _CLC_OVERLOAD _CLC_DECL void vstore##WIDTH(VEC_TYPE vec, size_t offset, ADDR_SPACE PRIM_TYPE *out); + +#define _CLC_VECTOR_VSTORE_DECL(PRIM_TYPE, ADDR_SPACE) \ + _CLC_VSTORE_DECL(PRIM_TYPE, PRIM_TYPE##2, 2, ADDR_SPACE) \ + _CLC_VSTORE_DECL(PRIM_TYPE, PRIM_TYPE##3, 3, ADDR_SPACE) \ + _CLC_VSTORE_DECL(PRIM_TYPE, PRIM_TYPE##4, 4, ADDR_SPACE) \ + _CLC_VSTORE_DECL(PRIM_TYPE, PRIM_TYPE##8, 8, ADDR_SPACE) \ + _CLC_VSTORE_DECL(PRIM_TYPE, PRIM_TYPE##16, 16, ADDR_SPACE) + +#define _CLC_VECTOR_VSTORE_PRIM1(PRIM_TYPE) \ + _CLC_VECTOR_VSTORE_DECL(PRIM_TYPE, __private) \ + _CLC_VECTOR_VSTORE_DECL(PRIM_TYPE, __local) \ + _CLC_VECTOR_VSTORE_DECL(PRIM_TYPE, __global) \ + +#define _CLC_VECTOR_VSTORE_PRIM() \ +_CLC_VECTOR_VSTORE_PRIM1(char) \ +_CLC_VECTOR_VSTORE_PRIM1(uchar) \ +_CLC_VECTOR_VSTORE_PRIM1(short) \ +_CLC_VECTOR_VSTORE_PRIM1(ushort) \ +_CLC_VECTOR_VSTORE_PRIM1(int) \ +_CLC_VECTOR_VSTORE_PRIM1(uint) \ +_CLC_VECTOR_VSTORE_PRIM1(long) \ +_CLC_VECTOR_VSTORE_PRIM1(ulong) \ +_CLC_VECTOR_VSTORE_PRIM1(float) \ + +#ifdef cl_khr_fp64 +#define _CLC_VECTOR_VSTORE() \ + _CLC_VECTOR_VSTORE_PRIM1(double) \ + _CLC_VECTOR_VSTORE_PRIM() +#else +#define _CLC_VECTOR_VSTORE() \ + _CLC_VECTOR_VSTORE_PRIM() +#endif + +_CLC_VECTOR_VSTORE() diff --git a/generic/lib/SOURCES b/generic/lib/SOURCES index 5d9e3fa..50cc9bd 100644 --- a/generic/lib/SOURCES +++ b/generic/lib/SOURCES @@ -24,5 +24,6 @@ shared/clamp.cl shared/max.cl shared/min.cl shared/vload.cl +shared/vstore.cl workitem/get_global_id.cl workitem/get_global_size.cl diff --git a/generic/lib/shared/vstore.cl b/generic/lib/shared/vstore.cl new file mode 100644 index 000..e88ccc5 --- /dev/null +++ b/generic/lib/shared/vstore.cl @@ -0,0 +1,56 @@ +#include + +#pragma OPENCL EXTENSION cl_khr_byte_addressable_store : enable + +#define VSTORE_VECTORIZE(PRIM_TYPE, ADDR_SPACE) \ + _CLC_OVERLOAD _CLC_DEF void vstore2(PRIM_TYPE##2 vec, size_t offset, ADDR_SPACE PRIM_TYPE *mem) { \ +mem[offset] = vec.s0; \ +mem[offset+1] = vec.s1; \ + } \ +\ + _CLC_OVERLOAD _CLC_DEF void vstore3(PRIM_TYPE##3 vec, size_t offset, ADDR_SPACE PRIM_TYPE *mem) { \ +mem[offset] = vec.s0; \ +mem[offset+1] = vec.s1; \ +mem[offset+2] = vec.s2; \ + } \ +\ + _CLC_OVERLOAD _CLC_DEF void vstore4(PRIM_TYPE##4 vec, size_t offset, ADDR_SPACE PRIM_TYPE *mem) { \ +mem[offset] = vec.s0; \ +mem[offset+1] = vec.s1; \ +mem[offset+2] = vec.s2; \ +mem[offset+3] = vec.s3; \ + } \ +\ + _CLC_OVERLOAD _CLC_DEF void vstore8(PRIM_TYPE##8 vec, size_t offset, ADDR_SPACE PRIM_TYPE *mem) { \ +vstore4(vec.lo, offset, mem); \ +vstore4(vec.hi, offset+4, mem); \ + } \ +\ + _CLC_OVERLOAD _CLC_DEF void vstore16(PRIM_TYPE##16 vec, size_t offset, ADDR_SPACE PRIM_TYPE *mem) { \ +vstore8(vec.lo, offset, mem); \ +vstore8(vec.hi, offset+8, mem); \ + } \ + +#define VSTORE_ADDR_SPACES(SCALAR_GENTYPE) \ +VSTORE_VECTORIZE(SCALAR_GENTYPE, __private) \ +VSTORE_VECTORIZE(SCALAR_GENTYPE, __local) \ +VSTORE_VECTORIZE(SCALAR_GENTYPE, __global) \ + +#define VSTORE_TYPES() \ +VSTORE_ADDR_SPACES(char) \ +VSTORE_ADDR_SPACES(uchar) \ +VSTORE_ADDR_SPACES(short) \ +VSTORE_ADDR_SPACES(ushort) \ +VSTORE_ADDR_SPACES(int) \ +VSTORE_ADDR_SPACES(uint) \ +VSTORE_ADDR_SPACES(long) \ +VSTORE_ADDR_SPACES(ulong) \ +VSTORE_ADDR_SPACES(float) \ + +VSTORE_TYPES() + +#ifdef cl_khr_fp64 +#pragma OPENCL EXTENSION cl_khr_fp64 : enable +VSTORE_ADDR_SPACES(double) +#endif + -- 1.8.1.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] libclc: Add assembly versions of vload for global int4/8/16
The assembly should be generic, but at least currently R600 only supports 32-bit loads of int1/4, and I believe that only global is well-supported. R600 lowers the 8/16 component vectors to multiple 4-bit loads. The unoptimized C versions of the other stuff is left in place. --- generic/lib/SOURCES | 2 ++ generic/lib/shared/vload.cl | 53 +-- generic/lib/shared/vload_if.ll | 60 generic/lib/shared/vload_impl.ll | 49 4 files changed, 162 insertions(+), 2 deletions(-) create mode 100644 generic/lib/shared/vload_if.ll create mode 100644 generic/lib/shared/vload_impl.ll diff --git a/generic/lib/SOURCES b/generic/lib/SOURCES index 50cc9bd..9f6acf3 100644 --- a/generic/lib/SOURCES +++ b/generic/lib/SOURCES @@ -24,6 +24,8 @@ shared/clamp.cl shared/max.cl shared/min.cl shared/vload.cl +shared/vload_if.ll +shared/vload_impl.ll shared/vstore.cl workitem/get_global_id.cl workitem/get_global_size.cl diff --git a/generic/lib/shared/vload.cl b/generic/lib/shared/vload.cl index 24d8240..f6ebd37 100644 --- a/generic/lib/shared/vload.cl +++ b/generic/lib/shared/vload.cl @@ -27,13 +27,12 @@ VLOAD_VECTORIZE(SCALAR_GENTYPE, __constant) \ VLOAD_VECTORIZE(SCALAR_GENTYPE, __global) \ +//int/uint are special... see below #define VLOAD_TYPES() \ VLOAD_ADDR_SPACES(char) \ VLOAD_ADDR_SPACES(uchar) \ VLOAD_ADDR_SPACES(short) \ VLOAD_ADDR_SPACES(ushort) \ -VLOAD_ADDR_SPACES(int) \ -VLOAD_ADDR_SPACES(uint) \ VLOAD_ADDR_SPACES(long) \ VLOAD_ADDR_SPACES(ulong) \ VLOAD_ADDR_SPACES(float) \ @@ -45,3 +44,53 @@ VLOAD_TYPES() VLOAD_ADDR_SPACES(double) #endif +VLOAD_VECTORIZE(int, __private) +VLOAD_VECTORIZE(int, __local) +VLOAD_VECTORIZE(int, __constant) +VLOAD_VECTORIZE(uint, __private) +VLOAD_VECTORIZE(uint, __local) +VLOAD_VECTORIZE(uint, __constant) + +_CLC_OVERLOAD _CLC_DEF int2 vload2(size_t offset, const global int *x) { + return (int2)(x[offset] , x[offset+1]); +} +_CLC_OVERLOAD _CLC_DEF int3 vload3(size_t offset, const global int *x) { + return (int3)(vload2(offset, x), x[offset+2]); +} +_CLC_OVERLOAD _CLC_DEF uint2 vload2(size_t offset, const global uint *x) { + return (uint2)(x[offset] , x[offset+1]); +} +_CLC_OVERLOAD _CLC_DEF uint3 vload3(size_t offset, const global uint *x) { + return (uint3)(vload2(offset, x), x[offset+2]); +} + +/*Note: It is known that R600 doesn't support load <2 x ?> and <3 x ?>... so + * they aren't actually overridden here + */ +_CLC_DECL int4 __clc_vload4_int__global(size_t offset, const __global int *); +_CLC_DECL int8 __clc_vload8_int__global(size_t offset, const __global int *); +_CLC_DECL int16 __clc_vload16_int__global(size_t offset, const __global int *); + +_CLC_OVERLOAD _CLC_DEF int4 vload4(size_t offset, const global int *x) { + return __clc_vload4_int__global(offset, x); +} +_CLC_OVERLOAD _CLC_DEF int8 vload8(size_t offset, const global int *x) { + return __clc_vload8_int__global(offset, x); +} +_CLC_OVERLOAD _CLC_DEF int16 vload16(size_t offset, const global int *x) { + return __clc_vload16_int__global(offset, x); +} + +_CLC_DECL uint4 __clc_vload4_uint__global(size_t offset, const __global uint *); +_CLC_DECL uint8 __clc_vload8_uint__global(size_t offset, const __global uint *); +_CLC_DECL uint16 __clc_vload16_uint__global(size_t offset, const __global uint *); + +_CLC_OVERLOAD _CLC_DEF uint4 vload4(size_t offset, const global uint *x) { + return __clc_vload4_uint__global(offset, x); +} +_CLC_OVERLOAD _CLC_DEF uint8 vload8(size_t offset, const global uint *x) { + return __clc_vload8_uint__global(offset, x); +} +_CLC_OVERLOAD _CLC_DEF uint16 vload16(size_t offset, const global uint *x) { + return __clc_vload16_uint__global(offset, x); +} \ No newline at end of file diff --git a/generic/lib/shared/vload_if.ll b/generic/lib/shared/vload_if.ll new file mode 100644 index 000..2634d37 --- /dev/null +++ b/generic/lib/shared/vload_if.ll @@ -0,0 +1,60 @@ +;Start int global vload + +declare <2 x i32> @__clc_vload2_impl_i32__global(i32 %x, i32 %y) +declare <3 x i32> @__clc_vload3_impl_i32__global(i32 %x, i32 %y) +declare <4 x i32> @__clc_vload4_impl_i32__global(i32 %x, i32 %y) +declare <8 x i32> @__clc_vload8_impl_i32__global(i32 %x, i32 %y) +declare <16 x i32> @__clc_vload16_impl_i32__global(i32 %x, i32 %y) + +define <2 x i32> @__clc_vload2_int__global(i32 %x, i32 %y) nounwind readonly alwaysinline { + %call = call <2 x i32> @__clc_vload2_impl_i32__global(i32 %x, i32 %y) + ret <2 x i32> %call +} + +define <3 x i32> @__clc_vload3_int__global(i32 %x, i32 %y) nounwind readonly alwaysinline { + %call = call <3 x i32> @__clc_vload3_impl_i32__global(i32 %x, i32 %y) + ret <3 x i32> %call +} + +define <4 x i32> @__clc_vload4_int__global(i32 %x, i32 %y) nounwind readonly alwaysinline { + %call = call <4 x i32> @__clc_vload4_impl_i32__global(i32 %x, i32 %y) + ret <4 x i32> %cal
[Mesa-dev] [PATCH 4/4] libclc: Add assembly versions of vstore for global [u]int4/8/16
The assembly should be generic, but at least currently R600 only supports 32-bit stores of [u]int1/4, and I believe that only global is well-supported. R600 lowers the 8/16 component stores to multiple 4-component stores. The unoptimized C versions of the other stuff is left in place. --- generic/lib/SOURCES | 2 ++ generic/lib/shared/vstore.cl | 63 +++ generic/lib/shared/vstore_if.ll | 59 generic/lib/shared/vstore_impl.ll | 50 +++ 4 files changed, 168 insertions(+), 6 deletions(-) create mode 100644 generic/lib/shared/vstore_if.ll create mode 100644 generic/lib/shared/vstore_impl.ll diff --git a/generic/lib/SOURCES b/generic/lib/SOURCES index 9f6acf3..8cda14a 100644 --- a/generic/lib/SOURCES +++ b/generic/lib/SOURCES @@ -27,5 +27,7 @@ shared/vload.cl shared/vload_if.ll shared/vload_impl.ll shared/vstore.cl +shared/vstore_if.ll +shared/vstore_impl.ll workitem/get_global_id.cl workitem/get_global_size.cl diff --git a/generic/lib/shared/vstore.cl b/generic/lib/shared/vstore.cl index e88ccc5..5b84f47 100644 --- a/generic/lib/shared/vstore.cl +++ b/generic/lib/shared/vstore.cl @@ -15,10 +15,8 @@ } \ \ _CLC_OVERLOAD _CLC_DEF void vstore4(PRIM_TYPE##4 vec, size_t offset, ADDR_SPACE PRIM_TYPE *mem) { \ -mem[offset] = vec.s0; \ -mem[offset+1] = vec.s1; \ -mem[offset+2] = vec.s2; \ -mem[offset+3] = vec.s3; \ +vstore2(vec.lo, offset, mem); \ +vstore2(vec.hi, offset+2, mem); \ } \ \ _CLC_OVERLOAD _CLC_DEF void vstore8(PRIM_TYPE##8 vec, size_t offset, ADDR_SPACE PRIM_TYPE *mem) { \ @@ -36,13 +34,12 @@ VSTORE_VECTORIZE(SCALAR_GENTYPE, __local) \ VSTORE_VECTORIZE(SCALAR_GENTYPE, __global) \ +//int/uint are special... see below #define VSTORE_TYPES() \ VSTORE_ADDR_SPACES(char) \ VSTORE_ADDR_SPACES(uchar) \ VSTORE_ADDR_SPACES(short) \ VSTORE_ADDR_SPACES(ushort) \ -VSTORE_ADDR_SPACES(int) \ -VSTORE_ADDR_SPACES(uint) \ VSTORE_ADDR_SPACES(long) \ VSTORE_ADDR_SPACES(ulong) \ VSTORE_ADDR_SPACES(float) \ @@ -54,3 +51,57 @@ VSTORE_TYPES() VSTORE_ADDR_SPACES(double) #endif +VSTORE_VECTORIZE(int, __private) +VSTORE_VECTORIZE(int, __local) +VSTORE_VECTORIZE(uint, __private) +VSTORE_VECTORIZE(uint, __local) + +_CLC_OVERLOAD _CLC_DEF void vstore2(int2 vec, size_t offset, global int *mem) { +mem[offset] = vec.s0; +mem[offset+1] = vec.s1; +} +_CLC_OVERLOAD _CLC_DEF void vstore3(int3 vec, size_t offset, global int *mem) { +mem[offset] = vec.s0; +mem[offset+1] = vec.s1; +mem[offset+2] = vec.s2; +} +_CLC_OVERLOAD _CLC_DEF void vstore2(uint2 vec, size_t offset, global uint *mem) { +mem[offset] = vec.s0; +mem[offset+1] = vec.s1; +} +_CLC_OVERLOAD _CLC_DEF void vstore3(uint3 vec, size_t offset, global uint *mem) { +mem[offset] = vec.s0; +mem[offset+1] = vec.s1; +mem[offset+2] = vec.s2; +} + +/*Note: R600 probably doesn't support store <2 x ?> and <3 x ?>... so + * they aren't actually overridden here... lowest-common-denominator + */ +_CLC_DECL void __clc_vstore4_int__global(int4 vec, size_t offset, __global int *); +_CLC_DECL void __clc_vstore8_int__global(int8 vec, size_t offset, __global int *); +_CLC_DECL void __clc_vstore16_int__global(int16 vec, size_t offset, __global int *); + +_CLC_OVERLOAD _CLC_DEF void vstore4(int4 vec, size_t offset, global int *x) { +__clc_vstore4_int__global(vec, offset, x); +} +_CLC_OVERLOAD _CLC_DEF void vstore8(int8 vec, size_t offset, global int *x) { +__clc_vstore8_int__global(vec, offset, x); +} +_CLC_OVERLOAD _CLC_DEF void vstore16(int16 vec, size_t offset, global int *x) { +__clc_vstore16_int__global(vec, offset, x); +} + +_CLC_DECL void __clc_vstore4_uint__global(uint4 vec, size_t offset, __global uint *); +_CLC_DECL void __clc_vstore8_uint__global(uint8 vec, size_t offset, __global uint *); +_CLC_DECL void __clc_vstore16_uint__global(uint16 vec, size_t offset, __global uint *); + +_CLC_OVERLOAD _CLC_DEF void vstore4(uint4 vec, size_t offset, global uint *x) { +__clc_vstore4_uint__global(vec, offset, x); +} +_CLC_OVERLOAD _CLC_DEF void vstore8(uint8 vec, size_t offset, global uint *x) { +__clc_vstore8_uint__global(vec, offset, x); +} +_CLC_OVERLOAD _CLC_DEF void vstore16(uint16 vec, size_t offset, global uint *x) { +__clc_vstore16_uint__global(vec, offset, x); +} diff --git a/generic/lib/shared/vstore_if.ll b/generic/lib/shared/vstore_if.ll new file mode 100644 index 000..30eb552 --- /dev/null +++ b/generic/lib/shared/vstore_if.ll @@ -0,0 +1,59 @@ +;Start int global vstore + +declare void @__clc_vstore2_impl_i32__global(<2 x i32> %vec, i32 %x, i32 %y) +declare void @__clc_vstore3_impl_i32__global(<3 x i32> %vec, i32 %x, i32 %y) +declare void @__clc_vstore4_impl_i32__global(<4 x i32> %vec, i32 %x, i32 %y) +declare void @__clc_vstore8_impl_i32__global(<8 x i32> %vec, i32 %x, i32 %y) +declare
[Mesa-dev] [Bug 64934] New: [llvmpipe] SIGSEGV src/gallium/state_trackers/glx/xlib/glx_api.c:1363
https://bugs.freedesktop.org/show_bug.cgi?id=64934 Priority: medium Bug ID: 64934 Keywords: have-backtrace Assignee: mesa-dev@lists.freedesktop.org Summary: [llvmpipe] SIGSEGV src/gallium/state_trackers/glx/xlib/glx_api.c:1363 Severity: critical Classification: Unclassified OS: Linux (All) Reporter: v...@freedesktop.org Hardware: x86-64 (AMD64) Status: NEW Version: git Component: Other Product: Mesa mesa: b1797c3a3867ab60419bb9ec13dd9cb842edcbe3 (master) Run glxinfo on llvmpipe on Fedora. $ glxinfo Segmentation fault (core dumped) (gdb) bt #0 0x7f457293ea17 in glXDestroyContext (dpy=0x211b540, ctx=0x0) at src/gallium/state_trackers/glx/xlib/glx_api.c:1363 #1 0x00403106 in create_context_flags (contextFlags=0, profileMask=1, direct=1, minor=, major=, fbconfig=0x212a3e0, dpy=0x211b540) at glxinfo.c:745 #2 create_context_with_config (dpy=0x211b540, config=0x212a3e0, coreProfile=, direct=1) at glxinfo.c:776 #3 0x0040357d in print_screen_info (dpy=dpy@entry=0x211b540, scrnum=scrnum@entry=0, allowDirect=allowDirect@entry=1, coreProfile=coreProfile@entry=1, limits=limits@entry=0, singleLine=singleLine@entry=0, coreWorked=coreWorked@entry=0) at glxinfo.c:854 #4 0x004016b4 in main (argc=, argv=) at glxinfo.c:1731 (gdb) frame 0 #0 0x7f457293ea17 in glXDestroyContext (dpy=0x211b540, ctx=0x0) at src/gallium/state_trackers/glx/xlib/glx_api.c:1363 1363 XMesaDestroyContext( glxCtx->xmesaContext ); (gdb) print glxCtx $1 = (GLXContext) 0x0 -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 64935] New: [swrast] s_texfetch.c:1335: set_fetch_functions: Assertion `texImage->FetchTexel' failed.
https://bugs.freedesktop.org/show_bug.cgi?id=64935 Priority: medium Bug ID: 64935 Keywords: have-backtrace, regression CC: e...@anholt.net Assignee: mesa-dev@lists.freedesktop.org Summary: [swrast] s_texfetch.c:1335: set_fetch_functions: Assertion `texImage->FetchTexel' failed. Severity: critical Classification: Unclassified OS: Linux (All) Reporter: v...@freedesktop.org Hardware: x86-64 (AMD64) Status: NEW Version: git Component: Other Product: Mesa mesa: b1797c3a3867ab60419bb9ec13dd9cb842edcbe3 (master) Run piglit texture-packed-formats on swrast. $ ./bin/texture-packed-formats -auto Mesa warning: failed to remap glClampColorARB Mesa warning: failed to remap glTexBufferARB Mesa warning: failed to remap glFramebufferTextureARB Mesa warning: failed to remap glVertexAttribDivisorARB Mesa warning: failed to remap glProgramParameteriARB texture-packed-formats: ../../../src/mesa/swrast/s_texfetch.c:1335: set_fetch_functions: Assertion `texImage->FetchTexel' failed. Aborted (core dumped) (gdb) bt #0 0x7fd37a518425 in __GI_raise (sig=) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x7fd37a51bb8b in __GI_abort () at abort.c:91 #2 0x7fd37a5110ee in __assert_fail_base (fmt=, assertion=0x7fd3770b82f5 "texImage->FetchTexel", file=0x7fd3770b8220 "../../../src/mesa/swrast/s_texfetch.c", line=, function=) at assert.c:94 #3 0x7fd37a511192 in __GI___assert_fail (assertion=0x7fd3770b82f5 "texImage->FetchTexel", file=0x7fd3770b8220 "../../../src/mesa/swrast/s_texfetch.c", line=1335, function=0x7fd3770b8310 "set_fetch_functions") at assert.c:103 #4 0x7fd376f6b347 in set_fetch_functions (samp=0x1fbcd84, texImage=0x21a4f50, dims=2) at ../../../src/mesa/swrast/s_texfetch.c:1335 #5 0x7fd376f6b41b in _mesa_update_fetch_functions (ctx=0x1fa3ed0, unit=0) at ../../../src/mesa/swrast/s_texfetch.c:1356 #6 0x7fd376f442a8 in _swrast_update_texture_samplers (ctx=0x1fa3ed0) at ../../../src/mesa/swrast/s_context.c:481 #7 0x7fd376f44553 in _swrast_validate_derived (ctx=0x1fa3ed0) at ../../../src/mesa/swrast/s_context.c:572 #8 0x7fd376f43e79 in _swrast_validate_triangle (ctx=0x1fa3ed0, v0=0x7fd37599f3b8, v1=0x7fd37599f730, v2=0x7fd37599f040) at ../../../src/mesa/swrast/s_context.c:353 #9 0x7fd376f446b0 in _swrast_Triangle (ctx=0x1fa3ed0, v0=0x7fd37599f3b8, v1=0x7fd37599f730, v2=0x7fd37599f040) at ../../../src/mesa/swrast/s_context.c:630 #10 0x7fd376f90fe9 in triangle_rgba (ctx=0x1fa3ed0, e0=1, e1=2, e2=0) at ../../../src/mesa/swrast_setup/ss_tritmp.h:177 #11 0x7fd376f1dfd0 in _tnl_render_poly_verts (ctx=0x1fa3ed0, start=0, count=4, flags=57) at ../../../src/mesa/tnl/t_vb_rendertmp.h:353 #12 0x7fd376f202bf in run_render (ctx=0x1fa3ed0, stage=0x2030fb0) at ../../../src/mesa/tnl/t_vb_render.c:322 #13 0x7fd376f0e236 in _tnl_run_pipeline (ctx=0x1fa3ed0) at ../../../src/mesa/tnl/t_pipeline.c:164 #14 0x7fd376f0fb48 in _tnl_draw_prims (ctx=0x1fa3ed0, arrays=0x201dab8, prim=0x201be94, nr_prims=1, ib=0x0, min_index=0, max_index=3) at ../../../src/mesa/tnl/t_draw.c:526 #15 0x7fd376f0f7f5 in _tnl_vbo_draw_prims (ctx=0x1fa3ed0, prim=0x201be94, nr_prims=1, ib=0x0, index_bounds_valid=1 '\001', min_index=0, max_index=3, tfb_vertcount=0x0) at ../../../src/mesa/tnl/t_draw.c:426 #16 0x7fd376eedbf8 in vbo_exec_vtx_flush (exec=0x201b738, keepUnmapped=1 '\001') at ../../../src/mesa/vbo/vbo_exec_draw.c:400 #17 0x7fd376ee65f5 in vbo_exec_FlushVertices_internal (exec=0x201b738, unmap=1 '\001') at ../../../src/mesa/vbo/vbo_exec_api.c:555 #18 0x7fd376ee8474 in vbo_exec_FlushVertices (ctx=0x1fa3ed0, flags=1) at ../../../src/mesa/vbo/vbo_exec_api.c:1164 #19 0x7fd376dd75db in enable_texture (ctx=0x1fa3ed0, state=0 '\000', texBit=1024) at ../../../src/mesa/main/enable.c:230 #20 0x7fd376dd9d00 in _mesa_set_enable (ctx=0x1fa3ed0, cap=3553, state=0 '\000') at ../../../src/mesa/main/enable.c:681 #21 0x7fd376ddb37e in _mesa_Disable (cap=3553) at ../../../src/mesa/main/enable.c:1055 #22 0x00401a65 in Test (intFmt=13, dims=2) at piglit/tests/texturing/texture-packed-formats.c:299 #23 0x00401b66 in piglit_display () at piglit/tests/texturing/texture-packed-formats.c:332 #24 0x7fd37a907478 in display () at piglit/tests/util/piglit-framework-gl/piglit_glut_framework.c:60 #25 0x7fd37a2bc137 in fghRedrawWindow (window=0x1f9fe60) at freeglut_main.c:210 #26 fghcbDisplayWindow (window=0x1f9fe60, enumerator=0x7fffed0635b0) at freeglut_main.c:227 #27 0x7fd37a2bf889 in fgEnumWindows (enumCallback=0x7fd37a2bc0d0 , enumerator=0x7fffed0635b0) at freeglut_structure.c:394 #28 0x7fd37a2bc5fa in fghDisplayAll () at freeglut_main.c:249 #29 glutMainLoopEvent () at freeglut_main.c:1450 #30 0x7fd37a2bcf05 in glutMainLoop () at freeglut
Re: [Mesa-dev] Error compiling mesa 9.1.1
Hi Matt, On Thu, May 23, 2013 at 8:38 PM, Matt Turner wrote: > On Thu, May 23, 2013 at 3:39 AM, Divick Kishore >>> >>> Could someone please help me build mesa 9.1.1? > > I think this is a build system bug caused by not building any classic > hardware DRI drivers. Try --with-dri-drivers=i965,swrast. I tried building with this option but still it does not build. It now fails with the following error message. make[7]: Entering directory `/home/divick/work/work/mesa-9.1.1/build/dri/src/mesa/drivers/dri/common' CC utils.lo CC dri_util.lo CC libdri_test_stubs_la-dri_test.lo CC xmlconfig.lo CCLD libdri_test_stubs.la CCLD libdricommon.la make[7]: Leaving directory `/home/divick/work/work/mesa-9.1.1/build/dri/src/mesa/drivers/dri/common' make[6]: Leaving directory `/home/divick/work/work/mesa-9.1.1/build/dri/src/mesa/drivers/dri/common' Making all in i965 make[6]: Entering directory `/home/divick/work/work/mesa-9.1.1/build/dri/src/mesa/drivers/dri/i965' make[6]: *** No rule to make target `intel_batchbuffer.lo', needed by `libi965_dri.la'. Stop. Anything else that I could try? Thanks for your help, Regards, Divick ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v4 10/10] egl: dri2: support for creating images out of dma buffers
When touching the src/egl/drivers/dri2 directory, use a commit subject that looks like "egl/dri2: STUFF", not "egl: dri2: STUFF". On 05/02/2013 12:08 AM, Topi Pohjolainen wrote: v2: - upon success close the given file descriptors v3: - use specific entry for dma buffers instead of the basic for primes, and enable the extension based on the availability of the hook Signed-off-by: Topi Pohjolainen --- src/egl/drivers/dri2/egl_dri2.c | 280 1 file changed, 280 insertions(+) diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index 1011f27..cfa7cf0 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -34,6 +34,7 @@ #include #include #include +#include #include #include #include @@ -507,6 +508,10 @@ dri2_setup_screen(_EGLDisplay *disp) disp->Extensions.KHR_gl_texture_2D_image = EGL_TRUE; disp->Extensions.KHR_gl_texture_cubemap_image = EGL_TRUE; } + if (dri2_dpy->image->base.version >= 8 && + dri2_dpy->image->createImageFromDmaBufs) { + disp->Extensions.EXT_image_dma_buf_import = EGL_TRUE; + } } } @@ -1170,6 +1175,279 @@ dri2_create_image_mesa_drm_buffer(_EGLDisplay *disp, _EGLContext *ctx, return dri2_create_image(disp, dri_image); } +static EGLBoolean +dri2_check_dma_buf_attribs(const _EGLImageAttribs *attrs) +{ + unsigned i; + + /** + * The spec says: + * + * "Required attributes and their values are as follows: + * + * * EGL_WIDTH & EGL_HEIGHT: The logical dimensions of the buffer in pixels + * + * * EGL_LINUX_DRM_FOURCC_EXT: The pixel format of the buffer, as specified + *by drm_fourcc.h and used as the pixel_format parameter of the + *drm_mode_fb_cmd2 ioctl." + * + * * EGL_DMA_BUF_PLANE0_FD_EXT: The dma_buf file descriptor of plane 0 of + *the image. + * + * * EGL_DMA_BUF_PLANE0_OFFSET_EXT: The offset from the start of the + *dma_buf of the first sample in plane 0, in bytes. + * + * * EGL_DMA_BUF_PLANE0_PITCH_EXT: The number of bytes between the start of + *subsequent rows of samples in plane 0. May have special meaning for + *non-linear formats." + * + * "* If is EGL_LINUX_DMA_BUF_EXT, and the list of attributes is + *incomplete, EGL_BAD_PARAMETER is generated." + */ + if (attrs->Width <= 0 || attrs->Height <= 0 || + !attrs->DMABufFourCC.IsPresent || + !attrs->DMABufPlaneFds[0].IsPresent || + !attrs->DMABufPlaneOffsets[0].IsPresent || + !attrs->DMABufPlanePitches[0].IsPresent) { + _eglError(EGL_BAD_PARAMETER, "attribute(s) missing"); + return EGL_FALSE; + } + + /** +* Also: +* +* "If is EGL_LINUX_DMA_BUF_EXT and one or more of the values +* specified for a plane's pitch or offset isn't supported by EGL, +* EGL_BAD_ACCESS is generated." +*/ + for (i = 0; i < sizeof(attrs->DMABufPlanePitches) / + sizeof(attrs->DMABufPlanePitches[0]); ++i) { Use ARRAY_SIZE here. + if (attrs->DMABufPlanePitches[i].IsPresent && + attrs->DMABufPlanePitches[i].Value <= 0) { + _eglError(EGL_BAD_ACCESS, "invalid pitch"); + return EGL_FALSE; + } + } + + return EGL_TRUE; +} + +/* Returns the total number of file descriptors zero indicating an error. */ /* Returns the total number of file descriptors. Zero indicates an error. */ +static unsigned +dri2_check_dma_buf_format(const _EGLImageAttribs *attrs) +{ + switch (attrs->DMABufFourCC.Value) { + case DRM_FORMAT_RGB332: + case DRM_FORMAT_BGR233: + case DRM_FORMAT_XRGB: + case DRM_FORMAT_XBGR: + case DRM_FORMAT_RGBX: + case DRM_FORMAT_BGRX: + case DRM_FORMAT_ARGB: + case DRM_FORMAT_ABGR: + case DRM_FORMAT_RGBA: + case DRM_FORMAT_BGRA: + case DRM_FORMAT_XRGB1555: + case DRM_FORMAT_XBGR1555: + case DRM_FORMAT_RGBX5551: + case DRM_FORMAT_BGRX5551: + case DRM_FORMAT_ARGB1555: + case DRM_FORMAT_ABGR1555: + case DRM_FORMAT_RGBA5551: + case DRM_FORMAT_BGRA5551: + case DRM_FORMAT_RGB565: + case DRM_FORMAT_BGR565: + case DRM_FORMAT_RGB888: + case DRM_FORMAT_BGR888: + case DRM_FORMAT_XRGB: + case DRM_FORMAT_XBGR: + case DRM_FORMAT_RGBX: + case DRM_FORMAT_BGRX: + case DRM_FORMAT_ARGB: + case DRM_FORMAT_ABGR: + case DRM_FORMAT_RGBA: + case DRM_FORMAT_BGRA: + case DRM_FORMAT_XRGB2101010: + case DRM_FORMAT_XBGR2101010: + case DRM_FORMAT_RGBX1010102: + case DRM_FORMAT_BGRX1010102: + case DRM_FORMAT_ARGB2101010: + case DRM_FORMAT_ABGR2101010: + case DRM_FORMAT_RGBA1010102: + case DRM_FORMAT_BGRA1010102: + case DRM_FORMAT_YUYV: + case DRM_FORMAT_YVYU: + case DRM_FORMAT_UYVY: + case DRM_FORMAT_VYUY: + /* There must be one and only one plane present */ + if
Re: [Mesa-dev] [v4 09/10] egl: definitions for EXT_image_dma_buf_import
On 05/02/2013 12:08 AM, Topi Pohjolainen wrote: As specified in: http://www.khronos.org/registry/egl/extensions/EXT/EGL_EXT_image_dma_buf_import.txt Checking for the valid fourcc values is left for drivers avoiding dependency to drm header files here. v2: - enforce EGL_NO_CONTEXT v3: - declare the extension as EGL (not GLES) v4: - do not update eglext.h manually but rely on update from Khronos instead Signed-off-by: Topi Pohjolainen --- src/egl/main/eglapi.c | 7 - src/egl/main/egldisplay.h | 1 + src/egl/main/eglimage.c | 76 +++ src/egl/main/eglimage.h | 15 ++ src/egl/main/eglmisc.c| 1 + 5 files changed, 99 insertions(+), 1 deletion(-) diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c index bcc5465..2355d45 100644 --- a/src/egl/main/eglapi.c +++ b/src/egl/main/eglapi.c @@ -1310,7 +1310,12 @@ eglCreateImageKHR(EGLDisplay dpy, EGLContext ctx, EGLenum target, _EGL_CHECK_DISPLAY(disp, EGL_NO_IMAGE_KHR, drv); if (!disp->Extensions.KHR_image_base) RETURN_EGL_EVAL(disp, EGL_NO_IMAGE_KHR); - if (!context && ctx != EGL_NO_CONTEXT) + + /** +* "If is EGL_LINUX_DMA_BUF_EXT, must be a valid display, +* must be EGL_NO_CONTEXT..." +*/ + if (ctx != EGL_NO_CONTEXT && (!context || target == EGL_LINUX_DMA_BUF_EXT)) RETURN_EGL_ERROR(disp, EGL_BAD_CONTEXT, EGL_NO_IMAGE_KHR); img = drv->API.CreateImageKHR(drv, diff --git a/src/egl/main/egldisplay.h b/src/egl/main/egldisplay.h index 4b33470..5a21f78 100644 --- a/src/egl/main/egldisplay.h +++ b/src/egl/main/egldisplay.h @@ -115,6 +115,7 @@ struct _egl_extensions EGLBoolean EXT_create_context_robustness; EGLBoolean EXT_buffer_age; + EGLBoolean EXT_image_dma_buf_import; }; diff --git a/src/egl/main/eglimage.c b/src/egl/main/eglimage.c index bfae709..1cede31 100644 --- a/src/egl/main/eglimage.c +++ b/src/egl/main/eglimage.c @@ -93,6 +93,82 @@ _eglParseImageAttribList(_EGLImageAttribs *attrs, _EGLDisplay *dpy, attrs->PlaneWL = val; break; + case EGL_LINUX_DRM_FOURCC_EXT: + attrs->DMABufFourCC.Value = val; + attrs->DMABufFourCC.IsPresent = EGL_TRUE; + break; + case EGL_DMA_BUF_PLANE0_FD_EXT: + attrs->DMABufPlaneFds[0].Value = val; + attrs->DMABufPlaneFds[0].IsPresent = EGL_TRUE; + break; + case EGL_DMA_BUF_PLANE0_OFFSET_EXT: + attrs->DMABufPlaneOffsets[0].Value = val; + attrs->DMABufPlaneOffsets[0].IsPresent = EGL_TRUE; + break; + case EGL_DMA_BUF_PLANE0_PITCH_EXT: + attrs->DMABufPlanePitches[0].Value = val; + attrs->DMABufPlanePitches[0].IsPresent = EGL_TRUE; + break; + case EGL_DMA_BUF_PLANE1_FD_EXT: + attrs->DMABufPlaneFds[1].Value = val; + attrs->DMABufPlaneFds[1].IsPresent = EGL_TRUE; + break; + case EGL_DMA_BUF_PLANE1_OFFSET_EXT: + attrs->DMABufPlaneOffsets[1].Value = val; + attrs->DMABufPlaneOffsets[1].IsPresent = EGL_TRUE; + break; + case EGL_DMA_BUF_PLANE1_PITCH_EXT: + attrs->DMABufPlanePitches[1].Value = val; + attrs->DMABufPlanePitches[1].IsPresent = EGL_TRUE; + break; + case EGL_DMA_BUF_PLANE2_FD_EXT: + attrs->DMABufPlaneFds[2].Value = val; + attrs->DMABufPlaneFds[2].IsPresent = EGL_TRUE; + break; + case EGL_DMA_BUF_PLANE2_OFFSET_EXT: + attrs->DMABufPlaneOffsets[2].Value = val; + attrs->DMABufPlaneOffsets[2].IsPresent = EGL_TRUE; + break; + case EGL_DMA_BUF_PLANE2_PITCH_EXT: + attrs->DMABufPlanePitches[2].Value = val; + attrs->DMABufPlanePitches[2].IsPresent = EGL_TRUE; + break; + case EGL_YUV_COLOR_SPACE_HINT_EXT: + if (val != EGL_ITU_REC601_EXT || val != EGL_ITU_REC709_EXT || + val != EGL_ITU_REC2020_EXT) { This should be `val != X && val != Y && val != Z`. +err = EGL_BAD_ATTRIBUTE; + } else { +attrs->DMABufYuvColorSpaceHint.Value = val; +attrs->DMABufYuvColorSpaceHint.IsPresent = EGL_TRUE; + } + break; + case EGL_SAMPLE_RANGE_HINT_EXT: + if (val != EGL_YUV_FULL_RANGE_EXT || val != EGL_YUV_NARROW_RANGE_EXT) { + err = EGL_BAD_ATTRIBUTE; Again, s/||/&&/. Also, there is a tab above, but all the surrounding code uses spaces. + } else { +attrs->DMABufSampleRangeHint.Value = val; +attrs->DMABufSampleRangeHint.IsPresent = EGL_TRUE; + } + break; + case EGL_YUV_CHROMA_HORIZONTAL_SITING_HINT_EXT: + if (val != EGL_YUV_CHROMA_SITING_0_EXT || + val != EGL_YUV_CHROMA_SITING_0_5_EXT) { + err = EGL_BAD_ATTRIBUTE; Again, s/||/&&/ and a tab. + } else { +attrs->DMABufChromaHorizontalSiting.Value = val; +attrs->DMABufChromaHorizontalSiting.IsPresent = EGL_TRUE; +
Re: [Mesa-dev] [v4 06/10] intel: prepare for dri images having more than one plane
On 05/02/2013 12:08 AM, Topi Pohjolainen wrote: v2 (as advised by Eric): - use ARRAY_SIZE - re-use 'image_destroy' for cleaning up after failure - check directly the region pointer instead of the buffer object when determining if a region exists Signed-off-by: Topi Pohjolainen --- src/mesa/drivers/dri/intel/intel_screen.c | 103 +- 1 file changed, 72 insertions(+), 31 deletions(-) diff --git a/src/mesa/drivers/dri/intel/intel_screen.c b/src/mesa/drivers/dri/intel/intel_screen.c index 4973441..d822b1c 100644 --- a/src/mesa/drivers/dri/intel/intel_screen.c +++ b/src/mesa/drivers/dri/intel/intel_screen.c @@ -490,8 +490,14 @@ intel_create_image_from_texture(__DRIcontext *context, int target, static void intel_destroy_image(__DRIimage *image) { -intel_region_release(&image->regions[0]); -free(image); + int i; + + for (i = 0; i < ARRAY_SIZE(image->regions); ++i) { + if (image->regions[i]) + intel_region_release(&image->regions[i]); + } + + free(image); } static __DRIimage * @@ -568,16 +574,22 @@ intel_query_image(__DRIimage *image, int attrib, int *value) static __DRIimage * intel_dup_image(__DRIimage *orig_image, void *loaderPrivate) { + int i; __DRIimage *image; image = calloc(1, sizeof *image); if (image == NULL) return NULL; - intel_region_reference(&image->regions[0], orig_image->regions[0]); - if (image->regions[0] == NULL) { - free(image); - return NULL; Pre-patch, this hunk returned NULL if orig_image->region[0] was somehow NULL. + for (i = 0; i < ARRAY_SIZE(image->regions); ++i) { + if (!orig_image->regions[i]) + break; Post-patch, if orig_image->region[0] was NULL, then this function no longer returns NULL because of the above break. To ensure that this patch doesn't regress anything, it needs to reproduce that behavior with `if (orig_image->regions[0] != NULL) return NULL`.. Or, if your confident (... I'm not, but maybe you are) that orig_image->region[0] is never NULL then assert that. + + intel_region_reference(&image->regions[i], orig_image->regions[i]); + if (image->regions[i] == NULL) { + intel_destroy_image(image); + return NULL; + } } image->internal_format = orig_image->internal_format; @@ -646,47 +658,76 @@ intel_create_image_from_names(__DRIscreen *screen, } static __DRIimage * +intel_setup_image_from_fds(struct intel_screen *screen, int width, int height, + const struct intel_image_format *f, + const int *fds, int num_fds, const int *strides, + void *loaderPriv) +{ I don't see the utility in extracting this code out of intel_create_image_from_fds() into its own, similarly named function. In fact, it makes the code harder to read. If no following patch reuses this function, then its body should remain in its original location, intel_create_image_from_fds. + int i; + __DRIimage *img; + + if (f->nplanes == 1) + img = intel_allocate_image(f->planes[0].dri_format, loaderPriv); + else + img = intel_allocate_image(__DRI_IMAGE_FORMAT_NONE, loaderPriv); + + if (img == NULL) + return NULL; + + for (i = 0; i < num_fds; i++) { + img->regions[i] = intel_region_alloc_for_fd(screen, f->planes[i].cpp, + width >> f->planes[i].width_shift, + height >> f->planes[i].height_shift, + strides[i], fds[i], "image"); + + if (img->regions[i] == NULL) { + intel_destroy_image(img); + return NULL; + } + } + + intel_setup_image_from_dimensions(img); + + return img; +} + +static __DRIimage * intel_create_image_from_fds(__DRIscreen *screen, int width, int height, int fourcc, int *fds, int num_fds, int *strides, int *offsets, void *loaderPrivate) { struct intel_screen *intelScreen = screen->driverPrivate; - struct intel_image_format *f; + struct intel_image_format *f = intel_image_format_lookup(fourcc); __DRIimage *image; int i, index; - if (fds == NULL || num_fds != 1) - return NULL; - - f = intel_image_format_lookup(fourcc); - if (f == NULL) + /** +* In case the image is to consist of multiple regions, there must be exactly +* one region per plane. +*/ + if (fds == NULL || f == NULL || (num_fds > 1 && f->nplanes != num_fds)) return NULL; - if (f->nplanes == 1) - image = intel_allocate_image(f->planes[0].dri_format, loaderPriv); - else - image = intel_allocate_image(__DRI_IMAGE_FORMAT_NONE, loaderPriv); - + image = intel_setup_image_from_fds(intelScreen, width, height, f, + fds, num_fds, strides, loaderPrivate); if (image == NULL) return NULL; - image->regions[0] = intel
Re: [Mesa-dev] [v4 03/10] intel: replace single region with a vector of regions
On 05/21/2013 11:17 PM, Pohjolainen, Topi wrote: On Tue, May 21, 2013 at 10:11:17PM -0700, Chad Versace wrote: On 05/02/2013 12:08 AM, Topi Pohjolainen wrote: No functional change in preparation for supporting multiple planes per image each having its own region. Signed-off-by: Topi Pohjolainen --- src/mesa/drivers/dri/intel/intel_fbo.c | 6 +-- src/mesa/drivers/dri/intel/intel_regions.h | 7 ++- src/mesa/drivers/dri/intel/intel_screen.c| 69 ++-- src/mesa/drivers/dri/intel/intel_tex_image.c | 2 +- 4 files changed, 45 insertions(+), 39 deletions(-) [snip] diff --git a/src/mesa/drivers/dri/intel/intel_regions.h b/src/mesa/drivers/dri/intel/intel_regions.h index 1fb6b27..e610f6b 100644 --- a/src/mesa/drivers/dri/intel/intel_regions.h +++ b/src/mesa/drivers/dri/intel/intel_regions.h @@ -129,8 +129,13 @@ struct intel_image_format { } planes[3]; }; +/** + * An image representing multiple planes may come in two flavours: + * - all planes in single region but in different offsets or + * - each plane in its own region. + */ In case (1), does image->regions contain a single image, or does it contain 3 pointers to the same image? More importantly, I don't understand a key point here. By examining a given instance of __DRIimageRec, how can I determine if the image falls under case (1) or case (2)? Here is a concrete instance of the problem I don't know how to solve. Suppose that I have an instance of __DRIimageRec and that I've determined so far that its contents are as below, where image->planar_format was obtained by a lookup in the intel_screen.c:intel_image_formats table. image = { // ... .planar_format = { .fourcc = __DRI_IMAGE_FOURCC_YUV422, .components = __DRI_IMAGE_COMPONENTS_Y_U_V, .nplanes = 3, . planes = { [0] = { .buffer_index = 0, .dri_format = __DRI_IMAGE_FORMAT_R8, // ... }, [1] = { .buffer_index = 1, .dri_format = __DRI_IMAGE_FORMAT_R8, // ... }, [2] = { .buffer_index = 2, .dri_format = __DRI_IMAGE_FORMAT_R8, // ... }, }; With this information, how can I know if dereferencing image->regions[1] will segfault or not? That is, how can I know if this image have one or three images? All the entities that produce instances of 'struct __DRIimageRec' make certain that 'regions' contain as many valid pointers as there are planes. Hence the controlling counter is 'planar_format.nplanes'. I originally thought adding a counter along with 'regions', say 'nregions', but then decided against as it would be duplicating 'planar_format.nplanes' - they would always agree anyway. If 'planar_format' itself is missing (zero pointer) the image is implicitly always packed having only one region. In this patch I have simply extended the number of pointers, but no logic is yet trying to access elements of 'regions' beyond the first. The following patches introduce planar cases where other elements are accessed the 'planar_format' controlling how many. In addition, patch number six takes advantage of the fact that all the unused region pointers in the array are fixed to zero (for now all instances of 'struct __DRIimageRec' are allocated from the heap and all members are initialised to zero). It is also the patch that introduces the creation of images having more than one region guaranteeing that the planar format is also set accordingly. I think I understand where you are getting at - these rules are not obvious and should probably be enforced and/or made clearer. Would you have any preference, introducing the 'nregions' perhaps? I'd prefer to not add 'nregions', since that is duplicate information. I think a fuller explanation in the comments here would suffice. struct __DRIimageRec { - struct intel_region *region; + struct intel_region *regions[3]; GLenum internal_format; uint32_t dri_format; GLuint format; [snip] ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Error compiling mesa 9.1.1
On Thu, May 23, 2013 at 8:04 PM, Divick Kishore wrote: > Hi Matt, > > On Thu, May 23, 2013 at 8:38 PM, Matt Turner wrote: >> On Thu, May 23, 2013 at 3:39 AM, Divick Kishore Could someone please help me build mesa 9.1.1? >> >> I think this is a build system bug caused by not building any classic >> hardware DRI drivers. Try --with-dri-drivers=i965,swrast. > > I tried building with this option but still it does not build. It now > fails with the following error message. > > make[7]: Entering directory > `/home/divick/work/work/mesa-9.1.1/build/dri/src/mesa/drivers/dri/common' > CC utils.lo > CC dri_util.lo > CC libdri_test_stubs_la-dri_test.lo > CC xmlconfig.lo > CCLD libdri_test_stubs.la > CCLD libdricommon.la > make[7]: Leaving directory > `/home/divick/work/work/mesa-9.1.1/build/dri/src/mesa/drivers/dri/common' > make[6]: Leaving directory > `/home/divick/work/work/mesa-9.1.1/build/dri/src/mesa/drivers/dri/common' > Making all in i965 > make[6]: Entering directory > `/home/divick/work/work/mesa-9.1.1/build/dri/src/mesa/drivers/dri/i965' > make[6]: *** No rule to make target `intel_batchbuffer.lo', needed by > `libi965_dri.la'. Stop. > > > Anything else that I could try? > > Thanks for your help, > Regards, > Divick >From the "build" in your path, it looks like you might be trying to do an out-of-tree build. I don't remember if that completely worked with 9.1. I just untarred 9.1.3 and did libtoolize --force ./autogen.sh --with-dri-drivers=i965,swrast --with-gallium-drivers=swrast --enable-glx-tls --with-egl-platforms="x11" --enable-gles1 --enable-gles2 --enable-gallium-egl --disable-glu make -jX and it built. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [v4 10/10] egl: dri2: support for creating images out of dma buffers
On Thu, May 23, 2013 at 09:39:30PM -0700, Chad Versace wrote: > When touching the src/egl/drivers/dri2 directory, use a commit subject > that looks like "egl/dri2: STUFF", not "egl: dri2: STUFF". > > On 05/02/2013 12:08 AM, Topi Pohjolainen wrote: > >v2: > >- upon success close the given file descriptors > > > >v3: > >- use specific entry for dma buffers instead of the basic for > > primes, and enable the extension based on the availability > > of the hook > > > >Signed-off-by: Topi Pohjolainen > >--- > > src/egl/drivers/dri2/egl_dri2.c | 280 > > > > 1 file changed, 280 insertions(+) > > > >diff --git a/src/egl/drivers/dri2/egl_dri2.c > >b/src/egl/drivers/dri2/egl_dri2.c > >index 1011f27..cfa7cf0 100644 > >--- a/src/egl/drivers/dri2/egl_dri2.c > >+++ b/src/egl/drivers/dri2/egl_dri2.c > >@@ -34,6 +34,7 @@ > > #include > > #include > > #include > >+#include > > #include > > #include > > #include > >@@ -507,6 +508,10 @@ dri2_setup_screen(_EGLDisplay *disp) > > disp->Extensions.KHR_gl_texture_2D_image = EGL_TRUE; > > disp->Extensions.KHR_gl_texture_cubemap_image = EGL_TRUE; > >} > >+ if (dri2_dpy->image->base.version >= 8 && > >+ dri2_dpy->image->createImageFromDmaBufs) { > >+ disp->Extensions.EXT_image_dma_buf_import = EGL_TRUE; > >+ } > > } > > } > > > >@@ -1170,6 +1175,279 @@ dri2_create_image_mesa_drm_buffer(_EGLDisplay *disp, > >_EGLContext *ctx, > > return dri2_create_image(disp, dri_image); > > } > > > >+static EGLBoolean > >+dri2_check_dma_buf_attribs(const _EGLImageAttribs *attrs) > >+{ > >+ unsigned i; > >+ > >+ /** > >+ * The spec says: > >+ * > >+ * "Required attributes and their values are as follows: > >+ * > >+ * * EGL_WIDTH & EGL_HEIGHT: The logical dimensions of the buffer in > >pixels > >+ * > >+ * * EGL_LINUX_DRM_FOURCC_EXT: The pixel format of the buffer, as > >specified > >+ *by drm_fourcc.h and used as the pixel_format parameter of the > >+ *drm_mode_fb_cmd2 ioctl." > >+ * > >+ * * EGL_DMA_BUF_PLANE0_FD_EXT: The dma_buf file descriptor of plane 0 > >of > >+ *the image. > >+ * > >+ * * EGL_DMA_BUF_PLANE0_OFFSET_EXT: The offset from the start of the > >+ *dma_buf of the first sample in plane 0, in bytes. > >+ * > >+ * * EGL_DMA_BUF_PLANE0_PITCH_EXT: The number of bytes between the > >start of > >+ *subsequent rows of samples in plane 0. May have special meaning > >for > >+ *non-linear formats." > >+ * > >+ * "* If is EGL_LINUX_DMA_BUF_EXT, and the list of attributes > >is > >+ *incomplete, EGL_BAD_PARAMETER is generated." > >+ */ > >+ if (attrs->Width <= 0 || attrs->Height <= 0 || > >+ !attrs->DMABufFourCC.IsPresent || > >+ !attrs->DMABufPlaneFds[0].IsPresent || > >+ !attrs->DMABufPlaneOffsets[0].IsPresent || > >+ !attrs->DMABufPlanePitches[0].IsPresent) { > >+ _eglError(EGL_BAD_PARAMETER, "attribute(s) missing"); > >+ return EGL_FALSE; > >+ } > >+ > >+ /** > >+* Also: > >+* > >+* "If is EGL_LINUX_DMA_BUF_EXT and one or more of the values > >+* specified for a plane's pitch or offset isn't supported by EGL, > >+* EGL_BAD_ACCESS is generated." > >+*/ > >+ for (i = 0; i < sizeof(attrs->DMABufPlanePitches) / > >+ sizeof(attrs->DMABufPlanePitches[0]); ++i) { > > Use ARRAY_SIZE here. Will do. > > >+ if (attrs->DMABufPlanePitches[i].IsPresent && > >+ attrs->DMABufPlanePitches[i].Value <= 0) { > >+ _eglError(EGL_BAD_ACCESS, "invalid pitch"); > >+ return EGL_FALSE; > >+ } > >+ } > >+ > >+ return EGL_TRUE; > >+} > >+ > >+/* Returns the total number of file descriptors zero indicating an error. */ > > /* Returns the total number of file descriptors. Zero indicates an error. */ Ok. > > >+static unsigned > >+dri2_check_dma_buf_format(const _EGLImageAttribs *attrs) > >+{ > >+ switch (attrs->DMABufFourCC.Value) { > >+ case DRM_FORMAT_RGB332: > >+ case DRM_FORMAT_BGR233: > >+ case DRM_FORMAT_XRGB: > >+ case DRM_FORMAT_XBGR: > >+ case DRM_FORMAT_RGBX: > >+ case DRM_FORMAT_BGRX: > >+ case DRM_FORMAT_ARGB: > >+ case DRM_FORMAT_ABGR: > >+ case DRM_FORMAT_RGBA: > >+ case DRM_FORMAT_BGRA: > >+ case DRM_FORMAT_XRGB1555: > >+ case DRM_FORMAT_XBGR1555: > >+ case DRM_FORMAT_RGBX5551: > >+ case DRM_FORMAT_BGRX5551: > >+ case DRM_FORMAT_ARGB1555: > >+ case DRM_FORMAT_ABGR1555: > >+ case DRM_FORMAT_RGBA5551: > >+ case DRM_FORMAT_BGRA5551: > >+ case DRM_FORMAT_RGB565: > >+ case DRM_FORMAT_BGR565: > >+ case DRM_FORMAT_RGB888: > >+ case DRM_FORMAT_BGR888: > >+ case DRM_FORMAT_XRGB: > >+ case DRM_FORMAT_XBGR: > >+ case DRM_FORMAT_RGBX: > >+ case DRM_FORMAT_BGRX: > >+ case DRM_FORMAT_ARGB: > >+ case DRM
Re: [Mesa-dev] [v4 09/10] egl: definitions for EXT_image_dma_buf_import
On Thu, May 23, 2013 at 09:40:09PM -0700, Chad Versace wrote: > On 05/02/2013 12:08 AM, Topi Pohjolainen wrote: > >As specified in: > > > >http://www.khronos.org/registry/egl/extensions/EXT/EGL_EXT_image_dma_buf_import.txt > > > >Checking for the valid fourcc values is left for drivers avoiding > >dependency to drm header files here. > > > >v2: > >- enforce EGL_NO_CONTEXT > > > >v3: > >- declare the extension as EGL (not GLES) > > > >v4: > >- do not update eglext.h manually but rely on update from > > Khronos instead > > > >Signed-off-by: Topi Pohjolainen > >--- > > src/egl/main/eglapi.c | 7 - > > src/egl/main/egldisplay.h | 1 + > > src/egl/main/eglimage.c | 76 > > +++ > > src/egl/main/eglimage.h | 15 ++ > > src/egl/main/eglmisc.c| 1 + > > 5 files changed, 99 insertions(+), 1 deletion(-) > > > >diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c > >index bcc5465..2355d45 100644 > >--- a/src/egl/main/eglapi.c > >+++ b/src/egl/main/eglapi.c > >@@ -1310,7 +1310,12 @@ eglCreateImageKHR(EGLDisplay dpy, EGLContext ctx, > >EGLenum target, > > _EGL_CHECK_DISPLAY(disp, EGL_NO_IMAGE_KHR, drv); > > if (!disp->Extensions.KHR_image_base) > >RETURN_EGL_EVAL(disp, EGL_NO_IMAGE_KHR); > >- if (!context && ctx != EGL_NO_CONTEXT) > >+ > >+ /** > >+* "If is EGL_LINUX_DMA_BUF_EXT, must be a valid display, > >+* must be EGL_NO_CONTEXT..." > >+*/ > >+ if (ctx != EGL_NO_CONTEXT && (!context || target == > >EGL_LINUX_DMA_BUF_EXT)) > >RETURN_EGL_ERROR(disp, EGL_BAD_CONTEXT, EGL_NO_IMAGE_KHR); > > > > img = drv->API.CreateImageKHR(drv, > >diff --git a/src/egl/main/egldisplay.h b/src/egl/main/egldisplay.h > >index 4b33470..5a21f78 100644 > >--- a/src/egl/main/egldisplay.h > >+++ b/src/egl/main/egldisplay.h > >@@ -115,6 +115,7 @@ struct _egl_extensions > > > > EGLBoolean EXT_create_context_robustness; > > EGLBoolean EXT_buffer_age; > >+ EGLBoolean EXT_image_dma_buf_import; > > }; > > > > > >diff --git a/src/egl/main/eglimage.c b/src/egl/main/eglimage.c > >index bfae709..1cede31 100644 > >--- a/src/egl/main/eglimage.c > >+++ b/src/egl/main/eglimage.c > >@@ -93,6 +93,82 @@ _eglParseImageAttribList(_EGLImageAttribs *attrs, > >_EGLDisplay *dpy, > > attrs->PlaneWL = val; > > break; > > > >+ case EGL_LINUX_DRM_FOURCC_EXT: > >+ attrs->DMABufFourCC.Value = val; > >+ attrs->DMABufFourCC.IsPresent = EGL_TRUE; > >+ break; > >+ case EGL_DMA_BUF_PLANE0_FD_EXT: > >+ attrs->DMABufPlaneFds[0].Value = val; > >+ attrs->DMABufPlaneFds[0].IsPresent = EGL_TRUE; > >+ break; > >+ case EGL_DMA_BUF_PLANE0_OFFSET_EXT: > >+ attrs->DMABufPlaneOffsets[0].Value = val; > >+ attrs->DMABufPlaneOffsets[0].IsPresent = EGL_TRUE; > >+ break; > >+ case EGL_DMA_BUF_PLANE0_PITCH_EXT: > >+ attrs->DMABufPlanePitches[0].Value = val; > >+ attrs->DMABufPlanePitches[0].IsPresent = EGL_TRUE; > >+ break; > >+ case EGL_DMA_BUF_PLANE1_FD_EXT: > >+ attrs->DMABufPlaneFds[1].Value = val; > >+ attrs->DMABufPlaneFds[1].IsPresent = EGL_TRUE; > >+ break; > >+ case EGL_DMA_BUF_PLANE1_OFFSET_EXT: > >+ attrs->DMABufPlaneOffsets[1].Value = val; > >+ attrs->DMABufPlaneOffsets[1].IsPresent = EGL_TRUE; > >+ break; > >+ case EGL_DMA_BUF_PLANE1_PITCH_EXT: > >+ attrs->DMABufPlanePitches[1].Value = val; > >+ attrs->DMABufPlanePitches[1].IsPresent = EGL_TRUE; > >+ break; > >+ case EGL_DMA_BUF_PLANE2_FD_EXT: > >+ attrs->DMABufPlaneFds[2].Value = val; > >+ attrs->DMABufPlaneFds[2].IsPresent = EGL_TRUE; > >+ break; > >+ case EGL_DMA_BUF_PLANE2_OFFSET_EXT: > >+ attrs->DMABufPlaneOffsets[2].Value = val; > >+ attrs->DMABufPlaneOffsets[2].IsPresent = EGL_TRUE; > >+ break; > >+ case EGL_DMA_BUF_PLANE2_PITCH_EXT: > >+ attrs->DMABufPlanePitches[2].Value = val; > >+ attrs->DMABufPlanePitches[2].IsPresent = EGL_TRUE; > >+ break; > >+ case EGL_YUV_COLOR_SPACE_HINT_EXT: > >+ if (val != EGL_ITU_REC601_EXT || val != EGL_ITU_REC709_EXT || > >+ val != EGL_ITU_REC2020_EXT) { > > This should be `val != X && val != Y && val != Z`. > > >+err = EGL_BAD_ATTRIBUTE; > >+ } else { > >+attrs->DMABufYuvColorSpaceHint.Value = val; > >+attrs->DMABufYuvColorSpaceHint.IsPresent = EGL_TRUE; > >+ } > >+ break; > >+ case EGL_SAMPLE_RANGE_HINT_EXT: > >+ if (val != EGL_YUV_FULL_RANGE_EXT || val != > >EGL_YUV_NARROW_RANGE_EXT) { > >+err = EGL_BAD_ATTRIBUTE; > > Again, s/||/&&/. Also, there is a tab above, but all the surrounding code > uses spaces. > > >+ } else { > >+attrs->DMABufSampleRangeHint.Value = val; > >+attrs->DMABufSampl
[Mesa-dev] [PATCH] glsl linker: Initialize member variable interface_namespace.
Fixes "Uninitialized pointer field" defect reported by Coverity. Signed-off-by: Vinson Lee --- src/glsl/lower_named_interface_blocks.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/glsl/lower_named_interface_blocks.cpp b/src/glsl/lower_named_interface_blocks.cpp index eba667a..922cc02 100644 --- a/src/glsl/lower_named_interface_blocks.cpp +++ b/src/glsl/lower_named_interface_blocks.cpp @@ -72,7 +72,8 @@ public: hash_table *interface_namespace; flatten_named_interface_blocks_declarations(void *mem_ctx) - : mem_ctx(mem_ctx) + : mem_ctx(mem_ctx), +interface_namespace(NULL) { } -- 1.8.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Error compiling mesa 9.1.1
Hi Matt, > From the "build" in your path, it looks like you might be trying to do > an out-of-tree build. I don't remember if that completely worked with > 9.1. > > I just untarred 9.1.3 and did > > libtoolize --force > ./autogen.sh --with-dri-drivers=i965,swrast > --with-gallium-drivers=swrast --enable-glx-tls > --with-egl-platforms="x11" --enable-gles1 --enable-gles2 > --enable-gallium-egl --disable-glu > make -jX > > and it built. thanks for your update. It builds fine with 9.1.3 but not with 9.1.1. I will simply start using 9.1.3. Thanks again, Regards, Divick ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev