date:20140214

Re: [Mesa-dev] [PATCH 1/2] vl: add motion adaptive deinterlacer

2014-02-14 Thread Christian König


A really nice piece of work, thx allot.

Both patches reviewed and pushed upstream.

Cheers,
Christian.

Am 13.02.2014 21:32, schrieb Grigori Goronzy:

---
  src/gallium/auxiliary/Makefile.sources |   3 +-
  src/gallium/auxiliary/vl/vl_deint_filter.c | 491 +
  src/gallium/auxiliary/vl/vl_deint_filter.h |  78 +
  3 files changed, 571 insertions(+), 1 deletion(-)
  create mode 100644 src/gallium/auxiliary/vl/vl_deint_filter.c
  create mode 100644 src/gallium/auxiliary/vl/vl_deint_filter.h

diff --git a/src/gallium/auxiliary/Makefile.sources 
b/src/gallium/auxiliary/Makefile.sources
index c89cbdd..19004e0 100644
--- a/src/gallium/auxiliary/Makefile.sources
+++ b/src/gallium/auxiliary/Makefile.sources
@@ -155,7 +155,8 @@ C_SOURCES := \
  vl/vl_idct.c \
vl/vl_mc.c \
  vl/vl_vertex_buffers.c \
-vl/vl_video_buffer.c
+vl/vl_video_buffer.c \
+   vl/vl_deint_filter.c
  
  GENERATED_SOURCES := \

indices/u_indices_gen.c \
diff --git a/src/gallium/auxiliary/vl/vl_deint_filter.c 
b/src/gallium/auxiliary/vl/vl_deint_filter.c
new file mode 100644
index 000..9b05154
--- /dev/null
+++ b/src/gallium/auxiliary/vl/vl_deint_filter.c
@@ -0,0 +1,491 @@
+/**
+ *
+ * Copyright 2013 Grigori Goronzy .
+ * All Rights Reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sub license, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial portions
+ * of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+ * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
+ * IN NO EVENT SHALL TUNGSTEN GRAPHICS AND/OR ITS SUPPLIERS BE LIABLE FOR
+ * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+ * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+ * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+ **/
+
+/*
+ *  References:
+ *
+ *  Lin, S. F., Chang, Y. L., & Chen, L. G. (2003).
+ *  Motion adaptive interpolation with horizontal motion detection for 
deinterlacing.
+ *  Consumer Electronics, IEEE Transactions on, 49(4), 1256-1265.
+ *
+ *  Pei-Yin, C. H. E. N., & Yao-Hsien, L. A. I. (2007).
+ *  A low-complexity interpolation method for deinterlacing.
+ *  IEICE transactions on information and systems, 90(2), 606-608.
+ *
+ */
+
+#include 
+
+#include "pipe/p_context.h"
+
+#include "tgsi/tgsi_ureg.h"
+
+#include "util/u_draw.h"
+#include "util/u_memory.h"
+#include "util/u_math.h"
+
+#include "vl_types.h"
+#include "vl_video_buffer.h"
+#include "vl_vertex_buffers.h"
+#include "vl_deint_filter.h"
+
+enum VS_OUTPUT
+{
+   VS_O_VPOS = 0,
+   VS_O_VTEX = 0
+};
+
+static void *
+create_vert_shader(struct vl_deint_filter *filter)
+{
+   struct ureg_program *shader;
+   struct ureg_src i_vpos;
+   struct ureg_dst o_vpos, o_vtex;
+
+   shader = ureg_create(TGSI_PROCESSOR_VERTEX);
+   if (!shader)
+  return NULL;
+
+   i_vpos = ureg_DECL_vs_input(shader, 0);
+   o_vpos = ureg_DECL_output(shader, TGSI_SEMANTIC_POSITION, VS_O_VPOS);
+   o_vtex = ureg_DECL_output(shader, TGSI_SEMANTIC_GENERIC, VS_O_VTEX);
+
+   ureg_MOV(shader, o_vpos, i_vpos);
+   ureg_MOV(shader, o_vtex, i_vpos);
+
+   ureg_END(shader);
+
+   return ureg_create_shader_and_destroy(shader, filter->pipe);
+}
+
+static void *
+create_copy_frag_shader(struct vl_deint_filter *filter, unsigned field)
+{
+   struct ureg_program *shader;
+   struct ureg_src i_vtex;
+   struct ureg_src sampler;
+   struct ureg_dst o_fragment;
+   struct ureg_dst t_tex;
+
+   shader = ureg_create(TGSI_PROCESSOR_FRAGMENT);
+   if (!shader) {
+  return NULL;
+   }
+   t_tex = ureg_DECL_temporary(shader);
+
+   i_vtex = ureg_DECL_fs_input(shader, TGSI_SEMANTIC_GENERIC, VS_O_VTEX, 
TGSI_INTERPOLATE_LINEAR);
+   sampler = ureg_DECL_sampler(shader, 2);
+   o_fragment = ureg_DECL_output(shader, TGSI_SEMANTIC_COLOR, 0);
+
+   ureg_MOV(shader, t_tex, i_vtex);
+   if (field) {
+  ureg_MOV(shader, ureg_writemask(t_tex, TGSI_WRITEMASK_ZW),
+   ureg_imm4f(shader, 0, 0, 1.0f, 0));
+   } else {
+  ureg_MOV(shader, ureg_writemask(t_tex, TGSI_WRITEMASK_ZW),
+   ureg_imm1f(shader, 0));
+   }
+
+   ureg_TEX(shader, o_fragment, TGSI_TEXTURE_2D_ARRAY, ureg_src(t_tex), 
sampler);
+
+   ureg_release_t

Re: [Mesa-dev] Mesa install - Doubts..

2014-02-14 Thread Kenneth Graunke

On 02/13/2014 06:09 AM, sathishkumar sivagurunathan wrote:
[snip]
>   Mobile 4 Series Chipset Integrated Graphics Controller = i915

The kernel driver for that hardware is i915.ko, but the 3D driver is
"i965" (i965_dri.so).

--Ken



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] configure: Use pkg-config for libselinux

2014-02-14 Thread Kusanagi Kouichi

On 2014-02-13 14:13:05 +, Emil Velikov wrote:
> On 13/02/14 13:20, Kusanagi Kouichi wrote:
> > libselinux provides pkgconfig file since 2.0.89 (2009-10-29).
> > 
> Can you check how many of the currently supported distros include that
> version or later ?
> I was nicely surprised when I did a similar change with expat
> 
> Thanks
> -Emil

Here is the results of quick search:
Debian stable 2.1.9
Ubuntu 10.04 LTS  2.0.89
Fedora 19 2.1.13
openSUSE 12.3 2.1.9
RHEL 51.33.4

So this patch will not work on rhel 5.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/5] glx: add extra null check in getFBConfigs

2014-02-14 Thread Juha-Pekka Heikkila

Signed-off-by: Juha-Pekka Heikkila 
---
 src/glx/glxext.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/glx/glxext.c b/src/glx/glxext.c
index 4a195bd..837c5b0 100644
--- a/src/glx/glxext.c
+++ b/src/glx/glxext.c
@@ -686,7 +686,8 @@ static GLboolean
   fb_req->glxCode = X_GLXGetFBConfigs;
   fb_req->screen = screen;
}
-   else if (strstr(psc->serverGLXexts, "GLX_SGIX_fbconfig") != NULL) {
+   else if (psc->serverGLXexts != NULL &&
+strstr(psc->serverGLXexts, "GLX_SGIX_fbconfig") != NULL) {
   GetReqExtra(GLXVendorPrivateWithReply,
   sz_xGLXGetFBConfigsSGIXReq -
   sz_xGLXVendorPrivateWithReplyReq, vpreq);
-- 
1.8.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/5] glx: add missing null check in SendMakeCurrentRequest

2014-02-14 Thread Juha-Pekka Heikkila

Signed-off-by: Juha-Pekka Heikkila 
---
 src/glx/indirect_glx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glx/indirect_glx.c b/src/glx/indirect_glx.c
index 28b8cd0..306bf5b 100644
--- a/src/glx/indirect_glx.c
+++ b/src/glx/indirect_glx.c
@@ -84,7 +84,7 @@ SendMakeCurrentRequest(Display * dpy, CARD8 opcode,
* not the SGI extension.
*/
 
-  if ((priv->majorVersion > 1) || (priv->minorVersion >= 3)) {
+  if (priv && ((priv->majorVersion > 1) || (priv->minorVersion >= 3))) {
  xGLXMakeContextCurrentReq *req;
 
  GetReq(GLXMakeContextCurrent, req);
-- 
1.8.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/5] glx: Add extra null check in __glXClientInfo

2014-02-14 Thread Juha-Pekka Heikkila

Signed-off-by: Juha-Pekka Heikkila 
---
 src/glx/glxcmds.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/src/glx/glxcmds.c b/src/glx/glxcmds.c
index 38a5262..4d8d0c2 100644
--- a/src/glx/glxcmds.c
+++ b/src/glx/glxcmds.c
@@ -1392,13 +1392,16 @@ void
 __glXClientInfo(Display * dpy, int opcode)
 {
char *ext_str = __glXGetClientGLExtensionString();
-   int size = strlen(ext_str) + 1;
 
-   xcb_connection_t *c = XGetXCBConnection(dpy);
-   xcb_glx_client_info(c,
-   GLX_MAJOR_VERSION, GLX_MINOR_VERSION, size, ext_str);
+   if (ext_str) {
+  int size = strlen(ext_str) + 1;
 
-   free(ext_str);
+  xcb_connection_t *c = XGetXCBConnection(dpy);
+  xcb_glx_client_info(c, GLX_MAJOR_VERSION, GLX_MINOR_VERSION, size,
+  ext_str);
+
+  free(ext_str);
+   }
 }
 
 
-- 
1.8.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/5] egl: Unhide functionality in _eglInitSync()

2014-02-14 Thread Juha-Pekka Heikkila

_eglInitResource() was used to memset entire _EGLSync by
writing more than size of pointed target. This does work
as long as Resource is the first element in _EGLSync,
this patch fixes such dependency.

Signed-off-by: Juha-Pekka Heikkila 
---
 src/egl/main/eglsync.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/egl/main/eglsync.c b/src/egl/main/eglsync.c
index 9d0067c..ba8a32f 100644
--- a/src/egl/main/eglsync.c
+++ b/src/egl/main/eglsync.c
@@ -75,7 +75,8 @@ _eglInitSync(_EGLSync *sync, _EGLDisplay *dpy, EGLenum type,
!(type == EGL_SYNC_FENCE_KHR && dpy->Extensions.KHR_fence_sync))
   return _eglError(EGL_BAD_ATTRIBUTE, "eglCreateSyncKHR");
 
-   _eglInitResource(&sync->Resource, sizeof(*sync), dpy);
+   memset(sync, 0, sizeof(_EGLSync));
+   _eglInitResource(&sync->Resource, sizeof(_EGLResource), dpy);
sync->Type = type;
sync->SyncStatus = EGL_UNSIGNALED_KHR;
sync->SyncCondition = EGL_SYNC_PRIOR_COMMANDS_COMPLETE_KHR;
-- 
1.8.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/5] mesa: Add missing null checks into prog_hash_table.c

2014-02-14 Thread Juha-Pekka Heikkila

Check calloc return values in hash_table_insert() and
hash_table_replace()

Signed-off-by: Juha-Pekka Heikkila 
---
 src/mesa/program/prog_hash_table.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/src/mesa/program/prog_hash_table.c 
b/src/mesa/program/prog_hash_table.c
index f45ed46..e62ad6f 100644
--- a/src/mesa/program/prog_hash_table.c
+++ b/src/mesa/program/prog_hash_table.c
@@ -143,10 +143,12 @@ hash_table_insert(struct hash_table *ht, void *data, 
const void *key)
 
 node = calloc(1, sizeof(*node));
 
-node->data = data;
-node->key = key;
+if (node != NULL) {
+node->data = data;
+node->key = key;
 
-insert_at_head(& ht->buckets[bucket], & node->link);
+insert_at_head(& ht->buckets[bucket], & node->link);
+}
 }
 
 bool
@@ -168,10 +170,12 @@ hash_table_replace(struct hash_table *ht, void *data, 
const void *key)
 
 hn = calloc(1, sizeof(*hn));
 
-hn->data = data;
-hn->key = key;
+if (hn != NULL) {
+hn->data = data;
+hn->key = key;
 
-insert_at_head(& ht->buckets[bucket], & hn->link);
+insert_at_head(& ht->buckets[bucket], & hn->link);
+}
 return false;
 }
 
-- 
1.8.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/5] Klocwork related pathes

2014-02-14 Thread Juha-Pekka Heikkila

Resend of the earlier glx patches with the issue pointed out by Petri fixed
as well as two additions.

Patch number five is a bit special. hash_table_insert() and 
hash_table_replace() don't really have a way to report errors and I did not 
want to go changing the api since these are called from so many places thus 
the case of null (c)allocation is handled just inside the functions and 
relied low memory situation is handled outside the function properly.

Juha-Pekka Heikkila (5):
  glx: Add extra null check in __glXClientInfo
  glx: add extra null check in getFBConfigs
  glx: add missing null check in SendMakeCurrentRequest
  egl: Unhide functionality in _eglInitSync()
  mesa: Add missing null checks into prog_hash_table.c

 src/egl/main/eglsync.c |  3 ++-
 src/glx/glxcmds.c  | 13 -
 src/glx/glxext.c   |  3 ++-
 src/glx/indirect_glx.c |  2 +-
 src/mesa/program/prog_hash_table.c | 16 ++--
 5 files changed, 23 insertions(+), 14 deletions(-)

-- 
1.8.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gallium/pipebuffer: change pb_cache_manager_create() size_factor to float

2014-02-14 Thread Brian Paul

Requested by Marek.

Cc: "10.1" 
---
 src/gallium/auxiliary/pipebuffer/pb_bufmgr.h   |4 ++--
 src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c |8 
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c  |2 +-
 src/gallium/winsys/svga/drm/vmw_screen_pools.c |2 +-
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/gallium/auxiliary/pipebuffer/pb_bufmgr.h 
b/src/gallium/auxiliary/pipebuffer/pb_bufmgr.h
index 3044ec8..d5b0ee2 100644
--- a/src/gallium/auxiliary/pipebuffer/pb_bufmgr.h
+++ b/src/gallium/auxiliary/pipebuffer/pb_bufmgr.h
@@ -161,8 +161,8 @@ pb_slab_range_manager_create(struct pb_manager *provider,
  */
 struct pb_manager *
 pb_cache_manager_create(struct pb_manager *provider, 
-   unsigned usecs,
-   unsigned size_factor,
+unsigned usecs,
+float size_factor,
 unsigned bypass_usage);
 
 
diff --git a/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c 
b/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c
index 0469146..32a8875 100644
--- a/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c
+++ b/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c
@@ -82,7 +82,7 @@ struct pb_cache_manager

struct list_head delayed;
pb_size numDelayed;
-   unsigned size_factor;
+   float size_factor;
unsigned bypass_usage;
 };
 
@@ -236,7 +236,7 @@ pb_cache_is_buffer_compat(struct pb_cache_buffer *buf,
   return 0;
 
/* be lenient with size */
-   if(buf->base.size > buf->mgr->size_factor*size)
+   if(buf->base.size > (unsigned) (buf->mgr->size_factor * size))
   return 0;

if(!pb_check_alignment(desc->alignment, buf->base.alignment))
@@ -403,8 +403,8 @@ pb_cache_manager_destroy(struct pb_manager *mgr)
  */
 struct pb_manager *
 pb_cache_manager_create(struct pb_manager *provider, 
-   unsigned usecs,
-   unsigned size_factor,
+unsigned usecs,
+float size_factor,
 unsigned bypass_usage)
 {
struct pb_cache_manager *mgr;
diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
index 6f0e2a5..44cd0d1 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
@@ -645,7 +645,7 @@ PUBLIC struct radeon_winsys *radeon_drm_winsys_create(int 
fd)
 ws->kman = radeon_bomgr_create(ws);
 if (!ws->kman)
 goto fail;
-ws->cman = pb_cache_manager_create(ws->kman, 100, 2, 0);
+ws->cman = pb_cache_manager_create(ws->kman, 100, 2.0f, 0);
 if (!ws->cman)
 goto fail;
 
diff --git a/src/gallium/winsys/svga/drm/vmw_screen_pools.c 
b/src/gallium/winsys/svga/drm/vmw_screen_pools.c
index 7f7b779..c97b71f 100644
--- a/src/gallium/winsys/svga/drm/vmw_screen_pools.c
+++ b/src/gallium/winsys/svga/drm/vmw_screen_pools.c
@@ -115,7 +115,7 @@ vmw_mob_pools_init(struct vmw_winsys_screen *vws)
struct pb_desc desc;
 
vws->pools.mob_cache = 
-  pb_cache_manager_create(vws->pools.gmr, 10, 2,
+  pb_cache_manager_create(vws->pools.gmr, 10, 2.0f,
   VMW_BUFFER_USAGE_SHARED);
if (!vws->pools.mob_cache)
   return FALSE;
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 15/19] pipebuffer, winsys: Add a size match parameter to the cached buffer manager

2014-02-14 Thread Brian Paul


Sure.  If you don't mind, I'll just do that as a follow-on patch...

-Brian

On 02/13/2014 07:11 PM, Marek Olšák wrote:

Please, can the size factor be a float?

Thanks,

Marek

On Fri, Feb 14, 2014 at 2:21 AM, Brian Paul  wrote:

From: Thomas Hellstrom 

In some situations it's important to restrict the sizes of buffers that the
cached buffer manager is allowed to return

Signed-off-by: Thomas Hellstrom 
Cc: "10.1" 
---
  src/gallium/auxiliary/pipebuffer/pb_bufmgr.h   |3 ++-
  src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c |7 +--
  src/gallium/winsys/radeon/drm/radeon_drm_winsys.c  |2 +-
  3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/gallium/auxiliary/pipebuffer/pb_bufmgr.h 
b/src/gallium/auxiliary/pipebuffer/pb_bufmgr.h
index 2c88cf4..fe4c8c2 100644
--- a/src/gallium/auxiliary/pipebuffer/pb_bufmgr.h
+++ b/src/gallium/auxiliary/pipebuffer/pb_bufmgr.h
@@ -161,7 +161,8 @@ pb_slab_range_manager_create(struct pb_manager *provider,
   */
  struct pb_manager *
  pb_cache_manager_create(struct pb_manager *provider,
-   unsigned usecs);
+   unsigned usecs,
+   unsigned size_factor);


  struct pb_fence_ops;
diff --git a/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c 
b/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c
index 9728bf4..6de5de0 100644
--- a/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c
+++ b/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c
@@ -82,6 +82,7 @@ struct pb_cache_manager

 struct list_head delayed;
 pb_size numDelayed;
+   unsigned size_factor;
  };


@@ -231,7 +232,7 @@ pb_cache_is_buffer_compat(struct pb_cache_buffer *buf,
return 0;

 /* be lenient with size */
-   if(buf->base.size >= 2*size)
+   if(buf->base.size > buf->mgr->size_factor*size)
return 0;

 if(!pb_check_alignment(desc->alignment, buf->base.alignment))
@@ -387,7 +388,8 @@ pb_cache_manager_destroy(struct pb_manager *mgr)

  struct pb_manager *
  pb_cache_manager_create(struct pb_manager *provider,
-   unsigned usecs)
+   unsigned usecs,
+   unsigned size_factor)
  {
 struct pb_cache_manager *mgr;

@@ -403,6 +405,7 @@ pb_cache_manager_create(struct pb_manager *provider,
 mgr->base.flush = pb_cache_manager_flush;
 mgr->provider = provider;
 mgr->usecs = usecs;
+   mgr->size_factor = size_factor;
 LIST_INITHEAD(&mgr->delayed);
 mgr->numDelayed = 0;
 pipe_mutex_init(mgr->mutex);
diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
index c28f3a7..b7137d2 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
@@ -645,7 +645,7 @@ PUBLIC struct radeon_winsys *radeon_drm_winsys_create(int 
fd)
  ws->kman = radeon_bomgr_create(ws);
  if (!ws->kman)
  goto fail;
-ws->cman = pb_cache_manager_create(ws->kman, 100);
+ws->cman = pb_cache_manager_create(ws->kman, 100, 2);
  if (!ws->cman)
  goto fail;

--
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/mailman/listinfo/mesa-dev&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=lGQMzzTgII0I7jefp2FHq7WtZ%2BTLs8wadB%2BiIj9xpBY%3D%0A&m=MlqYvKJO61Ep3bXn6Rg70XwXP9IBmzJWweG4uycHW%2Bw%3D%0A&s=eaa0cec3a9d68bbb2d845d10dc79dd7893b08b75d38288915655d46a9e1e245d




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 74988] New: Buffer overrun (segfault) decompressing ETC2 texture in GLBenchmark 3.0 Manhattan

2014-02-14 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=74988

  Priority: medium
Bug ID: 74988
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: Buffer overrun (segfault) decompressing ETC2 texture
in GLBenchmark 3.0 Manhattan
  Severity: normal
Classification: Unclassified
OS: Linux (All)
  Reporter: court...@lunarg.com
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: 10.0
 Component: Mesa core
   Product: Mesa

Saw intermittent segfault running GLBenchmark 3.0's Manhattan test. Narrowed
issue down to writing outside the memory bounds of the texture when
decompressing ETC2 data. Probably the root cause of bug 71002 (which has a nice
repro) as well.

See patch mesa-add-bounds-checking-to-eliminate-buffer-overrun on mesadev for
more details.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/12] DEMOS Use core profile in two GS demos (v2).

2014-02-14 Thread Brian Paul


On 02/13/2014 03:18 PM, Fabian Bieler wrote:

Hello!

As mesa only supports geometry shaders in core profile contexts this patchset
adjusts the gsraytrace and the geom-outlining-150 demos to use the core
profile.

This is v2 with the comment by Ian Romanick regarding incorrect usage of the
GLSL preprocessor #line directive adressed.

As I don't have git access, I'd appreciate it if someone could commit these
patches (assuming there are no further issues, of course).


Series looks ok to me and I tested with NVIDIA's driver.  You may want 
to wait for Ian to review too though.


Reviewed-by: Brian Paul 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mesa: add bounds checking to eliminate buffer overrun

2014-02-14 Thread Courtney Goeltzenleuchter

Decompressing ETC2 textures was causing intermitent segfault
by copying resulting 4x4 texel block to the destination texture
regardless of the size of the destination texture. Issue found
via application crash in GLBenchmark 3.0's Manhattan test.

Signed-off-by: Courtney Goeltzenleuchter 
---
 src/mesa/main/texcompress_etc.c | 49 +
 1 file changed, 25 insertions(+), 24 deletions(-)

diff --git a/src/mesa/main/texcompress_etc.c b/src/mesa/main/texcompress_etc.c
index e3862be..f9234b0 100644
--- a/src/mesa/main/texcompress_etc.c
+++ b/src/mesa/main/texcompress_etc.c
@@ -684,9 +684,10 @@ etc2_unpack_rgb8(uint8_t *dst_row,
  etc2_rgb8_parse_block(&block, src,
false /* punchthrough_alpha */);
 
- for (j = 0; j < bh; j++) {
+ /* be sure to stay within the bounds of the texture */
+ for (j = 0; j < bh && (j+y) < height; j++) {
 uint8_t *dst = dst_row + (y + j) * dst_stride + x * comps;
-for (i = 0; i < bw; i++) {
+for (i = 0; i < bw && (i+x) < width; i++) {
etc2_rgb8_fetch_texel(&block, i, j, dst,
  false /* punchthrough_alpha */);
dst[3] = 255;
@@ -721,9 +722,9 @@ etc2_unpack_srgb8(uint8_t *dst_row,
  etc2_rgb8_parse_block(&block, src,
false /* punchthrough_alpha */);
 
- for (j = 0; j < bh; j++) {
+ for (j = 0; j < bh && (j+y) < height; j++) {
 uint8_t *dst = dst_row + (y + j) * dst_stride + x * comps;
-for (i = 0; i < bw; i++) {
+for (i = 0; i < bw && (i+x) < width; i++) {
etc2_rgb8_fetch_texel(&block, i, j, dst,
  false /* punchthrough_alpha */);
/* Convert to MESA_FORMAT_B8G8R8A8_SRGB */
@@ -764,9 +765,9 @@ etc2_unpack_rgba8(uint8_t *dst_row,
   for (x = 0; x < width; x+= bw) {
  etc2_rgba8_parse_block(&block, src);
 
- for (j = 0; j < bh; j++) {
+ for (j = 0; j < bh && (j+y) < height; j++) {
 uint8_t *dst = dst_row + (y + j) * dst_stride + x * comps;
-for (i = 0; i < bw; i++) {
+for (i = 0; i < bw && (i+x) < width; i++) {
etc2_rgba8_fetch_texel(&block, i, j, dst);
dst += comps;
 }
@@ -801,9 +802,9 @@ etc2_unpack_srgb8_alpha8(uint8_t *dst_row,
   for (x = 0; x < width; x+= bw) {
  etc2_rgba8_parse_block(&block, src);
 
- for (j = 0; j < bh; j++) {
+ for (j = 0; j < bh && (j+y) < height; j++) {
 uint8_t *dst = dst_row + (y + j) * dst_stride + x * comps;
-for (i = 0; i < bw; i++) {
+for (i = 0; i < bw && (i+x) < width; i++) {
etc2_rgba8_fetch_texel(&block, i, j, dst);
 
/* Convert to MESA_FORMAT_B8G8R8A8_SRGB */
@@ -843,9 +844,9 @@ etc2_unpack_r11(uint8_t *dst_row,
   for (x = 0; x < width; x+= bw) {
  etc2_r11_parse_block(&block, src);
 
- for (j = 0; j < bh; j++) {
+ for (j = 0; j < bh && (j+y) < height; j++) {
 uint8_t *dst = dst_row + (y + j) * dst_stride + x * comps * 
comp_size;
-for (i = 0; i < bw; i++) {
+for (i = 0; i < bw && (i+x) < width; i++) {
etc2_r11_fetch_texel(&block, i, j, dst);
dst += comps * comp_size;
 }
@@ -879,10 +880,10 @@ etc2_unpack_rg11(uint8_t *dst_row,
  /* red component */
  etc2_r11_parse_block(&block, src);
 
- for (j = 0; j < bh; j++) {
+ for (j = 0; j < bh && (j+y) < height; j++) {
 uint8_t *dst = dst_row + (y + j) * dst_stride +
x * comps * comp_size;
-for (i = 0; i < bw; i++) {
+for (i = 0; i < bw && (i+x) < width; i++) {
etc2_r11_fetch_texel(&block, i, j, dst);
dst += comps * comp_size;
 }
@@ -890,10 +891,10 @@ etc2_unpack_rg11(uint8_t *dst_row,
  /* green component */
  etc2_r11_parse_block(&block, src + 8);
 
- for (j = 0; j < bh; j++) {
+ for (j = 0; j < bh && (j+y) < height; j++) {
 uint8_t *dst = dst_row + (y + j) * dst_stride +
x * comps * comp_size;
-for (i = 0; i < bw; i++) {
+for (i = 0; i < bw && (i+x) < width; i++) {
etc2_r11_fetch_texel(&block, i, j, dst + comp_size);
dst += comps * comp_size;
 }
@@ -926,10 +927,10 @@ etc2_unpack_signed_r11(uint8_t *dst_row,
   for (x = 0; x < width; x+= bw) {
  etc2_r11_parse_block(&block, src);
 
- for (j = 0; j < bh; j++) {
+ for (j = 0; j < bh && (j+y) < height; j++) {
 uint8_t *dst = dst_row + (y + j) * dst_stride +
x * comps * comp_size;
-for (i = 0; i < bw; i++) {
+for (i = 0; i < bw && (i+

Re: [Mesa-dev] [PATCH] mesa: add bounds checking to eliminate buffer overrun

2014-02-14 Thread Courtney Goeltzenleuchter

Forgot to mention. This patch addresses bug
#74988
.

No piglit regressions.

Courtney


On Fri, Feb 14, 2014 at 8:52 AM, Courtney Goeltzenleuchter <
court...@lunarg.com> wrote:

> Decompressing ETC2 textures was causing intermitent segfault
> by copying resulting 4x4 texel block to the destination texture
> regardless of the size of the destination texture. Issue found
> via application crash in GLBenchmark 3.0's Manhattan test.
>
> Signed-off-by: Courtney Goeltzenleuchter 
> ---
>  src/mesa/main/texcompress_etc.c | 49
> +
>  1 file changed, 25 insertions(+), 24 deletions(-)
>
> diff --git a/src/mesa/main/texcompress_etc.c
> b/src/mesa/main/texcompress_etc.c
> index e3862be..f9234b0 100644
> --- a/src/mesa/main/texcompress_etc.c
> +++ b/src/mesa/main/texcompress_etc.c
> @@ -684,9 +684,10 @@ etc2_unpack_rgb8(uint8_t *dst_row,
>   etc2_rgb8_parse_block(&block, src,
> false /* punchthrough_alpha */);
>
> - for (j = 0; j < bh; j++) {
> + /* be sure to stay within the bounds of the texture */
> + for (j = 0; j < bh && (j+y) < height; j++) {
>  uint8_t *dst = dst_row + (y + j) * dst_stride + x * comps;
> -for (i = 0; i < bw; i++) {
> +for (i = 0; i < bw && (i+x) < width; i++) {
> etc2_rgb8_fetch_texel(&block, i, j, dst,
>   false /* punchthrough_alpha */);
> dst[3] = 255;
> @@ -721,9 +722,9 @@ etc2_unpack_srgb8(uint8_t *dst_row,
>   etc2_rgb8_parse_block(&block, src,
> false /* punchthrough_alpha */);
>
> - for (j = 0; j < bh; j++) {
> + for (j = 0; j < bh && (j+y) < height; j++) {
>  uint8_t *dst = dst_row + (y + j) * dst_stride + x * comps;
> -for (i = 0; i < bw; i++) {
> +for (i = 0; i < bw && (i+x) < width; i++) {
> etc2_rgb8_fetch_texel(&block, i, j, dst,
>   false /* punchthrough_alpha */);
> /* Convert to MESA_FORMAT_B8G8R8A8_SRGB */
> @@ -764,9 +765,9 @@ etc2_unpack_rgba8(uint8_t *dst_row,
>for (x = 0; x < width; x+= bw) {
>   etc2_rgba8_parse_block(&block, src);
>
> - for (j = 0; j < bh; j++) {
> + for (j = 0; j < bh && (j+y) < height; j++) {
>  uint8_t *dst = dst_row + (y + j) * dst_stride + x * comps;
> -for (i = 0; i < bw; i++) {
> +for (i = 0; i < bw && (i+x) < width; i++) {
> etc2_rgba8_fetch_texel(&block, i, j, dst);
> dst += comps;
>  }
> @@ -801,9 +802,9 @@ etc2_unpack_srgb8_alpha8(uint8_t *dst_row,
>for (x = 0; x < width; x+= bw) {
>   etc2_rgba8_parse_block(&block, src);
>
> - for (j = 0; j < bh; j++) {
> + for (j = 0; j < bh && (j+y) < height; j++) {
>  uint8_t *dst = dst_row + (y + j) * dst_stride + x * comps;
> -for (i = 0; i < bw; i++) {
> +for (i = 0; i < bw && (i+x) < width; i++) {
> etc2_rgba8_fetch_texel(&block, i, j, dst);
>
> /* Convert to MESA_FORMAT_B8G8R8A8_SRGB */
> @@ -843,9 +844,9 @@ etc2_unpack_r11(uint8_t *dst_row,
>for (x = 0; x < width; x+= bw) {
>   etc2_r11_parse_block(&block, src);
>
> - for (j = 0; j < bh; j++) {
> + for (j = 0; j < bh && (j+y) < height; j++) {
>  uint8_t *dst = dst_row + (y + j) * dst_stride + x * comps *
> comp_size;
> -for (i = 0; i < bw; i++) {
> +for (i = 0; i < bw && (i+x) < width; i++) {
> etc2_r11_fetch_texel(&block, i, j, dst);
> dst += comps * comp_size;
>  }
> @@ -879,10 +880,10 @@ etc2_unpack_rg11(uint8_t *dst_row,
>   /* red component */
>   etc2_r11_parse_block(&block, src);
>
> - for (j = 0; j < bh; j++) {
> + for (j = 0; j < bh && (j+y) < height; j++) {
>  uint8_t *dst = dst_row + (y + j) * dst_stride +
> x * comps * comp_size;
> -for (i = 0; i < bw; i++) {
> +for (i = 0; i < bw && (i+x) < width; i++) {
> etc2_r11_fetch_texel(&block, i, j, dst);
> dst += comps * comp_size;
>  }
> @@ -890,10 +891,10 @@ etc2_unpack_rg11(uint8_t *dst_row,
>   /* green component */
>   etc2_r11_parse_block(&block, src + 8);
>
> - for (j = 0; j < bh; j++) {
> + for (j = 0; j < bh && (j+y) < height; j++) {
>  uint8_t *dst = dst_row + (y + j) * dst_stride +
> x * comps * comp_size;
> -for (i = 0; i < bw; i++) {
> +for (i = 0; i < bw && (i+x) < width; i++) {
> etc2_r11_fetch_texel(&block, i, j, dst + comp_size);
> dst += comps * comp_size;
>  }
> @@ -926,10

[Mesa-dev] [PATCH] mesa: Fix valgrind uninitialized variable warning

2014-02-14 Thread Courtney Goeltzenleuchter

Initialize field to eliminate valgrind warning.
Signed-off-by: Courtney Goeltzenleuchter 
---
 src/mesa/main/texcompress_etc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/main/texcompress_etc.c b/src/mesa/main/texcompress_etc.c
index f9234b0..97adc86 100644
--- a/src/mesa/main/texcompress_etc.c
+++ b/src/mesa/main/texcompress_etc.c
@@ -350,6 +350,7 @@ etc2_rgb8_parse_block(struct etc2_block *block,
block->is_t_mode = false;
block->is_h_mode = false;
block->is_planar_mode = false;
+   block->opaque = false;
 
if (punchthrough_alpha)
   block->opaque = src[3] & 0x2;
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mesa: Fix valgrind uninitialized variable warning

2014-02-14 Thread Courtney Goeltzenleuchter

Initialize field to eliminate valgrind warning.
Signed-off-by: Courtney Goeltzenleuchter 
---
 src/mesa/main/texcompress_etc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/main/texcompress_etc.c b/src/mesa/main/texcompress_etc.c
index f9234b0..97adc86 100644
--- a/src/mesa/main/texcompress_etc.c
+++ b/src/mesa/main/texcompress_etc.c
@@ -350,6 +350,7 @@ etc2_rgb8_parse_block(struct etc2_block *block,
block->is_t_mode = false;
block->is_h_mode = false;
block->is_planar_mode = false;
+   block->opaque = false;
 
if (punchthrough_alpha)
   block->opaque = src[3] & 0x2;
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/pipebuffer: change pb_cache_manager_create() size_factor to float

2014-02-14 Thread Marek Olšák

Reviewed-by: Marek Olšák 

Marek

On Fri, Feb 14, 2014 at 3:47 PM, Brian Paul  wrote:
> Requested by Marek.
>
> Cc: "10.1" 
> ---
>  src/gallium/auxiliary/pipebuffer/pb_bufmgr.h   |4 ++--
>  src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c |8 
>  src/gallium/winsys/radeon/drm/radeon_drm_winsys.c  |2 +-
>  src/gallium/winsys/svga/drm/vmw_screen_pools.c |2 +-
>  4 files changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/src/gallium/auxiliary/pipebuffer/pb_bufmgr.h 
> b/src/gallium/auxiliary/pipebuffer/pb_bufmgr.h
> index 3044ec8..d5b0ee2 100644
> --- a/src/gallium/auxiliary/pipebuffer/pb_bufmgr.h
> +++ b/src/gallium/auxiliary/pipebuffer/pb_bufmgr.h
> @@ -161,8 +161,8 @@ pb_slab_range_manager_create(struct pb_manager *provider,
>   */
>  struct pb_manager *
>  pb_cache_manager_create(struct pb_manager *provider,
> -   unsigned usecs,
> -   unsigned size_factor,
> +unsigned usecs,
> +float size_factor,
>  unsigned bypass_usage);
>
>
> diff --git a/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c 
> b/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c
> index 0469146..32a8875 100644
> --- a/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c
> +++ b/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c
> @@ -82,7 +82,7 @@ struct pb_cache_manager
>
> struct list_head delayed;
> pb_size numDelayed;
> -   unsigned size_factor;
> +   float size_factor;
> unsigned bypass_usage;
>  };
>
> @@ -236,7 +236,7 @@ pb_cache_is_buffer_compat(struct pb_cache_buffer *buf,
>return 0;
>
> /* be lenient with size */
> -   if(buf->base.size > buf->mgr->size_factor*size)
> +   if(buf->base.size > (unsigned) (buf->mgr->size_factor * size))
>return 0;
>
> if(!pb_check_alignment(desc->alignment, buf->base.alignment))
> @@ -403,8 +403,8 @@ pb_cache_manager_destroy(struct pb_manager *mgr)
>   */
>  struct pb_manager *
>  pb_cache_manager_create(struct pb_manager *provider,
> -   unsigned usecs,
> -   unsigned size_factor,
> +unsigned usecs,
> +float size_factor,
>  unsigned bypass_usage)
>  {
> struct pb_cache_manager *mgr;
> diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c 
> b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
> index 6f0e2a5..44cd0d1 100644
> --- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
> +++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
> @@ -645,7 +645,7 @@ PUBLIC struct radeon_winsys *radeon_drm_winsys_create(int 
> fd)
>  ws->kman = radeon_bomgr_create(ws);
>  if (!ws->kman)
>  goto fail;
> -ws->cman = pb_cache_manager_create(ws->kman, 100, 2, 0);
> +ws->cman = pb_cache_manager_create(ws->kman, 100, 2.0f, 0);
>  if (!ws->cman)
>  goto fail;
>
> diff --git a/src/gallium/winsys/svga/drm/vmw_screen_pools.c 
> b/src/gallium/winsys/svga/drm/vmw_screen_pools.c
> index 7f7b779..c97b71f 100644
> --- a/src/gallium/winsys/svga/drm/vmw_screen_pools.c
> +++ b/src/gallium/winsys/svga/drm/vmw_screen_pools.c
> @@ -115,7 +115,7 @@ vmw_mob_pools_init(struct vmw_winsys_screen *vws)
> struct pb_desc desc;
>
> vws->pools.mob_cache =
> -  pb_cache_manager_create(vws->pools.gmr, 10, 2,
> +  pb_cache_manager_create(vws->pools.gmr, 10, 2.0f,
>VMW_BUFFER_USAGE_SHARED);
> if (!vws->pools.mob_cache)
>return FALSE;
> --
> 1.7.10.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC] llvmpipe texture coordinate rounding

2014-02-14 Thread Jeff Muizelaar

In doing some testing we’ve noticed that trying to draw pixel aligned textures does not work very well with linear filtering in llvmpipe.Here’s an example of the problem.Imagine wanting to paint a 100x100 texture. After being scaled by 100 the texture coordinates will end up as:0.5, 1.5, 2.5, 3.5, 4.5..These are then multiplied by 256 and converted to integers giving us:128, 384, 640, 896, 1152..We subtract the 128:0, 256, 512, 768, 1024..Then mask to get the fractional component and shift to get the integer component:0,0, 1,0, 2,0, 3,0, 4,0...However, if for example 3.5 ends up as 3.499 we get:895.744 -> 895 -> 767 -> 2,255 instead of 3,0When we lerp using this value we end up including some of the pixel value at 2 instead of just at 3.If we add 0.5 before converting to an integer this problem goes away.The attached patch does this. It also changes the REPEAT mode code to use similar integer conversion code as the non-REPEAT path. The new generated code should be more efficient than the old code.-Jeff

add-0.5-before-converting-to-integer.patch
Description: Binary data
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] llvmpipe texture coordinate rounding

2014-02-14 Thread Roland Scheidegger

Am 14.02.2014 18:07, schrieb Jeff Muizelaar:
> In doing some testing we’ve noticed that trying to draw pixel aligned
> textures does not work very well with linear filtering in llvmpipe.
> 
> Here’s an example of the problem.
> 
> Imagine wanting to paint a 100x100 texture. After being scaled by 100
> the texture coordinates will end up as:
> 0.5, 1.5, 2.5, 3.5, 4.5..
> 
> These are then multiplied by 256 and converted to integers giving us:
> 128, 384, 640, 896, 1152..
> 
> We subtract the 128:
> 0, 256, 512, 768, 1024..
> 
> Then mask to get the fractional component and shift to get the integer
> component:
> 0,0, 1,0, 2,0, 3,0, 4,0...
> 
> However, if for example 3.5 ends up as 3.499 we get:
> 895.744 -> 895 -> 767 -> 2,255 instead of 3,0
> 
> When we lerp using this value we end up including some of the pixel
> value at 2 instead of just at 3.
> 
> If we add 0.5 before converting to an integer this problem goes away.
> 
> The attached patch does this. It also changes the REPEAT mode code to
> use similar integer conversion code as the non-REPEAT path. The new
> generated code should be more efficient than the old code.
> 


I'll need to take another look and run some tests, though I've got some
quick comments:


@@ -1031,16 +1082,28 @@ lp_build_sample_image_linear(struct
lp_build_sample_context *bld,
   s = lp_build_mul_imm(&bld->coord_bld, s, 256);
   if (dims >= 2)
  t = lp_build_mul_imm(&bld->coord_bld, t, 256);
   if (dims >= 3)
  r = lp_build_mul_imm(&bld->coord_bld, r, 256);
}

/* convert float to int */
+   half = lp_build_const_vec(bld->gallivm, bld->coord_bld.type, 0.5);
+   s = lp_build_add(&bld->coord_bld, s, half);
+   s = LLVMBuildFPToSI(builder, s, i32_vec_type, "");
+   if (dims >= 2) {
+  t = lp_build_add(&bld->coord_bld, t, half);
+  t = LLVMBuildFPToSI(builder, t, i32_vec_type, "");
+   }
+   if (dims >= 3) {
+  r = lp_build_add(&bld->coord_bld, r, half);
+  r = LLVMBuildFPToSI(builder, r, i32_vec_type, "");
+   }
+
s = LLVMBuildFPToSI(builder, s, i32_vec_type, "");
if (dims >= 2)
   t = LLVMBuildFPToSI(builder, t, i32_vec_type, "");
if (dims >= 3)
   r = LLVMBuildFPToSI(builder, r, i32_vec_type, "");
This looks quite incorrect you're converting the s/t/r coords twice from
float to int.


/* subtract 0.5 (add -128) */
i32_c128 = lp_build_const_int_vec(bld->gallivm, i32.type, -128);


Also, the add looks iffy as it won't work correctly if the coords are
negative, since the FPToSI is of course trunc, not floor.
Maybe instead of using add + fptosi should just use lp_build_iround
(which is just one sse instruction too on x86 though if you're targeting
another arch it will definitely be more code at least unless someone
adds an intrinsic for it if the cpu even has one). Might not matter
though depending on address mode...

And I might be missing something why do you think the new repeat code is
faster? Though that might also depend on arch_rounding being available
and such but at first looks it seems slightly more complex to me.

Roland
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 74251] Segfault in st_finalize_texture with Texture Buffer

2014-02-14 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=74251

--- Comment #10 from Andreas Boll  ---
Should be fixed with 
http://cgit.freedesktop.org/mesa/mesa/commit/?id=c6dbcf10dff1f8343a26081f5489ef732ebb5460

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 74717] r600g: 'invalid read' linking geometry shader

2014-02-14 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=74717

Andreas Boll  changed:

   What|Removed |Added

   Assignee|mesa-dev@lists.freedesktop. |i...@freedesktop.org
   |org |
 QA Contact||intel-3d-bugs@lists.freedes
   ||ktop.org
  Component|Mesa core   |glsl-compiler

--- Comment #9 from Andreas Boll  ---
(In reply to comment #7)
> Looks like a GLSL compiler issue.

Reassigning to glsl-compiler

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 73900] mesa: Fix build to properly check for supported compiler flags

2014-02-14 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=73900

Andreas Boll  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Andreas Boll  ---
Cherry-picked with 0461451dcdd640a7eda57f4211e0e03352059d3b

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] llvmpipe texture coordinate rounding

2014-02-14 Thread Jeff Muizelaar


On Feb 14, 2014, at 1:00 PM, Roland Scheidegger  wrote:

> Am 14.02.2014 18:07, schrieb Jeff Muizelaar:
> 
> I'll need to take another look and run some tests, though I've got some
> quick comments:
> 
> 
> @@ -1031,16 +1082,28 @@ lp_build_sample_image_linear(struct
> lp_build_sample_context *bld,
>   s = lp_build_mul_imm(&bld->coord_bld, s, 256);
>   if (dims >= 2)
>  t = lp_build_mul_imm(&bld->coord_bld, t, 256);
>   if (dims >= 3)
>  r = lp_build_mul_imm(&bld->coord_bld, r, 256);
>}
> 
>/* convert float to int */
> +   half = lp_build_const_vec(bld->gallivm, bld->coord_bld.type, 0.5);
> +   s = lp_build_add(&bld->coord_bld, s, half);
> +   s = LLVMBuildFPToSI(builder, s, i32_vec_type, "");
> +   if (dims >= 2) {
> +  t = lp_build_add(&bld->coord_bld, t, half);
> +  t = LLVMBuildFPToSI(builder, t, i32_vec_type, "");
> +   }
> +   if (dims >= 3) {
> +  r = lp_build_add(&bld->coord_bld, r, half);
> +  r = LLVMBuildFPToSI(builder, r, i32_vec_type, "");
> +   }
> +
>s = LLVMBuildFPToSI(builder, s, i32_vec_type, "");
>if (dims >= 2)
>   t = LLVMBuildFPToSI(builder, t, i32_vec_type, "");
>if (dims >= 3)
>   r = LLVMBuildFPToSI(builder, r, i32_vec_type, "");
> This looks quite incorrect you're converting the s/t/r coords twice from
> float to int.

Yep. I forgot to remove this second hunk.

>/* subtract 0.5 (add -128) */
>i32_c128 = lp_build_const_int_vec(bld->gallivm, i32.type, -128);
> 
> 
> Also, the add looks iffy as it won't work correctly if the coords are
> negative, since the FPToSI is of course trunc, not floor.

I think it will be ok because the REPEAT case avoids negative coord before 
converting to int and the other cases clamp to 0.

> Maybe instead of using add + fptosi should just use lp_build_iround
> (which is just one sse instruction too on x86 though if you're targeting
> another arch it will definitely be more code at least unless someone
> adds an intrinsic for it if the cpu even has one). Might not matter
> though depending on address mode…

Yeah, that might be a better idea.

> 
> And I might be missing something why do you think the new repeat code is
> faster? Though that might also depend on arch_rounding being available
> and such but at first looks it seems slightly more complex to me.

The current code converts integer and fractional parts to integer separately. 
It also does the subtract 0.5 in floating point instead of integer arithmetic 
(-128).

-Jeff___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v4 0/9] i965/gen7 instanced GS support for ARB_gpu_shader5

2014-02-14 Thread Anuj Phogat

On Thu, Feb 6, 2014 at 6:28 PM, Jordan Justen  wrote:
> v4:
>  * Merge with recent compute shader parsing of
>input layout qualifiers
>
> v3:
>  * Fix major brokenness of dual instance mode operation
>using Paul's suggestions
>  * Update parsing to allow separate primitive and
>invocation declarations. Fixes piglit test:
>spec/arb_gpu_shader5/execution/invocation-id-in-separate-gs
>  * New: glsl: Generate error for invalid input layout declarations
>This is made easier by the in_qualifier addition in
>this series, but it otherwise an unrelated bug fix.
>  * Added check for valid invocation count values
>
> v2:
>  * Convert gl_InvocationID to a system value
>
> No piglit regressions on HSW.
>
> Instanced GS support requires overriding ARB_gpu_shader5 via
> MESA_EXTENSION_OVERRIDE=GL_ARB_gpu_shader5, since not all
> parts of GL_ARB_gpu_shader5 are enabled.
>
> Patches are available at:
> git://people.freedesktop.org/~jljusten/mesa gs-inv-id-v4
>
> Jordan Justen (9):
>   glsl: convert GS input primitive to use ast_type_qualifier
>   glsl: Generate error for invalid input layout declarations
>   glsl: parse invocations layout qualifier for ARB_gpu_shader5
>   glsl/linker: produce gl_shader_program Geom.Invocations
>   mesa: initialize gl_geometry_program Invocations field
>   main/shaderapi: GL_GEOMETRY_SHADER_INVOCATIONS GetProgramiv support
>   glsl: add gl_InvocationID variable for ARB_gpu_shader5
>   i965: support gl_InvocationID for gen7
>   i965: support instanced GS on gen7
>
>  src/glsl/ast.h|  16 +++-
>  src/glsl/ast_to_hir.cpp   |   5 +-
>  src/glsl/ast_type.cpp | 110 
> ++
>  src/glsl/builtin_variables.cpp|   2 +
>  src/glsl/glsl_parser.yy   |  73 +-
>  src/glsl/glsl_parser_extras.cpp   |  10 +-
>  src/glsl/glsl_parser_extras.h |   7 +-
>  src/glsl/linker.cpp   |  18 
>  src/mesa/drivers/dri/i965/brw_context.h   |   2 +
>  src/mesa/drivers/dri/i965/brw_defines.h   |  13 +++
>  src/mesa/drivers/dri/i965/brw_shader.cpp  |   2 +
>  src/mesa/drivers/dri/i965/brw_vec4.h  |   1 +
>  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp  |  20 
>  src/mesa/drivers/dri/i965/brw_vec4_gs.c   |   2 +
>  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |  22 -
>  src/mesa/drivers/dri/i965/gen7_gs_state.c |   2 +
>  src/mesa/main/mtypes.h|  11 +++
>  src/mesa/main/shaderapi.c |   7 ++
>  src/mesa/program/program.c|   1 +
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp|   1 +
>  src/mesa/state_tracker/st_program.c   |   1 +
>  21 files changed, 261 insertions(+), 65 deletions(-)
>
> --
> 1.9.rc1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

After you take care of my comments in patches 3/9 and 8/9, this
series is: Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: Fix valgrind uninitialized variable warning

2014-02-14 Thread Ian Romanick

On 02/14/2014 08:05 AM, Courtney Goeltzenleuchter wrote:
> Initialize field to eliminate valgrind warning.

There are a couple other fields that aren't set it all paths (e.g.,
flipped).  I want to suggest just memseting the whole structure, but
it's not obvious to me how it's used throughout the code.

I suspect the code was more clear before ETC2 support was added...

> Signed-off-by: Courtney Goeltzenleuchter 
> ---
>  src/mesa/main/texcompress_etc.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/mesa/main/texcompress_etc.c b/src/mesa/main/texcompress_etc.c
> index f9234b0..97adc86 100644
> --- a/src/mesa/main/texcompress_etc.c
> +++ b/src/mesa/main/texcompress_etc.c
> @@ -350,6 +350,7 @@ etc2_rgb8_parse_block(struct etc2_block *block,
> block->is_t_mode = false;
> block->is_h_mode = false;
> block->is_planar_mode = false;
> +   block->opaque = false;
>  
> if (punchthrough_alpha)
>block->opaque = src[3] & 0x2;
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 74251] Segfault in st_finalize_texture with Texture Buffer

2014-02-14 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=74251

Marek Olšák  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from Marek Olšák  ---
Indeed. The fix will be backported to stable branches if it hasn't been
backported already. Closing.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: add bounds checking to eliminate buffer overrun

2014-02-14 Thread Ian Romanick

On 02/14/2014 07:52 AM, Courtney Goeltzenleuchter wrote:
> Decompressing ETC2 textures was causing intermitent segfault
> by copying resulting 4x4 texel block to the destination texture
> regardless of the size of the destination texture. Issue found
> via application crash in GLBenchmark 3.0's Manhattan test.

So... the problem is that every ETC texture is (physically) a multiple
of 4 width, but we may have allocated a decompression buffer that was
the logical width of the texture.  I think the code needs more of a
comment explaining that.  Otherwise, when someone comes back to it in a
year (or uses it as a model for some other texture compression code),
they won't understand why the code is the way it is.

With that, this patch is

Reviewed-by: Ian Romanick 

One additional suggestion below that you can take or leave.

> Signed-off-by: Courtney Goeltzenleuchter 

Also add:

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74988
Cc: "9.2 10.0 10.1" 

> ---
>  src/mesa/main/texcompress_etc.c | 49 
> +
>  1 file changed, 25 insertions(+), 24 deletions(-)
> 
> diff --git a/src/mesa/main/texcompress_etc.c b/src/mesa/main/texcompress_etc.c
> index e3862be..f9234b0 100644
> --- a/src/mesa/main/texcompress_etc.c
> +++ b/src/mesa/main/texcompress_etc.c
> @@ -684,9 +684,10 @@ etc2_unpack_rgb8(uint8_t *dst_row,
>   etc2_rgb8_parse_block(&block, src,
> false /* punchthrough_alpha */);
>  

Alternately, the code below could be:

const unsigned h = MIN2(bh, height - y);
const unsigned w = MIN2(bw, width - x);

...

for (j = 0; j < h; j++) {

...

   for (i = 0; i < w; i++) {


> - for (j = 0; j < bh; j++) {
> + /* be sure to stay within the bounds of the texture */
> + for (j = 0; j < bh && (j+y) < height; j++) {
>  uint8_t *dst = dst_row + (y + j) * dst_stride + x * comps;
> -for (i = 0; i < bw; i++) {
> +for (i = 0; i < bw && (i+x) < width; i++) {
> etc2_rgb8_fetch_texel(&block, i, j, dst,
>   false /* punchthrough_alpha */);
> dst[3] = 255;
> @@ -721,9 +722,9 @@ etc2_unpack_srgb8(uint8_t *dst_row,
>   etc2_rgb8_parse_block(&block, src,
> false /* punchthrough_alpha */);
>  
> - for (j = 0; j < bh; j++) {
> + for (j = 0; j < bh && (j+y) < height; j++) {
>  uint8_t *dst = dst_row + (y + j) * dst_stride + x * comps;
> -for (i = 0; i < bw; i++) {
> +for (i = 0; i < bw && (i+x) < width; i++) {
> etc2_rgb8_fetch_texel(&block, i, j, dst,
>   false /* punchthrough_alpha */);
> /* Convert to MESA_FORMAT_B8G8R8A8_SRGB */
> @@ -764,9 +765,9 @@ etc2_unpack_rgba8(uint8_t *dst_row,
>for (x = 0; x < width; x+= bw) {
>   etc2_rgba8_parse_block(&block, src);
>  
> - for (j = 0; j < bh; j++) {
> + for (j = 0; j < bh && (j+y) < height; j++) {
>  uint8_t *dst = dst_row + (y + j) * dst_stride + x * comps;
> -for (i = 0; i < bw; i++) {
> +for (i = 0; i < bw && (i+x) < width; i++) {
> etc2_rgba8_fetch_texel(&block, i, j, dst);
> dst += comps;
>  }
> @@ -801,9 +802,9 @@ etc2_unpack_srgb8_alpha8(uint8_t *dst_row,
>for (x = 0; x < width; x+= bw) {
>   etc2_rgba8_parse_block(&block, src);
>  
> - for (j = 0; j < bh; j++) {
> + for (j = 0; j < bh && (j+y) < height; j++) {
>  uint8_t *dst = dst_row + (y + j) * dst_stride + x * comps;
> -for (i = 0; i < bw; i++) {
> +for (i = 0; i < bw && (i+x) < width; i++) {
> etc2_rgba8_fetch_texel(&block, i, j, dst);
>  
> /* Convert to MESA_FORMAT_B8G8R8A8_SRGB */
> @@ -843,9 +844,9 @@ etc2_unpack_r11(uint8_t *dst_row,
>for (x = 0; x < width; x+= bw) {
>   etc2_r11_parse_block(&block, src);
>  
> - for (j = 0; j < bh; j++) {
> + for (j = 0; j < bh && (j+y) < height; j++) {
>  uint8_t *dst = dst_row + (y + j) * dst_stride + x * comps * 
> comp_size;
> -for (i = 0; i < bw; i++) {
> +for (i = 0; i < bw && (i+x) < width; i++) {
> etc2_r11_fetch_texel(&block, i, j, dst);
> dst += comps * comp_size;
>  }
> @@ -879,10 +880,10 @@ etc2_unpack_rg11(uint8_t *dst_row,
>   /* red component */
>   etc2_r11_parse_block(&block, src);
>  
> - for (j = 0; j < bh; j++) {
> + for (j = 0; j < bh && (j+y) < height; j++) {
>  uint8_t *dst = dst_row + (y + j) * dst_stride +
> x * comps * comp_size;
> -for (i = 0; i < bw; i++) {
> +for (i = 0; i < bw && (i+x) < width; i++) {
> etc2_r11_fetch_texel(&block, i, j, d

Re: [Mesa-dev] [PATCH] R600/SI: Custom select 64-bit ADD

2014-02-14 Thread Tom Stellard

On Thu, Feb 13, 2014 at 07:56:26AM -0800, Matt Arsenault wrote:
> 
> On Feb 7, 2014, at 7:46 AM, Tom Stellard  wrote:
> 
> > From: Tom Stellard 
> > 
> > ---
> > lib/Target/R600/AMDGPUISelDAGToDAG.cpp | 48 
> > ++
> > lib/Target/R600/SIISelLowering.cpp | 29 
> > lib/Target/R600/SIISelLowering.h   |  1 -
> > test/CodeGen/R600/add.ll   | 10 +++
> > test/CodeGen/R600/add_i64.ll   | 23 +++-
> > 5 files changed, 75 insertions(+), 36 deletions(-)
> > 
> > diff --git a/lib/Target/R600/AMDGPUISelDAGToDAG.cpp 
> > b/lib/Target/R600/AMDGPUISelDAGToDAG.cpp
> > index a989135..fea875c 100644
> > --- a/lib/Target/R600/AMDGPUISelDAGToDAG.cpp
> > +++ b/lib/Target/R600/AMDGPUISelDAGToDAG.cpp
> > @@ -200,6 +200,54 @@ SDNode *AMDGPUDAGToDAGISel::Select(SDNode *N) {
> >   }
> >   switch (Opc) {
> >   default: break;
> > +  // We are selecting i64 ADD here instead of custom lower it during
> > +  // DAG legalization, so we can fold some i64 ADDs used for address
> > +  // calculation into the LOAD and STORE instructions.
> > +  case ISD::ADD: {
> > +const AMDGPUSubtarget &ST = TM.getSubtarget();
> > +if (N->getValueType(0) != MVT::i64 ||
> > +ST.getGeneration() < AMDGPUSubtarget::SOUTHERN_ISLANDS)
> > +  break;
> > +
> > +SDLoc DL(N);
> > +SDValue LHS = N->getOperand(0);
> > +SDValue RHS = N->getOperand(1);
> > +
> > +SDValue Sub0 = CurDAG->getTargetConstant(AMDGPU::sub0, MVT::i32);
> > +SDValue Sub1 = CurDAG->getTargetConstant(AMDGPU::sub1, MVT::i32);
> > +
> > +SDNode *Lo0 = CurDAG->getMachineNode(TargetOpcode::EXTRACT_SUBREG,
> > + DL, MVT::i32, LHS, Sub0);
> > +SDNode *Hi0 = CurDAG->getMachineNode(TargetOpcode::EXTRACT_SUBREG,
> > + DL, MVT::i32, LHS, Sub1);
> > +
> > +SDNode *Lo1 = CurDAG->getMachineNode(TargetOpcode::EXTRACT_SUBREG,
> > + DL, MVT::i32, RHS, Sub0);
> > +SDNode *Hi1 = CurDAG->getMachineNode(TargetOpcode::EXTRACT_SUBREG,
> > + DL, MVT::i32, RHS, Sub1);
> > +
> > +SDVTList VTList = CurDAG->getVTList(MVT::i32, MVT::Glue);
> > +
> > +SmallVector AddLoArgs;
> > +AddLoArgs.push_back(SDValue(Lo0, 0));
> > +AddLoArgs.push_back(SDValue(Lo1, 0));
> > +
> > +SDNode *AddLo = CurDAG->getMachineNode(AMDGPU::S_ADD_I32, DL,
> > +   VTList, AddLoArgs);
> > +SDValue Carry = SDValue(AddLo, 1);
> > +SDNode *AddHi = CurDAG->getMachineNode(AMDGPU::S_ADDC_U32, DL,
> > +   MVT::i32, SDValue(Hi0, 0),
> > +   SDValue(Hi1, 0), Carry);
> > +
> > +SDValue Args[5] = {
> > +  CurDAG->getTargetConstant(AMDGPU::SReg_64RegClassID, MVT::i32),
> > +  SDValue(AddLo,0),
> > +  Sub0,
> > +  SDValue(AddHi,0),
> > +  Sub1,
> > +};
> > +return CurDAG->SelectNodeTo(N, AMDGPU::REG_SEQUENCE, MVT::i64, Args, 
> > 5);
> > +  }
> >   case ISD::BUILD_VECTOR: {
> > unsigned RegClassID;
> > const AMDGPUSubtarget &ST = TM.getSubtarget();
> > diff --git a/lib/Target/R600/SIISelLowering.cpp 
> > b/lib/Target/R600/SIISelLowering.cpp
> > index 0a22d16..4d2f370 100644
> > --- a/lib/Target/R600/SIISelLowering.cpp
> > +++ b/lib/Target/R600/SIISelLowering.cpp
> > @@ -76,7 +76,6 @@ SITargetLowering::SITargetLowering(TargetMachine &TM) :
> >   setOperationAction(ISD::VECTOR_SHUFFLE, MVT::v16i32, Expand);
> >   setOperationAction(ISD::VECTOR_SHUFFLE, MVT::v16f32, Expand);
> > 
> > -  setOperationAction(ISD::ADD, MVT::i64, Legal);
> 
> Would it be better to mark this as custom lowered, and then just return 
> SDValue() for it? That way it won’t be incorrectly reported as legal for 
> anything that might be checking.
>

That's an interesting idea, but if we do that then the legalizer
will try to expand the nodes, which isn't much of a problem now since
there is no ExpandNode implementation for ISD:ADD, but someone may
want to add one in the future.

I think most places use isLegalOrCustom(), so it may not make
much of a difference anyway.

-Tom
 
> >   setOperationAction(ISD::ADD, MVT::i32, Legal);
> >   setOperationAction(ISD::ADDC, MVT::i32, Legal);
> >   setOperationAction(ISD::ADDE, MVT::i32, Legal);
> > @@ -475,7 +474,6 @@ SDValue SITargetLowering::LowerOperation(SDValue Op, 
> > SelectionDAG &DAG) const {
> >   SIMachineFunctionInfo *MFI = MF.getInfo();
> >   switch (Op.getOpcode()) {
> >   default: return AMDGPUTargetLowering::LowerOperation(Op, DAG);
> > -  case ISD::ADD: return LowerADD(Op, DAG);
> >   case ISD::BRCOND: return LowerBRCOND(Op, DAG);
> >   case ISD::LOAD: {
> > LoadSDNode *Load = dyn_cast(Op);
> > @@ -613,33 +611,6 @@ SDValue SITargetLowering::LowerOperation(SDValue Op, 
> > SelectionDAG &DAG) const {
> >   return SDValue();
> > }
> > 
> > -SDValue SIT

Re: [Mesa-dev] [RFC] llvmpipe texture coordinate rounding

2014-02-14 Thread Roland Scheidegger

Am 14.02.2014 19:59, schrieb Jeff Muizelaar:
> 
> On Feb 14, 2014, at 1:00 PM, Roland Scheidegger  > wrote:
> 
>> Am 14.02.2014 18:07, schrieb Jeff Muizelaar:
>>
>> I'll need to take another look and run some tests, though I've got some
>> quick comments:
>>
>>
>> @@ -1031,16 +1082,28 @@ lp_build_sample_image_linear(struct
>> lp_build_sample_context *bld,
>>   s = lp_build_mul_imm(&bld->coord_bld, s, 256);
>>   if (dims >= 2)
>>  t = lp_build_mul_imm(&bld->coord_bld, t, 256);
>>   if (dims >= 3)
>>  r = lp_build_mul_imm(&bld->coord_bld, r, 256);
>>}
>>
>>/* convert float to int */
>> +   half = lp_build_const_vec(bld->gallivm, bld->coord_bld.type, 0.5);
>> +   s = lp_build_add(&bld->coord_bld, s, half);
>> +   s = LLVMBuildFPToSI(builder, s, i32_vec_type, "");
>> +   if (dims >= 2) {
>> +  t = lp_build_add(&bld->coord_bld, t, half);
>> +  t = LLVMBuildFPToSI(builder, t, i32_vec_type, "");
>> +   }
>> +   if (dims >= 3) {
>> +  r = lp_build_add(&bld->coord_bld, r, half);
>> +  r = LLVMBuildFPToSI(builder, r, i32_vec_type, "");
>> +   }
>> +
>>s = LLVMBuildFPToSI(builder, s, i32_vec_type, "");
>>if (dims >= 2)
>>   t = LLVMBuildFPToSI(builder, t, i32_vec_type, "");
>>if (dims >= 3)
>>   r = LLVMBuildFPToSI(builder, r, i32_vec_type, "");
>> This looks quite incorrect you're converting the s/t/r coords twice from
>> float to int.
> 
> Yep. I forgot to remove this second hunk.
> 
>>/* subtract 0.5 (add -128) */
>>i32_c128 = lp_build_const_int_vec(bld->gallivm, i32.type, -128);
>>
>>
>> Also, the add looks iffy as it won't work correctly if the coords are
>> negative, since the FPToSI is of course trunc, not floor.
> 
> I think it will be ok because the REPEAT case avoids negative coord
> before converting to int and the other cases clamp to 0.
I think for clamp_to_edge you're right, but repeat will use these coords
for the POT case (npot indeed won't care at all since it doesn't
actually use these values at all).

> 
>> Maybe instead of using add + fptosi should just use lp_build_iround
>> (which is just one sse instruction too on x86 though if you're targeting
>> another arch it will definitely be more code at least unless someone
>> adds an intrinsic for it if the cpu even has one). Might not matter
>> though depending on address mode…
> 
> Yeah, that might be a better idea.
> 
>>
>> And I might be missing something why do you think the new repeat code is
>> faster? Though that might also depend on arch_rounding being available
>> and such but at first looks it seems slightly more complex to me.
> 
> The current code converts integer and fractional parts to integer
> separately. It also does the subtract 0.5 in floating point instead of
> integer arithmetic (-128).

Yeah, you're probably right. With simple instruction counting I end up
with the same number of instructions actually (assuming arch_rounding is
available, I miscounted previously and thought it were two more), but
the ones in your version ought to be a bit cheaper.

I suspect actually nearest filtering suffers from the same inaccuracy,
we actually also do the mul by 256 so we get 8 fractional bits, then
toss away those 8 bits and just use the non-fractional part. That looks
to me like we only need these 8 bits so we get "reasonably correct"
rounding, but we still would chose the wrong sampling point sometimes
(if it's less than 1/256 pixels away from the center between two
texels). I actually noticed that one a while ago but I can't remember
why I didn't do something about it...

Roland
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/8] i965: Don't try to use the ctx->ReadBuffer when asked to blorp miptrees.

2014-02-14 Thread Eric Anholt

So far it's happened to be that we're only ever calling
intel_miptree_blit() (up/downsampling) from the ReadBuffer, but I stumbled
over a null ReadBuffer case when debugging later parts of the series.
---
 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
index c23504f..0aeb651 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
@@ -1974,7 +1974,6 @@ brw_blorp_blit_params::brw_blorp_blit_params(struct 
brw_context *brw,
  bool mirror_x, bool mirror_y)
 {
struct gl_context *ctx = &brw->ctx;
-   const struct gl_framebuffer *read_fb = ctx->ReadBuffer;
 
src.set(brw, src_mt, src_level, src_layer, false);
dst.set(brw, dst_mt, dst_level, dst_layer, true);
@@ -2127,8 +2126,10 @@ brw_blorp_blit_params::brw_blorp_blit_params(struct 
brw_context *brw,
y0 = wm_push_consts.dst_y0 = dst_y0;
x1 = wm_push_consts.dst_x1 = dst_x1;
y1 = wm_push_consts.dst_y1 = dst_y1;
-   wm_push_consts.rect_grid_x1 = read_fb->Width * wm_prog_key.x_scale - 1.0;
-   wm_push_consts.rect_grid_y1 = read_fb->Height * wm_prog_key.y_scale - 1.0;
+   wm_push_consts.rect_grid_x1 = (minify(src_mt->logical_width0, src_level) *
+  wm_prog_key.x_scale - 1.0);
+   wm_push_consts.rect_grid_y1 = (minify(src_mt->logical_height0, src_level) *
+  wm_prog_key.y_scale - 1.0);
 
wm_push_consts.x_transform.setup(src_x0, src_x1, dst_x0, dst_x1, mirror_x);
wm_push_consts.y_transform.setup(src_y0, src_y1, dst_y0, dst_y1, mirror_y);
-- 
1.9.rc1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 6/8] i965: Drop some duplicated code in DRI winsys BO updates.

2014-02-14 Thread Eric Anholt

The only DRI2 vs DRI3 delta was just how to decide about frontbuffer-ness
for doing the upsample.
---
 src/mesa/drivers/dri/i965/brw_context.c   |  26 ---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 102 --
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  19 ++---
 3 files changed, 37 insertions(+), 110 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 021287e..ba2f971 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -1302,11 +1302,14 @@ intel_process_dri2_buffer(struct brw_context *brw,
   return;
}
 
-   rb->mt = intel_miptree_create_for_dri2_buffer(brw,
- buffer->attachment,
- intel_rb_format(rb),
- num_samples,
- region);
+   intel_update_winsys_renderbuffer_miptree(brw, rb, region);
+
+   if (brw->is_front_buffer_rendering &&
+   (buffer->attachment == __DRI_BUFFER_FRONT_LEFT ||
+buffer->attachment == __DRI_BUFFER_FAKE_FRONT_LEFT) &&
+   rb->Base.Base.NumSamples > 1) {
+  intel_miptree_upsample(brw, rb->mt);
+   }
 
assert(rb->mt);
 
@@ -1359,12 +1362,13 @@ intel_update_image_buffer(struct brw_context *intel,
   return;
}
 
-   intel_miptree_release(&rb->mt);
-   rb->mt = intel_miptree_create_for_image_buffer(intel,
-  buffer_type,
-  intel_rb_format(rb),
-  num_samples,
-  region);
+   intel_update_winsys_renderbuffer_miptree(intel, rb, region);
+
+   if (intel->is_front_buffer_rendering &&
+   buffer_type == __DRI_IMAGE_BUFFER_FRONT &&
+   rb->Base.Base.NumSamples > 1) {
+  intel_miptree_upsample(intel, rb->mt);
+   }
 }
 
 static void
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 86a3887..63527e1 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -35,6 +35,7 @@
 #include "intel_resolve_map.h"
 #include "intel_tex.h"
 #include "intel_blit.h"
+#include "intel_fbo.h"
 
 #include "brw_blorp.h"
 #include "brw_context.h"
@@ -666,78 +667,6 @@ intel_miptree_create_for_bo(struct brw_context *brw,
return mt;
 }
 
-
-/**
- * For a singlesample DRI2 buffer, this simply wraps the given region with a 
miptree.
- *
- * For a multisample DRI2 buffer, this wraps the given region with
- * a singlesample miptree, then creates a multisample miptree into which the
- * singlesample miptree is embedded as a child.
- */
-struct intel_mipmap_tree*
-intel_miptree_create_for_dri2_buffer(struct brw_context *brw,
- unsigned dri_attachment,
- mesa_format format,
- uint32_t num_samples,
- struct intel_region *region)
-{
-   struct intel_mipmap_tree *singlesample_mt = NULL;
-   struct intel_mipmap_tree *multisample_mt = NULL;
-
-   /* Only the front and back buffers, which are color buffers, are shared
-* through DRI2.
-*/
-   assert(dri_attachment == __DRI_BUFFER_BACK_LEFT ||
-  dri_attachment == __DRI_BUFFER_FRONT_LEFT ||
-  dri_attachment == __DRI_BUFFER_FAKE_FRONT_LEFT);
-   assert(_mesa_get_format_base_format(format) == GL_RGB ||
-  _mesa_get_format_base_format(format) == GL_RGBA);
-
-   singlesample_mt = intel_miptree_create_for_bo(brw,
- region->bo,
- format,
- 0,
- region->width,
- region->height,
- region->pitch,
- region->tiling);
-   if (!singlesample_mt)
-  return NULL;
-   singlesample_mt->region->name = region->name;
-
-   /* If this miptree is capable of supporting fast color clears, set
-* fast_clear_state appropriately to ensure that fast clears will occur.
-* Allocation of the MCS miptree will be deferred until the first fast
-* clear actually occurs.
-*/
-   if (intel_is_non_msrt_mcs_buffer_supported(brw, singlesample_mt))
-  singlesample_mt->fast_clear_state = INTEL_FAST_CLEAR_STATE_RESOLVED;
-
-   if (num_samples == 0)
-  return singlesample_mt;
-
-   multisample_mt = intel_miptree_create_for_renderbuffer(brw,
-  format,
-  region->width,
-

[Mesa-dev] [PATCH 3/8] i965: Make the mt->target of multisample renderbuffers be 2D_MS.

2014-02-14 Thread Eric Anholt

Mostly mt->target == 2D_MS just results in a few checks that we don't try
to allocate multiple LODs and don't try to do slice copies with them.  But
with the introduction of binding renderbuffers to textures, we need more
consistency.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index c9f5bb3..08b8475 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -813,8 +813,9 @@ intel_miptree_create_for_renderbuffer(struct brw_context 
*brw,
struct intel_mipmap_tree *mt;
uint32_t depth = 1;
bool ok;
+   GLenum target = num_samples > 1 ? GL_TEXTURE_2D_MULTISAMPLE : GL_TEXTURE_2D;
 
-   mt = intel_miptree_create(brw, GL_TEXTURE_2D, format, 0, 0,
+   mt = intel_miptree_create(brw, target, format, 0, 0,
 width, height, depth, true, num_samples,
  INTEL_MIPTREE_TILING_ANY);
if (!mt)
@@ -1651,7 +1652,8 @@ intel_miptree_updownsample(struct brw_context *brw,
 static void
 assert_is_flat(struct intel_mipmap_tree *mt)
 {
-   assert(mt->target == GL_TEXTURE_2D);
+   assert(mt->target == GL_TEXTURE_2D ||
+  mt->target == GL_TEXTURE_2D_MULTISAMPLE);
assert(mt->first_level == 0);
assert(mt->last_level == 0);
 }
@@ -2363,7 +2365,7 @@ intel_miptree_map_multisample(struct brw_context *brw,
assert(mt->num_samples > 1);
 
/* Only flat, renderbuffer-like miptrees are supported. */
-   if (mt->target != GL_TEXTURE_2D ||
+   if (mt->target != GL_TEXTURE_2D_MULTISAMPLE ||
mt->first_level != 0 ||
mt->last_level != 0) {
   _mesa_problem(ctx, "attempt to map a multisample miptree for "
-- 
1.9.rc1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/8] meta: Fix blit shader compile on non-glsl-130 drivers.

2014-02-14 Thread Eric Anholt

Compare this VS to the one for the post-130 case.  Fixes piglit
glsl-lod-bias, and presumably tons of other code (I haven't done a full
piglit run on swrast).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74911
---
 src/mesa/drivers/common/meta.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index d3ca3b7..dd905dd 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -193,7 +193,7 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
|| ctx->Const.GLSLVersion < 130) {
   vs_source =
  "attribute vec2 position;\n"
- "attribute vec3 textureCoords;\n"
+ "attribute vec4 textureCoords;\n"
  "varying vec4 texCoords;\n"
  "void main()\n"
  "{\n"
-- 
1.9.rc1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/8] meta: Push into desktop GL mode when doing meta operations.

2014-02-14 Thread Eric Anholt

This lets us simplify our shaders, and rely on GLES-prohibited
functionality (like ARB_texture_multisample) when writing these
driver-internal functions.
---
 src/mesa/drivers/common/meta.c | 39 ---
 src/mesa/drivers/common/meta.h |  3 +++
 2 files changed, 19 insertions(+), 23 deletions(-)

diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index dd905dd..a0613f2 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -204,9 +204,6 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
   fs_source = ralloc_asprintf(mem_ctx,
   "#extension GL_EXT_texture_array : enable\n"
   "#extension GL_ARB_texture_cube_map_array: 
enable\n"
-  "#ifdef GL_ES\n"
-  "precision highp float;\n"
-  "#endif\n"
   "uniform %s texSampler;\n"
   "varying vec4 texCoords;\n"
   "void main()\n"
@@ -219,7 +216,7 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
}
else {
   vs_source = ralloc_asprintf(mem_ctx,
-  "#version %s\n"
+  "#version 130\n"
   "in vec2 position;\n"
   "in vec4 textureCoords;\n"
   "out vec4 texCoords;\n"
@@ -227,14 +224,10 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
   "{\n"
   "   texCoords = textureCoords;\n"
   "   gl_Position = vec4(position, 0.0, 
1.0);\n"
-  "}\n",
-  _mesa_is_desktop_gl(ctx) ? "130" : "300 es");
+  "}\n");
   fs_source = ralloc_asprintf(mem_ctx,
-  "#version %s\n"
+  "#version 130\n"
   "#extension GL_ARB_texture_cube_map_array: 
enable\n"
-  "#ifdef GL_ES\n"
-  "precision highp float;\n"
-  "#endif\n"
   "uniform %s texSampler;\n"
   "in vec4 texCoords;\n"
   "out vec4 out_color;\n"
@@ -244,7 +237,6 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
   "   out_color = texture(texSampler, %s);\n"
   "   gl_FragDepth = out_color.x;\n"
   "}\n",
-  _mesa_is_desktop_gl(ctx) ? "130" : "300 es",
   shader->type,
   shader->texcoords);
}
@@ -401,6 +393,13 @@ _mesa_meta_begin(struct gl_context *ctx, GLbitfield state)
memset(save, 0, sizeof(*save));
save->SavedState = state;
 
+   /* We always push into desktop GL mode and pop out at the end.  No sense in
+* writing our shaders varying based on the user's context choice, when
+* Mesa can handle either.
+*/
+   save->API = ctx->API;
+   ctx->API = API_OPENGL_COMPAT;
+
/* Pausing transform feedback needs to be done early, or else we won't be
 * able to change other state.
 */
@@ -753,6 +752,8 @@ _mesa_meta_end(struct gl_context *ctx)
const GLbitfield state = save->SavedState;
int i;
 
+   ctx->API = save->API;
+
/* After starting a new occlusion query, initialize the results to the
 * values saved previously. The driver will then continue to increment
 * these values.
@@ -1482,9 +1483,6 @@ meta_glsl_clear_init(struct gl_context *ctx, struct 
clear_state *clear)
   "  }\n"
   "}\n";
const char *fs_source =
-  "#ifdef GL_ES\n"
-  "precision highp float;\n"
-  "#endif\n"
   "uniform vec4 color;\n"
   "void main()\n"
   "{\n"
@@ -1536,27 +1534,22 @@ meta_glsl_clear_init(struct gl_context *ctx, struct 
clear_state *clear)
   void *shader_source_mem_ctx = ralloc_context(NULL);
   const char *vs_int_source =
  ralloc_asprintf(shader_source_mem_ctx,
- "#version %s\n"
+ "#version 130\n"
  "in vec4 position;\n"
  "void main()\n"
  "{\n"
  "   gl_Position = position;\n"
- "}\n",
- _mesa_is_desktop_gl(ctx) ? "130" : "300 es");
+ "}\n");
   const char *fs_int_source =
  ralloc_asprintf(shader_source_mem_ctx,
- "#version %s\n"
- "#ifdef GL_ES\n"
- "precision h

[Mesa-dev] [PATCH 5/8] i965: Simplify intel_miptree_updownsample.

2014-02-14 Thread Eric Anholt

Pretty silly to pass in values dereferenced out of one of the arguments.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 35 +--
 1 file changed, 11 insertions(+), 24 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 08b8475..86a3887 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1619,32 +1619,25 @@ intel_offset_S8(uint32_t stride, uint32_t x, uint32_t 
y, bool swizzled)
 static void
 intel_miptree_updownsample(struct brw_context *brw,
struct intel_mipmap_tree *src,
-   struct intel_mipmap_tree *dst,
-   unsigned width,
-   unsigned height)
+   struct intel_mipmap_tree *dst)
 {
-   int src_x0 = 0;
-   int src_y0 = 0;
-   int dst_x0 = 0;
-   int dst_y0 = 0;
-
brw_blorp_blit_miptrees(brw,
src, 0 /* level */, 0 /* layer */,
dst, 0 /* level */, 0 /* layer */,
-   src_x0, src_y0,
-   width, height,
-   dst_x0, dst_y0,
-   width, height,
+   0, 0,
+   src->logical_width0, src->logical_height0,
+   0, 0,
+   src->logical_width0, src->logical_height0,
GL_NEAREST, false, false /*mirror x, y*/);
 
if (src->stencil_mt) {
   brw_blorp_blit_miptrees(brw,
   src->stencil_mt, 0 /* level */, 0 /* layer */,
   dst->stencil_mt, 0 /* level */, 0 /* layer */,
-  src_x0, src_y0,
-  width, height,
-  dst_x0, dst_y0,
-  width, height,
+  0, 0,
+  src->logical_width0, src->logical_height0,
+  0, 0,
+  src->logical_width0, src->logical_height0,
   GL_NEAREST, false, false /*mirror x, y*/);
}
 }
@@ -1672,10 +1665,7 @@ intel_miptree_downsample(struct brw_context *brw,
 
if (!mt->need_downsample)
   return;
-   intel_miptree_updownsample(brw,
-  mt, mt->singlesample_mt,
-  mt->logical_width0,
-  mt->logical_height0);
+   intel_miptree_updownsample(brw, mt, mt->singlesample_mt);
mt->need_downsample = false;
 }
 
@@ -1692,10 +1682,7 @@ intel_miptree_upsample(struct brw_context *brw,
assert_is_flat(mt);
assert(!mt->need_downsample);
 
-   intel_miptree_updownsample(brw,
-  mt->singlesample_mt, mt,
-  mt->logical_width0,
-  mt->logical_height0);
+   intel_miptree_updownsample(brw, mt->singlesample_mt, mt);
 }
 
 void *
-- 
1.9.rc1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 8/8] i965: Drop mt->levels[].width/height.

2014-02-14 Thread Eric Anholt

It often confused people because it was unclear on whether it was the
physical or logical, and people needed the other one as well.  We can
recompute it trivially using the minify() macro, clarifying which value is
being used and making getting the other value obvious.
---
 src/mesa/drivers/dri/i965/brw_blorp.cpp   |  4 +--
 src/mesa/drivers/dri/i965/brw_clear.c |  3 +-
 src/mesa/drivers/dri/i965/brw_tex_layout.c|  5 ++--
 src/mesa/drivers/dri/i965/intel_blit.c|  4 +--
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 40 ---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  5 +---
 src/mesa/drivers/dri/i965/intel_screen.c  |  4 +--
 7 files changed, 23 insertions(+), 42 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp.cpp
index 76537c8..7980013 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp
@@ -68,8 +68,8 @@ brw_blorp_mip_info::set(struct intel_mipmap_tree *mt,
this->mt = mt;
this->level = level;
this->layer = layer;
-   this->width = mt->level[level].width;
-   this->height = mt->level[level].height;
+   this->width = minify(mt->physical_width0, level);
+   this->height = minify(mt->physical_height0, level);
 
intel_miptree_get_image_offset(mt, level, layer, &x_offset, &y_offset);
 }
diff --git a/src/mesa/drivers/dri/i965/brw_clear.c 
b/src/mesa/drivers/dri/i965/brw_clear.c
index 1964572..d9a8792 100644
--- a/src/mesa/drivers/dri/i965/brw_clear.c
+++ b/src/mesa/drivers/dri/i965/brw_clear.c
@@ -155,7 +155,8 @@ brw_fast_clear_depth(struct gl_context *ctx)
*width of the map (LOD0) is not multiple of 16, fast clear
*optimization must be disabled.
*/
-  if (brw->gen == 6 && (mt->level[depth_irb->mt_level].width % 16) != 0)
+  if (brw->gen == 6 && (minify(mt->physical_width0,
+   depth_irb->mt_level) % 16) != 0)
 return false;
   /* FALLTHROUGH */
 
diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c 
b/src/mesa/drivers/dri/i965/brw_tex_layout.c
index 61a2eba..76044b2 100644
--- a/src/mesa/drivers/dri/i965/brw_tex_layout.c
+++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c
@@ -197,8 +197,7 @@ brw_miptree_layout_2d(struct intel_mipmap_tree *mt)
for (unsigned level = mt->first_level; level <= mt->last_level; level++) {
   unsigned img_height;
 
-  intel_miptree_set_level_info(mt, level, x, y, width,
-  height, depth);
+  intel_miptree_set_level_info(mt, level, x, y, depth);
 
   img_height = ALIGN(height, mt->align_h);
   if (mt->compressed)
@@ -281,7 +280,7 @@ brw_miptree_layout_texture_3d(struct brw_context *brw,
   if (mt->target == GL_TEXTURE_CUBE_MAP)
  DL = 6;
 
-  intel_miptree_set_level_info(mt, level, 0, 0, WL, HL, DL);
+  intel_miptree_set_level_info(mt, level, 0, 0, DL);
 
   for (unsigned q = 0; q < DL; q++) {
  unsigned x = (q % (1 << level)) * wL;
diff --git a/src/mesa/drivers/dri/i965/intel_blit.c 
b/src/mesa/drivers/dri/i965/intel_blit.c
index b12ecca..23757f3 100644
--- a/src/mesa/drivers/dri/i965/intel_blit.c
+++ b/src/mesa/drivers/dri/i965/intel_blit.c
@@ -215,10 +215,10 @@ intel_miptree_blit(struct brw_context *brw,
intel_miptree_resolve_color(brw, dst_mt);
 
if (src_flip)
-  src_y = src_mt->level[src_level].height - src_y - height;
+  src_y = minify(src_mt->physical_height0, src_level) - src_y - height;
 
if (dst_flip)
-  dst_y = dst_mt->level[dst_level].height - dst_y - height;
+  dst_y = minify(dst_mt->physical_height0, src_level) - dst_y - height;
 
int src_pitch = src_mt->region->pitch;
if (src_flip != dst_flip)
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 522837c..88bab9a 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -866,24 +866,10 @@ intel_miptree_match_image(struct intel_mipmap_tree *mt,
 * minification.  This will also catch images not present in the
 * tree, changed targets, etc.
 */
-   if (mt->target == GL_TEXTURE_2D_MULTISAMPLE ||
- mt->target == GL_TEXTURE_2D_MULTISAMPLE_ARRAY) {
-  /* nonzero level here is always bogus */
-  assert(level == 0);
-
-  if (width != mt->logical_width0 ||
-height != mt->logical_height0 ||
-depth != mt->logical_depth0) {
- return false;
-  }
-   }
-   else {
-  /* all normal textures, renderbuffers, etc */
-  if (width != mt->level[level].width ||
-  height != mt->level[level].height ||
-  depth != mt->level[level].depth) {
- return false;
-  }
+   if (width != minify(mt->logical_width0, level) ||
+   height != minify(mt->logical_height0, level) ||
+   depth != mt->level[level].depth) {
+  return false;
}

[Mesa-dev] [PATCH 7/8] i965: Move singlesample_mt to the renderbuffer.

2014-02-14 Thread Eric Anholt

Since only window system renderbuffers can have a singlesample_mt, this
lets us drop a bunch of sanity checking to make sure that we're just a
renderbuffer-like thing.
---
 src/mesa/drivers/dri/i965/brw_context.c   |  20 ++-
 src/mesa/drivers/dri/i965/intel_fbo.c |  91 ++-
 src/mesa/drivers/dri/i965/intel_fbo.h |  47 +-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 209 +++---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  59 +---
 5 files changed, 165 insertions(+), 261 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index ba2f971..1071f9f 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -997,7 +997,7 @@ intel_resolve_for_dri2_flush(struct brw_context *brw,
   if (rb->mt->num_samples <= 1)
  intel_miptree_resolve_color(brw, rb->mt);
   else
- intel_miptree_downsample(brw, rb->mt);
+ intel_renderbuffer_downsample(brw, rb);
}
 }
 
@@ -1270,10 +1270,9 @@ intel_process_dri2_buffer(struct brw_context *brw,
rb->mt->region->name == buffer->name)
   return;
} else {
-   if (rb->mt &&
-   rb->mt->singlesample_mt &&
-   rb->mt->singlesample_mt->region &&
-   rb->mt->singlesample_mt->region->name == buffer->name)
+   if (rb->singlesample_mt &&
+   rb->singlesample_mt->region &&
+   rb->singlesample_mt->region->name == buffer->name)
   return;
}
 
@@ -1308,7 +1307,7 @@ intel_process_dri2_buffer(struct brw_context *brw,
(buffer->attachment == __DRI_BUFFER_FRONT_LEFT ||
 buffer->attachment == __DRI_BUFFER_FAKE_FRONT_LEFT) &&
rb->Base.Base.NumSamples > 1) {
-  intel_miptree_upsample(brw, rb->mt);
+  intel_renderbuffer_upsample(brw, rb);
}
 
assert(rb->mt);
@@ -1355,10 +1354,9 @@ intel_update_image_buffer(struct brw_context *intel,
rb->mt->region->bo == region->bo)
   return;
} else {
-   if (rb->mt &&
-   rb->mt->singlesample_mt &&
-   rb->mt->singlesample_mt->region &&
-   rb->mt->singlesample_mt->region->bo == region->bo)
+   if (rb->singlesample_mt &&
+   rb->singlesample_mt->region &&
+   rb->singlesample_mt->region->bo == region->bo)
   return;
}
 
@@ -1367,7 +1365,7 @@ intel_update_image_buffer(struct brw_context *intel,
if (intel->is_front_buffer_rendering &&
buffer_type == __DRI_IMAGE_BUFFER_FRONT &&
rb->Base.Base.NumSamples > 1) {
-  intel_miptree_upsample(intel, rb->mt);
+  intel_renderbuffer_upsample(intel, rb);
}
 }
 
diff --git a/src/mesa/drivers/dri/i965/intel_fbo.c 
b/src/mesa/drivers/dri/i965/intel_fbo.c
index cd148f0..62ccfab 100644
--- a/src/mesa/drivers/dri/i965/intel_fbo.c
+++ b/src/mesa/drivers/dri/i965/intel_fbo.c
@@ -74,11 +74,41 @@ intel_delete_renderbuffer(struct gl_context *ctx, struct 
gl_renderbuffer *rb)
ASSERT(irb);
 
intel_miptree_release(&irb->mt);
+   intel_miptree_release(&irb->singlesample_mt);
 
_mesa_delete_renderbuffer(ctx, rb);
 }
 
 /**
+ * \brief Downsample a winsys renderbuffer from mt to singlesample_mt.
+ *
+ * If the miptree needs no downsample, then skip.
+ */
+void
+intel_renderbuffer_downsample(struct brw_context *brw,
+  struct intel_renderbuffer *irb)
+{
+   if (!irb->need_downsample)
+  return;
+   intel_miptree_updownsample(brw, irb->mt, irb->singlesample_mt);
+   irb->need_downsample = false;
+}
+
+/**
+ * \brief Upsample a winsys renderbuffer from singlesample_mt to mt.
+ *
+ * The upsample is done unconditionally.
+ */
+void
+intel_renderbuffer_upsample(struct brw_context *brw,
+struct intel_renderbuffer *irb)
+{
+   assert(!irb->need_downsample);
+
+   intel_miptree_updownsample(brw, irb->singlesample_mt, irb->mt);
+}
+
+/**
  * \see dd_function_table::MapRenderbuffer
  */
 static void
@@ -92,6 +122,7 @@ intel_map_renderbuffer(struct gl_context *ctx,
struct brw_context *brw = brw_context(ctx);
struct swrast_renderbuffer *srb = (struct swrast_renderbuffer *)rb;
struct intel_renderbuffer *irb = intel_renderbuffer(rb);
+   struct intel_mipmap_tree *mt;
void *map;
int stride;
 
@@ -106,6 +137,39 @@ intel_map_renderbuffer(struct gl_context *ctx,
 
intel_prepare_render(brw);
 
+   /* The MapRenderbuffer API should always be returning a single-sampled
+* mapping.  The case we get mapping of multisampled RBs are in
+* glReadPixels() (or swrast paths like glCopyTexImage()) from a
+* window-system MSAA buffer.
+*
+* If it's a color miptree, there is a ->singlesample_mt which wraps the
+* actual window system renderbuffer (which we may resolve to at any time),
+* while the miptree itself is our driver-private allocation.  If it's a
+* depth or stencil miptree, we have a private MSAA buffer and no shared
+

Re: [Mesa-dev] [RFC] llvmpipe texture coordinate rounding

2014-02-14 Thread Roland Scheidegger

Am 14.02.2014 23:35, schrieb Roland Scheidegger:
> Am 14.02.2014 19:59, schrieb Jeff Muizelaar:
>>
>> On Feb 14, 2014, at 1:00 PM, Roland Scheidegger > > wrote:
>>
>>> Am 14.02.2014 18:07, schrieb Jeff Muizelaar:
>>>
>>> I'll need to take another look and run some tests, though I've got some
>>> quick comments:
>>>
>>>
>>> @@ -1031,16 +1082,28 @@ lp_build_sample_image_linear(struct
>>> lp_build_sample_context *bld,
>>>   s = lp_build_mul_imm(&bld->coord_bld, s, 256);
>>>   if (dims >= 2)
>>>  t = lp_build_mul_imm(&bld->coord_bld, t, 256);
>>>   if (dims >= 3)
>>>  r = lp_build_mul_imm(&bld->coord_bld, r, 256);
>>>}
>>>
>>>/* convert float to int */
>>> +   half = lp_build_const_vec(bld->gallivm, bld->coord_bld.type, 0.5);
>>> +   s = lp_build_add(&bld->coord_bld, s, half);
>>> +   s = LLVMBuildFPToSI(builder, s, i32_vec_type, "");
>>> +   if (dims >= 2) {
>>> +  t = lp_build_add(&bld->coord_bld, t, half);
>>> +  t = LLVMBuildFPToSI(builder, t, i32_vec_type, "");
>>> +   }
>>> +   if (dims >= 3) {
>>> +  r = lp_build_add(&bld->coord_bld, r, half);
>>> +  r = LLVMBuildFPToSI(builder, r, i32_vec_type, "");
>>> +   }
>>> +
>>>s = LLVMBuildFPToSI(builder, s, i32_vec_type, "");
>>>if (dims >= 2)
>>>   t = LLVMBuildFPToSI(builder, t, i32_vec_type, "");
>>>if (dims >= 3)
>>>   r = LLVMBuildFPToSI(builder, r, i32_vec_type, "");
>>> This looks quite incorrect you're converting the s/t/r coords twice from
>>> float to int.
>>
>> Yep. I forgot to remove this second hunk.
>>
>>>/* subtract 0.5 (add -128) */
>>>i32_c128 = lp_build_const_int_vec(bld->gallivm, i32.type, -128);
>>>
>>>
>>> Also, the add looks iffy as it won't work correctly if the coords are
>>> negative, since the FPToSI is of course trunc, not floor.
>>
>> I think it will be ok because the REPEAT case avoids negative coord
>> before converting to int and the other cases clamp to 0.
> I think for clamp_to_edge you're right, but repeat will use these coords
> for the POT case (npot indeed won't care at all since it doesn't
> actually use these values at all).
Hmm actually looking even closer I suspect this could introduce some
slight error with texel offsets, since these are added later hence
negative values DO matter. So probably safer to just always use correct
rounding. FWIW I'm wondering if we could use llvm intrinsics for this
instead of coming up with our own, llvm seems to have nearbyint, rint
and round. All returning floats though, but maybe it would recognize
patterns if combined with fptosi (I have no idea which one I'd have to
use though nor if they work and if so with what backends).

> 
>>
>>> Maybe instead of using add + fptosi should just use lp_build_iround
>>> (which is just one sse instruction too on x86 though if you're targeting
>>> another arch it will definitely be more code at least unless someone
>>> adds an intrinsic for it if the cpu even has one). Might not matter
>>> though depending on address mode…
>>
>> Yeah, that might be a better idea.
>>
>>>
>>> And I might be missing something why do you think the new repeat code is
>>> faster? Though that might also depend on arch_rounding being available
>>> and such but at first looks it seems slightly more complex to me.
>>
>> The current code converts integer and fractional parts to integer
>> separately. It also does the subtract 0.5 in floating point instead of
>> integer arithmetic (-128).
> 
> Yeah, you're probably right. With simple instruction counting I end up
> with the same number of instructions actually (assuming arch_rounding is
> available, I miscounted previously and thought it were two more), but
> the ones in your version ought to be a bit cheaper.
> 
> I suspect actually nearest filtering suffers from the same inaccuracy,
> we actually also do the mul by 256 so we get 8 fractional bits, then
> toss away those 8 bits and just use the non-fractional part. That looks
> to me like we only need these 8 bits so we get "reasonably correct"
> rounding, but we still would chose the wrong sampling point sometimes
> (if it's less than 1/256 pixels away from the center between two
> texels). I actually noticed that one a while ago but I can't remember
> why I didn't do something about it...
> 

Roland
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/8] meta: Fix blit shader compile on non-glsl-130 drivers.

2014-02-14 Thread Ian Romanick

On 02/14/2014 03:00 PM, Eric Anholt wrote:
> Compare this VS to the one for the post-130 case.  Fixes piglit
> glsl-lod-bias, and presumably tons of other code (I haven't done a full
> piglit run on swrast).

Looks like a good fix.  If we had NV_explicit_attrib_location[1], we
could easily unify a bunch of the vertex shaders in meta because it
enables "in" and "out" generally. :(

1:
http://www.khronos.org/registry/gles/extensions/NV/NV_explicit_attrib_location.txt

> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74911

Reviewed-by: Ian Romanick 

> ---
>  src/mesa/drivers/common/meta.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
> index d3ca3b7..dd905dd 100644
> --- a/src/mesa/drivers/common/meta.c
> +++ b/src/mesa/drivers/common/meta.c
> @@ -193,7 +193,7 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
> || ctx->Const.GLSLVersion < 130) {
>vs_source =
>   "attribute vec2 position;\n"
> - "attribute vec3 textureCoords;\n"
> + "attribute vec4 textureCoords;\n"
>   "varying vec4 texCoords;\n"
>   "void main()\n"
>   "{\n"
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 8/8] i965: Drop mt->levels[].width/height.

2014-02-14 Thread Chris Forbes

This is a nice improvement.

Reviewed-by: Chris Forbes 

On Sat, Feb 15, 2014 at 12:00 PM, Eric Anholt  wrote:
> It often confused people because it was unclear on whether it was the
> physical or logical, and people needed the other one as well.  We can
> recompute it trivially using the minify() macro, clarifying which value is
> being used and making getting the other value obvious.
> ---
>  src/mesa/drivers/dri/i965/brw_blorp.cpp   |  4 +--
>  src/mesa/drivers/dri/i965/brw_clear.c |  3 +-
>  src/mesa/drivers/dri/i965/brw_tex_layout.c|  5 ++--
>  src/mesa/drivers/dri/i965/intel_blit.c|  4 +--
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 40 
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  5 +---
>  src/mesa/drivers/dri/i965/intel_screen.c  |  4 +--
>  7 files changed, 23 insertions(+), 42 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp 
> b/src/mesa/drivers/dri/i965/brw_blorp.cpp
> index 76537c8..7980013 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp
> @@ -68,8 +68,8 @@ brw_blorp_mip_info::set(struct intel_mipmap_tree *mt,
> this->mt = mt;
> this->level = level;
> this->layer = layer;
> -   this->width = mt->level[level].width;
> -   this->height = mt->level[level].height;
> +   this->width = minify(mt->physical_width0, level);
> +   this->height = minify(mt->physical_height0, level);
>
> intel_miptree_get_image_offset(mt, level, layer, &x_offset, &y_offset);
>  }
> diff --git a/src/mesa/drivers/dri/i965/brw_clear.c 
> b/src/mesa/drivers/dri/i965/brw_clear.c
> index 1964572..d9a8792 100644
> --- a/src/mesa/drivers/dri/i965/brw_clear.c
> +++ b/src/mesa/drivers/dri/i965/brw_clear.c
> @@ -155,7 +155,8 @@ brw_fast_clear_depth(struct gl_context *ctx)
> *width of the map (LOD0) is not multiple of 16, fast clear
> *optimization must be disabled.
> */
> -  if (brw->gen == 6 && (mt->level[depth_irb->mt_level].width % 16) != 0)
> +  if (brw->gen == 6 && (minify(mt->physical_width0,
> +   depth_irb->mt_level) % 16) != 0)
>  return false;
>/* FALLTHROUGH */
>
> diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c 
> b/src/mesa/drivers/dri/i965/brw_tex_layout.c
> index 61a2eba..76044b2 100644
> --- a/src/mesa/drivers/dri/i965/brw_tex_layout.c
> +++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c
> @@ -197,8 +197,7 @@ brw_miptree_layout_2d(struct intel_mipmap_tree *mt)
> for (unsigned level = mt->first_level; level <= mt->last_level; level++) {
>unsigned img_height;
>
> -  intel_miptree_set_level_info(mt, level, x, y, width,
> -  height, depth);
> +  intel_miptree_set_level_info(mt, level, x, y, depth);
>
>img_height = ALIGN(height, mt->align_h);
>if (mt->compressed)
> @@ -281,7 +280,7 @@ brw_miptree_layout_texture_3d(struct brw_context *brw,
>if (mt->target == GL_TEXTURE_CUBE_MAP)
>   DL = 6;
>
> -  intel_miptree_set_level_info(mt, level, 0, 0, WL, HL, DL);
> +  intel_miptree_set_level_info(mt, level, 0, 0, DL);
>
>for (unsigned q = 0; q < DL; q++) {
>   unsigned x = (q % (1 << level)) * wL;
> diff --git a/src/mesa/drivers/dri/i965/intel_blit.c 
> b/src/mesa/drivers/dri/i965/intel_blit.c
> index b12ecca..23757f3 100644
> --- a/src/mesa/drivers/dri/i965/intel_blit.c
> +++ b/src/mesa/drivers/dri/i965/intel_blit.c
> @@ -215,10 +215,10 @@ intel_miptree_blit(struct brw_context *brw,
> intel_miptree_resolve_color(brw, dst_mt);
>
> if (src_flip)
> -  src_y = src_mt->level[src_level].height - src_y - height;
> +  src_y = minify(src_mt->physical_height0, src_level) - src_y - height;
>
> if (dst_flip)
> -  dst_y = dst_mt->level[dst_level].height - dst_y - height;
> +  dst_y = minify(dst_mt->physical_height0, src_level) - dst_y - height;
>
> int src_pitch = src_mt->region->pitch;
> if (src_flip != dst_flip)
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 522837c..88bab9a 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -866,24 +866,10 @@ intel_miptree_match_image(struct intel_mipmap_tree *mt,
>  * minification.  This will also catch images not present in the
>  * tree, changed targets, etc.
>  */
> -   if (mt->target == GL_TEXTURE_2D_MULTISAMPLE ||
> - mt->target == GL_TEXTURE_2D_MULTISAMPLE_ARRAY) {
> -  /* nonzero level here is always bogus */
> -  assert(level == 0);
> -
> -  if (width != mt->logical_width0 ||
> -height != mt->logical_height0 ||
> -depth != mt->logical_depth0) {
> - return false;
> -  }
> -   }
> -   else {
> -  /* all normal textures, renderbuffers, etc */
> -  if (width != mt->level[level].width |

[Mesa-dev] [PATCH 2/3] i965/fs: Add an optimization pass to remove rendant flags movs.

2014-02-14 Thread Eric Anholt

We generate steaming piles of these for the centroid workaround, and this
quickly cleans them up.

total instructions in shared programs: 1591228 -> 1590047 (-0.07%)
instructions in affected programs: 26111 -> 24930 (-4.52%)
GAINED:0
LOST:  0

(Improved apps are l4d2, csgo, and dolphin)
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 33 +
 src/mesa/drivers/dri/i965/brw_fs.h   |  1 +
 2 files changed, 34 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index d35928e..85d36f3 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -3310,6 +3310,37 @@ fs_visitor::calculate_register_pressure()
}
 }
 
+/**
+ * Look for repeated FS_OPCODE_MOV_DISPATCH_TO_FLAGS and drop the later ones.
+ *
+ * The needs_unlit_centroid_workaround ends up producing one of these per
+ * channel of centroid input, so it's good to clean them up.
+ *
+ * An assumption here is that nothing ever modifies the dispatched pixels
+ * value that FS_OPCODE_MOV_DISPATCH_TO_FLAGS reads from, but the hardware
+ * dictates that anyway.
+ */
+void
+fs_visitor::opt_drop_redundant_mov_to_flags()
+{
+   bool flag_mov_found[2] = {false};
+
+   foreach_list_safe(node, &this->instructions) {
+  fs_inst *inst = (fs_inst *)node;
+
+  if (inst->is_control_flow()) {
+ memset(flag_mov_found, 0, sizeof(flag_mov_found));
+  } else if (inst->opcode == FS_OPCODE_MOV_DISPATCH_TO_FLAGS) {
+ if (!flag_mov_found[inst->flag_subreg])
+flag_mov_found[inst->flag_subreg] = true;
+ else
+inst->remove();
+  } else if (inst->writes_flag()) {
+ flag_mov_found[inst->flag_subreg] = false;
+  }
+   }
+}
+
 bool
 fs_visitor::run()
 {
@@ -3376,6 +3407,8 @@ fs_visitor::run()
   remove_dead_constants();
   setup_pull_constants();
 
+  opt_drop_redundant_mov_to_flags();
+
   bool progress;
   do {
 progress = false;
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index c6f4ffb..2538983 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -352,6 +352,7 @@ public:
bool try_constant_propagate(fs_inst *inst, acp_entry *entry);
bool opt_copy_propagate_local(void *mem_ctx, bblock_t *block,
  exec_list *acp);
+   void opt_drop_redundant_mov_to_flags();
bool register_coalesce();
bool compute_to_mrf();
bool dead_code_eliminate();
-- 
1.9.rc1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/3] i965/vs: Use samplers for UBOs in the VS like we do for non-UBO pulls.

2014-02-14 Thread Eric Anholt

Improves performance of a dolphin emulator trace I had laying around by
3.60131% +/- 0.995887% (n=128).
---
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 23 ++-
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index 3bdb242..0de53ec 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -1582,14 +1582,27 @@ vec4_visitor::visit(ir_expression *ir)
  emit(SHR(dst_reg(offset), op[1], src_reg(4)));
   }
 
-  vec4_instruction *pull =
+  if (brw->gen >= 7) {
+ dst_reg grf_offset = dst_reg(this, glsl_type::int_type);
+ grf_offset.type = offset.type;
+
+ emit(MOV(grf_offset, offset));
+
  emit(new(mem_ctx) vec4_instruction(this,
-VS_OPCODE_PULL_CONSTANT_LOAD,
+VS_OPCODE_PULL_CONSTANT_LOAD_GEN7,
 dst_reg(packed_consts),
 surf_index,
-offset));
-  pull->base_mrf = 14;
-  pull->mlen = 1;
+src_reg(grf_offset)));
+  } else {
+ vec4_instruction *pull =
+emit(new(mem_ctx) vec4_instruction(this,
+   VS_OPCODE_PULL_CONSTANT_LOAD,
+   dst_reg(packed_consts),
+   surf_index,
+   offset));
+ pull->base_mrf = 14;
+ pull->mlen = 1;
+  }
 
   packed_consts.swizzle = swizzle_for_size(ir->type->vector_elements);
   packed_consts.swizzle += BRW_SWIZZLE4(const_offset % 16 / 4,
-- 
1.9.rc1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/3] i965/fs: Drop dead comment about the old proj_attrib_mask optimization.

2014-02-14 Thread Eric Anholt

The code was removed early last year.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 6 --
 1 file changed, 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 4f5558b..d35928e 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1108,12 +1108,6 @@ fs_visitor::emit_general_interpolation(ir_variable *ir)
 } else {
/* Smooth/noperspective interpolation case. */
for (unsigned int k = 0; k < type->vector_elements; k++) {
-  /* FINISHME: At some point we probably want to push
-   * this farther by giving similar treatment to the
-   * other potentially constant components of the
-   * attribute, as well as making brw_vs_constval.c
-   * handle varyings other than gl_TexCoord.
-   */
struct brw_reg interp = interp_reg(location, k);
emit_linterp(attr, fs_reg(interp), interpolation_mode,
 ir->data.centroid && !c->key.persample_shading,
-- 
1.9.rc1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mesa: Fix error code generation in glReadPixels()

2014-02-14 Thread Anuj Phogat

Section 4.3.1, page 220, of OpenGL 3.3 specification explains
the error conditions for glreadPixels():

   "If the format is DEPTH_STENCIL, then values are taken from
both the depth buffer and the stencil buffer. If there is
no depth buffer or if there is no stencil buffer, then the
error INVALID_OPERATION occurs. If the type parameter is
not UNSIGNED_INT_24_8 or FLOAT_32_UNSIGNED_INT_24_8_REV,
then the error INVALID_ENUM occurs."

Fixes failing Khronos CTS test packed_depth_stencil_error.test

Cc: 
Signed-off-by: Anuj Phogat 
---
 src/mesa/main/glformats.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c
index 77cf263..b797900 100644
--- a/src/mesa/main/glformats.c
+++ b/src/mesa/main/glformats.c
@@ -1257,6 +1257,9 @@ _mesa_error_check_format_and_type(const struct gl_context 
*ctx,
   ctx->Extensions.ARB_texture_rgb10_a2ui) {
  break; /* OK */
   }
+  if (format == GL_DEPTH_STENCIL && _mesa_is_desktop_gl(ctx)) {
+ return GL_INVALID_ENUM;
+  }
   return GL_INVALID_OPERATION;
 
case GL_UNSIGNED_SHORT_4_4_4_4:
@@ -1280,6 +1283,9 @@ _mesa_error_check_format_and_type(const struct gl_context 
*ctx,
   ctx->API == API_OPENGLES2) {
  break; /* OK by GL_EXT_texture_type_2_10_10_10_REV */
   }
+  if (format == GL_DEPTH_STENCIL && _mesa_is_desktop_gl(ctx)) {
+ return GL_INVALID_ENUM;
+  }
   return GL_INVALID_OPERATION;
 
case GL_UNSIGNED_INT_24_8:
@@ -1298,7 +1304,8 @@ _mesa_error_check_format_and_type(const struct gl_context 
*ctx,
   return GL_NO_ERROR;
 
case GL_UNSIGNED_INT_10F_11F_11F_REV:
-  if (!ctx->Extensions.EXT_packed_float) {
+  if (!ctx->Extensions.EXT_packed_float ||
+  (format == GL_DEPTH_STENCIL && _mesa_is_desktop_gl(ctx))) {
  return GL_INVALID_ENUM;
   }
   if (format != GL_RGB) {
-- 
1.8.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gallivm: optimize repeat linear npot code in the aos int path

2014-02-14 Thread sroland

From: Jeff Muizelaar 

Similar to the other cases, shift some weight/coord calculations to int
space. This should be slightly faster (on x86 sse it should actually safe one
instruction, and generally int instructions are cheaper).
---
 src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c |   74 +
 1 file changed, 62 insertions(+), 12 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c 
b/src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c
index 03a2ed5..e9f8611 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c
@@ -194,6 +194,62 @@ lp_build_sample_wrap_nearest_float(struct 
lp_build_sample_context *bld,
 
 
 /**
+ * Helper to compute the first coord and the weight for
+ * linear wrap repeat npot textures
+ */
+static void
+lp_build_coord_repeat_npot_linear_int(struct lp_build_sample_context *bld,
+  LLVMValueRef coord_f,
+  LLVMValueRef length_i,
+  LLVMValueRef length_f,
+  LLVMValueRef *coord0_i,
+  LLVMValueRef *weight_i)
+{
+   struct lp_build_context *coord_bld = &bld->coord_bld;
+   struct lp_build_context *int_coord_bld = &bld->int_coord_bld;
+   struct lp_build_context abs_coord_bld;
+   struct lp_type abs_type;
+   LLVMValueRef length_minus_one = lp_build_sub(int_coord_bld, length_i,
+int_coord_bld->one);
+   LLVMValueRef mask, i32_c8, i32_c128, i32_c255;
+
+   /* wrap with normalized floats is just fract */
+   coord_f = lp_build_fract(coord_bld, coord_f);
+   /* mul by size */
+   coord_f = lp_build_mul(coord_bld, coord_f, length_f);
+   /* convert to int, compute lerp weight */
+   coord_f = lp_build_mul_imm(&bld->coord_bld, coord_f, 256);
+
+   /* At this point we don't have any negative numbers so use non-signed
+* build context which might help on some archs.
+*/
+   abs_type = coord_bld->type;
+   abs_type.sign = 0;
+   lp_build_context_init(&abs_coord_bld, bld->gallivm, abs_type);
+   *coord0_i = lp_build_iround(&abs_coord_bld, coord_f);
+
+   /* subtract 0.5 (add -128) */
+   i32_c128 = lp_build_const_int_vec(bld->gallivm, bld->int_coord_type, -128);
+   *coord0_i = LLVMBuildAdd(bld->gallivm->builder, *coord0_i, i32_c128, "");
+
+   /* compute fractional part (AND with 0xff) */
+   i32_c255 = lp_build_const_int_vec(bld->gallivm, bld->int_coord_type, 255);
+   *weight_i = LLVMBuildAnd(bld->gallivm->builder, *coord0_i, i32_c255, "");
+
+   /* compute floor (shift right 8) */
+   i32_c8 = lp_build_const_int_vec(bld->gallivm, bld->int_coord_type, 8);
+   *coord0_i = LLVMBuildAShr(bld->gallivm->builder, *coord0_i, i32_c8, "");
+   /*
+* we avoided the 0.5/length division before the repeat wrap,
+* now need to fix up edge cases with selects
+*/
+   mask = lp_build_compare(int_coord_bld->gallivm, int_coord_bld->type,
+   PIPE_FUNC_LESS, *coord0_i, int_coord_bld->zero);
+   *coord0_i = lp_build_select(int_coord_bld, mask, length_minus_one, 
*coord0_i);
+}
+
+
+/**
  * Build LLVM code for texture coord wrapping, for linear filtering,
  * for scaled integer texcoords.
  * \param block_length  is the length of the pixel block along the
@@ -251,24 +307,21 @@ lp_build_sample_wrap_linear_int(struct 
lp_build_sample_context *bld,
  }
  else {
 LLVMValueRef mask;
-LLVMValueRef weight;
 LLVMValueRef length_f = lp_build_int_to_float(&bld->coord_bld, 
length);
 if (offset) {
offset = lp_build_int_to_float(&bld->coord_bld, offset);
offset = lp_build_div(&bld->coord_bld, offset, length_f);
coord_f = lp_build_add(&bld->coord_bld, coord_f, offset);
 }
-lp_build_coord_repeat_npot_linear(bld, coord_f,
-  length, length_f,
-  &coord0, &weight);
+lp_build_coord_repeat_npot_linear_int(bld, coord_f,
+  length, length_f,
+  &coord0, weight_i);
 mask = lp_build_compare(bld->gallivm, int_coord_bld->type,
 PIPE_FUNC_NOTEQUAL, coord0, 
length_minus_one);
 coord1 = LLVMBuildAnd(builder,
   lp_build_add(int_coord_bld, coord0,
int_coord_bld->one),
   mask, "");
-weight = lp_build_mul_imm(&bld->coord_bld, weight, 256);
-*weight_i = lp_build_itrunc(&bld->coord_bld, weight);
  }
  break;
 
@@ -308,18 +361,15 @@ lp_build_sample_wrap_linear_int(struct 
lp_build_sample_context *bld,
  coord0 = LLVMBuildAnd(builder, c

[Mesa-dev] [PATCH 3/3] gallivm: optimize repeat linear npot code in the aos int path

2014-02-14 Thread sroland

From: Jeff Muizelaar 

Similar to the other cases, shift some weight/coord calculations to int
space. This should be slightly faster (on x86 sse it should actually safe one
instruction, and generally int instructions are cheaper).
---
 src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c |   74 +
 1 file changed, 62 insertions(+), 12 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c 
b/src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c
index 03a2ed5..e9f8611 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c
@@ -194,6 +194,62 @@ lp_build_sample_wrap_nearest_float(struct 
lp_build_sample_context *bld,
 
 
 /**
+ * Helper to compute the first coord and the weight for
+ * linear wrap repeat npot textures
+ */
+static void
+lp_build_coord_repeat_npot_linear_int(struct lp_build_sample_context *bld,
+  LLVMValueRef coord_f,
+  LLVMValueRef length_i,
+  LLVMValueRef length_f,
+  LLVMValueRef *coord0_i,
+  LLVMValueRef *weight_i)
+{
+   struct lp_build_context *coord_bld = &bld->coord_bld;
+   struct lp_build_context *int_coord_bld = &bld->int_coord_bld;
+   struct lp_build_context abs_coord_bld;
+   struct lp_type abs_type;
+   LLVMValueRef length_minus_one = lp_build_sub(int_coord_bld, length_i,
+int_coord_bld->one);
+   LLVMValueRef mask, i32_c8, i32_c128, i32_c255;
+
+   /* wrap with normalized floats is just fract */
+   coord_f = lp_build_fract(coord_bld, coord_f);
+   /* mul by size */
+   coord_f = lp_build_mul(coord_bld, coord_f, length_f);
+   /* convert to int, compute lerp weight */
+   coord_f = lp_build_mul_imm(&bld->coord_bld, coord_f, 256);
+
+   /* At this point we don't have any negative numbers so use non-signed
+* build context which might help on some archs.
+*/
+   abs_type = coord_bld->type;
+   abs_type.sign = 0;
+   lp_build_context_init(&abs_coord_bld, bld->gallivm, abs_type);
+   *coord0_i = lp_build_iround(&abs_coord_bld, coord_f);
+
+   /* subtract 0.5 (add -128) */
+   i32_c128 = lp_build_const_int_vec(bld->gallivm, bld->int_coord_type, -128);
+   *coord0_i = LLVMBuildAdd(bld->gallivm->builder, *coord0_i, i32_c128, "");
+
+   /* compute fractional part (AND with 0xff) */
+   i32_c255 = lp_build_const_int_vec(bld->gallivm, bld->int_coord_type, 255);
+   *weight_i = LLVMBuildAnd(bld->gallivm->builder, *coord0_i, i32_c255, "");
+
+   /* compute floor (shift right 8) */
+   i32_c8 = lp_build_const_int_vec(bld->gallivm, bld->int_coord_type, 8);
+   *coord0_i = LLVMBuildAShr(bld->gallivm->builder, *coord0_i, i32_c8, "");
+   /*
+* we avoided the 0.5/length division before the repeat wrap,
+* now need to fix up edge cases with selects
+*/
+   mask = lp_build_compare(int_coord_bld->gallivm, int_coord_bld->type,
+   PIPE_FUNC_LESS, *coord0_i, int_coord_bld->zero);
+   *coord0_i = lp_build_select(int_coord_bld, mask, length_minus_one, 
*coord0_i);
+}
+
+
+/**
  * Build LLVM code for texture coord wrapping, for linear filtering,
  * for scaled integer texcoords.
  * \param block_length  is the length of the pixel block along the
@@ -251,24 +307,21 @@ lp_build_sample_wrap_linear_int(struct 
lp_build_sample_context *bld,
  }
  else {
 LLVMValueRef mask;
-LLVMValueRef weight;
 LLVMValueRef length_f = lp_build_int_to_float(&bld->coord_bld, 
length);
 if (offset) {
offset = lp_build_int_to_float(&bld->coord_bld, offset);
offset = lp_build_div(&bld->coord_bld, offset, length_f);
coord_f = lp_build_add(&bld->coord_bld, coord_f, offset);
 }
-lp_build_coord_repeat_npot_linear(bld, coord_f,
-  length, length_f,
-  &coord0, &weight);
+lp_build_coord_repeat_npot_linear_int(bld, coord_f,
+  length, length_f,
+  &coord0, weight_i);
 mask = lp_build_compare(bld->gallivm, int_coord_bld->type,
 PIPE_FUNC_NOTEQUAL, coord0, 
length_minus_one);
 coord1 = LLVMBuildAnd(builder,
   lp_build_add(int_coord_bld, coord0,
int_coord_bld->one),
   mask, "");
-weight = lp_build_mul_imm(&bld->coord_bld, weight, 256);
-*weight_i = lp_build_itrunc(&bld->coord_bld, weight);
  }
  break;
 
@@ -308,18 +361,15 @@ lp_build_sample_wrap_linear_int(struct 
lp_build_sample_context *bld,
  coord0 = LLVMBuildAnd(builder, c

[Mesa-dev] [PATCH 1/3] gallivm: use correct rounding for linear wrap mode (in the aos int path)

2014-02-14 Thread sroland

From: Jeff Muizelaar 

The previous method for converting coords to ints was sligthly inaccurate
(effectively losing 1bit from the 8bit lerp weight). This is probably
especially noticeable when trying to draw a pixel-aligned texture.
As an example, for a 100x100 texture after dernormalization the texture
coords in this case would turn up as
0.5, 1.5, 2.5, 3.5, 4.5, ...
After the mul by 256, conversion to int and 128 subtraction, they end up as
0, 256, 512, 768, 1024, ...
which gets us the correct coords/weights of
0/0, 1/0, 2/0, 3/0, 4/0, ...
But even LSB errors (which are unavoidable) in the input coords may cause
these coords/weights to be wrong, e.g. for a coord of 3.4 we'd get a
coord/weight of 2/255 instead.

Fix this by using round-to-nearest int instead of FPToSi (trunc). Should be
equally fast on x86 sse though other archs probably suffer a little.
---
 src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c |   14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c 
b/src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c
index c35b628..1d87ee8 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c
@@ -987,7 +987,6 @@ lp_build_sample_image_linear(struct lp_build_sample_context 
*bld,
const unsigned dims = bld->dims;
LLVMBuilderRef builder = bld->gallivm->builder;
struct lp_build_context i32;
-   LLVMTypeRef i32_vec_type;
LLVMValueRef i32_c8, i32_c128, i32_c255;
LLVMValueRef width_vec, height_vec, depth_vec;
LLVMValueRef s_ipart, s_fpart, s_float;
@@ -1003,8 +1002,6 @@ lp_build_sample_image_linear(struct 
lp_build_sample_context *bld,
 
lp_build_context_init(&i32, bld->gallivm, lp_type_int_vec(32, 
bld->vector_width));
 
-   i32_vec_type = lp_build_vec_type(bld->gallivm, i32.type);
-
lp_build_extract_image_sizes(bld,
 &bld->int_size_bld,
 bld->int_coord_type,
@@ -1036,11 +1033,16 @@ lp_build_sample_image_linear(struct 
lp_build_sample_context *bld,
}
 
/* convert float to int */
-   s = LLVMBuildFPToSI(builder, s, i32_vec_type, "");
+   /* For correct rounding, need round to nearest, not truncation here.
+* Note that in some cases (clamp to edge, no texel offsets) we
+* could use a non-signed build context which would help archs which
+* don't have fptosi intrinsic with nearest rounding implemented.
+*/
+   s_ipart = lp_build_iround(&bld->coord_bld, s);
if (dims >= 2)
-  t = LLVMBuildFPToSI(builder, t, i32_vec_type, "");
+  t_ipart = lp_build_iround(&bld->coord_bld, t);
if (dims >= 3)
-  r = LLVMBuildFPToSI(builder, r, i32_vec_type, "");
+  r_ipart = lp_build_iround(&bld->coord_bld, r);
 
/* subtract 0.5 (add -128) */
i32_c128 = lp_build_const_int_vec(bld->gallivm, i32.type, -128);
-- 
1.7.9.5
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/3] gallivm: use correct rounding for nearest wrap mode (in the aos int path)

2014-02-14 Thread sroland

From: Roland Scheidegger 

The previous code used coords which were calculated as
(int) (f_coord * tex_size * 256) >> 8.
This is not only unnecessarily complex but can give the wrong texel due to
rounding (as that uses truncation, not round to nearest) if the pixel is less
than 1/256 away from the center between two texels.
Instead, juse use correct round-to-nearest int, dropping the shift stuff.
(As for performance, this should always be a win on x86 sse2, though other
archs not implementing arch rounding intrinsics may suffer slightly.)
---
 src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c |   38 +
 1 file changed, 9 insertions(+), 29 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c 
b/src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c
index 1d87ee8..03a2ed5 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c
@@ -567,10 +567,7 @@ lp_build_sample_image_nearest(struct 
lp_build_sample_context *bld,
   LLVMValueRef *colors)
 {
const unsigned dims = bld->dims;
-   LLVMBuilderRef builder = bld->gallivm->builder;
struct lp_build_context i32;
-   LLVMTypeRef i32_vec_type;
-   LLVMValueRef i32_c8;
LLVMValueRef width_vec, height_vec, depth_vec;
LLVMValueRef s_ipart, t_ipart = NULL, r_ipart = NULL;
LLVMValueRef s_float, t_float = NULL, r_float = NULL;
@@ -580,8 +577,6 @@ lp_build_sample_image_nearest(struct 
lp_build_sample_context *bld,
 
lp_build_context_init(&i32, bld->gallivm, lp_type_int_vec(32, 
bld->vector_width));
 
-   i32_vec_type = lp_build_vec_type(bld->gallivm, i32.type);
-
lp_build_extract_image_sizes(bld,
 &bld->int_size_bld,
 bld->int_coord_type,
@@ -593,39 +588,24 @@ lp_build_sample_image_nearest(struct 
lp_build_sample_context *bld,
s_float = s; t_float = t; r_float = r;
 
if (bld->static_sampler_state->normalized_coords) {
-  LLVMValueRef scaled_size;
   LLVMValueRef flt_size;
 
-  /* scale size by 256 (8 fractional bits) */
-  scaled_size = lp_build_shl_imm(&bld->int_size_bld, int_size, 8);
-
-  flt_size = lp_build_int_to_float(&bld->float_size_bld, scaled_size);
+  flt_size = lp_build_int_to_float(&bld->float_size_bld, int_size);
 
   lp_build_unnormalized_coords(bld, flt_size, &s, &t, &r);
}
-   else {
-  /* scale coords by 256 (8 fractional bits) */
-  s = lp_build_mul_imm(&bld->coord_bld, s, 256);
-  if (dims >= 2)
- t = lp_build_mul_imm(&bld->coord_bld, t, 256);
-  if (dims >= 3)
- r = lp_build_mul_imm(&bld->coord_bld, r, 256);
-   }
 
/* convert float to int */
-   s = LLVMBuildFPToSI(builder, s, i32_vec_type, "");
-   if (dims >= 2)
-  t = LLVMBuildFPToSI(builder, t, i32_vec_type, "");
-   if (dims >= 3)
-  r = LLVMBuildFPToSI(builder, r, i32_vec_type, "");
-
-   /* compute floor (shift right 8) */
-   i32_c8 = lp_build_const_int_vec(bld->gallivm, i32.type, 8);
-   s_ipart = LLVMBuildAShr(builder, s, i32_c8, "");
+   /* For correct rounding, need round to nearest, not truncation here.
+* Note that in some cases (clamp to edge, no texel offsets) we
+* could use a non-signed build context which would help archs which
+* don't have fptosi intrinsic with nearest rounding implemented.
+*/
+   s_ipart = lp_build_iround(&bld->coord_bld, s);
if (dims >= 2)
-  t_ipart = LLVMBuildAShr(builder, t, i32_c8, "");
+  t_ipart = lp_build_iround(&bld->coord_bld, t);
if (dims >= 3)
-  r_ipart = LLVMBuildAShr(builder, r, i32_c8, "");
+  r_ipart = lp_build_iround(&bld->coord_bld, r);
 
/* add texel offsets */
if (offsets[0]) {
-- 
1.7.9.5
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/8] meta: Fix blit shader compile on non-glsl-130 drivers.

2014-02-14 Thread Kenneth Graunke

On 02/14/2014 03:00 PM, Eric Anholt wrote:
> Compare this VS to the one for the post-130 case.  Fixes piglit
> glsl-lod-bias, and presumably tons of other code (I haven't done a full
> piglit run on swrast).
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74911
> ---
>  src/mesa/drivers/common/meta.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
> index d3ca3b7..dd905dd 100644
> --- a/src/mesa/drivers/common/meta.c
> +++ b/src/mesa/drivers/common/meta.c
> @@ -193,7 +193,7 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
> || ctx->Const.GLSLVersion < 130) {
>vs_source =
>   "attribute vec2 position;\n"
> - "attribute vec3 textureCoords;\n"
> + "attribute vec4 textureCoords;\n"
>   "varying vec4 texCoords;\n"
>   "void main()\n"
>   "{\n"
> 

This is obviously:
Reviewed-by: Kenneth Graunke 

But I wonder, would it be terribly harmful to just override
ctx->Const.GLSLVersion to 130 in Meta so #version 130 works?

Sure, you could get into trouble if you tried to use things like
ClipDistance and they weren't supported, but I don't see us needing that.

We would need integer, but I don't know of any drivers that allow you to
make integer textures that can't handle integers.  (Gen4-5 expose
EXT_texture_integer without GLSL 1.30, but they can do GLSL 1.30...we
just never finished advertising it...)

Just an idea; I'm not suggesting altering any of these patches.

signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] gallivm: optimize repeat linear npot code in the aos int path

2014-02-14 Thread Roland Scheidegger

FWIW I've just cleaned 1/3 and 3/3 up a little and splitted it off into
two patches (I really want to be able to track any changes this might
cause separately), and on x86 sse I actually managed to shave off one
instruction by using lp_build_iround() too :-).
2/3 is more of the same just for the nearest filtering path.
In any case though I haven't actually tested any of it yet but the issue
indeed looks very real to me. I actually need to really run some
internal tests with this (piglit is usually not close to sensitive
enough), the whole texture wrap mode stuff is a bit of a nightmare (as
entirely different paths will be run depending on cpu flags AND texture
format which makes bugs in there difficult to detect). At some point I
wanted to unify the coord wrapping in the aos and soa paths since this
doesn't really depend on if aos or soa filtering is used though there
are indeed some dependencies if you want to get optimal code.

Roland


Am 15.02.2014 01:54, schrieb srol...@vmware.com:
> From: Jeff Muizelaar 
> 
> Similar to the other cases, shift some weight/coord calculations to int
> space. This should be slightly faster (on x86 sse it should actually safe one
> instruction, and generally int instructions are cheaper).
> ---
>  src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c |   74 
> +
>  1 file changed, 62 insertions(+), 12 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c 
> b/src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c
> index 03a2ed5..e9f8611 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c
> @@ -194,6 +194,62 @@ lp_build_sample_wrap_nearest_float(struct 
> lp_build_sample_context *bld,
>  
>  
>  /**
> + * Helper to compute the first coord and the weight for
> + * linear wrap repeat npot textures
> + */
> +static void
> +lp_build_coord_repeat_npot_linear_int(struct lp_build_sample_context *bld,
> +  LLVMValueRef coord_f,
> +  LLVMValueRef length_i,
> +  LLVMValueRef length_f,
> +  LLVMValueRef *coord0_i,
> +  LLVMValueRef *weight_i)
> +{
> +   struct lp_build_context *coord_bld = &bld->coord_bld;
> +   struct lp_build_context *int_coord_bld = &bld->int_coord_bld;
> +   struct lp_build_context abs_coord_bld;
> +   struct lp_type abs_type;
> +   LLVMValueRef length_minus_one = lp_build_sub(int_coord_bld, length_i,
> +int_coord_bld->one);
> +   LLVMValueRef mask, i32_c8, i32_c128, i32_c255;
> +
> +   /* wrap with normalized floats is just fract */
> +   coord_f = lp_build_fract(coord_bld, coord_f);
> +   /* mul by size */
> +   coord_f = lp_build_mul(coord_bld, coord_f, length_f);
> +   /* convert to int, compute lerp weight */
> +   coord_f = lp_build_mul_imm(&bld->coord_bld, coord_f, 256);
> +
> +   /* At this point we don't have any negative numbers so use non-signed
> +* build context which might help on some archs.
> +*/
> +   abs_type = coord_bld->type;
> +   abs_type.sign = 0;
> +   lp_build_context_init(&abs_coord_bld, bld->gallivm, abs_type);
> +   *coord0_i = lp_build_iround(&abs_coord_bld, coord_f);
> +
> +   /* subtract 0.5 (add -128) */
> +   i32_c128 = lp_build_const_int_vec(bld->gallivm, bld->int_coord_type, 
> -128);
> +   *coord0_i = LLVMBuildAdd(bld->gallivm->builder, *coord0_i, i32_c128, "");
> +
> +   /* compute fractional part (AND with 0xff) */
> +   i32_c255 = lp_build_const_int_vec(bld->gallivm, bld->int_coord_type, 255);
> +   *weight_i = LLVMBuildAnd(bld->gallivm->builder, *coord0_i, i32_c255, "");
> +
> +   /* compute floor (shift right 8) */
> +   i32_c8 = lp_build_const_int_vec(bld->gallivm, bld->int_coord_type, 8);
> +   *coord0_i = LLVMBuildAShr(bld->gallivm->builder, *coord0_i, i32_c8, "");
> +   /*
> +* we avoided the 0.5/length division before the repeat wrap,
> +* now need to fix up edge cases with selects
> +*/
> +   mask = lp_build_compare(int_coord_bld->gallivm, int_coord_bld->type,
> +   PIPE_FUNC_LESS, *coord0_i, int_coord_bld->zero);
> +   *coord0_i = lp_build_select(int_coord_bld, mask, length_minus_one, 
> *coord0_i);
> +}
> +
> +
> +/**
>   * Build LLVM code for texture coord wrapping, for linear filtering,
>   * for scaled integer texcoords.
>   * \param block_length  is the length of the pixel block along the
> @@ -251,24 +307,21 @@ lp_build_sample_wrap_linear_int(struct 
> lp_build_sample_context *bld,
>   }
>   else {
>  LLVMValueRef mask;
> -LLVMValueRef weight;
>  LLVMValueRef length_f = lp_build_int_to_float(&bld->coord_bld, 
> length);
>  if (offset) {
> offset = lp_build_int_to_float(&bld->coord_bld, offset);
> offset = lp_build_div(&bld->coord_bld, offset

Re: [Mesa-dev] [PATCH 5/8] i965: Simplify intel_miptree_updownsample.

2014-02-14 Thread Kenneth Graunke

On 02/14/2014 03:00 PM, Eric Anholt wrote:
> Pretty silly to pass in values dereferenced out of one of the arguments.
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 35 
> +--
>  1 file changed, 11 insertions(+), 24 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 08b8475..86a3887 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -1619,32 +1619,25 @@ intel_offset_S8(uint32_t stride, uint32_t x, uint32_t 
> y, bool swizzled)
>  static void
>  intel_miptree_updownsample(struct brw_context *brw,
> struct intel_mipmap_tree *src,
> -   struct intel_mipmap_tree *dst,
> -   unsigned width,
> -   unsigned height)
> +   struct intel_mipmap_tree *dst)
>  {
> -   int src_x0 = 0;
> -   int src_y0 = 0;
> -   int dst_x0 = 0;
> -   int dst_y0 = 0;
> -

It would be great to add a comment here stating that the use of src's
logical dimensions in both cases is intentional.  Otherwise, it looks
like a cut and paste bug to the casual observer.

> brw_blorp_blit_miptrees(brw,
> src, 0 /* level */, 0 /* layer */,
> dst, 0 /* level */, 0 /* layer */,
> -   src_x0, src_y0,
> -   width, height,
> -   dst_x0, dst_y0,
> -   width, height,
> +   0, 0,
> +   src->logical_width0, src->logical_height0,
> +   0, 0,
> +   src->logical_width0, src->logical_height0,
> GL_NEAREST, false, false /*mirror x, y*/);
>  
> if (src->stencil_mt) {
>brw_blorp_blit_miptrees(brw,
>src->stencil_mt, 0 /* level */, 0 /* layer */,
>dst->stencil_mt, 0 /* level */, 0 /* layer */,
> -  src_x0, src_y0,
> -  width, height,
> -  dst_x0, dst_y0,
> -  width, height,
> +  0, 0,
> +  src->logical_width0, src->logical_height0,
> +  0, 0,
> +  src->logical_width0, src->logical_height0,
>GL_NEAREST, false, false /*mirror x, y*/);
> }
>  }
> @@ -1672,10 +1665,7 @@ intel_miptree_downsample(struct brw_context *brw,
>  
> if (!mt->need_downsample)
>return;
> -   intel_miptree_updownsample(brw,
> -  mt, mt->singlesample_mt,
> -  mt->logical_width0,
> -  mt->logical_height0);
> +   intel_miptree_updownsample(brw, mt, mt->singlesample_mt);
> mt->need_downsample = false;
>  }
>  
> @@ -1692,10 +1682,7 @@ intel_miptree_upsample(struct brw_context *brw,
> assert_is_flat(mt);
> assert(!mt->need_downsample);
>  
> -   intel_miptree_updownsample(brw,
> -  mt->singlesample_mt, mt,
> -  mt->logical_width0,
> -  mt->logical_height0);
> +   intel_miptree_updownsample(brw, mt->singlesample_mt, mt);
>  }
>  
>  void *
> 




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965/fs: Use conditional sends to do FB writes on HSW+.

2014-02-14 Thread Eric Anholt

This drops the MOVs for header setup, which are totally mis-scheduled.

total instructions in shared programs: 1590047 -> 1589331 (-0.05%)
instructions in affected programs: 43729 -> 43013 (-1.64%)
GAINED:0
LOST:  0

glb27-trex:
x before
+ after
+-+
|   +  x xx+  ++  |
|  ++  + xxx ++x xx + ** *x+  +  + +  x * |
|+x xx x*x+++xx*x*xx+++*+*xx++** *x* x+***x*+xx+* + *+  +*|
|   |__|__MA___A___|___|  |
+-+
N   Min   MaxMedian   AvgStddev
x  49 62.33 65.41 63.49  63.534490.62757822
+  50 62.28  65.4  63.7   63.6982  0.656564
No difference proven at 95.0% confidence
---
 src/mesa/drivers/dri/i965/brw_eu_emit.c |  2 --
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp  | 22 -
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp| 14 -
 src/mesa/drivers/dri/i965/gen8_fs_generator.cpp | 26 +++--
 4 files changed, 46 insertions(+), 18 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
b/src/mesa/drivers/dri/i965/brw_eu_emit.c
index 8ab043f..5360b56 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
@@ -2241,8 +2241,6 @@ void brw_fb_WRITE(struct brw_compile *p,
} else {
   insn = next_insn(p, BRW_OPCODE_SEND);
}
-   /* The execution mask is ignored for render target writes. */
-   insn->header.predicate_control = 0;
insn->header.compression_control = BRW_COMPRESSION_NONE;
 
if (brw->gen >= 6) {
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index 00f19dc..ee13ced 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -114,18 +114,22 @@ fs_generator::generate_fb_write(fs_inst *inst)
brw_set_mask_control(p, BRW_MASK_DISABLE);
brw_set_compression_control(p, BRW_COMPRESSION_NONE);
 
-   if ((fp && fp->UsesKill) || c->key.alpha_test_func) {
-  struct brw_reg pixel_mask;
+   if (inst->header_present) {
+  /* On HSW, the GPU will use the predicate on SENDC, unless the header is
+   * present.
+   */
+  if (!brw->is_haswell && ((fp && fp->UsesKill) ||
+   c->key.alpha_test_func)) {
+ struct brw_reg pixel_mask;
 
-  if (brw->gen >= 6)
- pixel_mask = retype(brw_vec1_grf(1, 7), BRW_REGISTER_TYPE_UW);
-  else
- pixel_mask = retype(brw_vec1_grf(0, 0), BRW_REGISTER_TYPE_UW);
+ if (brw->gen >= 6)
+pixel_mask = retype(brw_vec1_grf(1, 7), BRW_REGISTER_TYPE_UW);
+ else
+pixel_mask = retype(brw_vec1_grf(0, 0), BRW_REGISTER_TYPE_UW);
 
-  brw_MOV(p, pixel_mask, brw_flag_reg(0, 1));
-   }
+ brw_MOV(p, pixel_mask, brw_flag_reg(0, 1));
+  }
 
-   if (inst->header_present) {
   if (brw->gen >= 6) {
 brw_set_compression_control(p, BRW_COMPRESSION_COMPRESSED);
 brw_MOV(p,
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 45b053d..70b7c66 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -2743,7 +2743,7 @@ fs_visitor::emit_fb_writes()
 *  thread message and on all dual-source messages."
 */
if (brw->gen >= 6 &&
-   !this->fp->UsesKill &&
+   (brw->is_haswell || brw->gen >= 8 || !this->fp->UsesKill) &&
!do_dual_src &&
c->key.nr_color_regions == 1) {
   header_present = false;
@@ -2840,6 +2840,10 @@ fs_visitor::emit_fb_writes()
   inst->mlen = nr - base_mrf;
   inst->eot = true;
   inst->header_present = header_present;
+  if ((brw->gen >= 8 || brw->is_haswell) && fp->UsesKill) {
+ inst->predicate = BRW_PREDICATE_NORMAL;
+ inst->flag_subreg = 1;
+  }
 
   c->prog_data.dual_src_blend = true;
   this->current_annotation = NULL;
@@ -2885,6 +2889,10 @@ fs_visitor::emit_fb_writes()
  inst->mlen = nr - base_mrf;
   inst->eot = eot;
   inst->header_present = header_present;
+  if ((brw->gen >= 8 || brw->is_haswell) && fp->UsesKill) {
+ inst->predicate = BRW_PREDICATE_NORMAL;
+ inst->flag_subreg = 1;
+  }
}
 
if (c->key.nr_color_regions == 0) {
@@ -2902,6 +2910,10 @@ fs_visitor::emit_fb_writes()
   inst->mlen = nr - base_mrf;
   inst->eot = true;
   inst->header_present = header_present;
+  if ((brw->gen >= 8 || brw->is_haswell) && fp->UsesKill) {
+ inst->predicate = BR

[Mesa-dev] [PATCH] mesa: Add GL_TEXTURE_CUBE_MAP_ARRAY to legal_get_tex_level_parameter_target()

2014-02-14 Thread Anuj Phogat

Fixes failing Khronos CTS test packed_depth_stencil_init.test

Cc: 
Signed-off-by: Anuj Phogat 
---
 src/mesa/main/texparam.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/main/texparam.c b/src/mesa/main/texparam.c
index b7ed50d..bbdbc27 100644
--- a/src/mesa/main/texparam.c
+++ b/src/mesa/main/texparam.c
@@ -986,6 +986,9 @@ legal_get_tex_level_parameter_target(struct gl_context 
*ctx, GLenum target)
case GL_TEXTURE_CUBE_MAP_NEGATIVE_Z_ARB:
case GL_PROXY_TEXTURE_CUBE_MAP_ARB:
   return ctx->Extensions.ARB_texture_cube_map;
+   case GL_TEXTURE_CUBE_MAP_ARRAY_ARB:
+   case GL_PROXY_TEXTURE_CUBE_MAP_ARRAY_ARB:
+  return ctx->Extensions.ARB_texture_cube_map_array;
case GL_TEXTURE_RECTANGLE_NV:
case GL_PROXY_TEXTURE_RECTANGLE_NV:
   return ctx->Extensions.NV_texture_rectangle;
-- 
1.8.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 7/8] i965: Move singlesample_mt to the renderbuffer.

2014-02-14 Thread Kenneth Graunke

On 02/14/2014 03:00 PM, Eric Anholt wrote:
> Since only window system renderbuffers can have a singlesample_mt, this
> lets us drop a bunch of sanity checking to make sure that we're just a
> renderbuffer-like thing.
> ---
>  src/mesa/drivers/dri/i965/brw_context.c   |  20 ++-
>  src/mesa/drivers/dri/i965/intel_fbo.c |  91 ++-
>  src/mesa/drivers/dri/i965/intel_fbo.h |  47 +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 209 
> +++---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  59 +---
>  5 files changed, 165 insertions(+), 261 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
> b/src/mesa/drivers/dri/i965/brw_context.c
> index ba2f971..1071f9f 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.c
> +++ b/src/mesa/drivers/dri/i965/brw_context.c
> @@ -997,7 +997,7 @@ intel_resolve_for_dri2_flush(struct brw_context *brw,
>if (rb->mt->num_samples <= 1)
>   intel_miptree_resolve_color(brw, rb->mt);
>else
> - intel_miptree_downsample(brw, rb->mt);
> + intel_renderbuffer_downsample(brw, rb);
> }
>  }
>  
> @@ -1270,10 +1270,9 @@ intel_process_dri2_buffer(struct brw_context *brw,
> rb->mt->region->name == buffer->name)
>return;
> } else {
> -   if (rb->mt &&
> -   rb->mt->singlesample_mt &&
> -   rb->mt->singlesample_mt->region &&
> -   rb->mt->singlesample_mt->region->name == buffer->name)
> +   if (rb->singlesample_mt &&
> +   rb->singlesample_mt->region &&
> +   rb->singlesample_mt->region->name == buffer->name)
>return;
> }
>  
> @@ -1308,7 +1307,7 @@ intel_process_dri2_buffer(struct brw_context *brw,
> (buffer->attachment == __DRI_BUFFER_FRONT_LEFT ||
>  buffer->attachment == __DRI_BUFFER_FAKE_FRONT_LEFT) &&
> rb->Base.Base.NumSamples > 1) {
> -  intel_miptree_upsample(brw, rb->mt);
> +  intel_renderbuffer_upsample(brw, rb);
> }
>  
> assert(rb->mt);
> @@ -1355,10 +1354,9 @@ intel_update_image_buffer(struct brw_context *intel,
> rb->mt->region->bo == region->bo)
>return;
> } else {
> -   if (rb->mt &&
> -   rb->mt->singlesample_mt &&
> -   rb->mt->singlesample_mt->region &&
> -   rb->mt->singlesample_mt->region->bo == region->bo)
> +   if (rb->singlesample_mt &&
> +   rb->singlesample_mt->region &&
> +   rb->singlesample_mt->region->bo == region->bo)
>return;
> }
>  
> @@ -1367,7 +1365,7 @@ intel_update_image_buffer(struct brw_context *intel,
> if (intel->is_front_buffer_rendering &&
> buffer_type == __DRI_IMAGE_BUFFER_FRONT &&
> rb->Base.Base.NumSamples > 1) {
> -  intel_miptree_upsample(intel, rb->mt);
> +  intel_renderbuffer_upsample(intel, rb);
> }
>  }
>  
> diff --git a/src/mesa/drivers/dri/i965/intel_fbo.c 
> b/src/mesa/drivers/dri/i965/intel_fbo.c
> index cd148f0..62ccfab 100644
> --- a/src/mesa/drivers/dri/i965/intel_fbo.c
> +++ b/src/mesa/drivers/dri/i965/intel_fbo.c
> @@ -74,11 +74,41 @@ intel_delete_renderbuffer(struct gl_context *ctx, struct 
> gl_renderbuffer *rb)
> ASSERT(irb);
>  
> intel_miptree_release(&irb->mt);
> +   intel_miptree_release(&irb->singlesample_mt);
>  
> _mesa_delete_renderbuffer(ctx, rb);
>  }
>  
>  /**
> + * \brief Downsample a winsys renderbuffer from mt to singlesample_mt.
> + *
> + * If the miptree needs no downsample, then skip.
> + */
> +void
> +intel_renderbuffer_downsample(struct brw_context *brw,
> +  struct intel_renderbuffer *irb)
> +{
> +   if (!irb->need_downsample)
> +  return;
> +   intel_miptree_updownsample(brw, irb->mt, irb->singlesample_mt);
> +   irb->need_downsample = false;
> +}
> +
> +/**
> + * \brief Upsample a winsys renderbuffer from singlesample_mt to mt.
> + *
> + * The upsample is done unconditionally.
> + */
> +void
> +intel_renderbuffer_upsample(struct brw_context *brw,
> +struct intel_renderbuffer *irb)
> +{
> +   assert(!irb->need_downsample);
> +
> +   intel_miptree_updownsample(brw, irb->singlesample_mt, irb->mt);
> +}
> +
> +/**
>   * \see dd_function_table::MapRenderbuffer
>   */
>  static void
> @@ -92,6 +122,7 @@ intel_map_renderbuffer(struct gl_context *ctx,
> struct brw_context *brw = brw_context(ctx);
> struct swrast_renderbuffer *srb = (struct swrast_renderbuffer *)rb;
> struct intel_renderbuffer *irb = intel_renderbuffer(rb);
> +   struct intel_mipmap_tree *mt;
> void *map;
> int stride;
>  
> @@ -106,6 +137,39 @@ intel_map_renderbuffer(struct gl_context *ctx,
>  
> intel_prepare_render(brw);
>  

I'm having a little trouble parsing this comment.  A few suggestions...

> +   /* The MapRenderbuffer API should always be returning a single-sampled

"should always return"?

> +* mapping.  The case we get mapping of multisampled RBs are in

"The case w

Re: [Mesa-dev] [PATCH 6/8] i965: Drop some duplicated code in DRI winsys BO updates.

2014-02-14 Thread Kenneth Graunke

On 02/14/2014 03:00 PM, Eric Anholt wrote:
[snip]
> @@ -666,78 +667,6 @@ intel_miptree_create_for_bo(struct brw_context *brw,
> return mt;
>  }
>  
> -
> -/**
> - * For a singlesample DRI2 buffer, this simply wraps the given region with a 
> miptree.
> - *
> - * For a multisample DRI2 buffer, this wraps the given region with
> - * a singlesample miptree, then creates a multisample miptree into which the
> - * singlesample miptree is embedded as a child.
> - */
> -struct intel_mipmap_tree*
> -intel_miptree_create_for_dri2_buffer(struct brw_context *brw,
> - unsigned dri_attachment,
> - mesa_format format,
> - uint32_t num_samples,
> - struct intel_region *region)
> -{
> -   struct intel_mipmap_tree *singlesample_mt = NULL;
> -   struct intel_mipmap_tree *multisample_mt = NULL;
> -
> -   /* Only the front and back buffers, which are color buffers, are shared
> -* through DRI2.
> -*/
> -   assert(dri_attachment == __DRI_BUFFER_BACK_LEFT ||
> -  dri_attachment == __DRI_BUFFER_FRONT_LEFT ||
> -  dri_attachment == __DRI_BUFFER_FAKE_FRONT_LEFT);
> -   assert(_mesa_get_format_base_format(format) == GL_RGB ||
> -  _mesa_get_format_base_format(format) == GL_RGBA);
> -
> -   singlesample_mt = intel_miptree_create_for_bo(brw,
> - region->bo,
> - format,
> - 0,
> - region->width,
> - region->height,
> - region->pitch,
> - region->tiling);
> -   if (!singlesample_mt)
> -  return NULL;
> -   singlesample_mt->region->name = region->name;

The singlesample_mt->region->name = region->name line is missing from
the new code.  I doubt it actually matters, but figured I'd point it out
in case it wasn't intentional.

--Ken



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 8/8] i965: Drop mt->levels[].width/height.

2014-02-14 Thread Kenneth Graunke

On 02/14/2014 03:00 PM, Eric Anholt wrote:
> It often confused people because it was unclear on whether it was the
> physical or logical, and people needed the other one as well.  We can
> recompute it trivially using the minify() macro, clarifying which value is
> being used and making getting the other value obvious.
> ---
>  src/mesa/drivers/dri/i965/brw_blorp.cpp   |  4 +--
>  src/mesa/drivers/dri/i965/brw_clear.c |  3 +-
>  src/mesa/drivers/dri/i965/brw_tex_layout.c|  5 ++--
>  src/mesa/drivers/dri/i965/intel_blit.c|  4 +--
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 40 
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  5 +---
>  src/mesa/drivers/dri/i965/intel_screen.c  |  4 +--
>  7 files changed, 23 insertions(+), 42 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp 
> b/src/mesa/drivers/dri/i965/brw_blorp.cpp
> index 76537c8..7980013 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp
> @@ -68,8 +68,8 @@ brw_blorp_mip_info::set(struct intel_mipmap_tree *mt,
> this->mt = mt;
> this->level = level;
> this->layer = layer;
> -   this->width = mt->level[level].width;
> -   this->height = mt->level[level].height;
> +   this->width = minify(mt->physical_width0, level);
> +   this->height = minify(mt->physical_height0, level);
>  
> intel_miptree_get_image_offset(mt, level, layer, &x_offset, &y_offset);
>  }
> diff --git a/src/mesa/drivers/dri/i965/brw_clear.c 
> b/src/mesa/drivers/dri/i965/brw_clear.c
> index 1964572..d9a8792 100644
> --- a/src/mesa/drivers/dri/i965/brw_clear.c
> +++ b/src/mesa/drivers/dri/i965/brw_clear.c
> @@ -155,7 +155,8 @@ brw_fast_clear_depth(struct gl_context *ctx)
> *width of the map (LOD0) is not multiple of 16, fast clear
> *optimization must be disabled.
> */
> -  if (brw->gen == 6 && (mt->level[depth_irb->mt_level].width % 16) != 0)
> +  if (brw->gen == 6 && (minify(mt->physical_width0,
> +   depth_irb->mt_level) % 16) != 0)
>return false;
>/* FALLTHROUGH */
>  
> diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c 
> b/src/mesa/drivers/dri/i965/brw_tex_layout.c
> index 61a2eba..76044b2 100644
> --- a/src/mesa/drivers/dri/i965/brw_tex_layout.c
> +++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c
> @@ -197,8 +197,7 @@ brw_miptree_layout_2d(struct intel_mipmap_tree *mt)
> for (unsigned level = mt->first_level; level <= mt->last_level; level++) {
>unsigned img_height;
>  
> -  intel_miptree_set_level_info(mt, level, x, y, width,
> -height, depth);
> +  intel_miptree_set_level_info(mt, level, x, y, depth);
>  
>img_height = ALIGN(height, mt->align_h);
>if (mt->compressed)
> @@ -281,7 +280,7 @@ brw_miptree_layout_texture_3d(struct brw_context *brw,
>if (mt->target == GL_TEXTURE_CUBE_MAP)
>   DL = 6;
>  
> -  intel_miptree_set_level_info(mt, level, 0, 0, WL, HL, DL);
> +  intel_miptree_set_level_info(mt, level, 0, 0, DL);
>  
>for (unsigned q = 0; q < DL; q++) {
>   unsigned x = (q % (1 << level)) * wL;
> diff --git a/src/mesa/drivers/dri/i965/intel_blit.c 
> b/src/mesa/drivers/dri/i965/intel_blit.c
> index b12ecca..23757f3 100644
> --- a/src/mesa/drivers/dri/i965/intel_blit.c
> +++ b/src/mesa/drivers/dri/i965/intel_blit.c
> @@ -215,10 +215,10 @@ intel_miptree_blit(struct brw_context *brw,
> intel_miptree_resolve_color(brw, dst_mt);
>  
> if (src_flip)
> -  src_y = src_mt->level[src_level].height - src_y - height;
> +  src_y = minify(src_mt->physical_height0, src_level) - src_y - height;
>  
> if (dst_flip)
> -  dst_y = dst_mt->level[dst_level].height - dst_y - height;
> +  dst_y = minify(dst_mt->physical_height0, src_level) - dst_y - height;

This looks like a typo.  Shouldn't this be dst_level?

Assuming you fix that, this series is:
Reviewed-by: Kenneth Graunke 

(You're welcome to take or ignore my other suggestions.)



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] i965/fs: Drop dead comment about the old proj_attrib_mask optimization.

2014-02-14 Thread Kenneth Graunke

On 02/14/2014 04:48 PM, Eric Anholt wrote:
> The code was removed early last year.
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 6 --
>  1 file changed, 6 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 4f5558b..d35928e 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -1108,12 +1108,6 @@ fs_visitor::emit_general_interpolation(ir_variable *ir)
>} else {
>   /* Smooth/noperspective interpolation case. */
>   for (unsigned int k = 0; k < type->vector_elements; k++) {
> -/* FINISHME: At some point we probably want to push
> - * this farther by giving similar treatment to the
> - * other potentially constant components of the
> - * attribute, as well as making brw_vs_constval.c
> - * handle varyings other than gl_TexCoord.
> - */
> struct brw_reg interp = interp_reg(location, k);
> emit_linterp(attr, fs_reg(interp), interpolation_mode,
>  ir->data.centroid && !c->key.persample_shading,
> 

Patch 1 and 3 are:
Reviewed-by: Kenneth Graunke 

(I haven't had time to look at 3.  Don't wait for me if someone else does.)



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] gallivm: use correct rounding for linear wrap mode (in the aos int path)

2014-02-14 Thread Roland Scheidegger

Am 15.02.2014 01:54, schrieb srol...@vmware.com:
> From: Jeff Muizelaar 
> 
> The previous method for converting coords to ints was sligthly inaccurate
> (effectively losing 1bit from the 8bit lerp weight). This is probably
> especially noticeable when trying to draw a pixel-aligned texture.
> As an example, for a 100x100 texture after dernormalization the texture
> coords in this case would turn up as
> 0.5, 1.5, 2.5, 3.5, 4.5, ...
> After the mul by 256, conversion to int and 128 subtraction, they end up as
> 0, 256, 512, 768, 1024, ...
> which gets us the correct coords/weights of
> 0/0, 1/0, 2/0, 3/0, 4/0, ...
> But even LSB errors (which are unavoidable) in the input coords may cause
> these coords/weights to be wrong, e.g. for a coord of 3.4 we'd get a
> coord/weight of 2/255 instead.
> 
> Fix this by using round-to-nearest int instead of FPToSi (trunc). Should be
> equally fast on x86 sse though other archs probably suffer a little.
> ---
>  src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c |   14 --
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c 
> b/src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c
> index c35b628..1d87ee8 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_aos.c
> @@ -987,7 +987,6 @@ lp_build_sample_image_linear(struct 
> lp_build_sample_context *bld,
> const unsigned dims = bld->dims;
> LLVMBuilderRef builder = bld->gallivm->builder;
> struct lp_build_context i32;
> -   LLVMTypeRef i32_vec_type;
> LLVMValueRef i32_c8, i32_c128, i32_c255;
> LLVMValueRef width_vec, height_vec, depth_vec;
> LLVMValueRef s_ipart, s_fpart, s_float;
> @@ -1003,8 +1002,6 @@ lp_build_sample_image_linear(struct 
> lp_build_sample_context *bld,
>  
> lp_build_context_init(&i32, bld->gallivm, lp_type_int_vec(32, 
> bld->vector_width));
>  
> -   i32_vec_type = lp_build_vec_type(bld->gallivm, i32.type);
> -
> lp_build_extract_image_sizes(bld,
>  &bld->int_size_bld,
>  bld->int_coord_type,
> @@ -1036,11 +1033,16 @@ lp_build_sample_image_linear(struct 
> lp_build_sample_context *bld,
> }
>  
> /* convert float to int */
> -   s = LLVMBuildFPToSI(builder, s, i32_vec_type, "");
> +   /* For correct rounding, need round to nearest, not truncation here.
> +* Note that in some cases (clamp to edge, no texel offsets) we
> +* could use a non-signed build context which would help archs which
> +* don't have fptosi intrinsic with nearest rounding implemented.
> +*/
> +   s_ipart = lp_build_iround(&bld->coord_bld, s);
> if (dims >= 2)
> -  t = LLVMBuildFPToSI(builder, t, i32_vec_type, "");
> +  t_ipart = lp_build_iround(&bld->coord_bld, t);
> if (dims >= 3)
> -  r = LLVMBuildFPToSI(builder, r, i32_vec_type, "");
> +  r_ipart = lp_build_iround(&bld->coord_bld, r);
>  
> /* subtract 0.5 (add -128) */
> i32_c128 = lp_build_const_int_vec(bld->gallivm, i32.type, -128);
> 

Oops this series is quite a disaster, sorry. That should have been
s_ipart/t_ipart/r_ipart above of course, my typo... And 2/3 is quite
fail too with some nice texel shift, bad math there... 3/3 may be ok...

Roland
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 75010] New: clang: error: unknown argument: '-fstack-protector-strong'

2014-02-14 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=75010

  Priority: medium
Bug ID: 75010
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: clang: error: unknown argument:
'-fstack-protector-strong'
  Severity: blocker
Classification: Unclassified
OS: Linux (All)
  Reporter: v...@freedesktop.org
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: git
 Component: Other
   Product: Mesa

Building with clang 3.4 on Fedora 21 (Rawhide)

mesa: 1020d8937ef52725cc5adafc12465f6332815e82 (master)

$ make
[...]
  CXX  gallivm/lp_bld_debug.lo
clang: error: unknown argument: '-fstack-protector-strong'

$ clang++ --version
clang version 3.4 (tags/RELEASE_34/final)
Target: x86_64-redhat-linux-gnu
Thread model: posix

$ llvm-config --cxxflags
-I/usr/include  -DNDEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS
-D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -O2 -g -pipe -Wall
-Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
-fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches  -m64
-mtune=generic -fomit-frame-pointer -std=c++11 -fvisibility-inlines-hidden
-fno-exceptions -fPIC -Woverloaded-virtual -Wcast-qual

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] configure: Try pkg-config first for libselinux

2014-02-14 Thread Kusanagi Kouichi

Signed-off-by: Kusanagi Kouichi 
---
 configure.ac | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/configure.ac b/configure.ac
index 00a0eaa..5c58928 100644
--- a/configure.ac
+++ b/configure.ac
@@ -506,11 +506,12 @@ AC_ARG_ENABLE([selinux],
 [MESA_SELINUX="$enableval"],
 [MESA_SELINUX=no])
 if test "x$enable_selinux" = "xyes"; then
-AC_CHECK_HEADER([selinux/selinux.h],[],
-[AC_MSG_ERROR([SELinux headers not found])])
-AC_CHECK_LIB([selinux],[is_selinux_enabled],[],
- [AC_MSG_ERROR([SELinux library not found])])
-SELINUX_LIBS="-lselinux"
+PKG_CHECK_MODULES([SELINUX], [libselinux], [],
+[AC_CHECK_HEADER([selinux/selinux.h],[],
+ [AC_MSG_ERROR([SELinux headers not found])])
+ AC_CHECK_LIB([selinux],[is_selinux_enabled],[],
+  [AC_MSG_ERROR([SELinux library not found])])
+ SELINUX_LIBS="-lselinux"])
 DEFINES="$DEFINES -DMESA_SELINUX"
 fi
 AC_SUBST([SELINUX_LIBS])
-- 
1.9.0.rc3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: Fix error code generation in glReadPixels()

2014-02-14 Thread Ian Romanick

On 02/14/2014 04:49 PM, Anuj Phogat wrote:
> Section 4.3.1, page 220, of OpenGL 3.3 specification explains
> the error conditions for glreadPixels():
> 
>"If the format is DEPTH_STENCIL, then values are taken from
> both the depth buffer and the stencil buffer. If there is
> no depth buffer or if there is no stencil buffer, then the
> error INVALID_OPERATION occurs. If the type parameter is
> not UNSIGNED_INT_24_8 or FLOAT_32_UNSIGNED_INT_24_8_REV,
> then the error INVALID_ENUM occurs."

Add this quotation to the code with an extra comment "OpenGL ES still
generates GL_INVALID_OPERATION because glReadPixels cannot be used to
read depth or stencil in that API."

> Fixes failing Khronos CTS test packed_depth_stencil_error.test
> 
> Cc: 
> Signed-off-by: Anuj Phogat 
> ---
>  src/mesa/main/glformats.c | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c
> index 77cf263..b797900 100644
> --- a/src/mesa/main/glformats.c
> +++ b/src/mesa/main/glformats.c
> @@ -1257,6 +1257,9 @@ _mesa_error_check_format_and_type(const struct 
> gl_context *ctx,
>ctx->Extensions.ARB_texture_rgb10_a2ui) {
>   break; /* OK */
>}
> +  if (format == GL_DEPTH_STENCIL && _mesa_is_desktop_gl(ctx)) {
> + return GL_INVALID_ENUM;
> +  }

Wouldn't it be easier to just add the following before the
switch-statement?

   /* Section 4.3.1, page 220, of OpenGL 3.3 specification explains
* the error conditions for glreadPixels():
*
* "If the format is DEPTH_STENCIL, then values are taken from both
* the depth buffer and the stencil buffer. If there is no depth
* buffer or if there is no stencil buffer, then the error
* INVALID_OPERATION occurs. If the type parameter is not
* UNSIGNED_INT_24_8 or FLOAT_32_UNSIGNED_INT_24_8_REV, then the
* error INVALID_ENUM occurs."
*
* OpenGL ES still generates GL_INVALID_OPERATION because glReadPixels
* cannot be used to read depth or stencil in that API.
*/
   if (_mesa_is_desktop_gl(ctx) && format == GL_DEPTH_STENCIL
   && type != GL_UNSIGNED_INT_24_8 
   && type != GL_FLOAT_32_UNSIGNED_INT_24_8_REV)
  return GL_INVALID_ENUM;

>return GL_INVALID_OPERATION;
>  
> case GL_UNSIGNED_SHORT_4_4_4_4:
> @@ -1280,6 +1283,9 @@ _mesa_error_check_format_and_type(const struct 
> gl_context *ctx,
>ctx->API == API_OPENGLES2) {
>   break; /* OK by GL_EXT_texture_type_2_10_10_10_REV */
>}
> +  if (format == GL_DEPTH_STENCIL && _mesa_is_desktop_gl(ctx)) {
> + return GL_INVALID_ENUM;
> +  }
>return GL_INVALID_OPERATION;
>  
> case GL_UNSIGNED_INT_24_8:
> @@ -1298,7 +1304,8 @@ _mesa_error_check_format_and_type(const struct 
> gl_context *ctx,
>return GL_NO_ERROR;
>  
> case GL_UNSIGNED_INT_10F_11F_11F_REV:
> -  if (!ctx->Extensions.EXT_packed_float) {
> +  if (!ctx->Extensions.EXT_packed_float ||
> +  (format == GL_DEPTH_STENCIL && _mesa_is_desktop_gl(ctx))) {
>   return GL_INVALID_ENUM;
>}
>if (format != GL_RGB) {
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: Add GL_TEXTURE_CUBE_MAP_ARRAY to legal_get_tex_level_parameter_target()

2014-02-14 Thread Ian Romanick

On 02/14/2014 05:31 PM, Anuj Phogat wrote:
> Fixes failing Khronos CTS test packed_depth_stencil_init.test
> 
> Cc: 
> Signed-off-by: Anuj Phogat 

Reviewed-by: Ian Romanick 

> ---
>  src/mesa/main/texparam.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/src/mesa/main/texparam.c b/src/mesa/main/texparam.c
> index b7ed50d..bbdbc27 100644
> --- a/src/mesa/main/texparam.c
> +++ b/src/mesa/main/texparam.c
> @@ -986,6 +986,9 @@ legal_get_tex_level_parameter_target(struct gl_context 
> *ctx, GLenum target)
> case GL_TEXTURE_CUBE_MAP_NEGATIVE_Z_ARB:
> case GL_PROXY_TEXTURE_CUBE_MAP_ARB:
>return ctx->Extensions.ARB_texture_cube_map;
> +   case GL_TEXTURE_CUBE_MAP_ARRAY_ARB:
> +   case GL_PROXY_TEXTURE_CUBE_MAP_ARRAY_ARB:
> +  return ctx->Extensions.ARB_texture_cube_map_array;
> case GL_TEXTURE_RECTANGLE_NV:
> case GL_PROXY_TEXTURE_RECTANGLE_NV:
>return ctx->Extensions.NV_texture_rectangle;
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/8] meta: Fix blit shader compile on non-glsl-130 drivers.

2014-02-14 Thread Ian Romanick

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 02/14/2014 05:09 PM, Kenneth Graunke wrote:
> On 02/14/2014 03:00 PM, Eric Anholt wrote:
>> Compare this VS to the one for the post-130 case.  Fixes piglit 
>> glsl-lod-bias, and presumably tons of other code (I haven't done
>> a full piglit run on swrast).
>> 
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74911 --- 
>> src/mesa/drivers/common/meta.c | 2 +- 1 file changed, 1
>> insertion(+), 1 deletion(-)
>> 
>> diff --git a/src/mesa/drivers/common/meta.c
>> b/src/mesa/drivers/common/meta.c index d3ca3b7..dd905dd 100644 
>> --- a/src/mesa/drivers/common/meta.c +++
>> b/src/mesa/drivers/common/meta.c @@ -193,7 +193,7 @@
>> _mesa_meta_setup_blit_shader(struct gl_context *ctx, ||
>> ctx->Const.GLSLVersion < 130) { vs_source = "attribute vec2
>> position;\n" - "attribute vec3 textureCoords;\n" +
>> "attribute vec4 textureCoords;\n" "varying vec4 texCoords;\n" 
>> "void main()\n" "{\n"
>> 
> 
> This is obviously: Reviewed-by: Kenneth Graunke
> 
> 
> But I wonder, would it be terribly harmful to just override 
> ctx->Const.GLSLVersion to 130 in Meta so #version 130 works?

ctx->Const.GLSLVersion is already (proably) 130 or greater.  There are
checks in the compiler against API to validate the version in the
shader.  Otherwise applications with an OpenGL ES 3.0 context could
compile desktop GLSL 1.30 shaders, and that seems bad. :)

> Sure, you could get into trouble if you tried to use things like 
> ClipDistance and they weren't supported, but I don't see us needing
> that.
> 
> We would need integer, but I don't know of any drivers that allow
> you to make integer textures that can't handle integers.  (Gen4-5
> expose EXT_texture_integer without GLSL 1.30, but they can do GLSL
> 1.30...we just never finished advertising it...)
> 
> Just an idea; I'm not suggesting altering any of these patches.
> 
> 
> 
> ___ mesa-dev mailing
> list mesa-dev@lists.freedesktop.org 
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.14 (GNU/Linux)

iEYEARECAAYFAlL+2UAACgkQX1gOwKyEAw9KrwCeO2qnkBmcSsYniCyQFBwa+man
RoQAoINS3RZReZs9PlT+q1IASpMzGoGg
=mVCJ
-END PGP SIGNATURE-
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/8] meta: Fix blit shader compile on non-glsl-130 drivers.

2014-02-14 Thread Kenneth Graunke

On 02/14/2014 07:04 PM, Ian Romanick wrote:
> On 02/14/2014 05:09 PM, Kenneth Graunke wrote:
>> On 02/14/2014 03:00 PM, Eric Anholt wrote:
>>> Compare this VS to the one for the post-130 case.  Fixes piglit 
>>> glsl-lod-bias, and presumably tons of other code (I haven't done
>>> a full piglit run on swrast).
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74911 --- 
>>> src/mesa/drivers/common/meta.c | 2 +- 1 file changed, 1
>>> insertion(+), 1 deletion(-)
>>>
>>> diff --git a/src/mesa/drivers/common/meta.c
>>> b/src/mesa/drivers/common/meta.c index d3ca3b7..dd905dd 100644 
>>> --- a/src/mesa/drivers/common/meta.c +++
>>> b/src/mesa/drivers/common/meta.c @@ -193,7 +193,7 @@
>>> _mesa_meta_setup_blit_shader(struct gl_context *ctx, ||
>>> ctx->Const.GLSLVersion < 130) { vs_source = "attribute vec2
>>> position;\n" - "attribute vec3 textureCoords;\n" +
>>> "attribute vec4 textureCoords;\n" "varying vec4 texCoords;\n" 
>>> "void main()\n" "{\n"
>>>
> 
>> This is obviously: Reviewed-by: Kenneth Graunke
>> 
> 
>> But I wonder, would it be terribly harmful to just override 
>> ctx->Const.GLSLVersion to 130 in Meta so #version 130 works?
> 
> ctx->Const.GLSLVersion is already (proably) 130 or greater.  There are
> checks in the compiler against API to validate the version in the
> shader.  Otherwise applications with an OpenGL ES 3.0 context could
> compile desktop GLSL 1.30 shaders, and that seems bad. :)

I think you misunderstand.

This patch is fixing a bug in a block of code which is:

if (ctx->Const.GLSLVersion < 130) {
   ...do version 110 shaders...
} else {
   ...do version 130 or 300 es shaders...
}

So, the duplication is precisely for drivers that don't do 1.30.  Like
Gen4-5...

--Ken



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] vl: add motion adaptive deinterlacer

2014-02-14 Thread Ilia Mirkin

Grigori,

I just tried this out on a few interlaced videos I have, and it works
really quite well! One thing I did notice is that when playing back
content with mplayer (and -vo vdpau:deint=3) in full screen mode, the
black bars are now stale/uninitialized textures. This does not happen
if I don't use deint=3. Is it something that we're doing wrong in
nouveau, or is it an issue in st/vdpau? If it's likely an issue in
nouveau, any hints on what it might be?

  -ilia

On Fri, Feb 14, 2014 at 3:18 AM, Christian König
 wrote:
> A really nice piece of work, thx allot.
>
> Both patches reviewed and pushed upstream.
>
> Cheers,
> Christian.
>
> Am 13.02.2014 21:32, schrieb Grigori Goronzy:
>
>> ---
>>   src/gallium/auxiliary/Makefile.sources |   3 +-
>>   src/gallium/auxiliary/vl/vl_deint_filter.c | 491
>> +
>>   src/gallium/auxiliary/vl/vl_deint_filter.h |  78 +
>>   3 files changed, 571 insertions(+), 1 deletion(-)
>>   create mode 100644 src/gallium/auxiliary/vl/vl_deint_filter.c
>>   create mode 100644 src/gallium/auxiliary/vl/vl_deint_filter.h
>>
>> diff --git a/src/gallium/auxiliary/Makefile.sources
>> b/src/gallium/auxiliary/Makefile.sources
>> index c89cbdd..19004e0 100644
>> --- a/src/gallium/auxiliary/Makefile.sources
>> +++ b/src/gallium/auxiliary/Makefile.sources
>> @@ -155,7 +155,8 @@ C_SOURCES := \
>>   vl/vl_idct.c \
>> vl/vl_mc.c \
>>   vl/vl_vertex_buffers.c \
>> -vl/vl_video_buffer.c
>> +vl/vl_video_buffer.c \
>> +   vl/vl_deint_filter.c
>> GENERATED_SOURCES := \
>> indices/u_indices_gen.c \
>> diff --git a/src/gallium/auxiliary/vl/vl_deint_filter.c
>> b/src/gallium/auxiliary/vl/vl_deint_filter.c
>> new file mode 100644
>> index 000..9b05154
>> --- /dev/null
>> +++ b/src/gallium/auxiliary/vl/vl_deint_filter.c
>> @@ -0,0 +1,491 @@
>>
>> +/**
>> + *
>> + * Copyright 2013 Grigori Goronzy .
>> + * All Rights Reserved.
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining
>> a
>> + * copy of this software and associated documentation files (the
>> + * "Software"), to deal in the Software without restriction, including
>> + * without limitation the rights to use, copy, modify, merge, publish,
>> + * distribute, sub license, and/or sell copies of the Software, and to
>> + * permit persons to whom the Software is furnished to do so, subject to
>> + * the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including the
>> + * next paragraph) shall be included in all copies or substantial
>> portions
>> + * of the Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>> EXPRESS
>> + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
>> NON-INFRINGEMENT.
>> + * IN NO EVENT SHALL TUNGSTEN GRAPHICS AND/OR ITS SUPPLIERS BE LIABLE FOR
>> + * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF
>> CONTRACT,
>> + * TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
>> + * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
>> + *
>> +
>> **/
>> +
>> +/*
>> + *  References:
>> + *
>> + *  Lin, S. F., Chang, Y. L., & Chen, L. G. (2003).
>> + *  Motion adaptive interpolation with horizontal motion detection for
>> deinterlacing.
>> + *  Consumer Electronics, IEEE Transactions on, 49(4), 1256-1265.
>> + *
>> + *  Pei-Yin, C. H. E. N., & Yao-Hsien, L. A. I. (2007).
>> + *  A low-complexity interpolation method for deinterlacing.
>> + *  IEICE transactions on information and systems, 90(2), 606-608.
>> + *
>> + */
>> +
>> +#include 
>> +
>> +#include "pipe/p_context.h"
>> +
>> +#include "tgsi/tgsi_ureg.h"
>> +
>> +#include "util/u_draw.h"
>> +#include "util/u_memory.h"
>> +#include "util/u_math.h"
>> +
>> +#include "vl_types.h"
>> +#include "vl_video_buffer.h"
>> +#include "vl_vertex_buffers.h"
>> +#include "vl_deint_filter.h"
>> +
>> +enum VS_OUTPUT
>> +{
>> +   VS_O_VPOS = 0,
>> +   VS_O_VTEX = 0
>> +};
>> +
>> +static void *
>> +create_vert_shader(struct vl_deint_filter *filter)
>> +{
>> +   struct ureg_program *shader;
>> +   struct ureg_src i_vpos;
>> +   struct ureg_dst o_vpos, o_vtex;
>> +
>> +   shader = ureg_create(TGSI_PROCESSOR_VERTEX);
>> +   if (!shader)
>> +  return NULL;
>> +
>> +   i_vpos = ureg_DECL_vs_input(shader, 0);
>> +   o_vpos = ureg_DECL_output(shader, TGSI_SEMANTIC_POSITION, VS_O_VPOS);
>> +   o_vtex = ureg_DECL_output(shader, TGSI_SEMANTIC_GENERIC, VS_O_VTEX);
>> +
>> +   ureg_MOV(shader, o_vpos, i_vpos);
>> +   ureg_MOV(shader, o_vtex, i_vpos);
>> +
>> +   ureg_END(shader);
>> +
>> +   return ureg_create_shader_and_destroy(shader, filter->pipe);
>> +}
>> +
>> +static void *
>> +create_copy_frag_shader(struct vl_deint_filter *filter, unsigned field)
>>

Re: [Mesa-dev] [PATCH] i965/fs: Use conditional sends to do FB writes on HSW+.

2014-02-14 Thread Matt Turner

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] configure: Try pkg-config first for libselinux

2014-02-14 Thread Matt Turner

Reviewed-by: Matt Turner 

I'll commit later, unless someone does it first.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

68 matches

Mail list logo