[Mesa-dev] [PATCH v2 0/2] nvc0: support for GK20A (Tegra K1)

2014-05-27 Thread Alexandre Courbot
The following 2 patches make it possible to run Mesa programs on GK20A
(Tegra K1).

GK20A is very similar to GK104, but uses a new (backward-compatible) 3D class
as well as the same ISA as GK110 (SM35). Taking these differences into account
is sufficient to successfully render simple off-screen buffers.

Changes since v1:
- Update TargetNVC0::getFileSize() to return the right number of GPR
- Remove definition for unneeded NVISA_GK110_CHIPSET
- Use consistent comparison scheme in nv50_ir_emit_nvc0.cpp

Alexandre Courbot (2):
  nvc0: add GK20A 3D class
  nvc0: use SM35 ISA with GK20A

 src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h  |  2 +-
 src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp |  2 +-
 .../drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp   | 15 ++-
 src/gallium/drivers/nouveau/nv_object.xml.h   |  1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c|  9 -
 5 files changed, 21 insertions(+), 8 deletions(-)

-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 2/2] nvc0: use SM35 ISA with GK20A

2014-05-27 Thread Alexandre Courbot
GK20A is mostly compatible with GK104, but uses the SM35 ISA. Use
the GK110 path when this chip is detected.

Signed-off-by: Alexandre Courbot 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h  |  2 +-
 src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp |  2 +-
 .../drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp   | 15 ++-
 3 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
index bbb89d97932e..f829aac0bcc2 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
@@ -91,7 +91,7 @@ struct nv50_ir_prog_symbol
 #define NVISA_GF100_CHIPSET_C0 0xc0
 #define NVISA_GF100_CHIPSET_D0 0xd0
 #define NVISA_GK104_CHIPSET0xe0
-#define NVISA_GK110_CHIPSET0xf0
+#define NVISA_GK20A_CHIPSET0xea
 #define NVISA_GM107_CHIPSET0x110
 
 struct nv50_ir_prog_info
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
index b1f76cf80432..f69e6a183e19 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
@@ -3027,7 +3027,7 @@ TargetNVC0::createCodeEmitterNVC0(Program::Type type)
 CodeEmitter *
 TargetNVC0::getCodeEmitter(Program::Type type)
 {
-   if (chipset >= NVISA_GK110_CHIPSET)
+   if (chipset >= NVISA_GK20A_CHIPSET)
   return createCodeEmitterGK110(type);
return createCodeEmitterNVC0(type);
 }
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
index 064e7a2c63f9..963b6e47ddfc 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
@@ -49,9 +49,12 @@ TargetNVC0::getBuiltinCode(const uint32_t **code, uint32_t 
*size) const
 {
switch (chipset & ~0xf) {
case 0xe0:
-  *code = (const uint32_t *)&gk104_builtin_code[0];
-  *size = sizeof(gk104_builtin_code);
-  break;
+  if (chipset < NVISA_GK20A_CHIPSET) {
+ *code = (const uint32_t *)&gk104_builtin_code[0];
+ *size = sizeof(gk104_builtin_code);
+ break;
+  }
+  /* fall-through for GK20A */
case 0xf0:
case 0x100:
   *code = (const uint32_t *)&gk110_builtin_code[0];
@@ -71,7 +74,9 @@ TargetNVC0::getBuiltinOffset(int builtin) const
 
switch (chipset & ~0xf) {
case 0xe0:
-  return gk104_builtin_offsets[builtin];
+  if (chipset < NVISA_GK20A_CHIPSET)
+ return gk104_builtin_offsets[builtin];
+  /* fall-through for GK20A */
case 0xf0:
case 0x100:
   return gk110_builtin_offsets[builtin];
@@ -235,7 +240,7 @@ TargetNVC0::getFileSize(DataFile file) const
 {
switch (file) {
case FILE_NULL:  return 0;
-   case FILE_GPR:   return (chipset >= NVISA_GK110_CHIPSET) ? 255 : 63;
+   case FILE_GPR:   return (chipset >= NVISA_GK20A_CHIPSET) ? 255 : 63;
case FILE_PREDICATE: return 7;
case FILE_FLAGS: return 1;
case FILE_ADDRESS:   return 0;
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 1/2] nvc0: add GK20A 3D class

2014-05-27 Thread Alexandre Courbot
GK20A is mostly compatible with GK104, but features a new 3D
class. Add it to the relevant header and use it when GK20A is
detected.

Signed-off-by: Alexandre Courbot 
---
 src/gallium/drivers/nouveau/nv_object.xml.h| 1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 9 -
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/nv_object.xml.h 
b/src/gallium/drivers/nouveau/nv_object.xml.h
index 4c93e6564838..0a0e187dc028 100644
--- a/src/gallium/drivers/nouveau/nv_object.xml.h
+++ b/src/gallium/drivers/nouveau/nv_object.xml.h
@@ -190,6 +190,7 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 
SOFTWARE.
 #define NVC8_3D_CLASS  0x9297
 #define NVE4_3D_CLASS  0xa097
 #define NVF0_3D_CLASS  0xa197
+#define NVEA_3D_CLASS  0xa297
 #define GM107_3D_CLASS 0xb097
 #define NV50_2D_CLASS  0x502d
 #define NVC0_2D_CLASS  0x902d
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
index cccfe2bba23d..95e5ef81cd79 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
@@ -702,7 +702,14 @@ nvc0_screen_create(struct nouveau_device *dev)
   obj_class = NVF0_3D_CLASS;
   break;
case 0xe0:
-  obj_class = NVE4_3D_CLASS;
+  switch (dev->chipset) {
+  case 0xea:
+ obj_class = NVEA_3D_CLASS;
+ break;
+  default:
+ obj_class = NVE4_3D_CLASS;
+ break;
+  }
   break;
case 0xd0:
   obj_class = NVC8_3D_CLASS;
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79230] After upgrade from 10.1.4 to 10.2-rc4 cross-compile fails

2014-05-27 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79230

Andreas Boll  changed:

   What|Removed |Added

 CC||i...@freedesktop.org
   Keywords||regression
 Blocks||79039

--- Comment #1 from Andreas Boll  ---
Can you bisect?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79039] [TRACKER] Mesa 10.2 release tracker

2014-05-27 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79039

Andreas Boll  changed:

   What|Removed |Added

 Depends on||79230

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 78914] [llvmpipe] Front/Backfaces do not cover the same pixels when rasterized

2014-05-27 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=78914

--- Comment #16 from Florian Link  ---
This is strange, since in my renderer I render all back faces to one FBO and
all front faces to another FBO, so the front/back faces do not fight in the
Z-buffer.

I really experience missing pixels on faces that share edges, in one FBO the
front triangle edge has different pixels than the back triangle in the other
FBO... 

But maybe your code does something different when depth testing is on?

I will try to adapt my example to show the problem without depth test.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 78914] [llvmpipe] Front/Backfaces do not cover the same pixels when rasterized

2014-05-27 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=78914

--- Comment #17 from Florian Link  ---
Ok, you are right, it only happens with depth test enabled.

The strange thing is that it creates these artifacts in my ray caster,
where I get exactly theses holes but both front/back faces have the same
interpolated positions, so depth rejection should not create holes because
another triangle should be have rendered to that pixel first.

Anyway, I think the depth test should still not generate these rejection
pixels, since it will create problems in e.g. depth peeling and other
algorithms as well.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79294] New: Xlib-based build broken on non x86/x86-64 architectures

2014-05-27 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79294

  Priority: medium
Bug ID: 79294
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: Xlib-based build broken on non x86/x86-64
architectures
  Severity: blocker
Classification: Unclassified
OS: All
  Reporter: andreas.boll@gmail.com
  Hardware: Other
Status: NEW
   Version: git
 Component: Other
   Product: Mesa

Created attachment 99928
  --> https://bugs.freedesktop.org/attachment.cgi?id=99928&action=edit
Full build log of mesa-10.2.0rc4-powerpc

All non x86/x86-64 architectures are affected by the same issue.
e.g armel, armhf, mips, mipsel, powerpc, s390x

For more information see
https://buildd.debian.org/status/package.php?p=mesa&suite=experimental

libtool: compile:  gcc -DPACKAGE_NAME=\"Mesa\" -DPACKAGE_TARNAME=\"mesa\"
-DPACKAGE_VERSION=\"10.2.0-rc4\" "-DPACKAGE_STRING=\"Mesa 10.2.0-rc4\""
"-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\"";
-DPACKAGE_URL=\"\" -DPACKAGE=\"mesa\" -DVERSION=\"10.2.0-rc4\" -DSTDC_HEADERS=1
-DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1
-DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1
-DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" -DYYTEXT_POINTER=1
-DHAVE___BUILTIN_BSWAP32=1 -DHAVE___BUILTIN_BSWAP64=1 -DHAVE_DLADDR=1
-DHAVE_CLOCK_GETTIME=1 -DHAVE_PTHREAD=1 -I. -I../../../../../src/mapi/glapi
-D_GNU_SOURCE -DHAVE_PTHREAD -DHAVE_DLOPEN -DHAVE_POSIX_MEMALIGN -DHAVE_LIBDRM
-DHAVE_LIBUDEV -DUSE_XSHM -DMESA_EGL_NO_X11_HEADERS -I../../../../../include
-I../../../../../src/mapi -I../../../src/mapi -I../../../../../src/mesa
-DMAPI_MODE_UTIL -D_FORTIFY_SOURCE=2 -g -O2 -fstack-protector
--param=ssp-buffer-size=4 -Wformat -Werror=format-security -Wall -Wall -std=c99
-Werror=implicit-function-declaration -Werror=missing-prototypes
-fno-strict-aliasing -fno-builtin-memcmp -MT glapi_dispatch.lo -MD -MP -MF
.deps/glapi_dispatch.Tpo -c ../../../../../src/mapi/glapi/glapi_dispatch.c 
-fPIC -DPIC -o .libs/glapi_dispatch.o
../../../../../src/mapi/glapi/glapi_dispatch.c:57:21: error: no previous
prototype for 'glPointSizePointerOES' [-Werror=missing-prototypes]
 #define NAME(func)  gl##func
 ^
../../../src/mapi/glapi/glapitemp.h:7689:24: note: in expansion of macro 'NAME'
 KEYWORD1 void KEYWORD2 NAME(PointSizePointerOES)(GLenum type, GLsizei stride,
const GLvoid * pointer)
^
../../../../../src/mapi/glapi/glapi_dispatch.c:57:21: error: no previous
prototype for 'glAlphaFuncx' [-Werror=missing-prototypes]
 #define NAME(func)  gl##func
 ^
../../../src/mapi/glapi/glapitemp.h:9359:24: note: in expansion of macro 'NAME'
 KEYWORD1 void KEYWORD2 NAME(AlphaFuncx)(GLenum func, GLclampx ref)
^
../../../../../src/mapi/glapi/glapi_dispatch.c:57:21: error: no previous
prototype for 'glClearColorx' [-Werror=missing-prototypes]
 #define NAME(func)  gl##func
 ^
../../../src/mapi/glapi/glapitemp.h:9373:24: note: in expansion of macro 'NAME'
 KEYWORD1 void KEYWORD2 NAME(ClearColorx)(GLclampx red, GLclampx green,
GLclampx blue, GLclampx alpha)
^
../../../../../src/mapi/glapi/glapi_dispatch.c:57:21: error: no previous
prototype for 'glClearDepthx' [-Werror=missing-prototypes]
 #define NAME(func)  gl##func
 ^
../../../src/mapi/glapi/glapitemp.h:9387:24: note: in expansion of macro 'NAME'
 KEYWORD1 void KEYWORD2 NAME(ClearDepthx)(GLclampx depth)
^
../../../../../src/mapi/glapi/glapi_dispatch.c:57:21: error: no previous
prototype for 'glColor4x' [-Werror=missing-prototypes]
 #define NAME(func)  gl##func
 ^
../../../src/mapi/glapi/glapitemp.h:9401:24: note: in expansion of macro 'NAME'
 KEYWORD1 void KEYWORD2 NAME(Color4x)(GLfixed red, GLfixed green, GLfixed blue,
GLfixed alpha)
^
../../../../../src/mapi/glapi/glapi_dispatch.c:57:21: error: no previous
prototype for 'glDepthRangex' [-Werror=missing-prototypes]
 #define NAME(func)  gl##func
 ^
../../../src/mapi/glapi/glapitemp.h:9415:24: note: in expansion of macro 'NAME'
 KEYWORD1 void KEYWORD2 NAME(DepthRangex)(GLclampx zNear, GLclampx zFar)
^
../../../../../src/mapi/glapi/glapi_dispatch.c:57:21: error: no previous
prototype for 'glFogx' [-Werror=missing-prototypes]
 #define NAME(func)  gl##func
 ^
../../../src/mapi/glapi/glapitemp.h:9429:24: note: in expansion of macro 'NAME'
 KEYWORD1 void KEYWORD2 NAME(Fogx)(GLenum pname, GLfixed param)
^
../../../../../src/mapi/glapi/glapi_dispatch.c:57:21: error: no previous
prototype for 'glFogxv' [-Werror=missing-prototypes]
 #define NAME(func)  gl##func
 ^
../../../src/mapi/glapi/glapitemp.h:9443:24: note: in exp

[Mesa-dev] [Bug 79039] [TRACKER] Mesa 10.2 release tracker

2014-05-27 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79039

Andreas Boll  changed:

   What|Removed |Added

 Depends on||79294

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79294] Xlib-based build broken on non x86/x86-64 architectures

2014-05-27 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79294

Andreas Boll  changed:

   What|Removed |Added

 CC||i...@freedesktop.org
   Keywords||regression
 Blocks||79039

--- Comment #1 from Andreas Boll  ---
This issue seems to be related with Bug 79230
Feel free to close this as duplicate.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 78914] [llvmpipe] Front/Backfaces do not cover the same pixels when rasterized

2014-05-27 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=78914

--- Comment #18 from Florian Link  ---
Ok, I can confirm that it is a depth fighting problem and found a fix for my
ray caster. Thank you for your effort!

Still it would be good if LLVM pipe would do the same quality depth test as
softpipe and NVidia/ATI do.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79230] After upgrade from 10.1.4 to 10.2-rc4 cross-compile fails

2014-05-27 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79230

Emilio Pozuelo Monfort  changed:

   What|Removed |Added

 CC||poch...@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i915: add a missing NULL pointer check

2014-05-27 Thread Lubomir Rintel
mesaVisual can be NULL with configless context since this commit:

commit 551d459af421a2eb937e9e16301bb64da4624f89
Author: Neil Roberts 
Date:   Fri Mar 7 18:05:47 2014 +

Add the EGL_MESA_configless_context extension
...
Previously the i965 and i915 drivers were explicitly creating a zeroed 
visual
whenever 0 is passed for the EGLConfig.

We attempt to dereference the visual in i915 and now we don't create a
zeroed-out one one it crashes, breaking at least weston in an i915. There's
point in doing so as it would be zero anyway.

Signed-off-by: Lubomir Rintel 
---
This fixes https://bugzilla.redhat.com/show_bug.cgi?id=1100967

 src/mesa/drivers/dri/i915/intel_context.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i915/intel_context.c 
b/src/mesa/drivers/dri/i915/intel_context.c
index a6057d3..09fe371 100644
--- a/src/mesa/drivers/dri/i915/intel_context.c
+++ b/src/mesa/drivers/dri/i915/intel_context.c
@@ -507,7 +507,7 @@ intelInitContext(struct intel_context *intel,
 
_mesa_meta_init(ctx);
 
-   intel->hw_stencil = mesaVis->stencilBits && mesaVis->depthBits == 24;
+   intel->hw_stencil = mesaVis && mesaVis->stencilBits && mesaVis->depthBits 
== 24;
intel->hw_stipple = 1;
 
intel->RenderIndex = ~0;
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] targets/opencl: Fix (static) linking with LLVM

2014-05-27 Thread Emil Velikov
On 27/05/14 05:46, Kertesz Laszlo wrote:
> On 05/19/2014 06:06 PM, Kai Wasserbäch wrote:
>> Michel Dänzer schrieb am 19.05.2014 04:12:
>>> On 18.05.2014 18:37, Kai Wasserbäch wrote:

 And instead of just not starting, my X starts crashing, whenever
 libGL fails to load a (32 bit) driver.
>>>
>>> FWIW, some potential alternatives for avoiding the X crashes:
>>>
>>> With current xserver Git master, you can pass the -iglx parameter to
>>> Xorg to prohibit GLX indirect rendering.
>>>
>>> Or just make sure the 32-bit swrast_dri.so works.
>>
>> Thanks a lot for those pointers. I think my swrast failed because it had 
>> picked
>> up some newer SO_VERSION as well. Which would bring me back to static 
>> linking.
>>
>> Kind regards,
>> Kai Wasserbäch
> 
> Hi,
> i too hit the X crashing issue. But i am unable to compile the latest
> git (10.2-branchpoint-318-g4c7bf8a according to 'git describe')
> Here is the errors i get:
> 
> 
> make[3]: Entering directory '/compile/mesa/src/gallium/targets/gbm'
>   CC   gbm.lo
>   CXXLDgbm_gallium_drm.la
> /usr/local/lib/llvm32/lib/libLLVMSupport.a(Process.o): In function
Hmm that file is not provide by us, so I'm afraid I cannot help you here.
Perhaps the LLVM folks will have a better idea.

> `llvm::sys::Process::FileDescriptorHasColors(int)':
> Process.cpp:(.text._ZN4llvm3sys7Process23FileDescriptorHasColorsEi+0x67): 
> undefined
> reference to `setupterm'
> Process.cpp:(.text._ZN4llvm3sys7Process23FileDescriptorHasColorsEi+0x92): 
> undefined
> reference to `tigetnum'
> Process.cpp:(.text._ZN4llvm3sys7Process23FileDescriptorHasColorsEi+0xa0): 
> undefined
> reference to `set_curterm'
> Process.cpp:(.text._ZN4llvm3sys7Process23FileDescriptorHasColorsEi+0xa8): 
> undefined
> reference to `del_curterm'
> /usr/local/lib/llvm32/lib/libLLVMSupport.a(Compression.o): In function
> `llvm::zlib::compress(llvm::StringRef, llvm::SmallVectorImpl&,
> llvm::zlib::CompressionLevel)':
> Compression.cpp:(.text._ZN4llvm4zlib8compressENS_9StringRefERNS_15SmallVectorImplIcEENS0_16CompressionLevelE+0x26):
> undefined reference to `compressBound'
> Compression.cpp:(.text._ZN4llvm4zlib8compressENS_9StringRefERNS_15SmallVectorImplIcEENS0_16CompressionLevelE+0xa7):
> undefined reference to `compress2'
> collect2: error: ld returned 1 exit status
> Makefile:919: recipe for target 'gbm_gallium_drm.la' failed
> make[3]: *** [gbm_gallium_drm.la] Error 1
> make[3]: Leaving directory '/compile/mesa/src/gallium/targets/gbm'
> Makefile:543: recipe for target 'all-recursive' failed
> make[2]: *** [all-recursive] Error 1
> make[2]: Leaving directory '/compile/mesa/src/gallium/targets'
> Makefile:530: recipe for target 'all-recursive' failed
> make[1]: *** [all-recursive] Error 1
> make[1]: Leaving directory '/compile/mesa/src'
> Makefile:579: recipe for target 'all-recursive' failed
> make: *** [all-recursive] Error 1
> 
> 
> Build script (i use it since forever, it is working with enabled llvm
> shared libs, but as soon as i touch anything OpenGL related, X crashes.)
> 
I would check what exactly is causing the crash and open a ticket at bugzilla,
considering it's not already reported.

> PKG_CONFIG_PATH=/usr/lib/i386-linux-gnu/pkgconfig:/usr/lib/pkgconfig:/usr/local/share/pkgconfig
> ./autogen.sh --sysconfdir=/etc --prefix=/usr \
> --libdir=/usr/lib/i386-linux-gnu --enable-debug \
> CPPFLAGS="-m32" \
> CXXFLAGS="-m32" \
> --with-llvm-prefix=/usr/local/lib/llvm32 \
> LDFLAGS="-L/usr/lib/i386-linux-gnu -L/usr/lib -L/usr/local/lib/llvm32/lib" \
Explicitly setting LDFLAGS is a recipe for disaster. Try to avoid that at all
cost.

Cheers
Emil
> --disable-64-bit --enable-32-bit \
> --enable-texture-float \
> --with-gallium-drivers=r600,swrast,radeonsi \
> --with-dri-drivers="" \
> --enable-vdpau \
> --enable-egl --enable-gles1 --enable-gles2 \
> --enable-glx-tls \
> --with-egl-platforms=x11,drm \
> --enable-gbm \
> --enable-gallium-egl \
> --enable-gallium-llvm \
> --disable-r600-llvm-compiler \
> --disable-dri3 \
> --enable-opencl \
> --enable-shared-glapi \
> --enable-gallium-osmesa \
> --disable-llvm-shared-libs
> 
> 
> 
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/3] i965: Add runtime checks for line antialiasing in Gen < 6

2014-05-27 Thread Iago Toral Quiroga
https://bugs.freedesktop.org/show_bug.cgi?id=78679

The c->runtime_check_aads_emit field has been unused since the removal of the
old ARB_fragment_shader backend in commit
098acf6c84333edbb7b1228545e4bdb2572ee0cd.

This field was relevant in Gen < 6 to do proper rendering of polygons in a
scenario where line antialising is enabled and one of the polygon faces
is rendered in GL_LINE mode while the other remains GL_FILL.

Currently, this scenario is broken in gm45 and ironlake (although ironlake
was broken before than commit too). Particularly, the GL_FILL face of the
polygon renders incorrectly with noise and wrong colors. Line color
interpolation also seems to be incorrect in some cases although I don't know
if this is related to the removed code or is a completely different issue.

There is a test case attached to the bug report that showcases the problem
in Gen4 and Gen5.

This patch series fixes the following issues in this scenario:
* In Gen5: Fixes incorrect rendering of the polygon's GL_FILL face.
* In Gen4: Removes noise and incorrect coloring of the polygon's GL_FILL face.

The following issues remain (would need further investigation):
* In Gen5 and Gen4: color interpolation in GL_LINE faces is not correct (color
is not interpolated in most cases). This seems to be unrelated to antialiasing
settings although behavior is improved when AA is enabled with this patches
(some lines do interpolate in some cases).
* In Gen4: the GL_FILL face's color is flat (does not interpolate). This is
unrelated to antialiasing settings, and happens every time there is a face
being rendered in GL_LINE and another in GL_FILL.

Patch 1: fixes possible crashes when processing code streams that end in a
block structure. This popped up while testing since the second patch creates
this situation.
Patch 2: Checks runtime conditions for proper AA setup in Gen < 6 when doing
framebuffer writes.
Patch 3: Saves unnecessary MOV in some cases involving AA in Gen < 6.

Iago Toral Quiroga (3):
  i965: Always set a valid block end pointer
  i965: Add runtime checks for line antialiasing in Gen < 6.
  i965: Do not prepare antialiasing data if it is not required

 src/mesa/drivers/dri/i965/brw_cfg.cpp|   5 ++
 src/mesa/drivers/dri/i965/brw_fs.h   |   5 ++
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 107 ++-
 3 files changed, 81 insertions(+), 36 deletions(-)

-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] i965: Add runtime checks for line antialiasing in Gen < 6.

2014-05-27 Thread Iago Toral Quiroga
In Gen < 6 the hardware generates a runtime bit that indicates whether AA data
has to be sent as part of the framebuffer write SEND message. This affects the
specific case where we have setup antialiased line rendering and we render
polygons which have one face setup in GL_LINE mode (line antialiasing
will be used) and the other one in GL_FILL mode (no line antialiasing needed).

Currently we are not doing this runtime test and instead we always send AA
data, which produces incorrect rendering of the GL_FILL face of the polygon in
in the aforementioned scenario (verified in ironlake and gm45).

In Gen4 this is, likely, a regression introduced with commit 098acf6c843. In
Gen5 this has never worked properly. Gen > 5 are not affected by this.

The patch fixes the problem by adding the appropriate runtime check and
adjusting the framebuffer write message accordingly in the conflictive
scenario (detected with fs_visitor::runtime_check_aads_emit == TRUE).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78679
---
 src/mesa/drivers/dri/i965/brw_fs.h   |  4 ++
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 86 +---
 2 files changed, 58 insertions(+), 32 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 60a4906..ab8912f 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -452,6 +452,10 @@ public:
 
void emit_color_write(int target, int index, int first_color_mrf);
void emit_alpha_test();
+   void do_emit_fb_write(int target, int base_mrf, int mlen, bool eot,
+ bool header_present);
+   void emit_fb_write(int target, int base_mrf, int mlen, bool eot,
+  bool header_present);
void emit_fb_writes();
 
void emit_shader_time_begin();
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 171f063..4c3897b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -2731,6 +2731,54 @@ fs_visitor::emit_alpha_test()
 }
 
 void
+fs_visitor::do_emit_fb_write(int target, int base_mrf, int mlen, bool eot,
+ bool header_present)
+{
+   fs_inst *inst = emit(FS_OPCODE_FB_WRITE);
+   inst->target = target;
+   inst->base_mrf = base_mrf;
+   inst->mlen = mlen;
+   inst->eot = eot;
+   inst->header_present = header_present;
+   if ((brw->gen >= 8 || brw->is_haswell) && fp->UsesKill) {
+  inst->predicate = BRW_PREDICATE_NORMAL;
+  inst->flag_subreg = 1;
+   }
+}
+
+void
+fs_visitor::emit_fb_write(int target, int base_mrf, int mlen, bool eot,
+  bool header_present)
+{
+   if (!runtime_check_aads_emit) {
+  do_emit_fb_write(target, base_mrf, mlen, eot, header_present);
+   } else {
+  /* This can only happen in Gen < 6
+   */
+  fs_reg reg_tmp_ud = fs_reg(this, glsl_type::uint_type);
+  emit(AND(reg_tmp_ud,
+   fs_reg(get_element_ud(brw_vec8_grf(1,0), 6)),
+   fs_reg(brw_imm_ud(1<<26;
+  emit(CMP(reg_null_ud,
+   reg_tmp_ud,
+   fs_reg(brw_imm_ud(0)),
+   BRW_CONDITIONAL_Z));
+  emit(IF(BRW_PREDICATE_NORMAL));
+  {
+ /* Shift message header one register since we are not sending
+  * AA data stored in base_mrf+2
+  */
+ do_emit_fb_write(target, base_mrf + 1, mlen - 1, eot, header_present);
+  }
+  emit(BRW_OPCODE_ELSE);
+  {
+ do_emit_fb_write(target, base_mrf, mlen, eot, header_present);
+  }
+  emit(BRW_OPCODE_ENDIF);
+   }
+}
+
+void
 fs_visitor::emit_fb_writes()
 {
this->current_annotation = "FB write header";
@@ -2848,16 +2896,7 @@ fs_visitor::emit_fb_writes()
   if (INTEL_DEBUG & DEBUG_SHADER_TIME)
  emit_shader_time_end();
 
-  fs_inst *inst = emit(FS_OPCODE_FB_WRITE);
-  inst->target = 0;
-  inst->base_mrf = base_mrf;
-  inst->mlen = nr - base_mrf;
-  inst->eot = true;
-  inst->header_present = header_present;
-  if ((brw->gen >= 8 || brw->is_haswell) && fp->UsesKill) {
- inst->predicate = BRW_PREDICATE_NORMAL;
- inst->flag_subreg = 1;
-  }
+  emit_fb_write(0, base_mrf, nr - base_mrf, true, header_present);
 
   prog_data->dual_src_blend = true;
   this->current_annotation = NULL;
@@ -2894,19 +2933,10 @@ fs_visitor::emit_fb_writes()
 emit_shader_time_end();
   }
 
-  fs_inst *inst = emit(FS_OPCODE_FB_WRITE);
-  inst->target = target;
-  inst->base_mrf = base_mrf;
-  if (src0_alpha_to_render_target && target == 0)
- inst->mlen = nr - base_mrf - reg_width;
-  else
- inst->mlen = nr - base_mrf;
-  inst->eot = eot;
-  inst->header_present = header_present;
-  if ((brw->gen >= 8 || brw->is_haswell) && fp->UsesKill) {
- inst->predicate = BRW_PREDICATE_NORMAL;
- inst->flag

[Mesa-dev] [PATCH 1/3] i965: Always set a valid block end pointer

2014-05-27 Thread Iago Toral Quiroga
When a instruction stream ends in a block structure (like a IF/ELSE/ENDIF) the
last block's end pointer will not be set, leading to a crash later on in
fs_live_variables::setup_def_use().

If we have not assigned the end pointer of the last block, set it to the last
instruction.
---
 src/mesa/drivers/dri/i965/brw_cfg.cpp | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_cfg.cpp 
b/src/mesa/drivers/dri/i965/brw_cfg.cpp
index 6bf99f1..d4647c4 100644
--- a/src/mesa/drivers/dri/i965/brw_cfg.cpp
+++ b/src/mesa/drivers/dri/i965/brw_cfg.cpp
@@ -257,6 +257,11 @@ cfg_t::cfg_t(exec_list *instructions)
   }
}
 
+   /* If the instruction stream ended with a block structure we need to
+  set the block's end pointer to the last instruction here */
+   if (!cur->end)
+  cur->end = (backend_instruction *)instructions->get_tail();
+
cur->end_ip = ip;
 
make_block_array();
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] i965: Do not prepare antialiasing data if it is not required

2014-05-27 Thread Iago Toral Quiroga
In Gen < 6 AA data will or will not be sent as part of the framebuffer write
SEND message depending on a runtime condition, so don't bother moving AA
data to the corresponding MRF register until we know that we need to send it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78679
---
 src/mesa/drivers/dri/i965/brw_fs.h   |  1 +
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 23 ++-
 2 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index ab8912f..351d0b6 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -452,6 +452,7 @@ public:
 
void emit_color_write(int target, int index, int first_color_mrf);
void emit_alpha_test();
+   void emit_aa(int mrf_aa_reg);
void do_emit_fb_write(int target, int base_mrf, int mlen, bool eot,
  bool header_present);
void emit_fb_write(int target, int base_mrf, int mlen, bool eot,
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 4c3897b..de1726f 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -2731,6 +2731,17 @@ fs_visitor::emit_alpha_test()
 }
 
 void
+fs_visitor::emit_aa(int mrf_aa_reg)
+{
+   if (payload.aa_dest_stencil_reg) {
+  push_force_uncompressed();
+  emit(MOV(fs_reg(MRF, mrf_aa_reg),
+   fs_reg(brw_vec8_grf(payload.aa_dest_stencil_reg, 0;
+  pop_force_uncompressed();
+   }
+}
+
+void
 fs_visitor::do_emit_fb_write(int target, int base_mrf, int mlen, bool eot,
  bool header_present)
 {
@@ -2751,6 +2762,7 @@ fs_visitor::emit_fb_write(int target, int base_mrf, int 
mlen, bool eot,
   bool header_present)
 {
if (!runtime_check_aads_emit) {
+  emit_aa(base_mrf + 2);
   do_emit_fb_write(target, base_mrf, mlen, eot, header_present);
} else {
   /* This can only happen in Gen < 6
@@ -2766,12 +2778,13 @@ fs_visitor::emit_fb_write(int target, int base_mrf, int 
mlen, bool eot,
   emit(IF(BRW_PREDICATE_NORMAL));
   {
  /* Shift message header one register since we are not sending
-  * AA data stored in base_mrf+2
+  * AA data in base_mrf+2
   */
  do_emit_fb_write(target, base_mrf + 1, mlen - 1, eot, header_present);
   }
   emit(BRW_OPCODE_ELSE);
   {
+ emit_aa(base_mrf + 2);
  do_emit_fb_write(target, base_mrf, mlen, eot, header_present);
   }
   emit(BRW_OPCODE_ENDIF);
@@ -2819,11 +2832,11 @@ fs_visitor::emit_fb_writes()
   nr += 2;
}
 
+   /* AA data. Depending on runtime conditions we might not need to send it
+* but we reserve space for it for now.
+*/
if (payload.aa_dest_stencil_reg) {
-  push_force_uncompressed();
-  emit(MOV(fs_reg(MRF, nr++),
-   fs_reg(brw_vec8_grf(payload.aa_dest_stencil_reg, 0;
-  pop_force_uncompressed();
+  nr += 1;
}
 
prog_data->uses_omask =
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 78914] [llvmpipe] Front/Backfaces do not cover the same pixels when rasterized

2014-05-27 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=78914

--- Comment #19 from Roland Scheidegger  ---
depth test as such is as accurate as it could be. Doing interpolation with as
much precision as possible is not all that easy due to properties of floating
point arithmetic. In particular for the math the order of vertices matter.
Reordering would be possible, though still does not guarantee the same results
for fragments along a shared edge (unless the tri shares all vertices, that is
it's really the same tri with reordered edges).
But I agree doing better would be nice, I'm just not entirely sure what clever
tricks need to be done to achieve this.
There's also a slight bug in the implementation I believe, the interpolation
should be done with snapped (fixed point) coordinates, however we do the
interpolation setup with float coordinates. I'm not sure though this would help
here, but at least in contrast to other interpolation issues this one wouldn't
be all that difficult to fix. Another issue is that if you have some attributes
with large gradients on a somewhat small triangle, you can get huge errors the
further the triangle is away from the framebuffer origin. So, interpolation is
definitely not perfect.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] meta/blit: Use gl_FragColor also in the msaa blit shader

2014-05-27 Thread Topi Pohjolainen
Fixes framebuffer_blit_functionality_multisampled_to_singlesampled_blit
es3 cts test on bdw. Also fixes this on ivb when ivb is forced to use
the meta path.

No piglit regressions on IVB.

Signed-off-by: Topi Pohjolainen 
Cc: Eric Anholt 
Cc: Matt Turner 
Cc: Kenneth Graunke 
Cc: Anuj Phogat 
Cc: "10.2" 
---
 src/mesa/drivers/common/meta_blit.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/common/meta_blit.c 
b/src/mesa/drivers/common/meta_blit.c
index 84594d1..5929619 100644
--- a/src/mesa/drivers/common/meta_blit.c
+++ b/src/mesa/drivers/common/meta_blit.c
@@ -273,7 +273,7 @@ setup_glsl_msaa_blit_shader(struct gl_context *ctx,
samples);
  } else {
 ralloc_asprintf_append(&sample_resolve,
-   "   out_color = sample_%d_0 / %f;\n",
+   "   gl_FragColor = sample_%d_0 / %f;\n",
samples, (float)samples);
  }
   }
-- 
1.8.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] vl: add interface for non-referenced frames

2014-05-27 Thread Leo Liu
Signed-off-by: Leo Liu 
---
 src/gallium/include/pipe/p_video_state.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/include/pipe/p_video_state.h 
b/src/gallium/include/pipe/p_video_state.h
index 0256a8f..6621dbd 100644
--- a/src/gallium/include/pipe/p_video_state.h
+++ b/src/gallium/include/pipe/p_video_state.h
@@ -371,6 +371,8 @@ struct pipe_h264_enc_picture_desc
unsigned pic_order_cnt;
unsigned ref_idx_l0;
unsigned ref_idx_l1;
+
+   bool not_referenced;
 };
 
 #ifdef __cplusplus
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] st/omx/enc: implement restricted b frames pattern

2014-05-27 Thread Leo Liu
Signed-off-by: Leo Liu 
---
 src/gallium/state_trackers/omx/vid_enc.c | 11 +--
 src/gallium/state_trackers/omx/vid_enc.h |  1 +
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/src/gallium/state_trackers/omx/vid_enc.c 
b/src/gallium/state_trackers/omx/vid_enc.c
index ee31452..e64928b 100644
--- a/src/gallium/state_trackers/omx/vid_enc.c
+++ b/src/gallium/state_trackers/omx/vid_enc.c
@@ -259,6 +259,7 @@ static OMX_ERRORTYPE vid_enc_Constructor(OMX_COMPONENTTYPE 
*comp, OMX_STRING nam
priv->force_pic_type.IntraRefreshVOP = OMX_FALSE; 
priv->frame_num = 0;
priv->pic_order_cnt = 0;
+   priv->restricted_b_frames = 
debug_get_bool_option("OMX_USE_RESTRICTED_B_FRAMES", FALSE);
 
priv->scale.xWidth = OMX_VID_ENC_SCALING_WIDTH_DEFAULT;
priv->scale.xHeight = OMX_VID_ENC_SCALING_WIDTH_DEFAULT;
@@ -994,6 +995,8 @@ static void enc_HandleTask(omx_base_PortType *port, struct 
encode_task *task,
 
picture.picture_type = picture_type;
picture.pic_order_cnt = task->pic_order_cnt;
+   if (priv->restricted_b_frames && picture_type == 
PIPE_H264_ENC_PICTURE_TYPE_B)
+  picture.not_referenced = true;
enc_ControlPicture(port, &picture);
 
/* -- encode frame - */
@@ -1023,7 +1026,9 @@ static void enc_ClearBframes(omx_base_PortType *port, 
struct input_buf_private *
/* handle B frames */
LIST_FOR_EACH_ENTRY(task, &priv->b_frames, list) {
   enc_HandleTask(port, task, PIPE_H264_ENC_PICTURE_TYPE_B);
-  priv->ref_idx_l0 = priv->frame_num++;
+  if (!priv->restricted_b_frames)
+ priv->ref_idx_l0 = priv->frame_num;
+  priv->frame_num++;
}
 
enc_MoveTasks(&priv->b_frames, &inp->tasks);
@@ -1091,7 +1096,9 @@ static OMX_ERRORTYPE 
vid_enc_EncodeFrame(omx_base_PortType *port, OMX_BUFFERHEAD
   /* handle B frames */
   LIST_FOR_EACH_ENTRY(task, &priv->b_frames, list) {
  enc_HandleTask(port, task, PIPE_H264_ENC_PICTURE_TYPE_B);
- priv->ref_idx_l0 = priv->frame_num++;
+ if (!priv->restricted_b_frames)
+priv->ref_idx_l0 = priv->frame_num;
+ priv->frame_num++;
   }
 
   enc_MoveTasks(&priv->b_frames, &inp->tasks);
diff --git a/src/gallium/state_trackers/omx/vid_enc.h 
b/src/gallium/state_trackers/omx/vid_enc.h
index 22f276f..d0350d6 100644
--- a/src/gallium/state_trackers/omx/vid_enc.h
+++ b/src/gallium/state_trackers/omx/vid_enc.h
@@ -77,6 +77,7 @@ DERIVEDCLASS(vid_enc_PrivateType, omx_base_filter_PrivateType)
OMX_U32 frame_num; \
OMX_U32 pic_order_cnt; \
OMX_U32 ref_idx_l0, ref_idx_l1; \
+   OMX_BOOL restricted_b_frames; \
OMX_VIDEO_PARAM_BITRATETYPE bitrate; \
OMX_VIDEO_PARAM_QUANTIZATIONTYPE quant; \
OMX_CONFIG_INTRAREFRESHVOPTYPE force_pic_type; \
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] radeon/vce: implement non-referenced frames

2014-05-27 Thread Leo Liu
Signed-off-by: Leo Liu 
---
 src/gallium/drivers/radeon/radeon_vce.c| 6 --
 src/gallium/drivers/radeon/radeon_vce_40_2_2.c | 2 +-
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_vce.c 
b/src/gallium/drivers/radeon/radeon_vce.c
index 222f32e..81e62d3 100644
--- a/src/gallium/drivers/radeon/radeon_vce.c
+++ b/src/gallium/drivers/radeon/radeon_vce.c
@@ -231,11 +231,13 @@ static void rvce_end_frame(struct pipe_video_codec 
*encoder,
flush(enc);
 
/* update the CPB backtrack with the just encoded frame */
-   LIST_DEL(&slot->list);
slot->picture_type = enc->pic.picture_type;
slot->frame_num = enc->pic.frame_num;
slot->pic_order_cnt = enc->pic.pic_order_cnt;
-   LIST_ADD(&slot->list, &enc->cpb_slots);
+   if (!enc->pic.not_referenced) {
+   LIST_DEL(&slot->list);
+   LIST_ADD(&slot->list, &enc->cpb_slots);
+   }
 }
 
 static void rvce_get_feedback(struct pipe_video_codec *encoder,
diff --git a/src/gallium/drivers/radeon/radeon_vce_40_2_2.c 
b/src/gallium/drivers/radeon/radeon_vce_40_2_2.c
index 3b67b31..3010c5b 100644
--- a/src/gallium/drivers/radeon/radeon_vce_40_2_2.c
+++ b/src/gallium/drivers/radeon/radeon_vce_40_2_2.c
@@ -283,7 +283,7 @@ static void encode(struct rvce_encoder *enc)
RVCE_CS(enc->pic.picture_type == PIPE_H264_ENC_PICTURE_TYPE_IDR); // 
encIdrFlag
RVCE_CS(0x); // encIdrPicId
RVCE_CS(0x); // encMGSKeyPic
-   RVCE_CS(0x0001); // encReferenceFlag
+   RVCE_CS(!enc->pic.not_referenced); // encReferenceFlag
RVCE_CS(0x); // encTemporalLayerIndex
RVCE_CS(0x); // num_ref_idx_active_override_flag
RVCE_CS(0x); // num_ref_idx_l0_active_minus1
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] glsl/tests: call create_test_cases.py in optimization-test

2014-05-27 Thread Connor Abbott
This way, when someone modifies create_test_cases.py and forgets to
commit their changes again, people will notice.

Signed-off-by: Connor Abbott 
---
 src/glsl/tests/optimization-test | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/glsl/tests/optimization-test b/src/glsl/tests/optimization-test
index 8ca7776..d58e335 100755
--- a/src/glsl/tests/optimization-test
+++ b/src/glsl/tests/optimization-test
@@ -9,6 +9,13 @@ fi
 total=0
 pass=0
 
+echo "==   Generating tests  =="
+for dir in */; do
+if [ -e "$dir/create_test_cases.py" ]; then
+cd $dir; python create_test_cases.py; cd ..
+fi
+done
+
 echo "== Testing optimization passes =="
 for test in `find . -iname '*.opt_test'`; do
 echo -n "Testing $test..."
-- 
1.8.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] glsl: be more consistent about printing constants

2014-05-27 Thread Connor Abbott
Make sure that we print the same number of digits when printing 0.0 as
any other floating-point number. This will make generating expected
output files for tests easier. To avoid breaking "make check," update
the generated tests for lower_jumps before the next commit which will
bring create_test_cases.py in line with them.

Signed-off-by: Connor Abbott 
---
 src/glsl/ir_print_visitor.cpp  |  2 +-
 src/glsl/tests/lower_jumps/lower_breaks_1.opt_test |  3 +--
 src/glsl/tests/lower_jumps/lower_breaks_1.opt_test.expected|  3 +--
 src/glsl/tests/lower_jumps/lower_breaks_2.opt_test.expected|  2 +-
 src/glsl/tests/lower_jumps/lower_breaks_3.opt_test.expected|  2 +-
 src/glsl/tests/lower_jumps/lower_breaks_4.opt_test.expected|  2 +-
 src/glsl/tests/lower_jumps/lower_breaks_5.opt_test.expected|  2 +-
 src/glsl/tests/lower_jumps/lower_breaks_6.opt_test.expected| 10 +-
 .../lower_guarded_conditional_break.opt_test.expected  |  6 +++---
 .../tests/lower_jumps/lower_pulled_out_jump.opt_test.expected  |  8 
 src/glsl/tests/lower_jumps/lower_returns_3.opt_test.expected   |  4 ++--
 src/glsl/tests/lower_jumps/lower_returns_4.opt_test.expected   |  2 +-
 .../lower_jumps/lower_returns_main_false.opt_test.expected |  4 ++--
 .../lower_jumps/lower_returns_main_true.opt_test.expected  |  4 ++--
 .../lower_jumps/lower_returns_sub_false.opt_test.expected  |  4 ++--
 .../tests/lower_jumps/lower_returns_sub_true.opt_test.expected |  4 ++--
 .../tests/lower_jumps/lower_unified_returns.opt_test.expected  |  8 
 .../tests/lower_jumps/remove_continue_at_end_of_loop.opt_test  |  3 +--
 .../remove_continue_at_end_of_loop.opt_test.expected   |  3 +--
 .../return_void_at_end_of_loop_lower_nothing.opt_test  |  3 +--
 .../return_void_at_end_of_loop_lower_nothing.opt_test.expected |  3 +--
 .../return_void_at_end_of_loop_lower_return.opt_test   |  3 +--
 .../return_void_at_end_of_loop_lower_return_and_break.opt_test |  3 +--
 23 files changed, 40 insertions(+), 48 deletions(-)

diff --git a/src/glsl/ir_print_visitor.cpp b/src/glsl/ir_print_visitor.cpp
index 0a7695a..a3d851e 100644
--- a/src/glsl/ir_print_visitor.cpp
+++ b/src/glsl/ir_print_visitor.cpp
@@ -430,7 +430,7 @@ void ir_print_visitor::visit(ir_constant *ir)
 case GLSL_TYPE_FLOAT:
 if (ir->value.f[i] == 0.0f)
/* 0.0 == -0.0, so print with %f to get the proper sign. */
-   fprintf(f, "%.1f", ir->value.f[i]);
+   fprintf(f, "%f", ir->value.f[i]);
 else if (fabs(ir->value.f[i]) < 0.01f)
fprintf(f, "%a", ir->value.f[i]);
 else if (fabs(ir->value.f[i]) > 100.0f)
diff --git a/src/glsl/tests/lower_jumps/lower_breaks_1.opt_test 
b/src/glsl/tests/lower_jumps/lower_breaks_1.opt_test
index b412ba8..e2d4ed1 100755
--- a/src/glsl/tests/lower_jumps/lower_breaks_1.opt_test
+++ b/src/glsl/tests/lower_jumps/lower_breaks_1.opt_test
@@ -8,6 +8,5 @@
 ((declare (out) float a)
  (function main
   (signature void (parameters)
-   ((loop
- ((assign (x) (var_ref a) (constant float (1.00))) break))
+   ((loop ((assign (x) (var_ref a) (constant float (1.00))) break))
 EOF
diff --git a/src/glsl/tests/lower_jumps/lower_breaks_1.opt_test.expected 
b/src/glsl/tests/lower_jumps/lower_breaks_1.opt_test.expected
index 56ef3e4..270a43d 100644
--- a/src/glsl/tests/lower_jumps/lower_breaks_1.opt_test.expected
+++ b/src/glsl/tests/lower_jumps/lower_breaks_1.opt_test.expected
@@ -1,5 +1,4 @@
 ((declare (out) float a)
  (function main
   (signature void (parameters)
-   ((loop
- ((assign (x) (var_ref a) (constant float (1.00))) break))
+   ((loop ((assign (x) (var_ref a) (constant float (1.00))) break))
diff --git a/src/glsl/tests/lower_jumps/lower_breaks_2.opt_test.expected 
b/src/glsl/tests/lower_jumps/lower_breaks_2.opt_test.expected
index dc231f9..73a1d56 100644
--- a/src/glsl/tests/lower_jumps/lower_breaks_2.opt_test.expected
+++ b/src/glsl/tests/lower_jumps/lower_breaks_2.opt_test.expected
@@ -3,5 +3,5 @@
   (signature void (parameters)
((loop
  ((assign (x) (var_ref a) (constant float (1.00)))
-  (if (expression bool > (var_ref b) (constant float (0.0))) (break)
+  (if (expression bool > (var_ref b) (constant float (0.00))) (break)
(
diff --git a/src/glsl/tests/lower_jumps/lower_breaks_3.opt_test.expected 
b/src/glsl/tests/lower_jumps/lower_breaks_3.opt_test.expected
index 8131b66..53d5392 100644
--- a/src/glsl/tests/lower_jumps/lower_breaks_3.opt_test.expected
+++ b/src/glsl/tests/lower_jumps/lower_breaks_3.opt_test.expected
@@ -3,6 +3,6 @@
   (signature void (parameters)
((loop
  ((assign (x) (var_ref a) (constant float (1.00)))
-  (if (expression bool > (var_ref b) (constant float (0.0)))
+  (if (expression bool > (var_ref b) (constant float (0.00)))
((assign (x) (

[Mesa-dev] [PATCH 2/4] glsl/tests/lower_jumps: fix generated sexpr's for loops

2014-05-27 Thread Connor Abbott
In 088494aa (as well as other commits in the series) Paul Berry modified
the tests for lower_jumps to account for the fact that the s-expression
for the loop IR instruction changed from
(loop () () () () (statements...)) to (loop (statements...)), but he
forgot to update create_test_cases.py which he used to create the tests.
Fix that, so that now create_test_cases.py is synced with the generated
tests.

Signed-off-by: Connor Abbott 
---
 src/glsl/tests/lower_jumps/create_test_cases.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glsl/tests/lower_jumps/create_test_cases.py 
b/src/glsl/tests/lower_jumps/create_test_cases.py
index 9974681..3be1079 100644
--- a/src/glsl/tests/lower_jumps/create_test_cases.py
+++ b/src/glsl/tests/lower_jumps/create_test_cases.py
@@ -126,7 +126,7 @@ def loop(statements):
 body.
 """
 check_sexp(statements)
-return [['loop', [], [], [], [], statements]]
+return [['loop', statements]]
 
 def declare_temp(var_type, var_name):
 """Create a declaration of the form
-- 
1.8.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] glsl/tests: remove generated tests from the repo

2014-05-27 Thread Connor Abbott
They were made unneccesary by the last commit.

Signed-off-by: Connor Abbott 
---
 src/glsl/tests/lower_jumps/.gitignore  |  2 ++
 src/glsl/tests/lower_jumps/lower_breaks_1.opt_test | 12 -
 .../lower_jumps/lower_breaks_1.opt_test.expected   |  4 ---
 src/glsl/tests/lower_jumps/lower_breaks_2.opt_test | 15 ---
 .../lower_jumps/lower_breaks_2.opt_test.expected   |  7 --
 src/glsl/tests/lower_jumps/lower_breaks_3.opt_test | 17 -
 .../lower_jumps/lower_breaks_3.opt_test.expected   |  8 --
 src/glsl/tests/lower_jumps/lower_breaks_4.opt_test | 15 ---
 .../lower_jumps/lower_breaks_4.opt_test.expected   |  7 --
 src/glsl/tests/lower_jumps/lower_breaks_5.opt_test | 16 
 .../lower_jumps/lower_breaks_5.opt_test.expected   |  7 --
 src/glsl/tests/lower_jumps/lower_breaks_6.opt_test | 29 --
 .../lower_jumps/lower_breaks_6.opt_test.expected   | 29 --
 .../lower_guarded_conditional_break.opt_test   | 21 
 ...wer_guarded_conditional_break.opt_test.expected | 20 ---
 .../lower_jumps/lower_pulled_out_jump.opt_test | 28 -
 .../lower_pulled_out_jump.opt_test.expected| 25 ---
 .../tests/lower_jumps/lower_returns_1.opt_test | 12 -
 .../lower_jumps/lower_returns_1.opt_test.expected  |  4 ---
 .../tests/lower_jumps/lower_returns_2.opt_test | 13 --
 .../lower_jumps/lower_returns_2.opt_test.expected  |  5 
 .../tests/lower_jumps/lower_returns_3.opt_test | 20 ---
 .../lower_jumps/lower_returns_3.opt_test.expected  | 21 
 .../tests/lower_jumps/lower_returns_4.opt_test | 14 ---
 .../lower_jumps/lower_returns_4.opt_test.expected  | 16 
 .../lower_jumps/lower_returns_main_false.opt_test  | 17 -
 .../lower_returns_main_false.opt_test.expected |  8 --
 .../lower_jumps/lower_returns_main_true.opt_test   | 17 -
 .../lower_returns_main_true.opt_test.expected  | 13 --
 .../lower_jumps/lower_returns_sub_false.opt_test   | 16 
 .../lower_returns_sub_false.opt_test.expected  |  8 --
 .../lower_jumps/lower_returns_sub_true.opt_test| 16 
 .../lower_returns_sub_true.opt_test.expected   | 13 --
 .../lower_jumps/lower_unified_returns.opt_test | 26 ---
 .../lower_unified_returns.opt_test.expected| 21 
 .../remove_continue_at_end_of_loop.opt_test| 12 -
 ...emove_continue_at_end_of_loop.opt_test.expected |  4 ---
 ..._non_void_at_end_of_loop_lower_nothing.opt_test | 16 
 ..._at_end_of_loop_lower_nothing.opt_test.expected |  8 --
 ...n_non_void_at_end_of_loop_lower_return.opt_test | 16 
 ...d_at_end_of_loop_lower_return.opt_test.expected | 19 --
 ..._at_end_of_loop_lower_return_and_break.opt_test | 16 
 ...f_loop_lower_return_and_break.opt_test.expected | 19 --
 ...turn_void_at_end_of_loop_lower_nothing.opt_test | 13 --
 ..._at_end_of_loop_lower_nothing.opt_test.expected |  5 
 ...eturn_void_at_end_of_loop_lower_return.opt_test | 13 --
 ...d_at_end_of_loop_lower_return.opt_test.expected | 11 
 ..._at_end_of_loop_lower_return_and_break.opt_test | 13 --
 ...f_loop_lower_return_and_break.opt_test.expected | 11 
 49 files changed, 2 insertions(+), 696 deletions(-)
 delete mode 100755 src/glsl/tests/lower_jumps/lower_breaks_1.opt_test
 delete mode 100644 src/glsl/tests/lower_jumps/lower_breaks_1.opt_test.expected
 delete mode 100755 src/glsl/tests/lower_jumps/lower_breaks_2.opt_test
 delete mode 100644 src/glsl/tests/lower_jumps/lower_breaks_2.opt_test.expected
 delete mode 100755 src/glsl/tests/lower_jumps/lower_breaks_3.opt_test
 delete mode 100644 src/glsl/tests/lower_jumps/lower_breaks_3.opt_test.expected
 delete mode 100755 src/glsl/tests/lower_jumps/lower_breaks_4.opt_test
 delete mode 100644 src/glsl/tests/lower_jumps/lower_breaks_4.opt_test.expected
 delete mode 100755 src/glsl/tests/lower_jumps/lower_breaks_5.opt_test
 delete mode 100644 src/glsl/tests/lower_jumps/lower_breaks_5.opt_test.expected
 delete mode 100755 src/glsl/tests/lower_jumps/lower_breaks_6.opt_test
 delete mode 100644 src/glsl/tests/lower_jumps/lower_breaks_6.opt_test.expected
 delete mode 100755 
src/glsl/tests/lower_jumps/lower_guarded_conditional_break.opt_test
 delete mode 100644 
src/glsl/tests/lower_jumps/lower_guarded_conditional_break.opt_test.expected
 delete mode 100755 src/glsl/tests/lower_jumps/lower_pulled_out_jump.opt_test
 delete mode 100644 
src/glsl/tests/lower_jumps/lower_pulled_out_jump.opt_test.expected
 delete mode 100755 src/glsl/tests/lower_jumps/lower_returns_1.opt_test
 delete mode 100644 src/glsl/tests/lower_jumps/lower_returns_1.opt_test.expected
 delete mode 100755 src/glsl/tests/lower_jumps/lower_returns_2.opt_test
 delete m

[Mesa-dev] [PATCH 0/4] glsl/tests: remove generated files

2014-05-27 Thread Connor Abbott
While trying to modify the lower_jumps unit tests to account for my SSA
changes, I realized that the tests were not in sync with the file that
generated them. There were two problems:

-The *.expected files all had the same number of digits after the
decimal place (6) whereas the *.out files had 1 digit in "0.0" and 6
digits in "1.00" when printing constants, which led to failures due
to diffs like:

-   ((if (expression bool > (var_ref b) (constant float (0.00)))
+   ((if (expression bool > (var_ref b) (constant float (0.0)))


-Loops were incorrect in the input files.

I fixed both problems, and then I removed the generated tests so that
stuff like this won't happen again.

Connor Abbott (4):
  glsl: be more consistent about printing constants
  glsl/tests/lower_jumps: fix generated sexpr's for loops
  glsl/tests: call create_test_cases.py in optimization-test
  glsl/tests: remove generated tests from the repo

 src/glsl/ir_print_visitor.cpp  |  2 +-
 src/glsl/tests/lower_jumps/.gitignore  |  2 ++
 src/glsl/tests/lower_jumps/create_test_cases.py|  2 +-
 src/glsl/tests/lower_jumps/lower_breaks_1.opt_test | 13 --
 .../lower_jumps/lower_breaks_1.opt_test.expected   |  5 
 src/glsl/tests/lower_jumps/lower_breaks_2.opt_test | 15 ---
 .../lower_jumps/lower_breaks_2.opt_test.expected   |  7 --
 src/glsl/tests/lower_jumps/lower_breaks_3.opt_test | 17 -
 .../lower_jumps/lower_breaks_3.opt_test.expected   |  8 --
 src/glsl/tests/lower_jumps/lower_breaks_4.opt_test | 15 ---
 .../lower_jumps/lower_breaks_4.opt_test.expected   |  7 --
 src/glsl/tests/lower_jumps/lower_breaks_5.opt_test | 16 
 .../lower_jumps/lower_breaks_5.opt_test.expected   |  7 --
 src/glsl/tests/lower_jumps/lower_breaks_6.opt_test | 29 --
 .../lower_jumps/lower_breaks_6.opt_test.expected   | 29 --
 .../lower_guarded_conditional_break.opt_test   | 21 
 ...wer_guarded_conditional_break.opt_test.expected | 20 ---
 .../lower_jumps/lower_pulled_out_jump.opt_test | 28 -
 .../lower_pulled_out_jump.opt_test.expected| 25 ---
 .../tests/lower_jumps/lower_returns_1.opt_test | 12 -
 .../lower_jumps/lower_returns_1.opt_test.expected  |  4 ---
 .../tests/lower_jumps/lower_returns_2.opt_test | 13 --
 .../lower_jumps/lower_returns_2.opt_test.expected  |  5 
 .../tests/lower_jumps/lower_returns_3.opt_test | 20 ---
 .../lower_jumps/lower_returns_3.opt_test.expected  | 21 
 .../tests/lower_jumps/lower_returns_4.opt_test | 14 ---
 .../lower_jumps/lower_returns_4.opt_test.expected  | 16 
 .../lower_jumps/lower_returns_main_false.opt_test  | 17 -
 .../lower_returns_main_false.opt_test.expected |  8 --
 .../lower_jumps/lower_returns_main_true.opt_test   | 17 -
 .../lower_returns_main_true.opt_test.expected  | 13 --
 .../lower_jumps/lower_returns_sub_false.opt_test   | 16 
 .../lower_returns_sub_false.opt_test.expected  |  8 --
 .../lower_jumps/lower_returns_sub_true.opt_test| 16 
 .../lower_returns_sub_true.opt_test.expected   | 13 --
 .../lower_jumps/lower_unified_returns.opt_test | 26 ---
 .../lower_unified_returns.opt_test.expected| 21 
 .../remove_continue_at_end_of_loop.opt_test| 13 --
 ...emove_continue_at_end_of_loop.opt_test.expected |  5 
 ..._non_void_at_end_of_loop_lower_nothing.opt_test | 16 
 ..._at_end_of_loop_lower_nothing.opt_test.expected |  8 --
 ...n_non_void_at_end_of_loop_lower_return.opt_test | 16 
 ...d_at_end_of_loop_lower_return.opt_test.expected | 19 --
 ..._at_end_of_loop_lower_return_and_break.opt_test | 16 
 ...f_loop_lower_return_and_break.opt_test.expected | 19 --
 ...turn_void_at_end_of_loop_lower_nothing.opt_test | 14 ---
 ..._at_end_of_loop_lower_nothing.opt_test.expected |  6 -
 ...eturn_void_at_end_of_loop_lower_return.opt_test | 14 ---
 ...d_at_end_of_loop_lower_return.opt_test.expected | 11 
 ..._at_end_of_loop_lower_return_and_break.opt_test | 14 ---
 ...f_loop_lower_return_and_break.opt_test.expected | 11 
 src/glsl/tests/optimization-test   |  7 ++
 52 files changed, 11 insertions(+), 706 deletions(-)
 delete mode 100755 src/glsl/tests/lower_jumps/lower_breaks_1.opt_test
 delete mode 100644 src/glsl/tests/lower_jumps/lower_breaks_1.opt_test.expected
 delete mode 100755 src/glsl/tests/lower_jumps/lower_breaks_2.opt_test
 delete mode 100644 src/glsl/tests/lower_jumps/lower_breaks_2.opt_test.expected
 delete mode 100755 src/glsl/tests/lower_jumps/lower_breaks_3.opt_test
 delete mode 100644 src/glsl/tests/lower_jumps/lower_breaks_3.opt_

Re: [Mesa-dev] [PATCH 1/3] vl: add interface for non-referenced frames

2014-05-27 Thread Christian König

Am 27.05.2014 16:12, schrieb Leo Liu:

Signed-off-by: Leo Liu 


Reviewed and pushed upstream.

Thanks,
Christian.


---
  src/gallium/include/pipe/p_video_state.h | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/src/gallium/include/pipe/p_video_state.h 
b/src/gallium/include/pipe/p_video_state.h
index 0256a8f..6621dbd 100644
--- a/src/gallium/include/pipe/p_video_state.h
+++ b/src/gallium/include/pipe/p_video_state.h
@@ -371,6 +371,8 @@ struct pipe_h264_enc_picture_desc
 unsigned pic_order_cnt;
 unsigned ref_idx_l0;
 unsigned ref_idx_l1;
+
+   bool not_referenced;
  };
  
  #ifdef __cplusplus


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] nvc0: add GK20A 3D class

2014-05-27 Thread Ilia Mirkin
On Tue, May 27, 2014 at 2:35 AM, Alexandre Courbot  wrote:
> On 05/27/2014 02:29 PM, Ilia Mirkin wrote:
>>
>> On Tue, May 27, 2014 at 12:59 AM, Alexandre Courbot 
>> wrote:
>>>
>>> GK20A is mostly compatible with GK104, but features a new 3D
>>> class. Add it to the relevant header and use it when GK20A is
>>> detected.
>>>
>>> Signed-off-by: Alexandre Courbot 
>>> ---
>>>   src/gallium/drivers/nouveau/nv_object.xml.h| 1 +
>>>   src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 9 -
>>>   2 files changed, 9 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/src/gallium/drivers/nouveau/nv_object.xml.h
>>> b/src/gallium/drivers/nouveau/nv_object.xml.h
>>> index 4c93e6564838..0a0e187dc028 100644
>>> --- a/src/gallium/drivers/nouveau/nv_object.xml.h
>>> +++ b/src/gallium/drivers/nouveau/nv_object.xml.h
>>> @@ -190,6 +190,7 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
>>> SOFTWARE.
>>>   #define NVC8_3D_CLASS
>>> 0x9297
>>>   #define NVE4_3D_CLASS
>>> 0xa097
>>>   #define NVF0_3D_CLASS
>>> 0xa197
>>> +#define NVEA_3D_CLASS
>>> 0xa297
>>>   #define GM107_3D_CLASS
>>> 0xb097
>>>   #define NV50_2D_CLASS
>>> 0x502d
>>>   #define NVC0_2D_CLASS
>>> 0x902d
>>> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
>>> b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
>>> index cccfe2bba23d..95e5ef81cd79 100644
>>> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
>>> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
>>> @@ -702,7 +702,14 @@ nvc0_screen_create(struct nouveau_device *dev)
>>> obj_class = NVF0_3D_CLASS;
>>> break;
>>>  case 0xe0:
>>> -  obj_class = NVE4_3D_CLASS;
>>> +  switch (dev->chipset) {
>>> +  case 0xea:
>>> + obj_class = NVEA_3D_CLASS;
>>
>>
>> Again, would be nice to be consistent with the way you set the ISA...
>> perhaps change this to a >= as well? But I guess the two could be
>> disconnected. Up to you, just thought I'd bring it up.
>
>
> Right below we have the following being done:
>
>  switch (dev->chipset) {
>   case 0xc8:
>  obj_class = NVC8_3D_CLASS;
>  break;
>   case 0xc1:
>  obj_class = NVC1_3D_CLASS;
>  break;
>   default:
>  obj_class = NVC0_3D_CLASS;
>  break;
>   }
>
> Shouldn't we try to be consistent with this more local example instead?

Which is why I didn't insist. The situation with nvcx is a little
different -- nvc8 (GF110) and nvc1 (GF108) are special, but e.g. nvce
(GF114) and nvcf (GF116) want the nvc0 class. OTOH you're using >=
0xea as the metric for selecting SM35, so I was just pointing out the
inconsistency. Of course there needn't be a 1:1 mapping between these
things, and the likelihood of another 0xex chipset being released is
fairly low. So:

Reviewed-by: Ilia Mirkin 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/2] nvc0: use SM35 ISA with GK20A

2014-05-27 Thread Ilia Mirkin
On Tue, May 27, 2014 at 3:03 AM, Alexandre Courbot  wrote:
> GK20A is mostly compatible with GK104, but uses the SM35 ISA. Use
> the GK110 path when this chip is detected.
>
> Signed-off-by: Alexandre Courbot 

Reviewed-by: Ilia Mirkin 

> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h  |  2 +-
>  src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp |  2 +-
>  .../drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp   | 15 
> ++-
>  3 files changed, 12 insertions(+), 7 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
> index bbb89d97932e..f829aac0bcc2 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
> @@ -91,7 +91,7 @@ struct nv50_ir_prog_symbol
>  #define NVISA_GF100_CHIPSET_C0 0xc0
>  #define NVISA_GF100_CHIPSET_D0 0xd0
>  #define NVISA_GK104_CHIPSET0xe0
> -#define NVISA_GK110_CHIPSET0xf0
> +#define NVISA_GK20A_CHIPSET0xea
>  #define NVISA_GM107_CHIPSET0x110
>
>  struct nv50_ir_prog_info
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
> index b1f76cf80432..f69e6a183e19 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
> @@ -3027,7 +3027,7 @@ TargetNVC0::createCodeEmitterNVC0(Program::Type type)
>  CodeEmitter *
>  TargetNVC0::getCodeEmitter(Program::Type type)
>  {
> -   if (chipset >= NVISA_GK110_CHIPSET)
> +   if (chipset >= NVISA_GK20A_CHIPSET)
>return createCodeEmitterGK110(type);
> return createCodeEmitterNVC0(type);
>  }
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
> index 064e7a2c63f9..963b6e47ddfc 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
> @@ -49,9 +49,12 @@ TargetNVC0::getBuiltinCode(const uint32_t **code, uint32_t 
> *size) const
>  {
> switch (chipset & ~0xf) {
> case 0xe0:
> -  *code = (const uint32_t *)&gk104_builtin_code[0];
> -  *size = sizeof(gk104_builtin_code);
> -  break;
> +  if (chipset < NVISA_GK20A_CHIPSET) {
> + *code = (const uint32_t *)&gk104_builtin_code[0];
> + *size = sizeof(gk104_builtin_code);
> + break;
> +  }
> +  /* fall-through for GK20A */
> case 0xf0:
> case 0x100:
>*code = (const uint32_t *)&gk110_builtin_code[0];
> @@ -71,7 +74,9 @@ TargetNVC0::getBuiltinOffset(int builtin) const
>
> switch (chipset & ~0xf) {
> case 0xe0:
> -  return gk104_builtin_offsets[builtin];
> +  if (chipset < NVISA_GK20A_CHIPSET)
> + return gk104_builtin_offsets[builtin];
> +  /* fall-through for GK20A */
> case 0xf0:
> case 0x100:
>return gk110_builtin_offsets[builtin];
> @@ -235,7 +240,7 @@ TargetNVC0::getFileSize(DataFile file) const
>  {
> switch (file) {
> case FILE_NULL:  return 0;
> -   case FILE_GPR:   return (chipset >= NVISA_GK110_CHIPSET) ? 255 : 
> 63;
> +   case FILE_GPR:   return (chipset >= NVISA_GK20A_CHIPSET) ? 255 : 
> 63;
> case FILE_PREDICATE: return 7;
> case FILE_FLAGS: return 1;
> case FILE_ADDRESS:   return 0;
> --
> 1.9.3
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] meta/blit: Use gl_FragColor also in the msaa blit shader

2014-05-27 Thread Anuj Phogat
On Tue, May 27, 2014 at 6:21 AM, Topi Pohjolainen
 wrote:
> Fixes framebuffer_blit_functionality_multisampled_to_singlesampled_blit
> es3 cts test on bdw. Also fixes this on ivb when ivb is forced to use
> the meta path.
>
> No piglit regressions on IVB.
>
> Signed-off-by: Topi Pohjolainen 
> Cc: Eric Anholt 
> Cc: Matt Turner 
> Cc: Kenneth Graunke 
> Cc: Anuj Phogat 
> Cc: "10.2" 
> ---
>  src/mesa/drivers/common/meta_blit.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/common/meta_blit.c 
> b/src/mesa/drivers/common/meta_blit.c
> index 84594d1..5929619 100644
> --- a/src/mesa/drivers/common/meta_blit.c
> +++ b/src/mesa/drivers/common/meta_blit.c
> @@ -273,7 +273,7 @@ setup_glsl_msaa_blit_shader(struct gl_context *ctx,
> samples);
>   } else {
>  ralloc_asprintf_append(&sample_resolve,
> -   "   out_color = sample_%d_0 / %f;\n",
> +   "   gl_FragColor = sample_%d_0 / %f;\n",
> samples, (float)samples);
>   }
>}
> --
> 1.8.3.1
>
This fixes msaa blits to multiple render targets for float buffers.

Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/8] gallium: create TGSI_PROPERTY to disable viewport and clipping

2014-05-27 Thread Marek Olšák
From: Christoph Bumiller 

Marek v2: add a cap

Signed-off-by: Marek Olšák 
---
 src/gallium/auxiliary/tgsi/tgsi_strings.c|  1 +
 src/gallium/auxiliary/tgsi/tgsi_ureg.c   | 16 
 src/gallium/auxiliary/tgsi/tgsi_ureg.h   |  4 
 src/gallium/docs/source/screen.rst   |  3 +++
 src/gallium/docs/source/tgsi.rst |  9 +
 src/gallium/drivers/freedreno/freedreno_screen.c |  1 +
 src/gallium/drivers/i915/i915_screen.c   |  1 +
 src/gallium/drivers/ilo/ilo_screen.c |  1 +
 src/gallium/drivers/llvmpipe/lp_screen.c |  1 +
 src/gallium/drivers/nouveau/nv30/nv30_screen.c   |  1 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c   |  1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   |  2 ++
 src/gallium/drivers/r300/r300_screen.c   |  1 +
 src/gallium/drivers/r600/r600_pipe.c |  1 +
 src/gallium/drivers/radeonsi/si_pipe.c   |  1 +
 src/gallium/drivers/softpipe/sp_screen.c |  1 +
 src/gallium/drivers/svga/svga_screen.c   |  1 +
 src/gallium/include/pipe/p_defines.h |  1 +
 src/gallium/include/pipe/p_shader_tokens.h   |  3 ++-
 19 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_strings.c 
b/src/gallium/auxiliary/tgsi/tgsi_strings.c
index 34dec4f..713631f 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_strings.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_strings.c
@@ -122,6 +122,7 @@ const char *tgsi_property_names[TGSI_PROPERTY_COUNT] =
"FS_DEPTH_LAYOUT",
"VS_PROHIBIT_UCPS",
"GS_INVOCATIONS",
+   "VS_POSITION_WINDOW_SPACE"
 };
 
 const char *tgsi_type_names[5] =
diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c 
b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
index 2bf93ee..bd0a3f7 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
@@ -173,6 +173,7 @@ struct ureg_program
unsigned char property_fs_coord_pixel_center; /* = 
TGSI_FS_COORD_PIXEL_CENTER_* */
unsigned char property_fs_color0_writes_all_cbufs; /* = 
TGSI_FS_COLOR0_WRITES_ALL_CBUFS * */
unsigned char property_fs_depth_layout; /* TGSI_FS_DEPTH_LAYOUT */
+   boolean property_vs_window_space_position; /* TGSI_VS_WINDOW_SPACE_POSITION 
*/
 
unsigned nr_addrs;
unsigned nr_preds;
@@ -331,6 +332,13 @@ ureg_property_fs_depth_layout(struct ureg_program *ureg,
ureg->property_fs_depth_layout = fs_depth_layout;
 }
 
+void
+ureg_property_vs_window_space_position(struct ureg_program *ureg,
+   boolean vs_window_space_position)
+{
+   ureg->property_vs_window_space_position = vs_window_space_position;
+}
+
 struct ureg_src
 ureg_DECL_fs_input_cyl_centroid(struct ureg_program *ureg,
unsigned semantic_name,
@@ -1508,6 +1516,14 @@ static void emit_decls( struct ureg_program *ureg )
 ureg->property_fs_depth_layout);
}
 
+   if (ureg->property_vs_window_space_position) {
+  assert(ureg->processor == TGSI_PROCESSOR_VERTEX);
+
+  emit_property(ureg,
+TGSI_PROPERTY_VS_WINDOW_SPACE_POSITION,
+ureg->property_vs_window_space_position);
+   }
+
if (ureg->processor == TGSI_PROCESSOR_VERTEX) {
   for (i = 0; i < UREG_MAX_INPUT; i++) {
  if (ureg->vs_inputs[i/32] & (1 << (i%32))) {
diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.h 
b/src/gallium/auxiliary/tgsi/tgsi_ureg.h
index a0a50b7..28edea6 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_ureg.h
+++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.h
@@ -184,6 +184,10 @@ void
 ureg_property_fs_depth_layout(struct ureg_program *ureg,
   unsigned fs_depth_layout);
 
+void
+ureg_property_vs_window_space_position(struct ureg_program *ureg,
+   boolean vs_window_space_position);
+
 
 /***
  * Build shader declarations:
diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index b292257..b8e356f 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -202,6 +202,9 @@ The integer capabilities:
   implemented.
 * ``PIPE_CAP_TEXTURE_GATHER_OFFSETS``: Whether the ``TG4`` instruction can
   accept 4 offsets.
+* ``PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION``: Whether
+  TGSI_PROPERTY_VS_WINDOW_SPACE_POSITION is supported, which disables clipping
+  and viewport transformation.
 
 
 .. _pipe_capf:
diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index 9500b9d..2ca3c3b 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -2848,6 +2848,15 @@ input primitive. Each invocation will have a different
 TGSI_SEMANTIC_INVOCATIONID system value set. If not specified, assumed to
 be 1.
 
+VS_WINDOW_SPACE_POSITION
+""
+If this property is set on the vertex shader, the TGSI_S

Re: [Mesa-dev] [PATCH] i915: add a missing NULL pointer check

2014-05-27 Thread Ian Romanick
On 05/27/2014 03:31 AM, Lubomir Rintel wrote:
> mesaVisual can be NULL with configless context since this commit:
> 
> commit 551d459af421a2eb937e9e16301bb64da4624f89
> Author: Neil Roberts 
> Date:   Fri Mar 7 18:05:47 2014 +
> 
> Add the EGL_MESA_configless_context extension
> ...
> Previously the i965 and i915 drivers were explicitly creating a zeroed 
> visual
> whenever 0 is passed for the EGLConfig.
> 
> We attempt to dereference the visual in i915 and now we don't create a
> zeroed-out one one it crashes, breaking at least weston in an i915. There's
> point in doing so as it would be zero anyway.

I think you mean "There's no point".  Yeah?

> 
> Signed-off-by: Lubomir Rintel 
> ---
> This fixes https://bugzilla.redhat.com/show_bug.cgi?id=1100967

This should go in the commit message as

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1100967

This commit should also be tagged for the 10.2 branch:

Cc: "10.2" 

>  src/mesa/drivers/dri/i915/intel_context.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i915/intel_context.c 
> b/src/mesa/drivers/dri/i915/intel_context.c
> index a6057d3..09fe371 100644
> --- a/src/mesa/drivers/dri/i915/intel_context.c
> +++ b/src/mesa/drivers/dri/i915/intel_context.c
> @@ -507,7 +507,7 @@ intelInitContext(struct intel_context *intel,
>  
> _mesa_meta_init(ctx);
>  
> -   intel->hw_stencil = mesaVis->stencilBits && mesaVis->depthBits == 24;
> +   intel->hw_stencil = mesaVis && mesaVis->stencilBits && mesaVis->depthBits 
> == 24;

Other then the complaints about the commit message, the code chnage
looks good.

Reviewed-by: Ian Romanick 

> intel->hw_stipple = 1;
>  
> intel->RenderIndex = ~0;

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] meta/blit: Use gl_FragColor also in the msaa blit shader

2014-05-27 Thread Kenneth Graunke
On 05/27/2014 06:21 AM, Topi Pohjolainen wrote:
> Fixes framebuffer_blit_functionality_multisampled_to_singlesampled_blit
> es3 cts test on bdw. Also fixes this on ivb when ivb is forced to use
> the meta path.
> 
> No piglit regressions on IVB.
> 
> Signed-off-by: Topi Pohjolainen 
> Cc: Eric Anholt 
> Cc: Matt Turner 
> Cc: Kenneth Graunke 
> Cc: Anuj Phogat 
> Cc: "10.2" 
> ---
>  src/mesa/drivers/common/meta_blit.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/common/meta_blit.c 
> b/src/mesa/drivers/common/meta_blit.c
> index 84594d1..5929619 100644
> --- a/src/mesa/drivers/common/meta_blit.c
> +++ b/src/mesa/drivers/common/meta_blit.c
> @@ -273,7 +273,7 @@ setup_glsl_msaa_blit_shader(struct gl_context *ctx,
> samples);
>   } else {
>  ralloc_asprintf_append(&sample_resolve,
> -   "   out_color = sample_%d_0 / %f;\n",
> +   "   gl_FragColor = sample_%d_0 / %f;\n",
> samples, (float)samples);
>   }
>}
> 

Seems reasonable in the short term, and this gets:
Reviewed-by: Kenneth Graunke 

Unfortunately, this doesn't fix MRT for integer data.

In the single-sampled case, since we're directly copying data, we were
read/copy/write data as "float" values, which actually contained the
integer bits.  Here, we can't do that since we need to process the
actual integer data.

I do wonder if we could use intBitsToFloat/uintBitsToFloat to stuff the
integer bits in the float gl_FragColor output.  Just a crazy idea.

In the long term (post 10.2), I think we should draft an extension that
allows you to do "layout(location = all)" on user-defined fragment
shader outputs.  (Or some similar syntax.)

--Ken



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/fs: Set correct number of regs_written for MCS fetches.

2014-05-27 Thread Matt Turner
regs_written is in units of virtual GRFs.
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 171f063..b51ecc1 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1578,9 +1578,9 @@ fs_visitor::emit_mcs_fetch(ir_texture *ir, fs_reg 
coordinate, int sampler)
inst->base_mrf = -1;
inst->mlen = next.reg_offset * reg_width;
inst->header_present = false;
-   inst->regs_written = 4 * reg_width; /* we only care about one reg of 
response,
-* but the sampler always writes 4/8
-*/
+   inst->regs_written = 4; /* we only care about one reg of response,
+* but the sampler always writes 4/8
+*/
inst->sampler = sampler;
 
return dest;
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Set correct number of regs_written for MCS fetches.

2014-05-27 Thread Chris Forbes
Reviewed-by: Chris Forbes 

On Wed, May 28, 2014 at 10:27 AM, Matt Turner  wrote:
> regs_written is in units of virtual GRFs.
> ---
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> index 171f063..b51ecc1 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> @@ -1578,9 +1578,9 @@ fs_visitor::emit_mcs_fetch(ir_texture *ir, fs_reg 
> coordinate, int sampler)
> inst->base_mrf = -1;
> inst->mlen = next.reg_offset * reg_width;
> inst->header_present = false;
> -   inst->regs_written = 4 * reg_width; /* we only care about one reg of 
> response,
> -* but the sampler always writes 4/8
> -*/
> +   inst->regs_written = 4; /* we only care about one reg of response,
> +* but the sampler always writes 4/8
> +*/
> inst->sampler = sampler;
>
> return dest;
> --
> 1.8.3.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/11] Gallium/dri2: implement blitImage

2014-05-27 Thread Axel Davy
Signed-off-by: Axel Davy 
---
 src/gallium/state_trackers/dri/drm/dri2.c | 43 ---
 1 file changed, 40 insertions(+), 3 deletions(-)

diff --git a/src/gallium/state_trackers/dri/drm/dri2.c 
b/src/gallium/state_trackers/dri/drm/dri2.c
index b5bc16b..f01257a 100644
--- a/src/gallium/state_trackers/dri/drm/dri2.c
+++ b/src/gallium/state_trackers/dri/drm/dri2.c
@@ -1251,6 +1251,42 @@ dri2_from_dma_bufs(__DRIscreen *screen,
 }
 
 static void
+dri2_blit_image(__DRIcontext *context, __DRIimage *dst, __DRIimage *src,
+int dstx0, int dsty0, int dstwidth, int dstheight,
+int srcx0, int srcy0, int srcwidth, int srcheight)
+{
+   struct dri_context *ctx = dri_context(context);
+   struct pipe_context *pipe = ctx->st->pipe;
+   struct pipe_blit_info blit;
+
+   if (!dst || !src)
+  return;
+
+   memset(&blit, 0, sizeof(blit));
+   blit.dst.resource = dst->texture;
+   blit.dst.box.x = dstx0;
+   blit.dst.box.y = dsty0;
+   blit.dst.box.width = dstwidth;
+   blit.dst.box.height = dstheight;
+   blit.dst.box.depth = 1;
+   blit.dst.format = dst->texture->format;
+   blit.src.resource = src->texture;
+   blit.src.box.x = srcx0;
+   blit.src.box.y = srcy0;
+   blit.src.box.width = srcwidth;
+   blit.src.box.height = srcheight;
+   blit.src.box.depth = 1;
+   blit.src.format = src->texture->format;
+   blit.mask = PIPE_MASK_RGBA;
+   blit.filter = PIPE_TEX_FILTER_NEAREST;
+
+   pipe->blit(pipe, &blit);
+
+   ctx->st->flush(ctx->st, 0, NULL);
+   pipe->flush_resource(pipe, dst->texture);
+}
+
+static void
 dri2_destroy_image(__DRIimage *img)
 {
pipe_resource_reference(&img->texture, NULL);
@@ -1259,7 +1295,7 @@ dri2_destroy_image(__DRIimage *img)
 
 /* The extension is modified during runtime if DRI_PRIME is detected */
 static __DRIimageExtension dri2ImageExtension = {
-.base = { __DRI_IMAGE, 6 },
+.base = { __DRI_IMAGE, 9 },
 
 .createImageFromName  = dri2_create_image_from_name,
 .createImageFromRenderbuffer  = dri2_create_image_from_renderbuffer,
@@ -1271,6 +1307,9 @@ static __DRIimageExtension dri2ImageExtension = {
 .createImageFromNames = dri2_from_names,
 .fromPlanar   = dri2_from_planar,
 .createImageFromTexture   = dri2_create_from_texture,
+.createImageFromFds   = NULL,
+.createImageFromDmaBufs   = NULL,
+.blitImage= dri2_blit_image,
 };
 
 /*
@@ -1325,8 +1364,6 @@ dri2_init_screen(__DRIscreen * sPriv)
 
   if (drmGetCap(sPriv->fd, DRM_CAP_PRIME, &cap) == 0 &&
   (cap & DRM_PRIME_CAP_IMPORT)) {
-
- dri2ImageExtension.base.version = 8;
  dri2ImageExtension.createImageFromFds = dri2_from_fds;
  dri2ImageExtension.createImageFromDmaBufs = dri2_from_dma_bufs;
   }
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/11] loader: Use drirc device_id parameter in complement to DRI_PRIME

2014-05-27 Thread Axel Davy
Signed-off-by: Axel Davy 
---
 src/Makefile.am |  4 +++-
 src/loader/Makefile.am  | 21 ---
 src/loader/loader.c | 27 +
 src/mesa/drivers/dri/common/xmlconfig.h |  2 ++
 src/mesa/drivers/dri/common/xmlpool/t_options.h | 14 +
 5 files changed, 64 insertions(+), 4 deletions(-)

diff --git a/src/Makefile.am b/src/Makefile.am
index 9d1580f..d4a7090 100644
--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -19,12 +19,14 @@
 # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
 # IN THE SOFTWARE.
 
-SUBDIRS = gtest loader mapi
+SUBDIRS = gtest mapi
 
 if NEED_OPENGL_COMMON
 SUBDIRS += glsl mesa
 endif
 
+SUBDIRS += loader
+
 if HAVE_DRI_GLX
 SUBDIRS += glx
 endif
diff --git a/src/loader/Makefile.am b/src/loader/Makefile.am
index bddf7ac..3503a51 100644
--- a/src/loader/Makefile.am
+++ b/src/loader/Makefile.am
@@ -29,6 +29,23 @@ libloader_la_CPPFLAGS = \
$(VISIBILITY_CFLAGS) \
$(LIBUDEV_CFLAGS)
 
+libloader_la_SOURCES = $(LOADER_C_FILES)
+libloader_la_LIBADD = $()
+
+if NEED_OPENGL_COMMON
+libloader_la_CPPFLAGS += \
+   -I$(top_srcdir)/src/mesa/drivers/dri/common/ \
+   -I$(top_srcdir)/src/mesa/ \
+   -I$(top_srcdir)/src/mapi/ \
+   -DUSE_DRICONF
+
+libloader_la_SOURCES += \
+   $(top_srcdir)/src/mesa/drivers/dri/common/xmlconfig.c
+
+libloader_la_LIBADD += \
+   -lexpat
+endif
+
 if !HAVE_LIBDRM
 libloader_la_CPPFLAGS += \
-D__NOT_HAVE_DRM_H
@@ -36,8 +53,6 @@ else
 libloader_la_CPPFLAGS += \
$(LIBDRM_CFLAGS)
 
-libloader_la_LIBADD = \
+libloader_la_LIBADD += \
$(LIBDRM_LIBS)
 endif
-
-libloader_la_SOURCES = $(LOADER_C_FILES)
diff --git a/src/loader/loader.c b/src/loader/loader.c
index 3d504f7..e9a8c46 100644
--- a/src/loader/loader.c
+++ b/src/loader/loader.c
@@ -74,6 +74,10 @@
 #include 
 #include 
 #include 
+#ifdef USE_DRICONF
+#include "xmlconfig.h"
+#include "xmlpool.h"
+#endif
 #endif
 #include "loader.h"
 
@@ -310,9 +314,22 @@ drm_open_device(const char *device_name)
return fd;
 }
 
+#ifdef USE_DRICONF
+const char __driConfigOptionsLoader[] =
+DRI_CONF_BEGIN
+DRI_CONF_SECTION_INITIALIZATION
+DRI_CONF_DEVICE_ID_PATH_TAG()
+DRI_CONF_SECTION_END
+DRI_CONF_END;
+#endif
+
 int loader_get_user_preferred_fd(int default_fd, int *different_device)
 {
struct udev *udev;
+#ifdef USE_DRICONF
+   driOptionCache defaultInitOptions;
+   driOptionCache userInitOptions;
+#endif
const char *dri_prime = getenv("DRI_PRIME");
char *prime = NULL;
int is_different_device = 0, fd = default_fd;
@@ -324,6 +341,16 @@ int loader_get_user_preferred_fd(int default_fd, int 
*different_device)
 
if (dri_prime)
   prime = strdup(dri_prime);
+#ifdef USE_DRICONF
+   else {
+  driParseOptionInfo(&defaultInitOptions, __driConfigOptionsLoader);
+  driParseConfigFiles(&userInitOptions, &defaultInitOptions, 0, "loader");
+  if (driCheckOption(&userInitOptions, "device_id", DRI_STRING))
+ prime = strdup(driQueryOptionstr(&userInitOptions, "device_id"));
+  driDestroyOptionCache(&userInitOptions);
+  driDestroyOptionInfo(&defaultInitOptions);
+   }
+#endif
 
if (prime == NULL) {
   *different_device = 0;
diff --git a/src/mesa/drivers/dri/common/xmlconfig.h 
b/src/mesa/drivers/dri/common/xmlconfig.h
index 786caae..a4daa6b 100644
--- a/src/mesa/drivers/dri/common/xmlconfig.h
+++ b/src/mesa/drivers/dri/common/xmlconfig.h
@@ -30,6 +30,8 @@
 #ifndef __XMLCONFIG_H
 #define __XMLCONFIG_H
 
+#include 
+
 #define STRING_CONF_MAXLEN 25
 
 /** \brief Option data types */
diff --git a/src/mesa/drivers/dri/common/xmlpool/t_options.h 
b/src/mesa/drivers/dri/common/xmlpool/t_options.h
index 3bf804a..fc9e104 100644
--- a/src/mesa/drivers/dri/common/xmlpool/t_options.h
+++ b/src/mesa/drivers/dri/common/xmlpool/t_options.h
@@ -321,3 +321,17 @@ DRI_CONF_SECTION_BEGIN \
 DRI_CONF_OPT_BEGIN_B(always_have_depth_buffer, def) \
 DRI_CONF_DESC(en,gettext("Create all visuals with a depth buffer")) \
 DRI_CONF_OPT_END
+
+
+
+/**
+ * \brief Initialization configuration options
+ */
+#define DRI_CONF_SECTION_INITIALIZATION \
+DRI_CONF_SECTION_BEGIN \
+DRI_CONF_DESC(en,gettext("Initialization"))
+
+#define DRI_CONF_DEVICE_ID_PATH_TAG(def) \
+DRI_CONF_OPT_BEGIN(device_id, string, def) \
+DRI_CONF_DESC(en,gettext("Define the graphic device to use if 
possible")) \
+DRI_CONF_OPT_END
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/11] gallium: Use base.stamp for all drawable invalidation checks.

2014-05-27 Thread Axel Davy
From: Keith Packard 

Upper levels of the stack use base.stamp to tell when a drawable needs to be
revalidated, but the dri state tracker was using dPriv->lastStamp. Those two,
along with dri2.stamp, all get simultaneously incremented when a dri2
invalidate event was delivered, and so end up containing precisely the same
value.

This patch doesn't change the fact that there are three variables, rather it
switches all of the tests to use only base.stamp, which is functionally
equivalent to the previous code.

Then, it passes base.stamp to the image loader getBuffers function so that the
one which is checked will get updated by the XCB special event queue used by 
DRI3.

Signed-off-by: Keith Packard 
Reviewed-by: Marek Olšák 
---
 src/gallium/state_trackers/dri/common/dri_drawable.c | 4 ++--
 src/gallium/state_trackers/dri/drm/dri2.c| 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/gallium/state_trackers/dri/common/dri_drawable.c 
b/src/gallium/state_trackers/dri/common/dri_drawable.c
index b7df053..b321415 100644
--- a/src/gallium/state_trackers/dri/common/dri_drawable.c
+++ b/src/gallium/state_trackers/dri/common/dri_drawable.c
@@ -73,7 +73,7 @@ dri_st_framebuffer_validate(struct st_context_iface *stctx,
 * checked.
 */
do {
-  lastStamp = drawable->dPriv->lastStamp;
+  lastStamp = drawable->base.stamp;
   new_stamp = (drawable->texture_stamp != lastStamp);
 
   if (new_stamp || new_mask || screen->broken_invalidate) {
@@ -91,7 +91,7 @@ dri_st_framebuffer_validate(struct st_context_iface *stctx,
  drawable->texture_stamp = lastStamp;
  drawable->texture_mask = statt_mask;
   }
-   } while (lastStamp != drawable->dPriv->lastStamp);
+   } while (lastStamp != drawable->base.stamp);
 
if (!out)
   return TRUE;
diff --git a/src/gallium/state_trackers/dri/drm/dri2.c 
b/src/gallium/state_trackers/dri/drm/dri2.c
index 2dc1d47..b5bc16b 100644
--- a/src/gallium/state_trackers/dri/drm/dri2.c
+++ b/src/gallium/state_trackers/dri/drm/dri2.c
@@ -590,7 +590,7 @@ dri_image_allocate_textures(struct dri_context *ctx,
 
(*sPriv->image.loader->getBuffers) (dPriv,
image_format,
-   &dPriv->dri2.stamp,
+   (uint32_t *) &drawable->base.stamp,
dPriv->loaderPrivate,
buffer_mask,
&images);
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/11] Wayland/egl: Add Gpu offloading support

2014-05-27 Thread Axel Davy
Signed-off-by: Axel Davy 
---
 src/egl/drivers/dri2/egl_dri2.h |   5 +-
 src/egl/drivers/dri2/platform_wayland.c | 171 ++--
 2 files changed, 142 insertions(+), 34 deletions(-)

diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h
index 0dd9d69..4b70c48 100644
--- a/src/egl/drivers/dri2/egl_dri2.h
+++ b/src/egl/drivers/dri2/egl_dri2.h
@@ -195,6 +195,8 @@ struct dri2_egl_display
int  authenticated;
int  formats;
uint32_t  capabilities;
+   int  is_different_gpu;
+   int  blit_front;
 #endif
 };
 
@@ -247,7 +249,8 @@ struct dri2_egl_surface
struct {
 #ifdef HAVE_WAYLAND_PLATFORM
   struct wl_buffer   *wl_buffer;
-  __DRIimage *dri_image;
+  __DRIimage *rendering_image;
+  __DRIimage *shared_image;
 #endif
 #ifdef HAVE_DRM_PLATFORM
   struct gbm_bo   *bo;
diff --git a/src/egl/drivers/dri2/platform_wayland.c 
b/src/egl/drivers/dri2/platform_wayland.c
index 537d26e..8d0a90c 100644
--- a/src/egl/drivers/dri2/platform_wayland.c
+++ b/src/egl/drivers/dri2/platform_wayland.c
@@ -238,8 +238,10 @@ dri2_wl_destroy_surface(_EGLDriver *drv, _EGLDisplay 
*disp, _EGLSurface *surf)
for (i = 0; i < ARRAY_SIZE(dri2_surf->color_buffers); i++) {
   if (dri2_surf->color_buffers[i].wl_buffer)
  wl_buffer_destroy(dri2_surf->color_buffers[i].wl_buffer);
-  if (dri2_surf->color_buffers[i].dri_image)
- dri2_dpy->image->destroyImage(dri2_surf->color_buffers[i].dri_image);
+  if (dri2_surf->color_buffers[i].rendering_image) {
+ 
dri2_dpy->image->destroyImage(dri2_surf->color_buffers[i].rendering_image);
+ 
dri2_dpy->image->destroyImage(dri2_surf->color_buffers[i].shared_image);
+  }
}
 
for (i = 0; i < __DRI_BUFFER_COUNT; i++)
@@ -272,11 +274,14 @@ dri2_wl_release_buffers(struct dri2_egl_surface 
*dri2_surf)
   if (dri2_surf->color_buffers[i].wl_buffer &&
   !dri2_surf->color_buffers[i].locked)
  wl_buffer_destroy(dri2_surf->color_buffers[i].wl_buffer);
-  if (dri2_surf->color_buffers[i].dri_image)
- dri2_dpy->image->destroyImage(dri2_surf->color_buffers[i].dri_image);
+  if (dri2_surf->color_buffers[i].rendering_image) {
+ 
dri2_dpy->image->destroyImage(dri2_surf->color_buffers[i].rendering_image);
+ 
dri2_dpy->image->destroyImage(dri2_surf->color_buffers[i].shared_image);
+  }
 
   dri2_surf->color_buffers[i].wl_buffer = NULL;
-  dri2_surf->color_buffers[i].dri_image = NULL;
+  dri2_surf->color_buffers[i].rendering_image = NULL;
+  dri2_surf->color_buffers[i].shared_image = NULL;
   dri2_surf->color_buffers[i].locked = 0;
}
 
@@ -292,6 +297,7 @@ get_back_bo(struct dri2_egl_surface *dri2_surf)
 {
struct dri2_egl_display *dri2_dpy =
   dri2_egl_display(dri2_surf->base.Resource.Display);
+   unsigned int use_flags;
int i;
 
/* We always want to throttle to some event (either a frame callback or
@@ -311,24 +317,45 @@ get_back_bo(struct dri2_egl_surface *dri2_surf)
 continue;
  if (dri2_surf->back == NULL)
 dri2_surf->back = &dri2_surf->color_buffers[i];
- else if (dri2_surf->back->dri_image == NULL)
+ else if (dri2_surf->back->rendering_image == NULL)
 dri2_surf->back = &dri2_surf->color_buffers[i];
   }
}
 
if (dri2_surf->back == NULL)
   return -1;
-   if (dri2_surf->back->dri_image == NULL) {
-  dri2_surf->back->dri_image = 
+
+   if (dri2_surf->back->rendering_image == NULL) {
+  use_flags = __DRI_IMAGE_USE_SHARE;
+
+  if (dri2_dpy->is_different_gpu)
+ use_flags |= __DRI_IMAGE_USE_LINEAR;
+
+  dri2_surf->back->shared_image =
  dri2_dpy->image->createImage(dri2_dpy->dri_screen,
   dri2_surf->base.Width,
   dri2_surf->base.Height,
   __DRI_IMAGE_FORMAT_ARGB,
-  __DRI_IMAGE_USE_SHARE,
+  use_flags,
   NULL);
+  if (dri2_surf->back->shared_image == NULL)
+ return -1;
+
+  if (dri2_dpy->blit_front)
+ dri2_surf->back->rendering_image =
+dri2_dpy->image->createImage(dri2_dpy->dri_screen,
+ dri2_surf->base.Width,
+ dri2_surf->base.Height,
+ __DRI_IMAGE_FORMAT_ARGB,
+ 0,
+ NULL);
+  else
+ dri2_surf->back->rendering_image =
+dri2_dpy->image->dupImage(dri2_surf->back->shared_image, NULL);
+
   dri2_surf->back->age = 0;
}
-   if (dri2_surf->back->dri_image == NULL)
+   if (dri2_surf->

[Mesa-dev] [PATCH 01/11] gallium: Add __DRIimageDriverExtension support to gallium

2014-05-27 Thread Axel Davy
From: Keith Packard 

Provide the hook to pull textures out of __DRIimage structures and use them as
renderbuffers.

Signed-off-by: Keith Packard 
---
 src/gallium/state_trackers/dri/drm/dri2.c | 238 +-
 1 file changed, 230 insertions(+), 8 deletions(-)

diff --git a/src/gallium/state_trackers/dri/drm/dri2.c 
b/src/gallium/state_trackers/dri/drm/dri2.c
index 7dccc5e..cd9964c 100644
--- a/src/gallium/state_trackers/dri/drm/dri2.c
+++ b/src/gallium/state_trackers/dri/drm/dri2.c
@@ -498,6 +498,219 @@ dri2_release_buffer(__DRIscreen *sPriv, __DRIbuffer 
*bPriv)
FREE(buffer);
 }
 
+static void
+dri_image_allocate_textures(struct dri_context *ctx,
+   struct dri_drawable *drawable,
+   const enum st_attachment_type *statts,
+   unsigned statts_count)
+{
+   __DRIdrawable *dPriv = drawable->dPriv;
+   __DRIscreen *sPriv = drawable->sPriv;
+   struct dri_screen *screen = dri_screen(sPriv);
+   unsigned int image_format = __DRI_IMAGE_FORMAT_NONE;
+   uint32_t buffer_mask = 0;
+   struct __DRIimageList images;
+   boolean alloc_depthstencil = FALSE;
+   int i, j;
+   struct pipe_resource templ;
+
+   /* See if we need a depth-stencil buffer. */
+   for (i = 0; i < statts_count; i++) {
+  if (statts[i] == ST_ATTACHMENT_DEPTH_STENCIL) {
+ alloc_depthstencil = TRUE;
+ break;
+  }
+   }
+
+   /* Delete the resources we won't need. */
+   for (i = 0; i < ST_ATTACHMENT_COUNT; i++) {
+  /* Don't delete the depth-stencil buffer, we can reuse it. */
+  if (i == ST_ATTACHMENT_DEPTH_STENCIL && alloc_depthstencil)
+ continue;
+
+  pipe_resource_reference(&drawable->textures[i], NULL);
+   }
+
+   if (drawable->stvis.samples > 1) {
+  for (i = 0; i < ST_ATTACHMENT_COUNT; i++) {
+ boolean del = TRUE;
+
+ /* Don't delete MSAA resources for the attachments which are enabled,
+  * we can reuse them. */
+ for (j = 0; j < statts_count; j++) {
+if (i == statts[j]) {
+   del = FALSE;
+   break;
+}
+ }
+
+ if (del) {
+pipe_resource_reference(&drawable->msaa_textures[i], NULL);
+ }
+  }
+   }
+
+   for (i = 0; i < statts_count; i++) {
+  enum pipe_format pf;
+  unsigned bind;
+
+  dri_drawable_get_format(drawable, statts[i], &pf, &bind);
+  if (pf == PIPE_FORMAT_NONE)
+ continue;
+
+  switch (pf) {
+  case PIPE_FORMAT_B5G6R5_UNORM:
+ image_format = __DRI_IMAGE_FORMAT_RGB565;
+ break;
+  case PIPE_FORMAT_B8G8R8X8_UNORM:
+ image_format = __DRI_IMAGE_FORMAT_XRGB;
+ break;
+  case PIPE_FORMAT_B8G8R8A8_UNORM:
+ image_format = __DRI_IMAGE_FORMAT_ARGB;
+ break;
+  case PIPE_FORMAT_R8G8B8A8_UNORM:
+ image_format = __DRI_IMAGE_FORMAT_ABGR;
+ break;
+  default:
+ image_format = __DRI_IMAGE_FORMAT_NONE;
+ break;
+  }
+
+  switch (statts[i]) {
+  case ST_ATTACHMENT_FRONT_LEFT:
+ buffer_mask |= __DRI_IMAGE_BUFFER_FRONT;
+ break;
+  case ST_ATTACHMENT_BACK_LEFT:
+ buffer_mask |= __DRI_IMAGE_BUFFER_BACK;
+ break;
+  default:
+ continue;
+  }
+   }
+
+   (*sPriv->image.loader->getBuffers) (dPriv,
+   image_format,
+   &dPriv->dri2.stamp,
+   dPriv->loaderPrivate,
+   buffer_mask,
+   &images);
+
+   if (images.image_mask & __DRI_IMAGE_BUFFER_FRONT) {
+  struct pipe_resource *texture = images.front->texture;
+
+  dPriv->w = texture->width0;
+  dPriv->h = texture->height0;
+
+  pipe_resource_reference(&drawable->textures[ST_ATTACHMENT_FRONT_LEFT], 
texture);
+   }
+
+   if (images.image_mask & __DRI_IMAGE_BUFFER_BACK) {
+  struct pipe_resource *texture = images.back->texture;
+
+  dPriv->w = images.back->texture->width0;
+  dPriv->h = images.back->texture->height0;
+
+  pipe_resource_reference(&drawable->textures[ST_ATTACHMENT_BACK_LEFT], 
texture);
+   }
+
+   memset(&templ, 0, sizeof(templ));
+   templ.target = screen->target;
+   templ.last_level = 0;
+   templ.width0 = dPriv->w;
+   templ.height0 = dPriv->h;
+   templ.depth0 = 1;
+   templ.array_size = 1;
+
+   /* Allocate private MSAA colorbuffers. */
+   if (drawable->stvis.samples > 1) {
+  for (i = 0; i < statts_count; i++) {
+ enum st_attachment_type att = statts[i];
+
+ if (att == ST_ATTACHMENT_DEPTH_STENCIL)
+continue;
+
+ if (drawable->textures[att]) {
+templ.format = drawable->textures[att]->format;
+templ.bind = drawable->textures[att]->bind;
+templ.nr_samples = drawable->stvis.samples;
+
+/* Try to reuse the resource.
+ * (the oth

[Mesa-dev] [PATCH 02/11] gallium/dri: fix unsetting of format when encountering depth/stencil

2014-05-27 Thread Axel Davy
From: Ben Skeggs 

Signed-off-by: Ben Skeggs 
Signed-off-by: Keith Packard 
---
 src/gallium/state_trackers/dri/drm/dri2.c | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/src/gallium/state_trackers/dri/drm/dri2.c 
b/src/gallium/state_trackers/dri/drm/dri2.c
index cd9964c..2dc1d47 100644
--- a/src/gallium/state_trackers/dri/drm/dri2.c
+++ b/src/gallium/state_trackers/dri/drm/dri2.c
@@ -558,6 +558,17 @@ dri_image_allocate_textures(struct dri_context *ctx,
   if (pf == PIPE_FORMAT_NONE)
  continue;
 
+  switch (statts[i]) {
+  case ST_ATTACHMENT_FRONT_LEFT:
+ buffer_mask |= __DRI_IMAGE_BUFFER_FRONT;
+ break;
+  case ST_ATTACHMENT_BACK_LEFT:
+ buffer_mask |= __DRI_IMAGE_BUFFER_BACK;
+ break;
+  default:
+ continue;
+  }
+
   switch (pf) {
   case PIPE_FORMAT_B5G6R5_UNORM:
  image_format = __DRI_IMAGE_FORMAT_RGB565;
@@ -575,17 +586,6 @@ dri_image_allocate_textures(struct dri_context *ctx,
  image_format = __DRI_IMAGE_FORMAT_NONE;
  break;
   }
-
-  switch (statts[i]) {
-  case ST_ATTACHMENT_FRONT_LEFT:
- buffer_mask |= __DRI_IMAGE_BUFFER_FRONT;
- break;
-  case ST_ATTACHMENT_BACK_LEFT:
- buffer_mask |= __DRI_IMAGE_BUFFER_BACK;
- break;
-  default:
- continue;
-  }
}
 
(*sPriv->image.loader->getBuffers) (dPriv,
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/11] GLX/DRI3: Add Gpu offloading support.

2014-05-27 Thread Axel Davy
Signed-off-by: Axel Davy 
---
 src/glx/dri3_glx.c  | 235 +++-
 src/glx/dri3_priv.h |   2 +
 2 files changed, 200 insertions(+), 37 deletions(-)

diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index 3d8a662..54030bb 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -596,6 +596,7 @@ dri3_copy_sub_buffer(__GLXDRIdrawable *pdraw, int x, int y,
 {
struct dri3_drawable *priv = (struct dri3_drawable *) pdraw;
struct dri3_screen *psc = (struct dri3_screen *) pdraw->psc;
+   struct dri3_context *pcp = (struct dri3_context *) __glXGetCurrentContext();
xcb_connection_t *c = XGetXCBConnection(priv->base.psc->dpy);
struct dri3_buffer *back = dri3_back_buffer(priv);
 
@@ -605,6 +606,30 @@ dri3_copy_sub_buffer(__GLXDRIdrawable *pdraw, int x, int y,
if (!priv->have_back || priv->is_pixmap)
   return;
 
+   /* When on a different gpu than the server, we use blitImage
+* for the copies. Do the needed copies before flushing.
+*/
+   if (psc->is_different_gpu && pcp && pcp->driContext) {
+  /* Update the linear buffer part of the back buffer
+   * for the dri3_copy_area operation
+   */
+  psc->image->blitImage(pcp->driContext,
+back->linear_buffer,
+back->image,
+0, 0, back->width,
+back->height,
+0, 0, back->width,
+back->height);
+  /* We use blitImage to update our fake front,
+   */
+  if (priv->have_fake_front)
+ psc->image->blitImage(pcp->driContext,
+   dri3_fake_front_buffer(priv)->image,
+   back->image,
+   x, y, width, height,
+   x, y, width, height);
+   }
+
flags = __DRI2_FLUSH_DRAWABLE;
if (flush)
   flags |= __DRI2_FLUSH_CONTEXT;
@@ -622,7 +647,7 @@ dri3_copy_sub_buffer(__GLXDRIdrawable *pdraw, int x, int y,
/* Refresh the fake front (if present) after we just damaged the real
 * front.
 */
-   if (priv->have_fake_front) {
+   if (priv->have_fake_front && !psc->is_different_gpu) {
   dri3_fence_reset(c, dri3_fake_front_buffer(priv));
   dri3_copy_area(c,
  dri3_back_buffer(priv)->pixmap,
@@ -655,25 +680,62 @@ dri3_copy_drawable(struct dri3_drawable *priv, Drawable 
dest, Drawable src)
 static void
 dri3_wait_x(struct glx_context *gc)
 {
+   struct dri3_context *pcp = (struct dri3_context *) gc;
struct dri3_drawable *priv = (struct dri3_drawable *)
   GetGLXDRIDrawable(gc->currentDpy, gc->currentDrawable);
+   struct dri3_screen *psc;
+   struct dri3_buffer *front;
 
if (priv == NULL || !priv->have_fake_front)
   return;
 
-   dri3_copy_drawable(priv, dri3_fake_front_buffer(priv)->pixmap, 
priv->base.xDrawable);
+   psc = (struct dri3_screen *) priv->base.psc;
+   front = dri3_fake_front_buffer(priv);
+
+   dri3_copy_drawable(priv, front->pixmap, priv->base.xDrawable);
+
+   /* In the psc->is_different_gpu case, the linear buffer has been updated,
+* but not yet the tiled buffer.
+* Copy back to the tiled buffer we use for rendering.
+* Note that we don't need flushing.
+*/
+   if (psc->is_different_gpu && pcp->driContext)
+  psc->image->blitImage(pcp->driContext,
+front->image,
+front->linear_buffer,
+0, 0, front->width,
+front->height,
+0, 0, front->width,
+front->height);
 }
 
 static void
 dri3_wait_gl(struct glx_context *gc)
 {
+   struct dri3_context *pcp = (struct dri3_context *) gc;
struct dri3_drawable *priv = (struct dri3_drawable *)
   GetGLXDRIDrawable(gc->currentDpy, gc->currentDrawable);
+   struct dri3_screen *psc;
+   struct dri3_buffer *front;
 
if (priv == NULL || !priv->have_fake_front)
   return;
 
-   dri3_copy_drawable(priv, priv->base.xDrawable, 
dri3_fake_front_buffer(priv)->pixmap);
+   psc = (struct dri3_screen *) priv->base.psc;
+   front = dri3_fake_front_buffer(priv);
+
+   /* In the psc->is_different_gpu case, we update the linear_buffer
+* before updating the real front.
+*/
+   if (psc->is_different_gpu && pcp->driContext)
+  psc->image->blitImage(pcp->driContext,
+front->linear_buffer,
+front->image,
+0, 0, front->width,
+front->height,
+0, 0, front->width,
+front->height);
+   dri3_copy_drawable(priv, priv->base.xDrawable, front->pixmap);
 }
 
 /**
@@ -741,6 +803,7 @@ dri3_alloc_render_buffer(struct glx_screen *glx_screen, 
Drawable draw,
struct dri3_screen *psc = (struct dri3_screen *) glx_screen;
Display *dpy = glx_

[Mesa-dev] [PATCH 04/11] Loader: Add gpu selection code with DRI_PRIME.

2014-05-27 Thread Axel Davy
Signed-off-by: Axel Davy 
---
 src/loader/loader.c | 188 
 src/loader/loader.h |   7 ++
 2 files changed, 195 insertions(+)

diff --git a/src/loader/loader.c b/src/loader/loader.c
index 666d015..3d504f7 100644
--- a/src/loader/loader.c
+++ b/src/loader/loader.c
@@ -70,6 +70,10 @@
 #ifdef HAVE_LIBUDEV
 #include 
 #include 
+#include 
+#include 
+#include 
+#include 
 #endif
 #include "loader.h"
 
@@ -202,6 +206,180 @@ out:
return (*chip_id >= 0);
 }
 
+static char *
+get_render_node_from_id_path_tag(struct udev *udev,
+ char *id_path_tag,
+ char another_tag)
+{
+   struct udev_device *device;
+   struct udev_enumerate *e;
+   struct udev_list_entry *entry;
+   const char *path, *id_path_tag_tmp;
+   char *path_res;
+   char found = 0;
+   UDEV_SYMBOL(struct udev_enumerate *, udev_enumerate_new,
+   (struct udev *));
+   UDEV_SYMBOL(int, udev_enumerate_add_match_subsystem,
+   (struct udev_enumerate *, const char *));
+   UDEV_SYMBOL(int, udev_enumerate_add_match_sysname,
+   (struct udev_enumerate *, const char *));
+   UDEV_SYMBOL(int, udev_enumerate_scan_devices,
+   (struct udev_enumerate *));
+   UDEV_SYMBOL(struct udev_list_entry *, udev_enumerate_get_list_entry,
+   (struct udev_enumerate *));
+   UDEV_SYMBOL(struct udev_list_entry *, udev_list_entry_get_next,
+   (struct udev_list_entry *));
+   UDEV_SYMBOL(const char *, udev_list_entry_get_name,
+   (struct udev_list_entry *));
+   UDEV_SYMBOL(struct udev_device *, udev_device_new_from_syspath,
+   (struct udev *, const char *));
+   UDEV_SYMBOL(const char *, udev_device_get_property_value,
+   (struct udev_device *, const char *));
+   UDEV_SYMBOL(const char *, udev_device_get_devnode,
+   (struct udev_device *));
+   UDEV_SYMBOL(struct udev_device *, udev_device_unref,
+   (struct udev_device *));
+
+   e = udev_enumerate_new(udev);
+   udev_enumerate_add_match_subsystem(e, "drm");
+   udev_enumerate_add_match_sysname(e, "render*");
+
+   udev_enumerate_scan_devices(e);
+   udev_list_entry_foreach(entry, udev_enumerate_get_list_entry(e)) {
+  path = udev_list_entry_get_name(entry);
+  device = udev_device_new_from_syspath(udev, path);
+  if (!device)
+ continue;
+  id_path_tag_tmp = udev_device_get_property_value(device, "ID_PATH_TAG");
+  if (id_path_tag_tmp) {
+ if ((!another_tag && !strcmp(id_path_tag, id_path_tag_tmp)) ||
+ (another_tag && strcmp(id_path_tag, id_path_tag_tmp))) {
+found = 1;
+break;
+ }
+  }
+  udev_device_unref(device);
+   }
+
+   if (found) {
+  path_res = strdup(udev_device_get_devnode(device));
+  udev_device_unref(device);
+  return path_res;
+   }
+   return NULL;
+}
+
+static char *
+get_id_path_tag_from_fd(struct udev *udev, int fd)
+{
+   struct udev_device *device;
+   const char *id_path_tag_tmp;
+   char *id_path_tag;
+   UDEV_SYMBOL(const char *, udev_device_get_property_value,
+   (struct udev_device *, const char *));
+   UDEV_SYMBOL(struct udev_device *, udev_device_unref,
+   (struct udev_device *));
+
+   device = udev_device_new_from_fd(udev, fd);
+   if (!device)
+  return NULL;
+
+   id_path_tag_tmp = udev_device_get_property_value(device, "ID_PATH_TAG");
+   if (!id_path_tag_tmp)
+  return NULL;
+
+   id_path_tag = strdup(id_path_tag_tmp);
+
+   udev_device_unref(device);
+   return id_path_tag;
+}
+
+static int
+drm_open_device(const char *device_name)
+{
+   int fd;
+#ifdef O_CLOEXEC
+   fd = open(device_name, O_RDWR | O_CLOEXEC);
+   if (fd == -1 && errno == EINVAL)
+#endif
+   {
+  fd = open(device_name, O_RDWR);
+  if (fd != -1)
+ fcntl(fd, F_SETFD, fcntl(fd, F_GETFD) | FD_CLOEXEC);
+   }
+   return fd;
+}
+
+int loader_get_user_preferred_fd(int default_fd, int *different_device)
+{
+   struct udev *udev;
+   const char *dri_prime = getenv("DRI_PRIME");
+   char *prime = NULL;
+   int is_different_device = 0, fd = default_fd;
+   char *default_device_id_path_tag;
+   char *device_name = NULL;
+   char another_tag = 0;
+   UDEV_SYMBOL(struct udev *, udev_new, (void));
+   UDEV_SYMBOL(struct udev *, udev_unref, (struct udev *));
+
+   if (dri_prime)
+  prime = strdup(dri_prime);
+
+   if (prime == NULL) {
+  *different_device = 0;
+  return default_fd;
+   }
+
+   udev = udev_new();
+   if (!udev)
+  goto prime_clean;
+
+   default_device_id_path_tag = get_id_path_tag_from_fd(udev, default_fd);
+   if (!default_device_id_path_tag)
+  goto udev_clean;
+
+   is_different_device = 1;
+   /* two format are supported:
+* "1": choose any other card than the card used by default.
+* id_path_tag: (for example "pci-_02_00_0") choose the card
+* with this id_path_tag.
+*/
+   if 

[Mesa-dev] [PATCH 11/11] Radeonsi: Use dma_copy when possible for si_blit.

2014-05-27 Thread Axel Davy
This improves GLX DRI3 Gpu offloading significantly on cpu
bound benchmarks particularly.
No performance impact for DRI2 Gpu offloading.

Signed-off-by: Axel Davy 
---
 src/gallium/drivers/radeonsi/si_blit.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_blit.c 
b/src/gallium/drivers/radeonsi/si_blit.c
index 6bc89ab..0e327b5 100644
--- a/src/gallium/drivers/radeonsi/si_blit.c
+++ b/src/gallium/drivers/radeonsi/si_blit.c
@@ -712,6 +712,21 @@ static void si_blit(struct pipe_context *ctx,
return;
}
 
+   if (info->src.box.width ==  info->dst.box.width &&
+   info->src.format == info->dst.format &&
+   info->src.box.width >=0 &&
+   info->src.resource->nr_samples == 0 &&
+   info->dst.resource->nr_samples == 0 &&
+   info->src.box.depth == 1 &&
+   info->dst.box.depth == 1 &&
+   info->mask == PIPE_MASK_RGBA) {
+   sctx->b.dma_copy(ctx, info->dst.resource, info->dst.level,
+info->dst.box.x, info->dst.box.y,
+info->dst.box.z, info->src.resource,
+info->src.level, &(info->src.box));
+   return;
+   }
+
assert(util_blitter_is_blit_supported(sctx->blitter, info));
 
/* The driver doesn't decompress resources automatically while
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/11] GPU offloading for GLX DRI3 and EGL Wayland

2014-05-27 Thread Axel Davy
Currently Gpu offloading is supported only with GLX DRI2.
You need to set it up with xrandr, and you need a DDX loaded for
the secondary device, even if it has no screen.
You use the DRI_PRIME env var to set up which Gpu the application
should use. Unfortunately it has some issues: Rendering to a pixmap
is unsupported, and you need either to be fullscreen, or to be in a
composited environment to not get a black content.

These patches add Gpu offloading support to GLX DRI3 and EGL Wayland.
Most of the limitations mentioned above are adressed.

The first three patches add the __DRIimageDriverExtension support
to gallium. It is needed for GLX DRI3 and to use a render-node
for EGL Wayland.

The next three patches add to Loader the support needed to change the
device EGL Wayland or GLX DRI3 would use, if the user specifies
another device via DRI_PRIME or via drirc device_id.

For example if drirc contains:









Then glmark2-wayland will use if possible the render-node of
ID_PATH_TAG pci-_01_00_0.

The ID_PATH_TAG of a device is filled by udev, and you can
get it with the command:
"udevadm info /dev/cardX"

DRI_PRIME can be set both to "1" (meaning 'another device
than the server') or to the ID_PATH_TAG.

If render-nodes are not enabled, or if the indicated device doesn't
exist, the server device is used.
There is no need to have a DDX loaded for the device we want to use,
nor you need to configure something with xrandr.

Two patches add blitImage specification and implementation to
gallium. It enables to blit two __DRIimage together.

The GLX DRI3 Gpu offloading implementation allows to render to a pixmap,
and will get in sync the back and front buffer with copies. There is no
need to be in a composited environment.

The last patch allows blitImage to use dma copy on radeonsi.
It gives a performance boost, especially in cpu limited benchmarks.
This makes for these cards Gpu offloading faster for DRI3 than for
DRI2.
Nouveau already has an optimised blitImage path because it uses
the 2D engine to copy.

Currently no official DDX release supports DRI3.
It's, I think, mainly because the Present DDX side API makes
it difficult to implement. If the DDX doesn't support Present,
the Xserver uses a fallback with copies. When rendering on the server
card, it's worse performance than Dri2, but since the copy is on the
server card, it won't affect the performance of Gpu offloading.

One proposition would be to add basic DRI3 support to the DDXs (without
Present support for now), but Mesa would use it only when DRI2 fails,
or if we want to use DRI3 Gpu offloading.

Axel Davy (8):
  Loader: Add gpu selection code via DRI_PRIME.
  drirc: Add string support
  loader: Use drirc device_id parameter in complement to DRI_PRIME
  DRIimage: add blitImage to the specification
  Gallium/dri2: implement blitImage
  GLX/DRI3: Add Gpu offloading support.
  Wayland/egl: Add Gpu offloading support
  Radeonsi: Use dma_copy when possible for si_blit.

Ben Skeggs (1):
  gallium/dri: fix unsetting of format when encountering depth/stencil

Keith Packard (2):
  gallium: Add __DRIimageDriverExtension support to gallium
  gallium: Use base.stamp for all drawable invalidation checks.

 include/GL/internal/dri_interface.h|  11 +-
 src/Makefile.am|   4 +-
 src/egl/drivers/dri2/egl_dri2.h|   5 +-
 src/egl/drivers/dri2/platform_wayland.c| 171 ++---
 src/gallium/drivers/radeonsi/si_blit.c |  15 ++
 .../state_trackers/dri/common/dri_drawable.c   |   4 +-
 src/gallium/state_trackers/dri/drm/dri2.c  | 281 -
 src/glx/dri3_glx.c | 235 ++---
 src/glx/dri3_priv.h|   2 +
 src/loader/Makefile.am |  21 +-
 src/loader/loader.c| 215 
 src/loader/loader.h|   7 +
 src/mesa/drivers/dri/common/xmlconfig.c|  29 +++
 src/mesa/drivers/dri/common/xmlconfig.h|   9 +-
 src/mesa/drivers/dri/common/xmlpool/t_options.h|  14 +
 15 files changed, 933 insertions(+), 90 deletions(-)

-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/11] DRIimage: add blitImage to the specification

2014-05-27 Thread Axel Davy
It allows to blit two __DRIimages.

Signed-off-by: Axel Davy 
---
 include/GL/internal/dri_interface.h | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/include/GL/internal/dri_interface.h 
b/include/GL/internal/dri_interface.h
index 4d57d0b..2ee3164 100644
--- a/include/GL/internal/dri_interface.h
+++ b/include/GL/internal/dri_interface.h
@@ -1005,7 +1005,7 @@ struct __DRIdri2ExtensionRec {
  * extensions.
  */
 #define __DRI_IMAGE "DRI_IMAGE"
-#define __DRI_IMAGE_VERSION 8
+#define __DRI_IMAGE_VERSION 9
 
 /**
  * These formats correspond to the similarly named MESA_FORMAT_*
@@ -1239,6 +1239,15 @@ struct __DRIimageExtensionRec {
  enum __DRIChromaSiting vert_siting,
  unsigned *error,
  void *loaderPrivate);
+
+   /**
+* Blit a part of a __DRIimage to another and flushes
+*
+* \since 9
+*/
+   void (*blitImage)(__DRIcontext *context, __DRIimage *dst, __DRIimage *src,
+ int dstx0, int dsty0, int dstwidth, int dstheight,
+ int srcx0, int srcy0, int srcwidth, int srcheight);
 };
 
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/11] drirc: Add string support

2014-05-27 Thread Axel Davy
Signed-off-by: Axel Davy 
---
 src/mesa/drivers/dri/common/xmlconfig.c | 29 +
 src/mesa/drivers/dri/common/xmlconfig.h |  7 ++-
 2 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/common/xmlconfig.c 
b/src/mesa/drivers/dri/common/xmlconfig.c
index b95e452..d41d2b2 100644
--- a/src/mesa/drivers/dri/common/xmlconfig.c
+++ b/src/mesa/drivers/dri/common/xmlconfig.c
@@ -311,6 +311,11 @@ static GLboolean parseValue (driOptionValue *v, 
driOptionType type,
   case DRI_FLOAT:
v->_float = strToF (string, &tail);
break;
+  case DRI_STRING:
+   if (v->_string)
+   free (v->_string);
+   v->_string = strndup(string, STRING_CONF_MAXLEN);
+   return GL_TRUE;
 }
 
 if (tail == string)
@@ -404,6 +409,8 @@ static GLboolean checkValue (const driOptionValue *v, const 
driOptionInfo *info)
v->_float <= info->ranges[i].end._float)
return GL_TRUE;
break;
+  case DRI_STRING:
+   break;
   default:
assert (0); /* should never happen */
 }
@@ -567,6 +574,8 @@ static void parseOptInfoAttr (struct OptInfoData *data, 
const XML_Char **attr) {
cache->info[opt].type = DRI_INT;
 else if (!strcmp (attrVal[OA_TYPE], "float"))
cache->info[opt].type = DRI_FLOAT;
+else if (!strcmp (attrVal[OA_TYPE], "string"))
+   cache->info[opt].type = DRI_STRING;
 else
XML_FATAL ("illegal type in option: %s.", attrVal[OA_TYPE]);
 
@@ -867,6 +876,7 @@ static void optConfEndElem (void *userData, const XML_Char 
*name) {
 
 /** \brief Initialize an option cache based on info */
 static void initOptionCache (driOptionCache *cache, const driOptionCache 
*info) {
+GLuint i, size = 1 << info->tableSize;
 cache->info = info->info;
 cache->tableSize = info->tableSize;
 cache->values = malloc((1values, info->values,
(1info[i].type == DRI_STRING)
+   XSTRDUP(cache->values[i]._string, info->values[i]._string);
+}
 }
 
 /** \brief Parse the named configuration file */
@@ -981,6 +995,13 @@ void driDestroyOptionInfo (driOptionCache *info) {
 }
 
 void driDestroyOptionCache (driOptionCache *cache) {
+if (cache->info) {
+   GLuint i, size = 1 << cache->tableSize;
+   for (i = 0; i < size; ++i) {
+   if (cache->info[i].type == DRI_STRING)
+   free(cache->values[i]._string);
+   }
+}
 free(cache->values);
 }
 
@@ -1013,3 +1034,11 @@ GLfloat driQueryOptionf (const driOptionCache *cache, 
const char *name) {
 assert (cache->info[i].type == DRI_FLOAT);
 return cache->values[i]._float;
 }
+
+char *driQueryOptionstr (const driOptionCache *cache, const char *name) {
+GLuint i = findOption (cache, name);
+  /* make sure the option is defined and has the correct type */
+assert (cache->info[i].name != NULL);
+assert (cache->info[i].type == DRI_STRING);
+return cache->values[i]._string;
+}
diff --git a/src/mesa/drivers/dri/common/xmlconfig.h 
b/src/mesa/drivers/dri/common/xmlconfig.h
index d0ad42c..786caae 100644
--- a/src/mesa/drivers/dri/common/xmlconfig.h
+++ b/src/mesa/drivers/dri/common/xmlconfig.h
@@ -30,9 +30,11 @@
 #ifndef __XMLCONFIG_H
 #define __XMLCONFIG_H
 
+#define STRING_CONF_MAXLEN 25
+
 /** \brief Option data types */
 typedef enum driOptionType {
-DRI_BOOL, DRI_ENUM, DRI_INT, DRI_FLOAT
+DRI_BOOL, DRI_ENUM, DRI_INT, DRI_FLOAT, DRI_STRING
 } driOptionType;
 
 /** \brief Option value */
@@ -40,6 +42,7 @@ typedef union driOptionValue {
 GLboolean _bool; /**< \brief Boolean */
 GLint _int;  /**< \brief Integer or Enum */
 GLfloat _float;  /**< \brief Floating-point */
+char *_string;   /**< \brief String */
 } driOptionValue;
 
 /** \brief Single range of valid values
@@ -118,5 +121,7 @@ GLboolean driQueryOptionb (const driOptionCache *cache, 
const char *name);
 GLint driQueryOptioni (const driOptionCache *cache, const char *name);
 /** \brief Query a floating-point option value */
 GLfloat driQueryOptionf (const driOptionCache *cache, const char *name);
+/** \brief Query a string option value */
+char *driQueryOptionstr (const driOptionCache *cache, const char *name);
 
 #endif
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Fix repeated usage of rectangle texture coordinate scaling.

2014-05-27 Thread Kenneth Graunke
Previously, we set up new entries in the params[] array on every access
of a rectangle texture.  Unfortunately, we only reserve space for
(2 * MaxTextureImageUnits) extra entries, so programs which accessed
rectangle textures more times than that would write off the end of the
array and likely crash.

We don't really have a decent mapping between the index returned by
_mesa_add_state_reference and our index into the params array, so we
have to manually search for it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78691
Signed-off-by: Kenneth Graunke 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 27 ---
 1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 171f063..be6b8ac 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1486,15 +1486,28 @@ fs_visitor::rescale_texcoord(ir_texture *ir, fs_reg 
coordinate,
 return coordinate;
   }
 
-  scale_x = fs_reg(UNIFORM, uniforms);
-  scale_y = fs_reg(UNIFORM, uniforms + 1);
-
   GLuint index = _mesa_add_state_reference(params,
   (gl_state_index *)tokens);
-  stage_prog_data->param[uniforms++] =
- &prog->Parameters->ParameterValues[index][0].f;
-  stage_prog_data->param[uniforms++] =
- &prog->Parameters->ParameterValues[index][1].f;
+  /* Try to find existing copies of the texrect scale uniforms. */
+  for (unsigned i = 0; i < uniforms; i++) {
+ if (stage_prog_data->param[i] ==
+ &prog->Parameters->ParameterValues[index][0].f) {
+scale_x = fs_reg(UNIFORM, i);
+scale_y = fs_reg(UNIFORM, i + 1);
+break;
+ }
+  }
+
+  /* If we didn't already set them up, do so now. */
+  if (scale_x.file == BAD_FILE) {
+ scale_x = fs_reg(UNIFORM, uniforms);
+ scale_y = fs_reg(UNIFORM, uniforms + 1);
+
+ stage_prog_data->param[uniforms++] =
+&prog->Parameters->ParameterValues[index][0].f;
+ stage_prog_data->param[uniforms++] =
+&prog->Parameters->ParameterValues[index][1].f;
+  }
}
 
/* The 965 requires the EU to do the normalization of GL rectangle
-- 
1.9.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 2/4] glsl/tests/lower_jumps: fix generated sexpr's for loops

2014-05-27 Thread Connor Abbott
In 088494aa (as well as other commits in the series) Paul Berry modified
the tests for lower_jumps to account for the fact that the s-expression
for the loop IR instruction changed from
(loop () () () () (statements...)) to (loop (statements...)), but he
forgot to update create_test_cases.py which he used to create the tests.
Fix that, so that now create_test_cases.py is synced with the generated
tests.

Signed-off-by: Connor Abbott 
---
 src/glsl/tests/lower_jumps/create_test_cases.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glsl/tests/lower_jumps/create_test_cases.py 
b/src/glsl/tests/lower_jumps/create_test_cases.py
index 9974681..3be1079 100644
--- a/src/glsl/tests/lower_jumps/create_test_cases.py
+++ b/src/glsl/tests/lower_jumps/create_test_cases.py
@@ -126,7 +126,7 @@ def loop(statements):
 body.
 """
 check_sexp(statements)
-return [['loop', [], [], [], [], statements]]
+return [['loop', statements]]
 
 def declare_temp(var_type, var_name):
 """Create a declaration of the form
-- 
1.8.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 0/4] glsl/tests: remove generated files

2014-05-27 Thread Connor Abbott
While trying to modify the lower_jumps unit tests to account for my SSA
changes, I realized that the tests were not in sync with the file that
generated them. There were two problems:

-The *.expected files all had the same number of digits after the
decimal place (6) whereas the *.out files had 1 digit in "0.0" and 6
digits in "1.00" when printing constants, which led to failures due
to diffs like:

-   ((if (expression bool > (var_ref b) (constant float (0.00)))
+   ((if (expression bool > (var_ref b) (constant float (0.0)))


-Loops were incorrect in the input files.

This series fixes both problems, and then removes the generated tests so
that stuff like this won't happen again.

v2: actually generate the test files

Connor Abbott (4):
  glsl: be more consistent about printing constants
  glsl/tests/lower_jumps: fix generated sexpr's for loops
  glsl/tests: call create_test_cases.py in optimization-test
  glsl/tests: remove generated tests from the repo

 src/glsl/ir_print_visitor.cpp  |  2 +-
 src/glsl/tests/lower_jumps/.gitignore  |  2 ++
 src/glsl/tests/lower_jumps/create_test_cases.py|  2 +-
 src/glsl/tests/lower_jumps/lower_breaks_1.opt_test | 13 --
 .../lower_jumps/lower_breaks_1.opt_test.expected   |  5 
 src/glsl/tests/lower_jumps/lower_breaks_2.opt_test | 15 ---
 .../lower_jumps/lower_breaks_2.opt_test.expected   |  7 --
 src/glsl/tests/lower_jumps/lower_breaks_3.opt_test | 17 -
 .../lower_jumps/lower_breaks_3.opt_test.expected   |  8 --
 src/glsl/tests/lower_jumps/lower_breaks_4.opt_test | 15 ---
 .../lower_jumps/lower_breaks_4.opt_test.expected   |  7 --
 src/glsl/tests/lower_jumps/lower_breaks_5.opt_test | 16 
 .../lower_jumps/lower_breaks_5.opt_test.expected   |  7 --
 src/glsl/tests/lower_jumps/lower_breaks_6.opt_test | 29 --
 .../lower_jumps/lower_breaks_6.opt_test.expected   | 29 --
 .../lower_guarded_conditional_break.opt_test   | 21 
 ...wer_guarded_conditional_break.opt_test.expected | 20 ---
 .../lower_jumps/lower_pulled_out_jump.opt_test | 28 -
 .../lower_pulled_out_jump.opt_test.expected| 25 ---
 .../tests/lower_jumps/lower_returns_1.opt_test | 12 -
 .../lower_jumps/lower_returns_1.opt_test.expected  |  4 ---
 .../tests/lower_jumps/lower_returns_2.opt_test | 13 --
 .../lower_jumps/lower_returns_2.opt_test.expected  |  5 
 .../tests/lower_jumps/lower_returns_3.opt_test | 20 ---
 .../lower_jumps/lower_returns_3.opt_test.expected  | 21 
 .../tests/lower_jumps/lower_returns_4.opt_test | 14 ---
 .../lower_jumps/lower_returns_4.opt_test.expected  | 16 
 .../lower_jumps/lower_returns_main_false.opt_test  | 17 -
 .../lower_returns_main_false.opt_test.expected |  8 --
 .../lower_jumps/lower_returns_main_true.opt_test   | 17 -
 .../lower_returns_main_true.opt_test.expected  | 13 --
 .../lower_jumps/lower_returns_sub_false.opt_test   | 16 
 .../lower_returns_sub_false.opt_test.expected  |  8 --
 .../lower_jumps/lower_returns_sub_true.opt_test| 16 
 .../lower_returns_sub_true.opt_test.expected   | 13 --
 .../lower_jumps/lower_unified_returns.opt_test | 26 ---
 .../lower_unified_returns.opt_test.expected| 21 
 .../remove_continue_at_end_of_loop.opt_test| 13 --
 ...emove_continue_at_end_of_loop.opt_test.expected |  5 
 ..._non_void_at_end_of_loop_lower_nothing.opt_test | 16 
 ..._at_end_of_loop_lower_nothing.opt_test.expected |  8 --
 ...n_non_void_at_end_of_loop_lower_return.opt_test | 16 
 ...d_at_end_of_loop_lower_return.opt_test.expected | 19 --
 ..._at_end_of_loop_lower_return_and_break.opt_test | 16 
 ...f_loop_lower_return_and_break.opt_test.expected | 19 --
 ...turn_void_at_end_of_loop_lower_nothing.opt_test | 14 ---
 ..._at_end_of_loop_lower_nothing.opt_test.expected |  6 -
 ...eturn_void_at_end_of_loop_lower_return.opt_test | 14 ---
 ...d_at_end_of_loop_lower_return.opt_test.expected | 11 
 ..._at_end_of_loop_lower_return_and_break.opt_test | 14 ---
 ...f_loop_lower_return_and_break.opt_test.expected | 11 
 src/glsl/tests/optimization-test   |  8 ++
 52 files changed, 12 insertions(+), 706 deletions(-)
 delete mode 100755 src/glsl/tests/lower_jumps/lower_breaks_1.opt_test
 delete mode 100644 src/glsl/tests/lower_jumps/lower_breaks_1.opt_test.expected
 delete mode 100755 src/glsl/tests/lower_jumps/lower_breaks_2.opt_test
 delete mode 100644 src/glsl/tests/lower_jumps/lower_breaks_2.opt_test.expected
 delete mode 100755 src/glsl/tests/lower_jumps/lower_breaks_3.opt_test
 delete mode 100644 

[Mesa-dev] [PATCH v2 4/4] glsl/tests: remove generated tests from the repo

2014-05-27 Thread Connor Abbott
They were made unneccesary by the last commit.

Signed-off-by: Connor Abbott 
---
 src/glsl/tests/lower_jumps/.gitignore  |  2 ++
 src/glsl/tests/lower_jumps/lower_breaks_1.opt_test | 12 -
 .../lower_jumps/lower_breaks_1.opt_test.expected   |  4 ---
 src/glsl/tests/lower_jumps/lower_breaks_2.opt_test | 15 ---
 .../lower_jumps/lower_breaks_2.opt_test.expected   |  7 --
 src/glsl/tests/lower_jumps/lower_breaks_3.opt_test | 17 -
 .../lower_jumps/lower_breaks_3.opt_test.expected   |  8 --
 src/glsl/tests/lower_jumps/lower_breaks_4.opt_test | 15 ---
 .../lower_jumps/lower_breaks_4.opt_test.expected   |  7 --
 src/glsl/tests/lower_jumps/lower_breaks_5.opt_test | 16 
 .../lower_jumps/lower_breaks_5.opt_test.expected   |  7 --
 src/glsl/tests/lower_jumps/lower_breaks_6.opt_test | 29 --
 .../lower_jumps/lower_breaks_6.opt_test.expected   | 29 --
 .../lower_guarded_conditional_break.opt_test   | 21 
 ...wer_guarded_conditional_break.opt_test.expected | 20 ---
 .../lower_jumps/lower_pulled_out_jump.opt_test | 28 -
 .../lower_pulled_out_jump.opt_test.expected| 25 ---
 .../tests/lower_jumps/lower_returns_1.opt_test | 12 -
 .../lower_jumps/lower_returns_1.opt_test.expected  |  4 ---
 .../tests/lower_jumps/lower_returns_2.opt_test | 13 --
 .../lower_jumps/lower_returns_2.opt_test.expected  |  5 
 .../tests/lower_jumps/lower_returns_3.opt_test | 20 ---
 .../lower_jumps/lower_returns_3.opt_test.expected  | 21 
 .../tests/lower_jumps/lower_returns_4.opt_test | 14 ---
 .../lower_jumps/lower_returns_4.opt_test.expected  | 16 
 .../lower_jumps/lower_returns_main_false.opt_test  | 17 -
 .../lower_returns_main_false.opt_test.expected |  8 --
 .../lower_jumps/lower_returns_main_true.opt_test   | 17 -
 .../lower_returns_main_true.opt_test.expected  | 13 --
 .../lower_jumps/lower_returns_sub_false.opt_test   | 16 
 .../lower_returns_sub_false.opt_test.expected  |  8 --
 .../lower_jumps/lower_returns_sub_true.opt_test| 16 
 .../lower_returns_sub_true.opt_test.expected   | 13 --
 .../lower_jumps/lower_unified_returns.opt_test | 26 ---
 .../lower_unified_returns.opt_test.expected| 21 
 .../remove_continue_at_end_of_loop.opt_test| 12 -
 ...emove_continue_at_end_of_loop.opt_test.expected |  4 ---
 ..._non_void_at_end_of_loop_lower_nothing.opt_test | 16 
 ..._at_end_of_loop_lower_nothing.opt_test.expected |  8 --
 ...n_non_void_at_end_of_loop_lower_return.opt_test | 16 
 ...d_at_end_of_loop_lower_return.opt_test.expected | 19 --
 ..._at_end_of_loop_lower_return_and_break.opt_test | 16 
 ...f_loop_lower_return_and_break.opt_test.expected | 19 --
 ...turn_void_at_end_of_loop_lower_nothing.opt_test | 13 --
 ..._at_end_of_loop_lower_nothing.opt_test.expected |  5 
 ...eturn_void_at_end_of_loop_lower_return.opt_test | 13 --
 ...d_at_end_of_loop_lower_return.opt_test.expected | 11 
 ..._at_end_of_loop_lower_return_and_break.opt_test | 13 --
 ...f_loop_lower_return_and_break.opt_test.expected | 11 
 49 files changed, 2 insertions(+), 696 deletions(-)
 delete mode 100755 src/glsl/tests/lower_jumps/lower_breaks_1.opt_test
 delete mode 100644 src/glsl/tests/lower_jumps/lower_breaks_1.opt_test.expected
 delete mode 100755 src/glsl/tests/lower_jumps/lower_breaks_2.opt_test
 delete mode 100644 src/glsl/tests/lower_jumps/lower_breaks_2.opt_test.expected
 delete mode 100755 src/glsl/tests/lower_jumps/lower_breaks_3.opt_test
 delete mode 100644 src/glsl/tests/lower_jumps/lower_breaks_3.opt_test.expected
 delete mode 100755 src/glsl/tests/lower_jumps/lower_breaks_4.opt_test
 delete mode 100644 src/glsl/tests/lower_jumps/lower_breaks_4.opt_test.expected
 delete mode 100755 src/glsl/tests/lower_jumps/lower_breaks_5.opt_test
 delete mode 100644 src/glsl/tests/lower_jumps/lower_breaks_5.opt_test.expected
 delete mode 100755 src/glsl/tests/lower_jumps/lower_breaks_6.opt_test
 delete mode 100644 src/glsl/tests/lower_jumps/lower_breaks_6.opt_test.expected
 delete mode 100755 
src/glsl/tests/lower_jumps/lower_guarded_conditional_break.opt_test
 delete mode 100644 
src/glsl/tests/lower_jumps/lower_guarded_conditional_break.opt_test.expected
 delete mode 100755 src/glsl/tests/lower_jumps/lower_pulled_out_jump.opt_test
 delete mode 100644 
src/glsl/tests/lower_jumps/lower_pulled_out_jump.opt_test.expected
 delete mode 100755 src/glsl/tests/lower_jumps/lower_returns_1.opt_test
 delete mode 100644 src/glsl/tests/lower_jumps/lower_returns_1.opt_test.expected
 delete mode 100755 src/glsl/tests/lower_jumps/lower_returns_2.opt_test
 delete m

[Mesa-dev] [PATCH v2 3/4] glsl/tests: call create_test_cases.py in optimization-test

2014-05-27 Thread Connor Abbott
This way, when someone modifies create_test_cases.py and forgets to
commit their changes again, people will notice.

v2: make sure we parse the right directories and check for existance the
right way.

Signed-off-by: Connor Abbott 
---
 src/glsl/tests/optimization-test | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/glsl/tests/optimization-test b/src/glsl/tests/optimization-test
index 8ca7776..bf15153 100755
--- a/src/glsl/tests/optimization-test
+++ b/src/glsl/tests/optimization-test
@@ -9,6 +9,14 @@ fi
 total=0
 pass=0
 
+echo "==   Generating tests  =="
+for dir in tests/*/; do
+if [ -e "${dir}create_test_cases.py" ]; then
+cd $dir; python create_test_cases.py; cd ..
+fi
+echo "$dir"
+done
+
 echo "== Testing optimization passes =="
 for test in `find . -iname '*.opt_test'`; do
 echo -n "Testing $test..."
-- 
1.8.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 1/4] glsl: be more consistent about printing constants

2014-05-27 Thread Connor Abbott
Make sure that we print the same number of digits when printing 0.0 as
any other floating-point number. This will make generating expected
output files for tests easier. To avoid breaking "make check," update
the generated tests for lower_jumps before the next commit which will
bring create_test_cases.py in line with them.

Signed-off-by: Connor Abbott 
---
 src/glsl/ir_print_visitor.cpp  |  2 +-
 src/glsl/tests/lower_jumps/lower_breaks_1.opt_test |  3 +--
 src/glsl/tests/lower_jumps/lower_breaks_1.opt_test.expected|  3 +--
 src/glsl/tests/lower_jumps/lower_breaks_2.opt_test.expected|  2 +-
 src/glsl/tests/lower_jumps/lower_breaks_3.opt_test.expected|  2 +-
 src/glsl/tests/lower_jumps/lower_breaks_4.opt_test.expected|  2 +-
 src/glsl/tests/lower_jumps/lower_breaks_5.opt_test.expected|  2 +-
 src/glsl/tests/lower_jumps/lower_breaks_6.opt_test.expected| 10 +-
 .../lower_guarded_conditional_break.opt_test.expected  |  6 +++---
 .../tests/lower_jumps/lower_pulled_out_jump.opt_test.expected  |  8 
 src/glsl/tests/lower_jumps/lower_returns_3.opt_test.expected   |  4 ++--
 src/glsl/tests/lower_jumps/lower_returns_4.opt_test.expected   |  2 +-
 .../lower_jumps/lower_returns_main_false.opt_test.expected |  4 ++--
 .../lower_jumps/lower_returns_main_true.opt_test.expected  |  4 ++--
 .../lower_jumps/lower_returns_sub_false.opt_test.expected  |  4 ++--
 .../tests/lower_jumps/lower_returns_sub_true.opt_test.expected |  4 ++--
 .../tests/lower_jumps/lower_unified_returns.opt_test.expected  |  8 
 .../tests/lower_jumps/remove_continue_at_end_of_loop.opt_test  |  3 +--
 .../remove_continue_at_end_of_loop.opt_test.expected   |  3 +--
 .../return_void_at_end_of_loop_lower_nothing.opt_test  |  3 +--
 .../return_void_at_end_of_loop_lower_nothing.opt_test.expected |  3 +--
 .../return_void_at_end_of_loop_lower_return.opt_test   |  3 +--
 .../return_void_at_end_of_loop_lower_return_and_break.opt_test |  3 +--
 23 files changed, 40 insertions(+), 48 deletions(-)

diff --git a/src/glsl/ir_print_visitor.cpp b/src/glsl/ir_print_visitor.cpp
index 0a7695a..a3d851e 100644
--- a/src/glsl/ir_print_visitor.cpp
+++ b/src/glsl/ir_print_visitor.cpp
@@ -430,7 +430,7 @@ void ir_print_visitor::visit(ir_constant *ir)
 case GLSL_TYPE_FLOAT:
 if (ir->value.f[i] == 0.0f)
/* 0.0 == -0.0, so print with %f to get the proper sign. */
-   fprintf(f, "%.1f", ir->value.f[i]);
+   fprintf(f, "%f", ir->value.f[i]);
 else if (fabs(ir->value.f[i]) < 0.01f)
fprintf(f, "%a", ir->value.f[i]);
 else if (fabs(ir->value.f[i]) > 100.0f)
diff --git a/src/glsl/tests/lower_jumps/lower_breaks_1.opt_test 
b/src/glsl/tests/lower_jumps/lower_breaks_1.opt_test
index b412ba8..e2d4ed1 100755
--- a/src/glsl/tests/lower_jumps/lower_breaks_1.opt_test
+++ b/src/glsl/tests/lower_jumps/lower_breaks_1.opt_test
@@ -8,6 +8,5 @@
 ((declare (out) float a)
  (function main
   (signature void (parameters)
-   ((loop
- ((assign (x) (var_ref a) (constant float (1.00))) break))
+   ((loop ((assign (x) (var_ref a) (constant float (1.00))) break))
 EOF
diff --git a/src/glsl/tests/lower_jumps/lower_breaks_1.opt_test.expected 
b/src/glsl/tests/lower_jumps/lower_breaks_1.opt_test.expected
index 56ef3e4..270a43d 100644
--- a/src/glsl/tests/lower_jumps/lower_breaks_1.opt_test.expected
+++ b/src/glsl/tests/lower_jumps/lower_breaks_1.opt_test.expected
@@ -1,5 +1,4 @@
 ((declare (out) float a)
  (function main
   (signature void (parameters)
-   ((loop
- ((assign (x) (var_ref a) (constant float (1.00))) break))
+   ((loop ((assign (x) (var_ref a) (constant float (1.00))) break))
diff --git a/src/glsl/tests/lower_jumps/lower_breaks_2.opt_test.expected 
b/src/glsl/tests/lower_jumps/lower_breaks_2.opt_test.expected
index dc231f9..73a1d56 100644
--- a/src/glsl/tests/lower_jumps/lower_breaks_2.opt_test.expected
+++ b/src/glsl/tests/lower_jumps/lower_breaks_2.opt_test.expected
@@ -3,5 +3,5 @@
   (signature void (parameters)
((loop
  ((assign (x) (var_ref a) (constant float (1.00)))
-  (if (expression bool > (var_ref b) (constant float (0.0))) (break)
+  (if (expression bool > (var_ref b) (constant float (0.00))) (break)
(
diff --git a/src/glsl/tests/lower_jumps/lower_breaks_3.opt_test.expected 
b/src/glsl/tests/lower_jumps/lower_breaks_3.opt_test.expected
index 8131b66..53d5392 100644
--- a/src/glsl/tests/lower_jumps/lower_breaks_3.opt_test.expected
+++ b/src/glsl/tests/lower_jumps/lower_breaks_3.opt_test.expected
@@ -3,6 +3,6 @@
   (signature void (parameters)
((loop
  ((assign (x) (var_ref a) (constant float (1.00)))
-  (if (expression bool > (var_ref b) (constant float (0.0)))
+  (if (expression bool > (var_ref b) (constant float (0.00)))
((assign (x) (

[Mesa-dev] [PATCH 01/19] i965/fs: Add and use an fs_inst copy constructor.

2014-05-27 Thread Matt Turner
Will get more complicated when fs_reg src becomes a pointer.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 5 +
 src/mesa/drivers/dri/i965/brw_fs.h   | 1 +
 2 files changed, 6 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index bd77e0c..5b7d84f 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -139,6 +139,11 @@ fs_inst::fs_inst(enum opcode opcode, fs_reg dst,
   assert(src[2].reg_offset >= 0);
 }
 
+fs_inst::fs_inst(const fs_inst &that)
+{
+   memcpy(this, &that, sizeof(that));
+}
+
 #define ALU1(op)\
fs_inst *\
fs_visitor::op(fs_reg dst, fs_reg src0)  \
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 789f0b3..bda233c 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -197,6 +197,7 @@ public:
fs_inst(enum opcode opcode, fs_reg dst, fs_reg src0, fs_reg src1);
fs_inst(enum opcode opcode, fs_reg dst,
fs_reg src0, fs_reg src1,fs_reg src2);
+   fs_inst(const fs_inst &that);
 
bool equals(fs_inst *inst) const;
bool overwrites_reg(const fs_reg ®) const;
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/19] i965/fs: load_payload on Gen7+.

2014-05-27 Thread Matt Turner
Here's a respin of my load_payload series from mid-April with some
feedback from Ken addressed and some bugs fixed.

This series is available in my tree (with a few unrelated patches
before it)

   git://people.freedesktop.org/~mattst88/mesa tex-sources

This is a prep series for implementing SSA in the i965 fragment
shader backend.

I haven't done any testing on Gen < 7, but I think everything
should still work afterward on those platforms. While we won't
generate load_payload instructions from the texture visitor, we
may generate them from the cubemap fixup and from CSE. Both should
be safe since their destinations are generate purpose registers,
not MRFs.

I'd be nice to extend this to MRFs, since as I've planned it this
work will be necessary for SSA in the fs backend.

total instructions in shared programs: 1686122 -> 1677922 (-0.49%)
instructions in affected programs: 635490 -> 627290 (-1.29%)
GAINED:20
LOST:  0

A small handful of shaders gain an extra instruction or two.

I'd really like to get some R-b tags this time around, since this
series is necessary for the SSA work I've been anxious to get back to
for a while.

i965/fs: Add and use an fs_inst copy constructor.
i965/fs: Disable fs_inst assignment operator.
i965/fs: ralloc fs_inst's fs_reg sources.
i965/fs: Store the number of sources an fs_inst has.
i965/fs: Loop from 0 to inst->sources, not 0 to 3.
i965/fs: Clean up fs_inst constructors.
i965/fs: Add a function to resize fs_inst's sources
i965/fs: Add fs_inst constructor that takes a list of

   Preparatory work and infrastructure for instructions with
   variable numbers of sources.

i965/fs: Add SHADER_OPCODE_LOAD_PAYLOAD.
i965/fs: Lower LOAD_PAYLOAD and clean up.
i965/fs: Use LOAD_PAYLOAD in emit_texture_gen7().
i965/fs: Apply cube map array fixup and restore the

   The main implementation of load_payload on Gen7.

i965/fs: Only consider real sources when comparing
i965/fs: Emit load_payload instead of multiple MOVs for
i965/fs: Support register coalescing on LOAD_PAYLOAD
i965/fs: Perform CSE on load_payload instructions if
i965/fs: Copy propagate from load_payload.

   A series of patches to teach our optimization passes
   about the new virtual instruction.

i965/fs: Perform CSE on texture operations.
i965/fs: Optimize SEL with the same sources into a MOV.

   A couple more optimizations enabled by this series.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/19] i965/fs: Disable fs_inst assignment operator.

2014-05-27 Thread Matt Turner
The fs_reg src array is going to turn into a pointer and we'd rather not
consider the implications of shallow copying fs_insts.

Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_fs.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index bda233c..e079842 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -185,6 +185,8 @@ public:
 };
 
 class fs_inst : public backend_instruction {
+   fs_inst &operator=(const fs_inst &);
+
 public:
DECLARE_RALLOC_CXX_OPERATORS(fs_inst)
 
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 17/19] i965/fs: Copy propagate from load_payload.

2014-05-27 Thread Matt Turner
But only into non-load_payload instructions. Otherwise we would prevent
register coalescing from combining identical payloads.
---
 .../drivers/dri/i965/brw_fs_copy_propagation.cpp   | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index a1aff21..f00ccf2 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -42,6 +42,7 @@ namespace { /* avoid conflict with 
opt_copy_propagation_elements */
 struct acp_entry : public exec_node {
fs_reg dst;
fs_reg src;
+   bool is_from_load_payload;
 };
 
 struct block_data {
@@ -278,6 +279,10 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, 
acp_entry *entry)
if (entry->src.file == IMM)
   return false;
 
+   if (entry->is_from_load_payload &&
+   inst->opcode == SHADER_OPCODE_LOAD_PAYLOAD)
+  return false;
+
/* Bail if inst is reading more than entry is writing. */
if ((inst->regs_read(this, arg) * inst->src[arg].stride *
 type_sz(inst->src[arg].type)) > type_sz(entry->dst.type))
@@ -545,7 +550,24 @@ fs_visitor::opt_copy_propagate_local(void *copy_prop_ctx, 
bblock_t *block,
 acp_entry *entry = ralloc(copy_prop_ctx, acp_entry);
 entry->dst = inst->dst;
 entry->src = inst->src[0];
+ entry->is_from_load_payload = false;
 acp[entry->dst.reg % ACP_HASH_SIZE].push_tail(entry);
+  } else if (inst->opcode == SHADER_OPCODE_LOAD_PAYLOAD &&
+ inst->dst.file == GRF) {
+ for (int i = 0; i < inst->sources; i++) {
+if (inst->src[i].file == GRF) {
+   acp_entry *entry = ralloc(copy_prop_ctx, acp_entry);
+   entry->dst = inst->dst;
+   entry->dst.reg_offset = i;
+   entry->src = inst->src[i];
+   entry->is_from_load_payload = true;
+   if (!entry->dst.equals(inst->src[i])) {
+  acp[entry->dst.reg % ACP_HASH_SIZE].push_tail(entry);
+   } else {
+  ralloc_free(entry);
+   }
+}
+ }
   }
}
 
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/19] i965/fs: Emit load_payload instead of multiple MOVs for large VGRFs.

2014-05-27 Thread Matt Turner
---
 src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 33 
 1 file changed, 21 insertions(+), 12 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
index e40567f..5037579 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
@@ -177,15 +177,20 @@ fs_visitor::opt_cse_local(bblock_t *block, exec_list *aeb)
entry->tmp = tmp;
entry->generator->dst = tmp;
 
-   for (int i = 0; i < written; i++) {
-  fs_inst *copy = MOV(orig_dst, tmp);
+   fs_inst *copy;
+   if (written > 1) {
+  fs_reg *sources = ralloc_array(mem_ctx, fs_reg, written);
+  for (int i = 0; i < written; i++) {
+ sources[i] = tmp;
+ sources[i].reg_offset = i;
+  }
+  copy = LOAD_PAYLOAD(orig_dst, sources, written);
+   } else {
+  copy = MOV(orig_dst, tmp);
   copy->force_writemask_all =
  entry->generator->force_writemask_all;
-  entry->generator->insert_after(copy);
-
-  orig_dst.reg_offset++;
-  tmp.reg_offset++;
}
+   entry->generator->insert_after(copy);
}
 
/* dest <- temp */
@@ -195,15 +200,19 @@ fs_visitor::opt_cse_local(bblock_t *block, exec_list *aeb)
assert(inst->dst.type == entry->tmp.type);
fs_reg dst = inst->dst;
fs_reg tmp = entry->tmp;
-   fs_inst *copy = NULL;
-   for (int i = 0; i < written; i++) {
+   fs_inst *copy;
+   if (written > 1) {
+  fs_reg *sources = ralloc_array(mem_ctx, fs_reg, written);
+  for (int i = 0; i < written; i++) {
+ sources[i] = tmp;
+ sources[i].reg_offset = i;
+  }
+  copy = LOAD_PAYLOAD(dst, sources, written);
+   } else {
   copy = MOV(dst, tmp);
   copy->force_writemask_all = inst->force_writemask_all;
-  inst->insert_before(copy);
-
-  dst.reg_offset++;
-  tmp.reg_offset++;
}
+   inst->insert_before(copy);
 }
 
 /* Set our iterator so that next time through the loop inst->next
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/19] i965/fs: Loop from 0 to inst->sources, not 0 to 3.

2014-05-27 Thread Matt Turner
---
 src/mesa/drivers/dri/i965/brw_fs.cpp   | 24 +++---
 .../drivers/dri/i965/brw_fs_copy_propagation.cpp   |  4 ++--
 src/mesa/drivers/dri/i965/brw_fs_cse.cpp   |  2 +-
 .../dri/i965/brw_fs_dead_code_eliminate.cpp|  2 +-
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp |  2 +-
 .../drivers/dri/i965/brw_fs_live_variables.cpp |  2 +-
 src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  |  6 +++---
 .../drivers/dri/i965/brw_fs_register_coalesce.cpp  |  2 +-
 .../dri/i965/brw_fs_saturate_propagation.cpp   |  2 +-
 .../drivers/dri/i965/brw_schedule_instructions.cpp | 10 -
 10 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index a9a8ac1..8b13683 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1472,7 +1472,7 @@ fs_visitor::assign_curb_setup()
foreach_list(node, &this->instructions) {
   fs_inst *inst = (fs_inst *)node;
 
-  for (unsigned int i = 0; i < 3; i++) {
+  for (unsigned int i = 0; i < inst->sources; i++) {
 if (inst->src[i].file == UNIFORM) {
 int uniform_nr = inst->src[i].reg + inst->src[i].reg_offset;
 int constant_nr;
@@ -1670,7 +1670,7 @@ fs_visitor::split_virtual_grfs()
* the send is reading the whole thing.
*/
   if (inst->is_send_from_grf()) {
- for (int i = 0; i < 3; i++) {
+ for (int i = 0; i < inst->sources; i++) {
 if (inst->src[i].file == GRF) {
split_grf[inst->src[i].reg] = false;
 }
@@ -1703,7 +1703,7 @@ fs_visitor::split_virtual_grfs()
  inst->dst.reg_offset - 1);
 inst->dst.reg_offset = 0;
   }
-  for (int i = 0; i < 3; i++) {
+  for (int i = 0; i < inst->sources; i++) {
 if (inst->src[i].file == GRF &&
 split_grf[inst->src[i].reg] &&
 inst->src[i].reg_offset != 0) {
@@ -1741,7 +1741,7 @@ fs_visitor::compact_virtual_grfs()
   if (inst->dst.file == GRF)
  remap_table[inst->dst.reg] = 0;
 
-  for (int i = 0; i < 3; i++) {
+  for (int i = 0; i < inst->sources; i++) {
  if (inst->src[i].file == GRF)
 remap_table[inst->src[i].reg] = 0;
   }
@@ -1767,7 +1767,7 @@ fs_visitor::compact_virtual_grfs()
   if (inst->dst.file == GRF)
  inst->dst.reg = remap_table[inst->dst.reg];
 
-  for (int i = 0; i < 3; i++) {
+  for (int i = 0; i < inst->sources; i++) {
  if (inst->src[i].file == GRF)
 inst->src[i].reg = remap_table[inst->src[i].reg];
   }
@@ -1807,7 +1807,7 @@ fs_visitor::move_uniform_array_access_to_pull_constants()
foreach_list_safe(node, &this->instructions) {
   fs_inst *inst = (fs_inst *)node;
 
-  for (int i = 0 ; i < 3; i++) {
+  for (int i = 0 ; i < inst->sources; i++) {
  if (inst->src[i].file != UNIFORM || !inst->src[i].reladdr)
 continue;
 
@@ -1857,7 +1857,7 @@ fs_visitor::assign_constant_locations()
foreach_list(node, &this->instructions) {
   fs_inst *inst = (fs_inst *) node;
 
-  for (int i = 0; i < 3; i++) {
+  for (int i = 0; i < inst->sources; i++) {
  if (inst->src[i].file != UNIFORM)
 continue;
 
@@ -1928,7 +1928,7 @@ fs_visitor::demote_pull_constants()
foreach_list(node, &this->instructions) {
   fs_inst *inst = (fs_inst *)node;
 
-  for (int i = 0; i < 3; i++) {
+  for (int i = 0; i < inst->sources; i++) {
 if (inst->src[i].file != UNIFORM)
continue;
 
@@ -2180,7 +2180,7 @@ fs_visitor::compute_to_mrf()
  * MRF's source GRF that we wanted to rewrite, that stops us.
  */
 bool interfered = false;
-for (int i = 0; i < 3; i++) {
+for (int i = 0; i < scan_inst->sources; i++) {
if (scan_inst->src[i].file == GRF &&
scan_inst->src[i].reg == inst->src[0].reg &&
scan_inst->src[i].reg_offset == inst->src[0].reg_offset) {
@@ -2319,7 +2319,7 @@ clear_deps_for_inst_src(fs_inst *inst, int 
dispatch_width, bool *deps,
!inst->force_sechalf);
 
/* Clear the flag for registers that actually got read (as expected). */
-   for (int i = 0; i < 3; i++) {
+   for (int i = 0; i < inst->sources; i++) {
   int grf;
   if (inst->src[i].file == GRF) {
  grf = inst->src[i].reg;
@@ -2697,7 +2697,7 @@ fs_visitor::dump_instruction(backend_instruction 
*be_inst, FILE *file)
}
fprintf(file, ":%s, ", brw_reg_type_letters(inst->dst.type));
 
-   for (int i = 0; i < 3 && inst->src[i].file != BAD_FILE; i++) {
+   for (int i = 0; i < inst->sources && inst->src[i].file != BAD_FILE; i++) {
   if (inst->src[i].negate)
  fprintf(file, "-");
   if (inst->src[i].abs)
@@ -2786,7 +2786,7 @@ fs_visitor::dump_instruction(backend_instruction 
*be_inst, FILE *file)
  fprintf(file

[Mesa-dev] [PATCH 11/19] i965/fs: Use LOAD_PAYLOAD in emit_texture_gen7().

2014-05-27 Thread Matt Turner
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 135 +++
 1 file changed, 73 insertions(+), 62 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index b51ecc1..10ec254 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1268,8 +1268,11 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg 
dst, fs_reg coordinate,
int reg_width = dispatch_width / 8;
bool header_present = false;
 
-   fs_reg payload = fs_reg(this, glsl_type::float_type);
-   fs_reg next = payload;
+   fs_reg *sources = ralloc_array(mem_ctx, fs_reg, MAX_SAMPLER_MESSAGE_SIZE);
+   for (int i = 0; i < MAX_SAMPLER_MESSAGE_SIZE; i++) {
+  sources[i] = fs_reg(this, glsl_type::float_type);
+   }
+   int length = 0;
 
if (ir->op == ir_tg4 || (ir->offset && ir->op != ir_txf) || sampler >= 16) {
   /* For general texture offsets (no txf workaround), we need a header to
@@ -1283,12 +1286,13 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg 
dst, fs_reg coordinate,
* need to offset the Sampler State Pointer in the header.
*/
   header_present = true;
-  next.reg_offset++;
+  sources[length] = reg_undef;
+  length++;
}
 
if (ir->shadow_comparitor) {
-  emit(MOV(next, shadow_c));
-  next.reg_offset++;
+  emit(MOV(sources[length], shadow_c));
+  length++;
}
 
bool has_nonconstant_offset = ir->offset && !ir->offset->as_constant();
@@ -1300,12 +1304,12 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg 
dst, fs_reg coordinate,
case ir_lod:
   break;
case ir_txb:
-  emit(MOV(next, lod));
-  next.reg_offset++;
+  emit(MOV(sources[length], lod));
+  length++;
   break;
case ir_txl:
-  emit(MOV(next, lod));
-  next.reg_offset++;
+  emit(MOV(sources[length], lod));
+  length++;
   break;
case ir_txd: {
   no16("Gen7 does not support sample_d/sample_d_c in SIMD16 mode.");
@@ -1314,21 +1318,21 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg 
dst, fs_reg coordinate,
* [hdr], [ref], x, dPdx.x, dPdy.x, y, dPdx.y, dPdy.y, z, dPdx.z, dPdy.z
*/
   for (int i = 0; i < ir->coordinate->type->vector_elements; i++) {
-emit(MOV(next, coordinate));
+emit(MOV(sources[length], coordinate));
 coordinate.reg_offset++;
-next.reg_offset++;
+length++;
 
  /* For cube map array, the coordinate is (u,v,r,ai) but there are
   * only derivatives for (u, v, r).
   */
  if (i < ir->lod_info.grad.dPdx->type->vector_elements) {
-emit(MOV(next, lod));
+emit(MOV(sources[length], lod));
 lod.reg_offset++;
-next.reg_offset++;
+length++;
 
-emit(MOV(next, lod2));
+emit(MOV(sources[length], lod2));
 lod2.reg_offset++;
-next.reg_offset++;
+length++;
  }
   }
 
@@ -1336,45 +1340,45 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg 
dst, fs_reg coordinate,
   break;
}
case ir_txs:
-  emit(MOV(retype(next, BRW_REGISTER_TYPE_UD), lod));
-  next.reg_offset++;
+  emit(MOV(retype(sources[length], BRW_REGISTER_TYPE_UD), lod));
+  length++;
   break;
case ir_query_levels:
-  emit(MOV(retype(next, BRW_REGISTER_TYPE_UD), fs_reg(0u)));
-  next.reg_offset++;
+  emit(MOV(retype(sources[length], BRW_REGISTER_TYPE_UD), fs_reg(0u)));
+  length++;
   break;
case ir_txf:
   /* Unfortunately, the parameters for LD are intermixed: u, lod, v, r. */
-  emit(MOV(retype(next, BRW_REGISTER_TYPE_D), coordinate));
+  emit(MOV(retype(sources[length], BRW_REGISTER_TYPE_D), coordinate));
   coordinate.reg_offset++;
-  next.reg_offset++;
+  length++;
 
-  emit(MOV(retype(next, BRW_REGISTER_TYPE_D), lod));
-  next.reg_offset++;
+  emit(MOV(retype(sources[length], BRW_REGISTER_TYPE_D), lod));
+  length++;
 
   for (int i = 1; i < ir->coordinate->type->vector_elements; i++) {
-emit(MOV(retype(next, BRW_REGISTER_TYPE_D), coordinate));
+emit(MOV(retype(sources[length], BRW_REGISTER_TYPE_D), coordinate));
 coordinate.reg_offset++;
-next.reg_offset++;
+length++;
   }
 
   coordinate_done = true;
   break;
case ir_txf_ms:
-  emit(MOV(retype(next, BRW_REGISTER_TYPE_UD), sample_index));
-  next.reg_offset++;
+  emit(MOV(retype(sources[length], BRW_REGISTER_TYPE_UD), sample_index));
+  length++;
 
   /* data from the multisample control surface */
-  emit(MOV(retype(next, BRW_REGISTER_TYPE_UD), mcs));
-  next.reg_offset++;
+  emit(MOV(retype(sources[length], BRW_REGISTER_TYPE_UD), mcs));
+  length++;
 
   /* there is no offsetting for this message; just copy in the integer
* texture coordina

[Mesa-dev] [PATCH 06/19] i965/fs: Clean up fs_inst constructors.

2014-05-27 Thread Matt Turner
In a fashion suggested by Ken.
---
Allocating fewer sources than 3 is not handled in this series.

 src/mesa/drivers/dri/i965/brw_fs.cpp | 90 ++--
 src/mesa/drivers/dri/i965/brw_fs.h   | 17 ---
 2 files changed, 32 insertions(+), 75 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 8b13683..f926d97 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -52,95 +52,53 @@ extern "C" {
 #include "glsl/glsl_types.h"
 
 void
-fs_inst::init(int sources)
+fs_inst::init(enum opcode opcode, const fs_reg &dst, fs_reg *src, int sources)
 {
memset(this, 0, sizeof(*this));
 
+   this->opcode = opcode;
+   this->dst = dst;
+   this->src = src;
this->sources = sources;
-   this->src = ralloc_array(this, fs_reg, sources);
 
this->conditional_mod = BRW_CONDITIONAL_NONE;
 
-   this->dst = reg_undef;
-   this->src[0] = reg_undef;
-   this->src[1] = reg_undef;
-   this->src[2] = reg_undef;
-
/* This will be the case for almost all instructions. */
this->regs_written = 1;
 
this->writes_accumulator = false;
 }
 
-fs_inst::fs_inst()
+fs_inst::fs_inst(enum opcode opcode, const fs_reg &dst)
 {
-   init(3);
-   this->opcode = BRW_OPCODE_NOP;
+   fs_reg *src = ralloc_array(this, fs_reg, 3);
+   init(opcode, dst, src, 0);
 }
 
-fs_inst::fs_inst(enum opcode opcode)
+fs_inst::fs_inst(enum opcode opcode, const fs_reg &dst, const fs_reg &src0)
 {
-   init(3);
-   this->opcode = opcode;
+   fs_reg *src = ralloc_array(this, fs_reg, 3);
+   src[0] = src0;
+   init(opcode, dst, src, 1);
 }
 
-fs_inst::fs_inst(enum opcode opcode, fs_reg dst)
+fs_inst::fs_inst(enum opcode opcode, const fs_reg &dst, const fs_reg &src0,
+ const fs_reg &src1)
 {
-   init(3);
-   this->opcode = opcode;
-   this->dst = dst;
-
-   if (dst.file == GRF)
-  assert(dst.reg_offset >= 0);
+   fs_reg *src = ralloc_array(this, fs_reg, 3);
+   src[0] = src0;
+   src[1] = src1;
+   init(opcode, dst, src, 2);
 }
 
-fs_inst::fs_inst(enum opcode opcode, fs_reg dst, fs_reg src0)
+fs_inst::fs_inst(enum opcode opcode, const fs_reg &dst, const fs_reg &src0,
+ const fs_reg &src1, const fs_reg &src2)
 {
-   init(3);
-   this->opcode = opcode;
-   this->dst = dst;
-   this->src[0] = src0;
-
-   if (dst.file == GRF)
-  assert(dst.reg_offset >= 0);
-   if (src[0].file == GRF)
-  assert(src[0].reg_offset >= 0);
-}
-
-fs_inst::fs_inst(enum opcode opcode, fs_reg dst, fs_reg src0, fs_reg src1)
-{
-   init(3);
-   this->opcode = opcode;
-   this->dst = dst;
-   this->src[0] = src0;
-   this->src[1] = src1;
-
-   if (dst.file == GRF)
-  assert(dst.reg_offset >= 0);
-   if (src[0].file == GRF)
-  assert(src[0].reg_offset >= 0);
-   if (src[1].file == GRF)
-  assert(src[1].reg_offset >= 0);
-}
-
-fs_inst::fs_inst(enum opcode opcode, fs_reg dst,
-fs_reg src0, fs_reg src1, fs_reg src2)
-{
-   init(3);
-   this->opcode = opcode;
-   this->dst = dst;
-   this->src[0] = src0;
-   this->src[1] = src1;
-   this->src[2] = src2;
-
-   if (dst.file == GRF)
-  assert(dst.reg_offset >= 0);
-   if (src[0].file == GRF)
-  assert(src[0].reg_offset >= 0);
-   if (src[1].file == GRF)
-  assert(src[1].reg_offset >= 0);
-   if (src[2].file == GRF)
-  assert(src[2].reg_offset >= 0);
+   fs_reg *src = ralloc_array(this, fs_reg, 3);
+   src[0] = src0;
+   src[1] = src1;
+   src[2] = src2;
+   init(opcode, dst, src, 3);
 }
 
 fs_inst::fs_inst(const fs_inst &that)
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 4f8a2b2..fb68923 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -190,15 +190,14 @@ class fs_inst : public backend_instruction {
 public:
DECLARE_RALLOC_CXX_OPERATORS(fs_inst)
 
-   void init(int sources);
-
-   fs_inst();
-   fs_inst(enum opcode opcode);
-   fs_inst(enum opcode opcode, fs_reg dst);
-   fs_inst(enum opcode opcode, fs_reg dst, fs_reg src0);
-   fs_inst(enum opcode opcode, fs_reg dst, fs_reg src0, fs_reg src1);
-   fs_inst(enum opcode opcode, fs_reg dst,
-   fs_reg src0, fs_reg src1,fs_reg src2);
+   void init(enum opcode opcode, const fs_reg &dst, fs_reg *src, int sources);
+
+   fs_inst(enum opcode opcode = BRW_OPCODE_NOP, const fs_reg &dst = reg_undef);
+   fs_inst(enum opcode opcode, const fs_reg &dst, const fs_reg &src0);
+   fs_inst(enum opcode opcode, const fs_reg &dst, const fs_reg &src0,
+   const fs_reg &src1);
+   fs_inst(enum opcode opcode, const fs_reg &dst, const fs_reg &src0,
+   const fs_reg &src1, const fs_reg &src2);
fs_inst(const fs_inst &that);
 
bool equals(fs_inst *inst) const;
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/19] i965/fs: Add fs_inst constructor that takes a list of sources.

2014-05-27 Thread Matt Turner
Also add an emit() function that calls it.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 12 
 src/mesa/drivers/dri/i965/brw_fs.h   |  3 +++
 2 files changed, 15 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 1f174d3..c86cb42 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -101,6 +101,11 @@ fs_inst::fs_inst(enum opcode opcode, const fs_reg &dst, 
const fs_reg &src0,
init(opcode, dst, src, 3);
 }
 
+fs_inst::fs_inst(enum opcode opcode, const fs_reg &dst, fs_reg src[], int 
sources)
+{
+   init(opcode, dst, src, sources);
+}
+
 fs_inst::fs_inst(const fs_inst &that)
 {
memcpy(this, &that, sizeof(that));
@@ -740,6 +745,13 @@ fs_visitor::emit(enum opcode opcode, fs_reg dst,
return emit(new(mem_ctx) fs_inst(opcode, dst, src0, src1, src2));
 }
 
+fs_inst *
+fs_visitor::emit(enum opcode opcode, fs_reg dst,
+ fs_reg src[], int sources)
+{
+   return emit(new(mem_ctx) fs_inst(opcode, dst, src, sources));
+}
+
 void
 fs_visitor::push_force_uncompressed()
 {
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index b7cfb3c..527c3f3 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -198,6 +198,7 @@ public:
const fs_reg &src1);
fs_inst(enum opcode opcode, const fs_reg &dst, const fs_reg &src0,
const fs_reg &src1, const fs_reg &src2);
+   fs_inst(enum opcode opcode, const fs_reg &dst, fs_reg src[], int sources);
fs_inst(const fs_inst &that);
 
void resize_sources(uint8_t num_sources);
@@ -295,6 +296,8 @@ public:
fs_inst *emit(enum opcode opcode, fs_reg dst, fs_reg src0, fs_reg src1);
fs_inst *emit(enum opcode opcode, fs_reg dst,
  fs_reg src0, fs_reg src1, fs_reg src2);
+   fs_inst *emit(enum opcode opcode, fs_reg dst,
+ fs_reg src[], int sources);
 
fs_inst *MOV(fs_reg dst, fs_reg src);
fs_inst *NOT(fs_reg dst, fs_reg src);
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/19] i965/fs: Store the number of sources an fs_inst has.

2014-05-27 Thread Matt Turner
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 21 +++--
 src/mesa/drivers/dri/i965/brw_fs.h   |  3 ++-
 2 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index b06966a..a9a8ac1 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -52,11 +52,12 @@ extern "C" {
 #include "glsl/glsl_types.h"
 
 void
-fs_inst::init()
+fs_inst::init(int sources)
 {
memset(this, 0, sizeof(*this));
 
-   this->src = ralloc_array(this, fs_reg, 3);
+   this->sources = sources;
+   this->src = ralloc_array(this, fs_reg, sources);
 
this->conditional_mod = BRW_CONDITIONAL_NONE;
 
@@ -73,19 +74,19 @@ fs_inst::init()
 
 fs_inst::fs_inst()
 {
-   init();
+   init(3);
this->opcode = BRW_OPCODE_NOP;
 }
 
 fs_inst::fs_inst(enum opcode opcode)
 {
-   init();
+   init(3);
this->opcode = opcode;
 }
 
 fs_inst::fs_inst(enum opcode opcode, fs_reg dst)
 {
-   init();
+   init(3);
this->opcode = opcode;
this->dst = dst;
 
@@ -95,7 +96,7 @@ fs_inst::fs_inst(enum opcode opcode, fs_reg dst)
 
 fs_inst::fs_inst(enum opcode opcode, fs_reg dst, fs_reg src0)
 {
-   init();
+   init(3);
this->opcode = opcode;
this->dst = dst;
this->src[0] = src0;
@@ -108,7 +109,7 @@ fs_inst::fs_inst(enum opcode opcode, fs_reg dst, fs_reg 
src0)
 
 fs_inst::fs_inst(enum opcode opcode, fs_reg dst, fs_reg src0, fs_reg src1)
 {
-   init();
+   init(3);
this->opcode = opcode;
this->dst = dst;
this->src[0] = src0;
@@ -125,7 +126,7 @@ fs_inst::fs_inst(enum opcode opcode, fs_reg dst, fs_reg 
src0, fs_reg src1)
 fs_inst::fs_inst(enum opcode opcode, fs_reg dst,
 fs_reg src0, fs_reg src1, fs_reg src2)
 {
-   init();
+   init(3);
this->opcode = opcode;
this->dst = dst;
this->src[0] = src0;
@@ -146,9 +147,9 @@ fs_inst::fs_inst(const fs_inst &that)
 {
memcpy(this, &that, sizeof(that));
 
-   this->src = ralloc_array(this, fs_reg, 3);
+   this->src = ralloc_array(this, fs_reg, that.sources);
 
-   for (int i = 0; i < 3; i++)
+   for (int i = 0; i < that.sources; i++)
   this->src[i] = that.src[i];
 }
 
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 11a5c7c..4f8a2b2 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -190,7 +190,7 @@ class fs_inst : public backend_instruction {
 public:
DECLARE_RALLOC_CXX_OPERATORS(fs_inst)
 
-   void init();
+   void init(int sources);
 
fs_inst();
fs_inst(enum opcode opcode);
@@ -216,6 +216,7 @@ public:
uint32_t texture_offset; /**< Texture offset bitfield */
uint32_t offset; /* spill/unspill offset */
 
+   uint8_t sources; /**< Number of fs_reg sources. */
uint8_t conditional_mod; /**< BRW_CONDITIONAL_* */
 
/* Chooses which flag subregister (f0.0 or f0.1) is used for conditional
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/19] i965/fs: Lower LOAD_PAYLOAD and clean up.

2014-05-27 Thread Matt Turner
Clean up with with register_coalesce()/dead_code_eliminate().
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 42 
 src/mesa/drivers/dri/i965/brw_fs.h   |  1 +
 2 files changed, 43 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 0856b6b..c0af6d0 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2574,6 +2574,43 @@ fs_visitor::lower_uniform_pull_constant_loads()
}
 }
 
+bool
+fs_visitor::lower_load_payload()
+{
+   bool progress = false;
+
+   foreach_list_safe(node, &instructions) {
+  fs_inst *inst = (fs_inst *)node;
+
+  if (inst->opcode == SHADER_OPCODE_LOAD_PAYLOAD) {
+ fs_reg dst = inst->dst;
+
+ /* The generator creates the message header if present, which is in
+  * the first register of the message payload.
+  */
+ if (!inst->header_present) {
+inst->insert_before(MOV(dst, inst->src[0]));
+ } else {
+assert(inst->src[0].file == BAD_FILE);
+ }
+ dst.reg_offset++;
+
+ for (int i = 1; i < inst->sources; i++) {
+inst->insert_before(MOV(dst, inst->src[i]));
+dst.reg_offset++;
+ }
+
+ inst->remove();
+ progress = true;
+  }
+   }
+
+   if (progress)
+  invalidate_live_intervals();
+
+   return progress;
+}
+
 void
 fs_visitor::dump_instructions()
 {
@@ -3071,6 +3108,11 @@ fs_visitor::run()
  progress = OPT(compute_to_mrf) || progress;
   } while (progress);
 
+  if (lower_load_payload()) {
+ register_coalesce();
+ dead_code_eliminate();
+  }
+
   lower_uniform_pull_constant_loads();
 
   assign_curb_setup();
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index d0e459c..2b60945 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -391,6 +391,7 @@ public:
void fail(const char *msg, ...);
void no16(const char *msg, ...);
void lower_uniform_pull_constant_loads();
+   bool lower_load_payload();
 
void push_force_uncompressed();
void pop_force_uncompressed();
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 19/19] i965/fs: Optimize SEL with the same sources into a MOV.

2014-05-27 Thread Matt Turner
instructions in affected programs: 474 -> 462 (-2.53%)
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index c0af6d0..453683c 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2021,7 +2021,12 @@ fs_visitor::opt_algebraic()
  }
  break;
   case BRW_OPCODE_SEL:
- if (inst->saturate && inst->src[1].file == IMM) {
+ if (inst->src[0].equals(inst->src[1])) {
+inst->opcode = BRW_OPCODE_MOV;
+inst->src[1] = reg_undef;
+inst->predicate = BRW_PREDICATE_NONE;
+progress = true;
+ } else if (inst->saturate && inst->src[1].file == IMM) {
 switch (inst->conditional_mod) {
 case BRW_CONDITIONAL_LE:
 case BRW_CONDITIONAL_L:
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/19] i965/fs: Add SHADER_OPCODE_LOAD_PAYLOAD.

2014-05-27 Thread Matt Turner
Will be used to simplify the handling of large virtual GRFs in SSA form.
---
 src/mesa/drivers/dri/i965/brw_defines.h|  2 ++
 src/mesa/drivers/dri/i965/brw_fs.cpp   | 10 ++
 src/mesa/drivers/dri/i965/brw_fs.h |  2 ++
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp |  3 +++
 src/mesa/drivers/dri/i965/brw_shader.cpp   |  3 +++
 5 files changed, 20 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index c38e447..34467e9 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -797,6 +797,8 @@ enum opcode {
SHADER_OPCODE_TG4,
SHADER_OPCODE_TG4_OFFSET,
 
+   SHADER_OPCODE_LOAD_PAYLOAD,
+
SHADER_OPCODE_SHADER_TIME_ADD,
 
SHADER_OPCODE_UNTYPED_ATOMIC,
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index c86cb42..0856b6b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -241,6 +241,16 @@ fs_visitor::CMP(fs_reg dst, fs_reg src0, fs_reg src1, 
uint32_t condition)
return inst;
 }
 
+fs_inst *
+fs_visitor::LOAD_PAYLOAD(const fs_reg &dst, fs_reg *src, int sources)
+{
+   fs_inst *inst = new(mem_ctx) fs_inst(SHADER_OPCODE_LOAD_PAYLOAD, dst, src,
+sources);
+   inst->regs_written = sources;
+
+   return inst;
+}
+
 exec_list
 fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_reg &dst,
const fs_reg &surf_index,
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 527c3f3..d0e459c 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -338,6 +338,8 @@ public:
   fs_inst *end,
   const fs_reg ®);
 
+   fs_inst *LOAD_PAYLOAD(const fs_reg &dst, fs_reg *src, int sources);
+
exec_list VARYING_PULL_CONSTANT_LOAD(const fs_reg &dst,
 const fs_reg &surf_index,
 const fs_reg &varying_offset,
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index a5be0ec..26b963b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -1713,6 +1713,9 @@ fs_generator::generate_code(exec_list *instructions,
  generate_discard_jump(inst);
  break;
 
+  case SHADER_OPCODE_LOAD_PAYLOAD:
+ break;
+
   case SHADER_OPCODE_SHADER_TIME_ADD:
  generate_shader_time_add(inst, src[0], src[1], src[2]);
  break;
diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index 254afef..b35862c 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -452,6 +452,9 @@ brw_instruction_name(enum opcode op)
case SHADER_OPCODE_TG4_OFFSET:
   return "tg4_offset";
 
+   case SHADER_OPCODE_LOAD_PAYLOAD:
+  return "load_payload";
+
case SHADER_OPCODE_GEN4_SCRATCH_READ:
   return "gen4_scratch_read";
case SHADER_OPCODE_GEN4_SCRATCH_WRITE:
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 16/19] i965/fs: Perform CSE on load_payload instructions if it's not a copy.

2014-05-27 Thread Matt Turner
Since CSE creates instructions, if we let CSE generate things register
coalescing can't remove, bad things will happen. Only let CSE combine
non-copy load_payloads.

E.g., allow CSE to handle this

   load_payload vgrf4+0, vgrf5, vgrf6

but not this

   load_payload vgrf4+0, vgrf5+0, vgrf5+1
---
 src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
index 5037579..75c6aab 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
@@ -43,6 +43,22 @@ struct aeb_entry : public exec_node {
 }
 
 static bool
+is_copy_payload(const fs_inst *inst)
+{
+   const int reg = inst->src[0].reg;
+   if (inst->src[0].reg_offset != 0)
+  return false;
+
+   for (int i = 1; i < inst->sources; i++) {
+  if (inst->src[i].reg != reg ||
+  inst->src[i].reg_offset != i) {
+ return false;
+  }
+   }
+   return true;
+}
+
+static bool
 is_expression(const fs_inst *const inst)
 {
switch (inst->opcode) {
@@ -73,6 +89,8 @@ is_expression(const fs_inst *const inst)
case FS_OPCODE_CINTERP:
case FS_OPCODE_LINTERP:
   return true;
+   case SHADER_OPCODE_LOAD_PAYLOAD:
+  return !is_copy_payload(inst);
default:
   return false;
}
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 18/19] i965/fs: Perform CSE on texture operations.

2014-05-27 Thread Matt Turner
Helps Unigine Tropics and some (old) gstreamer shaders in shader-db.

instructions in affected programs: 792 -> 744 (-6.06%)
---
 src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
index 75c6aab..6e36b8c 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
@@ -92,7 +92,7 @@ is_expression(const fs_inst *const inst)
case SHADER_OPCODE_LOAD_PAYLOAD:
   return !is_copy_payload(inst);
default:
-  return false;
+  return inst->is_tex();
}
 }
 
@@ -142,6 +142,16 @@ instructions_match(fs_inst *a, fs_inst *b)
   a->conditional_mod == b->conditional_mod &&
   a->dst.type == b->dst.type &&
   a->sources == b->sources &&
+  (a->is_tex() ? (a->texture_offset == b->texture_offset &&
+  a->mlen == b->mlen &&
+  a->regs_written == b->regs_written &&
+  a->base_mrf == b->base_mrf &&
+  a->sampler == b->sampler &&
+  a->target == b->target &&
+  a->eot == b->eot &&
+  a->header_present == b->header_present &&
+  a->shadow_compare == b->shadow_compare)
+   : true) &&
   operands_match(a, b);
 }
 
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/19] i965/fs: Apply cube map array fixup and restore the payload.

2014-05-27 Thread Matt Turner
So that we don't have partial writes to a large VGRF. Will be cleaned up
by register coalescing.
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 10ec254..b94141a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1726,7 +1726,20 @@ fs_visitor::visit(ir_texture *ir)
   type->sampler_array) {
  fs_reg depth = dst;
  depth.reg_offset = 2;
- emit_math(SHADER_OPCODE_INT_QUOTIENT, depth, depth, fs_reg(6));
+ fs_reg fixed_depth = fs_reg(this, glsl_type::int_type);
+ emit_math(SHADER_OPCODE_INT_QUOTIENT, fixed_depth, depth, fs_reg(6));
+
+ fs_reg *fixed_payload = ralloc_array(mem_ctx, fs_reg, 
inst->regs_written);
+ fs_reg d = dst;
+ for (int i = 0; i < inst->regs_written; i++) {
+if (i == 2) {
+   fixed_payload[i] = fixed_depth;
+} else {
+   d.reg_offset = i;
+   fixed_payload[i] = d;
+}
+ }
+ emit(LOAD_PAYLOAD(dst, fixed_payload, inst->regs_written));
   }
}
 
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 15/19] i965/fs: Support register coalescing on LOAD_PAYLOAD operands.

2014-05-27 Thread Matt Turner
---
 .../drivers/dri/i965/brw_fs_register_coalesce.cpp  | 59 ++
 1 file changed, 49 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
index a0aa169..0aa4b3e 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_register_coalesce.cpp
@@ -46,7 +46,14 @@
 static bool
 is_nop_mov(const fs_inst *inst)
 {
-   if (inst->opcode == BRW_OPCODE_MOV) {
+   if (inst->opcode == SHADER_OPCODE_LOAD_PAYLOAD) {
+  for (int i = 0; i < inst->sources; i++) {
+ if (!inst->dst.equals(inst->src[i])) {
+return false;
+ }
+  }
+  return true;
+   } else if (inst->opcode == BRW_OPCODE_MOV) {
   return inst->dst.equals(inst->src[0]);
}
 
@@ -54,9 +61,26 @@ is_nop_mov(const fs_inst *inst)
 }
 
 static bool
+is_copy_payload(const fs_inst *inst)
+{
+   const int reg = inst->src[0].reg;
+   if (inst->src[0].reg_offset != 0)
+  return false;
+
+   for (int i = 1; i < inst->sources; i++) {
+  if (inst->src[i].reg != reg ||
+  inst->src[i].reg_offset != i) {
+ return false;
+  }
+   }
+   return true;
+}
+
+static bool
 is_coalesce_candidate(const fs_inst *inst, const int *virtual_grf_sizes)
 {
-   if (inst->opcode != BRW_OPCODE_MOV ||
+   if ((inst->opcode != BRW_OPCODE_MOV &&
+inst->opcode != SHADER_OPCODE_LOAD_PAYLOAD) ||
inst->is_partial_write() ||
inst->saturate ||
inst->src[0].file != GRF ||
@@ -72,6 +96,12 @@ is_coalesce_candidate(const fs_inst *inst, const int 
*virtual_grf_sizes)
virtual_grf_sizes[inst->dst.reg])
   return false;
 
+   if (inst->opcode == SHADER_OPCODE_LOAD_PAYLOAD) {
+  if (!is_copy_payload(inst)) {
+ return false;
+  }
+   }
+
return true;
 }
 
@@ -161,10 +191,18 @@ fs_visitor::register_coalesce()
   if (reg_to != inst->dst.reg)
  continue;
 
-  const int offset = inst->src[0].reg_offset;
-  reg_to_offset[offset] = inst->dst.reg_offset;
-  mov[offset] = inst;
-  channels_remaining--;
+  if (inst->opcode == SHADER_OPCODE_LOAD_PAYLOAD) {
+ for (int i = 0; i < src_size; i++) {
+reg_to_offset[i] = i;
+ }
+ mov[0] = inst;
+ channels_remaining -= inst->sources;
+  } else {
+ const int offset = inst->src[0].reg_offset;
+ reg_to_offset[offset] = inst->dst.reg_offset;
+ mov[offset] = inst;
+ channels_remaining--;
+  }
 
   if (channels_remaining)
  continue;
@@ -186,15 +224,16 @@ fs_visitor::register_coalesce()
  continue;
 
   progress = true;
+  bool was_load_payload = inst->opcode == SHADER_OPCODE_LOAD_PAYLOAD;
 
   for (int i = 0; i < src_size; i++) {
  if (mov[i]) {
 mov[i]->opcode = BRW_OPCODE_NOP;
 mov[i]->conditional_mod = BRW_CONDITIONAL_NONE;
 mov[i]->dst = reg_undef;
-mov[i]->src[0] = reg_undef;
-mov[i]->src[1] = reg_undef;
-mov[i]->src[2] = reg_undef;
+for (int j = 0; j < mov[i]->sources; j++) {
+   mov[i]->src[j] = reg_undef;
+}
  }
   }
 
@@ -202,7 +241,7 @@ fs_visitor::register_coalesce()
  fs_inst *scan_inst = (fs_inst *)node;
 
  for (int i = 0; i < src_size; i++) {
-if (mov[i]) {
+if (mov[i] || was_load_payload) {
if (scan_inst->dst.file == GRF &&
scan_inst->dst.reg == reg_from &&
scan_inst->dst.reg_offset == i) {
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/19] i965/fs: ralloc fs_inst's fs_reg sources.

2014-05-27 Thread Matt Turner
---
Allocating fewer sources than 3 is not handled in this series.

 src/mesa/drivers/dri/i965/brw_fs.cpp | 8 
 src/mesa/drivers/dri/i965/brw_fs.h   | 2 +-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 5b7d84f..b06966a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -55,6 +55,9 @@ void
 fs_inst::init()
 {
memset(this, 0, sizeof(*this));
+
+   this->src = ralloc_array(this, fs_reg, 3);
+
this->conditional_mod = BRW_CONDITIONAL_NONE;
 
this->dst = reg_undef;
@@ -142,6 +145,11 @@ fs_inst::fs_inst(enum opcode opcode, fs_reg dst,
 fs_inst::fs_inst(const fs_inst &that)
 {
memcpy(this, &that, sizeof(that));
+
+   this->src = ralloc_array(this, fs_reg, 3);
+
+   for (int i = 0; i < 3; i++)
+  this->src[i] = that.src[i];
 }
 
 #define ALU1(op)\
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index e079842..11a5c7c 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -211,7 +211,7 @@ public:
bool writes_flag() const;
 
fs_reg dst;
-   fs_reg src[3];
+   fs_reg *src;
 
uint32_t texture_offset; /**< Texture offset bitfield */
uint32_t offset; /* spill/unspill offset */
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/19] i965/fs: Add a function to resize fs_inst's sources array.

2014-05-27 Thread Matt Turner
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 9 +
 src/mesa/drivers/dri/i965/brw_fs.h   | 2 ++
 2 files changed, 11 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index f926d97..1f174d3 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -111,6 +111,15 @@ fs_inst::fs_inst(const fs_inst &that)
   this->src[i] = that.src[i];
 }
 
+void
+fs_inst::resize_sources(uint8_t num_sources)
+{
+   if (this->sources != num_sources) {
+  this->src = reralloc(this, this->src, fs_reg, num_sources);
+  this->sources = num_sources;
+   }
+}
+
 #define ALU1(op)\
fs_inst *\
fs_visitor::op(fs_reg dst, fs_reg src0)  \
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index fb68923..b7cfb3c 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -200,6 +200,8 @@ public:
const fs_reg &src1, const fs_reg &src2);
fs_inst(const fs_inst &that);
 
+   void resize_sources(uint8_t num_sources);
+
bool equals(fs_inst *inst) const;
bool overwrites_reg(const fs_reg ®) const;
bool is_send_from_grf() const;
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/19] i965/fs: Only consider real sources when comparing instructions.

2014-05-27 Thread Matt Turner
---
 src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
index 94f657d..e40567f 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
@@ -94,10 +94,20 @@ is_expression_commutative(enum opcode op)
 }
 
 static bool
-operands_match(enum opcode op, fs_reg *xs, fs_reg *ys)
+operands_match(fs_inst *a, fs_inst *b)
 {
-   if (!is_expression_commutative(op)) {
-  return xs[0].equals(ys[0]) && xs[1].equals(ys[1]) && xs[2].equals(ys[2]);
+   fs_reg *xs = a->src;
+   fs_reg *ys = b->src;
+
+   if (!is_expression_commutative(a->opcode)) {
+  bool match = true;
+  for (int i = 0; i < a->sources; i++) {
+ if (!xs[i].equals(ys[i])) {
+match = false;
+break;
+ }
+  }
+  return match;
} else {
   return (xs[0].equals(ys[0]) && xs[1].equals(ys[1])) ||
  (xs[1].equals(ys[0]) && xs[0].equals(ys[1]));
@@ -113,7 +123,8 @@ instructions_match(fs_inst *a, fs_inst *b)
   a->predicate_inverse == b->predicate_inverse &&
   a->conditional_mod == b->conditional_mod &&
   a->dst.type == b->dst.type &&
-  operands_match(a->opcode, a->src, b->src);
+  a->sources == b->sources &&
+  operands_match(a, b);
 }
 
 bool
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/21] glsl: Store short variable names inside ir_variable

2014-05-27 Thread Ian Romanick
From: Ian Romanick 

Most of the overhead of the name allocation is the ralloc tracking,
especially on 64-bit.  The allocation of the variable name "i" is 2
bytes for the name and 40 bytes for the ralloc tracking.

Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 225KiB on 64-bit.

Before: IR MEM: variable usage / name / total: 5746368 1439077 7185445
After:  IR MEM: variable usage / name / total: 5746368 1208630 6954998

Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 70KiB on 32-bit.

Before: IR MEM: variable usage / name / total: 4327584 915817 5243401
After:  IR MEM: variable usage / name / total: 4327584 844096 5171680

Signed-off-by: Ian Romanick 
---
 src/glsl/ir.cpp  | 11 ++-
 src/glsl/ir_memory_usage.cpp |  3 ++-
 src/glsl/ir_validate.cpp |  2 +-
 3 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
index 4907b34..69a0345 100644
--- a/src/glsl/ir.cpp
+++ b/src/glsl/ir.cpp
@@ -1536,7 +1536,16 @@ ir_variable::ir_variable(const struct glsl_type *type, 
const char *name,
: ir_instruction(ir_type_variable), max_ifc_array_access(NULL)
 {
this->type = type;
-   this->name = ralloc_strdup(this, name);
+
+   if (name == NULL) {
+  this->padding[0] = 0;
+  this->name = (char *) this->padding;
+   } else if (strlen(name) < sizeof(this->padding)) {
+  this->name = strcpy((char *) this->padding, name);
+   } else {
+  this->name = ralloc_strdup(this, name);
+   }
+
this->data.explicit_location = false;
this->data.has_initializer = false;
this->data.location = -1;
diff --git a/src/glsl/ir_memory_usage.cpp b/src/glsl/ir_memory_usage.cpp
index 68c0b5c..4918824 100644
--- a/src/glsl/ir_memory_usage.cpp
+++ b/src/glsl/ir_memory_usage.cpp
@@ -63,7 +63,8 @@ ir_memory_usage::visit(ir_variable *ir)
   this->s.variable_usage += (sizeof(ir_state_slot) * ir->num_state_slots)
  + ralloc_header_size;
 
-   this->s.variable_name_usage += strlen(ir->name) + 1 + ralloc_header_size;
+   if (ir->name != (char *) ir->padding)
+  this->s.variable_name_usage += strlen(ir->name) + 1 + ralloc_header_size;
 
return visit_continue;
 }
diff --git a/src/glsl/ir_validate.cpp b/src/glsl/ir_validate.cpp
index 1cfd0d5..08dd250 100644
--- a/src/glsl/ir_validate.cpp
+++ b/src/glsl/ir_validate.cpp
@@ -634,7 +634,7 @@ ir_validate::visit(ir_variable *ir)
 * in the ir_dereference_variable handler to ensure that a variable is
 * declared before it is dereferenced.
 */
-   if (ir->name)
+   if (ir->name && ir->name != (char *) ir->padding)
   assert(ralloc_parent(ir->name) == ir);
 
hash_table_insert(ht, ir, ir);
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/21] glsl: Store ir_variable::ir_type in 8 bits instead of 32

2014-05-27 Thread Ian Romanick
From: Ian Romanick 

No change in the peak ir_variable memory usage in a trimmed apitrace of
dota2 on 64-bit.

No change in the peak ir_variable memory usage in a trimmed apitrace of
dota2 on 32-bit.

Signed-off-by: Ian Romanick 
---
 src/glsl/ir.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index 7faee74..bc02f6e 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -92,12 +92,13 @@ enum ir_node_type {
  */
 class ir_instruction : public exec_node {
 private:
-   enum ir_node_type ir_type;
+   uint8_t ir_type;
 
 public:
inline enum ir_node_type get_ir_type() const
{
-  return this->ir_type;
+  STATIC_ASSERT(ir_type_max < 256);
+  return (enum ir_node_type) this->ir_type;
}
 
/**
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 18/21] glsl: Squish ir_variable::max_ifc_array_access and ::state_slots together

2014-05-27 Thread Ian Romanick
From: Ian Romanick 

At least one of these pointers must be NULL, and we can determine which
will be NULL by looking at other fields.  Use this information to store
both pointers in the same location.

If anyone can think of a better name for the union than "u", I'm all
ears.

Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 204KiB on 64-bit.

Before: IR MEM: variable usage / name / total: 5746368 1121441 6867809
After:  IR MEM: variable usage / name / total: 5537064 1121441 6658505

Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 102KiB on 32-bit.

Before: IR MEM: variable usage / name / total: 4327584 787727 5115311
After:  IR MEM: variable usage / name / total: 4222932 787727 5010659

Signed-off-by: Ian Romanick 
---
 src/glsl/ir.cpp   |  4 ++-
 src/glsl/ir.h | 78 ++-
 src/glsl/ir_clone.cpp |  4 +--
 3 files changed, 51 insertions(+), 35 deletions(-)

diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
index 69a0345..50660ac 100644
--- a/src/glsl/ir.cpp
+++ b/src/glsl/ir.cpp
@@ -1533,7 +1533,7 @@ ir_swizzle::variable_referenced() const
 
 ir_variable::ir_variable(const struct glsl_type *type, const char *name,
 ir_variable_mode mode)
-   : ir_instruction(ir_type_variable), max_ifc_array_access(NULL)
+   : ir_instruction(ir_type_variable)
 {
this->type = type;
 
@@ -1546,6 +1546,8 @@ ir_variable::ir_variable(const struct glsl_type *type, 
const char *name,
   this->name = ralloc_strdup(this, name);
}
 
+   this->u.max_ifc_array_access = NULL;
+
this->data.explicit_location = false;
this->data.has_initializer = false;
this->data.location = -1;
diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index ab9f27b..95182fb 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -476,7 +476,7 @@ public:
   assert(this->interface_type == NULL);
   this->interface_type = type;
   if (this->is_interface_instance()) {
- this->max_ifc_array_access =
+ this->u.max_ifc_array_access =
 rzalloc_array(this, unsigned, type->length);
   }
}
@@ -488,7 +488,7 @@ public:
 */
void change_interface_type(const struct glsl_type *type)
{
-  if (this->max_ifc_array_access != NULL) {
+  if (this->u.max_ifc_array_access != NULL) {
  /* max_ifc_array_access has already been allocated, so make sure the
   * new interface has the same number of fields as the old one.
   */
@@ -505,7 +505,7 @@ public:
 */
void reinit_interface_type(const struct glsl_type *type)
{
-  if (this->max_ifc_array_access != NULL) {
+  if (this->u.max_ifc_array_access != NULL) {
 #ifndef NDEBUG
  /* Redeclaring gl_PerVertex is only allowed if none of the built-ins
   * it defines have been accessed yet; so it's safe to throw away the
@@ -513,10 +513,10 @@ public:
   * zero.
   */
  for (unsigned i = 0; i < this->interface_type->length; i++)
-assert(this->max_ifc_array_access[i] == 0);
+assert(this->u.max_ifc_array_access[i] == 0);
 #endif
- ralloc_free(this->max_ifc_array_access);
- this->max_ifc_array_access = NULL;
+ ralloc_free(this->u.max_ifc_array_access);
+ this->u.max_ifc_array_access = NULL;
   }
   this->interface_type = NULL;
   init_interface_type(type);
@@ -535,33 +535,45 @@ public:
 */
inline unsigned *get_max_ifc_array_access()
{
-  return this->max_ifc_array_access;
+  assert(this->data._num_state_slots == 0);
+  return this->u.max_ifc_array_access;
}
 
inline unsigned get_num_state_slots() const
{
+  assert(!this->is_interface_instance()
+ || this->data._num_state_slots == 0);
   return this->data._num_state_slots;
}
 
inline void set_num_state_slots(unsigned n)
{
+  assert(!this->is_interface_instance()
+ || n == 0);
   this->data._num_state_slots = n;
}
 
inline ir_state_slot *get_state_slots()
{
-  return this->state_slots;
+  return this->is_interface_instance() ? NULL : this->u.state_slots;
+   }
+
+   inline const ir_state_slot *get_state_slots() const
+   {
+  return this->is_interface_instance() ? NULL : this->u.state_slots;
}
 
inline ir_state_slot *allocate_state_slots(unsigned n)
{
-  this->state_slots = ralloc_array(this, ir_state_slot, n);
+  assert(!this->is_interface_instance());
+
+  this->u.state_slots = ralloc_array(this, ir_state_slot, n);
   this->data._num_state_slots = 0;
 
-  if (this->state_slots != NULL)
+  if (this->u.state_slots != NULL)
  this->data._num_state_slots = n;
 
-  return this->state_slots;
+  return this->u.state_slots;
}
 
/**
@@ -818,28 +830,30 @@ public:
 private:
static const char *const warn_extension_table[];
 
-   /**
-* For variables which satisfy the is_interface_inst

[Mesa-dev] [PATCH 07/21] glsl: Replace ir_variable::warn_extension pointer with an 8-bit index

2014-05-27 Thread Ian Romanick
From: Ian Romanick 

Also move the new warn_extension_index into ir_variable::data.  This
enables slightly better packing.

Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 204KiB on 64-bit.

Before: IR MEM: variable usage / name / total: 5955672 1439077 7394749
After:  IR MEM: variable usage / name / total: 5746368 1439077 7185445

Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 102KiB on 32-bit.

Before: IR MEM: variable usage / name / total: 4536888 915817 5452705
After:  IR MEM: variable usage / name / total: 4432236 915817 5348053

Signed-off-by: Ian Romanick 
---
 src/glsl/ir.cpp   | 21 ++---
 src/glsl/ir.h | 18 +-
 src/glsl/ir_clone.cpp |  2 --
 3 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
index 3d6af56..65a755e 100644
--- a/src/glsl/ir.cpp
+++ b/src/glsl/ir.cpp
@@ -1547,7 +1547,7 @@ ir_variable::ir_variable(const struct glsl_type *type, 
const char *name,
this->data.location = -1;
this->data.location_frac = 0;
this->data.binding = 0;
-   this->warn_extension = NULL;
+   this->data.warn_extension_index = 0;
this->constant_value = NULL;
this->constant_initializer = NULL;
this->data.origin_upper_left = false;
@@ -1610,16 +1610,31 @@ ir_variable::determine_interpolation_mode(bool 
flat_shade)
   return INTERP_QUALIFIER_SMOOTH;
 }
 
+const char *const ir_variable::warn_extension_table[] = {
+   "",
+   "GL_ARB_shader_stencil_export",
+   "GL_AMD_shader_stencil_export",
+};
+
 void
 ir_variable::enable_extension_warning(const char *extension)
 {
-   this->warn_extension = extension;
+   for (unsigned i = 0; i < Elements(warn_extension_table); i++) {
+  if (strcmp(warn_extension_table[i], extension) == 0) {
+ this->data.warn_extension_index = i;
+ return;
+  }
+   }
+
+   assert(!"Should not get here.");
+   this->data.warn_extension_index = 0;
 }
 
 const char *
 ir_variable::get_extension_warning() const
 {
-   return this->warn_extension;
+   return this->data.warn_extension_index == 0
+  ? NULL : warn_extension_table[this->data.warn_extension_index];
 }
 
 ir_function_signature::ir_function_signature(const glsl_type *return_type,
diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index 3298a50..4147bbc 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -685,6 +685,13 @@ public:
   uint16_t image_format;
 
   /**
+   * Emit a warning if this variable is accessed.
+   */
+   private:
+  uint8_t warn_extension_index;
+
+   public:
+  /**
* \brief Layout qualifier for gl_FragDepth.
*
* This is not equal to \c ir_depth_layout_none if and only if this
@@ -733,6 +740,10 @@ public:
*/
   unsigned max_array_access;
 
+  /**
+   * Allow (only) ir_variable direct access private members.
+   */
+  friend class ir_variable;
} data;
 
/**
@@ -767,6 +778,8 @@ public:
ir_constant *constant_initializer;
 
 private:
+   static const char *const warn_extension_table[];
+
/**
 * For variables that are in an interface block or are an instance of an
 * interface block, this is the \c GLSL_TYPE_INTERFACE type for that block.
@@ -774,11 +787,6 @@ private:
 * \sa ir_variable::location
 */
const glsl_type *interface_type;
-
-   /**
-* Emit a warning if this variable is accessed.
-*/
-   const char *warn_extension;
 };
 
 /**
diff --git a/src/glsl/ir_clone.cpp b/src/glsl/ir_clone.cpp
index c00adc5..d594529 100644
--- a/src/glsl/ir_clone.cpp
+++ b/src/glsl/ir_clone.cpp
@@ -53,8 +53,6 @@ ir_variable::clone(void *mem_ctx, struct hash_table *ht) const
 
memcpy(&var->data, &this->data, sizeof(var->data));
 
-   var->warn_extension = this->warn_extension;
-
var->num_state_slots = this->num_state_slots;
if (this->state_slots) {
   /* FINISHME: This really wants to use something like talloc_reference, 
but
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/21] glsl: Store ir_variable::depth_layout using 3 bits

2014-05-27 Thread Ian Romanick
From: Ian Romanick 

warn_extension_index was moved to improve packing.

No change in the peak ir_variable memory usage in a trimmed apitrace of
dota2 on 64-bit.

Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 102KiB on 32-bit.

Before: IR MEM: variable usage / name / total: 4432236 915817 5348053
After:  IR MEM: variable usage / name / total: 4327584 915817 5243401

Signed-off-by: Ian Romanick 
---
 src/glsl/ast_to_hir.cpp |  4 ++--
 src/glsl/ir.h   | 19 +--
 2 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 0621ea7..ef1607d 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -2824,8 +2824,8 @@ get_variable_being_redeclared(ir_variable *var, YYLTYPE 
loc,
  "gl_FragDepth: depth layout is declared here "
  "as '%s, but it was previously declared as "
  "'%s'",
- depth_layout_string(var->data.depth_layout),
- depth_layout_string(earlier->data.depth_layout));
+ 
depth_layout_string(ir_depth_layout(var->data.depth_layout)),
+ 
depth_layout_string(ir_depth_layout(earlier->data.depth_layout)));
   }
 
   earlier->data.depth_layout = var->data.depth_layout;
diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index 4147bbc..8515124 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -671,6 +671,13 @@ public:
*/
   unsigned index:1;
 
+  /**
+   * \brief Layout qualifier for gl_FragDepth.
+   *
+   * This is not equal to \c ir_depth_layout_none if and only if this
+   * variable is \c gl_FragDepth and a layout qualifier is specified.
+   */
+  unsigned depth_layout:3;
 
   /**
* ARB_shader_image_load_store qualifiers.
@@ -681,9 +688,6 @@ public:
   unsigned image_volatile:1;
   unsigned image_restrict:1;
 
-  /** Image internal format if specified explicitly, otherwise GL_NONE. */
-  uint16_t image_format;
-
   /**
* Emit a warning if this variable is accessed.
*/
@@ -691,13 +695,8 @@ public:
   uint8_t warn_extension_index;
 
public:
-  /**
-   * \brief Layout qualifier for gl_FragDepth.
-   *
-   * This is not equal to \c ir_depth_layout_none if and only if this
-   * variable is \c gl_FragDepth and a layout qualifier is specified.
-   */
-  ir_depth_layout depth_layout;
+  /** Image internal format if specified explicitly, otherwise GL_NONE. */
+  uint16_t image_format;
 
   /**
* Storage location of the base of this variable
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/21] glsl: Use bit-flags image attributes and uint16_t for the image format

2014-05-27 Thread Ian Romanick
From: Ian Romanick 

All of the GL image enums fit in 16-bits.

Also move the fields from the anonymous "image" structucture to the next
higher structure.  This will enable packing the bits with the other
bitfield.

Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 204KiB on 64-bit.

Before: IR MEM: variable usage / name / total: 6164976 1439077 7604053
After:  IR MEM: variable usage / name / total: 5955672 1439077 7394749

Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 204KiB on 32-bit.

Before: IR MEM: variable usage / name / total: 4746192 915817 5662009
After:  IR MEM: variable usage / name / total: 4536888 915817 5452705

Signed-off-by: Ian Romanick 
---
 src/glsl/ast_function.cpp  | 10 +-
 src/glsl/ast_to_hir.cpp| 14 +++---
 src/glsl/builtin_functions.cpp | 10 +-
 src/glsl/ir.cpp| 20 ++--
 src/glsl/ir.h  | 27 +--
 src/glsl/link_uniforms.cpp |  4 ++--
 6 files changed, 42 insertions(+), 43 deletions(-)

diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp
index 4b84470..c70b519 100644
--- a/src/glsl/ast_function.cpp
+++ b/src/glsl/ast_function.cpp
@@ -106,35 +106,35 @@ verify_image_parameter(YYLTYPE *loc, 
_mesa_glsl_parse_state *state,
 *  qualifiers. [...] It is legal to have additional qualifiers
 *  on a formal parameter, but not to have fewer."
 */
-   if (actual->data.image.coherent && !formal->data.image.coherent) {
+   if (actual->data.image_coherent && !formal->data.image_coherent) {
   _mesa_glsl_error(loc, state,
"function call parameter `%s' drops "
"`coherent' qualifier", formal->name);
   return false;
}
 
-   if (actual->data.image._volatile && !formal->data.image._volatile) {
+   if (actual->data.image_volatile && !formal->data.image_volatile) {
   _mesa_glsl_error(loc, state,
"function call parameter `%s' drops "
"`volatile' qualifier", formal->name);
   return false;
}
 
-   if (actual->data.image.restrict_flag && !formal->data.image.restrict_flag) {
+   if (actual->data.image_restrict && !formal->data.image_restrict) {
   _mesa_glsl_error(loc, state,
"function call parameter `%s' drops "
"`restrict' qualifier", formal->name);
   return false;
}
 
-   if (actual->data.image.read_only && !formal->data.image.read_only) {
+   if (actual->data.image_read_only && !formal->data.image_read_only) {
   _mesa_glsl_error(loc, state,
"function call parameter `%s' drops "
"`readonly' qualifier", formal->name);
   return false;
}
 
-   if (actual->data.image.write_only && !formal->data.image.write_only) {
+   if (actual->data.image_write_only && !formal->data.image_write_only) {
   _mesa_glsl_error(loc, state,
"function call parameter `%s' drops "
"`writeonly' qualifier", formal->name);
diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 0128b3f..0621ea7 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -2314,11 +2314,11 @@ apply_image_qualifier_to_variable(const struct 
ast_type_qualifier *qual,
   "global variables");
   }
 
-  var->data.image.read_only |= qual->flags.q.read_only;
-  var->data.image.write_only |= qual->flags.q.write_only;
-  var->data.image.coherent |= qual->flags.q.coherent;
-  var->data.image._volatile |= qual->flags.q._volatile;
-  var->data.image.restrict_flag |= qual->flags.q.restrict_flag;
+  var->data.image_read_only |= qual->flags.q.read_only;
+  var->data.image_write_only |= qual->flags.q.write_only;
+  var->data.image_coherent |= qual->flags.q.coherent;
+  var->data.image_volatile |= qual->flags.q._volatile;
+  var->data.image_restrict |= qual->flags.q.restrict_flag;
   var->data.read_only = true;
 
   if (qual->flags.q.explicit_image_format) {
@@ -2332,7 +2332,7 @@ apply_image_qualifier_to_variable(const struct 
ast_type_qualifier *qual,
  "base data type of the image");
  }
 
- var->data.image.format = qual->image_format;
+ var->data.image_format = qual->image_format;
   } else {
  if (var->data.mode == ir_var_uniform && !qual->flags.q.write_only) {
 _mesa_glsl_error(loc, state, "uniforms not qualified with "
@@ -2340,7 +2340,7 @@ apply_image_qualifier_to_variable(const struct 
ast_type_qualifier *qual,
  "qualifier");
  }
 
- var->data.image.format = GL_NONE;
+ var->data.image_format = GL_NONE;
   }
}
 }
diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp
index f9f0686..4b538c7 100644
--- a/src/glsl/builtin_functions.cpp

[Mesa-dev] [PATCH 16/21] glsl: Make ir_variable::max_ifc_array_access private

2014-05-27 Thread Ian Romanick
From: Ian Romanick 

The payoff for this will come in a few more patches.

Signed-off-by: Ian Romanick 
---
 src/glsl/ast_array_index.cpp | 10 --
 src/glsl/ir.h| 37 -
 src/glsl/ir_validate.cpp |  9 +++--
 src/glsl/link_functions.cpp  | 14 +++---
 src/glsl/linker.cpp  |  5 +++--
 5 files changed, 53 insertions(+), 22 deletions(-)

diff --git a/src/glsl/ast_array_index.cpp b/src/glsl/ast_array_index.cpp
index f3b060e..ecc4200 100644
--- a/src/glsl/ast_array_index.cpp
+++ b/src/glsl/ast_array_index.cpp
@@ -88,8 +88,14 @@ update_max_array_access(ir_rvalue *ir, unsigned idx, YYLTYPE 
*loc,
 unsigned field_index =
deref_record->record->type->field_index(deref_record->field);
 assert(field_index < interface_type->length);
-if (idx > deref_var->var->max_ifc_array_access[field_index]) {
-   deref_var->var->max_ifc_array_access[field_index] = idx;
+
+unsigned *const max_ifc_array_access =
+   deref_var->var->get_max_ifc_array_access();
+
+assert(max_ifc_array_access != NULL);
+
+if (idx > max_ifc_array_access[field_index]) {
+   max_ifc_array_access[field_index] = idx;
 
/* Check whether this access will, as a side effect, implicitly
 * cause the size of a built-in array to be too large.
diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index fb10c32..bfd790e 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -528,6 +528,17 @@ public:
}
 
/**
+* Get the max_ifc_array_access pointer
+*
+* A "set" function is not needed because the array is dynmically allocated
+* as necessary.
+*/
+   inline unsigned *get_max_ifc_array_access()
+   {
+  return this->max_ifc_array_access;
+   }
+
+   /**
 * Enable emitting extension warnings for this variable
 */
void enable_extension_warning(const char *extension);
@@ -549,19 +560,6 @@ public:
 */
const char *name;
 
-   /**
-* For variables which satisfy the is_interface_instance() predicate, this
-* points to an array of integers such that if the ith member of the
-* interface block is an array, max_ifc_array_access[i] is the maximum
-* array element of that member that has been accessed.  If the ith member
-* of the interface block is not an array, max_ifc_array_access[i] is
-* unused.
-*
-* For variables whose type is not an interface block, this pointer is
-* NULL.
-*/
-   unsigned *max_ifc_array_access;
-
struct ir_variable_data {
 
   /**
@@ -807,6 +805,19 @@ private:
static const char *const warn_extension_table[];
 
/**
+* For variables which satisfy the is_interface_instance() predicate, this
+* points to an array of integers such that if the ith member of the
+* interface block is an array, max_ifc_array_access[i] is the maximum
+* array element of that member that has been accessed.  If the ith member
+* of the interface block is not an array, max_ifc_array_access[i] is
+* unused.
+*
+* For variables whose type is not an interface block, this pointer is
+* NULL.
+*/
+   unsigned *max_ifc_array_access;
+
+   /**
 * For variables that are in an interface block or are an instance of an
 * interface block, this is the \c GLSL_TYPE_INTERFACE type for that block.
 *
diff --git a/src/glsl/ir_validate.cpp b/src/glsl/ir_validate.cpp
index 08dd250..2fca437 100644
--- a/src/glsl/ir_validate.cpp
+++ b/src/glsl/ir_validate.cpp
@@ -662,10 +662,15 @@ ir_validate::visit(ir_variable *ir)
  ir->get_interface_type()->fields.structure;
   for (unsigned i = 0; i < ir->get_interface_type()->length; i++) {
  if (fields[i].type->array_size() > 0) {
-if (ir->max_ifc_array_access[i] >= fields[i].type->length) {
+const unsigned *const max_ifc_array_access =
+   ir->get_max_ifc_array_access();
+
+assert(max_ifc_array_access != NULL);
+
+if (max_ifc_array_access[i] >= fields[i].type->length) {
printf("ir_variable has maximum access out of bounds for "
   "field %s (%d vs %d)\n", fields[i].name,
-  ir->max_ifc_array_access[i], fields[i].type->length);
+  max_ifc_array_access[i], fields[i].type->length);
ir->print();
abort();
 }
diff --git a/src/glsl/link_functions.cpp b/src/glsl/link_functions.cpp
index 56f3f20..60d0d13 100644
--- a/src/glsl/link_functions.cpp
+++ b/src/glsl/link_functions.cpp
@@ -246,11 +246,19 @@ public:
/* Similarly, we need implicit sizes of arrays within interface
 * blocks to be sized by the maximal access in *any* shader.
 */
+   unsigned *const linked_max_ifc_array_access =
+  var->get_max_ifc_array_access();
+   

[Mesa-dev] [PATCH 09/21] glsl: Set ir_instruction::ir_type in the base class constructor

2014-05-27 Thread Ian Romanick
From: Ian Romanick 

This has the added perk that if you forget to set ir_type in the
constructor of a new subclass (or a new constructor of an existing
subclass) the compiler will tell you... instead of relying on
ir_validate or similar run-time detection.

Signed-off-by: Ian Romanick 
---
 src/glsl/ir.cpp | 65 ++---
 src/glsl/ir.h   | 46 +++-
 2 files changed, 57 insertions(+), 54 deletions(-)

diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
index 65a755e..4907b34 100644
--- a/src/glsl/ir.cpp
+++ b/src/glsl/ir.cpp
@@ -26,7 +26,8 @@
 #include "ir_visitor.h"
 #include "glsl_types.h"
 
-ir_rvalue::ir_rvalue()
+ir_rvalue::ir_rvalue(enum ir_node_type t)
+   : ir_instruction(t)
 {
this->type = glsl_type::error_type;
 }
@@ -153,8 +154,8 @@ ir_assignment::whole_variable_written()
 
 ir_assignment::ir_assignment(ir_dereference *lhs, ir_rvalue *rhs,
 ir_rvalue *condition, unsigned write_mask)
+   : ir_instruction(ir_type_assignment)
 {
-   this->ir_type = ir_type_assignment;
this->condition = condition;
this->rhs = rhs;
this->lhs = lhs;
@@ -173,8 +174,8 @@ ir_assignment::ir_assignment(ir_dereference *lhs, ir_rvalue 
*rhs,
 
 ir_assignment::ir_assignment(ir_rvalue *lhs, ir_rvalue *rhs,
 ir_rvalue *condition)
+   : ir_instruction(ir_type_assignment)
 {
-   this->ir_type = ir_type_assignment;
this->condition = condition;
this->rhs = rhs;
 
@@ -198,8 +199,8 @@ ir_assignment::ir_assignment(ir_rvalue *lhs, ir_rvalue *rhs,
 ir_expression::ir_expression(int op, const struct glsl_type *type,
 ir_rvalue *op0, ir_rvalue *op1,
 ir_rvalue *op2, ir_rvalue *op3)
+   : ir_rvalue(ir_type_expression)
 {
-   this->ir_type = ir_type_expression;
this->type = type;
this->operation = ir_expression_operation(op);
this->operands[0] = op0;
@@ -215,9 +216,8 @@ ir_expression::ir_expression(int op, const struct glsl_type 
*type,
 }
 
 ir_expression::ir_expression(int op, ir_rvalue *op0)
+   : ir_rvalue(ir_type_expression)
 {
-   this->ir_type = ir_type_expression;
-
this->operation = ir_expression_operation(op);
this->operands[0] = op0;
this->operands[1] = NULL;
@@ -324,9 +324,8 @@ ir_expression::ir_expression(int op, ir_rvalue *op0)
 }
 
 ir_expression::ir_expression(int op, ir_rvalue *op0, ir_rvalue *op1)
+   : ir_rvalue(ir_type_expression)
 {
-   this->ir_type = ir_type_expression;
-
this->operation = ir_expression_operation(op);
this->operands[0] = op0;
this->operands[1] = op1;
@@ -420,9 +419,8 @@ ir_expression::ir_expression(int op, ir_rvalue *op0, 
ir_rvalue *op1)
 
 ir_expression::ir_expression(int op, ir_rvalue *op0, ir_rvalue *op1,
  ir_rvalue *op2)
+   : ir_rvalue(ir_type_expression)
 {
-   this->ir_type = ir_type_expression;
-
this->operation = ir_expression_operation(op);
this->operands[0] = op0;
this->operands[1] = op1;
@@ -610,25 +608,25 @@ ir_expression::get_operator(const char *str)
 }
 
 ir_constant::ir_constant()
+   : ir_rvalue(ir_type_constant)
 {
-   this->ir_type = ir_type_constant;
 }
 
 ir_constant::ir_constant(const struct glsl_type *type,
 const ir_constant_data *data)
+   : ir_rvalue(ir_type_constant)
 {
assert((type->base_type >= GLSL_TYPE_UINT)
  && (type->base_type <= GLSL_TYPE_BOOL));
 
-   this->ir_type = ir_type_constant;
this->type = type;
memcpy(& this->value, data, sizeof(this->value));
 }
 
 ir_constant::ir_constant(float f, unsigned vector_elements)
+   : ir_rvalue(ir_type_constant)
 {
assert(vector_elements <= 4);
-   this->ir_type = ir_type_constant;
this->type = glsl_type::get_instance(GLSL_TYPE_FLOAT, vector_elements, 1);
for (unsigned i = 0; i < vector_elements; i++) {
   this->value.f[i] = f;
@@ -639,9 +637,9 @@ ir_constant::ir_constant(float f, unsigned vector_elements)
 }
 
 ir_constant::ir_constant(unsigned int u, unsigned vector_elements)
+   : ir_rvalue(ir_type_constant)
 {
assert(vector_elements <= 4);
-   this->ir_type = ir_type_constant;
this->type = glsl_type::get_instance(GLSL_TYPE_UINT, vector_elements, 1);
for (unsigned i = 0; i < vector_elements; i++) {
   this->value.u[i] = u;
@@ -652,9 +650,9 @@ ir_constant::ir_constant(unsigned int u, unsigned 
vector_elements)
 }
 
 ir_constant::ir_constant(int integer, unsigned vector_elements)
+   : ir_rvalue(ir_type_constant)
 {
assert(vector_elements <= 4);
-   this->ir_type = ir_type_constant;
this->type = glsl_type::get_instance(GLSL_TYPE_INT, vector_elements, 1);
for (unsigned i = 0; i < vector_elements; i++) {
   this->value.i[i] = integer;
@@ -665,9 +663,9 @@ ir_constant::ir_constant(int integer, unsigned 
vector_elements)
 }
 
 ir_constant::ir_constant(bool b, unsigned vector_elements)
+   : ir_rvalue(ir_type_constant)
 {
assert(vector_elements <= 4);

[Mesa-dev] [PATCH 14/21] glsl: Use short names for flattening temporaries

2014-05-27 Thread Ian Romanick
From: Ian Romanick 

Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 66KiB on 64-bit.

Before: IR MEM: variable usage / name / total: 5746368 1208630 6954998
After:  IR MEM: variable usage / name / total: 5746368 1140817 6887185

Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 42KiB on 32-bit.

Before: IR MEM: variable usage / name / total: 4327584 844096 5171680
After:  IR MEM: variable usage / name / total: 4327584 800183 5127767

Signed-off-by: Ian Romanick 
---
 src/glsl/ir_expression_flattening.cpp |  2 +-
 src/mesa/drivers/dri/i965/brw_fs_vector_splitting.cpp | 16 +---
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/src/glsl/ir_expression_flattening.cpp 
b/src/glsl/ir_expression_flattening.cpp
index c1cadb1..4fd4733 100644
--- a/src/glsl/ir_expression_flattening.cpp
+++ b/src/glsl/ir_expression_flattening.cpp
@@ -78,7 +78,7 @@ ir_expression_flattening_visitor::handle_rvalue(ir_rvalue 
**rvalue)
 
void *ctx = ralloc_parent(ir);
 
-   var = new(ctx) ir_variable(ir->type, "flattening_tmp", ir_var_temporary);
+   var = new(ctx) ir_variable(ir->type, "$f", ir_var_temporary);
base_ir->insert_before(var);
 
assign = new(ctx) ir_assignment(new(ctx) ir_dereference_variable(var),
diff --git a/src/mesa/drivers/dri/i965/brw_fs_vector_splitting.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_vector_splitting.cpp
index a9125ca..aac515b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_vector_splitting.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_vector_splitting.cpp
@@ -368,9 +368,19 @@ brw_do_vector_splitting(exec_list *instructions)
   entry->mem_ctx = ralloc_parent(entry->var);
 
   for (unsigned int i = 0; i < entry->var->type->vector_elements; i++) {
-const char *name = ralloc_asprintf(mem_ctx, "%s_%c",
-   entry->var->name,
-   "xyzw"[i]);
+ const char *name;
+ char buf[3];
+
+ if (entry->var->name[0] == '$') {
+buf[0] = '$';
+buf[1] = "xyzw"[i];
+buf[2] = '\0';
+name = buf;
+ } else {
+name = ralloc_asprintf(mem_ctx, "%s_%c",
+   entry->var->name,
+   "xyzw"[i]);
+ }
 
 entry->components[i] = new(entry->mem_ctx) ir_variable(type, name,

ir_var_temporary);
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 20/21] glsl: Use short names for function return value variables

2014-05-27 Thread Ian Romanick
From: Ian Romanick 

Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 181KiB on 64-bit.

Before: IR MEM: variable usage / name / total: 5327760 1121441 6449201
After:  IR MEM: variable usage / name / total: 5327760 935234 6262994

Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 114KiB on 32-bit.

Before: IR MEM: variable usage / name / total: 4118280 787727 4906007
After:  IR MEM: variable usage / name / total: 4118280 670980 4789260

Signed-off-by: Ian Romanick 
---
 src/glsl/ast_function.cpp | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp
index bad410b..24bbd90 100644
--- a/src/glsl/ast_function.cpp
+++ b/src/glsl/ast_function.cpp
@@ -394,12 +394,9 @@ generate_call(exec_list *instructions, 
ir_function_signature *sig,
ir_dereference_variable *deref = NULL;
if (!sig->return_type->is_void()) {
   /* Create a new temporary to hold the return value. */
-  ir_variable *var;
+  ir_variable *var =
+ new(ctx) ir_variable(sig->return_type, "$r", ir_var_temporary);
 
-  var = new(ctx) ir_variable(sig->return_type,
-ralloc_asprintf(ctx, "%s_retval",
-sig->function_name()),
-ir_var_temporary);
   instructions->push_tail(var);
 
   deref = new(ctx) ir_dereference_variable(var);
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/21] glsl: Use a single bit for the dual-source blend index

2014-05-27 Thread Ian Romanick
From: Ian Romanick 

The only values allowed are 0 and 1, and the value is checked before
assigning.

With the previous changes, reduces the peak ir_variable memory usage in
a trimmed apitrace of dota2 by 204KiB on 64-bit.

Before: IR MEM: variable usage / name / total: 6374280 1439077 7813357
After:  IR MEM: variable usage / name / total: 6164976 1439077 7604053

Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 102KiB on 32-bit.

Before: IR MEM: variable usage / name / total: 4850844 915817 571
After:  IR MEM: variable usage / name / total: 4746192 915817 5662009

Signed-off-by: Ian Romanick 
---
 src/glsl/ir.h | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index 93d5aef..3767f2a 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -651,6 +651,15 @@ public:
   unsigned from_named_ifc_block_array:1;
 
   /**
+   * Output index for dual source blending.
+   *
+   * \note
+   * The GLSL spec only allows the values 0 or 1 for the index in \b dual
+   * source blending.
+   */
+  unsigned index:1;
+
+  /**
* \brief Layout qualifier for gl_FragDepth.
*
* This is not equal to \c ir_depth_layout_none if and only if this
@@ -679,11 +688,6 @@ public:
   int location;
 
   /**
-   * output index for dual source blending.
-   */
-  int index;
-
-  /**
* Initial binding point for a sampler, atomic, or UBO.
*
* For array types, this represents the binding point for the first 
element.
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/21] mesa: Log memory usage statistics for all known shaders

2014-05-27 Thread Ian Romanick
From: Ian Romanick 

Currently this is done at each call to glLinkProgram.  This seems like
as good a place as any.  This is the main place where memory usage will
change, and it enables tracking as applications progress (e.g., load new
levels).

Signed-off-by: Ian Romanick 
---
 src/mesa/main/shaderapi.c | 79 +++
 1 file changed, 79 insertions(+)

diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index 28739da..8e8170e 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -58,6 +58,7 @@
 #include "../glsl/ir.h"
 #include "../glsl/ir_uniform.h"
 #include "../glsl/program.h"
+#include "../glsl/ir_memory_usage.h"
 
 /** Define this to enable shader substitution (see below) */
 #define SHADER_SUBST 0
@@ -887,6 +888,56 @@ compile_shader(struct gl_context *ctx, GLuint shaderObj)
 }
 
 
+#ifdef DEBUG
+static void
+memory_stats_cb(GLuint key, void *data, void *userData)
+{
+   struct ir_memory_statistics *total =
+  (struct ir_memory_statistics *) userData;
+   struct ir_memory_statistics stats;
+   struct gl_shader_program *shProg = (struct gl_shader_program *) data;
+
+   (void) key;
+
+   if (shProg->Type == GL_SHADER_PROGRAM_MESA) {
+  unsigned i;
+
+  for (i = 0; i < MESA_SHADER_STAGES; i++) {
+ struct gl_shader *sh = shProg->_LinkedShaders[i];
+
+ if (shProg->_LinkedShaders[i] != NULL) {
+calculate_ir_tree_memory_usage(sh->ir, &stats);
+
+total->variable_usage += stats.variable_usage;
+total->variable_name_usage += stats.variable_name_usage;
+total->dereference_variable_usage +=
+   stats.dereference_variable_usage;
+total->dereference_array_usage += stats.dereference_array_usage;
+total->dereference_record_usage += stats.dereference_record_usage;
+total->dereference_record_field_usage +=
+   stats.dereference_record_field_usage;
+ }
+  }
+   } else {
+  struct gl_shader *sh = (struct gl_shader *) data;
+
+  assert(sh->Type == GL_FRAGMENT_SHADER
+ || sh->Type == GL_VERTEX_SHADER
+ || sh->Type == GL_GEOMETRY_SHADER_ARB);
+
+  calculate_ir_tree_memory_usage(sh->ir, &stats);
+
+  total->variable_usage += stats.variable_usage;
+  total->variable_name_usage += stats.variable_name_usage;
+  total->dereference_variable_usage += stats.dereference_variable_usage;
+  total->dereference_array_usage += stats.dereference_array_usage;
+  total->dereference_record_usage += stats.dereference_record_usage;
+  total->dereference_record_field_usage +=
+ stats.dereference_record_field_usage;
+   }
+}
+#endif /* DEBUG */
+
 /**
  * Link a program's shaders.
  */
@@ -920,6 +971,34 @@ link_program(struct gl_context *ctx, GLuint program)
   shProg->Name, shProg->InfoLog);
}
 
+   /* On the first draw call, dump the memory usage statistics for *ALL*
+* known shaders.
+*/
+#ifdef DEBUG
+   if (ctx->_Shader->Flags & GLSL_LOG) {
+  struct ir_memory_statistics stats;
+
+  memset(&stats, 0, sizeof(stats));
+  _mesa_HashWalk(ctx->Shared->ShaderObjects,
+ memory_stats_cb,
+ &stats);
+
+  printf("IR MEM: variable usage / name / total: %u %u %u\n",
+ stats.variable_usage,
+ stats.variable_name_usage,
+ stats.variable_usage + stats.variable_name_usage);
+  printf("IR MEM: dereference variable usage: %u\n",
+ stats.dereference_variable_usage);
+  printf("IR MEM: dereference array usage: %u\n",
+ stats.dereference_array_usage);
+  printf("IR MEM: dereference record usage / field / total: %u %u %u\n",
+ stats.dereference_record_usage,
+ stats.dereference_record_field_usage,
+ stats.dereference_record_usage
+ + stats.dereference_record_field_usage);
+   }
+#endif /* DEBUG */
+
/* debug code */
if (0) {
   GLuint i;
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 15/21] glsl: Use short names for conditional temporaries

2014-05-27 Thread Ian Romanick
From: Ian Romanick 

Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 18KiB on 64-bit.

Before: IR MEM: variable usage / name / total: 5746368 1140817 6887185
After:  IR MEM: variable usage / name / total: 5746368 1121441 6867809

Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 12KiB on 32-bit.

Before: IR MEM: variable usage / name / total: 4327584 800183 5127767
After:  IR MEM: variable usage / name / total: 4327584 787727 5115311

Signed-off-by: Ian Romanick 
---
 src/glsl/ast_to_hir.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 3fcec19..bff2e0a 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -1596,7 +1596,7 @@ ast_expression::do_hir(exec_list *instructions,
  result = (cond_val->value.b[0]) ? then_val : else_val;
   } else {
  ir_variable *const tmp =
-new(ctx) ir_variable(type, "conditional_tmp", ir_var_temporary);
+new(ctx) ir_variable(type, "$c", ir_var_temporary);
  instructions->push_tail(tmp);
 
  ir_if *const stmt = new(ctx) ir_if(op[0]);
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/21] glsl: Use accessors for ir_variable::warn_extension

2014-05-27 Thread Ian Romanick
From: Ian Romanick 

The payoff for this will come in the next patch.

No change in the peak ir_variable memory usage in a trimmed apitrace of
dota2 on 64-bit.

No change in the peak ir_variable memory usage in a trimmed apitrace of
dota2 on 32-bit.

Signed-off-by: Ian Romanick 
---
 src/glsl/builtin_variables.cpp |  4 ++--
 src/glsl/ir.cpp| 11 +++
 src/glsl/ir.h  | 22 +-
 3 files changed, 30 insertions(+), 7 deletions(-)

diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
index 9b35850..1461953 100644
--- a/src/glsl/builtin_variables.cpp
+++ b/src/glsl/builtin_variables.cpp
@@ -913,14 +913,14 @@ builtin_variable_generator::generate_fs_special_vars()
   ir_variable *const var =
  add_output(FRAG_RESULT_STENCIL, int_t, "gl_FragStencilRefARB");
   if (state->ARB_shader_stencil_export_warn)
- var->warn_extension = "GL_ARB_shader_stencil_export";
+ var->enable_extension_warning("GL_ARB_shader_stencil_export");
}
 
if (state->AMD_shader_stencil_export_enable) {
   ir_variable *const var =
  add_output(FRAG_RESULT_STENCIL, int_t, "gl_FragStencilRefAMD");
   if (state->AMD_shader_stencil_export_warn)
- var->warn_extension = "GL_AMD_shader_stencil_export";
+ var->enable_extension_warning("GL_AMD_shader_stencil_export");
}
 
if (state->ARB_sample_shading_enable) {
diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
index c727d89..3d6af56 100644
--- a/src/glsl/ir.cpp
+++ b/src/glsl/ir.cpp
@@ -1610,6 +1610,17 @@ ir_variable::determine_interpolation_mode(bool 
flat_shade)
   return INTERP_QUALIFIER_SMOOTH;
 }
 
+void
+ir_variable::enable_extension_warning(const char *extension)
+{
+   this->warn_extension = extension;
+}
+
+const char *
+ir_variable::get_extension_warning() const
+{
+   return this->warn_extension;
+}
 
 ir_function_signature::ir_function_signature(const glsl_type *return_type,
  builtin_available_predicate b)
diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index fac24df..3298a50 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -501,6 +501,18 @@ public:
}
 
/**
+* Enable emitting extension warnings for this variable
+*/
+   void enable_extension_warning(const char *extension);
+
+   /**
+* Get the extension warning string for this variable
+*
+* If warnings are not enabled, \c NULL is returned.
+*/
+   const char *get_extension_warning() const;
+
+   /**
 * Declared type of the variable
 */
const struct glsl_type *type;
@@ -740,11 +752,6 @@ public:
/*@}*/
 
/**
-* Emit a warning if this variable is accessed.
-*/
-   const char *warn_extension;
-
-   /**
 * Value assigned in the initializer of a variable declared "const"
 */
ir_constant *constant_value;
@@ -767,6 +774,11 @@ private:
 * \sa ir_variable::location
 */
const glsl_type *interface_type;
+
+   /**
+* Emit a warning if this variable is accessed.
+*/
+   const char *warn_extension;
 };
 
 /**
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/21] glsl: Add a facility to get some memory usage statistics for a shader

2014-05-27 Thread Ian Romanick
From: Ian Romanick 

v2: Also account for the ralloc header overhead.

Signed-off-by: Ian Romanick 
---
 src/glsl/Makefile.sources|   1 +
 src/glsl/ir_memory_usage.cpp | 104 +++
 src/glsl/ir_memory_usage.h   |  48 
 3 files changed, 153 insertions(+)
 create mode 100644 src/glsl/ir_memory_usage.cpp
 create mode 100644 src/glsl/ir_memory_usage.h

diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
index 5945590..6e230f7 100644
--- a/src/glsl/Makefile.sources
+++ b/src/glsl/Makefile.sources
@@ -41,6 +41,7 @@ LIBGLSL_FILES = \
$(GLSL_SRCDIR)/ir_hierarchical_visitor.cpp \
$(GLSL_SRCDIR)/ir_hv_accept.cpp \
$(GLSL_SRCDIR)/ir_import_prototypes.cpp \
+   $(GLSL_SRCDIR)/ir_memory_usage.cpp \
$(GLSL_SRCDIR)/ir_print_visitor.cpp \
$(GLSL_SRCDIR)/ir_reader.cpp \
$(GLSL_SRCDIR)/ir_rvalue_visitor.cpp \
diff --git a/src/glsl/ir_memory_usage.cpp b/src/glsl/ir_memory_usage.cpp
new file mode 100644
index 000..68c0b5c
--- /dev/null
+++ b/src/glsl/ir_memory_usage.cpp
@@ -0,0 +1,104 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+/**
+ * \file ir_memory_usage.cpp
+ * Determine the amount of memory used by different kinds of IR in a shader.
+ */
+
+#include "ir.h"
+#include "ir_hierarchical_visitor.h"
+#include "ir_memory_usage.h"
+
+class ir_memory_usage : public ir_hierarchical_visitor {
+public:
+   ir_memory_usage()
+   {
+  memset(&this->s, 0, sizeof(this->s));
+   }
+
+   ~ir_memory_usage()
+   {
+  /* empty */
+   }
+
+   virtual ir_visitor_status visit(ir_variable *v);
+   virtual ir_visitor_status visit(ir_dereference_variable *ir);
+   virtual ir_visitor_status visit_enter(class ir_dereference_array *);
+   virtual ir_visitor_status visit_enter(class ir_dereference_record *);
+
+   ir_memory_statistics s;
+};
+
+/* In release builds, the ralloc header contains 5 pointers.
+ */
+static const unsigned ralloc_header_size = 5 * sizeof(void *);
+
+ir_visitor_status
+ir_memory_usage::visit(ir_variable *ir)
+{
+   this->s.variable_usage += sizeof(*ir);
+
+   if (ir->state_slots != NULL)
+  this->s.variable_usage += (sizeof(ir_state_slot) * ir->num_state_slots)
+ + ralloc_header_size;
+
+   this->s.variable_name_usage += strlen(ir->name) + 1 + ralloc_header_size;
+
+   return visit_continue;
+}
+
+
+ir_visitor_status
+ir_memory_usage::visit(ir_dereference_variable *ir)
+{
+   this->s.dereference_variable_usage += sizeof(*ir);
+
+   return visit_continue;
+}
+
+ir_visitor_status
+ir_memory_usage::visit_enter(class ir_dereference_array *ir)
+{
+   this->s.dereference_array_usage += sizeof(*ir);
+   return visit_continue;
+}
+
+ir_visitor_status
+ir_memory_usage::visit_enter(class ir_dereference_record *ir)
+{
+   this->s.dereference_record_usage += sizeof(*ir);
+   this->s.dereference_record_field_usage += strlen(ir->field) + 1
+  + ralloc_header_size;
+   return visit_continue;
+}
+
+extern "C" void
+calculate_ir_tree_memory_usage(exec_list *instructions,
+   struct ir_memory_statistics *stats)
+{
+   ir_memory_usage v;
+
+   v.run(instructions);
+   *stats = v.s;
+}
diff --git a/src/glsl/ir_memory_usage.h b/src/glsl/ir_memory_usage.h
new file mode 100644
index 000..3d137a3
--- /dev/null
+++ b/src/glsl/ir_memory_usage.h
@@ -0,0 +1,48 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditi

[Mesa-dev] [PATCH 10/21] glsl: Make ir_instruction::ir_type private

2014-05-27 Thread Ian Romanick
From: Ian Romanick 

In the next patch, the type of ir_type is going to change from enum to
uint8_t.  Since the type won't be an enum, we won't get compiler
warnings about, for example, switch statements that don't have cases for
all the enum values.  Using a getter that returns the enum type will
enable us to continue getting those warnings.

Also, ir_type should never be changed after an object is created.
Having it set in the constructor and no setter effectively makes it
write-once.

Signed-off-by: Ian Romanick 
---
 src/glsl/ast_function.cpp  | 2 +-
 src/glsl/ast_to_hir.cpp| 2 +-
 src/glsl/ir.h  | 8 +++-
 src/glsl/ir_constant_expression.cpp| 4 ++--
 src/glsl/ir_print_visitor.cpp  | 2 +-
 src/glsl/ir_validate.cpp   | 6 +++---
 src/glsl/loop_analysis.cpp | 2 +-
 src/glsl/loop_controls.cpp | 2 +-
 src/glsl/loop_unroll.cpp   | 2 +-
 src/glsl/lower_clip_distance.cpp   | 4 ++--
 src/glsl/lower_if_to_cond_assign.cpp   | 4 ++--
 src/glsl/lower_jumps.cpp   | 6 +++---
 src/glsl/lower_offset_array.cpp| 2 +-
 src/glsl/lower_ubo_reference.cpp   | 4 ++--
 src/glsl/lower_vector.cpp  | 4 ++--
 src/glsl/lower_vector_insert.cpp   | 2 +-
 src/glsl/opt_constant_folding.cpp  | 2 +-
 src/glsl/opt_cse.cpp   | 2 +-
 src/glsl/opt_redundant_jumps.cpp   | 6 +++---
 src/glsl/opt_structure_splitting.cpp   | 2 +-
 src/glsl/opt_vectorize.cpp | 2 +-
 src/mesa/program/ir_to_mesa.cpp| 2 +-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 2 +-
 23 files changed, 40 insertions(+), 34 deletions(-)

diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp
index c70b519..bad410b 100644
--- a/src/glsl/ast_function.cpp
+++ b/src/glsl/ast_function.cpp
@@ -175,7 +175,7 @@ verify_parameter_modes(_mesa_glsl_parse_state *state,
 
   /* Verify that 'const_in' parameters are ir_constants. */
   if (formal->data.mode == ir_var_const_in &&
- actual->ir_type != ir_type_constant) {
+ actual->get_ir_type() != ir_type_constant) {
 _mesa_glsl_error(&loc, state,
  "parameter `in %s' must be a constant expression",
  formal->name);
diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index ef1607d..3fcec19 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -756,7 +756,7 @@ do_assignment(exec_list *instructions, struct 
_mesa_glsl_parse_state *state,
/* If the assignment LHS comes back as an ir_binop_vector_extract
 * expression, move it to the RHS as an ir_triop_vector_insert.
 */
-   if (lhs->ir_type == ir_type_expression) {
+   if (lhs->get_ir_type() == ir_type_expression) {
   ir_expression *const lhs_expr = lhs->as_expression();
 
   if (unlikely(lhs_expr->operation == ir_binop_vector_extract)) {
diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index 5d45469..7faee74 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -91,9 +91,15 @@ enum ir_node_type {
  * Base class of all IR instructions
  */
 class ir_instruction : public exec_node {
-public:
+private:
enum ir_node_type ir_type;
 
+public:
+   inline enum ir_node_type get_ir_type() const
+   {
+  return this->ir_type;
+   }
+
/**
 * GCC 4.7+ and clang warn when deleting an ir_instruction unless
 * there's a virtual destructor present.  Because we almost
diff --git a/src/glsl/ir_constant_expression.cpp 
b/src/glsl/ir_constant_expression.cpp
index 8afe8f7..c07b951 100644
--- a/src/glsl/ir_constant_expression.cpp
+++ b/src/glsl/ir_constant_expression.cpp
@@ -403,7 +403,7 @@ constant_referenced(const ir_dereference *deref,
if (variable_context == NULL)
   return false;
 
-   switch (deref->ir_type) {
+   switch (deref->get_ir_type()) {
case ir_type_dereference_array: {
   const ir_dereference_array *const da =
  (const ir_dereference_array *) deref;
@@ -1785,7 +1785,7 @@ bool 
ir_function_signature::constant_expression_evaluate_expression_list(const s
 {
foreach_list(n, &body) {
   ir_instruction *inst = (ir_instruction *)n;
-  switch(inst->ir_type) {
+  switch(inst->get_ir_type()) {
 
 /* (declare () type symbol) */
   case ir_type_variable: {
diff --git a/src/glsl/ir_print_visitor.cpp b/src/glsl/ir_print_visitor.cpp
index 0a7695a..e5ac50e 100644
--- a/src/glsl/ir_print_visitor.cpp
+++ b/src/glsl/ir_print_visitor.cpp
@@ -70,7 +70,7 @@ _mesa_print_ir(FILE *f, exec_list *instructions,
foreach_list(n, instructions) {
   ir_instruction *ir = (ir_instruction *) n;
   ir->fprint(f);
-  if (ir->ir_type != ir_type_function)
+  if (ir->get_ir_type() != ir_type_function)
 fprintf(f, "\n");
}
fprintf(f, "\n)");
diff --git a/src/glsl/ir_validate.cpp b/src/glsl/ir_validate.cpp
index 71defc8..1cfd0

[Mesa-dev] [PATCH 12/21] glsl: Make compiler-added padding ir_instruction usable

2014-05-27 Thread Ian Romanick
From: Ian Romanick 

No change in the peak ir_variable memory usage in a trimmed apitrace of
dota2 on 64-bit.

No change in the peak ir_variable memory usage in a trimmed apitrace of
dota2 on 32-bit.

Signed-off-by: Ian Romanick 
---
 src/glsl/ir.h | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index bc02f6e..fb10c32 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -95,6 +95,20 @@ private:
uint8_t ir_type;
 
 public:
+   /**
+* Alignment padding that would be added by the compiler
+*
+* Putting a field here makes what would otherwise be dead space usabled.
+* Subclasses of ir_instruction can store data here.  Care must be taken
+* for two reasons:
+*
+* 1. Direct descendents in the class hierarchy (e.g., \c ir_dereference
+*and \c ir_dereference_array) must not try to use this space.
+*
+* 2. The size of the padding depends on the architecture.
+*/
+   uint8_t padding[sizeof(intptr_t) - 1];
+
inline enum ir_node_type get_ir_type() const
{
   STATIC_ASSERT(ir_type_max < 256);
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/21] glsl: Eliminate ir_variable::data.atomic.buffer_index

2014-05-27 Thread Ian Romanick
From: Ian Romanick 

Just use ir_variable::data.binding... because that's the where the
binding is stored for everything else that can use layout(binding=).

No change in the peak ir_variable memory usage in a trimmed apitrace of
dota2 on 64-bit.

Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 102KiB on 32-bit.

Before: IR MEM: variable usage / name / total: 4955496 915817 5871313
After:  IR MEM: variable usage / name / total: 4850844 915817 571

Signed-off-by: Ian Romanick 
---
 src/glsl/ir.cpp| 2 +-
 src/glsl/ir.h  | 3 +--
 src/glsl/link_atomics.cpp  | 4 +++-
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   | 2 +-
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 2 +-
 5 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
index ba8a839..65541c2 100644
--- a/src/glsl/ir.cpp
+++ b/src/glsl/ir.cpp
@@ -1546,6 +1546,7 @@ ir_variable::ir_variable(const struct glsl_type *type, 
const char *name,
this->data.has_initializer = false;
this->data.location = -1;
this->data.location_frac = 0;
+   this->data.binding = 0;
this->warn_extension = NULL;
this->constant_value = NULL;
this->constant_initializer = NULL;
@@ -1561,7 +1562,6 @@ ir_variable::ir_variable(const struct glsl_type *type, 
const char *name,
this->data.mode = mode;
this->data.interpolation = INTERP_QUALIFIER_NONE;
this->data.max_array_access = 0;
-   this->data.atomic.buffer_index = 0;
this->data.atomic.offset = 0;
this->data.image.read_only = false;
this->data.image.write_only = false;
diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index ef4a12d..93d5aef 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -684,7 +684,7 @@ public:
   int index;
 
   /**
-   * Initial binding point for a sampler or UBO.
+   * Initial binding point for a sampler, atomic, or UBO.
*
* For array types, this represents the binding point for the first 
element.
*/
@@ -694,7 +694,6 @@ public:
* Location an atomic counter is stored at.
*/
   struct {
- unsigned buffer_index;
  unsigned offset;
   } atomic;
 
diff --git a/src/glsl/link_atomics.cpp b/src/glsl/link_atomics.cpp
index d92cdb1..8655269 100644
--- a/src/glsl/link_atomics.cpp
+++ b/src/glsl/link_atomics.cpp
@@ -192,7 +192,9 @@ link_assign_atomic_counter_resources(struct gl_context *ctx,
  gl_uniform_storage *const storage = &prog->UniformStorage[id];
 
  mab.Uniforms[j] = id;
- var->data.atomic.buffer_index = i;
+ if (!var->data.explicit_binding)
+var->data.binding = i;
+
  storage->atomic_buffer_index = i;
  storage->offset = var->data.atomic.offset;
  storage->array_stride = (var->type->is_array() ?
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index dcc8441..f2b34e2 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -2278,7 +2278,7 @@ fs_visitor::visit_atomic_counter_intrinsic(ir_call *ir)
   ir->actual_parameters.get_head());
ir_variable *location = deref->variable_referenced();
unsigned surf_index = (prog_data->base.binding_table.abo_start +
-  location->data.atomic.buffer_index);
+  location->data.binding);
 
/* Calculate the surface offset */
fs_reg offset(this, glsl_type::uint_type);
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index 7bad81c..d72c47c 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -2198,7 +2198,7 @@ vec4_visitor::visit_atomic_counter_intrinsic(ir_call *ir)
   ir->actual_parameters.get_head());
ir_variable *location = deref->variable_referenced();
unsigned surf_index = (prog_data->base.binding_table.abo_start +
-  location->data.atomic.buffer_index);
+  location->data.binding);
 
/* Calculate the surface offset */
src_reg offset(this, glsl_type::uint_type);
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 21/21] i965: Use short names for channel_expressions temporaries

2014-05-27 Thread Ian Romanick
From: Ian Romanick 

Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 39KiB on 64-bit.

Before: IR MEM: variable usage / name / total: 5327760 935234 6262994
After:  IR MEM: variable usage / name / total: 5327760 894914 6222674

Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 26KiB on 32-bit.

Before: IR MEM: variable usage / name / total: 4118280 670980 4789260
After:  IR MEM: variable usage / name / total: 4118280 644100 4762380

Signed-off-by: Ian Romanick 
---
 src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp
index ae5bc56..7d4e25b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp
@@ -163,8 +163,7 @@ ir_channel_expressions_visitor::visit_leave(ir_assignment 
*ir)
   assert(!expr->operands[i]->type->is_matrix());
 
   op_var[i] = new(mem_ctx) ir_variable(expr->operands[i]->type,
-  "channel_expressions",
-  ir_var_temporary);
+   "$c", ir_var_temporary);
   ir->insert_before(op_var[i]);
 
   deref = new(mem_ctx) ir_dereference_variable(op_var[i]);
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 19/21] glsl: Store ir_variable_data::_num_state_slots and ::binding in 16-bits each

2014-05-27 Thread Ian Romanick
From: Ian Romanick 

Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 204KiB on 64-bit.

Before: IR MEM: variable usage / name / total: 5537064 1121441 6658505
After:  IR MEM: variable usage / name / total: 5327760 1121441 6449201

Reduces the peak ir_variable memory usage in a trimmed apitrace of dota2
by 102KiB on 32-bit.

Before: IR MEM: variable usage / name / total: 4222932 787727 5010659
After:  IR MEM: variable usage / name / total: 4118280 787727 4906007

Signed-off-by: Ian Romanick 
---
 src/glsl/ir.h | 24 
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index 95182fb..fccbfdd 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -762,10 +762,25 @@ public:
   uint16_t image_format;
 
private:
-  unsigned _num_state_slots;/**< Number of state slots used */
+  /**
+   * Number of state slots used
+   *
+   * \note
+   * This could be stored in as few as 7-bits, if necessary.  If it is made
+   * smaller, add an assertion to \c ir_variable::allocate_state_slots to
+   * be safe.
+   */
+  uint16_t _num_state_slots;
 
public:
   /**
+   * Initial binding point for a sampler, atomic, or UBO.
+   *
+   * For array types, this represents the binding point for the first 
element.
+   */
+  int16_t binding;
+
+  /**
* Storage location of the base of this variable
*
* The precise meaning of this field depends on the nature of the 
variable.
@@ -786,13 +801,6 @@ public:
   int location;
 
   /**
-   * Initial binding point for a sampler, atomic, or UBO.
-   *
-   * For array types, this represents the binding point for the first 
element.
-   */
-  int binding;
-
-  /**
* Location an atomic counter is stored at.
*/
   struct {
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 17/21] glsl: Make ir_variable::num_state_slots and ir_variable::state_slots private

2014-05-27 Thread Ian Romanick
From: Ian Romanick 

Also move num_state_slots inside ir_variable_data for better packing.

The payoff for this will come in a few more patches.

Signed-off-by: Ian Romanick 
---
 src/glsl/builtin_variables.cpp |  5 +--
 src/glsl/ir.h  | 56 ++
 src/glsl/ir_clone.cpp  | 13 ++
 src/glsl/ir_memory_usage.cpp   |  5 ++-
 src/glsl/linker.cpp|  7 ++--
 src/mesa/drivers/dri/i965/brw_fs.cpp   |  6 +--
 src/mesa/drivers/dri/i965/brw_shader.cpp   |  6 +--
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp |  6 +--
 src/mesa/program/ir_to_mesa.cpp| 14 +++
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 14 +++
 10 files changed, 75 insertions(+), 57 deletions(-)

diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
index 1461953..5878fbf 100644
--- a/src/glsl/builtin_variables.cpp
+++ b/src/glsl/builtin_variables.cpp
@@ -489,12 +489,9 @@ builtin_variable_generator::add_uniform(const glsl_type 
*type,
   &_mesa_builtin_uniform_desc[i];
 
const unsigned array_count = type->is_array() ? type->length : 1;
-   uni->num_state_slots = array_count * statevar->num_elements;
 
ir_state_slot *slots =
-  ralloc_array(uni, ir_state_slot, uni->num_state_slots);
-
-   uni->state_slots = slots;
+  uni->allocate_state_slots(array_count * statevar->num_elements);
 
for (unsigned a = 0; a < array_count; a++) {
   for (unsigned j = 0; j < statevar->num_elements; j++) {
diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index bfd790e..ab9f27b 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -538,6 +538,32 @@ public:
   return this->max_ifc_array_access;
}
 
+   inline unsigned get_num_state_slots() const
+   {
+  return this->data._num_state_slots;
+   }
+
+   inline void set_num_state_slots(unsigned n)
+   {
+  this->data._num_state_slots = n;
+   }
+
+   inline ir_state_slot *get_state_slots()
+   {
+  return this->state_slots;
+   }
+
+   inline ir_state_slot *allocate_state_slots(unsigned n)
+   {
+  this->state_slots = ralloc_array(this, ir_state_slot, n);
+  this->data._num_state_slots = 0;
+
+  if (this->state_slots != NULL)
+ this->data._num_state_slots = n;
+
+  return this->state_slots;
+   }
+
/**
 * Enable emitting extension warnings for this variable
 */
@@ -723,6 +749,10 @@ public:
   /** Image internal format if specified explicitly, otherwise GL_NONE. */
   uint16_t image_format;
 
+   private:
+  unsigned _num_state_slots;/**< Number of state slots used */
+
+   public:
   /**
* Storage location of the base of this variable
*
@@ -771,22 +801,6 @@ public:
} data;
 
/**
-* Built-in state that backs this uniform
-*
-* Once set at variable creation, \c state_slots must remain invariant.
-* This is because, ideally, this array would be shared by all clones of
-* this variable in the IR tree.  In other words, we'd really like for it
-* to be a fly-weight.
-*
-* If the variable is not a uniform, \c num_state_slots will be zero and
-* \c state_slots will be \c NULL.
-*/
-   /*@{*/
-   unsigned num_state_slots;/**< Number of state slots used */
-   ir_state_slot *state_slots;  /**< State descriptors. */
-   /*@}*/
-
-   /**
 * Value assigned in the initializer of a variable declared "const"
 */
ir_constant *constant_value;
@@ -818,6 +832,16 @@ private:
unsigned *max_ifc_array_access;
 
/**
+* Built-in state that backs this uniform
+*
+* Once set at variable creation, \c state_slots must remain invariant.
+*
+* If the variable is not a uniform, \c _num_state_slots will be zero and
+* \c state_slots will be \c NULL.
+*/
+   ir_state_slot *state_slots;
+
+   /**
 * For variables that are in an interface block or are an instance of an
 * interface block, this is the \c GLSL_TYPE_INTERFACE type for that block.
 *
diff --git a/src/glsl/ir_clone.cpp b/src/glsl/ir_clone.cpp
index d594529..0cd35f0 100644
--- a/src/glsl/ir_clone.cpp
+++ b/src/glsl/ir_clone.cpp
@@ -53,15 +53,10 @@ ir_variable::clone(void *mem_ctx, struct hash_table *ht) 
const
 
memcpy(&var->data, &this->data, sizeof(var->data));
 
-   var->num_state_slots = this->num_state_slots;
-   if (this->state_slots) {
-  /* FINISHME: This really wants to use something like talloc_reference, 
but
-   * FINISHME: ralloc doesn't have any similar function.
-   */
-  var->state_slots = ralloc_array(var, ir_state_slot,
- this->num_state_slots);
-  memcpy(var->state_slots, this->state_slots,
-sizeof(this->state_slots[0]) * var->num_state_slots);
+   if (this->get_state_slots()) {
+  ir_state_slot *s = 
var->allocate_state_slots(this->get_num_state_slots());
+  memcpy(s, this->get_state_slots()

[Mesa-dev] [PATCH 00/21] Reduce ir_variable memory usage

2014-05-27 Thread Ian Romanick
This series reduces the memory usage of ir_variable quite significantly.

The first couple patches add a mechanism to determine the amount of
memory used by any kind of IR object.  This is used to collect the data
that is shown in the commit messages through the series.

Most of the rest of the patches rearrange data or store things in
smaller fields.  The two interesting "subseries" are:

Patches 9 through 15 and 20 through 21: Store short variable names in
otherwise "dead" space in the base class.  I didn't rebase these patches
to all be together because I didn't want to re-collect all the data. :)
A small amount more savings could be had here, but in the test case at
hand, it didn't appear worth the effort.  Adding

diff --git a/src/glsl/ir_memory_usage.cpp b/src/glsl/ir_memory_usage.cpp
index f122635..2aead7c 100644
--- a/src/glsl/ir_memory_usage.cpp
+++ b/src/glsl/ir_memory_usage.cpp
@@ -67,6 +67,9 @@ ir_memory_usage::visit(ir_variable *ir)
if (ir->name != (char *) ir->padding)
   this->s.variable_name_usage += strlen(ir->name) + 1 + ralloc_header_size;
 
+   if (ir->name != (char *) ir->padding && ir->data.mode == ir_var_temporary)
+  printf("IR MEM: %s\n", ir->name);
+
return visit_continue;
 }
 
may show some other possibilities.

Patches 16 through 18: Store two fields that are never both used in the
same location.

Here's the punchline.  In a trimmed trace from dota2 on 32-bit,
ir_variable accounts for ~5.5MB before this series.  After this series,
it accounts for only ~4.5MB.

Before: IR MEM: variable usage / name / total: 4955496 915817 5871313
After:  IR MEM: variable usage / name / total: 4118280 644100 4762380

I would love to see before / after data for a full run of dota2 with all
the shaders compiled. :)

This is also available in the ir_variable-diet-v2 branch of my fdo
tree.  The ir_variable-diet branch contains a false start.  I tried to
move a bunch of fields that are only used for shader interface variables
(e.g., uniforms or varyings) to a dynamically allocated structure.  At
least on my test case, the added ralloc overhead made that a loss.  It
may be possible to try a similar techinique by subclassing ir_varible,
but I think that will be a lot of work.  The biggest annoyance is that
when ast_to_hir allocates an ir_variable, it doesn't yet know what it
will be... and changes the ir_variable_data::mode after allocation.

As a side note, pahole is a really useful utility, but it lies a little
bit on C++ objects.  It will say things like:

class ir_rvalue : public ir_instruction {
public:

/* class ir_instruction  ; */  /* 0 0 */

/* XXX 16 bytes hole, try to pack */

const class glsl_type  *   type; /*16 4 */

/* size: 20, cachelines: 1, members: 2 */
/* sum members: 4, holes: 1, sum holes: 16 */
/* last cacheline: 20 bytes */
};

I've trimmed out all the methods and other noise.  It says there's this
16-byte "hole."  Notice sizeof(ir_instruction) is 16 bytes and the total
size of ir_rvalue is 20 bytes.  This 16-byte hole is the storage for the
base class members! :) Calling it a hole is a bit misleading.  This also
sent me down a false path, but, thankfully, not for too long.

Cc: Eero Tamminen 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/21] glsl: Store ir_variable::ir_type in 8 bits instead of 32

2014-05-27 Thread Matt Turner
On Tue, May 27, 2014 at 7:49 PM, Ian Romanick  wrote:
> From: Ian Romanick 
>
> No change in the peak ir_variable memory usage in a trimmed apitrace of
> dota2 on 64-bit.
>
> No change in the peak ir_variable memory usage in a trimmed apitrace of
> dota2 on 32-bit.
>
> Signed-off-by: Ian Romanick 
> ---
>  src/glsl/ir.h | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/src/glsl/ir.h b/src/glsl/ir.h
> index 7faee74..bc02f6e 100644
> --- a/src/glsl/ir.h
> +++ b/src/glsl/ir.h
> @@ -92,12 +92,13 @@ enum ir_node_type {
>   */
>  class ir_instruction : public exec_node {
>  private:
> -   enum ir_node_type ir_type;
> +   uint8_t ir_type;
>
>  public:
> inline enum ir_node_type get_ir_type() const
> {
> -  return this->ir_type;
> +  STATIC_ASSERT(ir_type_max < 256);
> +  return (enum ir_node_type) this->ir_type;
> }
>
> /**
> --
> 1.8.1.4

Instead of doing this, you can mark the enum type with the PACKED
attribute. I did this in a similar change in i965 already. See
http://lists.freedesktop.org/archives/mesa-dev/2014-February/054643.html

This way we still get enum type checking and warnings out of switch
statements and such.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/11] loader: Use drirc device_id parameter in complement to DRI_PRIME

2014-05-27 Thread Michel Dänzer

This breaks the build for me, see below. That's an out-of-tree build FWIW.


make[2]: Entering directory 
'/home/daenzer/src/mesa-git/mesa/build-amd64/src/loader'
 cd ../../.. && automake-1.14 --foreign src/loader/Makefile
src/loader/Makefile.am:42: warning: source file 
'$(top_srcdir)/src/mesa/drivers/dri/common/xmlconfig.c' is in a subdirectory,
src/loader/Makefile.am:42: but option 'subdir-objects' is disabled
automake-1.14: warning: possible forward-incompatibility.
automake-1.14: At least a source file is in a subdirectory, but the 
'subdir-objects'
automake-1.14: automake option hasn't been enabled.  For now, the corresponding 
output
automake-1.14: object file(s) will be placed in the top-level directory.  
However,
automake-1.14: this behaviour will change in future Automake versions: they will
automake-1.14: unconditionally cause object files to be placed in the same 
subdirectory
automake-1.14: of the corresponding sources.
automake-1.14: You are advised to start using 'subdir-objects' option 
throughout your
automake-1.14: project, to avoid future incompatibilities.
 cd ../.. && /bin/bash ./config.status src/loader/Makefile depfiles
config.status: creating src/loader/Makefile
config.status: executing depfiles commands
  CC   libloader_la-loader.lo
  CC   libloader_la-xmlconfig.lo
In file included from ../../../src/loader/loader.c:79:0:
../../../src/mesa/drivers/dri/common/xmlpool.h:103:29: fatal error: 
xmlpool/options.h: No such file or directory
 #include "xmlpool/options.h"
 ^
compilation terminated.
Makefile:570: recipe for target 'libloader_la-loader.lo' failed
make[2]: *** [libloader_la-loader.lo] Error 1
make[2]: Leaving directory 
'/home/daenzer/src/mesa-git/mesa/build-amd64/src/loader'


-- 
Earthling Michel Dänzer|  http://www.amd.com
Libre software enthusiast  |Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/19] i965/fs: Add and use an fs_inst copy constructor.

2014-05-27 Thread Chris Forbes
Patches 1-8 inclusive are

Reviewed-by: Chris Forbes 

On Wed, May 28, 2014 at 1:47 PM, Matt Turner  wrote:
> Will get more complicated when fs_reg src becomes a pointer.
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 5 +
>  src/mesa/drivers/dri/i965/brw_fs.h   | 1 +
>  2 files changed, 6 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index bd77e0c..5b7d84f 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -139,6 +139,11 @@ fs_inst::fs_inst(enum opcode opcode, fs_reg dst,
>assert(src[2].reg_offset >= 0);
>  }
>
> +fs_inst::fs_inst(const fs_inst &that)
> +{
> +   memcpy(this, &that, sizeof(that));
> +}
> +
>  #define ALU1(op)\
> fs_inst *\
> fs_visitor::op(fs_reg dst, fs_reg src0)  \
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
> b/src/mesa/drivers/dri/i965/brw_fs.h
> index 789f0b3..bda233c 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> @@ -197,6 +197,7 @@ public:
> fs_inst(enum opcode opcode, fs_reg dst, fs_reg src0, fs_reg src1);
> fs_inst(enum opcode opcode, fs_reg dst,
> fs_reg src0, fs_reg src1,fs_reg src2);
> +   fs_inst(const fs_inst &that);
>
> bool equals(fs_inst *inst) const;
> bool overwrites_reg(const fs_reg ®) const;
> --
> 1.8.3.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >