date:20160927

Re: [Mesa-dev] OSMesa Virtual Window with height > 16384px

2016-09-27 Thread Phaedra Narayna

Thank you Brian for the clarification, I have looked at your past project
and will go this way

-- 
Philippe

On 23 September 2016 at 18:07:39, Brian Paul (bri...@vmware.com) wrote:

On 09/23/2016 08:08 AM, Phaedra Narayna wrote:
> Hi,
>
> I have a OSMesa bug when trying to create a virtual window that is more
> than 16384 pixel in height , see below :
>
> *Command:*
> #headless_shell --screenshot --window-size="1920x16385"
> --hide-scrollbars --no-sandbox http://linuxfr.org
> *Error Msg:*
> [0920/154515:ERROR:gl_context_osmesa.cc(72)] OSMesaMakeCurrent failed.
> [0920/154515:ERROR:gles2_cmd_decoder.cc(4992)] GLES2DecoderImpl: Context
> lost because context no longer current after resize callback.
> [0920/154515:ERROR:gles2_cmd_decoder.cc(5108)] Error: 5 for Command
> kResizeCHROMIUM
> [0920/154515:ERROR:gl_context_osmesa.cc(72)] OSMesaMakeCurrent failed.
>
> I am using OSMesa on Archlinux :
> https://www.archlinux.org/packages/extra/i686/mesa/ , mesa 12.0.3-1 as
> this time of writing.
>
> I have checked with the HeadLess Chromium project as per this thread:
>
https://groups.google.com/a/chromium.org/forum/#!topic/headless-dev/t0ixeHXCzK0
>
> I have checked with the Skia project as per this thread :
>
https://bugs.chromium.org/p/skia/issues/detail?id=580&can=2&start=0&num=100&q=label%3AHotlist-Fixit&colspec=ID%20Type%20Status%20Priority%20M%20Area%20Owner%20Summary&groupby=&sort=
>
>
> Any idea on how to solve this problem?

There's always a limit on max surface/rendering size because of
rasterization and interpolation limitations.

I don't know which driver you're using, but if you query the
GL_MAX_FRAMEBUFFER_WIDTH and GL_MAX_FRAMEBUFFER_HEIGHT you can find the
limits.

If you need to render a larger image, you'll probably have to break up
the image into tiles which are rendered individually. Here's an old
project of mine that might help: http://www.mesa3d.org/brianp/TR.html

-Brian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [llvm] r282237 - [InstCombine] Fix for PR29124: reduce insertelements to shufflevector

2016-09-27 Thread Alexey Bataev

Hello Michael,
Will look at it ASAP

Best regards,
Alexey Bataev

> 26 сент. 2016 г., в 10:07, Michel Dänzer  написал(а):
> 
> 
> Hi Alexey,
> 
> 
>> On 23/09/16 06:14 PM, Alexey Bataev via llvm-commits wrote:
>> Author: abataev
>> Date: Fri Sep 23 04:14:08 2016
>> New Revision: 282237
>> 
>> URL: http://llvm.org/viewvc/llvm-project?rev=282237&view=rev
>> Log:
>> [InstCombine] Fix for PR29124: reduce insertelements to shufflevector
> 
> This change introduced failures with the Mesa llvmpipe driver unit test
> lp_test_format. See below for information about the CPU, and the
> attachment for the IR and results of the failing sub-tests. Let me know
> if you need more information.
> 
> 
> processor: 0
> vendor_id: AuthenticAMD
> cpu family: 21
> model: 48
> model name: AMD A10-7850K Radeon R7, 12 Compute Cores 4C+8G
> stepping: 1
> microcode: 0x6003106
> cpu MHz: 4100.000
> cache size: 2048 KB
> physical id: 0
> siblings: 4
> core id: 0
> cpu cores: 2
> apicid: 16
> initial apicid: 0
> fpu: yes
> fpu_exception: yes
> cpuid level: 13
> wp: yes
> flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
> pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt
> pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid
> aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2
> popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm
> sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce
> nodeid_msr tbm topoext perfctr_core perfctr_nb bpext arat cpb hw_pstate
> npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid
> decodeassists pausefilter pfthreshold vmmcall fsgsbase bmi1 xsaveopt
> bugs: fxsave_leak sysret_ss_attrs
> bogomips: 8200.55
> TLB size: 1536 4K pages
> clflush size: 64
> cache_alignment: 64
> address sizes: 48 bits physical, 48 bits virtual
> power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro [13]
> 
> 
> -- 
> Earthling Michel Dänzer   |   http://www.amd.com
> Libre software enthusiast | Mesa and X developer
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [llvm] r282237 - [InstCombine] Fix for PR29124: reduce insertelements to shufflevector

2016-09-27 Thread Alexey Bataev

Michael, fixed this bug in r282401

Best regards,
Alexey Bataev

On 09/26/2016 10:06 AM, Michel Dänzer wrote:
> Hi Alexey,
>
>
> On 23/09/16 06:14 PM, Alexey Bataev via llvm-commits wrote:
>> Author: abataev
>> Date: Fri Sep 23 04:14:08 2016
>> New Revision: 282237
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=282237&view=rev
>> Log:
>> [InstCombine] Fix for PR29124: reduce insertelements to shufflevector
> This change introduced failures with the Mesa llvmpipe driver unit test
> lp_test_format. See below for information about the CPU, and the
> attachment for the IR and results of the failing sub-tests. Let me know
> if you need more information.
>
>
> processor : 0
> vendor_id : AuthenticAMD
> cpu family: 21
> model : 48
> model name: AMD A10-7850K Radeon R7, 12 Compute Cores 4C+8G
> stepping  : 1
> microcode : 0x6003106
> cpu MHz   : 4100.000
> cache size: 2048 KB
> physical id   : 0
> siblings  : 4
> core id   : 0
> cpu cores : 2
> apicid: 16
> initial apicid: 0
> fpu   : yes
> fpu_exception : yes
> cpuid level   : 13
> wp: yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
> pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt
> pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid
> aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2
> popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm
> sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce
> nodeid_msr tbm topoext perfctr_core perfctr_nb bpext arat cpb hw_pstate
> npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid
> decodeassists pausefilter pfthreshold vmmcall fsgsbase bmi1 xsaveopt
> bugs  : fxsave_leak sysret_ss_attrs
> bogomips  : 8200.55
> TLB size  : 1536 4K pages
> clflush size  : 64
> cache_alignment   : 64
> address sizes : 48 bits physical, 48 bits virtual
> power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro [13]
>
>

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97879] [amdgpu] Rocket League: long hangs (several seconds) when loading assets (models/textures/shaders?)

2016-09-27 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97879

--- Comment #23 from Michel Dänzer  ---
Please pass --call-graph to perf record (you may need to play with the
different methods supported by that to find the one which works best for you,
see perf record --help) and perf report.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: remove remaining tabs in glsl_parser_extras.h

2016-09-27 Thread Eric Engestrom

On Tue, Sep 27, 2016 at 12:03:40PM +1000, Timothy Arceri wrote:
> ---

Whitespace-only change, and the result is cleaner than before :)
Reviewed-by: Eric Engestrom 

>  src/compiler/glsl/glsl_parser_extras.h | 60 
> +-
>  1 file changed, 30 insertions(+), 30 deletions(-)
> 
> diff --git a/src/compiler/glsl/glsl_parser_extras.h 
> b/src/compiler/glsl/glsl_parser_extras.h
> index f4050e3..b9c9a1a 100644
> --- a/src/compiler/glsl/glsl_parser_extras.h
> +++ b/src/compiler/glsl/glsl_parser_extras.h
> @@ -69,12 +69,12 @@ typedef struct YYLTYPE {
>  # define YYLTYPE_IS_TRIVIAL 1
>  
>  extern void _mesa_glsl_error(YYLTYPE *locp, _mesa_glsl_parse_state *state,
> -  const char *fmt, ...);
> + const char *fmt, ...);
>  
>  
>  struct _mesa_glsl_parse_state {
> _mesa_glsl_parse_state(struct gl_context *_ctx, gl_shader_stage stage,
> -   void *mem_ctx);
> +  void *mem_ctx);
>  
> DECLARE_RALLOC_CXX_OPERATORS(_mesa_glsl_parse_state);
>  
> @@ -816,23 +816,23 @@ struct _mesa_glsl_parse_state {
> unsigned clip_dist_size, cull_dist_size;
>  };
>  
> -# define YYLLOC_DEFAULT(Current, Rhs, N) \
> -do { \
> -   if (N)\
> -   { \
> -  (Current).first_line   = YYRHSLOC(Rhs, 1).first_line;  \
> -  (Current).first_column = YYRHSLOC(Rhs, 1).first_column;\
> -  (Current).last_line= YYRHSLOC(Rhs, N).last_line;   \
> -  (Current).last_column  = YYRHSLOC(Rhs, N).last_column; \
> -   } \
> -   else  \
> -   { \
> -  (Current).first_line   = (Current).last_line = \
> -  YYRHSLOC(Rhs, 0).last_line;\
> -  (Current).first_column = (Current).last_column =   \
> -  YYRHSLOC(Rhs, 0).last_column;  \
> -   } \
> -   (Current).source = 0; \
> +# define YYLLOC_DEFAULT(Current, Rhs, N)\
> +do {\
> +   if (N)   \
> +   {\
> +  (Current).first_line   = YYRHSLOC(Rhs, 1).first_line; \
> +  (Current).first_column = YYRHSLOC(Rhs, 1).first_column;   \
> +  (Current).last_line= YYRHSLOC(Rhs, N).last_line;  \
> +  (Current).last_column  = YYRHSLOC(Rhs, N).last_column;\
> +   }\
> +   else \
> +   {\
> +  (Current).first_line   = (Current).last_line =\
> + YYRHSLOC(Rhs, 0).last_line;\
> +  (Current).first_column = (Current).last_column =  \
> + YYRHSLOC(Rhs, 0).last_column;  \
> +   }\
> +   (Current).source = 0;\
>  } while (0)
>  
>  /**
> @@ -841,11 +841,11 @@ do {
> \
>   * \sa _mesa_glsl_error
>   */
>  extern void _mesa_glsl_warning(const YYLTYPE *locp,
> -_mesa_glsl_parse_state *state,
> -const char *fmt, ...);
> +   _mesa_glsl_parse_state *state,
> +   const char *fmt, ...);
>  
>  extern void _mesa_glsl_lexer_ctor(struct _mesa_glsl_parse_state *state,
> -   const char *string);
> +  const char *string);
>  
>  extern void _mesa_glsl_lexer_dtor(struct _mesa_glsl_parse_state *state);
>  
> @@ -863,9 +863,9 @@ extern int _mesa_glsl_parse(struct _mesa_glsl_parse_state 
> *);
>   * \c false is returned.
>   */
>  extern bool _mesa_glsl_process_extension(const char *name, YYLTYPE 
> *name_locp,
> -  const char *behavior,
> -  YYLTYPE *behavior_locp,
> -  _mesa_glsl_parse_state *state);
> + const char *behavior,
> + YYLTYPE *behavior_locp,
> + _mesa_glsl_parse_state *state);
>  
>  #endif /* __cplusplus */
>  
> @@ -880,11 +880,11 @@ extern "C" {
>  struct glcpp_parser;
>  
>  typedef void (*glcpp_extension_iterator)(
> -

[Mesa-dev] [Bug 97879] [amdgpu] Rocket League: long hangs (several seconds) when loading assets (models/textures/shaders?)

2016-09-27 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97879

--- Comment #24 from Silvan Jegen  ---
(In reply to Michel Dänzer from comment #23)
> Please pass --call-graph to perf record (you may need to play with the
> different methods supported by that to find the one which works best for
> you, see perf record --help) and perf report.

Looks like my perf is linked to libunwind so according to 'perf help record' I
should be able to run 'perf record --call-graph dwarf -a' during the stall and
generate a CPU profile report including the call graphs. I will do so later
today.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gallium/radeon: Initialize pipe_resource::next to NULL

2016-09-27 Thread Michel Dänzer

From: Michel Dänzer 

Fixes lots of piglit tests crashing due to using uninitialized memory.

Fixes: ecd6fce2611e ("mesa/st: support lowering multi-planar YUV")
Signed-off-by: Michel Dänzer 
---
 src/gallium/drivers/radeon/r600_buffer_common.c | 1 +
 src/gallium/drivers/radeon/r600_texture.c   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c 
b/src/gallium/drivers/radeon/r600_buffer_common.c
index 2e8b6f4..cbbcc29 100644
--- a/src/gallium/drivers/radeon/r600_buffer_common.c
+++ b/src/gallium/drivers/radeon/r600_buffer_common.c
@@ -511,6 +511,7 @@ r600_alloc_buffer_struct(struct pipe_screen *screen,
rbuffer = MALLOC_STRUCT(r600_resource);
 
rbuffer->b.b = *templ;
+   rbuffer->b.b.next = NULL;
pipe_reference_init(&rbuffer->b.b.reference, 1);
rbuffer->b.b.screen = screen;
rbuffer->b.vtbl = &r600_buffer_vtbl;
diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index b02b2dc..71564e2 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -1040,6 +1040,7 @@ r600_texture_create_object(struct pipe_screen *screen,
 
resource = &rtex->resource;
resource->b.b = *base;
+   resource->b.b.next = NULL;
resource->b.vtbl = &r600_texture_vtbl;
pipe_reference_init(&resource->b.b.reference, 1);
resource->b.b.screen = screen;
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97808] "tgsi/scan: don't set interp flags for inputs only used by INTERP instructions" causes glitches in wine with gallium nine

2016-09-27 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97808

Marek Olšák  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #6 from Marek Olšák  ---
I reverted the problematic patch:
https://cgit.freedesktop.org/mesa/mesa/commit/?id=f019255acf4e3dab40f9504390357cd7798dd3e0

Closing.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/radeon: Initialize pipe_resource::next to NULL

2016-09-27 Thread Nicolai Hähnle


Ouch :)

Reviewed-by: Nicolai Hähnle 

On 27.09.2016 11:18, Michel Dänzer wrote:

From: Michel Dänzer 

Fixes lots of piglit tests crashing due to using uninitialized memory.

Fixes: ecd6fce2611e ("mesa/st: support lowering multi-planar YUV")
Signed-off-by: Michel Dänzer 
---
 src/gallium/drivers/radeon/r600_buffer_common.c | 1 +
 src/gallium/drivers/radeon/r600_texture.c   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c 
b/src/gallium/drivers/radeon/r600_buffer_common.c
index 2e8b6f4..cbbcc29 100644
--- a/src/gallium/drivers/radeon/r600_buffer_common.c
+++ b/src/gallium/drivers/radeon/r600_buffer_common.c
@@ -511,6 +511,7 @@ r600_alloc_buffer_struct(struct pipe_screen *screen,
rbuffer = MALLOC_STRUCT(r600_resource);

rbuffer->b.b = *templ;
+   rbuffer->b.b.next = NULL;
pipe_reference_init(&rbuffer->b.b.reference, 1);
rbuffer->b.b.screen = screen;
rbuffer->b.vtbl = &r600_buffer_vtbl;
diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index b02b2dc..71564e2 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -1040,6 +1040,7 @@ r600_texture_create_object(struct pipe_screen *screen,

resource = &rtex->resource;
resource->b.b = *base;
+   resource->b.b.next = NULL;
resource->b.vtbl = &r600_texture_vtbl;
pipe_reference_init(&resource->b.b.reference, 1);
resource->b.b.screen = screen;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/3] st/va: Save surface chroma format in config

2016-09-27 Thread Emil Velikov

Hi Mark,

Patches without any commit message are a bad idea, generally. Please
don't do that.
Here are some articles which should help you on the topic.

[1] http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html
[2] http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html
[3] http://chris.beams.io/posts/git-commit/

On 19 September 2016 at 00:10, Mark Thompson  wrote:
> ---
> We need this stored somewhere to be able to return useful information from 
> vaQuerySurfaceAttributes() in the following patch.
>
>
>  src/gallium/state_trackers/va/config.c | 23 +--
>  src/gallium/state_trackers/va/va_private.h |  1 +
>  2 files changed, 22 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/state_trackers/va/config.c 
> b/src/gallium/state_trackers/va/config.c
> index c6c5bb1..bd47381 100644
> --- a/src/gallium/state_trackers/va/config.c
> +++ b/src/gallium/state_trackers/va/config.c
> @@ -191,6 +191,17 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile 
> profile, VAEntrypoint entrypoin
> if (profile == VAProfileNone && entrypoint == VAEntrypointVideoProc) {
>config->entrypoint = VAEntrypointVideoProc;
>config->profile = PIPE_VIDEO_PROFILE_UNKNOWN;
> +  for (int i = 0; i < num_attribs; i++) {
> + if (attrib_list[i].type == VAConfigAttribRTFormat) {
> +if (attrib_list[i].value & (VA_RT_FORMAT_YUV420 |
> +VA_RT_FORMAT_RGB32)) {
Nit: move this to the previous line. Then again, why is
VA_RT_FORMAT_RGB32 in here ?

> +   config->rt_format = attrib_list[i].value;
> +} else {
> +   FREE(config);
> +   return VA_STATUS_ERROR_UNSUPPORTED_RT_FORMAT;
> +}
> + }
> +  }
>pipe_mutex_lock(drv->mutex);
>*config_id = handle_table_add(drv->htab, config);
>pipe_mutex_unlock(drv->mutex);
> @@ -233,7 +244,7 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile profile, 
> VAEntrypoint entrypoin
>
> config->profile = p;
>
> -   for (int i = 0; i  +   for (int i = 0; i < num_attribs; i++) {
Unrelated whitespace change ?

>if (attrib_list[i].type == VAConfigAttribRateControl) {
>   if (attrib_list[i].value == VA_RC_CBR)
>  config->rc = PIPE_H264_ENC_RATE_CONTROL_METHOD_CONSTANT;
> @@ -242,6 +253,14 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile 
> profile, VAEntrypoint entrypoin
>   else
>  config->rc = PIPE_H264_ENC_RATE_CONTROL_METHOD_DISABLE;
>}
> +  if (attrib_list[i].type == VAConfigAttribRTFormat) {
> + if (attrib_list[i].value & VA_RT_FORMAT_YUV420) {
s/&/==/ ?

Regards,
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/5] st/omx/dec/h265: add scaling list data

2016-09-27 Thread Emil Velikov

On 23 September 2016 at 17:32, Leo Liu  wrote:
> Specified by 7.3.4
There's a word missing in there ^ - table 7.3.4 ?

>
> Signed-off-by: Leo Liu 
> ---
>  src/gallium/state_trackers/omx/vid_dec_h265.c | 126 
> +-
>  1 file changed, 121 insertions(+), 5 deletions(-)
>
> diff --git a/src/gallium/state_trackers/omx/vid_dec_h265.c 
> b/src/gallium/state_trackers/omx/vid_dec_h265.c
> index 0772b4d..3c46505 100644
> --- a/src/gallium/state_trackers/omx/vid_dec_h265.c
> +++ b/src/gallium/state_trackers/omx/vid_dec_h265.c
> @@ -57,6 +57,28 @@ enum {
> NAL_UNIT_TYPE_PPS = 34,
>  };
>
> +static const uint8_t Default_8x8_Intra[64] = {
> +   16, 16, 16, 16, 17, 18, 21, 24,
> +   16, 16, 16, 16, 17, 19, 22, 25,
> +   16, 16, 17, 18, 20, 22, 25, 29,
> +   16, 16, 18, 21, 24, 27, 31, 36,
> +   17, 17, 20, 24, 30, 35, 41, 47,
> +   18, 19, 22, 27, 35, 44, 54, 65,
> +   21, 22, 25, 31, 41, 54, 70, 88,
> +   24, 25, 29, 36, 47, 65, 88, 115
> +};
> +
> +static const uint8_t Default_8x8_Inter[64] = {
> +   16, 16, 16, 16, 17, 18, 20, 24,
> +   16, 16, 16, 17, 18, 20, 24, 25,
> +   16, 16, 17, 18, 20, 24, 25, 28,
> +   16, 17, 18, 20, 24, 25, 28, 33,
> +   17, 18, 20, 24, 25, 28, 33, 41,
> +   18, 20, 24, 25, 28, 33, 41, 54,
> +   20, 24, 25, 28, 33, 41, 54, 71,
> +   24, 25, 28, 33, 41, 54, 71, 91
> +};
> +
Style used for the names is a bit iffy - use default_8x8_inter ?
Since neither of these is omx specific worth moving these to aux/vl ?

>  struct dpb_list {
> struct list_head list;
> struct pipe_video_buffer *buffer;
> @@ -188,10 +210,104 @@ static unsigned profile_tier_level(struct vl_rbsp 
> *rbsp,
> return level_idc;
>  }
>
> -static void scaling_list_data(void)
> +static void scaling_list_data(vid_dec_PrivateType *priv,
> +  struct vl_rbsp *rbsp, struct pipe_h265_sps 
> *sps)
>  {
> -   /* TODO */
> -   assert(0);
> +   unsigned size_id, matrix_id;
> +
> +   for (size_id = 0; size_id < 4; ++size_id) {
Why would one loop over size_id, if close of everything in the loop is
special cased on the size_id ?

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/5] st/omx/dec/h265: decoder size should follow from sps

2016-09-27 Thread Emil Velikov

On 23 September 2016 at 17:32, Leo Liu  wrote:
> So that it will pass correct size to width(height)_in_samples in
> uvd message buffer.
>
The st code is device agnostic. s/uvd/hardware/ perhaps ?

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 5/5] st/omx/dec/h265: fix the skip for before and after list

2016-09-27 Thread Emil Velikov

On 23 September 2016 at 17:32, Leo Liu  wrote:
> Should not be skipped when rps->used false
>
Please 'translate' "rps->used false" to English ? Also one might want
to mention if the patch/es fix any known issue - bugzilla, fixes
"jerky" playback, etc.

Afaict 3-5 are bugfixes so please add a bit more context in the commit
message and
Cc: mesa-sta...@lists.freedesktop.org
Reviewed-by: Emil Velikov 

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97542] mesa-12.0.1 with llvm-3.9.0_rc3 - src/gallium/state_trackers/clover/llvm/invocation.cpp:212:75: error: no matching function for call to clang::CompilerInvocation::setLangDefault

2016-09-27 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97542

Alexander Tsoy  changed:

   What|Removed |Added

 CC||alexan...@tsoy.me

bastian.beisc...@rwth-aachen.de changed:

   What|Removed |Added

 CC||bastian.beischer@rwth-aache
   ||n.de

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] v2 st/va Avoid VBR bitrate calculation overflow

2016-09-27 Thread Emil Velikov

On 26 September 2016 at 14:35, Christian König  wrote:
> Am 26.09.2016 um 11:44 schrieb Andy Furniss:
>>
>> VBR bitrate calc needs 64 bits at high rates.
>> v2 use float.
>>
>> Signed-off-by: Andy Furniss 
>
>
> Reviewed-by: Christian König .
>
> Since Leo is on vacation I will probably collect all remaining mesa patches
> and commit them later today.
>
Christian please s/v2 st/va /st/va: / the commit message and add
Cc: mesa-sta...@lists.freedesktop.org

Thanks
Emil

Andy: git format-patch -v2 ... saves you the manual tweak of the subject line.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/va: Fix vaSyncSurface with no outstanding operation

2016-09-27 Thread Christian König

I just pushed the patch for now cause this is really a rather obvious 
bugfix.


Please keep digging thinks like this up.

Thanks for the help,
Christian.

Am 27.09.2016 um 02:18 schrieb Mark Thompson:

On 27/09/16 00:49, Andy Furniss wrote:

Mark Thompson wrote:

---
A simple fix to the problem described here: 
.

With this applied, the driver no longer hangs/crashes when vaSyncSurface() is 
called in places other than for the first time after an encode operation 
(including a second call on the same surface).

Once I could get ffmpeg (patched) or avconv to roughly work (before the dual 
instance commit), but I can't get either to work now = produces unreadable file.

Testing with git avconv I am trying -

./avconv -vaapi_device :0 -f rawvideo -framerate 50 -s 2560x1440 -pix_fmt nv12 
-i /mnt/ramdisk/trees-1440p50.nv12 -vframes 5 -vf 'hwupload' -c:v h264_vaapi 
-profile:v 66 -b:v 40M  -bf 0 -g 30  -f h264 -y /mnt/ramdisk/out.264

but debugging printfs show refs = 2 and bframes enabled (I also notice with 
your baseline patch that -profile:v 66 fails).

Do you have an example that works for you with avconv + this patch?

Yes: this patch 
 is 
also required to match the vaSyncSurface() change.  The rest of the that series to 
libav and the one to mesa for config setup makes it all a bit more sensible (doesn't 
submit a load of packed headers which are ignored), but it does mostly work without.

With all of those, the commands:

./avconv -y -vaapi_device /dev/dri/renderD129 -i in.mp4 -an -vf 
'format=nv12,hwupload' -c:v h264_vaapi -bf 0 out.mp4

./avconv -y -vaapi_device /dev/dri/renderD129 -hwaccel vaapi 
-hwaccel_output_format vaapi -i in.mp4 -an -c:v h264_vaapi -bf 0 out.mp4

./avconv -y -vaapi_device /dev/dri/renderD129 -hwaccel vaapi 
-hwaccel_output_format vaapi -i in.mp4 -an -vf 'scale_vaapi=w=1280:h=720' -c:v 
h264_vaapi -bf 0 out.mp4

work sensibly for me (also with -b for CBR, -qp for CQP, -g for GOP size); I 
imagine raw video as in your example would also be fine.  On profile, 
constrained baseline on the command line is 578 (== 66 | 0x200, for 
constraint_set1_flag).

Thanks,

- Mark

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] v2 st/va Avoid VBR bitrate calculation overflow

2016-09-27 Thread Christian König


Am 27.09.2016 um 12:42 schrieb Emil Velikov:

On 26 September 2016 at 14:35, Christian König  wrote:

Am 26.09.2016 um 11:44 schrieb Andy Furniss:

VBR bitrate calc needs 64 bits at high rates.
v2 use float.

Signed-off-by: Andy Furniss 


Reviewed-by: Christian König .

Since Leo is on vacation I will probably collect all remaining mesa patches
and commit them later today.


Christian please s/v2 st/va /st/va: / the commit message and add
Cc: mesa-sta...@lists.freedesktop.org


Done and committed.



Thanks
Emil

Andy: git format-patch -v2 ... saves you the manual tweak of the subject line.


Interesting, didn't knew that option either.

Thanks,
Christian.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] egl: stop claiming support for pbuffer + msaa (RFC)

2016-09-27 Thread Emil Velikov

On 26 September 2016 at 08:41, Tapani Pälli  wrote:
> This fixes a crash in egl-create-msaa-pbuffer-surface Piglit test
> and same crash in many dEQP EGL tests.
>
> I also found that some Qt example did a workaround because of this
> crash: https://bugreports.qt.io/browse/QTBUG-47509
>
> Signed-off-by: Tapani Pälli 
> ---
>
> This is RFC as I'm not sure if we are supposed to support this. I tried
> to verify overall pbuffer situation with some mesa-demos using pbuffer
> but those are not working for me at all with or without my patch.
>
>  src/egl/main/eglconfig.c | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/src/egl/main/eglconfig.c b/src/egl/main/eglconfig.c
> index 6161d26..20cf9d4 100644
> --- a/src/egl/main/eglconfig.c
> +++ b/src/egl/main/eglconfig.c
> @@ -407,6 +407,11 @@ _eglValidateConfig(const _EGLConfig *conf, EGLBoolean 
> for_matching)
>return EGL_FALSE;
> }
>
> +   /* pbuffer with MSAA not supported */
Fwiw on my system piglit also crashes + the demos don't render
anything. So I'm leaning that we want this as-is (for the time being)
+ cc stable ?

Can you apply a minor polish to the comment - "XXX/TODO: pbuffer +
MSAA does not work + QT bugreport" or alike.

Cc: mesa-sta...@lists.freedesktop.org
Reviewed-by: Emil Velikov 

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/radeon: Initialize pipe_resource::next to NULL

2016-09-27 Thread Marek Olšák

Reviewed-by: Marek Olšák 

Marek

On Tue, Sep 27, 2016 at 11:18 AM, Michel Dänzer  wrote:
> From: Michel Dänzer 
>
> Fixes lots of piglit tests crashing due to using uninitialized memory.
>
> Fixes: ecd6fce2611e ("mesa/st: support lowering multi-planar YUV")
> Signed-off-by: Michel Dänzer 
> ---
>  src/gallium/drivers/radeon/r600_buffer_common.c | 1 +
>  src/gallium/drivers/radeon/r600_texture.c   | 1 +
>  2 files changed, 2 insertions(+)
>
> diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c 
> b/src/gallium/drivers/radeon/r600_buffer_common.c
> index 2e8b6f4..cbbcc29 100644
> --- a/src/gallium/drivers/radeon/r600_buffer_common.c
> +++ b/src/gallium/drivers/radeon/r600_buffer_common.c
> @@ -511,6 +511,7 @@ r600_alloc_buffer_struct(struct pipe_screen *screen,
> rbuffer = MALLOC_STRUCT(r600_resource);
>
> rbuffer->b.b = *templ;
> +   rbuffer->b.b.next = NULL;
> pipe_reference_init(&rbuffer->b.b.reference, 1);
> rbuffer->b.b.screen = screen;
> rbuffer->b.vtbl = &r600_buffer_vtbl;
> diff --git a/src/gallium/drivers/radeon/r600_texture.c 
> b/src/gallium/drivers/radeon/r600_texture.c
> index b02b2dc..71564e2 100644
> --- a/src/gallium/drivers/radeon/r600_texture.c
> +++ b/src/gallium/drivers/radeon/r600_texture.c
> @@ -1040,6 +1040,7 @@ r600_texture_create_object(struct pipe_screen *screen,
>
> resource = &rtex->resource;
> resource->b.b = *base;
> +   resource->b.b.next = NULL;
> resource->b.vtbl = &r600_texture_vtbl;
> pipe_reference_init(&resource->b.b.reference, 1);
> resource->b.b.screen = screen;
> --
> 2.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] egl: stop claiming support for pbuffer + msaa (RFC)

2016-09-27 Thread Marek Olšák

On Tue, Sep 27, 2016 at 2:34 PM, Emil Velikov  wrote:
> On 26 September 2016 at 08:41, Tapani Pälli  wrote:
>> This fixes a crash in egl-create-msaa-pbuffer-surface Piglit test
>> and same crash in many dEQP EGL tests.
>>
>> I also found that some Qt example did a workaround because of this
>> crash: https://bugreports.qt.io/browse/QTBUG-47509
>>
>> Signed-off-by: Tapani Pälli 
>> ---
>>
>> This is RFC as I'm not sure if we are supposed to support this. I tried
>> to verify overall pbuffer situation with some mesa-demos using pbuffer
>> but those are not working for me at all with or without my patch.
>>
>>  src/egl/main/eglconfig.c | 5 +
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/src/egl/main/eglconfig.c b/src/egl/main/eglconfig.c
>> index 6161d26..20cf9d4 100644
>> --- a/src/egl/main/eglconfig.c
>> +++ b/src/egl/main/eglconfig.c
>> @@ -407,6 +407,11 @@ _eglValidateConfig(const _EGLConfig *conf, EGLBoolean 
>> for_matching)
>>return EGL_FALSE;
>> }
>>
>> +   /* pbuffer with MSAA not supported */
> Fwiw on my system piglit also crashes + the demos don't render
> anything. So I'm leaning that we want this as-is (for the time being)
> + cc stable ?
>
> Can you apply a minor polish to the comment - "XXX/TODO: pbuffer +
> MSAA does not work + QT bugreport" or alike.

Please don't add "XXX/TODO". pbuffers were spec'd in 1997 and were
meant to be used on GL 1.x hardware that didn't support MSAA
texturing, thus MSAA pbuffers don't make any sense. Just keep the
current comment.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] glx: don't expose systemTimeExtension for DRI2/DRI3/DRISW

2016-09-27 Thread Emil Velikov

Used/applicable to only dri1 drivers.

Signed-off-by: Emil Velikov 
---
If anyone wants to go ahead and start moving the DRI1 only functionality
to src/glx/dri[1] that'll be appreciated.
---
 src/glx/dri2_glx.c  | 2 --
 src/glx/dri3_glx.c  | 1 -
 src/glx/drisw_glx.c | 1 -
 3 files changed, 4 deletions(-)

diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c
index af388d9..4f847cc 100644
--- a/src/glx/dri2_glx.c
+++ b/src/glx/dri2_glx.c
@@ -1393,8 +1393,6 @@ dri2CreateDisplay(Display * dpy)
else
   pdp->loader_extensions[i++] = &dri2LoaderExtension.base;

-   pdp->loader_extensions[i++] = &systemTimeExtension.base;
-
pdp->loader_extensions[i++] = &dri2UseInvalidate.base;
 
pdp->loader_extensions[i++] = NULL;
diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index 90d7bba..bdefdf3 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -488,7 +488,6 @@ const __DRIuseInvalidateExtension dri3UseInvalidate = {
 
 static const __DRIextension *loader_extensions[] = {
&imageLoaderExtension.base,
-   &systemTimeExtension.base,
&dri3UseInvalidate.base,
NULL
 };
diff --git a/src/glx/drisw_glx.c b/src/glx/drisw_glx.c
index 241ac7f..110b7f8 100644
--- a/src/glx/drisw_glx.c
+++ b/src/glx/drisw_glx.c
@@ -219,7 +219,6 @@ static const __DRIswrastLoaderExtension 
swrastLoaderExtension = {
 };
 
 static const __DRIextension *loader_extensions[] = {
-   &systemTimeExtension.base,
&swrastLoaderExtension.base,
NULL
 };
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] egl: use unsigned int index when iterating over attrib_list

2016-09-27 Thread Emil Velikov

From: Emil Velikov 

Otherwise one can overflow the signed variable and (attempt to) cause
all sorts of strange behaviour.

Signed-off-by: Emil Velikov 
---
 src/egl/drivers/dri2/egl_dri2.c | 2 +-
 src/egl/main/eglconfig.c| 3 ++-
 src/egl/main/eglcontext.c   | 3 ++-
 src/egl/main/egldisplay.c   | 2 +-
 src/egl/main/eglimage.c | 3 ++-
 src/egl/main/eglsurface.c   | 3 ++-
 src/egl/main/eglsync.c  | 6 --
 7 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index 8e376e3..6a3318b 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -167,7 +167,7 @@ dri2_add_config(_EGLDisplay *disp, const __DRIconfig 
*dri_config, int id,
_EGLConfig *matching_config;
EGLint num_configs = 0;
EGLint config_id;
-   int i;
+   unsigned int i;
 
dri2_dpy = disp->DriverData;
_eglInitConfig(&base, disp, id);
diff --git a/src/egl/main/eglconfig.c b/src/egl/main/eglconfig.c
index 6161d26..b12ff9d 100644
--- a/src/egl/main/eglconfig.c
+++ b/src/egl/main/eglconfig.c
@@ -514,7 +514,8 @@ EGLBoolean
 _eglParseConfigAttribList(_EGLConfig *conf, _EGLDisplay *dpy,
   const EGLint *attrib_list)
 {
-   EGLint attr, val, i;
+   EGLint attr, val;
+   unsigned int i;
 
_eglInitConfig(conf, dpy, EGL_DONT_CARE);
 
diff --git a/src/egl/main/eglcontext.c b/src/egl/main/eglcontext.c
index 60625f6..694f137 100644
--- a/src/egl/main/eglcontext.c
+++ b/src/egl/main/eglcontext.c
@@ -85,7 +85,8 @@ _eglParseContextAttribList(_EGLContext *ctx, _EGLDisplay *dpy,
const EGLint *attrib_list)
 {
EGLenum api = ctx->ClientAPI;
-   EGLint i, err = EGL_SUCCESS;
+   EGLint err = EGL_SUCCESS;
+   unsigned int i;
 
if (!attrib_list)
   return EGL_SUCCESS;
diff --git a/src/egl/main/egldisplay.c b/src/egl/main/egldisplay.c
index 3d4eb81..201cf7b 100644
--- a/src/egl/main/egldisplay.c
+++ b/src/egl/main/egldisplay.c
@@ -474,7 +474,7 @@ _eglUnlinkResource(_EGLResource *res, _EGLResourceType type)
 static EGLBoolean
 _eglParseX11DisplayAttribList(const EGLint *attrib_list)
 {
-   int i;
+   unsigned int i;
 
if (attrib_list == NULL) {
   return EGL_TRUE;
diff --git a/src/egl/main/eglimage.c b/src/egl/main/eglimage.c
index 818b597..44dbfab 100644
--- a/src/egl/main/eglimage.c
+++ b/src/egl/main/eglimage.c
@@ -41,7 +41,8 @@ EGLint
 _eglParseImageAttribList(_EGLImageAttribs *attrs, _EGLDisplay *dpy,
  const EGLint *attrib_list)
 {
-   EGLint i, err = EGL_SUCCESS;
+   EGLint err = EGL_SUCCESS;
+   unsigned int i;
 
(void) dpy;
 
diff --git a/src/egl/main/eglsurface.c b/src/egl/main/eglsurface.c
index 231a5f0..37ede3e 100644
--- a/src/egl/main/eglsurface.c
+++ b/src/egl/main/eglsurface.c
@@ -70,9 +70,10 @@ _eglParseSurfaceAttribList(_EGLSurface *surf, const EGLint 
*attrib_list)
_EGLDisplay *dpy = surf->Resource.Display;
EGLint type = surf->Type;
EGLint texture_type = EGL_PBUFFER_BIT;
-   EGLint i, err = EGL_SUCCESS;
+   EGLint err = EGL_SUCCESS;
EGLint attr = EGL_NONE;
EGLint val = EGL_NONE;
+   unsigned int i;
 
if (!attrib_list)
   return EGL_SUCCESS;
diff --git a/src/egl/main/eglsync.c b/src/egl/main/eglsync.c
index 33625e9..df313cb 100644
--- a/src/egl/main/eglsync.c
+++ b/src/egl/main/eglsync.c
@@ -40,7 +40,8 @@
 static EGLint
 _eglParseSyncAttribList(_EGLSync *sync, const EGLint *attrib_list)
 {
-   EGLint i, err = EGL_SUCCESS;
+   EGLint err = EGL_SUCCESS;
+   unsigned int i;
 
if (!attrib_list)
   return EGL_SUCCESS;
@@ -69,7 +70,8 @@ _eglParseSyncAttribList(_EGLSync *sync, const EGLint 
*attrib_list)
 static EGLint
 _eglParseSyncAttribList64(_EGLSync *sync, const EGLAttrib *attrib_list)
 {
-   EGLint i, err = EGL_SUCCESS;
+   EGLint err = EGL_SUCCESS;
+   unsigned int i;
 
if (!attrib_list)
   return EGL_SUCCESS;
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] st/omx/dec/h265: Correct the timestamping (derived from commit 3b6bda665a5a890f2c98e19d2939d7de92b8cb4c)

2016-09-27 Thread Christian König


Hi Indrajit,

please send this patch once more as text mail. I can't commit it like this.

Regards,
Christian.

Am 20.09.2016 um 13:48 schrieb Das, Indrajit-kumar:


From: Indrajit Das 

Reviewed-by: Christian König 

Reviewed-by: Nishanth Peethambaran 

Signed-off-by: Indrajit Das 

---

src/gallium/state_trackers/omx/vid_dec_h265.c | 13 -

1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/omx/vid_dec_h265.c 
b/src/gallium/state_trackers/omx/vid_dec_h265.c


index 7c0f75d..db20292 100644

--- a/src/gallium/state_trackers/omx/vid_dec_h265.c

+++ b/src/gallium/state_trackers/omx/vid_dec_h265.c

@@ -60,6 +60,7 @@ enum {

struct dpb_list {

struct list_head list;

struct pipe_video_buffer *buffer;

+   OMX_TICKS timestamp;

unsigned poc;

};

@@ -518,6 +519,9 @@ static void 
vid_dec_h265_BeginFrame(vid_dec_PrivateType *priv)


   return;

vid_dec_NeedTarget(priv);

+   if (priv->first_buf_in_frame)

+priv->timestamp = priv->timestamps[0];

+   priv->first_buf_in_frame = false;

if (!priv->codec) {

   struct pipe_video_codec templat = {};

@@ -558,6 +562,8 @@ static struct pipe_video_buffer 
*vid_dec_h265_Flush(vid_dec_PrivateType *priv,


   return NULL;

buf = result->buffer;

+   if (timestamp)

+*timestamp = result->timestamp;

--priv->codec_data.h265.dpb_num;

LIST_DEL(&result->list);

@@ -572,6 +578,7 @@ static void 
vid_dec_h265_EndFrame(vid_dec_PrivateType *priv)


struct pipe_video_buffer *tmp;

struct ref_pic_set *rps;

int i;

+   OMX_TICKS timestamp;

if (!priv->frame_started)

   return;

@@ -621,7 +628,9 @@ static void 
vid_dec_h265_EndFrame(vid_dec_PrivateType *priv)


if (!entry)

   return;

+   priv->first_buf_in_frame = true;

entry->buffer = priv->target;

+   entry->timestamp = priv->timestamp;

entry->poc = get_poc(priv);

LIST_ADDTAIL(&entry->list, &priv->codec_data.h265.dpb_list);

@@ -632,7 +641,8 @@ static void 
vid_dec_h265_EndFrame(vid_dec_PrivateType *priv)


   return;

tmp = priv->in_buffers[0]->pInputPortPrivate;

- priv->in_buffers[0]->pInputPortPrivate = vid_dec_h265_Flush(priv, NULL);

+ priv->in_buffers[0]->pInputPortPrivate = vid_dec_h265_Flush(priv, 
×tamp);


+   priv->in_buffers[0]->nTimeStamp = timestamp;

priv->target = tmp;

priv->frame_finished = priv->in_buffers[0]->pInputPortPrivate != NULL;

if (priv->frame_finished &&

@@ -894,4 +904,5 @@ void vid_dec_h265_Init(vid_dec_PrivateType *priv)

priv->Decode = vid_dec_h265_Decode;

priv->EndFrame = vid_dec_h265_EndFrame;

priv->Flush = vid_dec_h265_Flush;

+   priv->first_buf_in_frame = true;

}

--

2.7.4



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] egl: use unsigned int index when iterating over attrib_list

2016-09-27 Thread Nicolai Hähnle


On 27.09.2016 14:40, Emil Velikov wrote:

From: Emil Velikov 

Otherwise one can overflow the signed variable and (attempt to) cause
all sorts of strange behaviour.


As long as we're worrying about such things, shouldn't it really be a 
size_t then? With that,


Reviewed-by: Nicolai Hähnle 

Cheers,
Nicolai



Signed-off-by: Emil Velikov 
---
 src/egl/drivers/dri2/egl_dri2.c | 2 +-
 src/egl/main/eglconfig.c| 3 ++-
 src/egl/main/eglcontext.c   | 3 ++-
 src/egl/main/egldisplay.c   | 2 +-
 src/egl/main/eglimage.c | 3 ++-
 src/egl/main/eglsurface.c   | 3 ++-
 src/egl/main/eglsync.c  | 6 --
 7 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index 8e376e3..6a3318b 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -167,7 +167,7 @@ dri2_add_config(_EGLDisplay *disp, const __DRIconfig 
*dri_config, int id,
_EGLConfig *matching_config;
EGLint num_configs = 0;
EGLint config_id;
-   int i;
+   unsigned int i;

dri2_dpy = disp->DriverData;
_eglInitConfig(&base, disp, id);
diff --git a/src/egl/main/eglconfig.c b/src/egl/main/eglconfig.c
index 6161d26..b12ff9d 100644
--- a/src/egl/main/eglconfig.c
+++ b/src/egl/main/eglconfig.c
@@ -514,7 +514,8 @@ EGLBoolean
 _eglParseConfigAttribList(_EGLConfig *conf, _EGLDisplay *dpy,
   const EGLint *attrib_list)
 {
-   EGLint attr, val, i;
+   EGLint attr, val;
+   unsigned int i;

_eglInitConfig(conf, dpy, EGL_DONT_CARE);

diff --git a/src/egl/main/eglcontext.c b/src/egl/main/eglcontext.c
index 60625f6..694f137 100644
--- a/src/egl/main/eglcontext.c
+++ b/src/egl/main/eglcontext.c
@@ -85,7 +85,8 @@ _eglParseContextAttribList(_EGLContext *ctx, _EGLDisplay *dpy,
const EGLint *attrib_list)
 {
EGLenum api = ctx->ClientAPI;
-   EGLint i, err = EGL_SUCCESS;
+   EGLint err = EGL_SUCCESS;
+   unsigned int i;

if (!attrib_list)
   return EGL_SUCCESS;
diff --git a/src/egl/main/egldisplay.c b/src/egl/main/egldisplay.c
index 3d4eb81..201cf7b 100644
--- a/src/egl/main/egldisplay.c
+++ b/src/egl/main/egldisplay.c
@@ -474,7 +474,7 @@ _eglUnlinkResource(_EGLResource *res, _EGLResourceType type)
 static EGLBoolean
 _eglParseX11DisplayAttribList(const EGLint *attrib_list)
 {
-   int i;
+   unsigned int i;

if (attrib_list == NULL) {
   return EGL_TRUE;
diff --git a/src/egl/main/eglimage.c b/src/egl/main/eglimage.c
index 818b597..44dbfab 100644
--- a/src/egl/main/eglimage.c
+++ b/src/egl/main/eglimage.c
@@ -41,7 +41,8 @@ EGLint
 _eglParseImageAttribList(_EGLImageAttribs *attrs, _EGLDisplay *dpy,
  const EGLint *attrib_list)
 {
-   EGLint i, err = EGL_SUCCESS;
+   EGLint err = EGL_SUCCESS;
+   unsigned int i;

(void) dpy;

diff --git a/src/egl/main/eglsurface.c b/src/egl/main/eglsurface.c
index 231a5f0..37ede3e 100644
--- a/src/egl/main/eglsurface.c
+++ b/src/egl/main/eglsurface.c
@@ -70,9 +70,10 @@ _eglParseSurfaceAttribList(_EGLSurface *surf, const EGLint 
*attrib_list)
_EGLDisplay *dpy = surf->Resource.Display;
EGLint type = surf->Type;
EGLint texture_type = EGL_PBUFFER_BIT;
-   EGLint i, err = EGL_SUCCESS;
+   EGLint err = EGL_SUCCESS;
EGLint attr = EGL_NONE;
EGLint val = EGL_NONE;
+   unsigned int i;

if (!attrib_list)
   return EGL_SUCCESS;
diff --git a/src/egl/main/eglsync.c b/src/egl/main/eglsync.c
index 33625e9..df313cb 100644
--- a/src/egl/main/eglsync.c
+++ b/src/egl/main/eglsync.c
@@ -40,7 +40,8 @@
 static EGLint
 _eglParseSyncAttribList(_EGLSync *sync, const EGLint *attrib_list)
 {
-   EGLint i, err = EGL_SUCCESS;
+   EGLint err = EGL_SUCCESS;
+   unsigned int i;

if (!attrib_list)
   return EGL_SUCCESS;
@@ -69,7 +70,8 @@ _eglParseSyncAttribList(_EGLSync *sync, const EGLint 
*attrib_list)
 static EGLint
 _eglParseSyncAttribList64(_EGLSync *sync, const EGLAttrib *attrib_list)
 {
-   EGLint i, err = EGL_SUCCESS;
+   EGLint err = EGL_SUCCESS;
+   unsigned int i;

if (!attrib_list)
   return EGL_SUCCESS;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC 1/7] eglplatform.h: introduce and use EGL_USE_PLATFORM_*_KHR

2016-09-27 Thread Emil Velikov

On 25 September 2016 at 00:17, Eric Engestrom  wrote:
> On Thu, Sep 22, 2016 at 09:38:06AM +0100, Emil Velikov wrote:
>> From: Emil Velikov 
>>
>> In order to avoid the current, somewhat fragile detection in
>> eglplatform.h introduce explicit platform selection.
>>
>> The approach is based on the one used in Vulkan and allows one to
>> explicitly "request" the platform they will be using without the need of
>> local hacks.
>>
>> ---
>> XXX: Strictly speaking the default/else case would be the None/native
>> one, but since we still have the "autodetection" heuristics...
>>
>> Admittedly some of the names can be improved, plus there's a limited
>> about of Symbian users still in the wild. The latter kept for
>> compatibility reasons.
>> ---
>>  include/EGL/eglplatform.h | 56 
>> +++
>>  1 file changed, 56 insertions(+)
>>
>> diff --git a/include/EGL/eglplatform.h b/include/EGL/eglplatform.h
>> index b376e64..923b5f6 100644
>> --- a/include/EGL/eglplatform.h
>> +++ b/include/EGL/eglplatform.h
>> @@ -67,6 +67,62 @@
>>   * implementations.
>>   */
>>
>> +#if defined(EGL_USE_PLATFORM_ANDROID_KHR)
>> +#include 
>> +
>> +struct egl_native_pixmap_t;
>> +
>> +typedef struct ANativeWindow*   EGLNativeWindowType;
>> +typedef struct egl_native_pixmap_t* EGLNativePixmapType;
>> +typedef void*   EGLNativeDisplayType;
>> +
>> +#elif defined(EGL_USE_PLATFORM_GBM_KHR) // XXX: Name GBM vs DRM vs other
>> +
>> +typedef struct gbm_device  *EGLNativeDisplayType;
>> +typedef struct gbm_bo  *EGLNativePixmapType;
>> +typedef void   *EGLNativeWindowType;
>> +
>> +#elif defined(EGL_USE_PLATFORM_NONE_KHR) // XXX: Name NONE vs Native vs 
>> other
>> +
>> +typedef void*EGLNativeDisplayType;
>> +typedef khronos_uintptr_t EGLNativePixmapType;
>> +typedef khronos_uintptr_t EGLNativeWindowType;
>> +
>> +#elif defined(EGL_USE_PLATFORM_SYMBIAN_KHR)
>> +
>> +typedef int   EGLNativeDisplayType;
>> +typedef void *EGLNativeWindowType;
>> +typedef void *EGLNativePixmapType;
>> +
>> +#elif defined(EGL_USE_PLATFORM_WAYLAND_KHR)
>> +
>> +typedef struct wl_display *EGLNativeDisplayType;
>> +typedef struct wl_egl_pixmap  *EGLNativePixmapType;
>> +typedef struct wl_egl_window  *EGLNativeWindowType;
>> +
>> +#elif defined(EGL_USE_PLATFORM_WIN32_KHR)
>> +
>> +#ifndef WIN32_LEAN_AND_MEAN
>> +#define WIN32_LEAN_AND_MEAN 1
>> +#endif
>
> Isn't that fragile too?  I'm not familiar with Windows and what this
> #define does, but presumably if someone where to `#include `
> first, the behaviour would change?
> If that's the case, maybe we could add an `#else #warning "Don't include
>  before this file"`?
>
Yes, things look rather iffy, but since it's unlikely I'll test the
Windows side I've opted for a copy/paste of the existing code :-)

>> +#include 
>> +
>> +typedef HDC EGLNativeDisplayType;
>> +typedef HBITMAP EGLNativePixmapType;
>> +typedef HWNDEGLNativeWindowType;
>> +
>> +#elif defined(EGL_USE_PLATFORM_XLIB_KHR)
>> +#include 
>> +#include 
>> +
>> +typedef Display *EGLNativeDisplayType;
>> +typedef Pixmap   EGLNativePixmapType;
>> +typedef Window   EGLNativeWindowType;
>> +
>> +#else
>
> That `+#else` is missing its matching `+#endif` :)
>
> ---8<---
> @@ -134,6 +190,7 @@ typedef khronos_uintptr_tEGLNativeWindowType;
>  #else
>  #error "Platform not recognized"
>  #endif
> +#endif
>
>  /* EGL 1.2 types, renamed for consistency in EGL 1.3 */
>  typedef EGLNativeDisplayType NativeDisplayType;
> --->8---
>
> Overall, I think this series is a good idea, although #7 will probably
> have to wait for a while...
>
Yes, the last 2-3 patches might never make it afaict, but they're good
to illustrate things.

Thanks for catching the silly bug :-)
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] gallium/radeon: use smaller buffers for query results

2016-09-27 Thread Nicolai Hähnle

From: Nicolai Hähnle 

Most of the time, even the 512 bytes that we now get is more than sufficient
(pipeline stats queries are the largest at 184 bytes per shot).
---
 src/gallium/drivers/radeon/r600_query.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeon/r600_query.c 
b/src/gallium/drivers/radeon/r600_query.c
index 2c3d530..0dce1c9 100644
--- a/src/gallium/drivers/radeon/r600_query.c
+++ b/src/gallium/drivers/radeon/r600_query.c
@@ -311,21 +311,21 @@ void r600_query_hw_destroy(struct r600_common_context 
*rctx,
}
 
r600_resource_reference(&query->buffer.buf, NULL);
FREE(rquery);
 }
 
 static struct r600_resource *r600_new_query_buffer(struct r600_common_context 
*ctx,
   struct r600_query_hw *query)
 {
unsigned buf_size = MAX2(query->result_size,
-ctx->screen->info.gart_page_size);
+ctx->screen->info.min_alloc_size);
 
/* Queries are normally read by the CPU after
 * being written by the gpu, hence staging is probably a good
 * usage pattern.
 */
struct r600_resource *buf = (struct r600_resource*)
pipe_buffer_create(ctx->b.screen, PIPE_BIND_CUSTOM,
   PIPE_USAGE_STAGING, buf_size);
if (!buf)
return NULL;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] gallium/radeon/winsyses: add radeon_winsys::min_alloc_size

2016-09-27 Thread Nicolai Hähnle

From: Nicolai Hähnle 

---
 src/gallium/drivers/radeon/radeon_winsys.h| 1 +
 src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 2 ++
 src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 4 
 3 files changed, 7 insertions(+)

diff --git a/src/gallium/drivers/radeon/radeon_winsys.h 
b/src/gallium/drivers/radeon/radeon_winsys.h
index 55f0395..d0705d6 100644
--- a/src/gallium/drivers/radeon/radeon_winsys.h
+++ b/src/gallium/drivers/radeon/radeon_winsys.h
@@ -176,20 +176,21 @@ struct radeon_info {
 uint32_tpci_func;
 
 /* Device info. */
 uint32_tpci_id;
 enum radeon_family  family;
 enum chip_class chip_class;
 uint32_tgart_page_size;
 uint64_tgart_size;
 uint64_tvram_size;
 uint64_tmax_alloc_size;
+uint32_tmin_alloc_size;
 boolhas_dedicated_vram;
 boolhas_virtual_memory;
 boolgfx_ib_pad_with_type2;
 boolhas_sdma;
 boolhas_uvd;
 uint32_tuvd_fw_version;
 uint32_tvce_fw_version;
 uint32_tme_fw_version;
 uint32_tpfp_fw_version;
 uint32_tce_fw_version;
diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
index c83489d..c28e1ca 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c
@@ -550,20 +550,22 @@ amdgpu_winsys_create(int fd, radeon_screen_create_t 
screen_create)
 
if (!pb_slabs_init(&ws->bo_slabs,
   AMDGPU_SLAB_MIN_SIZE_LOG2, AMDGPU_SLAB_MAX_SIZE_LOG2,
   12, /* number of heaps (domain/flags combinations) */
   ws,
   amdgpu_bo_can_reclaim_slab,
   amdgpu_bo_slab_alloc,
   amdgpu_bo_slab_free))
   goto fail_cache;
 
+   ws->info.min_alloc_size = 1 << AMDGPU_SLAB_MIN_SIZE_LOG2;
+
/* init reference */
pipe_reference_init(&ws->reference, 1);
 
/* Set functions. */
ws->base.unref = amdgpu_winsys_unref;
ws->base.destroy = amdgpu_winsys_destroy;
ws->base.query_info = amdgpu_winsys_query_info;
ws->base.cs_request_feature = amdgpu_cs_request_feature;
ws->base.query_value = amdgpu_query_value;
ws->base.read_registers = amdgpu_read_registers;
diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
index ae55746..16e4408 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
@@ -767,20 +767,24 @@ radeon_drm_winsys_create(int fd, radeon_screen_create_t 
screen_create)
  * honor the address offset.
  */
 if (!pb_slabs_init(&ws->bo_slabs,
RADEON_SLAB_MIN_SIZE_LOG2, 
RADEON_SLAB_MAX_SIZE_LOG2,
12,
ws,
radeon_bo_can_reclaim_slab,
radeon_bo_slab_alloc,
radeon_bo_slab_free))
 goto fail_cache;
+
+ws->info.min_alloc_size = 1 << RADEON_SLAB_MIN_SIZE_LOG2;
+} else {
+ws->info.min_alloc_size = ws->info.gart_page_size;
 }
 
 if (ws->gen >= DRV_R600) {
 ws->surf_man = radeon_surface_manager_new(ws->fd);
 if (!ws->surf_man)
 goto fail_slab;
 }
 
 /* init reference */
 pipe_reference_init(&ws->reference, 1);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] st/omx/dec/h265: Correct the timestamping (derived from commit 3b6bda665a5a890f2c98e19d2939d7de92b8cb4c)

2016-09-27 Thread Emil Velikov

On 27 September 2016 at 13:13, Christian König  wrote:
> Hi Indrajit,
>
> please send this patch once more as text mail. I can't commit it like this.
>
Having a commit message is strongly recommended. A fixes and/or stable
tag would be even better :-)

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] egl: use unsigned int index when iterating over attrib_list

2016-09-27 Thread Eric Engestrom

On Tue, Sep 27, 2016 at 04:10:53PM +0200, Nicolai Hähnle wrote:
> On 27.09.2016 14:40, Emil Velikov wrote:
> > From: Emil Velikov 
> > 
> > Otherwise one can overflow the signed variable and (attempt to) cause
> > all sorts of strange behaviour.
> 
> As long as we're worrying about such things, shouldn't it really be a size_t
> then? With that,

Agreed, and you can also have my r-b.

One question though: why these specific `i`s? There are plenty more `i`s
(in these files) that could use the same treatment, not to mention other
variables.
It's not as if these are the most overflow-critical either: I'm pretty
sure if we have >INT_MAX attributes, we have more pressing problems than
overflowing the attrib counter :P
(To be clear, I do think this is a good change, just wondering why)

> 
> Reviewed-by: Nicolai Hähnle 
> 
> Cheers,
> Nicolai
> 
> > 
> > Signed-off-by: Emil Velikov 
> > ---
> >  src/egl/drivers/dri2/egl_dri2.c | 2 +-
> >  src/egl/main/eglconfig.c| 3 ++-
> >  src/egl/main/eglcontext.c   | 3 ++-
> >  src/egl/main/egldisplay.c   | 2 +-
> >  src/egl/main/eglimage.c | 3 ++-
> >  src/egl/main/eglsurface.c   | 3 ++-
> >  src/egl/main/eglsync.c  | 6 --
> >  7 files changed, 14 insertions(+), 8 deletions(-)
> > 
> > diff --git a/src/egl/drivers/dri2/egl_dri2.c 
> > b/src/egl/drivers/dri2/egl_dri2.c
> > index 8e376e3..6a3318b 100644
> > --- a/src/egl/drivers/dri2/egl_dri2.c
> > +++ b/src/egl/drivers/dri2/egl_dri2.c
> > @@ -167,7 +167,7 @@ dri2_add_config(_EGLDisplay *disp, const __DRIconfig 
> > *dri_config, int id,
> > _EGLConfig *matching_config;
> > EGLint num_configs = 0;
> > EGLint config_id;
> > -   int i;
> > +   unsigned int i;
> > 
> > dri2_dpy = disp->DriverData;
> > _eglInitConfig(&base, disp, id);
> > diff --git a/src/egl/main/eglconfig.c b/src/egl/main/eglconfig.c
> > index 6161d26..b12ff9d 100644
> > --- a/src/egl/main/eglconfig.c
> > +++ b/src/egl/main/eglconfig.c
> > @@ -514,7 +514,8 @@ EGLBoolean
> >  _eglParseConfigAttribList(_EGLConfig *conf, _EGLDisplay *dpy,
> >const EGLint *attrib_list)
> >  {
> > -   EGLint attr, val, i;
> > +   EGLint attr, val;
> > +   unsigned int i;
> > 
> > _eglInitConfig(conf, dpy, EGL_DONT_CARE);
> > 
> > diff --git a/src/egl/main/eglcontext.c b/src/egl/main/eglcontext.c
> > index 60625f6..694f137 100644
> > --- a/src/egl/main/eglcontext.c
> > +++ b/src/egl/main/eglcontext.c
> > @@ -85,7 +85,8 @@ _eglParseContextAttribList(_EGLContext *ctx, _EGLDisplay 
> > *dpy,
> > const EGLint *attrib_list)
> >  {
> > EGLenum api = ctx->ClientAPI;
> > -   EGLint i, err = EGL_SUCCESS;
> > +   EGLint err = EGL_SUCCESS;
> > +   unsigned int i;
> > 
> > if (!attrib_list)
> >return EGL_SUCCESS;
> > diff --git a/src/egl/main/egldisplay.c b/src/egl/main/egldisplay.c
> > index 3d4eb81..201cf7b 100644
> > --- a/src/egl/main/egldisplay.c
> > +++ b/src/egl/main/egldisplay.c
> > @@ -474,7 +474,7 @@ _eglUnlinkResource(_EGLResource *res, _EGLResourceType 
> > type)
> >  static EGLBoolean
> >  _eglParseX11DisplayAttribList(const EGLint *attrib_list)
> >  {
> > -   int i;
> > +   unsigned int i;
> > 
> > if (attrib_list == NULL) {
> >return EGL_TRUE;
> > diff --git a/src/egl/main/eglimage.c b/src/egl/main/eglimage.c
> > index 818b597..44dbfab 100644
> > --- a/src/egl/main/eglimage.c
> > +++ b/src/egl/main/eglimage.c
> > @@ -41,7 +41,8 @@ EGLint
> >  _eglParseImageAttribList(_EGLImageAttribs *attrs, _EGLDisplay *dpy,
> >   const EGLint *attrib_list)
> >  {
> > -   EGLint i, err = EGL_SUCCESS;
> > +   EGLint err = EGL_SUCCESS;
> > +   unsigned int i;
> > 
> > (void) dpy;
> > 
> > diff --git a/src/egl/main/eglsurface.c b/src/egl/main/eglsurface.c
> > index 231a5f0..37ede3e 100644
> > --- a/src/egl/main/eglsurface.c
> > +++ b/src/egl/main/eglsurface.c
> > @@ -70,9 +70,10 @@ _eglParseSurfaceAttribList(_EGLSurface *surf, const 
> > EGLint *attrib_list)
> > _EGLDisplay *dpy = surf->Resource.Display;
> > EGLint type = surf->Type;
> > EGLint texture_type = EGL_PBUFFER_BIT;
> > -   EGLint i, err = EGL_SUCCESS;
> > +   EGLint err = EGL_SUCCESS;
> > EGLint attr = EGL_NONE;
> > EGLint val = EGL_NONE;
> > +   unsigned int i;
> > 
> > if (!attrib_list)
> >return EGL_SUCCESS;
> > diff --git a/src/egl/main/eglsync.c b/src/egl/main/eglsync.c
> > index 33625e9..df313cb 100644
> > --- a/src/egl/main/eglsync.c
> > +++ b/src/egl/main/eglsync.c
> > @@ -40,7 +40,8 @@
> >  static EGLint
> >  _eglParseSyncAttribList(_EGLSync *sync, const EGLint *attrib_list)
> >  {
> > -   EGLint i, err = EGL_SUCCESS;
> > +   EGLint err = EGL_SUCCESS;
> > +   unsigned int i;
> > 
> > if (!attrib_list)
> >return EGL_SUCCESS;
> > @@ -69,7 +70,8 @@ _eglParseSyncAttribList(_EGLSync *sync, const EGLint 
> > *attrib_list)
> >  static EGLint
> >  _eglParseSyncAttribList64(_EGLSync *sync, const EGLAttrib *attrib_list)
> >

[Mesa-dev] [Bug 97879] [amdgpu] Rocket League: long hangs (several seconds) when loading assets (models/textures/shaders?)

2016-09-27 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97879

--- Comment #25 from Silvan Jegen  ---
Created attachment 126813
  --> https://bugs.freedesktop.org/attachment.cgi?id=126813&action=edit
perf report of RocketLeague stalling/freezing, including callgraphs

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97879] [amdgpu] Rocket League: long hangs (several seconds) when loading assets (models/textures/shaders?)

2016-09-27 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97879

--- Comment #26 from Silvan Jegen  ---
I generated the report and uploaded the result.

I may have hit a bug in perf while doing it. "perf report" was blocking on a
pread64 call on /dev/dri/card0 when creating the report. I had to use gdb to
close the associated file descriptor to get perf to continue creating the
report but it seems to have worked.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gallium/r300: initialize pipe_resource::next to NULL

2016-09-27 Thread Rob Clark

Signed-off-by: Rob Clark 
---
I had a scan through the rest of pipe_resource allocations, and I think
this is the only remaining one (besides r600_alloc_buffer_struct())
which was using MALLOC_STRUCT()..  sorry 'bout that

 src/gallium/drivers/r300/r300_screen_buffer.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/r300/r300_screen_buffer.c 
b/src/gallium/drivers/r300/r300_screen_buffer.c
index 4747058..24dd92f 100644
--- a/src/gallium/drivers/r300/r300_screen_buffer.c
+++ b/src/gallium/drivers/r300/r300_screen_buffer.c
@@ -163,6 +163,7 @@ struct pipe_resource *r300_buffer_create(struct pipe_screen 
*screen,
 rbuf = MALLOC_STRUCT(r300_resource);
 
 rbuf->b.b = *templ;
+rbuf->b.b.next = NULL;
 rbuf->b.vtbl = &r300_buffer_vtbl;
 pipe_reference_init(&rbuf->b.b.reference, 1);
 rbuf->b.b.screen = screen;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/r300: initialize pipe_resource::next to NULL

2016-09-27 Thread Marek Olšák

Reviewed-by: Marek Olšák 

Marek

On Tue, Sep 27, 2016 at 5:33 PM, Rob Clark  wrote:
> Signed-off-by: Rob Clark 
> ---
> I had a scan through the rest of pipe_resource allocations, and I think
> this is the only remaining one (besides r600_alloc_buffer_struct())
> which was using MALLOC_STRUCT()..  sorry 'bout that
>
>  src/gallium/drivers/r300/r300_screen_buffer.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/src/gallium/drivers/r300/r300_screen_buffer.c 
> b/src/gallium/drivers/r300/r300_screen_buffer.c
> index 4747058..24dd92f 100644
> --- a/src/gallium/drivers/r300/r300_screen_buffer.c
> +++ b/src/gallium/drivers/r300/r300_screen_buffer.c
> @@ -163,6 +163,7 @@ struct pipe_resource *r300_buffer_create(struct 
> pipe_screen *screen,
>  rbuf = MALLOC_STRUCT(r300_resource);
>
>  rbuf->b.b = *templ;
> +rbuf->b.b.next = NULL;
>  rbuf->b.vtbl = &r300_buffer_vtbl;
>  pipe_reference_init(&rbuf->b.b.reference, 1);
>  rbuf->b.b.screen = screen;
> --
> 2.7.4
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97879] [amdgpu] Rocket League: long hangs (several seconds) when loading assets (models/textures/shaders?)

2016-09-27 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97879

--- Comment #27 from Eero Tamminen  ---
(In reply to Silvan Jegen from comment #22)
> Created attachment 126796 [details]
> perf report of RocketLeague stalling/freezing

Overview:
-
 74.86% RocketLeague
 76.73% RocketLeague
 16.33% [kernel.vmlinux]
  2.05% libc-2.24.so
  1.79% libpthread-2.24.so
  1.59% [amdgpu]

 10.22% RenderingThread
 46.08% radeonsi_dri.so
 22.35% RocketLeague
 15.32% libc-2.24.so
  8.99% libpthread-2.24.so
  2.07% [amdgpu]
  1.50% libGL.so.1.2.0
  1.50% [kernel.vmlinux]

  6.86% swapper
 97.25% [kernel.vmlinux]
  2.58% [xhci_hcd]

  6.81% AsyncIOSystem
 99.83% RocketLeague
-

-> If the perf data is just from the freeze(s), I'm not sure the problem here
is compiler.

(With the "-n" option given to "perf report" this overview could be much more
accurate than calculating the info from the rounded percentages in the
attachment.)

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/va: Fix vaSyncSurface with no outstanding operation

2016-09-27 Thread Andy Furniss


Mark Thompson wrote:

On 27/09/16 00:49, Andy Furniss wrote:

Mark Thompson wrote:

---
A simple fix to the problem described here: 
.

With this applied, the driver no longer hangs/crashes when vaSyncSurface() is 
called in places other than for the first time after an encode operation 
(including a second call on the same surface).


Once I could get ffmpeg (patched) or avconv to roughly work (before the dual 
instance commit), but I can't get either to work now = produces unreadable file.

Testing with git avconv I am trying -

./avconv -vaapi_device :0 -f rawvideo -framerate 50 -s 2560x1440 -pix_fmt nv12 
-i /mnt/ramdisk/trees-1440p50.nv12 -vframes 5 -vf 'hwupload' -c:v h264_vaapi 
-profile:v 66 -b:v 40M  -bf 0 -g 30  -f h264 -y /mnt/ramdisk/out.264

but debugging printfs show refs = 2 and bframes enabled (I also notice with 
your baseline patch that -profile:v 66 fails).

Do you have an example that works for you with avconv + this patch?


Yes: this patch 
 is 
also required to match the vaSyncSurface() change.  The rest of the that series to 
libav and the one to mesa for config setup makes it all a bit more sensible (doesn't 
submit a load of packed headers which are ignored), but it does mostly work without.


Ok, thanks, so with that I am back to where I was before it stopped working.

In summary baseline works but JM ref decoder doesn't like the pocs.

b frames don't work properly, but then they don't with gst vaapi either. 
They do work with gst omx.


Looking at output from printfs some differences I see vs gstreamer.

maxrefs is hardcoded to 2 which has sideffects =

enc_pic.pc.enc_b_pic_pattern = 1 vs 0 - seems harmless in practice.

There is code that for my h/w disables dual instance when maxrefs > 1 
which means half speed, but there seems to be a bottleneck elsewhere 
that makes avconv 3x slower than gstreamer anyway.


gop, it seems that avconv with -g doesn't set h264->intra_idr_period in 
handleVAEncSequenceParameterBufferType which gets used to set 
context->desc.h264enc.gop_size and enc_pic.rc.gop_size


pocs gstreamer increments h264->CurrPic.TopFieldOrderCnt in 2s avconv 
1s. The code divides this by 2 in handleVAEncPictureParameterBufferType




With all of those, the commands:

./avconv -y -vaapi_device /dev/dri/renderD129 -i in.mp4 -an -vf 
'format=nv12,hwupload' -c:v h264_vaapi -bf 0 out.mp4

./avconv -y -vaapi_device /dev/dri/renderD129 -hwaccel vaapi 
-hwaccel_output_format vaapi -i in.mp4 -an -c:v h264_vaapi -bf 0 out.mp4

./avconv -y -vaapi_device /dev/dri/renderD129 -hwaccel vaapi 
-hwaccel_output_format vaapi -i in.mp4 -an -vf 'scale_vaapi=w=1280:h=720' -c:v 
h264_vaapi -bf 0 out.mp4

work sensibly for me (also with -b for CBR, -qp for CQP, -g for GOP size); I 
imagine raw video as in your example would also be fine.  On profile, 
constrained baseline on the command line is 578 (== 66 | 0x200, for 
constraint_set1_flag).

Thanks,

- Mark




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] egl: use unsigned int index when iterating over attrib_list

2016-09-27 Thread Emil Velikov

On 27 September 2016 at 16:18, Eric Engestrom  wrote:
> On Tue, Sep 27, 2016 at 04:10:53PM +0200, Nicolai Hähnle wrote:
>> On 27.09.2016 14:40, Emil Velikov wrote:
>> > From: Emil Velikov 
>> >
>> > Otherwise one can overflow the signed variable and (attempt to) cause
>> > all sorts of strange behaviour.
>>
>> As long as we're worrying about such things, shouldn't it really be a size_t
>> then? With that,
>
> Agreed, and you can also have my r-b.
>
> One question though: why these specific `i`s? There are plenty more `i`s
> (in these files) that could use the same treatment, not to mention other
> variables.
> It's not as if these are the most overflow-critical either: I'm pretty
> sure if we have >INT_MAX attributes, we have more pressing problems than
> overflowing the attrib counter :P
The gripe is about (possible) intentional abuse of the attrib_list,
using which one can use to read/modify the stack*. Nobody in their
right might is (should be) using more than UINT_MAX attributes, so
size_t won't bring much. But if you insist...

I've went ahead with a simple grep for EGL_NONE although one could
expand things throughout egl (and mesa as a whole). Feel free to
pursue :-)

-Emil
* Haven't bothered coming up with specific attack and I'm not 100%
sure it's possible in all the cases.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97879] [amdgpu] Rocket League: long hangs (several seconds) when loading assets (models/textures/shaders?)

2016-09-27 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97879

--- Comment #28 from Micael Bergeron  ---
I can reproduce this every time. 
Using amdgpu-pro seems to fix the problem.

I asked Psyonix for the debug symbols bug had no response.
http://psyonix.com/forum/viewtopic.php?f=36&t=27894

I opened a bug report on Rocket League for the menu freeze.
https://support.rocketleaguegame.com/hc/en-us/requests/1840

Shaders/materials seems to load very slowly compared to the Windows version.

Software:
Linux 4.7.4-1-ARCH #1 SMP PREEMPT Thu Sep 15 15:24:29 CEST 2016 x86_64
GNU/Linux
mesa 12.0.3-1

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v4 3/7] intel/isl: Allow creation of 1-D compressed textures

2016-09-27 Thread Jason Ekstrand

On Fri, Sep 23, 2016 at 9:52 AM, Nanley Chery  wrote:

> On Fri, Sep 23, 2016 at 12:17:19AM -0700, Jason Ekstrand wrote:
> > Compressed 1-D textures are not well-defined thing in either GL or
> Vulkan.
> > However, auxiliary surfaces are treated as compressed textures in ISL and
> > we can do HiZ and CCS with 1-D so we need to be able to create them.  In
> > order to prevent actually using them (the docs say no), we assert in the
> > state setup code.
> >
>
> Thanks for updating this commit message!
>
> > Signed-off-by: Jason Ekstrand 
> > ---
> >  src/intel/isl/isl.c   | 12 +++-
> >  src/intel/isl/isl_surface_state.c |  9 +
> >  2 files changed, 16 insertions(+), 5 deletions(-)
> >
> > diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> > index a75fddf..710c990 100644
> > --- a/src/intel/isl/isl.c
> > +++ b/src/intel/isl/isl.c
> > @@ -518,7 +518,6 @@ isl_calc_phys_level0_extent_sa(const struct
> isl_device *dev,
> >assert(info->height == 1);
> >assert(info->depth == 1);
> >assert(info->samples == 1);
> > -  assert(!isl_format_is_compressed(info->format));
> >
> >switch (dim_layout) {
> >case ISL_DIM_LAYOUT_GEN4_3D:
> > @@ -527,8 +526,8 @@ isl_calc_phys_level0_extent_sa(const struct
> isl_device *dev,
> >case ISL_DIM_LAYOUT_GEN9_1D:
> >case ISL_DIM_LAYOUT_GEN4_2D:
> >   *phys_level0_sa = (struct isl_extent4d) {
> > -.w = info->width,
> > -.h = 1,
> > +.w = isl_align_npot(info->width, fmtl->bw),
> > +.h = fmtl->bh,
> >  .d = 1,
> >  .a = info->array_len,
> >   };
> > @@ -757,7 +756,7 @@ isl_calc_phys_slice0_extent_sa_gen9_1d(
> >  {
> > MAYBE_UNUSED const struct isl_format_layout *fmtl =
> isl_format_get_layout(info->format);
> >
> > -   assert(phys_level0_sa->height == 1);
> > +   assert(phys_level0_sa->height == fmtl->bh);
> > assert(phys_level0_sa->depth == 1);
> > assert(info->samples == 1);
> > assert(image_align_sa->w >= fmtl->bw);
> > @@ -1567,9 +1566,12 @@ get_image_offset_sa_gen9_1d(const struct
> isl_surf *surf,
> >  uint32_t *x_offset_sa,
> >  uint32_t *y_offset_sa)
> >  {
> > +   MAYBE_UNUSED const struct isl_format_layout *fmtl =
> > +  isl_format_get_layout(surf->format);
> > +
> > assert(level < surf->levels);
> > assert(layer < surf->phys_level0_sa.array_len);
> > -   assert(surf->phys_level0_sa.height == 1);
> > +   assert(surf->phys_level0_sa.height == fmtl->bh);
> > assert(surf->phys_level0_sa.depth == 1);
> > assert(surf->samples == 1);
> >
>
> As mentioned in my previous reply, I no longer think we should update
> get_image_offset_sa_gen9_1d() and
> isl_calc_phys_slice0_extent_sa_gen9_1d() as auxiliary surfaces won't
> have this layout.
>

Right... I didn't match your original comment with why until I thought
about it a bit more.  I dropped those hunks.


> > diff --git a/src/intel/isl/isl_surface_state.c
> b/src/intel/isl/isl_surface_state.c
> > index 979e140..210308c 100644
> > --- a/src/intel/isl/isl_surface_state.c
> > +++ b/src/intel/isl/isl_surface_state.c
> > @@ -215,6 +215,15 @@ isl_genX(surf_fill_state_s)(const struct
> isl_device *dev, void *state,
> >assert(isl_format_supports_rendering(dev->info,
> info->view->format));
> > else if (info->view->usage & ISL_SURF_USAGE_TEXTURE_BIT)
> >assert(isl_format_supports_sampling(dev->info,
> info->view->format));
> > +
> > +   /* From the Sky Lake PRM Vol. 2d, RENDER_SURFACE_STATE::
> SurfaceFormat
> > +*
> > +*This field cannot be a compressed (BC*, DXT*, FXT*, ETC*, EAC*)
> > +*format if the Surface Type is SURFTYPE_1D
> > +*/
> > +   if (info->surf->dim == ISL_SURF_DIM_1D)
> > +  assert(!isl_format_is_compressed(info->view->format));
> > +
>
> Thanks for adding this assertion! Placing it in get_surftype()
> may be a better fit as it already has assertions against 1D images used
> as cube maps.
>
> > s.SurfaceFormat = info->view->format;
> >
> >  #if GEN_IS_HASWELL
> > --
> > 2.5.0.400.gff86faf
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 1/8] intel/isl: Add a format_supports_multisampling helper

2016-09-27 Thread Jason Ekstrand

Signed-off-by: Jason Ekstrand 
Reviewed-by: Chad Versace 
Reviewed-by: Nanley Chery 
---
 src/intel/isl/isl.h|  2 ++
 src/intel/isl/isl_format.c | 28 
 src/intel/isl/isl_gen6.c   | 19 +--
 src/intel/isl/isl_gen7.c   | 16 +---
 src/intel/isl/isl_gen8.c   |  4 +---
 5 files changed, 33 insertions(+), 36 deletions(-)

diff --git a/src/intel/isl/isl.h b/src/intel/isl/isl.h
index ecedc05..cb7c22d 100644
--- a/src/intel/isl/isl.h
+++ b/src/intel/isl/isl.h
@@ -989,6 +989,8 @@ bool isl_format_supports_vertex_fetch(const struct 
brw_device_info *devinfo,
   enum isl_format format);
 bool isl_format_supports_lossless_compression(const struct brw_device_info 
*devinfo,
   enum isl_format format);
+bool isl_format_supports_multisampling(const struct brw_device_info *devinfo,
+   enum isl_format format);
 
 bool isl_format_has_unorm_channel(enum isl_format fmt) ATTRIBUTE_CONST;
 bool isl_format_has_snorm_channel(enum isl_format fmt) ATTRIBUTE_CONST;
diff --git a/src/intel/isl/isl_format.c b/src/intel/isl/isl_format.c
index 8507cc5..f3429be 100644
--- a/src/intel/isl/isl_format.c
+++ b/src/intel/isl/isl_format.c
@@ -429,6 +429,34 @@ isl_format_supports_lossless_compression(const struct 
brw_device_info *devinfo,
return format_gen(devinfo) >= format_info[format].lossless_compression;
 }
 
+bool
+isl_format_supports_multisampling(const struct brw_device_info *devinfo,
+  enum isl_format format)
+{
+   /* From the Sandybridge PRM, Volume 4 Part 1 p72, SURFACE_STATE, Surface
+* Format:
+*
+*If Number of Multisamples is set to a value other than
+*MULTISAMPLECOUNT_1, this field cannot be set to the following
+*formats:
+*
+*   - any format with greater than 64 bits per element
+*   - any compressed texture format (BC*)
+*   - any YCRCB* format
+*
+* The restriction on the format's size is removed on Broadwell.
+*/
+   if (devinfo->gen < 8 && isl_format_get_layout(format)->bpb > 64) {
+  return false;
+   } else if (isl_format_is_compressed(format)) {
+  return false;
+   } else if (isl_format_is_yuv(format)) {
+  return false;
+   } else {
+  return true;
+   }
+}
+
 static inline bool
 isl_format_has_channel_type(enum isl_format fmt, enum isl_base_type type)
 {
diff --git a/src/intel/isl/isl_gen6.c b/src/intel/isl/isl_gen6.c
index 2c52e38..b30998d 100644
--- a/src/intel/isl/isl_gen6.c
+++ b/src/intel/isl/isl_gen6.c
@@ -30,8 +30,6 @@ gen6_choose_msaa_layout(const struct isl_device *dev,
   enum isl_tiling tiling,
   enum isl_msaa_layout *msaa_layout)
 {
-   const struct isl_format_layout *fmtl = isl_format_get_layout(info->format);
-
assert(ISL_DEV_GEN(dev) == 6);
assert(info->samples >= 1);
 
@@ -40,22 +38,7 @@ gen6_choose_msaa_layout(const struct isl_device *dev,
   return false;
}
 
-   /* From the Sandybridge PRM, Volume 4 Part 1 p72, SURFACE_STATE, Surface
-* Format:
-*
-*If Number of Multisamples is set to a value other than
-*MULTISAMPLECOUNT_1, this field cannot be set to the following
-*formats:
-*
-*   - any format with greater than 64 bits per element
-*   - any compressed texture format (BC*)
-*   - any YCRCB* format
-*/
-   if (fmtl->bpb > 64)
-  return false;
-   if (isl_format_is_compressed(info->format))
-  return false;
-   if (isl_format_is_yuv(info->format))
+   if (!isl_format_supports_multisampling(dev->info, info->format))
   return false;
 
/* From the Sandybridge PRM, Volume 4 Part 1 p85, SURFACE_STATE, Number of
diff --git a/src/intel/isl/isl_gen7.c b/src/intel/isl/isl_gen7.c
index 02273f8..7b40291 100644
--- a/src/intel/isl/isl_gen7.c
+++ b/src/intel/isl/isl_gen7.c
@@ -30,8 +30,6 @@ gen7_choose_msaa_layout(const struct isl_device *dev,
 enum isl_tiling tiling,
 enum isl_msaa_layout *msaa_layout)
 {
-   const struct isl_format_layout *fmtl = isl_format_get_layout(info->format);
-
bool require_array = false;
bool require_interleaved = false;
 
@@ -43,19 +41,7 @@ gen7_choose_msaa_layout(const struct isl_device *dev,
   return true;
}
 
-   /* From the Ivybridge PRM, Volume 4 Part 1 p63, SURFACE_STATE, Surface
-* Format:
-*
-*If Number of Multisamples is set to a value other than
-*MULTISAMPLECOUNT_1, this field cannot be set to the following
-*formats: any format with greater than 64 bits per element, any
-*compressed texture format (BC*), and any YCRCB* format.
-*/
-   if (fmtl->bpb > 64)
-  return false;
-   if (isl_format_is_compressed(info->format))
-  return false;
-   if (isl_format_is_yuv(info->format))
+   if (!isl_format_supports_multisampling(dev->info, i

[Mesa-dev] [PATCH v3 7/8] intel/isl: Add a detailed comment about multisampling with HiZ

2016-09-27 Thread Jason Ekstrand

Signed-off-by: Jason Ekstrand 
Reviewed-by: Chad Versace 
Reviewed-by: Nanley Chery 
---
 src/intel/isl/isl.c | 60 +++--
 1 file changed, 58 insertions(+), 2 deletions(-)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index ee5330e..749d228 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -1288,6 +1288,63 @@ isl_surf_get_hiz_surf(const struct isl_device *dev,
assert(surf->msaa_layout == ISL_MSAA_LAYOUT_NONE ||
   surf->msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED);
 
+   /* From the Broadwell PRM Vol. 7, "Hierarchical Depth Buffer":
+*
+*"The Surface Type, Height, Width, Depth, Minimum Array Element, Render
+*Target View Extent, and Depth Coordinate Offset X/Y of the
+*hierarchical depth buffer are inherited from the depth buffer. The
+*height and width of the hierarchical depth buffer that must be
+*allocated are computed by the following formulas, where HZ is the
+*hierarchical depth buffer and Z is the depth buffer. The Z_Height,
+*Z_Width, and Z_Depth values given in these formulas are those present
+*in 3DSTATE_DEPTH_BUFFER incremented by one.
+*
+*"The value of Z_Height and Z_Width must each be multiplied by 2 before
+*being applied to the table below if Number of Multisamples is set to
+*NUMSAMPLES_4. The value of Z_Height must be multiplied by 2 and
+*Z_Width must be multiplied by 4 before being applied to the table
+*below if Number of Multisamples is set to NUMSAMPLES_8."
+*
+* In the Sky Lake PRM, the second paragraph is replaced with this:
+*
+*"The Z_Height and Z_Width values must equal those present in
+*3DSTATE_DEPTH_BUFFER incremented by one."
+*
+* In other words, on Sandy Bridge through Broadwell, each 128-bit HiZ
+* block corresponds to a region of 8x4 samples in the primary depth
+* surface.  On Sky Lake, on the other hand, each HiZ block corresponds to
+* a region of 8x4 pixels in the primary depth surface regardless of the
+* number of samples.  The dimensions of a HiZ block in both pixels and
+* samples are given in the table below:
+*
+*| SNB - BDW | SKL+
+*  --+---+-
+*1x  |  8 x 4 sa |   8 x 4 sa
+*   MSAA |  8 x 4 px |   8 x 4 px
+*  --+---+-
+*2x  |  8 x 4 sa |  16 x 4 sa
+*   MSAA |  4 x 4 px |   8 x 4 px
+*  --+---+-
+*4x  |  8 x 4 sa |  16 x 8 sa
+*   MSAA |  4 x 2 px |   8 x 4 px
+*  --+---+-
+*8x  |  8 x 4 sa |  32 x 8 sa
+*   MSAA |  2 x 2 px |   8 x 4 px
+*  --+---+-
+*   16x  |N/A| 32 x 16 sa
+*   MSAA |N/A|  8 x  4 px
+*  --+---+-
+*
+* There are a number of different ways that this discrepency could be
+* handled.  The way we have chosen is to simply make MSAA HiZ have the
+* same number of samples as the parent surface pre-Sky Lake and always be
+* single-sampled on Sky Lake and above.  Since the block sizes of
+* compressed formats are given in samples, this neatly handles everything
+* without the need for additional HiZ formats with different block sizes
+* on SKL+.
+*/
+   const unsigned samples = ISL_DEV_GEN(dev) >= 9 ? 1 : surf->samples;
+
isl_surf_init(dev, hiz_surf,
  .dim = ISL_SURF_DIM_2D,
  .format = ISL_FORMAT_HIZ,
@@ -1296,8 +1353,7 @@ isl_surf_get_hiz_surf(const struct isl_device *dev,
  .depth = 1,
  .levels = surf->levels,
  .array_len = surf->logical_level0_px.array_len,
- /* On SKL+, HiZ is always single-sampled */
- .samples = ISL_DEV_GEN(dev) >= 9 ? 1 : surf->samples,
+ .samples = samples,
  .usage = ISL_SURF_USAGE_HIZ_BIT,
  .tiling_flags = ISL_TILING_HIZ_BIT);
 }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 3/8] intel/isl: Allow creation of 1-D compressed textures

2016-09-27 Thread Jason Ekstrand

Compressed 1-D textures are not well-defined thing in either GL or Vulkan.
However, auxiliary surfaces are treated as compressed textures in ISL and
we can do HiZ and CCS with 1-D so we need to be able to create them.  In
order to prevent actually using them (the docs say no), we assert in the
state setup code.

Signed-off-by: Jason Ekstrand 
---
 src/intel/isl/isl.c   | 5 ++---
 src/intel/isl/isl_surface_state.c | 9 +
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index a75fddf..185984d 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -518,7 +518,6 @@ isl_calc_phys_level0_extent_sa(const struct isl_device *dev,
   assert(info->height == 1);
   assert(info->depth == 1);
   assert(info->samples == 1);
-  assert(!isl_format_is_compressed(info->format));
 
   switch (dim_layout) {
   case ISL_DIM_LAYOUT_GEN4_3D:
@@ -527,8 +526,8 @@ isl_calc_phys_level0_extent_sa(const struct isl_device *dev,
   case ISL_DIM_LAYOUT_GEN9_1D:
   case ISL_DIM_LAYOUT_GEN4_2D:
  *phys_level0_sa = (struct isl_extent4d) {
-.w = info->width,
-.h = 1,
+.w = isl_align_npot(info->width, fmtl->bw),
+.h = fmtl->bh,
 .d = 1,
 .a = info->array_len,
  };
diff --git a/src/intel/isl/isl_surface_state.c 
b/src/intel/isl/isl_surface_state.c
index 979e140..210308c 100644
--- a/src/intel/isl/isl_surface_state.c
+++ b/src/intel/isl/isl_surface_state.c
@@ -215,6 +215,15 @@ isl_genX(surf_fill_state_s)(const struct isl_device *dev, 
void *state,
   assert(isl_format_supports_rendering(dev->info, info->view->format));
else if (info->view->usage & ISL_SURF_USAGE_TEXTURE_BIT)
   assert(isl_format_supports_sampling(dev->info, info->view->format));
+
+   /* From the Sky Lake PRM Vol. 2d, RENDER_SURFACE_STATE::SurfaceFormat
+*
+*This field cannot be a compressed (BC*, DXT*, FXT*, ETC*, EAC*)
+*format if the Surface Type is SURFTYPE_1D
+*/
+   if (info->surf->dim == ISL_SURF_DIM_1D)
+  assert(!isl_format_is_compressed(info->view->format));
+
s.SurfaceFormat = info->view->format;
 
 #if GEN_IS_HASWELL
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 5/8] intel/isl: Handle HiZ and CCS tiling more directly

2016-09-27 Thread Jason Ekstrand

The HiZ and CCS tiling formats are always used for HiZ and CCS surfaces
respectively.  There's no reason why we should go through filter_tiling and
it's much easier to always get HiZ and CCS right if we just handle them
directly.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Topi Pohjolainen 
Reviewed-by: Chad Versace 
Reviewed-by: Nanley Chery 
---
 src/intel/isl/isl.c  | 18 --
 src/intel/isl/isl_gen7.c | 14 --
 2 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index 33d7079..ee5330e 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -226,6 +226,22 @@ isl_surf_choose_tiling(const struct isl_device *dev,
 {
isl_tiling_flags_t tiling_flags = info->tiling_flags;
 
+   /* HiZ surfaces always use the HiZ tiling */
+   if (info->usage & ISL_SURF_USAGE_HIZ_BIT) {
+  assert(info->format == ISL_FORMAT_HIZ);
+  assert(tiling_flags == ISL_TILING_HIZ_BIT);
+  *tiling = ISL_TILING_HIZ;
+  return true;
+   }
+
+   /* CCS surfaces always use the CCS tiling */
+   if (info->usage & ISL_SURF_USAGE_CCS_BIT) {
+  assert(isl_format_get_layout(info->format)->txc == ISL_TXC_CCS);
+  assert(tiling_flags == ISL_TILING_CCS_BIT);
+  *tiling = ISL_TILING_CCS;
+  return true;
+   }
+
if (ISL_DEV_GEN(dev) >= 7) {
   gen7_filter_tiling(dev, info, &tiling_flags);
} else {
@@ -254,8 +270,6 @@ isl_surf_choose_tiling(const struct isl_device *dev,
   CHOOSE(ISL_TILING_LINEAR);
}
 
-   CHOOSE(ISL_TILING_CCS);
-   CHOOSE(ISL_TILING_HIZ);
CHOOSE(ISL_TILING_Ys);
CHOOSE(ISL_TILING_Yf);
CHOOSE(ISL_TILING_Y0);
diff --git a/src/intel/isl/isl_gen7.c b/src/intel/isl/isl_gen7.c
index 7b40291..316b51b 100644
--- a/src/intel/isl/isl_gen7.c
+++ b/src/intel/isl/isl_gen7.c
@@ -217,24 +217,10 @@ gen7_filter_tiling(const struct isl_device *dev,
   *flags &= ~ISL_TILING_W_BIT;
}
 
-   /* The HiZ format and tiling always go together */
-   if (info->format == ISL_FORMAT_HIZ) {
-  *flags &= ISL_TILING_HIZ_BIT;
-   } else {
-  *flags &= ~ISL_TILING_HIZ_BIT;
-   }
-
/* MCS buffers are always Y-tiled */
if (isl_format_get_layout(info->format)->txc == ISL_TXC_MCS)
   *flags &= ISL_TILING_Y0_BIT;
 
-   /* The CCS formats and tiling always go together */
-   if (isl_format_get_layout(info->format)->txc == ISL_TXC_CCS) {
-  *flags &= ISL_TILING_CCS_BIT;
-   } else {
-  *flags &= ~ISL_TILING_CCS_BIT;
-   }
-
if (info->usage & (ISL_SURF_USAGE_DISPLAY_ROTATE_90_BIT |
   ISL_SURF_USAGE_DISPLAY_ROTATE_180_BIT |
   ISL_SURF_USAGE_DISPLAY_ROTATE_270_BIT)) {
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 6/8] intel/isl: Remove tiling checks from choose_msaa_layout

2016-09-27 Thread Jason Ekstrand

We already do those checks in filter_tiling.  There's no good reason to
repeat them in choose_msaa_layout.  If anything they should have been
asserts and not "return false" checks.  Also, this check was causing us to
outright reject multisampled HiZ surfaces which wasn't intended.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Chad Versace 
Reviewed-by: Nanley Chery 
---
 src/intel/isl/isl_gen7.c | 10 +++---
 src/intel/isl/isl_gen8.c | 11 ---
 2 files changed, 7 insertions(+), 14 deletions(-)

diff --git a/src/intel/isl/isl_gen7.c b/src/intel/isl/isl_gen7.c
index 316b51b..5b4f0d4 100644
--- a/src/intel/isl/isl_gen7.c
+++ b/src/intel/isl/isl_gen7.c
@@ -249,9 +249,13 @@ gen7_filter_tiling(const struct isl_device *dev,
*   For multisample render targets, this field must be 1 (true). MSRTs
*   can only be tiled.
*
-   * Multisample surfaces never require X tiling, and Y tiling generally
-   * performs better than X. So choose Y. (Unless it's stencil, then it
-   * must be W).
+   * From the Broadwell PRM >> Volume2d: Command Structures >>
+   * RENDER_SURFACE_STATE Tile Mode:
+   *
+   *   If Number of Multisamples is not MULTISAMPLECOUNT_1, this field
+   *   must be YMAJOR.
+   *
+   * As usual, though, stencil is special and requires W-tiling.
*/
   *flags &= (ISL_TILING_ANY_Y_MASK | ISL_TILING_W_BIT);
}
diff --git a/src/intel/isl/isl_gen8.c b/src/intel/isl/isl_gen8.c
index 0049614..2d7f41f 100644
--- a/src/intel/isl/isl_gen8.c
+++ b/src/intel/isl/isl_gen8.c
@@ -41,17 +41,6 @@ gen8_choose_msaa_layout(const struct isl_device *dev,
}
 
/* From the Broadwell PRM >> Volume2d: Command Structures >>
-* RENDER_SURFACE_STATE Tile Mode:
-*
-*- If Number of Multisamples is not MULTISAMPLECOUNT_1, this field
-*  must be YMAJOR.
-*
-* As usual, though, stencil is special.
-*/
-   if (!isl_tiling_is_any_y(tiling) && !isl_surf_usage_is_stencil(info->usage))
-  return false;
-
-   /* From the Broadwell PRM >> Volume2d: Command Structures >>
 * RENDER_SURFACE_STATE Multisampled Surface Storage Format:
 *
 *All multisampled render target surfaces must have this field set to
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 8/8] intel/isl: Allow non-2D HiZ surfaces

2016-09-27 Thread Jason Ekstrand

---
 src/intel/isl/isl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index 749d228..9735d26 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -1346,11 +1346,11 @@ isl_surf_get_hiz_surf(const struct isl_device *dev,
const unsigned samples = ISL_DEV_GEN(dev) >= 9 ? 1 : surf->samples;
 
isl_surf_init(dev, hiz_surf,
- .dim = ISL_SURF_DIM_2D,
+ .dim = surf->dim,
  .format = ISL_FORMAT_HIZ,
  .width = surf->logical_level0_px.width,
  .height = surf->logical_level0_px.height,
- .depth = 1,
+ .depth = surf->logical_level0_px.depth,
  .levels = surf->levels,
  .array_len = surf->logical_level0_px.array_len,
  .samples = samples,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 4/8] intel/isl: Allow multisampling with ISL_FORMAT_HiZ

2016-09-27 Thread Jason Ekstrand

HiZ buffers can be multisampled and, on Broadwell and earlier, simply using
interleaved multisampling with a compression block size of 8x4 samples
yields the correct HiZ surface size calculations.  Unfortunately,
choose_msaa_layout was rejecting multisampled HiZ buffers because of format
checks.  Now that we have a simple helper for determining if a format
supports multisampling, that's an easy enough issue to fix.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Chad Versace 
---
 src/intel/isl/isl.c|  4 +++-
 src/intel/isl/isl_format.c | 11 +--
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index 185984d..33d7079 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -572,7 +572,6 @@ isl_calc_phys_level0_extent_sa(const struct isl_device *dev,
  assert(info->depth == 1);
  assert(info->levels == 1);
  assert(isl_format_supports_multisampling(dev->info, info->format));
- assert(fmtl->bw == 1 && fmtl->bh == 1);
 
  *phys_level0_sa = (struct isl_extent4d) {
 .w = info->width,
@@ -584,6 +583,9 @@ isl_calc_phys_level0_extent_sa(const struct isl_device *dev,
  isl_msaa_interleaved_scale_px_to_sa(info->samples,
  &phys_level0_sa->w,
  &phys_level0_sa->h);
+
+ phys_level0_sa->w = isl_align(phys_level0_sa->w, fmtl->bw);
+ phys_level0_sa->h = isl_align(phys_level0_sa->h, fmtl->bh);
  break;
   }
   break;
diff --git a/src/intel/isl/isl_format.c b/src/intel/isl/isl_format.c
index f3429be..e373e49 100644
--- a/src/intel/isl/isl_format.c
+++ b/src/intel/isl/isl_format.c
@@ -444,9 +444,16 @@ isl_format_supports_multisampling(const struct 
brw_device_info *devinfo,
 *   - any compressed texture format (BC*)
 *   - any YCRCB* format
 *
-* The restriction on the format's size is removed on Broadwell.
+* The restriction on the format's size is removed on Broadwell.  Also,
+* there is an exception for HiZ which we treat as a compressed format and
+* is allowed to be multisampled on Broadwell and earlier.
 */
-   if (devinfo->gen < 8 && isl_format_get_layout(format)->bpb > 64) {
+   if (format == ISL_FORMAT_HIZ) {
+  /* On SKL+, HiZ is always single-sampled even when the primary surface
+   * is multisampled.  See also isl_surf_get_hiz_surf().
+   */
+  return devinfo->gen <= 8;
+   } else if (devinfo->gen < 8 && isl_format_get_layout(format)->bpb > 64) {
   return false;
} else if (isl_format_is_compressed(format)) {
   return false;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 2/8] intel/isl: Fix up asserts in calc_phys_level0_extent_sa

2016-09-27 Thread Jason Ekstrand

The assertion that a format is uncompressed in the multisample layouts
isn't quite right.  What we really want to assert is that the format
supports multisampling which is a bit more complicated query.  We also want
to assert that it has a block size of 1x1 since we do nothing with the
block size in the phys_level0_sa assignment.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Topi Pohjolainen 
Reviewed-by: Chad Versace 
Reviewed-by: Nanley Chery 
---
 src/intel/isl/isl.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
index c460ddb..a75fddf 100644
--- a/src/intel/isl/isl.c
+++ b/src/intel/isl/isl.c
@@ -558,7 +558,8 @@ isl_calc_phys_level0_extent_sa(const struct isl_device *dev,
   case ISL_MSAA_LAYOUT_ARRAY:
  assert(info->depth == 1);
  assert(info->levels == 1);
- assert(!isl_format_is_compressed(info->format));
+ assert(isl_format_supports_multisampling(dev->info, info->format));
+ assert(fmtl->bw == 1 && fmtl->bh == 1);
 
  *phys_level0_sa = (struct isl_extent4d) {
 .w = info->width,
@@ -571,7 +572,8 @@ isl_calc_phys_level0_extent_sa(const struct isl_device *dev,
   case ISL_MSAA_LAYOUT_INTERLEAVED:
  assert(info->depth == 1);
  assert(info->levels == 1);
- assert(!isl_format_is_compressed(info->format));
+ assert(isl_format_supports_multisampling(dev->info, info->format));
+ assert(fmtl->bw == 1 && fmtl->bh == 1);
 
  *phys_level0_sa = (struct isl_extent4d) {
 .w = info->width,
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 3/8] intel/isl: Allow creation of 1-D compressed textures

2016-09-27 Thread Nanley Chery

On Tue, Sep 27, 2016 at 09:22:00AM -0700, Jason Ekstrand wrote:
> Compressed 1-D textures are not well-defined thing in either GL or Vulkan.
> However, auxiliary surfaces are treated as compressed textures in ISL and
> we can do HiZ and CCS with 1-D so we need to be able to create them.  In
> order to prevent actually using them (the docs say no), we assert in the
> state setup code.
> 
> Signed-off-by: Jason Ekstrand 
> ---
>  src/intel/isl/isl.c   | 5 ++---
>  src/intel/isl/isl_surface_state.c | 9 +
>  2 files changed, 11 insertions(+), 3 deletions(-)

This patch is
Reviewed-by: Nanley Chery 

> 
> diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> index a75fddf..185984d 100644
> --- a/src/intel/isl/isl.c
> +++ b/src/intel/isl/isl.c
> @@ -518,7 +518,6 @@ isl_calc_phys_level0_extent_sa(const struct isl_device 
> *dev,
>assert(info->height == 1);
>assert(info->depth == 1);
>assert(info->samples == 1);
> -  assert(!isl_format_is_compressed(info->format));
>  
>switch (dim_layout) {
>case ISL_DIM_LAYOUT_GEN4_3D:
> @@ -527,8 +526,8 @@ isl_calc_phys_level0_extent_sa(const struct isl_device 
> *dev,
>case ISL_DIM_LAYOUT_GEN9_1D:
>case ISL_DIM_LAYOUT_GEN4_2D:
>   *phys_level0_sa = (struct isl_extent4d) {
> -.w = info->width,
> -.h = 1,
> +.w = isl_align_npot(info->width, fmtl->bw),
> +.h = fmtl->bh,
>  .d = 1,
>  .a = info->array_len,
>   };
> diff --git a/src/intel/isl/isl_surface_state.c 
> b/src/intel/isl/isl_surface_state.c
> index 979e140..210308c 100644
> --- a/src/intel/isl/isl_surface_state.c
> +++ b/src/intel/isl/isl_surface_state.c
> @@ -215,6 +215,15 @@ isl_genX(surf_fill_state_s)(const struct isl_device 
> *dev, void *state,
>assert(isl_format_supports_rendering(dev->info, info->view->format));
> else if (info->view->usage & ISL_SURF_USAGE_TEXTURE_BIT)
>assert(isl_format_supports_sampling(dev->info, info->view->format));
> +
> +   /* From the Sky Lake PRM Vol. 2d, RENDER_SURFACE_STATE::SurfaceFormat
> +*
> +*This field cannot be a compressed (BC*, DXT*, FXT*, ETC*, EAC*)
> +*format if the Surface Type is SURFTYPE_1D
> +*/
> +   if (info->surf->dim == ISL_SURF_DIM_1D)
> +  assert(!isl_format_is_compressed(info->view->format));
> +
> s.SurfaceFormat = info->view->format;
>  
>  #if GEN_IS_HASWELL
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97952] /usr/include/string.h:518:12: error: exception specification in declaration does not match previous declaration

2016-09-27 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97952

Bug ID: 97952
   Summary: /usr/include/string.h:518:12: error: exception
specification in declaration does not match previous
declaration
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Keywords: bisected, regression
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: v...@freedesktop.org
QA Contact: mesa-dev@lists.freedesktop.org
CC: robcl...@freedesktop.org

mesa: 4421c0fb0dc7a51c3d639c452ad8a5d55a99cec1 (master 12.1.0-devel)

  Compiling src/compiler/glsl/link_varyings.cpp ...
In file included from src/compiler/glsl/link_varyings.cpp:33:
In file included from src/compiler/glsl/glsl_symbol_table.h:34:
In file included from src/compiler/glsl/ir.h:33:
In file included from src/compiler/glsl_types.h:29:
/usr/include/string.h:518:12: error: exception specification in declaration
does not match previous declaration
extern int ffs (int __i) __THROW __attribute__ ((__const__));
   ^
src/util/bitscan.h:51:13: note: expanded from macro 'ffs'
#define ffs __builtin_ffs
^
src/util/bitscan.h:96:18: note: previous declaration is here
   const int i = ffs(*mask) - 1;
 ^
src/util/bitscan.h:51:13: note: expanded from macro 'ffs'
#define ffs __builtin_ffs
^

ecd6fce2611e88ff8468a354cff8eda39f260a31 is the first bad commit
commit ecd6fce2611e88ff8468a354cff8eda39f260a31
Author: Rob Clark 
Date:   Wed Aug 31 17:44:01 2016 -0400

mesa/st: support lowering multi-planar YUV

Support multi-planar YUV for external EGLImage's (currently just in the
dma-buf import path) by lowering to multiple texture fetch's for each
plane and CSC in shader.

There was some discussion of alternative approaches for tracking the
additional UV or U/V planes:

 
https://lists.freedesktop.org/archives/mesa-dev/2016-September/127832.html

They all seemed worse than pipe_resource::next

Signed-off-by: Rob Clark 

:04 04 80840dcad2d468df7741746515b03247c41a1084
88f03e54885c7b5b1090fb09709e7daf66972941 M  src
bisect run success

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 4/8] intel/isl: Allow multisampling with ISL_FORMAT_HiZ

2016-09-27 Thread Nanley Chery

On Tue, Sep 27, 2016 at 09:22:01AM -0700, Jason Ekstrand wrote:
> HiZ buffers can be multisampled and, on Broadwell and earlier, simply using
> interleaved multisampling with a compression block size of 8x4 samples
> yields the correct HiZ surface size calculations.  Unfortunately,
> choose_msaa_layout was rejecting multisampled HiZ buffers because of format
> checks.  Now that we have a simple helper for determining if a format
> supports multisampling, that's an easy enough issue to fix.
> 
> Signed-off-by: Jason Ekstrand 
> Reviewed-by: Chad Versace 
> ---
>  src/intel/isl/isl.c|  4 +++-
>  src/intel/isl/isl_format.c | 11 +--
>  2 files changed, 12 insertions(+), 3 deletions(-)
> 

This patch is
Reviewed-by: Nanley Chery 

> diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> index 185984d..33d7079 100644
> --- a/src/intel/isl/isl.c
> +++ b/src/intel/isl/isl.c
> @@ -572,7 +572,6 @@ isl_calc_phys_level0_extent_sa(const struct isl_device 
> *dev,
>   assert(info->depth == 1);
>   assert(info->levels == 1);
>   assert(isl_format_supports_multisampling(dev->info, info->format));
> - assert(fmtl->bw == 1 && fmtl->bh == 1);
>  
>   *phys_level0_sa = (struct isl_extent4d) {
>  .w = info->width,
> @@ -584,6 +583,9 @@ isl_calc_phys_level0_extent_sa(const struct isl_device 
> *dev,
>   isl_msaa_interleaved_scale_px_to_sa(info->samples,
>   &phys_level0_sa->w,
>   &phys_level0_sa->h);
> +
> + phys_level0_sa->w = isl_align(phys_level0_sa->w, fmtl->bw);
> + phys_level0_sa->h = isl_align(phys_level0_sa->h, fmtl->bh);
>   break;
>}
>break;
> diff --git a/src/intel/isl/isl_format.c b/src/intel/isl/isl_format.c
> index f3429be..e373e49 100644
> --- a/src/intel/isl/isl_format.c
> +++ b/src/intel/isl/isl_format.c
> @@ -444,9 +444,16 @@ isl_format_supports_multisampling(const struct 
> brw_device_info *devinfo,
>  *   - any compressed texture format (BC*)
>  *   - any YCRCB* format
>  *
> -* The restriction on the format's size is removed on Broadwell.
> +* The restriction on the format's size is removed on Broadwell.  Also,
> +* there is an exception for HiZ which we treat as a compressed format and
> +* is allowed to be multisampled on Broadwell and earlier.
>  */
> -   if (devinfo->gen < 8 && isl_format_get_layout(format)->bpb > 64) {
> +   if (format == ISL_FORMAT_HIZ) {
> +  /* On SKL+, HiZ is always single-sampled even when the primary surface
> +   * is multisampled.  See also isl_surf_get_hiz_surf().
> +   */
> +  return devinfo->gen <= 8;
> +   } else if (devinfo->gen < 8 && isl_format_get_layout(format)->bpb > 64) {
>return false;
> } else if (isl_format_is_compressed(format)) {
>return false;
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 8/8] intel/isl: Allow non-2D HiZ surfaces

2016-09-27 Thread Nanley Chery

On Tue, Sep 27, 2016 at 09:22:05AM -0700, Jason Ekstrand wrote:
> ---
>  src/intel/isl/isl.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 

This patch is
Reviewed-by: Nanley Chery 

> diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> index 749d228..9735d26 100644
> --- a/src/intel/isl/isl.c
> +++ b/src/intel/isl/isl.c
> @@ -1346,11 +1346,11 @@ isl_surf_get_hiz_surf(const struct isl_device *dev,
> const unsigned samples = ISL_DEV_GEN(dev) >= 9 ? 1 : surf->samples;
>  
> isl_surf_init(dev, hiz_surf,
> - .dim = ISL_SURF_DIM_2D,
> + .dim = surf->dim,
>   .format = ISL_FORMAT_HIZ,
>   .width = surf->logical_level0_px.width,
>   .height = surf->logical_level0_px.height,
> - .depth = 1,
> + .depth = surf->logical_level0_px.depth,
>   .levels = surf->levels,
>   .array_len = surf->logical_level0_px.array_len,
>   .samples = samples,
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH V2 07/11] anv/image: Memset hiz surfaces to 0 when binding memory

2016-09-27 Thread Chad Versace

On Mon 26 Sep 2016, Nanley Chery wrote:
> From: Jason Ekstrand 
> 
> Nanley Chery (amend):
>  - Change memset value from 0xff to 0 (a defined value for HiZ).
> 
> Signed-off-by: Nanley Chery 
> 
> ---
> 
> v2. Add asserts (Jason)
> Handle NULL return value of the mmap

Reviewed-by: Chad Versace 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH V2 10/11] genX/cmd_buffer: Enable fast depth clears

2016-09-27 Thread Chad Versace

On Mon 26 Sep 2016, Nanley Chery wrote:
> From: Nanley Chery 
> 
> Provides an FPS increase of ~30% on the Sascha triangle and multisampling
> demos.
> 
> Clears that happen within a render pass via vkCmdClearAttachments are safe
> even if the clear color changes. This is because the meta implementation does
> not use LOAD_OP_CLEAR which avoids any conflicts with 3DSTATE_CLEAR_PARAMS.
> 
> Signed-off-by: Nanley Chery 
> Reviewed-by: Jason Ekstrand 
> 
> ---
> 
> v2. Update granularity comment for accuracy
> 
>  src/intel/vulkan/anv_pass.c| 13 +
>  src/intel/vulkan/gen8_cmd_buffer.c |  6 ++
>  src/intel/vulkan/genX_cmd_buffer.c |  4 +---
>  3 files changed, 20 insertions(+), 3 deletions(-)
> 
> diff --git a/src/intel/vulkan/anv_pass.c b/src/intel/vulkan/anv_pass.c
> index 69c3c7e..595c2ea 100644
> --- a/src/intel/vulkan/anv_pass.c
> +++ b/src/intel/vulkan/anv_pass.c
> @@ -155,5 +155,18 @@ void anv_GetRenderAreaGranularity(
>  VkRenderPassrenderPass,
>  VkExtent2D* pGranularity)
>  {
> +   ANV_FROM_HANDLE(anv_render_pass, pass, renderPass);
> +
> +   /* This granularity satisfies HiZ fast clear alignment requirements
> +* for all sample counts.
> +*/
> +   for (unsigned i = 0; i < pass->subpass_count; ++i) {
> +  if (pass->subpasses[i].depth_stencil_attachment !=
> +  VK_ATTACHMENT_UNUSED) {
> + *pGranularity = (VkExtent2D) { .width = 8, .height = 4 };
> + return;
> +  }
> +   }
> +
> *pGranularity = (VkExtent2D) { 1, 1 };
>  }
> diff --git a/src/intel/vulkan/gen8_cmd_buffer.c 
> b/src/intel/vulkan/gen8_cmd_buffer.c
> index 14e6a7b..96e972c 100644
> --- a/src/intel/vulkan/gen8_cmd_buffer.c
> +++ b/src/intel/vulkan/gen8_cmd_buffer.c
> @@ -479,6 +479,12 @@ genX(cmd_buffer_do_hz_op)(struct anv_cmd_buffer 
> *cmd_buffer,
>   cmd_state->render_area.extent.height % px_dim.h)
>  return;
>}
> +
> +  anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_CLEAR_PARAMS), cp) {
> + cp.DepthClearValueValid = true;
> + cp.DepthClearValue =
> +cmd_buffer->state.attachments[ds].clear_value.depthStencil.depth;
> +  }
>break;
> case BLORP_HIZ_OP_DEPTH_RESOLVE:
>if (cmd_buffer->state.pass->attachments[ds].store_op !=
> diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
> b/src/intel/vulkan/genX_cmd_buffer.c
> index 2cb1539..290fefc 100644
> --- a/src/intel/vulkan/genX_cmd_buffer.c
> +++ b/src/intel/vulkan/genX_cmd_buffer.c
> @@ -1320,9 +1320,6 @@ cmd_buffer_emit_depth_stencil(struct anv_cmd_buffer 
> *cmd_buffer)
> } else {
>anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_STENCIL_BUFFER), sb);
> }
> -
> -   /* Clear the clear params. */
> -   anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_CLEAR_PARAMS), cp);

We may need to preserve emission of 3DSTATE_CLEAR_PARAMS here. Two reasons:

Reason 1. If hiz is enabled in the 3DSTATE_DEPTH_BUFFER, and the hiz
   surface has some bits in the clear state, and 
3DSTATE_CLEAR_PARAMS.DepthClearValueValid is 0,
   and we emit a draw call, what does the hardware do when it
   accesses a cleard pixel? I don't want to find out.

Reason 2. The PRM says we have to (though, to be honest, I don't trust the 
PRM's logic).

From the Skylake PRM >> Vol7: 3D-Media-GPGUP >> Section: Hierarchical 
Depth Buffer:
| 
|  If HiZ is enabled, you must initialize the clear value by either:
| 
| 1. Perform a depth clear pass to initialize the clear value.
| 2. Send a 3dstate_clear_params packet with valid = 1.
| 
|  Without one of these events, context switching will fail, as it will 
try
|  to save off a clear value even though no valid clear value has been 
set.
|  When context restore happens, HW will restore an uninitialized clear 
value.

Even though the hardware docs claim we need 3DSTATE_CLEAR_PARAMS when 
hiz is
enabled, the docs are vague about the consequences. Does context 
switching
really fail, as claimed by #1? Or does context switching actually 
succeed, but
context restore gives us an invalid clear value (which doesn't hurt 
us), as
claimed by #2? Oh hw docs... :/

As a consequence of that reasoning, we should set 
3DSTATE_CLEAR_PARAMS.DepthClearValueValid = 1 
whenever hiz is enabled, even if we don't care about the actual clear value.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH V2 05/11] anv: Allocate hiz surface

2016-09-27 Thread Chad Versace

On Mon 26 Sep 2016, Nanley Chery wrote:
> From: Chad Versace 
> 
> Nanley Chery:
> (rebase)
>  - Use isl_surf_get_hiz_surf()
> (amend)
>  - Only add a HiZ surface onto a depth/stencil attachment
>  - Add comment above HiZ surface addition
>  - Hide HiZ behind INTEL_VK_HIZ prior to BDW
>  - Disable HiZ for untested cases
>  - Remove DISABLE_AUX_BIT instead of preventing it from being added
> 
> Signed-off-by: Nanley Chery 
> Reviewed-by: Jason Ekstrand 
> Reviewed-by: Chad Versace  (v1)

Reviewed-by: Chad Versace  (v2)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH V2 08/11] anv/cmd_buffer: Add code for performing HZ operations

2016-09-27 Thread Chad Versace

On Mon 26 Sep 2016, Nanley Chery wrote:
> Create a function that performs one of three HiZ operations -
> depth/stencil clears, HiZ resolve, and depth resolves.
> 
> Signed-off-by: Nanley Chery 
> 
> ---
> 
> v2. Add documentation
> Fix the alignment check
> Don't minify clear rectangle (Jason)
> Use blorp enums (Jason)
> Enable depth stalls and flushes
> Use full RT rectangle for resolve ops
> Add stencil clear todo
> 
>  src/intel/vulkan/anv_genX.h|   3 +
>  src/intel/vulkan/gen7_cmd_buffer.c |   6 ++
>  src/intel/vulkan/gen8_cmd_buffer.c | 167 
> +
>  3 files changed, 176 insertions(+)



> +/**
> + * Emit the HZ_OP packet in the sequence specified by the BDW PRM section
> + * entitled: "Optimized Depth Buffer Clear and/or Stencil Buffer Clear."
> + *
> + * \todo Enable Stencil Buffer-only clears
> + */
> +void
> +genX(cmd_buffer_do_hz_op)(struct anv_cmd_buffer *cmd_buffer,
> +  enum blorp_hiz_op op)
> +{

All other "emission" functions in gen8_cmd_buffer.c are named
gen8_cmd_buffer_emit_foo(). I think this funtion should be named
gen8_cmd_buffer_emit_hz_op for consistency.

> +   struct anv_cmd_state *cmd_state = &cmd_buffer->state;
> +   const struct anv_image_view *iview =
> +  anv_cmd_buffer_get_depth_stencil_view(cmd_buffer);
> +
> +   if (iview == NULL || !anv_image_has_hiz(iview->image))
> +  return;

Shouldn't this check for subpass_count > 1, like the previous patches
do?

> +
> +   const uint32_t ds = cmd_state->subpass->depth_stencil_attachment;
> +   const bool full_surface_op =
> + cmd_state->render_area.extent.width == iview->extent.width &&
> + cmd_state->render_area.extent.height == iview->extent.height;
> +
> +   /* Validate that we can perform the HZ operation and that it's necessary. 
> */
> +   switch (op) {
> +   case BLORP_HIZ_OP_DEPTH_CLEAR:
> +  if (cmd_buffer->state.pass->attachments[ds].load_op !=
> +  VK_ATTACHMENT_LOAD_OP_CLEAR)
> + return;
> +
> +  /* Apply alignment restrictions. Despite the BDW PRM mentioning this is
> +   * only needed for a depth buffer surface type of D16_UNORM, testing
> +   * showed it to be necessary for other depth formats as well
> +   * (e.g., D32_FLOAT).
> +   */
> +  if (!full_surface_op) {
> +
> + struct isl_extent2d px_dim;
> +#if GEN_GEN == 8
> + /* Pre-SKL, HiZ has an 8x4 sample block. As the number of samples
> +  * increases, the number of pixels representable by this block
> +  * decreases by a factor of the sample dimensions. Sample dimensions
> +  * scale following the MSAA interleaved pattern.
> +  *
> +  * Sample|Sample|Pixel
> +  * Count |Dim   |Dim
> +  * ===
> +  *1  | 1x1  | 8x4
> +  *2  | 2x1  | 4x4
> +  *4  | 2x2  | 4x2
> +  *8  | 4x2  | 2x2
> +  *   16  | 4x4  | 2x1
> +  *
> +  * Table: Pixel Dimensions in a HiZ Sample Block Pre-SKL
> +  */
> + const struct isl_extent2d sa_dim =
> +isl_get_interleaved_msaa_px_size_sa(iview->image->samples);
> + px_dim.w = 8 / sa_dim.w;
> + px_dim.h = 4 / sa_dim.h;
> +#else
> + /* SKL+, the sample block becomes a "pixel block" so the expected
> +  * pixel dimension is a constant 8x4 px for all sample counts.
> +  */
> + px_dim = (struct isl_extent2d) { .w = 8, .h = 4};
> +#endif
> +
> + /* Fast depth clears clear an entire sample block at a time. As a
> +  * result, the rectangle must be aligned to the pixel dimensions of
> +  * a sample block for a successful operation.
> +  */
> + if (cmd_state->render_area.offset.x % px_dim.w ||
> + cmd_state->render_area.offset.y % px_dim.h ||
> + cmd_state->render_area.extent.width % px_dim.w ||
> + cmd_state->render_area.extent.height % px_dim.h)
> +return;
> +  }
> +  break;
> +   case BLORP_HIZ_OP_DEPTH_RESOLVE:
> +  if (cmd_buffer->state.pass->attachments[ds].store_op !=
> +  VK_ATTACHMENT_STORE_OP_STORE)
> + return;
> +  break;
> +   case BLORP_HIZ_OP_HIZ_RESOLVE:
> +  if (cmd_buffer->state.pass->attachments[ds].load_op !=
> +  VK_ATTACHMENT_LOAD_OP_LOAD)
> + return;
> +  break;
> +   case BLORP_HIZ_OP_NONE:
> +  unreachable("Invalid HiZ OP");
> +  break;
> +   }
> +
> +   anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_WM_HZ_OP), hzp) {
> +  switch (op) {
> +  case BLORP_HIZ_OP_DEPTH_CLEAR:
> + hzp.StencilBufferClearEnable = VK_IMAGE_ASPECT_STENCIL_BIT &
> +cmd_state->attachments[ds].pending_clear_aspects;
> + hzp.DepthBufferClearEnable = VK_IMAGE_ASPECT_DEPTH_BIT &
> +cmd_state->attachments[ds].pending_clear_aspects;
> +

Re: [Mesa-dev] [PATCH V2 09/11] genX/cmd_buffer: Enable rendering to HiZ

2016-09-27 Thread Chad Versace

On Mon 26 Sep 2016, Nanley Chery wrote:
> From: Chad Versace 
> 
> Nanley Chery:
> (rebase)
>  - Resolve conflicts with new anv_batch_emit macro
> (amend)
>  - Handle a QPitch TODO
>  - Emit 3DSTATE_HIER_DEPTH_BUFFER on pre-BDW systems
>  - Only use HiZ for single-subpass renderpasses
>  - Emit the HiZ instruction before the stencil instruction to follow the
>optimized clear sequence specified in the PRMs
>  - Don't modify clear params
>  - Enable resolves when a HiZ buffer is used to ensure depth buffer validity
> 
> Provides an FPS increase of ~15% on the Sascha triangle and multisampling
> demos.

Woo!

> Signed-off-by: Nanley Chery 
> 
> ---
> 
> v2: Emit zero'ed 3DSTATE_HIER_DEPTH_BUFFER when hiz is disabled
> (Jason, Chad)
> 
>  src/intel/vulkan/gen8_cmd_buffer.c |  4 
>  src/intel/vulkan/genX_cmd_buffer.c | 43 
> ++
>  2 files changed, 43 insertions(+), 4 deletions(-)
> 
> diff --git a/src/intel/vulkan/gen8_cmd_buffer.c 
> b/src/intel/vulkan/gen8_cmd_buffer.c
> index a13413c..14e6a7b 100644
> --- a/src/intel/vulkan/gen8_cmd_buffer.c
> +++ b/src/intel/vulkan/gen8_cmd_buffer.c
> @@ -417,6 +417,10 @@ genX(cmd_buffer_do_hz_op)(struct anv_cmd_buffer 
> *cmd_buffer,
> if (iview == NULL || !anv_image_has_hiz(iview->image))
>return;
>  
> +   /* FIXME: Implement multi-subpass HiZ */

This should be a FINISHME, not a FIXME, as nothing is broken and there
is no bug. It's just disabled.

> +   if (cmd_buffer->state.pass->subpass_count > 1)
> +  return;
> +

Anyway, that's just a small nitpick.

Reviewed-by: Chad Versace 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/6] vc4: use the new parent/child pools for transfers

2016-09-27 Thread Nicolai Hähnle

From: Nicolai Hähnle 

---
 src/gallium/drivers/vc4/vc4_context.c  | 5 ++---
 src/gallium/drivers/vc4/vc4_context.h  | 2 +-
 src/gallium/drivers/vc4/vc4_resource.c | 4 ++--
 src/gallium/drivers/vc4/vc4_screen.c   | 3 +++
 src/gallium/drivers/vc4/vc4_screen.h   | 3 +++
 5 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/vc4/vc4_context.c 
b/src/gallium/drivers/vc4/vc4_context.c
index 3863e44..b780b13 100644
--- a/src/gallium/drivers/vc4/vc4_context.c
+++ b/src/gallium/drivers/vc4/vc4_context.c
@@ -89,21 +89,21 @@ vc4_context_destroy(struct pipe_context *pctx)
 
 if (vc4->blitter)
 util_blitter_destroy(vc4->blitter);
 
 if (vc4->primconvert)
 util_primconvert_destroy(vc4->primconvert);
 
 if (vc4->uploader)
 u_upload_destroy(vc4->uploader);
 
-slab_destroy(&vc4->transfer_pool);
+slab_destroy_child(&vc4->transfer_pool);
 
 pipe_surface_reference(&vc4->framebuffer.cbufs[0], NULL);
 pipe_surface_reference(&vc4->framebuffer.zsbuf, NULL);
 
 vc4_program_fini(pctx);
 
 ralloc_free(vc4);
 }
 
 struct pipe_context *
@@ -132,22 +132,21 @@ vc4_context_create(struct pipe_screen *pscreen, void 
*priv, unsigned flags)
 vc4_draw_init(pctx);
 vc4_state_init(pctx);
 vc4_program_init(pctx);
 vc4_query_init(pctx);
 vc4_resource_context_init(pctx);
 
 vc4_job_init(vc4);
 
 vc4->fd = screen->fd;
 
-slab_create(&vc4->transfer_pool, sizeof(struct vc4_transfer),
- 16);
+slab_create_child(&vc4->transfer_pool, &screen->transfer_pool);
 vc4->blitter = util_blitter_create(pctx);
 if (!vc4->blitter)
 goto fail;
 
 vc4->primconvert = util_primconvert_create(pctx,
(1 << PIPE_PRIM_QUADS) - 1);
 if (!vc4->primconvert)
 goto fail;
 
 vc4->uploader = u_upload_create(pctx, 16 * 1024,
diff --git a/src/gallium/drivers/vc4/vc4_context.h 
b/src/gallium/drivers/vc4/vc4_context.h
index 87d8c79..0d6b8d0 100644
--- a/src/gallium/drivers/vc4/vc4_context.h
+++ b/src/gallium/drivers/vc4/vc4_context.h
@@ -290,21 +290,21 @@ struct vc4_context {
 struct hash_table *jobs;
 
 /**
  * Map from vc4_resource to a job writing to that resource.
  *
  * Primarily for flushing jobs rendering to textures that are now
  * being read from.
  */
 struct hash_table *write_jobs;
 
-struct slab_mempool transfer_pool;
+struct slab_child_pool transfer_pool;
 struct blitter_context *blitter;
 
 /** bitfield of VC4_DIRTY_* */
 uint32_t dirty;
 
 struct primconvert_context *primconvert;
 
 struct hash_table *fs_cache, *vs_cache;
 struct set *fs_inputs_set;
 uint32_t next_uncompiled_program_id;
diff --git a/src/gallium/drivers/vc4/vc4_resource.c 
b/src/gallium/drivers/vc4/vc4_resource.c
index bfa8f40..9932bb3 100644
--- a/src/gallium/drivers/vc4/vc4_resource.c
+++ b/src/gallium/drivers/vc4/vc4_resource.c
@@ -113,21 +113,21 @@ vc4_resource_transfer_unmap(struct pipe_context *pctx,
 
 blit.mask = util_format_get_mask(ptrans->resource->format);
 blit.filter = PIPE_TEX_FILTER_NEAREST;
 
 pctx->blit(pctx, &blit);
 
 pipe_resource_reference(&trans->ss_resource, NULL);
 }
 
 pipe_resource_reference(&ptrans->resource, NULL);
-slab_free_st(&vc4->transfer_pool, ptrans);
+slab_free(&vc4->transfer_pool, ptrans);
 }
 
 static struct pipe_resource *
 vc4_get_temp_resource(struct pipe_context *pctx,
   struct pipe_resource *prsc,
   const struct pipe_box *box)
 {
 struct pipe_resource temp_setup;
 
 memset(&temp_setup, 0, sizeof(temp_setup));
@@ -189,21 +189,21 @@ vc4_resource_transfer_map(struct pipe_context *pctx,
  */
 if (usage & PIPE_TRANSFER_WRITE)
 vc4_flush_jobs_reading_resource(vc4, prsc);
 else
 vc4_flush_jobs_writing_resource(vc4, prsc);
 }
 
 if (usage & PIPE_TRANSFER_WRITE)
 rsc->writes++;
 
-trans = slab_alloc_st(&vc4->transfer_pool);
+trans = slab_alloc(&vc4->transfer_pool);
 if (!trans)
 return NULL;
 
 /* XXX: Handle DONTBLOCK, DISCARD_RANGE, PERSISTENT, COHERENT. */
 
 /* slab_alloc_st() doesn't zero: */
 memset(trans, 0, sizeof(*trans));
 ptrans = &trans->base;
 
 pipe_resource_reference(&ptrans->resource, prsc);
diff --git a/src/gallium/drivers/vc4/vc4_screen.c 
b/src/gallium/drivers/vc4/vc4_screen.c
index 3dc85d5..64bff5d 100644
--- a/src/gallium/drivers/vc4/vc4_screen.c
+++ b/src/gallium/drivers/vc4/vc4_screen.c
@@ -91

[Mesa-dev] [PATCH 2/6] gallium/radeon: use the new parent/child pools for transfers

2016-09-27 Thread Nicolai Hähnle

From: Nicolai Hähnle 

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97894
---
 src/gallium/drivers/radeon/r600_buffer_common.c | 4 ++--
 src/gallium/drivers/radeon/r600_pipe_common.c   | 9 ++---
 src/gallium/drivers/radeon/r600_pipe_common.h   | 4 +++-
 3 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c 
b/src/gallium/drivers/radeon/r600_buffer_common.c
index 2e8b6f4..6f5018f 100644
--- a/src/gallium/drivers/radeon/r600_buffer_common.c
+++ b/src/gallium/drivers/radeon/r600_buffer_common.c
@@ -276,21 +276,21 @@ void r600_invalidate_resource(struct pipe_context *ctx,
 static void *r600_buffer_get_transfer(struct pipe_context *ctx,
  struct pipe_resource *resource,
   unsigned level,
   unsigned usage,
   const struct pipe_box *box,
  struct pipe_transfer **ptransfer,
  void *data, struct r600_resource *staging,
  unsigned offset)
 {
struct r600_common_context *rctx = (struct r600_common_context*)ctx;
-   struct r600_transfer *transfer = slab_alloc_st(&rctx->pool_transfers);
+   struct r600_transfer *transfer = slab_alloc(&rctx->pool_transfers);
 
transfer->transfer.resource = resource;
transfer->transfer.level = level;
transfer->transfer.usage = usage;
transfer->transfer.box = *box;
transfer->transfer.stride = 0;
transfer->transfer.layer_stride = 0;
transfer->offset = offset;
transfer->staging = staging;
*ptransfer = &transfer->transfer;
@@ -461,21 +461,21 @@ static void r600_buffer_transfer_unmap(struct 
pipe_context *ctx,
struct r600_common_context *rctx = (struct r600_common_context*)ctx;
struct r600_transfer *rtransfer = (struct r600_transfer*)transfer;
 
if (transfer->usage & PIPE_TRANSFER_WRITE &&
!(transfer->usage & PIPE_TRANSFER_FLUSH_EXPLICIT))
r600_buffer_do_flush_region(ctx, transfer, &transfer->box);
 
if (rtransfer->staging)
r600_resource_reference(&rtransfer->staging, NULL);
 
-   slab_free_st(&rctx->pool_transfers, transfer);
+   slab_free(&rctx->pool_transfers, transfer);
 }
 
 void r600_buffer_subdata(struct pipe_context *ctx,
 struct pipe_resource *buffer,
 unsigned usage, unsigned offset,
 unsigned size, const void *data)
 {
struct pipe_transfer *transfer = NULL;
struct pipe_box box;
uint8_t *map = NULL;
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index b0d9813..b35f0bb 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -424,22 +424,21 @@ static void r600_set_debug_callback(struct pipe_context 
*ctx,
if (cb)
rctx->debug = *cb;
else
memset(&rctx->debug, 0, sizeof(rctx->debug));
 }
 
 bool r600_common_context_init(struct r600_common_context *rctx,
  struct r600_common_screen *rscreen,
  unsigned context_flags)
 {
-   slab_create(&rctx->pool_transfers,
-sizeof(struct r600_transfer), 64);
+   slab_create_child(&rctx->pool_transfers, &rscreen->pool_transfers);
 
rctx->screen = rscreen;
rctx->ws = rscreen->ws;
rctx->family = rscreen->family;
rctx->chip_class = rscreen->chip_class;
 
if (rscreen->chip_class >= CIK)
rctx->max_db = MAX2(8, rscreen->info.num_render_backends);
else if (rscreen->chip_class >= EVERGREEN)
rctx->max_db = 8;
@@ -525,21 +524,21 @@ void r600_common_context_cleanup(struct 
r600_common_context *rctx)
rctx->ws->cs_destroy(rctx->gfx.cs);
if (rctx->dma.cs)
rctx->ws->cs_destroy(rctx->dma.cs);
if (rctx->ctx)
rctx->ws->ctx_destroy(rctx->ctx);
 
if (rctx->uploader) {
u_upload_destroy(rctx->uploader);
}
 
-   slab_destroy(&rctx->pool_transfers);
+   slab_destroy_child(&rctx->pool_transfers);
 
if (rctx->allocator_zeroed_memory) {
u_suballocator_destroy(rctx->allocator_zeroed_memory);
}
rctx->ws->fence_reference(&rctx->last_gfx_fence, NULL);
rctx->ws->fence_reference(&rctx->last_sdma_fence, NULL);
 }
 
 void r600_context_add_resource_size(struct pipe_context *ctx, struct 
pipe_resource *r)
 {
@@ -1137,20 +1136,22 @@ bool r600_common_screen_init(struct r600_common_screen 
*rscreen,
}
 
r600_init_screen_texture_functions(rscreen);
r600_init_screen_query_functions(rscreen);
 
rscreen->ws

[Mesa-dev] [PATCH 3/6] r300: use the new parent/child pools for transfers

2016-09-27 Thread Nicolai Hähnle

From: Nicolai Hähnle 

---
 src/gallium/drivers/r300/r300_context.c   | 5 ++---
 src/gallium/drivers/r300/r300_context.h   | 2 +-
 src/gallium/drivers/r300/r300_screen.c| 3 +++
 src/gallium/drivers/r300/r300_screen.h| 2 ++
 src/gallium/drivers/r300/r300_screen_buffer.c | 4 ++--
 5 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/r300/r300_context.c 
b/src/gallium/drivers/r300/r300_context.c
index 3e5f1d6..b914cdb 100644
--- a/src/gallium/drivers/r300/r300_context.c
+++ b/src/gallium/drivers/r300/r300_context.c
@@ -93,21 +93,21 @@ static void r300_destroy_context(struct pipe_context* 
context)
 r300_release_referenced_objects(r300);
 
 if (r300->cs)
 r300->rws->cs_destroy(r300->cs);
 if (r300->ctx)
 r300->rws->ctx_destroy(r300->ctx);
 
 rc_destroy_regalloc_state(&r300->fs_regalloc_state);
 
 /* XXX: No way to tell if this was initialized or not? */
-slab_destroy(&r300->pool_transfers);
+slab_destroy_child(&r300->pool_transfers);
 
 /* Free the structs allocated in r300_setup_atoms() */
 if (r300->aa_state.state) {
 FREE(r300->aa_state.state);
 FREE(r300->blend_color_state.state);
 FREE(r300->clip_state.state);
 FREE(r300->fb_state.state);
 FREE(r300->gpu_flush.state);
 FREE(r300->hyperz_state.state);
 FREE(r300->invariant_state.state);
@@ -378,22 +378,21 @@ struct pipe_context* r300_create_context(struct 
pipe_screen* screen,
 return NULL;
 
 r300->rws = rws;
 r300->screen = r300screen;
 
 r300->context.screen = screen;
 r300->context.priv = priv;
 
 r300->context.destroy = r300_destroy_context;
 
-slab_create(&r300->pool_transfers,
- sizeof(struct pipe_transfer), 64);
+slab_create_child(&r300->pool_transfers, &r300screen->pool_transfers);
 
 r300->ctx = rws->ctx_create(rws);
 if (!r300->ctx)
 goto fail;
 
 r300->cs = rws->cs_create(r300->ctx, RING_GFX, r300_flush_callback, r300);
 if (r300->cs == NULL)
 goto fail;
 
 if (!r300screen->caps.has_tcl) {
diff --git a/src/gallium/drivers/r300/r300_context.h 
b/src/gallium/drivers/r300/r300_context.h
index 592479a..264ace5 100644
--- a/src/gallium/drivers/r300/r300_context.h
+++ b/src/gallium/drivers/r300/r300_context.h
@@ -589,21 +589,21 @@ struct r300_context {
 boolean alpha_to_one;
 boolean alpha_to_coverage;
 
 void *dsa_decompress_zmask;
 
 struct pipe_index_buffer index_buffer;
 struct pipe_vertex_buffer vertex_buffer[PIPE_MAX_ATTRIBS];
 unsigned nr_vertex_buffers;
 struct u_upload_mgr *uploader;
 
-struct slab_mempool pool_transfers;
+struct slab_child_pool pool_transfers;
 
 /* Stat counter. */
 uint64_t flush_counter;
 
 /* const tracking for VS */
 int vs_const_base;
 
 /* Vertex array state info */
 boolean vertex_arrays_dirty;
 boolean vertex_arrays_indexed;
diff --git a/src/gallium/drivers/r300/r300_screen.c 
b/src/gallium/drivers/r300/r300_screen.c
index f6949ce..4d41693 100644
--- a/src/gallium/drivers/r300/r300_screen.c
+++ b/src/gallium/drivers/r300/r300_screen.c
@@ -669,20 +669,21 @@ static boolean r300_is_format_supported(struct 
pipe_screen* screen,
 
 static void r300_destroy_screen(struct pipe_screen* pscreen)
 {
 struct r300_screen* r300screen = r300_screen(pscreen);
 struct radeon_winsys *rws = radeon_winsys(pscreen);
 
 if (rws && !rws->unref(rws))
   return;
 
 pipe_mutex_destroy(r300screen->cmask_mutex);
+slab_destroy_parent(&r300screen->pool_transfers);
 
 if (rws)
   rws->destroy(rws);
 
 FREE(r300screen);
 }
 
 static void r300_fence_reference(struct pipe_screen *screen,
  struct pipe_fence_handle **ptr,
  struct pipe_fence_handle *fence)
@@ -731,15 +732,17 @@ struct pipe_screen* r300_screen_create(struct 
radeon_winsys *rws)
 r300screen->screen.get_paramf = r300_get_paramf;
 r300screen->screen.get_video_param = r300_get_video_param;
 r300screen->screen.is_format_supported = r300_is_format_supported;
 r300screen->screen.is_video_format_supported = 
vl_video_buffer_is_format_supported;
 r300screen->screen.context_create = r300_create_context;
 r300screen->screen.fence_reference = r300_fence_reference;
 r300screen->screen.fence_finish = r300_fence_finish;
 
 r300_init_screen_resource_functions(r300screen);
 
+slab_create_parent(&r300screen->pool_transfers, sizeof(struct 
pipe_transfer), 64);
+
 util_format_s3tc_init();
 pipe_mutex_init(r300screen->cmask_mutex);
 
 return &r300screen->screen;
 }
diff --git a/src/gallium/drivers/r300/r300_screen.h 
b/src/gallium/drivers/r300/r300_screen.h
index 5cd5a40..4b783af 100644
--- a/src/gallium/drivers/r300/r300_screen.h
+++ b/src/gallium/drivers/r300/r300_screen.h
@@ -37,20 +37,22 @@ struct r300_screen {
 
 struct radeon_winsys *r

[Mesa-dev] [PATCH 0/6] Fix map/unmaps from different contexts

2016-09-27 Thread Nicolai Hähnle

Hi all,

it can happen that a buffer that is mapped from one context ends up being
unmapped in another one. I found nothing in the spec to forbid it, and in
any case it's a scenario that people are increasingly likely to run into
with persistent maps and multiple contexts.

This leads to crashes in some drivers because the Gallium pipe_transfer
object is allocated from a per-context allocator.

While a possible fix would be to simply use a per-screen allocator, this
would require an additional mutex lock/unlock in a pretty critical path.
This series takes an alternative path of re-designing util/slab so that
it can support free-ing an object in a different pool than the one it was
allocated from. Please review!

Thanks,
Nicolai

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 6/6] virgl: use the new parent/child pools for transfers

2016-09-27 Thread Nicolai Hähnle

From: Nicolai Hähnle 

---
 src/gallium/drivers/virgl/virgl_buffer.c  | 4 ++--
 src/gallium/drivers/virgl/virgl_context.c | 5 ++---
 src/gallium/drivers/virgl/virgl_context.h | 2 +-
 src/gallium/drivers/virgl/virgl_screen.c  | 4 
 src/gallium/drivers/virgl/virgl_screen.h  | 3 +++
 src/gallium/drivers/virgl/virgl_texture.c | 4 ++--
 6 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/virgl/virgl_buffer.c 
b/src/gallium/drivers/virgl/virgl_buffer.c
index de99796..2e63aeb 100644
--- a/src/gallium/drivers/virgl/virgl_buffer.c
+++ b/src/gallium/drivers/virgl/virgl_buffer.c
@@ -55,21 +55,21 @@ static void *virgl_buffer_transfer_map(struct pipe_context 
*ctx,
bool doflushwait = false;
 
if ((usage & PIPE_TRANSFER_READ) && (vbuf->on_list == TRUE))
   doflushwait = true;
else
   doflushwait = virgl_res_needs_flush_wait(vctx, &vbuf->base, usage);
 
if (doflushwait)
   ctx->flush(ctx, NULL, 0);
 
-   trans = slab_alloc_st(&vctx->texture_transfer_pool);
+   trans = slab_alloc(&vctx->texture_transfer_pool);
if (!trans)
   return NULL;
 
trans->base.resource = resource;
trans->base.level = level;
trans->base.usage = usage;
trans->base.box = *box;
trans->base.stride = 0;
trans->base.layer_stride = 0;
 
@@ -107,21 +107,21 @@ static void virgl_buffer_transfer_unmap(struct 
pipe_context *ctx,
   if (!(transfer->usage & PIPE_TRANSFER_FLUSH_EXPLICIT)) {
  struct virgl_screen *vs = virgl_screen(ctx->screen);
  vbuf->base.clean = FALSE;
  vctx->num_transfers++;
  vs->vws->transfer_put(vs->vws, vbuf->base.hw_res,
&transfer->box, trans->base.stride, 
trans->base.layer_stride, trans->offset, transfer->level);
 
   }
}
 
-   slab_free_st(&vctx->texture_transfer_pool, trans);
+   slab_free(&vctx->texture_transfer_pool, trans);
 }
 
 static void virgl_buffer_transfer_flush_region(struct pipe_context *ctx,
struct pipe_transfer *transfer,
const struct pipe_box *box)
 {
struct virgl_context *vctx = virgl_context(ctx);
struct virgl_buffer *vbuf = virgl_buffer(transfer->resource);
 
if (!vbuf->on_list) {
diff --git a/src/gallium/drivers/virgl/virgl_context.c 
b/src/gallium/drivers/virgl/virgl_context.c
index a6c0597..e693a73 100644
--- a/src/gallium/drivers/virgl/virgl_context.c
+++ b/src/gallium/drivers/virgl/virgl_context.c
@@ -855,21 +855,21 @@ virgl_context_destroy( struct pipe_context *ctx )
vctx->framebuffer.zsbuf = NULL;
vctx->framebuffer.nr_cbufs = 0;
virgl_encoder_destroy_sub_ctx(vctx, vctx->hw_sub_ctx_id);
virgl_flush_eq(vctx, vctx);
 
rs->vws->cmd_buf_destroy(vctx->cbuf);
if (vctx->uploader)
   u_upload_destroy(vctx->uploader);
util_primconvert_destroy(vctx->primconvert);
 
-   slab_destroy(&vctx->texture_transfer_pool);
+   slab_destroy_child(&vctx->texture_transfer_pool);
FREE(vctx);
 }
 
 struct pipe_context *virgl_context_create(struct pipe_screen *pscreen,
   void *priv,
   unsigned flags)
 {
struct virgl_context *vctx;
struct virgl_screen *rs = virgl_screen(pscreen);
vctx = CALLOC_STRUCT(virgl_context);
@@ -936,22 +936,21 @@ struct pipe_context *virgl_context_create(struct 
pipe_screen *pscreen,
 
vctx->base.resource_copy_region = virgl_resource_copy_region;
vctx->base.flush_resource = virgl_flush_resource;
vctx->base.blit =  virgl_blit;
 
virgl_init_context_resource_functions(&vctx->base);
virgl_init_query_functions(vctx);
virgl_init_so_functions(vctx);
 
list_inithead(&vctx->to_flush_bufs);
-   slab_create(&vctx->texture_transfer_pool, sizeof(struct virgl_transfer),
-16);
+   slab_create_child(&vctx->texture_transfer_pool, rs->texture_transfer_pool);
 
vctx->primconvert = util_primconvert_create(&vctx->base, 
rs->caps.caps.v1.prim_mask);
vctx->uploader = u_upload_create(&vctx->base, 1024 * 1024,
  PIPE_BIND_INDEX_BUFFER, 
PIPE_USAGE_STREAM);
if (!vctx->uploader)
goto fail;
 
vctx->hw_sub_ctx_id = rs->sub_ctx_id++;
virgl_encoder_create_sub_ctx(vctx, vctx->hw_sub_ctx_id);
 
diff --git a/src/gallium/drivers/virgl/virgl_context.h 
b/src/gallium/drivers/virgl/virgl_context.h
index 3b9901f..597ed49 100644
--- a/src/gallium/drivers/virgl/virgl_context.h
+++ b/src/gallium/drivers/virgl/virgl_context.h
@@ -49,21 +49,21 @@ struct virgl_textures_info {
 };
 
 struct virgl_context {
struct pipe_context base;
struct virgl_cmd_buf *cbuf;
 
struct virgl_textures_info samplers[PIPE_SHADER_TYPES];
 
struct pipe_framebuffer_state framebuffer;
 
-   struct slab_mempool texture_transfer_pool;
+   struct slab_child_pool texture_transfer_pool;
 
struct pipe_index_buffer index_buffer;
struct u_upload_mgr

[Mesa-dev] [PATCH 1/6] util/slab: re-design to allow migration between pools

2016-09-27 Thread Nicolai Hähnle

From: Nicolai Hähnle 

This is basically a re-write of the slab allocator into a design where
multiple child pools are linked to a parent pool. The intention is that
every (GL, pipe) context has its own child pool, while the corresponding
parent pool is held by the winsys or screen, or possibly the GL share group.

The fast path is still used when objects are freed by the same child pool
that allocated them. However, it is now also possible to free an object in a
different pool, as long as they belong to the same parent. Objects also
survive the destruction of the (child) pool from which they were allocated.

The slow path will return freed objects to the child pool from which they
were originally allocated. If that child pool was destroyed, the corresponding
page is considered an orphan and will be freed once all objects in it have
been freed.

This allocation pattern is required for pipe_transfers that correspond to
(GL) buffer object mappings when the mapping is created in one context
which is later destroyed while other contexts of the same share group live
on -- see the bug report referenced below.

Note that individual drivers do need to migrate to the new interface in
order to benefit and fix the bug.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97894
---
 src/util/slab.c | 278 +---
 src/util/slab.h |  64 ++---
 2 files changed, 253 insertions(+), 89 deletions(-)

diff --git a/src/util/slab.c b/src/util/slab.c
index af75152..654d936 100644
--- a/src/util/slab.c
+++ b/src/util/slab.c
@@ -16,166 +16,296 @@
  * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
  * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
  * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
  * THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
  * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
  * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
  * USE OR OTHER DEALINGS IN THE SOFTWARE. */
 
 #include "slab.h"
 #include "macros.h"
-#include "simple_list.h"
+#include "u_atomic.h"
 #include 
 #include 
 #include 
 
 #define ALIGN(value, align) (((value) + (align) - 1) & ~((align) - 1))
 
+#define SLAB_MAGIC_ALLOCATED 0xcafe4321
+#define SLAB_MAGIC_FREE 0x7ee01234
+
 #ifdef DEBUG
-#define SLAB_MAGIC 0xcafe4321
-#define SET_MAGIC(element)   (element)->magic = SLAB_MAGIC
-#define CHECK_MAGIC(element) assert((element)->magic == SLAB_MAGIC)
+#define SET_MAGIC(element, value)   (element)->magic = (value)
+#define CHECK_MAGIC(element, value) assert((element)->magic == (value))
 #else
-#define SET_MAGIC(element)
-#define CHECK_MAGIC(element)
+#define SET_MAGIC(element, value)
+#define CHECK_MAGIC(element, value)
 #endif
 
 /* One array element within a big buffer. */
 struct slab_element_header {
-   /* The next free element. */
-   struct slab_element_header *next_free;
+   /* Linked list of elements. */
+   struct list_head head;
+
+   /* This is either
+* - a pointer to the child pool to which this element belongs, or
+* - a pointer to the orphaned page of the element, with the least
+*   significant bit set to 1.
+*/
+   intptr_t owner;
 
 #ifdef DEBUG
-   /* Use intptr_t to keep the header aligned to a pointer size. */
intptr_t magic;
 #endif
 };
 
+/* The page is an array of allocations in one block. */
+struct slab_page_header {
+   union {
+  /* Next page in the same child pool. */
+  struct slab_page_header *next;
+
+  /* Number of remaining, non-freed elements (for orphaned pages). */
+  unsigned num_remaining;
+   } u;
+   /* Memory after the last member is dedicated to the page itself.
+* The allocated size is always larger than this structure.
+*/
+};
+
+
 static struct slab_element_header *
-slab_get_element(struct slab_mempool *pool,
+slab_get_element(struct slab_parent_pool *parent,
  struct slab_page_header *page, unsigned index)
 {
return (struct slab_element_header*)
-  ((uint8_t*)&page[1] + (pool->element_size * index));
+  ((uint8_t*)&page[1] + (parent->element_size * index));
+}
+
+/* The given object/element belongs to an orphaned page (i.e. the owning child
+ * pool has been destroyed). Mark the element as freed and free the whole page
+ * when no elements are left in it.
+ */
+static void
+slab_free_orphaned(struct slab_element_header *elt)
+{
+   struct slab_page_header *page;
+
+   assert(elt->owner & 1);
+
+   page = (struct slab_page_header *)(elt->owner & ~(intptr_t)1);
+   if (!p_atomic_dec_return(&page->u.num_remaining))
+  free(page);
+}
+
+/**
+ * Create a parent pool for the allocation of same-sized objects.
+ *
+ * \param item_size Size of one object.
+ * \param num_items Number of objects to allocate at once.
+ */
+void
+slab_create_parent(struct slab_parent_pool *parent,
+   unsigned item_size,

[Mesa-dev] [PATCH 4/6] freedreno: use the new parent/child pools for transfers

2016-09-27 Thread Nicolai Hähnle

From: Nicolai Hähnle 

---
 src/gallium/drivers/freedreno/freedreno_context.c  | 5 ++---
 src/gallium/drivers/freedreno/freedreno_context.h  | 2 +-
 src/gallium/drivers/freedreno/freedreno_resource.c | 4 ++--
 src/gallium/drivers/freedreno/freedreno_screen.c   | 4 
 src/gallium/drivers/freedreno/freedreno_screen.h   | 3 +++
 5 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/freedreno/freedreno_context.c 
b/src/gallium/drivers/freedreno/freedreno_context.c
index f8604f1..0b12409 100644
--- a/src/gallium/drivers/freedreno/freedreno_context.c
+++ b/src/gallium/drivers/freedreno/freedreno_context.c
@@ -114,21 +114,21 @@ fd_context_destroy(struct pipe_context *pctx)
 
if (ctx->blitter)
util_blitter_destroy(ctx->blitter);
 
if (ctx->clear_rs_state)
pctx->delete_rasterizer_state(pctx, ctx->clear_rs_state);
 
if (ctx->primconvert)
util_primconvert_destroy(ctx->primconvert);
 
-   slab_destroy(&ctx->transfer_pool);
+   slab_destroy_child(&ctx->transfer_pool);
 
for (i = 0; i < ARRAY_SIZE(ctx->pipe); i++) {
struct fd_vsc_pipe *pipe = &ctx->pipe[i];
if (!pipe->bo)
break;
fd_bo_del(pipe->bo);
}
 
fd_device_del(ctx->dev);
 
@@ -258,22 +258,21 @@ fd_context_init(struct fd_context *ctx, struct 
pipe_screen *pscreen,
pctx->set_debug_callback = fd_set_debug_callback;
 
/* TODO what about compute?  Ideally it creates it's own independent
 * batches per compute job (since it isn't using tiling, so no point
 * in getting involved with the re-ordering madness)..
 */
if (!screen->reorder) {
ctx->batch = fd_bc_alloc_batch(&screen->batch_cache, ctx);
}
 
-   slab_create(&ctx->transfer_pool, sizeof(struct fd_transfer),
-   16);
+   slab_create_child(&ctx->transfer_pool, &screen->transfer_pool);
 
fd_draw_init(pctx);
fd_resource_context_init(pctx);
fd_query_context_init(pctx);
fd_texture_init(pctx);
fd_state_init(pctx);
fd_hw_query_init(pctx);
 
ctx->blitter = util_blitter_create(pctx);
if (!ctx->blitter)
diff --git a/src/gallium/drivers/freedreno/freedreno_context.h 
b/src/gallium/drivers/freedreno/freedreno_context.h
index e1b7b23..c4c08a6 100644
--- a/src/gallium/drivers/freedreno/freedreno_context.h
+++ b/src/gallium/drivers/freedreno/freedreno_context.h
@@ -114,21 +114,21 @@ struct fd_context {
struct fd_device *dev;
struct fd_screen *screen;
 
struct util_queue flush_queue;
 
struct blitter_context *blitter;
void *clear_rs_state;
struct primconvert_context *primconvert;
 
/* slab for pipe_transfer allocations: */
-   struct slab_mempool transfer_pool;
+   struct slab_child_pool transfer_pool;
 
/* slabs for fd_hw_sample and fd_hw_sample_period allocations: */
struct slab_mempool sample_pool;
struct slab_mempool sample_period_pool;
 
/* sample-providers for hw queries: */
const struct fd_hw_sample_provider 
*sample_providers[MAX_HW_SAMPLE_PROVIDERS];
 
/* list of active queries: */
struct list_head active_queries;
diff --git a/src/gallium/drivers/freedreno/freedreno_resource.c 
b/src/gallium/drivers/freedreno/freedreno_resource.c
index 1874271..addfc40 100644
--- a/src/gallium/drivers/freedreno/freedreno_resource.c
+++ b/src/gallium/drivers/freedreno/freedreno_resource.c
@@ -418,21 +418,21 @@ fd_resource_transfer_unmap(struct pipe_context *pctx,
fd_bo_cpu_fini(rsc->bo);
if (rsc->stencil)
fd_bo_cpu_fini(rsc->stencil->bo);
}
 
util_range_add(&rsc->valid_buffer_range,
   ptrans->box.x,
   ptrans->box.x + ptrans->box.width);
 
pipe_resource_reference(&ptrans->resource, NULL);
-   slab_free_st(&ctx->transfer_pool, ptrans);
+   slab_free(&ctx->transfer_pool, ptrans);
 
free(trans->staging);
 }
 
 static void *
 fd_resource_transfer_map(struct pipe_context *pctx,
struct pipe_resource *prsc,
unsigned level, unsigned usage,
const struct pipe_box *box,
struct pipe_transfer **pptrans)
@@ -444,21 +444,21 @@ fd_resource_transfer_map(struct pipe_context *pctx,
struct pipe_transfer *ptrans;
enum pipe_format format = prsc->format;
uint32_t op = 0;
uint32_t offset;
char *buf;
int ret = 0;
 
DBG("prsc=%p, level=%u, usage=%x, box=%dx%d+%d,%d", prsc, level, usage,
box->width, box->height, box->x, box->y);
 
-   ptrans = slab_alloc_st(&ctx->transfer_pool);
+   ptrans = slab_alloc(&ctx->transfer_pool);
if (!ptrans)
return NULL;

[Mesa-dev] [PATCH] swr: Removed stalling SwrWaitForIdle from queries.

2016-09-27 Thread Bruce Cherniak

Previous fundamental change in stats gathering added a temporary
SwrWaitForIdle to begin_query and end_query.  Code has been reworked to
remove stall.
---
 src/gallium/drivers/swr/swr_context.cpp |  33 +++
 src/gallium/drivers/swr/swr_context.h   |  11 ++-
 src/gallium/drivers/swr/swr_query.cpp   | 152 +---
 src/gallium/drivers/swr/swr_query.h |  10 +--
 4 files changed, 87 insertions(+), 119 deletions(-)

diff --git a/src/gallium/drivers/swr/swr_context.cpp 
b/src/gallium/drivers/swr/swr_context.cpp
index 15e60cd..cbc60e0 100644
--- a/src/gallium/drivers/swr/swr_context.cpp
+++ b/src/gallium/drivers/swr/swr_context.cpp
@@ -24,6 +24,7 @@
 #include "util/u_memory.h"
 #include "util/u_inlines.h"
 #include "util/u_format.h"
+#include "util/u_atomic.h"
 
 extern "C" {
 #include "util/u_transfer.h"
@@ -352,9 +353,9 @@ swr_UpdateStats(HANDLE hPrivateContext, const SWR_STATS 
*pStats)
if (!pDC)
   return;
 
-   struct swr_context *ctx = (struct swr_context *)pDC->swr_ctx;
+   struct swr_query_result *pqr = (struct swr_query_result *)pDC->pStats;
 
-   SWR_STATS *pSwrStats = &ctx->stats;
+   SWR_STATS *pSwrStats = &pqr->core;
 
pSwrStats->DepthPassCount += pStats->DepthPassCount;
pSwrStats->PsInvocations += pStats->PsInvocations;
@@ -369,22 +370,24 @@ swr_UpdateStatsFE(HANDLE hPrivateContext, const 
SWR_STATS_FE *pStats)
if (!pDC)
   return;
 
-   struct swr_context *ctx = (struct swr_context *)pDC->swr_ctx;
+   struct swr_query_result *pqr = (struct swr_query_result *)pDC->pStats;
 
-   SWR_STATS_FE *pSwrStats = &ctx->statsFE;
-   pSwrStats->IaVertices += pStats->IaVertices;
-   pSwrStats->IaPrimitives += pStats->IaPrimitives;
-   pSwrStats->VsInvocations += pStats->VsInvocations;
-   pSwrStats->HsInvocations += pStats->HsInvocations;
-   pSwrStats->DsInvocations += pStats->DsInvocations;
-   pSwrStats->GsInvocations += pStats->GsInvocations;
-   pSwrStats->CInvocations += pStats->CInvocations;
-   pSwrStats->CPrimitives += pStats->CPrimitives;
-   pSwrStats->GsPrimitives += pStats->GsPrimitives;
+   SWR_STATS_FE *pSwrStats = &pqr->coreFE;
+   p_atomic_add(&pSwrStats->IaVertices, pStats->IaVertices);
+   p_atomic_add(&pSwrStats->IaPrimitives, pStats->IaPrimitives);
+   p_atomic_add(&pSwrStats->VsInvocations, pStats->VsInvocations);
+   p_atomic_add(&pSwrStats->HsInvocations, pStats->HsInvocations);
+   p_atomic_add(&pSwrStats->DsInvocations, pStats->DsInvocations);
+   p_atomic_add(&pSwrStats->GsInvocations, pStats->GsInvocations);
+   p_atomic_add(&pSwrStats->CInvocations, pStats->CInvocations);
+   p_atomic_add(&pSwrStats->CPrimitives, pStats->CPrimitives);
+   p_atomic_add(&pSwrStats->GsPrimitives, pStats->GsPrimitives);
 
for (unsigned i = 0; i < 4; i++) {
-  pSwrStats->SoPrimStorageNeeded[i] += pStats->SoPrimStorageNeeded[i];
-  pSwrStats->SoNumPrimsWritten[i] += pStats->SoNumPrimsWritten[i];
+  p_atomic_add(&pSwrStats->SoPrimStorageNeeded[i],
+pStats->SoPrimStorageNeeded[i]);
+  p_atomic_add(&pSwrStats->SoNumPrimsWritten[i],
+pStats->SoNumPrimsWritten[i]);
}
 }
 
diff --git a/src/gallium/drivers/swr/swr_context.h 
b/src/gallium/drivers/swr/swr_context.h
index 6854d69..eecfe0d 100644
--- a/src/gallium/drivers/swr/swr_context.h
+++ b/src/gallium/drivers/swr/swr_context.h
@@ -92,7 +92,7 @@ struct swr_draw_context {
float userClipPlanes[PIPE_MAX_CLIP_PLANES][4];
 
SWR_SURFACE_STATE renderTargets[SWR_NUM_ATTACHMENTS];
-   void *swr_ctx;
+   void *pStats;
 };
 
 /* gen_llvm_types FINI */
@@ -159,9 +159,6 @@ struct swr_context {
/* SWR private state - draw context */
struct swr_draw_context swrDC;
 
-   SWR_STATS stats;
-   SWR_STATS_FE statsFE;
-
unsigned dirty; /**< Mask of SWR_NEW_x flags */
 };
 
@@ -172,11 +169,13 @@ swr_context(struct pipe_context *pipe)
 }
 
 static INLINE void
-swr_update_draw_context(struct swr_context *ctx)
+swr_update_draw_context(struct swr_context *ctx,
+  struct swr_query_result *pqr = nullptr)
 {
swr_draw_context *pDC =
   (swr_draw_context *)SwrGetPrivateContextState(ctx->swrContext);
-   ctx->swrDC.swr_ctx = ctx;
+   if (pqr)
+  ctx->swrDC.pStats = pqr;
memcpy(pDC, &ctx->swrDC, sizeof(swr_draw_context));
 }
 
diff --git a/src/gallium/drivers/swr/swr_query.cpp 
b/src/gallium/drivers/swr/swr_query.cpp
index c51c529..8bb0b16 100644
--- a/src/gallium/drivers/swr/swr_query.cpp
+++ b/src/gallium/drivers/swr/swr_query.cpp
@@ -71,48 +71,6 @@ swr_destroy_query(struct pipe_context *pipe, struct 
pipe_query *q)
 }
 
 
-static void
-swr_gather_stats(struct pipe_context *pipe, struct swr_query *pq)
-{
-   struct swr_context *ctx = swr_context(pipe);
-
-   assert(pq->result);
-   struct swr_query_result *result = pq->result;
-   boolean enable_stats = pq->enable_stats;
-
-   /* A few results don't require the core, so don't involve it */
-   switch (pq->type) {
-   case PIPE_QUERY_TIMESTAMP:
-   case PIPE_QUERY_TIME_ELAPSED:
-  result->timestamp

Re: [Mesa-dev] [PATCH] st/va: Fix vaSyncSurface with no outstanding operation

2016-09-27 Thread Mark Thompson

On 27/09/16 16:48, Andy Furniss wrote:
> Ok, thanks, so with that I am back to where I was before it stopped working.
> 
> In summary baseline works but JM ref decoder doesn't like the pocs.
> 
> b frames don't work properly, but then they don't with gst vaapi either. They 
> do work with gst omx.
> 
> Looking at output from printfs some differences I see vs gstreamer.
> 
> maxrefs is hardcoded to 2 which has sideffects =

Easily fixed: 
.

> enc_pic.pc.enc_b_pic_pattern = 1 vs 0 - seems harmless in practice.
> 
> There is code that for my h/w disables dual instance when maxrefs > 1 which 
> means half speed, but there seems to be a bottleneck elsewhere that makes 
> avconv 3x slower than gstreamer anyway.

I'm not really sure how fast I should be expecting this stuff to work - can you 
offer any numbers about how fast it goes for you?  I only get ~30fps doing 
1080p transcode on an R7 360, which compares rather badly to ~240fps on the 
Skylake 6300 in the same machine.

> gop, it seems that avconv with -g doesn't set h264->intra_idr_period in 
> handleVAEncSequenceParameterBufferType which gets used to set 
> context->desc.h264enc.gop_size and enc_pic.rc.gop_size

Hmm, this is an error on my part.  I'll fix it, but I need to test a bit 
further to be sure I'm not breaking anything else.

> pocs gstreamer increments h264->CurrPic.TopFieldOrderCnt in 2s avconv 1s. The 
> code divides this by 2 in handleVAEncPictureParameterBufferType

That code is just wrong, isn't it?  It works for pic_order_cnt_type == 2, but 
it needs to look at the pic_order_cnt_lsb and delta_pic_order_cnt values on the 
slice header for other cases.  (Looking at gstreamer, it has POC type 0 (as I 
do), but then the POC values match what POC type 2 would create in the 
no-B-frame case, so this happens to work.)

I'll see if I can make a patch for this.

Thanks,

- Mark

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97879] [amdgpu] Rocket League: long hangs (several seconds) when loading assets (models/textures/shaders?)

2016-09-27 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97879

--- Comment #29 from Silvan Jegen  ---
(In reply to Micael Bergeron from comment #28)
> I asked Psyonix for the debug symbols bug had no response.
> http://psyonix.com/forum/viewtopic.php?f=36&t=27894

Maybe you can add a link to this bug report on their support forums? I have
seen that people at Feral that do ports to Linux have commented on mesa bug
reports about the games they are porting.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 1/6] nv50/ir: add preliminary support for SHLADD

2016-09-27 Thread Samuel Pitoiset

This instruction is available since SM20 (Fermi) and allow to do
(a << b) + c in one shot. In some situations, IMAD should be
replaced by SHLADD when b is a power of 2, and ADD+SHL should be
replaced by SHLADD as well.

v3: - fix neg flag
- remove isFloatType() in isOpSupported()
- teach isModSupported() about SHLADD

v2: - fix up the commutative table on nv50/ir

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir.h   |  1 +
 src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp   |  1 +
 src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp  |  6 +++---
 src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp |  5 +++--
 src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp | 11 +--
 5 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h 
b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
index d6011d9..bedbdcc 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
@@ -57,6 +57,7 @@ enum operation
OP_MAD,
OP_FMA,
OP_SAD, // abs(src0 - src1) + src2
+   OP_SHLADD,
OP_ABS,
OP_NEG,
OP_NOT,
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
index 22f2f5d..dbd0f7d 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
@@ -86,6 +86,7 @@ const char *operationStr[OP_LAST + 1] =
"mad",
"fma",
"sad",
+   "shladd",
"abs",
"neg",
"not",
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp
index 7d7b315..273ec34 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp
@@ -30,7 +30,7 @@ const uint8_t Target::operationSrcNr[] =
0, 0,   // NOP, PHI
0, 0, 0, 0, // UNION, SPLIT, MERGE, CONSTRAINT
1, 1, 2,// MOV, LOAD, STORE
-   2, 2, 2, 2, 2, 3, 3, 3, // ADD, SUB, MUL, DIV, MOD, MAD, FMA, SAD
+   2, 2, 2, 2, 2, 3, 3, 3, 3, // ADD, SUB, MUL, DIV, MOD, MAD, FMA, SAD, SHLADD
1, 1, 1,// ABS, NEG, NOT
2, 2, 2, 2, 2,  // AND, OR, XOR, SHL, SHR
2, 2, 1,// MAX, MIN, SAT
@@ -70,10 +70,10 @@ const OpClass Target::operationClass[] =
OPCLASS_MOVE,
OPCLASS_LOAD,
OPCLASS_STORE,
-   // ADD, SUB, MUL; DIV, MOD; MAD, FMA, SAD
+   // ADD, SUB, MUL; DIV, MOD; MAD, FMA, SAD, SHLADD
OPCLASS_ARITH, OPCLASS_ARITH, OPCLASS_ARITH,
OPCLASS_ARITH, OPCLASS_ARITH,
-   OPCLASS_ARITH, OPCLASS_ARITH, OPCLASS_ARITH,
+   OPCLASS_ARITH, OPCLASS_ARITH, OPCLASS_ARITH, OPCLASS_ARITH,
// ABS, NEG; NOT, AND, OR, XOR; SHL, SHR
OPCLASS_CONVERT, OPCLASS_CONVERT,
OPCLASS_LOGIC, OPCLASS_LOGIC, OPCLASS_LOGIC, OPCLASS_LOGIC,
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
index 1246cc6..83b4102 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
@@ -115,12 +115,12 @@ void TargetNV50::initOpInfo()
{
   // ADD, MUL, MAD, FMA, AND, OR, XOR, MAX, MIN, SET_AND, SET_OR, SET_XOR,
   // SET, SELP, SLCT
-  0x0670ca00, 0x003f, 0x, 0x
+  0x0ce0ca00, 0x007e, 0x, 0x
};
static const uint32_t shortForm[(OP_LAST + 31) / 32] =
{
   // MOV, ADD, SUB, MUL, MAD, SAD, RCP, L/PINTERP, TEX, TXF
-  0x00014e40, 0x0040, 0x0930, 0x
+  0x00014e40, 0x0080, 0x1260, 0x
};
static const operation noDestList[] =
{
@@ -438,6 +438,7 @@ TargetNV50::isOpSupported(operation op, DataType ty) const
case OP_EXTBF:
case OP_EXIT: // want exit modifier instead (on NOP if required)
case OP_MEMBAR:
+   case OP_SHLADD:
   return false;
case OP_SAD:
   return ty == TYPE_S32;
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
index f75e395..8606065 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
@@ -105,6 +105,7 @@ static const struct opProperties _initProps[] =
{ OP_MAX,0x3, 0x3, 0x0, 0x0, 0x2, 0x2 },
{ OP_MIN,0x3, 0x3, 0x0, 0x0, 0x2, 0x2 },
{ OP_MAD,0x7, 0x0, 0x0, 0x8, 0x6, 0x2 | 0x8 }, // special c[] constraint
+   { OP_SHLADD, 0x5, 0x0, 0x0, 0x0, 0x4, 0x6 },
{ OP_MADSP,  0x0, 0x0, 0x0, 0x0, 0x6, 0x2 },
{ OP_ABS,0x0, 0x0, 0x0, 0x0, 0x1, 0x0 },
{ OP_NEG,0x0, 0x1, 0x0, 0x0, 0x1, 0x0 },
@@ -158,13 +159,13 @@ void TargetNVC0::initOpInfo()
{
   // ADD, MUL, MAD, FMA, AND, OR, XOR, MAX, MIN, SET_AND, SET_OR, SET_XOR,
   //

Re: [Mesa-dev] [PATCH v3 2/6] nvc0/ir: add emission for SHLADD

2016-09-27 Thread Ilia Mirkin

Reviewed-by: Ilia Mirkin 

On Tue, Sep 27, 2016 at 2:57 PM, Samuel Pitoiset
 wrote:
> Unfortunately, we can't use the emit helpers for GF100/GK110
> because src1 and src2 are swapped.
>
> v3: - remove useless use of src1 neg mod
> v2: - s/emitSHLADD/emitISCADD for GM107 emitter
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  .../drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 52 
> ++
>  .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 32 +
>  .../drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp  | 43 ++
>  3 files changed, 127 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
> index 61c450b..ce20ed3 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
> @@ -96,6 +96,7 @@ private:
> void emitDMUL(const Instruction *);
> void emitIMAD(const Instruction *);
> void emitISAD(const Instruction *);
> +   void emitSHLADD(const Instruction *);
> void emitFMAD(const Instruction *);
> void emitDMAD(const Instruction *);
> void emitMADSP(const Instruction *i);
> @@ -757,6 +758,54 @@ CodeEmitterGK110::emitISAD(const Instruction *i)
>  }
>
>  void
> +CodeEmitterGK110::emitSHLADD(const Instruction *i)
> +{
> +   uint8_t addOp = (i->src(2).mod.neg() << 1) | i->src(0).mod.neg();
> +   const ImmediateValue *imm = i->src(1).get()->asImm();
> +   assert(imm);
> +
> +   if (i->src(2).getFile() == FILE_IMMEDIATE) {
> +  code[0] = 0x1;
> +  code[1] = 0xc0c << 20;
> +   } else {
> +  code[0] = 0x2;
> +  code[1] = 0x20c << 20;
> +   }
> +   code[1] |= addOp << 19;
> +
> +   emitPredicate(i);
> +
> +   defId(i->def(0), 2);
> +   srcId(i->src(0), 10);
> +
> +   if (i->flagsDef >= 0)
> +  code[1] |= 1 << 18;
> +
> +   assert(!(imm->reg.data.u32 & 0xffe0));
> +   code[1] |= imm->reg.data.u32 << 10;
> +
> +   switch (i->src(2).getFile()) {
> +   case FILE_GPR:
> +  assert(code[0] & 0x2);
> +  code[1] |= 0xc << 28;
> +  srcId(i->src(2), 23);
> +  break;
> +   case FILE_MEMORY_CONST:
> +  assert(code[0] & 0x2);
> +  code[1] |= 0x4 << 28;
> +  setCAddress14(i->src(2));
> +  break;
> +   case FILE_IMMEDIATE:
> +  assert(code[0] & 0x1);
> +  setShortImmediate(i, 2);
> +  break;
> +   default:
> +  assert(!"bad src2 file");
> +  break;
> +   }
> +}
> +
> +void
>  CodeEmitterGK110::emitNOT(const Instruction *i)
>  {
> code[0] = 0x0003fc02; // logop(mov2) dst, 0, not src
> @@ -2403,6 +2452,9 @@ CodeEmitterGK110::emitInstruction(Instruction *insn)
> case OP_SAD:
>emitISAD(insn);
>break;
> +   case OP_SHLADD:
> +  emitSHLADD(insn);
> +  break;
> case OP_NOT:
>emitNOT(insn);
>break;
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> index cfde66c..3fedafd 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
> @@ -152,6 +152,7 @@ private:
> void emitIADD();
> void emitIMUL();
> void emitIMAD();
> +   void emitISCADD();
> void emitIMNMX();
> void emitICMP();
> void emitISET();
> @@ -1813,6 +1814,34 @@ CodeEmitterGM107::emitIMAD()
>  }
>
>  void
> +CodeEmitterGM107::emitISCADD()
> +{
> +   switch (insn->src(2).getFile()) {
> +   case FILE_GPR:
> +  emitInsn(0x5c18);
> +  emitGPR (0x14, insn->src(2));
> +  break;
> +   case FILE_MEMORY_CONST:
> +  emitInsn(0x4c18);
> +  emitCBUF(0x22, -1, 0x14, 16, 2, insn->src(2));
> +  break;
> +   case FILE_IMMEDIATE:
> +  emitInsn(0x3818);
> +  emitIMMD(0x14, 19, insn->src(2));
> +  break;
> +   default:
> +  assert(!"bad src1 file");
> +  break;
> +   }
> +   emitNEG (0x31, insn->src(0));
> +   emitNEG (0x30, insn->src(2));
> +   emitCC  (0x2f);
> +   emitIMMD(0x27, 5, insn->src(1));
> +   emitGPR (0x08, insn->src(0));
> +   emitGPR (0x00, insn->def(0));
> +}
> +
> +void
>  CodeEmitterGM107::emitIMNMX()
>  {
> switch (insn->src(1).getFile()) {
> @@ -3098,6 +3127,9 @@ CodeEmitterGM107::emitInstruction(Instruction *i)
>   emitIMAD();
>}
>break;
> +   case OP_SHLADD:
> +  emitISCADD();
> +  break;
> case OP_MIN:
> case OP_MAX:
>if (isFloatType(insn->dType)) {
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
> index d8ca6ab..0be9f7a 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
> @@ -101,6 +101,7 @@ private:
> void emitDMUL(const Instruction *);
> void emitIMAD(const Instruction *);
> void emitISAD(const Instruction *);
> +   void emitSHLA

Re: [Mesa-dev] [PATCH v3 1/6] nv50/ir: add preliminary support for SHLADD

2016-09-27 Thread Ilia Mirkin

Reviewed-by: Ilia Mirkin 

On Tue, Sep 27, 2016 at 2:55 PM, Samuel Pitoiset
 wrote:
> This instruction is available since SM20 (Fermi) and allow to do
> (a << b) + c in one shot. In some situations, IMAD should be
> replaced by SHLADD when b is a power of 2, and ADD+SHL should be
> replaced by SHLADD as well.
>
> v3: - fix neg flag
> - remove isFloatType() in isOpSupported()
> - teach isModSupported() about SHLADD
>
> v2: - fix up the commutative table on nv50/ir
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir.h   |  1 +
>  src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp   |  1 +
>  src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp  |  6 +++---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp |  5 +++--
>  src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp | 11 +--
>  5 files changed, 17 insertions(+), 7 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
> index d6011d9..bedbdcc 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
> @@ -57,6 +57,7 @@ enum operation
> OP_MAD,
> OP_FMA,
> OP_SAD, // abs(src0 - src1) + src2
> +   OP_SHLADD,
> OP_ABS,
> OP_NEG,
> OP_NOT,
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
> index 22f2f5d..dbd0f7d 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
> @@ -86,6 +86,7 @@ const char *operationStr[OP_LAST + 1] =
> "mad",
> "fma",
> "sad",
> +   "shladd",
> "abs",
> "neg",
> "not",
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp
> index 7d7b315..273ec34 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target.cpp
> @@ -30,7 +30,7 @@ const uint8_t Target::operationSrcNr[] =
> 0, 0,   // NOP, PHI
> 0, 0, 0, 0, // UNION, SPLIT, MERGE, CONSTRAINT
> 1, 1, 2,// MOV, LOAD, STORE
> -   2, 2, 2, 2, 2, 3, 3, 3, // ADD, SUB, MUL, DIV, MOD, MAD, FMA, SAD
> +   2, 2, 2, 2, 2, 3, 3, 3, 3, // ADD, SUB, MUL, DIV, MOD, MAD, FMA, SAD, 
> SHLADD
> 1, 1, 1,// ABS, NEG, NOT
> 2, 2, 2, 2, 2,  // AND, OR, XOR, SHL, SHR
> 2, 2, 1,// MAX, MIN, SAT
> @@ -70,10 +70,10 @@ const OpClass Target::operationClass[] =
> OPCLASS_MOVE,
> OPCLASS_LOAD,
> OPCLASS_STORE,
> -   // ADD, SUB, MUL; DIV, MOD; MAD, FMA, SAD
> +   // ADD, SUB, MUL; DIV, MOD; MAD, FMA, SAD, SHLADD
> OPCLASS_ARITH, OPCLASS_ARITH, OPCLASS_ARITH,
> OPCLASS_ARITH, OPCLASS_ARITH,
> -   OPCLASS_ARITH, OPCLASS_ARITH, OPCLASS_ARITH,
> +   OPCLASS_ARITH, OPCLASS_ARITH, OPCLASS_ARITH, OPCLASS_ARITH,
> // ABS, NEG; NOT, AND, OR, XOR; SHL, SHR
> OPCLASS_CONVERT, OPCLASS_CONVERT,
> OPCLASS_LOGIC, OPCLASS_LOGIC, OPCLASS_LOGIC, OPCLASS_LOGIC,
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
> index 1246cc6..83b4102 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nv50.cpp
> @@ -115,12 +115,12 @@ void TargetNV50::initOpInfo()
> {
>// ADD, MUL, MAD, FMA, AND, OR, XOR, MAX, MIN, SET_AND, SET_OR, 
> SET_XOR,
>// SET, SELP, SLCT
> -  0x0670ca00, 0x003f, 0x, 0x
> +  0x0ce0ca00, 0x007e, 0x, 0x
> };
> static const uint32_t shortForm[(OP_LAST + 31) / 32] =
> {
>// MOV, ADD, SUB, MUL, MAD, SAD, RCP, L/PINTERP, TEX, TXF
> -  0x00014e40, 0x0040, 0x0930, 0x
> +  0x00014e40, 0x0080, 0x1260, 0x
> };
> static const operation noDestList[] =
> {
> @@ -438,6 +438,7 @@ TargetNV50::isOpSupported(operation op, DataType ty) const
> case OP_EXTBF:
> case OP_EXIT: // want exit modifier instead (on NOP if required)
> case OP_MEMBAR:
> +   case OP_SHLADD:
>return false;
> case OP_SAD:
>return ty == TYPE_S32;
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
> index f75e395..8606065 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
> @@ -105,6 +105,7 @@ static const struct opProperties _initProps[] =
> { OP_MAX,0x3, 0x3, 0x0, 0x0, 0x2, 0x2 },
> { OP_MIN,0x3, 0x3, 0x0, 0x0, 0x2, 0x2 },
> { OP_MAD,0x7, 0x0, 0x0, 0x8, 0x6, 0x2 | 0x8 }, // special c[] 
> constraint
> +   { OP_SHLADD, 0x5, 0x0, 0x0, 0

Re: [Mesa-dev] [PATCH v3 02/14] mesa/main: add support for ARB_compute_variable_groups_size

2016-09-27 Thread Nicolai Hähnle


On 26.09.2016 19:23, Samuel Pitoiset wrote:

v2: - update formatting spec quotations (Ian)
- move the total_invocations check outside of the loop (Ian)

Signed-off-by: Samuel Pitoiset 
---
 src/mesa/main/api_validate.c | 96 
 src/mesa/main/api_validate.h |  4 ++
 src/mesa/main/compute.c  | 17 +++
 src/mesa/main/context.c  |  6 +++
 src/mesa/main/dd.h   |  9 
 src/mesa/main/extensions_table.h |  1 +
 src/mesa/main/get.c  | 12 +
 src/mesa/main/get_hash_params.py |  3 ++
 src/mesa/main/mtypes.h   | 24 +-
 src/mesa/main/shaderapi.c|  1 +
 src/mesa/main/shaderobj.c|  2 +
 11 files changed, 174 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
index 6cb626a..fa24854 100644
--- a/src/mesa/main/api_validate.c
+++ b/src/mesa/main/api_validate.c
@@ -1096,6 +1096,7 @@ GLboolean
 _mesa_validate_DispatchCompute(struct gl_context *ctx,
const GLuint *num_groups)
 {
+   struct gl_shader_program *prog;
int i;
FLUSH_CURRENT(ctx, 0);

@@ -1128,6 +1129,88 @@ _mesa_validate_DispatchCompute(struct gl_context *ctx,
   }
}

+   /* The ARB_compute_variable_group_size spec says:
+*
+* "An INVALID_OPERATION error is generated by DispatchCompute if the active
+* program for the compute shader stage has a variable work group size."


Not sure what Ian said about this, but I seem to recall that the quotes 
are always indented slightly...



+*/
+   prog = ctx->_Shader->CurrentProgram[MESA_SHADER_COMPUTE];
+   if (prog->Comp.LocalSizeVariable) {
+  _mesa_error(ctx, GL_INVALID_OPERATION,
+  "glDispatchCompute(variable work group size forbidden)");
+  return GL_FALSE;
+   }
+
+   return GL_TRUE;
+}
+
+GLboolean
+_mesa_validate_DispatchComputeGroupSizeARB(struct gl_context *ctx,
+   const GLuint *num_groups,
+   const GLuint *group_size)
+{
+   struct gl_shader_program *prog;
+   GLuint total_invocations = 1;
+   int i;
+
+   FLUSH_CURRENT(ctx, 0);
+
+   if (!check_valid_to_compute(ctx, "glDispatchComputeGroupSizeARB"))
+  return GL_FALSE;
+
+   for (i = 0; i < 3; i++) {
+  /* The ARB_compute_variable_group_size spec says:
+   *
+   * "An INVALID_VALUE error is generated by DispatchComputeGroupSizeARB if
+   * any of , , or  is less than
+   * or equal to zero or greater than the maximum local work group size for
+   * compute shaders with variable group size
+   * (MAX_COMPUTE_VARIABLE_GROUP_SIZE_ARB) in the corresponding dimension."
+   *
+   * However, the "less than" is a spec bug because they are declared as
+   * unsigned integers.


... also here, where it would help a lot to visually set it apart from 
the rest of the comment, and in a number of places below :)




+   */
+  if (group_size[i] == 0 ||
+  group_size[i] > ctx->Const.MaxComputeVariableGroupSize[i]) {
+ _mesa_error(ctx, GL_INVALID_VALUE,
+ "glDispatchComputeGroupSizeARB(group_size_%c)", 'x' + i);
+ return GL_FALSE;
+  }
+
+  /* The ARB_compute_variable_group_size spec says:
+   *
+   * "An INVALID_OPERATION error is generated by
+   * DispatchComputeGroupSizeARB if the active program for the compute
+   * shader stage has a fixed work group size."
+   */
+  prog = ctx->_Shader->CurrentProgram[MESA_SHADER_COMPUTE];
+  if (prog->Comp.LocalSize[i] != 0) {
+ _mesa_error(ctx, GL_INVALID_OPERATION,
+ "glDispatchComputeGroupSizeARB(fixed work group size "
+ "forbidden)");
+ return GL_FALSE;
+  }


You're using different logic here to check for fixed work group sizes 
than above in DispatchCompute. I think the approach above is better. And 
the check doesn't need to be inside the loop, does it?



+
+  total_invocations *= group_size[i];
+   }
+
+   /* The ARB_compute_variable_group_size spec says:
+*
+* "An INVALID_VALUE error is generated by DispatchComputeGroupSizeARB if
+* the product of , , and  exceeds
+* the implementation-dependent maximum local work group invocation count
+* for compute shaders with variable group size
+* (MAX_COMPUTE_VARIABLE_GROUP_INVOCATIONS_ARB)."
+*/
+   if (total_invocations > ctx->Const.MaxComputeVariableGroupInvocations) {
+  _mesa_error(ctx, GL_INVALID_VALUE,
+  "glDispatchComputeGroupSizeARB(product of local_sizes "
+  "exceeds MAX_COMPUTE_VARIABLE_GROUP_INVOCATIONS_ARB "
+  "(%d > %d))", total_invocations,
+  ctx->Const.MaxComputeVariableGroupInvocations);
+  return GL_FALSE;
+   }
+
return GL_TRUE;
 }

@@ -1137,6 +1220,7 @@ valid_dispatch_indirect(struct gl_context *ctx,

[Mesa-dev] [PATCH v3 2/6] nvc0/ir: add emission for SHLADD

2016-09-27 Thread Samuel Pitoiset

Unfortunately, we can't use the emit helpers for GF100/GK110
because src1 and src2 are swapped.

v3: - remove useless use of src1 neg mod
v2: - s/emitSHLADD/emitISCADD for GM107 emitter

Signed-off-by: Samuel Pitoiset 
---
 .../drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 52 ++
 .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 32 +
 .../drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp  | 43 ++
 3 files changed, 127 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
index 61c450b..ce20ed3 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
@@ -96,6 +96,7 @@ private:
void emitDMUL(const Instruction *);
void emitIMAD(const Instruction *);
void emitISAD(const Instruction *);
+   void emitSHLADD(const Instruction *);
void emitFMAD(const Instruction *);
void emitDMAD(const Instruction *);
void emitMADSP(const Instruction *i);
@@ -757,6 +758,54 @@ CodeEmitterGK110::emitISAD(const Instruction *i)
 }
 
 void
+CodeEmitterGK110::emitSHLADD(const Instruction *i)
+{
+   uint8_t addOp = (i->src(2).mod.neg() << 1) | i->src(0).mod.neg();
+   const ImmediateValue *imm = i->src(1).get()->asImm();
+   assert(imm);
+
+   if (i->src(2).getFile() == FILE_IMMEDIATE) {
+  code[0] = 0x1;
+  code[1] = 0xc0c << 20;
+   } else {
+  code[0] = 0x2;
+  code[1] = 0x20c << 20;
+   }
+   code[1] |= addOp << 19;
+
+   emitPredicate(i);
+
+   defId(i->def(0), 2);
+   srcId(i->src(0), 10);
+
+   if (i->flagsDef >= 0)
+  code[1] |= 1 << 18;
+
+   assert(!(imm->reg.data.u32 & 0xffe0));
+   code[1] |= imm->reg.data.u32 << 10;
+
+   switch (i->src(2).getFile()) {
+   case FILE_GPR:
+  assert(code[0] & 0x2);
+  code[1] |= 0xc << 28;
+  srcId(i->src(2), 23);
+  break;
+   case FILE_MEMORY_CONST:
+  assert(code[0] & 0x2);
+  code[1] |= 0x4 << 28;
+  setCAddress14(i->src(2));
+  break;
+   case FILE_IMMEDIATE:
+  assert(code[0] & 0x1);
+  setShortImmediate(i, 2);
+  break;
+   default:
+  assert(!"bad src2 file");
+  break;
+   }
+}
+
+void
 CodeEmitterGK110::emitNOT(const Instruction *i)
 {
code[0] = 0x0003fc02; // logop(mov2) dst, 0, not src
@@ -2403,6 +2452,9 @@ CodeEmitterGK110::emitInstruction(Instruction *insn)
case OP_SAD:
   emitISAD(insn);
   break;
+   case OP_SHLADD:
+  emitSHLADD(insn);
+  break;
case OP_NOT:
   emitNOT(insn);
   break;
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
index cfde66c..3fedafd 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
@@ -152,6 +152,7 @@ private:
void emitIADD();
void emitIMUL();
void emitIMAD();
+   void emitISCADD();
void emitIMNMX();
void emitICMP();
void emitISET();
@@ -1813,6 +1814,34 @@ CodeEmitterGM107::emitIMAD()
 }
 
 void
+CodeEmitterGM107::emitISCADD()
+{
+   switch (insn->src(2).getFile()) {
+   case FILE_GPR:
+  emitInsn(0x5c18);
+  emitGPR (0x14, insn->src(2));
+  break;
+   case FILE_MEMORY_CONST:
+  emitInsn(0x4c18);
+  emitCBUF(0x22, -1, 0x14, 16, 2, insn->src(2));
+  break;
+   case FILE_IMMEDIATE:
+  emitInsn(0x3818);
+  emitIMMD(0x14, 19, insn->src(2));
+  break;
+   default:
+  assert(!"bad src1 file");
+  break;
+   }
+   emitNEG (0x31, insn->src(0));
+   emitNEG (0x30, insn->src(2));
+   emitCC  (0x2f);
+   emitIMMD(0x27, 5, insn->src(1));
+   emitGPR (0x08, insn->src(0));
+   emitGPR (0x00, insn->def(0));
+}
+
+void
 CodeEmitterGM107::emitIMNMX()
 {
switch (insn->src(1).getFile()) {
@@ -3098,6 +3127,9 @@ CodeEmitterGM107::emitInstruction(Instruction *i)
  emitIMAD();
   }
   break;
+   case OP_SHLADD:
+  emitISCADD();
+  break;
case OP_MIN:
case OP_MAX:
   if (isFloatType(insn->dType)) {
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
index d8ca6ab..0be9f7a 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
@@ -101,6 +101,7 @@ private:
void emitDMUL(const Instruction *);
void emitIMAD(const Instruction *);
void emitISAD(const Instruction *);
+   void emitSHLADD(const Instruction *a);
void emitFMAD(const Instruction *);
void emitDMAD(const Instruction *);
void emitMADSP(const Instruction *);
@@ -759,6 +760,45 @@ CodeEmitterNVC0::emitIMAD(const Instruction *i)
 }
 
 void
+CodeEmitterNVC0::emitSHLADD(const Instruction *i)
+{
+   uint8_t addOp = (i->src(2).mod.neg() << 1) | i->src(0).mod.neg();
+   const ImmediateValue *imm = i->src(1).get()

Re: [Mesa-dev] [PATCH v3 04/14] glsl: process local_size_variable input qualifier

2016-09-27 Thread Nicolai Hähnle


On 26.09.2016 19:23, Samuel Pitoiset wrote:

This is the new layout qualifier introduced by
ARB_compute_variable_group_size which allows to use a variable work
group size.

Signed-off-by: Samuel Pitoiset 
Reviewed-by: Ian Romanick 
---
 src/compiler/glsl/ast.h  |  5 +
 src/compiler/glsl/ast_type.cpp   |  6 ++
 src/compiler/glsl/glsl_parser.yy | 13 +
 src/compiler/glsl/glsl_parser_extras.cpp |  6 ++
 src/compiler/glsl/glsl_parser_extras.h   |  6 ++
 5 files changed, 36 insertions(+)

diff --git a/src/compiler/glsl/ast.h b/src/compiler/glsl/ast.h
index 4c648d0..55f009a 100644
--- a/src/compiler/glsl/ast.h
+++ b/src/compiler/glsl/ast.h
@@ -553,6 +553,11 @@ struct ast_type_qualifier {
   */
  unsigned local_size:3;

+/** \name Layout qualifiers for ARB_compute_variable_group_size. */
+/** \{ */
+unsigned local_size_variable:1;
+/** \} */
+
 /** \name Layout and memory qualifiers for 
ARB_shader_image_load_store. */
 /** \{ */
 unsigned early_fragment_tests:1;
diff --git a/src/compiler/glsl/ast_type.cpp b/src/compiler/glsl/ast_type.cpp
index f3f6b29..3f19f1f 100644
--- a/src/compiler/glsl/ast_type.cpp
+++ b/src/compiler/glsl/ast_type.cpp
@@ -503,6 +503,7 @@ ast_type_qualifier::merge_in_qualifier(YYLTYPE *loc,
  state->in_qualifier->flags.q.local_size == 0;

   valid_in_mask.flags.q.local_size = 7;
+  valid_in_mask.flags.q.local_size_variable = 1;
   break;
default:
   _mesa_glsl_error(loc, state,
@@ -580,6 +581,10 @@ ast_type_qualifier::merge_in_qualifier(YYLTYPE *loc,
   this->point_mode = q.point_mode;
}

+   if (q.flags.q.local_size_variable) {
+  state->cs_input_local_size_variable_specified = true;
+   }


What if previously a fixed local group size was specified? I think you 
need to check for this either here or in the next patch.




+
if (create_node) {
   if (create_gs_ast) {
  node = new(mem_ctx) ast_gs_input_layout(*loc, q.prim_type);
@@ -653,6 +658,7 @@ ast_type_qualifier::validate_flags(YYLTYPE *loc,
 bad.flags.q.prim_type ? " prim_type" : "",
 bad.flags.q.max_vertices ? " max_vertices" : "",
 bad.flags.q.local_size ? " local_size" : "",
+bad.flags.q.local_size_variable ? " local_size_variable" : 
"",
 bad.flags.q.early_fragment_tests ? " early_fragment_tests" : 
"",
 bad.flags.q.explicit_image_format ? " image_format" : "",
 bad.flags.q.coherent ? " coherent" : "",


You need to add a %s to the monster format string above. Doesn't this 
trigger a compiler warning?


Cheers,
Nicolai



diff --git a/src/compiler/glsl/glsl_parser.yy b/src/compiler/glsl/glsl_parser.yy
index 9e1fd9e..38cbd3f 100644
--- a/src/compiler/glsl/glsl_parser.yy
+++ b/src/compiler/glsl/glsl_parser.yy
@@ -1491,6 +1491,19 @@ layout_qualifier_id:
  }
   }

+  /* Layout qualifiers for ARB_compute_variable_group_size. */
+  if (!$$.flags.i) {
+ if (match_layout_qualifier($1, "local_size_variable", state) == 0) {
+$$.flags.q.local_size_variable = 1;
+ }
+
+ if ($$.flags.i && !state->ARB_compute_variable_group_size_enable) {
+_mesa_glsl_error(& @1, state,
+ "qualifier `local_size_variable` requires "
+ "ARB_compute_variable_group_size");
+ }
+  }
+
   if (!$$.flags.i) {
  _mesa_glsl_error(& @1, state, "unrecognized layout identifier "
   "`%s'", $1);
diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index eff5235..e200b7d 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -297,6 +297,8 @@ _mesa_glsl_parse_state::_mesa_glsl_parse_state(struct 
gl_context *_ctx,
   sizeof(this->atomic_counter_offsets));
this->allow_extension_directive_midshader =
   ctx->Const.AllowGLSLExtensionDirectiveMidShader;
+
+   this->cs_input_local_size_variable_specified = false;
 }

 /**
@@ -1676,6 +1678,7 @@ set_shader_inout_layout(struct gl_shader *shader,
if (shader->Stage != MESA_SHADER_COMPUTE) {
   /* Should have been prevented by the parser. */
   assert(!state->cs_input_local_size_specified);
+  assert(!state->cs_input_local_size_variable_specified);
}

if (shader->Stage != MESA_SHADER_FRAGMENT) {
@@ -1791,6 +1794,9 @@ set_shader_inout_layout(struct gl_shader *shader,
  for (int i = 0; i < 3; i++)
 shader->info.Comp.LocalSize[i] = 0;
   }
+
+  shader->info.Comp.LocalSizeVariable =
+ state->cs_input_local_size_variable_specified;
   break;

case MESA_SHADER_FRAGMENT:
diff --git a/src/compiler/glsl/glsl_parser_extras.h 
b/src/compiler/glsl/glsl_parser_extras.h
index 7528df7..127e

Re: [Mesa-dev] [PATCH v3 00/14] add support for ARB_compute_variable_group_size

2016-09-27 Thread Nicolai Hähnle


On 26.09.2016 19:23, Samuel Pitoiset wrote:

v3: - use a new case statement in r600_pipe_common.c
- fix compilation with softpipe
- initialize max_variable_threads_per_block to 0


I have sent some remarks on patches 2 and 4. Patches 1, 3, 5-11:

Reviewed-by: Nicolai Hähnle 



v2: - update formatting spec quotations
- add PIPE_COMPUTE_CAP_MAX_VARIABLE_THREADS_PER_BLOCK
- expose the ext based on that new cap
- add missing relnotes
- various cosmetic changes

From original cover-letter:

Hi,

This series implements ARB_compute_variable_group_size written against GL 4.3.
This extension allows to dispatch variable work group size via a new function
called glDispatchComputeGroupSizeARB().

Because this extension is pretty similar to ARB_compute_shader, all Gallium
drivers which already support compute shaders will expose
ARB_compute_variable_group_size with that series.

I did write a bunch of piglit tests, have a look here if you want:
https://lists.freedesktop.org/archives/piglit/2016-September/020755.html

All tests pass on Fermi (GF119) as well as all previous compute shaders tests.

Marek, Nicolai and other AMD folks, I don't know if radeonsi will need a fix
somewhere for handling a variable work group size, but as I don't have the
hardware, I can't test. Let me know if something needs to be slighty updated.

Please review,
Thanks!

Samuel Pitoiset (14):
  glapi: add entry points for GL_ARB_compute_variable_group_size
  mesa/main: add support for ARB_compute_variable_groups_size
  glsl: add enable flags for ARB_compute_variable_group_size
  glsl: process local_size_variable input qualifier
  glsl: reject compute shaders with fixed and variable local size
  glsl/linker: handle errors when a variable local size is used
  glsl: add gl_LocalGroupSizeARB as a system value
  gallium: add PIPE_COMPUTE_CAP_MAX_VARIABLE_THREADS_PER_BLOCK
  st/mesa: add mapping for SYSTEM_VALUE_LOCAL_GROUP_SIZE
  st/mesa: add support for dispatching a variable local size
  st/mesa: expose ARB_compute_variable_group_size
  nv50/ir: use 1024 threads/block for variable local size
  nvc0: expose ARB_compute_variable_group_size
  docs: mark ARB_compute_variable_group_size as done for nvc0

 docs/features.txt  |  2 +-
 docs/relnotes/12.1.0.html  |  1 +
 src/compiler/glsl/ast.h|  5 ++
 src/compiler/glsl/ast_to_hir.cpp   | 14 
 src/compiler/glsl/ast_type.cpp |  6 ++
 src/compiler/glsl/builtin_variables.cpp|  6 ++
 src/compiler/glsl/glsl_parser.yy   | 13 +++
 src/compiler/glsl/glsl_parser_extras.cpp   |  7 ++
 src/compiler/glsl/glsl_parser_extras.h |  8 ++
 src/compiler/glsl/linker.cpp   | 25 +-
 src/compiler/glsl/standalone.cpp   |  4 +
 src/compiler/glsl/standalone_scaffolding.cpp   |  5 ++
 src/compiler/shader_enums.h|  1 +
 src/gallium/docs/source/screen.rst |  4 +
 src/gallium/drivers/ilo/ilo_screen.c   |  2 +
 .../drivers/nouveau/codegen/nv50_ir_target.h   |  3 +-
 src/gallium/drivers/nouveau/nv50/nv50_screen.c |  2 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c |  1 +
 src/gallium/drivers/radeon/r600_pipe_common.c  |  2 +
 src/gallium/drivers/softpipe/sp_screen.c   |  1 +
 src/gallium/include/pipe/p_defines.h   |  3 +-
 .../glapi/gen/ARB_compute_variable_group_size.xml  | 25 ++
 src/mapi/glapi/gen/Makefile.am |  1 +
 src/mapi/glapi/gen/gl_API.xml  |  4 +-
 src/mesa/main/api_validate.c   | 96 ++
 src/mesa/main/api_validate.h   |  4 +
 src/mesa/main/compute.c| 25 ++
 src/mesa/main/compute.h|  5 ++
 src/mesa/main/context.c|  6 ++
 src/mesa/main/dd.h |  9 ++
 src/mesa/main/extensions_table.h   |  1 +
 src/mesa/main/get.c| 12 +++
 src/mesa/main/get_hash_params.py   |  3 +
 src/mesa/main/mtypes.h | 24 +-
 src/mesa/main/shaderapi.c  |  1 +
 src/mesa/main/shaderobj.c  |  2 +
 src/mesa/main/tests/dispatch_sanity.cpp|  3 +
 src/mesa/state_tracker/st_cb_compute.c | 15 +++-
 src/mesa/state_tracker/st_extensions.c | 22 +
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  2 +
 40 files changed, 365 insertions(+), 10 deletions(-)
 create mode 100644 src/mapi/glapi/gen/ARB_compute_variable_group_size.xml


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 8/8] intel/isl: Allow non-2D HiZ surfaces

2016-09-27 Thread Jason Ekstrand

On Sep 27, 2016 10:10 AM, "Nanley Chery"  wrote:
>
> On Tue, Sep 27, 2016 at 09:22:05AM -0700, Jason Ekstrand wrote:
> > ---
> >  src/intel/isl/isl.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
>
> This patch is
> Reviewed-by: Nanley Chery 

That's all of them. Feel free to push the lot (I'm on vacation) or to put
them in your series.

> > diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> > index 749d228..9735d26 100644
> > --- a/src/intel/isl/isl.c
> > +++ b/src/intel/isl/isl.c
> > @@ -1346,11 +1346,11 @@ isl_surf_get_hiz_surf(const struct isl_device
*dev,
> > const unsigned samples = ISL_DEV_GEN(dev) >= 9 ? 1 : surf->samples;
> >
> > isl_surf_init(dev, hiz_surf,
> > - .dim = ISL_SURF_DIM_2D,
> > + .dim = surf->dim,
> >   .format = ISL_FORMAT_HIZ,
> >   .width = surf->logical_level0_px.width,
> >   .height = surf->logical_level0_px.height,
> > - .depth = 1,
> > + .depth = surf->logical_level0_px.depth,
> >   .levels = surf->levels,
> >   .array_len = surf->logical_level0_px.array_len,
> >   .samples = samples,
> > --
> > 2.5.0.400.gff86faf
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nvc0: update GM107 sched control codes format

2016-09-27 Thread Samuel Pitoiset

envyas now uses a much better representation for those control
codes and it displays the different flags instead of an
unreadable hex number.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/codegen/lib/gm107.asm | 42 +++
 src/gallium/drivers/nouveau/nvc0/nvc0_surface.c   |  4 +--
 2 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/lib/gm107.asm 
b/src/gallium/drivers/nouveau/codegen/lib/gm107.asm
index 758cc81..67b98da 100644
--- a/src/gallium/drivers/nouveau/codegen/lib/gm107.asm
+++ b/src/gallium/drivers/nouveau/codegen/lib/gm107.asm
@@ -11,39 +11,39 @@
 // SIZE:22 / 14 * 8 bytes
 //
 gm107_div_u32:
-   sched 0x7e0 0x7e0 0x7e0
+   sched (st 0x0) (st 0x0) (st 0x0)
flo u32 $r2 $r1
lop xor 1 $r2 $r2 0x1f
mov $r3 0x1 0xf
-   sched 0x7e0 0x7e0 0x7e0
+   sched (st 0x0) (st 0x0) (st 0x0)
shl $r2 $r3 $r2
i2i u32 u32 $r1 neg $r1
imul u32 u32 $r3 $r1 $r2
-   sched 0x7e0 0x7e0 0x7e0
+   sched (st 0x0) (st 0x0) (st 0x0)
imad u32 u32 hi $r2 $r2 $r3 $r2
imul u32 u32 $r3 $r1 $r2
imad u32 u32 hi $r2 $r2 $r3 $r2
-   sched 0x7e0 0x7e0 0x7e0
+   sched (st 0x0) (st 0x0) (st 0x0)
imul u32 u32 $r3 $r1 $r2
imad u32 u32 hi $r2 $r2 $r3 $r2
imul u32 u32 $r3 $r1 $r2
-   sched 0x7e0 0x7e0 0x7e0
+   sched (st 0x0) (st 0x0) (st 0x0)
imad u32 u32 hi $r2 $r2 $r3 $r2
imul u32 u32 $r3 $r1 $r2
imad u32 u32 hi $r2 $r2 $r3 $r2
-   sched 0x7e0 0x7e0 0x7e0
+   sched (st 0x0) (st 0x0) (st 0x0)
mov $r3 $r0 0xf
imul u32 u32 hi $r0 $r0 $r2
i2i u32 u32 $r2 neg $r1
-   sched 0x7e0 0x7e0 0x7e0
+   sched (st 0x0) (st 0x0) (st 0x0)
imad u32 u32 $r1 $r1 $r0 $r3
isetp ge u32 and $p0 1 $r1 $r2 1
$p0 iadd $r1 $r1 neg $r2
-   sched 0x7e0 0x7e0 0x7e0
+   sched (st 0x0) (st 0x0) (st 0x0)
$p0 iadd $r0 $r0 0x1
$p0 isetp ge u32 and $p0 1 $r1 $r2 1
$p0 iadd $r1 $r1 neg $r2
-   sched 0x7e0 0x7e0 0x7e0
+   sched (st 0x0) (st 0x0) (st 0x0)
$p0 iadd $r0 $r0 0x1
ret
nop 0
@@ -55,47 +55,47 @@ gm107_div_u32:
 // CLOBBER: $r2 - $r3, $p0 - $p3
 //
 gm107_div_s32:
-   sched 0x7e0 0x7e0 0x7e0
+   sched (st 0x0) (st 0x0) (st 0x0)
isetp lt and $p2 0x1 $r0 0 1
isetp lt xor $p3 1 $r1 0 $p2
i2i s32 s32 $r0 abs $r0
-   sched 0x7e0 0x7e0 0x7e0
+   sched (st 0x0) (st 0x0) (st 0x0)
i2i s32 s32 $r1 abs $r1
flo u32 $r2 $r1
lop xor 1 $r2 $r2 0x1f
-   sched 0x7e0 0x7e0 0x7e0
+   sched (st 0x0) (st 0x0) (st 0x0)
mov $r3 0x1 0xf
shl $r2 $r3 $r2
i2i u32 u32 $r1 neg $r1
-   sched 0x7e0 0x7e0 0x7e0
+   sched (st 0x0) (st 0x0) (st 0x0)
imul u32 u32 $r3 $r1 $r2
imad u32 u32 hi $r2 $r2 $r3 $r2
imul u32 u32 $r3 $r1 $r2
-   sched 0x7e0 0x7e0 0x7e0
+   sched (st 0x0) (st 0x0) (st 0x0)
imad u32 u32 hi $r2 $r2 $r3 $r2
imul u32 u32 $r3 $r1 $r2
imad u32 u32 hi $r2 $r2 $r3 $r2
-   sched 0x7e0 0x7e0 0x7e0
+   sched (st 0x0) (st 0x0) (st 0x0)
imul u32 u32 $r3 $r1 $r2
imad u32 u32 hi $r2 $r2 $r3 $r2
imul u32 u32 $r3 $r1 $r2
-   sched 0x7e0 0x7e0 0x7e0
+   sched (st 0x0) (st 0x0) (st 0x0)
imad u32 u32 hi $r2 $r2 $r3 $r2
mov $r3 $r0 0xf
imul u32 u32 hi $r0 $r0 $r2
-   sched 0x7e0 0x7e0 0x7e0
+   sched (st 0x0) (st 0x0) (st 0x0)
i2i u32 u32 $r2 neg $r1
imad u32 u32 $r1 $r1 $r0 $r3
isetp ge u32 and $p0 1 $r1 $r2 1
-   sched 0x7e0 0x7e0 0x7e0
+   sched (st 0x0) (st 0x0) (st 0x0)
$p0 iadd $r1 $r1 neg $r2
$p0 iadd $r0 $r0 0x1
$p0 isetp ge u32 and $p0 1 $r1 $r2 1
-   sched 0x7e0 0x7e0 0x7e0
+   sched (st 0x0) (st 0x0) (st 0x0)
$p0 iadd $r1 $r1 neg $r2
$p0 iadd $r0 $r0 0x1
$p3 i2i s32 s32 $r0 neg $r0
-   sched 0x7e0 0x7e0 0x7e0
+   sched (st 0x0) (st 0x0) (st 0x0)
$p2 i2i s32 s32 $r1 neg $r1
ret
nop 0
@@ -103,7 +103,7 @@ gm107_div_s32:
 // STUB
 gm107_rcp_f64:
 gm107_rsq_f64:
-   sched 0x7e0 0x7e0 0x7e0
+   sched (st 0x0) (st 0x0) (st 0x0)
ret
nop 0
nop 0
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c
index e353242..0d14058 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_surface.c
@@ -836,11 +836,11 @@ nvc0_blitter_make_vp(struct nvc0_blitter *blit)
};
static const uint32_t code_gm107[] =
{
-  0xfc0007e0, 0x001f8000, /* sched 0x7e0 0x7e0 0x7e0 */
+  0xfc0007e0, 0x001f8000, /* sched (st 0x0) (st 0x0) (st 0x0) */
   0x0807ff04, 0xefd8ff80, /* ld b64 $r4 a[0x80] 0x0 */
   0x0907ff00, 0xefd97f80, /* ld b96 $r0 a[0x90] 0x0 */
   0x0707ff04, 0xeff0ff80, /* st b64 a[0x70] $r4 0x0 */
-  0xfc0007e0, 0x, /* sched 0x7e0 0x7e0 0x0 */
+  0xfc0007e0, 0x, /* sched (st 0x0) (st 0x0) (st 0x0) */
   0x0807ff00, 0xeff17f80, /* st b96 a[0x80] $r0 0x0 */
   0x0007000f, 0xe300, /* exit */
};
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http

Re: [Mesa-dev] [PATCH V2 09/11] genX/cmd_buffer: Enable rendering to HiZ

2016-09-27 Thread Nanley Chery

On Tue, Sep 27, 2016 at 11:00:06AM -0700, Chad Versace wrote:
> On Mon 26 Sep 2016, Nanley Chery wrote:
> > From: Chad Versace 
> > 
> > Nanley Chery:
> > (rebase)
> >  - Resolve conflicts with new anv_batch_emit macro
> > (amend)
> >  - Handle a QPitch TODO
> >  - Emit 3DSTATE_HIER_DEPTH_BUFFER on pre-BDW systems
> >  - Only use HiZ for single-subpass renderpasses
> >  - Emit the HiZ instruction before the stencil instruction to follow the
> >optimized clear sequence specified in the PRMs
> >  - Don't modify clear params
> >  - Enable resolves when a HiZ buffer is used to ensure depth buffer validity
> > 
> > Provides an FPS increase of ~15% on the Sascha triangle and multisampling
> > demos.
> 
> Woo!
> 
> > Signed-off-by: Nanley Chery 
> > 
> > ---
> > 
> > v2: Emit zero'ed 3DSTATE_HIER_DEPTH_BUFFER when hiz is disabled
> > (Jason, Chad)
> > 
> >  src/intel/vulkan/gen8_cmd_buffer.c |  4 
> >  src/intel/vulkan/genX_cmd_buffer.c | 43 
> > ++
> >  2 files changed, 43 insertions(+), 4 deletions(-)
> > 
> > diff --git a/src/intel/vulkan/gen8_cmd_buffer.c 
> > b/src/intel/vulkan/gen8_cmd_buffer.c
> > index a13413c..14e6a7b 100644
> > --- a/src/intel/vulkan/gen8_cmd_buffer.c
> > +++ b/src/intel/vulkan/gen8_cmd_buffer.c
> > @@ -417,6 +417,10 @@ genX(cmd_buffer_do_hz_op)(struct anv_cmd_buffer 
> > *cmd_buffer,
> > if (iview == NULL || !anv_image_has_hiz(iview->image))
> >return;
> >  
> > +   /* FIXME: Implement multi-subpass HiZ */
> 
> This should be a FINISHME, not a FIXME, as nothing is broken and there
> is no bug. It's just disabled.
> 

Good catch, I'll update it.

> > +   if (cmd_buffer->state.pass->subpass_count > 1)
> > +  return;
> > +
> 
> Anyway, that's just a small nitpick.
> 
> Reviewed-by: Chad Versace 

Thanks!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] egl: stop claiming support for pbuffer + msaa (RFC)

2016-09-27 Thread Ian Romanick

On 09/26/2016 12:41 AM, Tapani Pälli wrote:
> This fixes a crash in egl-create-msaa-pbuffer-surface Piglit test
> and same crash in many dEQP EGL tests.
> 
> I also found that some Qt example did a workaround because of this
> crash: https://bugreports.qt.io/browse/QTBUG-47509

Eh... I would have to double-check, but I thought each config had a set
of bits that described the kinds of surfaces (window, pixmap, or
pbuffer) it could be used with.  I know this is how GLX works.  Assuming
that's the case, the correct fix is to never set the the pbuffer bit for
configs that have multisample.

If I'm reading this patch correctly, it looks like it will remove those
configs completely.  We still want to have multisample windows.  Am I
understanding this correctly?

> 
> Signed-off-by: Tapani Pälli 
> ---
> 
> This is RFC as I'm not sure if we are supposed to support this. I tried
> to verify overall pbuffer situation with some mesa-demos using pbuffer
> but those are not working for me at all with or without my patch.
> 
>  src/egl/main/eglconfig.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/src/egl/main/eglconfig.c b/src/egl/main/eglconfig.c
> index 6161d26..20cf9d4 100644
> --- a/src/egl/main/eglconfig.c
> +++ b/src/egl/main/eglconfig.c
> @@ -407,6 +407,11 @@ _eglValidateConfig(const _EGLConfig *conf, EGLBoolean 
> for_matching)
>return EGL_FALSE;
> }
>  
> +   /* pbuffer with MSAA not supported */
> +   if (conf->SurfaceType & EGL_PBUFFER_BIT && conf->Samples) {
> +  return EGL_FALSE;
> +   }
> +
> if (!(conf->SurfaceType & EGL_WINDOW_BIT)) {
>if (conf->NativeVisualID != 0 || conf->NativeVisualType != EGL_NONE)
>   valid = EGL_FALSE;
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 77449] Tracker bug for all bugs related to Steam titles

2016-09-27 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=77449

Vedran Miletić  changed:

   What|Removed |Added

 Depends on|97917   |


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=97917
[Bug 97917] Enabling sisched gives Assertion `!NodePtr->isKnownSentinel()'
failed
-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH V2 10/11] genX/cmd_buffer: Enable fast depth clears

2016-09-27 Thread Nanley Chery

On Tue, Sep 27, 2016 at 11:00:21AM -0700, Chad Versace wrote:
> On Mon 26 Sep 2016, Nanley Chery wrote:
> > From: Nanley Chery 
> > 
> > Provides an FPS increase of ~30% on the Sascha triangle and multisampling
> > demos.
> > 
> > Clears that happen within a render pass via vkCmdClearAttachments are safe
> > even if the clear color changes. This is because the meta implementation 
> > does
> > not use LOAD_OP_CLEAR which avoids any conflicts with 3DSTATE_CLEAR_PARAMS.
> > 
> > Signed-off-by: Nanley Chery 
> > Reviewed-by: Jason Ekstrand 
> > 
> > ---
> > 
> > v2. Update granularity comment for accuracy
> > 
> >  src/intel/vulkan/anv_pass.c| 13 +
> >  src/intel/vulkan/gen8_cmd_buffer.c |  6 ++
> >  src/intel/vulkan/genX_cmd_buffer.c |  4 +---
> >  3 files changed, 20 insertions(+), 3 deletions(-)
> > 
> > diff --git a/src/intel/vulkan/anv_pass.c b/src/intel/vulkan/anv_pass.c
> > index 69c3c7e..595c2ea 100644
> > --- a/src/intel/vulkan/anv_pass.c
> > +++ b/src/intel/vulkan/anv_pass.c
> > @@ -155,5 +155,18 @@ void anv_GetRenderAreaGranularity(
> >  VkRenderPassrenderPass,
> >  VkExtent2D* pGranularity)
> >  {
> > +   ANV_FROM_HANDLE(anv_render_pass, pass, renderPass);
> > +
> > +   /* This granularity satisfies HiZ fast clear alignment requirements
> > +* for all sample counts.
> > +*/
> > +   for (unsigned i = 0; i < pass->subpass_count; ++i) {
> > +  if (pass->subpasses[i].depth_stencil_attachment !=
> > +  VK_ATTACHMENT_UNUSED) {
> > + *pGranularity = (VkExtent2D) { .width = 8, .height = 4 };
> > + return;
> > +  }
> > +   }
> > +
> > *pGranularity = (VkExtent2D) { 1, 1 };
> >  }
> > diff --git a/src/intel/vulkan/gen8_cmd_buffer.c 
> > b/src/intel/vulkan/gen8_cmd_buffer.c
> > index 14e6a7b..96e972c 100644
> > --- a/src/intel/vulkan/gen8_cmd_buffer.c
> > +++ b/src/intel/vulkan/gen8_cmd_buffer.c
> > @@ -479,6 +479,12 @@ genX(cmd_buffer_do_hz_op)(struct anv_cmd_buffer 
> > *cmd_buffer,
> >   cmd_state->render_area.extent.height % px_dim.h)
> >  return;
> >}
> > +
> > +  anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_CLEAR_PARAMS), cp) {
> > + cp.DepthClearValueValid = true;
> > + cp.DepthClearValue =
> > +
> > cmd_buffer->state.attachments[ds].clear_value.depthStencil.depth;
> > +  }
> >break;
> > case BLORP_HIZ_OP_DEPTH_RESOLVE:
> >if (cmd_buffer->state.pass->attachments[ds].store_op !=
> > diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
> > b/src/intel/vulkan/genX_cmd_buffer.c
> > index 2cb1539..290fefc 100644
> > --- a/src/intel/vulkan/genX_cmd_buffer.c
> > +++ b/src/intel/vulkan/genX_cmd_buffer.c
> > @@ -1320,9 +1320,6 @@ cmd_buffer_emit_depth_stencil(struct anv_cmd_buffer 
> > *cmd_buffer)
> > } else {
> >anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_STENCIL_BUFFER), sb);
> > }
> > -
> > -   /* Clear the clear params. */
> > -   anv_batch_emit(&cmd_buffer->batch, GENX(3DSTATE_CLEAR_PARAMS), cp);
> 
> We may need to preserve emission of 3DSTATE_CLEAR_PARAMS here. Two reasons:
> 
> Reason 1. If hiz is enabled in the 3DSTATE_DEPTH_BUFFER, and the hiz
>surface has some bits in the clear state, and 
> 3DSTATE_CLEAR_PARAMS.DepthClearValueValid is 0,
>and we emit a draw call, what does the hardware do when it
>accesses a cleard pixel? I don't want to find out.
> 

Good point.

> Reason 2. The PRM says we have to (though, to be honest, I don't trust 
> the PRM's logic).
> 
> From the Skylake PRM >> Vol7: 3D-Media-GPGUP >> Section: Hierarchical 
> Depth Buffer:
> | 
> |  If HiZ is enabled, you must initialize the clear value by either:
> | 
> | 1. Perform a depth clear pass to initialize the clear value.
> | 2. Send a 3dstate_clear_params packet with valid = 1.
> | 
> |  Without one of these events, context switching will fail, as it 
> will try
> |  to save off a clear value even though no valid clear value has 
> been set.
> |  When context restore happens, HW will restore an uninitialized 
> clear value.
> 
> Even though the hardware docs claim we need 3DSTATE_CLEAR_PARAMS when 
> hiz is
> enabled, the docs are vague about the consequences. Does context 
> switching
> really fail, as claimed by #1? Or does context switching actually 
> succeed, but
> context restore gives us an invalid clear value (which doesn't hurt 
> us), as
> claimed by #2? Oh hw docs... :/
> 

I didn't trust the logic as well, but I agree. It's good to keep the
diff as small as reasonably possible. 

> As a consequence of that reasoning, we should set 
> 3DSTATE_CLEAR_PARAMS.DepthClearValueValid = 1 
> whenever hiz is enabled, even if we don't care about the actual clear value.

In the V3, I plan to emit that packet once at device ini

[Mesa-dev] [PATCH] i965: Remove useless (harmful) assertion

2016-09-27 Thread Ben Widawsky

From: Ben Widawsky 

The code already skips doing the depth stall on gen >= 8, and as we
enable new platforms this assertion will fail needlessly. Instead of
changing the caller, make this simple change.

Signed-off-by: Ben Widawsky 
---
 src/mesa/drivers/dri/i965/brw_pipe_control.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c 
b/src/mesa/drivers/dri/i965/brw_pipe_control.c
index 358d31d..c7e3b3c 100644
--- a/src/mesa/drivers/dri/i965/brw_pipe_control.c
+++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c
@@ -253,7 +253,7 @@ brw_emit_pipe_control_write(struct brw_context *brw, 
uint32_t flags,
 void
 brw_emit_depth_stall_flushes(struct brw_context *brw)
 {
-   assert(brw->gen >= 6 && brw->gen <= 9);
+   assert(brw->gen >= 6);
 
/* Starting on BDW, these pipe controls are unnecessary.
 *
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH V2 10/11] genX/cmd_buffer: Enable fast depth clears

2016-09-27 Thread Chad Versace

On Tue 27 Sep 2016, Nanley Chery wrote:
> On Tue, Sep 27, 2016 at 11:00:21AM -0700, Chad Versace wrote:

> > As a consequence of that reasoning, we should set 
> > 3DSTATE_CLEAR_PARAMS.DepthClearValueValid = 1 
> > whenever hiz is enabled, even if we don't care about the actual clear value.
> 
> In the V3, I plan to emit that packet once at device initialization time
> HSW+, and to always emit it (in the expected location) for IVB/BYT. Only
> the latter platforms have the restriction that it must always be
> programmed with the other depth/stencil commands.

Is there any benefit to emitting it multiple times on ivb/byt? Does
emitting once during initialization, as for hsw, also work for ivb/byt?
If so, the code is cleaner if the two gens share the same workaround
code.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH V2 10/11] genX/cmd_buffer: Enable fast depth clears

2016-09-27 Thread Nanley Chery

On Tue, Sep 27, 2016 at 03:12:17PM -0700, Chad Versace wrote:
> On Tue 27 Sep 2016, Nanley Chery wrote:
> > On Tue, Sep 27, 2016 at 11:00:21AM -0700, Chad Versace wrote:
> 
> > > As a consequence of that reasoning, we should set 
> > > 3DSTATE_CLEAR_PARAMS.DepthClearValueValid = 1 
> > > whenever hiz is enabled, even if we don't care about the actual clear 
> > > value.
> > 
> > In the V3, I plan to emit that packet once at device initialization time
> > HSW+, and to always emit it (in the expected location) for IVB/BYT. Only
> > the latter platforms have the restriction that it must always be
> > programmed with the other depth/stencil commands.
> 
> Is there any benefit to emitting it multiple times on ivb/byt? Does
> emitting once during initialization, as for hsw, also work for ivb/byt?
> If so, the code is cleaner if the two gens share the same workaround
> code.

The benefit for emitting it multiple times on IVB/BYT is that we're
(possibly) following the oddly-worded programming note for the packet:

   From the IVB PRM Vol2P1, 11.5.5.4 3DSTATE_CLEAR_PARAMS:

  3DSTATE_CLEAR_PARAMS must always be programmed in the along with
  the other Depth/Stencil state commands(i.e.  3DSTATE_DEPTH_BUFFER,
  3DSTATE_STENCIL_BUFFER, or 3DSTATE_HIER_DEPTH_BUFFER)

HSW+ doesn't have this restriction, so we're free to only do it once.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] HUD: Add support for block I/O, network I/O and lmsensor stats

2016-09-27 Thread Steven Toth

V4: Merged with master as of 2016/9/27 6pm

V3: Flatten the entire patchset ready for the ML

V2: Additional seperate patches based on feedback
a) configure.ac: Add a comment related to libsensors

b) HUD: Disable Block/NIC I/O stats by default.
Implement configuration option --enable-gallium-extra-hud=yes
and enable both statistics when this option is enabled.

c) Configure.ac: Minor cleanup to user visible configuration settings

d) Configure.ac: HUD stats - build system improvements
Move the -lsensors out of a deeper Makefile, bring it into the configure.ac.
Also, rename a compiler directive to more closely follow the standard.

V1: Initial release to the ML
Three new features:
1. Disk/block I/O device read/write stats MB/ps.
2. Network Interface RX/TX transfer statistics as a percentage
   of the overall NIC speed.
3. lmsensor power, voltage and temperature sensors.

The lmsensor changes makes a dependency on libsensors so support
for the change is opt out by default.

Signed-off-by: Steven Toth 
---
 configure.ac |  42 +++
 src/gallium/auxiliary/Makefile.am|   2 +
 src/gallium/auxiliary/Makefile.sources   |   3 +
 src/gallium/auxiliary/hud/hud_context.c  |  73 +
 src/gallium/auxiliary/hud/hud_diskstat.c | 335 
 src/gallium/auxiliary/hud/hud_nic.c  | 441 +++
 src/gallium/auxiliary/hud/hud_private.h  |  25 ++
 src/gallium/auxiliary/hud/hud_sensors_temp.c | 374 +++
 src/gallium/include/pipe/p_defines.h |   4 +
 9 files changed, 1299 insertions(+)
 create mode 100644 src/gallium/auxiliary/hud/hud_diskstat.c
 create mode 100644 src/gallium/auxiliary/hud/hud_nic.c
 create mode 100644 src/gallium/auxiliary/hud/hud_sensors_temp.c

diff --git a/configure.ac b/configure.ac
index b9e6000..adddbe9 100644
--- a/configure.ac
+++ b/configure.ac
@@ -91,6 +91,7 @@ XCBGLX_REQUIRED=1.8.1
 XSHMFENCE_REQUIRED=1.1
 XVMC_REQUIRED=1.0.6
 PYTHON_MAKO_REQUIRED=0.8.0
+LIBSENSORS_REQUIRED=4.0.0
 
 dnl Check for progs
 AC_PROG_CPP
@@ -871,6 +872,32 @@ AC_ARG_ENABLE([dri],
 [enable_dri="$enableval"],
 [enable_dri=yes])
 
+AC_ARG_ENABLE([gallium-extra-hud],
+[AS_HELP_STRING([--enable-gallium-extra-hud],
+[enable HUD block/NIC I/O HUD stats support 
@<:@default=disabled@:>@])],
+[enable_gallium_extra_hud="$enableval"],
+[enable_gallium_extra_hud=no])
+AM_CONDITIONAL(HAVE_GALLIUM_EXTRA_HUD, test "x$enable_gallium_extra_hud" = 
xyes)
+if test "x$enable_gallium_extra_hud" = xyes ; then
+DEFINES="${DEFINES} -DHAVE_GALLIUM_EXTRA_HUD=1"
+fi
+
+#TODO: no pkgconfig .pc available for libsensors.
+#PKG_CHECK_MODULES([LIBSENSORS], [libsensors >= $LIBSENSORS_REQUIRED], 
[enable_lmsensors=yes], [enable_lmsensors=no])
+AC_ARG_ENABLE([lmsensors],
+[AS_HELP_STRING([--enable-lmsensors],
+[enable HUD lmsensor support @<:@default=disabled@:>@])],
+[enable_lmsensors="$enableval"],
+[enable_lmsensors=no])
+AM_CONDITIONAL(HAVE_LIBSENSORS, test "x$enable_lmsensors" = xyes)
+if test "x$enable_lmsensors" = xyes ; then
+DEFINES="${DEFINES} -DHAVE_LIBSENSORS=1"
+LIBSENSORS_LDFLAGS="-lsensors"
+else
+LIBSENSORS_LDFLAGS=""
+fi
+AC_SUBST(LIBSENSORS_LDFLAGS)
+
 case "$host_os" in
 linux*)
 dri3_default=yes
@@ -1124,6 +1151,8 @@ AM_CONDITIONAL(HAVE_DRISW_KMS, test "x$have_drisw_kms" = 
xyes )
 AM_CONDITIONAL(HAVE_DRI2, test "x$enable_dri" = xyes -a "x$dri_platform" = 
xdrm -a "x$have_libdrm" = xyes )
 AM_CONDITIONAL(HAVE_DRI3, test "x$enable_dri3" = xyes -a "x$dri_platform" = 
xdrm -a "x$have_libdrm" = xyes )
 AM_CONDITIONAL(HAVE_APPLEDRI, test "x$enable_dri" = xyes -a "x$dri_platform" = 
xapple )
+AM_CONDITIONAL(HAVE_LMSENSORS, test "x$enable_lmsensors" = xyes )
+AM_CONDITIONAL(HAVE_GALLIUM_EXTRA_HUD, test "x$enable_gallium_extra_hud" = 
xyes )
 AM_CONDITIONAL(HAVE_WINDOWSDRI, test "x$enable_dri" = xyes -a "x$dri_platform" 
= xwindows )
 
 AC_ARG_ENABLE([shared-glapi],
@@ -2888,6 +2917,19 @@ else
 echo "Gallium: no"
 fi
 
+echo ""
+if test "x$enable_gallium_extra_hud" != xyes; then
+echo "HUD extra stats: no"
+else
+echo "HUD extra stats: yes"
+fi
+
+if test "x$enable_lmsensors" != xyes; then
+echo "HUD lmsensors:   no"
+else
+echo "HUD lmsensors:   yes"
+fi
+
 dnl Shader cache
 echo ""
 echo "Shader cache:$enable_shader_cache"
diff --git a/src/gallium/auxiliary/Makefile.am 
b/src/gallium/auxiliary/Makefile.am
index d971a2b..4a4a4fb 100644
--- a/src/gallium/auxiliary/Makefile.am
+++ b/src/gallium/auxiliary/Makefile.am
@@ -34,6 +34,8 @@ libgallium_la_SOURCES += \
 
 endif
 
+libgallium_la_LDFLAGS = $(LIBSENSORS_LDFLAGS)
+
 MKDIR_GEN = $(AM_V_at)$(MKDIR_P) $(@D)
 PYTHON_GEN =  $(AM_V_GEN)$(PYTHON2) $(PYTHON_FLAGS)
 
diff --git a/src/gallium/auxiliary/Makefile.sources 
b/src/gallium/auxiliary/Makefile.sources
index ed9eaa8..3d728ae 100644
--- a/src/gallium/auxiliary/Makefile.sour

[Mesa-dev] [PATCH 2/2] HUD: Add support for block I/O, network I/O and lmsensor stats

2016-09-27 Thread Steven Toth

V5: Feedback based on peer review
 convert sprintf to snprintf
 convert char * to const char *
 int arg converted to bool
 Func changes to take a filename vs a larger struct.
 omit the space between '*' and the param name.

Signed-off-by: Steven Toth 
---
 src/gallium/auxiliary/hud/hud_diskstat.c | 36 ++-
 src/gallium/auxiliary/hud/hud_nic.c  | 53 +++-
 src/gallium/auxiliary/hud/hud_private.h  | 12 +++
 src/gallium/auxiliary/hud/hud_sensors_temp.c | 23 ++--
 4 files changed, 66 insertions(+), 58 deletions(-)

diff --git a/src/gallium/auxiliary/hud/hud_diskstat.c 
b/src/gallium/auxiliary/hud/hud_diskstat.c
index d22afb7..b2ee1f8 100644
--- a/src/gallium/auxiliary/hud/hud_diskstat.c
+++ b/src/gallium/auxiliary/hud/hud_diskstat.c
@@ -85,7 +85,7 @@ static int gdiskstat_count = 0;
 static struct list_head gdiskstat_list;
 
 static struct diskstat_info *
-find_dsi_by_name(char *n, int mode)
+find_dsi_by_name(const char *n, int mode)
 {
list_for_each_entry(struct diskstat_info, dsi, &gdiskstat_list, list) {
   if (dsi->mode != mode)
@@ -97,10 +97,10 @@ find_dsi_by_name(char *n, int mode)
 }
 
 static int
-get_file_values(struct diskstat_info *dsi, struct stat_s *s)
+get_file_values(char *fn, struct stat_s *s)
 {
int ret = 0;
-   FILE *fh = fopen(dsi->sysfs_filename, "r");
+   FILE *fh = fopen(fn, "r");
if (!fh)
   return -1;
 
@@ -128,7 +128,7 @@ query_dsi_load(struct hud_graph *gr)
if (dsi->last_time) {
   if (dsi->last_time + gr->pane->period <= now) {
  struct stat_s stat;
- if (get_file_values(dsi, &stat) < 0)
+ if (get_file_values(dsi->sysfs_filename, &stat) < 0)
 return;
  float val = 0;
 
@@ -157,7 +157,7 @@ query_dsi_load(struct hud_graph *gr)
   switch (dsi->mode) {
   case DISKSTAT_RD:
   case DISKSTAT_WR:
- get_file_values(dsi, &dsi->last_stat);
+ get_file_values(dsi->sysfs_filename, &dsi->last_stat);
  break;
   }
   dsi->last_time = now;
@@ -179,7 +179,7 @@ free_query_data(void *p)
   * \param  mode  query read or write (DISKSTAT_RD/DISKSTAT_WR) statistics.
   */
 void
-hud_diskstat_graph_install(struct hud_pane *pane, char *dev_name,
+hud_diskstat_graph_install(struct hud_pane *pane, const char *dev_name,
unsigned int mode)
 {
struct hud_graph *gr;
@@ -205,10 +205,10 @@ hud_diskstat_graph_install(struct hud_pane *pane, char 
*dev_name,
 
dsi->mode = mode;
if (dsi->mode == DISKSTAT_RD) {
-  sprintf(gr->name, "%s-Read-MB/s", dsi->name);
+  snprintf(gr->name, sizeof(gr->name), "%s-Read-MB/s", dsi->name);
}
else if (dsi->mode == DISKSTAT_WR) {
-  sprintf(gr->name, "%s-Write-MB/s", dsi->name);
+  snprintf(gr->name, sizeof(gr->name), "%s-Write-MB/s", dsi->name);
}
else
   return;
@@ -226,24 +226,26 @@ hud_diskstat_graph_install(struct hud_pane *pane, char 
*dev_name,
 }
 
 static void
-add_object_part(char *basename, char *name, int objmode)
+add_object_part(const char *basename, char *name, int objmode)
 {
struct diskstat_info *dsi = CALLOC_STRUCT(diskstat_info);
 
strcpy(dsi->name, name);
-   sprintf(dsi->sysfs_filename, "%s/%s/stat", basename, name);
+   snprintf(dsi->sysfs_filename, sizeof(dsi->sysfs_filename), "%s/%s/stat",
+  basename, name);
dsi->mode = objmode;
list_addtail(&dsi->list, &gdiskstat_list);
gdiskstat_count++;
 }
 
 static void
-add_object(char *basename, char *name, int objmode)
+add_object(const char *basename, char *name, int objmode)
 {
struct diskstat_info *dsi = CALLOC_STRUCT(diskstat_info);
 
strcpy(dsi->name, name);
-   sprintf(dsi->sysfs_filename, "%s/stat", basename);
+   snprintf(dsi->sysfs_filename, sizeof(dsi->sysfs_filename), "%s/stat",
+  basename);
dsi->mode = objmode;
list_addtail(&dsi->list, &gdiskstat_list);
gdiskstat_count++;
@@ -256,7 +258,7 @@ add_object(char *basename, char *name, int objmode)
   * \return  number of detected block I/O devices.
   */
 int
-hud_get_num_disks(int displayhelp)
+hud_get_num_disks(bool displayhelp)
 {
struct dirent *dp;
struct stat stat_buf;
@@ -281,8 +283,8 @@ hud_get_num_disks(int displayhelp)
  continue;
 
   char basename[256];
-  sprintf(basename, "/sys/block/%s", dp->d_name);
-  sprintf(name, "%s/stat", basename);
+  snprintf(basename, sizeof(basename), "/sys/block/%s", dp->d_name);
+  snprintf(name, sizeof(name), "%s/stat", basename);
   if (stat(name, &stat_buf) < 0)
  continue;
 
@@ -305,7 +307,7 @@ hud_get_num_disks(int displayhelp)
 continue;
 
  char p[64];
- sprintf(p, "%s/%s/stat", basename, dpart->d_name);
+ snprintf(p, sizeof(p), "%s/%s/stat", basename, dpart->d_name);
  if (stat(p, &stat_buf) < 0)
 continue;
 
@@ -321,7 +323,7 @@ hud_get_num_disks(int displayhelp)
if (displayhelp) {
   list_for_each_entry(

Re: [Mesa-dev] [PATCH V2 08/11] anv/cmd_buffer: Add code for performing HZ operations

2016-09-27 Thread Nanley Chery

On Tue, Sep 27, 2016 at 11:00:14AM -0700, Chad Versace wrote:
> On Mon 26 Sep 2016, Nanley Chery wrote:
> > Create a function that performs one of three HiZ operations -
> > depth/stencil clears, HiZ resolve, and depth resolves.
> > 
> > Signed-off-by: Nanley Chery 
> > 
> > ---
> > 
> > v2. Add documentation
> > Fix the alignment check
> > Don't minify clear rectangle (Jason)
> > Use blorp enums (Jason)
> > Enable depth stalls and flushes
> > Use full RT rectangle for resolve ops
> > Add stencil clear todo
> > 
> >  src/intel/vulkan/anv_genX.h|   3 +
> >  src/intel/vulkan/gen7_cmd_buffer.c |   6 ++
> >  src/intel/vulkan/gen8_cmd_buffer.c | 167 
> > +
> >  3 files changed, 176 insertions(+)
> 
> 
> 
> > +/**
> > + * Emit the HZ_OP packet in the sequence specified by the BDW PRM section
> > + * entitled: "Optimized Depth Buffer Clear and/or Stencil Buffer Clear."
> > + *
> > + * \todo Enable Stencil Buffer-only clears
> > + */
> > +void
> > +genX(cmd_buffer_do_hz_op)(struct anv_cmd_buffer *cmd_buffer,
> > +  enum blorp_hiz_op op)
> > +{
> 
> All other "emission" functions in gen8_cmd_buffer.c are named
> gen8_cmd_buffer_emit_foo(). I think this funtion should be named
> gen8_cmd_buffer_emit_hz_op for consistency.
> 

Sounds good. I'll fix that in the v3.

> > +   struct anv_cmd_state *cmd_state = &cmd_buffer->state;
> > +   const struct anv_image_view *iview =
> > +  anv_cmd_buffer_get_depth_stencil_view(cmd_buffer);
> > +
> > +   if (iview == NULL || !anv_image_has_hiz(iview->image))
> > +  return;
> 
> Shouldn't this check for subpass_count > 1, like the previous patches
> do?
> 

The following patch in the series adds this check. I wait until then
because this patch only adds code to implement HiZ and isn't aware of
any implementation restrictions. The next patch enables HiZ for
single-subpass renderpasses, and so the appropriate restriction is added
here as well.

Previous patches checked for other properties relating to HiZ (gen and
number of miplevels).

> > +
> > +   const uint32_t ds = cmd_state->subpass->depth_stencil_attachment;
> > +   const bool full_surface_op =
> > + cmd_state->render_area.extent.width == iview->extent.width &&
> > + cmd_state->render_area.extent.height == iview->extent.height;
> > +
> > +   /* Validate that we can perform the HZ operation and that it's 
> > necessary. */
> > +   switch (op) {
> > +   case BLORP_HIZ_OP_DEPTH_CLEAR:
> > +  if (cmd_buffer->state.pass->attachments[ds].load_op !=
> > +  VK_ATTACHMENT_LOAD_OP_CLEAR)
> > + return;
> > +
> > +  /* Apply alignment restrictions. Despite the BDW PRM mentioning this 
> > is
> > +   * only needed for a depth buffer surface type of D16_UNORM, testing
> > +   * showed it to be necessary for other depth formats as well
> > +   * (e.g., D32_FLOAT).
> > +   */
> > +  if (!full_surface_op) {
> > +
> > + struct isl_extent2d px_dim;
> > +#if GEN_GEN == 8
> > + /* Pre-SKL, HiZ has an 8x4 sample block. As the number of samples
> > +  * increases, the number of pixels representable by this block
> > +  * decreases by a factor of the sample dimensions. Sample 
> > dimensions
> > +  * scale following the MSAA interleaved pattern.
> > +  *
> > +  * Sample|Sample|Pixel
> > +  * Count |Dim   |Dim
> > +  * ===
> > +  *1  | 1x1  | 8x4
> > +  *2  | 2x1  | 4x4
> > +  *4  | 2x2  | 4x2
> > +  *8  | 4x2  | 2x2
> > +  *   16  | 4x4  | 2x1
> > +  *
> > +  * Table: Pixel Dimensions in a HiZ Sample Block Pre-SKL
> > +  */
> > + const struct isl_extent2d sa_dim =
> > +isl_get_interleaved_msaa_px_size_sa(iview->image->samples);
> > + px_dim.w = 8 / sa_dim.w;
> > + px_dim.h = 4 / sa_dim.h;
> > +#else
> > + /* SKL+, the sample block becomes a "pixel block" so the expected
> > +  * pixel dimension is a constant 8x4 px for all sample counts.
> > +  */
> > + px_dim = (struct isl_extent2d) { .w = 8, .h = 4};
> > +#endif
> > +
> > + /* Fast depth clears clear an entire sample block at a time. As a
> > +  * result, the rectangle must be aligned to the pixel dimensions 
> > of
> > +  * a sample block for a successful operation.
> > +  */
> > + if (cmd_state->render_area.offset.x % px_dim.w ||
> > + cmd_state->render_area.offset.y % px_dim.h ||
> > + cmd_state->render_area.extent.width % px_dim.w ||
> > + cmd_state->render_area.extent.height % px_dim.h)
> > +return;
> > +  }
> > +  break;
> > +   case BLORP_HIZ_OP_DEPTH_RESOLVE:
> > +  if (cmd_buffer->state.pass->attachments[ds].store_op !=
> > +  VK_ATTACHMENT_STORE_OP_STORE)
> > + return;
> > +

Re: [Mesa-dev] [PATCH] i965: Remove useless (harmful) assertion

2016-09-27 Thread Anuj Phogat

On Tue, Sep 27, 2016 at 3:02 PM, Ben Widawsky
 wrote:
> From: Ben Widawsky 
>
> The code already skips doing the depth stall on gen >= 8, and as we
> enable new platforms this assertion will fail needlessly. Instead of
> changing the caller, make this simple change.
>
> Signed-off-by: Ben Widawsky 
> ---
>  src/mesa/drivers/dri/i965/brw_pipe_control.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c 
> b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> index 358d31d..c7e3b3c 100644
> --- a/src/mesa/drivers/dri/i965/brw_pipe_control.c
> +++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c
> @@ -253,7 +253,7 @@ brw_emit_pipe_control_write(struct brw_context *brw, 
> uint32_t flags,
>  void
>  brw_emit_depth_stall_flushes(struct brw_context *brw)
>  {
> -   assert(brw->gen >= 6 && brw->gen <= 9);
> +   assert(brw->gen >= 6);
>
> /* Starting on BDW, these pipe controls are unnecessary.
>  *
> --
> 2.10.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3] HUD: Add support for block I/O, network I/O and lmsensor stats

2016-09-27 Thread Steven Toth

On Fri, Sep 23, 2016 at 12:19 PM, Brian Paul  wrote:
> Hi Steven,
>
> I did a more thorough review per your request...

Thank you Brian.

All of your suggestions have been implemented, and new patches pushed to the ML.

...with the exception of one, primarily because I wanted to comment.

>> +#if HAVE_LIBSENSORS
>> +  else if (sscanf(name, "sensors_temp_cu-%s", arg_name) == 1) {
>> + hud_sensors_temp_graph_install(pane, &name[16],
>
>
> What's the significance of name[16]?  Should that be a #define ?

Everything after the hyphen is essentially its unique sensor name,
prior to the hyphen is a routing string that tells mesa HUD us to use
the lmsensor HUD module, rather than say... the disk stats module.

So 16, is the length of "sensors_temp_cu-" and we pass the remainder
into the sensor specific initializer func - which is all it cares
about.

I'm happy to implement whatever the project recommends, so are you
suggesting instead:

#define SOMEPREFIX "sensors_temp_cu-"
then
hud_sensors_temp_graph_install(pane, &name[sizeof(SOMEPREFIX - 1)]

Or, have I misunderstood your comment?

Thanks again.

-- 
Steven Toth - Kernel Labs
http://www.kernellabs.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/va: Fix vaSyncSurface with no outstanding operation

2016-09-27 Thread Andy Furniss


Mark Thompson wrote:

On 27/09/16 16:48, Andy Furniss wrote:

Ok, thanks, so with that I am back to where I was before it stopped working.

In summary baseline works but JM ref decoder doesn't like the pocs.

b frames don't work properly, but then they don't with gst vaapi either. They 
do work with gst omx.

Looking at output from printfs some differences I see vs gstreamer.

maxrefs is hardcoded to 2 which has sideffects =


Easily fixed: 
.


Great, thanks.



enc_pic.pc.enc_b_pic_pattern = 1 vs 0 - seems harmless in practice.

There is code that for my h/w disables dual instance when maxrefs > 1 which 
means half speed, but there seems to be a bottleneck elsewhere that makes avconv 
3x slower than gstreamer anyway.


I'm not really sure how fast I should be expecting this stuff to work - can you 
offer any numbers about how fast it goes for you?  I only get ~30fps doing 
1080p transcode on an R7 360, which compares rather badly to ~240fps on the 
Skylake 6300 in the same machine.


I'll have to come back on this one tomorrow as it's late here.

I started to get some numbers but found a possible bug = I made a 1000 
frame 15mbit 1080p50 mkv with ffmpeg/libx264.


Using it as source for transcode (s/w or h/w dec) it came out far too 
low bitrate.


After re-applying debugging patch to mesa it turns out that framerate is 
being set as 1000 in the encoder, I don't know if the file is duff or if 
it's the patches.


more tomorrow.


gop, it seems that avconv with -g doesn't set h264->intra_idr_period in 
handleVAEncSequenceParameterBufferType which gets used to set 
context->desc.h264enc.gop_size and enc_pic.rc.gop_size


Hmm, this is an error on my part.  I'll fix it, but I need to test a bit 
further to be sure I'm not breaking anything else.



pocs gstreamer increments h264->CurrPic.TopFieldOrderCnt in 2s avconv 1s. The 
code divides this by 2 in handleVAEncPictureParameterBufferType


That code is just wrong, isn't it?  It works for pic_order_cnt_type == 2, but 
it needs to look at the pic_order_cnt_lsb and delta_pic_order_cnt values on the 
slice header for other cases.  (Looking at gstreamer, it has POC type 0 (as I 
do), but then the POC values match what POC type 2 would create in the 
no-B-frame case, so this happens to work.)


My knowledge of low level detail is not so good.


I'll see if I can make a patch for this.


Thanks.




Thanks,

- Mark




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3] HUD: Add support for block I/O, network I/O and lmsensor stats

2016-09-27 Thread Brian Paul


On 09/27/2016 05:17 PM, Steven Toth wrote:

On Fri, Sep 23, 2016 at 12:19 PM, Brian Paul  wrote:

Hi Steven,

I did a more thorough review per your request...


Thank you Brian.

All of your suggestions have been implemented, and new patches pushed to the ML.


Were you planning on squashing the two patches?  I think you should.

BTW, I see that some of the new changes could use const qualifiers.  For 
example:


@@ -97,10 +97,10 @@ find_dsi_by_name(char *n, int mode)
 }

 static int
-get_file_values(struct diskstat_info *dsi, struct stat_s *s)
+get_file_values(char *fn, struct stat_s *s)

That could be 'const char *fn'.  Same thing in other places.

Sorry to be nit picky about const qualifiers, but they're pretty helpful 
when reading code to help understand which arguments may or may not be 
modified by a function.





...with the exception of one, primarily because I wanted to comment.


+#if HAVE_LIBSENSORS
+  else if (sscanf(name, "sensors_temp_cu-%s", arg_name) == 1) {
+ hud_sensors_temp_graph_install(pane, &name[16],



What's the significance of name[16]?  Should that be a #define ?


Everything after the hyphen is essentially its unique sensor name,
prior to the hyphen is a routing string that tells mesa HUD us to use
the lmsensor HUD module, rather than say... the disk stats module.

So 16, is the length of "sensors_temp_cu-" and we pass the remainder
into the sensor specific initializer func - which is all it cares
about.

I'm happy to implement whatever the project recommends, so are you
suggesting instead:

#define SOMEPREFIX "sensors_temp_cu-"
then
hud_sensors_temp_graph_install(pane, &name[sizeof(SOMEPREFIX - 1)]

Or, have I misunderstood your comment?


Thanks for explaining.  But why not do this?

  else if (sscanf(name, "sensors_temp_cu-%s", arg_name) == 1) {
 hud_sensors_temp_graph_install(pane, arg_name, ...

as you did in other places?

-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97879] [amdgpu] Rocket League: long hangs (several seconds) when loading assets (models/textures/shaders?)

2016-09-27 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97879

--- Comment #30 from Michel Dänzer  ---
(In reply to Eero Tamminen from comment #20)
> Apitrace's own CPU overhead is so high that it's not very good for
> identifying CPU bottlenecks.

That may be true in general, but taking a CPU profile while replaying the
referenced apitrace clearly shows that most of the CPU cycles during the
startup phase are spent in the GLSL compiler frontend code for me. Can't you
reproduce that?

Anyway, it looks like there may be at least one other, not shader compilation
related, issue at play here. But that doesn't mean nothing can be done about
the shader compilation issue.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/r300: initialize pipe_resource::next to NULL

2016-09-27 Thread Michel Dänzer

On 28/09/16 12:33 AM, Rob Clark wrote:
> Signed-off-by: Rob Clark 
> ---
> I had a scan through the rest of pipe_resource allocations, and I think
> this is the only remaining one (besides r600_alloc_buffer_struct())
> which was using MALLOC_STRUCT()..  sorry 'bout that

Note that the MALLOC_STRUCT here isn't relevant:


> diff --git a/src/gallium/drivers/r300/r300_screen_buffer.c 
> b/src/gallium/drivers/r300/r300_screen_buffer.c
> index 4747058..24dd92f 100644
> --- a/src/gallium/drivers/r300/r300_screen_buffer.c
> +++ b/src/gallium/drivers/r300/r300_screen_buffer.c
> @@ -163,6 +163,7 @@ struct pipe_resource *r300_buffer_create(struct 
> pipe_screen *screen,
>  rbuf = MALLOC_STRUCT(r300_resource);
>  
>  rbuf->b.b = *templ;

The pipe_resource::next field is copied in from the template here, so
the question is really whether the next field of the template is
initialized to NULL by all callers.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/5] glsl: remove tabs from linker.{cpp,h}

2016-09-27 Thread Timothy Arceri

---
 src/compiler/glsl/linker.cpp | 807 +--
 src/compiler/glsl/linker.h   |   8 +-
 2 files changed, 407 insertions(+), 408 deletions(-)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 290811f..e6b2231 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -102,8 +102,8 @@ public:
   ir_variable *const var = ir->lhs->variable_referenced();
 
   if (strcmp(name, var->name) == 0) {
-found = true;
-return visit_stop;
+ found = true;
+ return visit_stop;
   }
 
   return visit_continue_with_parent;
@@ -113,26 +113,26 @@ public:
{
   foreach_two_lists(formal_node, &ir->callee->parameters,
 actual_node, &ir->actual_parameters) {
-ir_rvalue *param_rval = (ir_rvalue *) actual_node;
-ir_variable *sig_param = (ir_variable *) formal_node;
-
-if (sig_param->data.mode == ir_var_function_out ||
-sig_param->data.mode == ir_var_function_inout) {
-   ir_variable *var = param_rval->variable_referenced();
-   if (var && strcmp(name, var->name) == 0) {
-  found = true;
-  return visit_stop;
-   }
-}
+ ir_rvalue *param_rval = (ir_rvalue *) actual_node;
+ ir_variable *sig_param = (ir_variable *) formal_node;
+
+ if (sig_param->data.mode == ir_var_function_out ||
+ sig_param->data.mode == ir_var_function_inout) {
+ir_variable *var = param_rval->variable_referenced();
+if (var && strcmp(name, var->name) == 0) {
+   found = true;
+   return visit_stop;
+}
+ }
   }
 
   if (ir->return_deref != NULL) {
-ir_variable *const var = ir->return_deref->variable_referenced();
+ ir_variable *const var = ir->return_deref->variable_referenced();
 
-if (strcmp(name, var->name) == 0) {
-   found = true;
-   return visit_stop;
-}
+ if (strcmp(name, var->name) == 0) {
+found = true;
+return visit_stop;
+ }
   }
 
   return visit_continue_with_parent;
@@ -163,8 +163,8 @@ public:
virtual ir_visitor_status visit(ir_dereference_variable *ir)
{
   if (strcmp(this->name, ir->var->name) == 0) {
-this->found = true;
-return visit_stop;
+ this->found = true;
+ return visit_stop;
   }
 
   return visit_continue;
@@ -665,7 +665,7 @@ validate_vertex_shader_executable(struct gl_shader_program 
*prog,
   linker_error(prog,
"vertex shader does not write to `gl_Position'. \n");
 }
-return;
+ return;
   }
}
 
@@ -708,7 +708,7 @@ validate_fragment_shader_executable(struct 
gl_shader_program *prog,
 
if (frag_color.variable_found() && frag_data.variable_found()) {
   linker_error(prog,  "fragment shader writes to both "
-  "`gl_FragColor' and `gl_FragData'\n");
+   "`gl_FragColor' and `gl_FragData'\n");
}
 }
 
@@ -793,7 +793,7 @@ validate_geometry_shader_emissions(struct gl_context *ctx,
 bool
 validate_intrastage_arrays(struct gl_shader_program *prog,
ir_variable *const var,
-  ir_variable *const existing)
+   ir_variable *const existing)
 {
/* Consider the types to be "the same" if both types are arrays
 * of the same type and one of the arrays is implicitly sized.
@@ -1084,7 +1084,7 @@ cross_validate_uniforms(struct gl_shader_program *prog)
glsl_symbol_table variables;
for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
   if (prog->_LinkedShaders[i] == NULL)
-continue;
+ continue;
 
   cross_validate_globals(prog, prog->_LinkedShaders[i]->ir, &variables,
  true);
@@ -1125,7 +1125,7 @@ interstage_cross_validate_uniform_blocks(struct 
gl_shader_program *prog,
  InterfaceBlockStageIndex[i][j] = -1;
 
   if (sh == NULL)
-continue;
+ continue;
 
   unsigned sh_num_blocks;
   struct gl_uniform_block **sh_blks;
@@ -1162,8 +1162,8 @@ interstage_cross_validate_uniform_blocks(struct 
gl_shader_program *prog,
   for (unsigned j = 0; j < *num_blks; j++) {
  int stage_index = InterfaceBlockStageIndex[i][j];
 
-if (stage_index != -1) {
-   struct gl_linked_shader *sh = prog->_LinkedShaders[i];
+ if (stage_index != -1) {
+struct gl_linked_shader *sh = prog->_LinkedShaders[i];
 
 blks[j].stageref |= (1 << i);
 
@@ -1171,7 +1171,7 @@ interstage_cross_validate_uniform_blocks(struct 
gl_shader_program *prog,
sh->ShaderStorageBlocks : sh->UniformBlocks;
 
 sh_blks[stage_index] = &blks[j];
-}
+ }
   }
}
 
@@ -1201,7 +1201,7 @@ populate_symbol_table(gl_linked_shader *sh)

[Mesa-dev] [PATCH 2/5] glsl: remove tabs from ast_expr.cpp

2016-09-27 Thread Timothy Arceri

---
 src/compiler/glsl/ast_expr.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ast_expr.cpp b/src/compiler/glsl/ast_expr.cpp
index e624d11..1fd5b6e 100644
--- a/src/compiler/glsl/ast_expr.cpp
+++ b/src/compiler/glsl/ast_expr.cpp
@@ -79,7 +79,7 @@ ast_expression::operator_string(enum ast_operators op)
 
 
 ast_expression_bin::ast_expression_bin(int oper, ast_expression *ex0,
-  ast_expression *ex1) :
+   ast_expression *ex1) :
ast_expression(oper, ex0, ex1, NULL)
 {
assert((oper >= ast_plus) && (oper <= ast_logic_not));
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/5] glsl: remove remaining tabs from ast_array_index.cpp

2016-09-27 Thread Timothy Arceri

---
 src/compiler/glsl/ast_array_index.cpp | 73 +--
 1 file changed, 36 insertions(+), 37 deletions(-)

diff --git a/src/compiler/glsl/ast_array_index.cpp 
b/src/compiler/glsl/ast_array_index.cpp
index 2e36035..e29dafb 100644
--- a/src/compiler/glsl/ast_array_index.cpp
+++ b/src/compiler/glsl/ast_array_index.cpp
@@ -141,24 +141,24 @@ get_implicit_array_size(struct _mesa_glsl_parse_state 
*state,
 
 ir_rvalue *
 _mesa_ast_array_index_to_hir(void *mem_ctx,
-struct _mesa_glsl_parse_state *state,
-ir_rvalue *array, ir_rvalue *idx,
-YYLTYPE &loc, YYLTYPE &idx_loc)
+ struct _mesa_glsl_parse_state *state,
+ ir_rvalue *array, ir_rvalue *idx,
+ YYLTYPE &loc, YYLTYPE &idx_loc)
 {
if (!array->type->is_error()
&& !array->type->is_array()
&& !array->type->is_matrix()
&& !array->type->is_vector()) {
   _mesa_glsl_error(& idx_loc, state,
-  "cannot dereference non-array / non-matrix / "
-  "non-vector");
+   "cannot dereference non-array / non-matrix / "
+   "non-vector");
}
 
if (!idx->type->is_error()) {
   if (!idx->type->is_integer()) {
-_mesa_glsl_error(& idx_loc, state, "array index must be integer type");
+ _mesa_glsl_error(& idx_loc, state, "array index must be integer 
type");
   } else if (!idx->type->is_scalar()) {
-_mesa_glsl_error(& idx_loc, state, "array index must be scalar");
+ _mesa_glsl_error(& idx_loc, state, "array index must be scalar");
   }
}
 
@@ -182,33 +182,32 @@ _mesa_ast_array_index_to_hir(void *mem_ctx,
*negative constant expression."
*/
   if (array->type->is_matrix()) {
-if (array->type->row_type()->vector_elements <= idx) {
-   type_name = "matrix";
-   bound = array->type->row_type()->vector_elements;
-}
+ if (array->type->row_type()->vector_elements <= idx) {
+type_name = "matrix";
+bound = array->type->row_type()->vector_elements;
+ }
   } else if (array->type->is_vector()) {
-if (array->type->vector_elements <= idx) {
-   type_name = "vector";
-   bound = array->type->vector_elements;
-}
+ if (array->type->vector_elements <= idx) {
+type_name = "vector";
+bound = array->type->vector_elements;
+ }
   } else {
-/* glsl_type::array_size() returns -1 for non-array types.  This means
- * that we don't need to verify that the type is an array before
- * doing the bounds checking.
- */
-if ((array->type->array_size() > 0)
-&& (array->type->array_size() <= idx)) {
-   type_name = "array";
-   bound = array->type->array_size();
-}
+ /* glsl_type::array_size() returns -1 for non-array types.  This means
+  * that we don't need to verify that the type is an array before
+  * doing the bounds checking.
+  */
+ if ((array->type->array_size() > 0)
+ && (array->type->array_size() <= idx)) {
+type_name = "array";
+bound = array->type->array_size();
+ }
   }
 
   if (bound > 0) {
-_mesa_glsl_error(& loc, state, "%s index must be < %u",
- type_name, bound);
+ _mesa_glsl_error(& loc, state, "%s index must be < %u",
+  type_name, bound);
   } else if (idx < 0) {
-_mesa_glsl_error(& loc, state, "%s index must be >= 0",
- type_name);
+ _mesa_glsl_error(& loc, state, "%s index must be >= 0", type_name);
   }
 
   if (array->type->is_array())
@@ -253,18 +252,18 @@ _mesa_ast_array_index_to_hir(void *mem_ctx,
   * on uniform blocks but not shader storage blocks.
   *
   */
-_mesa_glsl_error(&loc, state, "%s block array index must be constant",
+ _mesa_glsl_error(&loc, state, "%s block array index must be constant",
   array->variable_referenced()->data.mode
   == ir_var_uniform ? "uniform" : "shader storage");
   } else {
-/* whole_variable_referenced can return NULL if the array is a
- * member of a structure.  In this case it is safe to not update
- * the max_array_access field because it is never used for fields
- * of structures.
- */
-ir_variable *v = array->whole_variable_referenced();
-if (v != NULL)
-   v->data.max_array_access = array->type->array_size() - 1;
+ /* whole_variable_referenced can return NULL if the array is a
+  * member of a structure.  In this case it is safe to not update
+  * the max_array_access

[Mesa-dev] [PATCH 5/5] glsl: remove remaining tabs from ast_type.cpp

2016-09-27 Thread Timothy Arceri

---
 src/compiler/glsl/ast_type.cpp | 39 ---
 1 file changed, 16 insertions(+), 23 deletions(-)

diff --git a/src/compiler/glsl/ast_type.cpp b/src/compiler/glsl/ast_type.cpp
index f3f6b29..b586f94 100644
--- a/src/compiler/glsl/ast_type.cpp
+++ b/src/compiler/glsl/ast_type.cpp
@@ -114,7 +114,7 @@ ast_type_qualifier::has_auxiliary_storage() const
  */
 bool
 ast_type_qualifier::merge_qualifier(YYLTYPE *loc,
-   _mesa_glsl_parse_state *state,
+_mesa_glsl_parse_state *state,
 const ast_type_qualifier &q,
 bool is_single_layout_merge)
 {
@@ -182,16 +182,15 @@ ast_type_qualifier::merge_qualifier(YYLTYPE *loc,
 
if (is_single_layout_merge && !state->has_enhanced_layouts() &&
(this->flags.i & q.flags.i & ~allowed_duplicates_mask.flags.i) != 0) {
-  _mesa_glsl_error(loc, state,
-  "duplicate layout qualifiers used");
+  _mesa_glsl_error(loc, state, "duplicate layout qualifiers used");
   return false;
}
 
if (q.flags.q.prim_type) {
   if (this->flags.q.prim_type && this->prim_type != q.prim_type) {
-_mesa_glsl_error(loc, state,
- "conflicting primitive type qualifiers used");
-return false;
+ _mesa_glsl_error(loc, state,
+  "conflicting primitive type qualifiers used");
+ return false;
   }
   this->prim_type = q.prim_type;
}
@@ -206,8 +205,8 @@ ast_type_qualifier::merge_qualifier(YYLTYPE *loc,
 
if (q.flags.q.subroutine_def) {
   if (this->flags.q.subroutine_def) {
-_mesa_glsl_error(loc, state,
- "conflicting subroutine qualifiers used");
+ _mesa_glsl_error(loc, state,
+  "conflicting subroutine qualifiers used");
   } else {
  this->subroutine_list = q.subroutine_list;
   }
@@ -284,27 +283,24 @@ ast_type_qualifier::merge_qualifier(YYLTYPE *loc,
 
if (q.flags.q.vertex_spacing) {
   if (this->flags.q.vertex_spacing && this->vertex_spacing != 
q.vertex_spacing) {
-_mesa_glsl_error(loc, state,
- "conflicting vertex spacing used");
-return false;
+ _mesa_glsl_error(loc, state, "conflicting vertex spacing used");
+ return false;
   }
   this->vertex_spacing = q.vertex_spacing;
}
 
if (q.flags.q.ordering) {
   if (this->flags.q.ordering && this->ordering != q.ordering) {
-_mesa_glsl_error(loc, state,
- "conflicting ordering used");
-return false;
+ _mesa_glsl_error(loc, state, "conflicting ordering used");
+ return false;
   }
   this->ordering = q.ordering;
}
 
if (q.flags.q.point_mode) {
   if (this->flags.q.point_mode && this->point_mode != q.point_mode) {
-_mesa_glsl_error(loc, state,
- "conflicting point mode used");
-return false;
+ _mesa_glsl_error(loc, state, "conflicting point mode used");
+ return false;
   }
   this->point_mode = q.point_mode;
}
@@ -328,8 +324,7 @@ ast_type_qualifier::merge_qualifier(YYLTYPE *loc,
 
if (this->flags.q.in &&
(this->flags.i & ~input_layout_mask.flags.i) != 0) {
-  _mesa_glsl_error(loc, state,
-  "invalid input layout qualifier used");
+  _mesa_glsl_error(loc, state, "invalid input layout qualifier used");
   return false;
}
 
@@ -428,8 +423,7 @@ ast_type_qualifier::merge_out_qualifier(YYLTYPE *loc,
 
/* Generate an error when invalid input layout qualifiers are used. */
if ((q.flags.i & ~valid_out_mask.flags.i) != 0) {
-  _mesa_glsl_error(loc, state,
-  "invalid output layout qualifiers used");
+  _mesa_glsl_error(loc, state, "invalid output layout qualifiers used");
   return false;
}
 
@@ -513,8 +507,7 @@ ast_type_qualifier::merge_in_qualifier(YYLTYPE *loc,
 
/* Generate an error when invalid input layout qualifiers are used. */
if ((q.flags.i & ~valid_in_mask.flags.i) != 0) {
-  _mesa_glsl_error(loc, state,
-  "invalid input layout qualifiers used");
+  _mesa_glsl_error(loc, state, "invalid input layout qualifiers used");
   return false;
}
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/5] glsl: remove remaining tabs from ast_to_hir.cpp

2016-09-27 Thread Timothy Arceri

---
 src/compiler/glsl/ast_to_hir.cpp | 78 
 1 file changed, 38 insertions(+), 40 deletions(-)

diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp
index 9de8454..2ad97d9 100644
--- a/src/compiler/glsl/ast_to_hir.cpp
+++ b/src/compiler/glsl/ast_to_hir.cpp
@@ -63,7 +63,7 @@ using namespace ir_builder;
 
 static void
 detect_conflicting_assignments(struct _mesa_glsl_parse_state *state,
-  exec_list *instructions);
+   exec_list *instructions);
 static void
 remove_per_vertex_blocks(exec_list *instructions,
  _mesa_glsl_parse_state *state, ir_variable_mode mode);
@@ -883,7 +883,7 @@ validate_assignment(struct _mesa_glsl_parse_state *state,
/* Check for implicit conversion in GLSL 1.20 */
if (apply_implicit_conversion(lhs->type, rhs, state)) {
   if (rhs->type == lhs->type)
-return rhs;
+ return rhs;
}
 
_mesa_glsl_error(&loc, state,
@@ -1029,11 +1029,11 @@ get_lvalue_copy(exec_list *instructions, ir_rvalue 
*lvalue)
ir_variable *var;
 
var = new(ctx) ir_variable(lvalue->type, "_post_incdec_tmp",
- ir_var_temporary);
+  ir_var_temporary);
instructions->push_tail(var);
 
instructions->push_tail(new(ctx) ir_assignment(new(ctx) 
ir_dereference_variable(var),
- lvalue));
+  lvalue));
 
return new(ctx) ir_dereference_variable(var);
 }
@@ -1160,11 +1160,11 @@ do_comparison(void *mem_ctx, int operation, ir_rvalue 
*op0, ir_rvalue *op1)
  */
 ir_rvalue *
 get_scalar_boolean_operand(exec_list *instructions,
-  struct _mesa_glsl_parse_state *state,
-  ast_expression *parent_expr,
-  int operand,
-  const char *operand_name,
-  bool *error_emitted)
+   struct _mesa_glsl_parse_state *state,
+   ast_expression *parent_expr,
+   int operand,
+   const char *operand_name,
+   bool *error_emitted)
 {
ast_expression *expr = parent_expr->subexpressions[operand];
void *ctx = state;
@@ -1457,8 +1457,8 @@ ast_expression::do_hir(exec_list *instructions,
* in a scalar boolean.  See page 57 of the GLSL 1.50 spec.
*/
   assert(type->is_error()
-|| ((type->base_type == GLSL_TYPE_BOOL)
-&& type->is_scalar()));
+ || ((type->base_type == GLSL_TYPE_BOOL)
+ && type->is_scalar()));
 
   result = new(ctx) ir_expression(operations[this->oper], type,
   op[0], op[1]);
@@ -3401,9 +3401,9 @@ apply_layout_qualifier_to_variable(const struct 
ast_type_qualifier *qual,
  ? "origin_upper_left" : "pixel_center_integer";
 
   _mesa_glsl_error(loc, state,
-  "layout qualifier `%s' can only be applied to "
-  "fragment shader input `gl_FragCoord'",
-  qual_string);
+   "layout qualifier `%s' can only be applied to "
+   "fragment shader input `gl_FragCoord'",
+   qual_string);
}
 
if (qual->flags.q.explicit_location) {
@@ -3717,7 +3717,7 @@ apply_type_qualifier_to_variable(const struct 
ast_type_qualifier *qual,
else if (qual->flags.q.in)
   var->data.mode = is_parameter ? ir_var_function_in : ir_var_shader_in;
else if (qual->flags.q.attribute
-   || (qual->flags.q.varying && (state->stage == 
MESA_SHADER_FRAGMENT)))
+|| (qual->flags.q.varying && (state->stage == 
MESA_SHADER_FRAGMENT)))
   var->data.mode = ir_var_shader_in;
else if (qual->flags.q.out)
   var->data.mode = is_parameter ? ir_var_function_out : ir_var_shader_out;
@@ -4013,9 +4013,9 @@ get_variable_being_redeclared(ir_variable *var, YYLTYPE 
loc,
  */
 ir_rvalue *
 process_initializer(ir_variable *var, ast_declaration *decl,
-   ast_fully_specified_type *type,
-   exec_list *initializer_instructions,
-   struct _mesa_glsl_parse_state *state)
+ast_fully_specified_type *type,
+exec_list *initializer_instructions,
+struct _mesa_glsl_parse_state *state)
 {
ir_rvalue *result = NULL;
 
@@ -4600,7 +4600,7 @@ ast_declarator_list::hir(exec_list *instructions,
* confusing error.
*/
   assert(this->type->specifier->structure == NULL || decl_type != NULL
-|| state->error);
+ || state->error);
 
   if (decl_type == NULL) {
  _mesa_glsl_error(&loc, state,
@@ -4752,7 +4752,7 @@ ast_declarator_list::hir(exec_list *instructions,
   }
 
   apply_type

[Mesa-dev] [PATCH 0/7] egl: Fixes and cleanups for EGLSync

2016-09-27 Thread Chad Versace

Fixes a deadlock in
dEQP-EGL.functional.fence_sync.invalid.get_invalid_value.

With the deadlock fixed, it's now possible to run all of
'dEQP-EGL.functional.fence_sync.*'.

The patch series' main goal is to unify the attribute parsing between
eglCreateSyncKHR(..., EGLint *attrib_list) and
eglCreateSync(..., EGLAttrib *attrib_list)
During the unification, some bugs found by inspection are fixed.

Tested with the following. No regressions found. Ratios are pass/fail.

BEFORE AFTER
  dEQP-EGL.functional.fence_sync.*  deadlock   27/0
  dEQP-EGL.functional.reusable_sync.* : 8/17   8/17
  piglit egl_khr_fence_sync pass   pass

This series lives at
git://git.kiwitree.net/~chadv/mesa  review/fences-v02

Chad Versace (7):
  egl: Fix missing unlock in eglGetSyncAttribKHR
  egl: Fix truncation error in _eglParseSyncAttribList64
  egl: Fix an error path in eglCreateSync*
  egl: Add _eglConvertIntsToAttribs()
  egl: Cleanup control flow in _eglParseSyncAttribList
  egl: Drop duplicate check on EGLSync type
  egl: Unify the EGLint/EGLAttrib paths in eglCreateSync*

 src/egl/drivers/dri2/egl_dri2.c |  6 +--
 src/egl/main/eglapi.c   | 88 -
 src/egl/main/eglapi.h   |  5 ++-
 src/egl/main/eglsync.c  | 59 ++-
 src/egl/main/eglsync.h  |  2 +-
 5 files changed, 95 insertions(+), 65 deletions(-)

-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/7] egl: Add _eglConvertIntsToAttribs()

2016-09-27 Thread Chad Versace

This function converts an attribute list from EGLint[] to EGLAttrib[].
Will be used in following patches to cleanup EGLSync attribute parsing.
---
 src/egl/main/eglapi.c | 41 +
 src/egl/main/eglapi.h |  2 ++
 2 files changed, 43 insertions(+)

diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
index 07f6794..697957e 100644
--- a/src/egl/main/eglapi.c
+++ b/src/egl/main/eglapi.c
@@ -251,6 +251,47 @@ _eglUnlockDisplay(_EGLDisplay *dpy)
 }
 
 
+/**
+ * Convert an attribute list from EGLint[] to EGLAttrib[].
+ *
+ * Return an EGL error code. The output parameter out_attrib_list is modified
+ * only on success.
+ */
+EGLint
+_eglConvertIntsToAttribs(const EGLint *int_list, EGLAttrib **out_attrib_list)
+{
+   size_t len = 0;
+   EGLAttrib *attrib_list;
+
+   if (int_list) {
+  while (int_list[2*len] != EGL_NONE)
+ ++len;
+   }
+
+   if (len == 0) {
+  *out_attrib_list = NULL;
+  return EGL_SUCCESS;
+   }
+
+   if (2*len + 1 > SIZE_MAX / sizeof(EGLAttrib))
+  return EGL_BAD_ALLOC;
+
+   attrib_list = malloc((2*len + 1) * sizeof(EGLAttrib));
+   if (!attrib_list)
+  return EGL_BAD_ALLOC;
+
+   for (size_t i = 0; i < len; ++i) {
+  attrib_list[2*i + 0] = int_list[2*i + 0];
+  attrib_list[2*i + 1] = int_list[2*i + 1];
+   }
+
+   attrib_list[2*len] = EGL_NONE;
+
+   *out_attrib_list = attrib_list;
+   return EGL_SUCCESS;
+}
+
+
 static EGLint *
 _eglConvertAttribsToInt(const EGLAttrib *attr_list)
 {
diff --git a/src/egl/main/eglapi.h b/src/egl/main/eglapi.h
index 2d6a24f..5d9c1b8 100644
--- a/src/egl/main/eglapi.h
+++ b/src/egl/main/eglapi.h
@@ -199,6 +199,8 @@ struct _egl_api
 struct mesa_glinterop_export_out *out);
 };
 
+EGLint _eglConvertIntsToAttribs(const EGLint *int_list,
+EGLAttrib **out_attrib_list);
 
 #ifdef __cplusplus
 }
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/7] egl: Fix missing unlock in eglGetSyncAttribKHR

2016-09-27 Thread Chad Versace

On the error path, eglGetSyncAttribKHR neglected to unlock the
EGLDisplay before returning.

Fixes deadlock in dEQP-EGL.functional.fence_sync.invalid.get_invalid_value.

Cc: mesa-sta...@lists.freedesktop.org
Cc: Mark Janes 
---
 src/egl/main/eglapi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
index 1c62a80..44fc0b8 100644
--- a/src/egl/main/eglapi.c
+++ b/src/egl/main/eglapi.c
@@ -1602,7 +1602,7 @@ eglGetSyncAttribKHR(EGLDisplay dpy, EGLSync sync, EGLint 
attribute, EGLint *valu
EGLBoolean result;
 
if (!value)
-  RETURN_EGL_ERROR(NULL, EGL_BAD_PARAMETER, EGL_FALSE);
+  RETURN_EGL_ERROR(disp, EGL_BAD_PARAMETER, EGL_FALSE);
 
attrib = *value;
result = _eglGetSyncAttribCommon(disp, s, attribute, &attrib);
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 6/7] egl: Drop duplicate check on EGLSync type

2016-09-27 Thread Chad Versace

_eglInitSync checked that the display supported the sync type (such as
EGL_SYNC_FENCE), and did it wrong. When the check failed it emitted
EGL_BAD_ATTRIBUTE, but sometimes EGL_BAD_PARAMETER is needed.

_eglCreateSync already does the error checking, and it does it right.
---
 src/egl/main/eglsync.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/src/egl/main/eglsync.c b/src/egl/main/eglsync.c
index 6f77992..afb724f 100644
--- a/src/egl/main/eglsync.c
+++ b/src/egl/main/eglsync.c
@@ -110,12 +110,6 @@ _eglInitSync(_EGLSync *sync, _EGLDisplay *dpy, EGLenum 
type,
 {
EGLint err;
 
-   if (!(type == EGL_SYNC_REUSABLE_KHR && dpy->Extensions.KHR_reusable_sync) &&
-   !(type == EGL_SYNC_FENCE_KHR && dpy->Extensions.KHR_fence_sync) &&
-   !(type == EGL_SYNC_CL_EVENT_KHR && dpy->Extensions.KHR_cl_event2 &&
- attrib_list64))
-  return _eglError(EGL_BAD_ATTRIBUTE, "eglCreateSyncKHR");
-
_eglInitResource(&sync->Resource, sizeof(*sync), dpy);
sync->Type = type;
sync->SyncStatus = EGL_UNSIGNALED_KHR;
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/7] egl: Fix an error path in eglCreateSync*

2016-09-27 Thread Chad Versace

When the user called eglCreateSync64KHR on a display without
EGL_KHR_cl_event2 (the only extension that exposes it), we returned
EGL_NO_SYNC but did not update the error code.

We also did the same for eglCreateSync on a display without EGL 1.5.
---
 src/egl/main/eglapi.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c
index 44fc0b8..07f6794 100644
--- a/src/egl/main/eglapi.c
+++ b/src/egl/main/eglapi.c
@@ -1395,8 +1395,18 @@ _eglCreateSync(_EGLDisplay *disp, EGLenum type, const 
EGLint *attrib_list,
 
_EGL_CHECK_DISPLAY(disp, EGL_NO_SYNC_KHR, drv);
 
-   if (!disp->Extensions.KHR_cl_event2 && is64)
-  RETURN_EGL_EVAL(disp, EGL_NO_SYNC_KHR);
+   if (!disp->Extensions.KHR_cl_event2 && is64) {
+  /* There exist two EGLAttrib variants of eglCreateSync*:
+   * eglCreateSync64KHR which requires EGL_KHR_cl_event2, and eglCreateSync
+   * which requires EGL 1.5. Here we use the presence of EGL_KHR_cl_event2
+   * support as a proxy for EGL 1.5 support, even though that's not
+   * entirely correct (though _eglComputeVersion does the same).
+   *
+   * The EGL spec provides no guidance on how to handle unsupported
+   * functions. EGL_BAD_MATCH seems reasonable.
+   */
+  RETURN_EGL_ERROR(disp, EGL_BAD_MATCH, EGL_NO_SYNC_KHR);
+   }
 
/* return an error if the client API doesn't support GL_OES_EGL_sync */
if (!ctx || ctx->Resource.Display != disp ||
-- 
2.10.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

1 2 >

1 - 100 of 112 matches

Mail list logo