Re: [Mesa-dev] [PATCH 2/2] clover: implement CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE

2015-06-25 Thread Grigori Goronzy
On 2015-05-28 13:04, Grigori Goronzy wrote: Work-group size should always be aligned to subgroup size; this is a basic requirement, otherwise some work-items will be no-operation. It might make sense to refine the value according to a kernel's resource usage, but that's a possible op

Re: [Mesa-dev] [PATCH 1/2] clover: fix event handling of buffer operations

2015-06-25 Thread Grigori Goronzy
On 2015-06-09 22:52, Francisco Jerez wrote: + + if (blocking) + hev().wait(); + hard_event::wait() may fail, so this should probably be done before the ret_object() call to avoid leaks. Alright... C++ exceptions are a minefield. :) Is there any reason you didn't make the same change

Re: [Mesa-dev] [PATCH] Revert "radeon/llvm: enable unsafe math for graphics shaders"

2015-02-18 Thread Grigori Goronzy
Hi, AFAIR not enabling this makes LLVM generate really slow code in some common cases. Maybe this is just a bug in LLVM/R600 triggered by unsafe FP math optimization or some optimization is too eager. Other drivers do fine with these types of optimization. What's the impact on performance with un

Re: [Mesa-dev] [PATCH] Revert "radeon/llvm: enable unsafe math for graphics shaders"

2015-02-18 Thread Grigori Goronzy
Am 2015-02-18 09:13, schrieb Michel Dänzer: On 18.02.2015 16:52, Grigori Goronzy wrote: Hi, AFAIR not enabling this makes LLVM generate really slow code in some common cases. Maybe this is just a bug in LLVM/R600 triggered by unsafe FP math optimization or some optimization is too eager

[Mesa-dev] [PATCH 2/3] radv: fix uninitialized variables

2016-10-11 Thread Grigori Goronzy
This gets rid of "may be used uninitialized" compiler warnings. --- src/amd/vulkan/radv_formats.c | 2 +- src/amd/vulkan/radv_pipeline.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/amd/vulkan/radv_formats.c b/src/amd/vulkan/radv_formats.c index 90c140c..76d5fa1 1006

[Mesa-dev] [PATCH 1/3] radv: add missing unreachable

2016-10-11 Thread Grigori Goronzy
--- src/amd/vulkan/radv_descriptor_set.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/amd/vulkan/radv_descriptor_set.c b/src/amd/vulkan/radv_descriptor_set.c index d1d2b1f..ba8a002 100644 --- a/src/amd/vulkan/radv_descriptor_set.c +++ b/src/amd/vulkan/radv_descriptor_set.c @@ -113,6 +1

[Mesa-dev] [PATCH 3/3] radv: fix strict aliasing violation

2016-10-11 Thread Grigori Goronzy
--- src/amd/vulkan/radv_pipeline_cache.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/amd/vulkan/radv_pipeline_cache.c b/src/amd/vulkan/radv_pipeline_cache.c index 032a7e4..85a2b6d 100644 --- a/src/amd/vulkan/radv_pipeline_cache.c +++ b/src/amd/vulkan/radv_pipeli

Re: [Mesa-dev] Mesa 12.1.0 release plan (Was Re: Next Mesa release, anyone?)

2016-10-19 Thread Grigori Goronzy
On 2016-10-04 12:32, Emil Velikov wrote: On 2 October 2016 at 14:17, Axel Davy wrote: I'd prefer myself Oct 14, because we have a lot of patches for nine, and they deserve more cleaning and testing, but if it's Oct 7, we'll try be on time. 14th it is. As mentioned before: _don't_ wait for t

Re: [Mesa-dev] [r600g] Mesa CVS 4e9aa67: vdpau has only MPEG1/2 on RV730

2013-09-30 Thread Grigori Goronzy
On 30.09.2013 10:06, Michel Dänzer wrote: On Son, 2013-09-29 at 22:34 +0200, Dieter Nützel wrote: after latest git pull I've only MPEG1, MPEG2_SIMPLE and MPEG2_MAIN with my RV730 (AGP). Same problem on PALM. Bisection shows that it is caused by commit 68f6dec32. The initialization order se

[Mesa-dev] [PATCH] r600g: fix UVD detection

2013-09-30 Thread Grigori Goronzy
UVD was checked before the info fields were initialized. Introduced by commit 68f6dec32. --- src/gallium/drivers/r600/r600_pipe.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/r600/r600_pipe.c b/src/gallium/drivers/r600/r600_pipe.c index 097

[Mesa-dev] [PATCH] st/egl: flush resources before presentation

2013-10-01 Thread Grigori Goronzy
Fixes regression on r600g due to fast clear introduced by commit edbbfac6. --- src/gallium/state_trackers/egl/x11/native_dri2.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/gallium/state_trackers/egl/x11/native_dri2.c b/src/gallium/state_trackers/egl/x11/native_dri2.c inde

[Mesa-dev] [PATCH 1/2] radeon/uvd, st/vdpau: fix video format reporting

2013-10-01 Thread Grigori Goronzy
UVD can only support NV12 in the case of hardware decoding, but we can still use all other formats for software decoding. Use the UNKNOWN entrypoint to signal that we're not interesting in hardware decoding. --- src/gallium/drivers/radeon/radeon_uvd.c | 7 +-- src/gallium/state_trackers/vdpau

[Mesa-dev] [PATCH 2/2] st/vl: use MPEG-2 chroma cositing

2013-10-01 Thread Grigori Goronzy
MPEG-2 and later video standards align the chroma sample position horizontally with the leftmost luma sample position. Add a half-texel offset to the chroma texture sampling coordinate to sample at the this position instead of sampling in the center between the luma texels. This avoids minor color

[Mesa-dev] [PATCH 1/2] r600g: texture offsets for non-TXF instructions

2013-10-02 Thread Grigori Goronzy
All texture instructions can use offsets, not just TXF. Offsets into the literals array were wrong, too. --- src/gallium/drivers/r600/r600_shader.c | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drive

Re: [Mesa-dev] [PATCH 1/2] r600g: texture offsets for non-TXF instructions

2013-10-02 Thread Grigori Goronzy
On 03.10.2013 00:12, Grigori Goronzy wrote: All texture instructions can use offsets, not just TXF. Offsets into the literals array were wrong, too. BTW, I just noticed it now: this fixes the fs-textureOffset-2D piglit test, which unfortunately does not appear to be part of any of the test

Re: [Mesa-dev] [PATCH 1/2] radeon/uvd, st/vdpau: fix video format reporting

2013-10-07 Thread Grigori Goronzy
On 07.10.2013 11:25, Christian König wrote: Am 01.10.2013 21:12, schrieb Ilia Mirkin: On Tue, Oct 1, 2013 at 3:06 PM, Grigori Goronzy wrote: UVD can only support NV12 in the case of hardware decoding, but we can still use all other formats for software decoding. Use the UNKNOWN entrypoint to

[Mesa-dev] [PATCH 1/6] radeon/uvd: fix video format reporting

2013-10-08 Thread Grigori Goronzy
UVD can only support NV12 in the case of hardware decoding, but we can still use all other formats for software decoding. Use the UNKNOWN profile to signal that we're not interesting in hardware decoding. v2: use profile instead of entrypoint --- src/gallium/drivers/radeon/radeon_uvd.c | 7 +-

[Mesa-dev] [PATCH 3/6] radeon/uvd: disable VC-1 simple/main profile

2013-10-08 Thread Grigori Goronzy
It doesn't work (decodes to garbage) with most videos on UVD 3.0. Worse yet, it often results in random memory corruption or GPU hangs. Rumor has it only the newest UVD hardware could do it anyway. --- src/gallium/drivers/radeon/radeon_uvd.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)

[Mesa-dev] [PATCH 2/6] radeon/uvd: try to fix VC-1 decoding

2013-10-08 Thread Grigori Goronzy
The DPB size calculations seem to be off; there is various random corruption happening, even with advanced profile. Always assuming a minimum number of references appears to fix it, similarly to H.264. This might overallocate the DPB. Also clean up the SPS/PPS field setup so that it matches VC-1 s

[Mesa-dev] [PATCH 4/6] st/vdpau: fix GenerateCSCMatrix with NULL procamp

2013-10-08 Thread Grigori Goronzy
As per API specification, it is legal to supply a NULL procamp. In this case, a CSC matrix according to the colorspace should be generated, but no further adjustments are made. Addresses: https://trac.videolan.org/vlc/ticket/9281 https://bugs.freedesktop.org/show_bug.cgi?id=68792 --- src/gallium/

[Mesa-dev] [PATCH 5/6] st/vdpau: add new formats to OutputSurface rendering

2013-10-08 Thread Grigori Goronzy
OutputSurfaces have simple YCbCr rendering functionality built in, but so far only 4:2:0 subsampling worked correctly. This fixes 4:2:2 and 4:4:4 formats. --- src/gallium/state_trackers/vdpau/output.c| 2 +- src/gallium/state_trackers/vdpau/vdpau_private.h | 23 +++ 2

[Mesa-dev] [PATCH 6/6] st/vdpau: really block until surface is idle

2013-10-08 Thread Grigori Goronzy
pipe_screen::fence_finish with zero timeout returns quickly and doesn't wait at all. Fix that, and also delete the fence afterwards, so that QuerySurfaceStatus returns the right state later. Addresses: https://trac.videolan.org/vlc/ticket/9281 https://bugs.freedesktop.org/show_bug.cgi?id=68792 ---

[Mesa-dev] [PATCH 2/2] st/vdpau: add format conversions for GetBitsYCbCr

2013-10-09 Thread Grigori Goronzy
Add simple plain C routines for NV12<->YV12 and YUYV<->UYVY conversions. The NV12->YV12 conversion is commonly used, for instance by VLC. --- src/gallium/state_trackers/vdpau/surface.c | 125 +++-- 1 file changed, 117 insertions(+), 8 deletions(-) diff --git a/src/gallium/

[Mesa-dev] [PATCH 1/2] r600/uvd: fix mapping of UVD surfaces for readback

2013-10-09 Thread Grigori Goronzy
R600_RESOURCE_FLAG_TRANSFER forces direct mapping, and reading from VRAM is simply too slow. VDPAU GetBitsYCbCr is unusuable. Change to the new PIPE_BIND_LINEAR and adjust r600_transfer_map so that it uses a staging texture. --- src/gallium/drivers/r600/r600_uvd.c | 6 +++--- src/gallium/dri

Re: [Mesa-dev] [PATCH 1/2] r600/uvd: fix mapping of UVD surfaces for readback

2013-10-10 Thread Grigori Goronzy
On 10.10.2013 11:41, Christian König wrote: Am 09.10.2013 22:19, schrieb Grigori Goronzy: R600_RESOURCE_FLAG_TRANSFER forces direct mapping, and reading from VRAM is simply too slow. VDPAU GetBitsYCbCr is unusuable. Change to the new PIPE_BIND_LINEAR and adjust r600_transfer_map so that it uses

[Mesa-dev] [PATCH] r600g: fix crash in set_framebuffer_state

2013-10-10 Thread Grigori Goronzy
We should be able to safely set the framebuffer state without a fragment shader bound. bind_ps_state will take care of updating the necessary state bits later. --- src/gallium/drivers/r600/evergreen_state.c | 4 +++- src/gallium/drivers/r600/r600_state.c | 4 +++- 2 files changed, 6 insertion

[Mesa-dev] [PATCH] r600g: fix crash in set_framebuffer_state

2013-10-10 Thread Grigori Goronzy
We should be able to safely set the framebuffer state without a fragment shader bound. bind_ps_state will take care of updating the necessary state bits later. v2: check in update_db_shader_control --- src/gallium/drivers/r600/evergreen_state.c | 23 +++ src/gallium/drivers/r6

[Mesa-dev] [PATCH 2/3] radeon: use staging for mapping linear textures

2013-10-13 Thread Grigori Goronzy
Textures that likely reside in VRAM, are mapped for reading and don't require direct mapping should be staged into GTT, to avoid bad performance. This fixes readback performance of VDPAU surfaces. --- src/gallium/drivers/radeon/r600_texture.c | 6 ++ 1 file changed, 6 insertions(+) diff --git

[Mesa-dev] [PATCH 1/3] radeon/uvd: use PIPE_BIND_LINEAR for video surfaces

2013-10-13 Thread Grigori Goronzy
This new bind flag forces linear storage, but does not have other side effects like R600_RESOURCE_FLAG_TRANSFER. --- src/gallium/drivers/r600/r600_uvd.c | 6 +++--- src/gallium/drivers/radeonsi/radeonsi_uvd.c | 8 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/src/

[Mesa-dev] [PATCH 3/3] st/vdpau: add format conversions for GetBitsYCbCr

2013-10-13 Thread Grigori Goronzy
Add simple plain C routines for NV12<->YV12 and YUYV<->UYVY conversions. The NV12->YV12 conversion is commonly used, for instance by VLC. --- src/gallium/state_trackers/vdpau/surface.c | 125 +++-- 1 file changed, 117 insertions(+), 8 deletions(-) diff --git a/src/gallium/

Re: [Mesa-dev] Decode hi10p with mesa uvd vdpau

2013-10-26 Thread Grigori Goronzy
On 26.10.2013 16:31, Peter Frühberger wrote: Hi, I looked at the openmax decoder posted yesterday and have seen that only two fields are missing to also decode hi10p with the current vdpau uvd infrastructure in place. I mailed two patches to the vdpau mailing list in order to get the API bumped

[Mesa-dev] [PATCH] st/vdpau: resolve delayed rendering for GL interop

2013-11-05 Thread Grigori Goronzy
Otherwise OutputSurface interop has funny results sometimes. This fixes interop with the mpv media player. --- src/gallium/state_trackers/vdpau/output.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/gallium/state_trackers/vdpau/output.c b/src/gallium/state_trackers/vdpau/output.c index

Re: [Mesa-dev] [PATCH] st/vdpau: resolve delayed rendering for GL interop

2013-11-06 Thread Grigori Goronzy
Mesa 10.0, but I don't know if this is a realistic goal. Best regards Grigori Thanks for the help, Christian. Am 06.11.2013 00:35, schrieb Grigori Goronzy: Otherwise OutputSurface interop has funny results sometimes. This fixes interop with the mpv media player. --- src/gallium/state

[Mesa-dev] [PATCH 1/4] r600g/sb: work around hw issues with stack on eg/cm

2013-11-15 Thread Grigori Goronzy
From: Vadim Girlin v2: make it actually work, improve condition Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68503 Cc: "10.0" Signed-off-by: Vadim Girlin --- src/gallium/drivers/r600/sb/sb_bc.h| 21 src/gallium/drivers/r600/sb/sb_bc_finalize.cpp | 129 +

Re: [Mesa-dev] Anonymous structure in uniform question

2013-11-24 Thread Grigori Goronzy
o YMMV. :) Best regards Grigori >From 386dc4f201a65a2a8740c8c9f4a039d5c8209a9c Mon Sep 17 00:00:00 2001 From: Grigori Goronzy Date: Sun, 24 Nov 2013 20:24:58 +0100 Subject: [PATCH] WIP: fix unnamed struct type conflicts If two shader stages define the same unnamed struct type, they will co

[Mesa-dev] [PATCH 1/3] util/u_format: move utility function from r600g

2014-06-04 Thread Grigori Goronzy
We need this for radeonsi, and it might be useful for other drivers, too. --- src/gallium/auxiliary/util/u_format.c | 11 +++ src/gallium/auxiliary/util/u_format.h | 3 +++ src/gallium/drivers/r600/r600_blit.c | 12 +--- 3 files changed, 15 insertions(+), 11 deletions(-) diff --

[Mesa-dev] [PATCH 2/3] radeonsi: add sampling of 4:2:2 subsampled textures

2014-06-04 Thread Grigori Goronzy
This makes 4:2:2 video surfaces work in VDPAU. --- src/gallium/drivers/radeon/r600_texture.c | 5 +- src/gallium/drivers/radeonsi/si_blit.c| 91 ++- src/gallium/drivers/radeonsi/si_state.c | 15 + 3 files changed, 71 insertions(+), 40 deletions(-) diff --git

[Mesa-dev] [PATCH 3/3] radeon/uvd: disable VC-1 simple/main on UVD 2.x

2014-06-04 Thread Grigori Goronzy
It's about as broken as on later UVD revisions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66452 Cc: "10.1 10.2" --- src/gallium/drivers/radeon/radeon_video.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeon/radeon_video.c b/src/gall

Re: [Mesa-dev] [PATCH 2/3] radeonsi: add sampling of 4:2:2 subsampled textures

2014-06-17 Thread Grigori Goronzy
Ping? I'm not sure if this is completely correct, but this code path is only excercised by VDPAU and it seems to work fine on SI. Grigori On 04.06.2014 18:54, Grigori Goronzy wrote: > This makes 4:2:2 video surfaces work in VDPAU. > --- > src/gallium/drivers/radeon/r600_texture.c

Re: [Mesa-dev] [PATCH 2/3] radeonsi: add sampling of 4:2:2 subsampled textures

2014-06-18 Thread Grigori Goronzy
> This looks good to me. >> >> Reviewed-by: Marek Olšák >> >> Marek >> >> On Wed, Jun 4, 2014 at 6:54 PM, Grigori Goronzy >> wrote: >>> This makes 4:2:2 video surfaces work in VDPAU. >>> --- >>> src/gal

Re: [Mesa-dev] [PATCH 2/3] radeonsi: add sampling of 4:2:2 subsampled textures

2014-07-02 Thread Grigori Goronzy
On 02.07.2014 22:18, Andy Furniss wrote: > > Before I knew how to get field sync to use my TVs deinterlacer I had to > modify mesa so that I could use the vdpau de-interlacer(s), when I did > this I noticed that 422 didn't work and looked the same as it does now > this has gone in with my si. > A

Re: [Mesa-dev] [PATCH 4/5] r600g, radeonsi: Use write-combined persistent GTT mappings

2014-07-17 Thread Grigori Goronzy
On 17.07.2014 12:01, Michel Dänzer wrote: > From: Michel Dänzer > > This is hopefully safe: The kernel makes sure writes to these mappings > finish before the GPU might start reading from them, and the GPU caches > are invalidated at the start of a command stream. > Aren't CPU reads from write-c

[Mesa-dev] [PATCH 1/2] radeon/llvm: enable unsafe math for graphics shaders

2014-07-17 Thread Grigori Goronzy
Accuracy of some operations was recently improved in the R600 backend, at the cost of slower code. This is required for compute shaders, but not for graphics shaders. Add unsafe-fp-math hint to make LLVM generate faster but possibly less accurate code. Piglit didn't indicate any regressions. ---

[Mesa-dev] [PATCH 2/2] radeon/llvm: fix formatting

2014-07-17 Thread Grigori Goronzy
Use K&R and same indent as most other code. No functional change intended. --- src/gallium/drivers/radeon/radeon_llvm_emit.c | 24 ++-- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c b/src/gallium/drivers/radeon/ra

Re: [Mesa-dev] [PATCH 4/5] r600g, radeonsi: Use write-combined persistent GTT mappings

2014-07-18 Thread Grigori Goronzy
On 18.07.2014 13:45, Marek Olšák wrote: > If the requirements of GL_MAP_COHERENT_BIT are satisfied, then the > patch is okay. > Apart from correctness, I still wonder how this will affect performance, most notably CPU reads. This change unconditionally uses write-combined, uncached memory for MAP_

Re: [Mesa-dev] [PATCH 1/2] radeon/llvm: enable unsafe math for graphics shaders

2014-07-21 Thread Grigori Goronzy
On 17.07.2014 21:24, Tom Stellard wrote: > On Thu, Jul 17, 2014 at 06:44:25PM +0200, Grigori Goronzy wrote: >> Accuracy of some operations was recently improved in the R600 backend, >> at the cost of slower code. This is required for compute shaders, >> but not for graphics s

[Mesa-dev] [PATCH] radeonsi: implement BPTC texture support

2014-07-23 Thread Grigori Goronzy
Passes corrected piglit test and should also handle signed vs unsigned float correctly. --- src/gallium/drivers/radeonsi/si_state.c | 20 1 file changed, 20 insertions(+) diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index 3de

Re: [Mesa-dev] [PATCH] st/mesa: expose EXT_framebuffer_multisample_blit_scaled if MSAA is supported

2013-07-16 Thread Grigori Goronzy
On 16.07.2013 19:26, Marek Olšák wrote: Surprisingly all drivers supporting MSAA can already do this (r300g and r600g for sure) and I think Christoph wanted to have this feature for his Nouveau drivers anyway. OK, they can do it, but is it actually any faster than doing a resolve and regular b

Re: [Mesa-dev] [PATCH] st/mesa: expose EXT_framebuffer_multisample_blit_scaled if MSAA is supported

2013-07-16 Thread Grigori Goronzy
On 17.07.2013 02:05, Marek Olšák wrote: No, it's not faster, but it's not slower either. Now that I think about it, I can't come up with a good shader-based algorithm for the resolve operation. I don't think Christoph's approach that an MSAA texture can be viewed as a larger single-sample textu

[Mesa-dev] [PATCH 1/3] gallium: add flush_resource context function

2013-09-09 Thread Grigori Goronzy
From: Marek Olšák r600g needs explicit flushing before DRI2 buffers are presented on the screen. v2: add (stub) implementations for all drivers, fix frontbuffer flushing --- src/gallium/docs/source/context.rst | 13 + src/gallium/drivers/freedreno/freedreno_resource.

[Mesa-dev] [PATCH 2/3] r600g: add support for separately allocated CMASKs

2013-09-09 Thread Grigori Goronzy
--- src/gallium/drivers/r600/evergreen_state.c | 24 +++- src/gallium/drivers/r600/r600_hw_context.c | 12 +--- src/gallium/drivers/r600/r600_resource.h | 3 +++ src/gallium/drivers/r600/r600_texture.c| 25 - 4 files changed, 55 insertions

[Mesa-dev] [PATCH 3/3] r600g: fast color clears for single-sample buffers

2013-09-09 Thread Grigori Goronzy
Allocate a CMASK on demand and use it to fast clear single-sample colorbuffers. Both FBOs and window system colorbuffers are fast cleared. Expand as needed when colorbuffers are mapped or displayed on screen. --- src/gallium/drivers/r600/evergreen_state.c | 11 + src/gallium/drivers/r600/r60

Re: [Mesa-dev] [PATCH 2/3] r600g: add support for separately allocated CMASKs

2013-09-09 Thread Grigori Goronzy
On 09.09.2013 16:09, Marek Olšák wrote: /* Check colorbuffers. */ for (i = 0; i < rctx->framebuffer.state.nr_cbufs; i++) { + struct r600_texture *tex = + (struct r600_texture*)rctx->framebuffer.state.cbufs[i]->texture; + Please check if cbu

[Mesa-dev] [PATCH v2 3/3] r600g: fast color clears for single-sample buffers

2013-09-10 Thread Grigori Goronzy
Allocate a CMASK on demand and use it to fast clear single-sample colorbuffers. Both FBOs and window system colorbuffers are fast cleared. Expand as needed when colorbuffers are mapped or displayed on screen. v2: cosmetics, move transfer expansion into dma_blit --- src/gallium/drivers/r600/evergr

[Mesa-dev] [PATCH v2 2/3] r600g: add support for separately allocated CMASKs

2013-09-10 Thread Grigori Goronzy
v2: check for NULL cbufs --- src/gallium/drivers/r600/evergreen_state.c | 24 +++- src/gallium/drivers/r600/r600_hw_context.c | 18 ++ src/gallium/drivers/r600/r600_resource.h | 3 +++ src/gallium/drivers/r600/r600_texture.c| 25 -

[Mesa-dev] [PATCH v2 1/3] gallium: add flush_resource context function

2013-09-10 Thread Grigori Goronzy
From: Marek Olšák r600g needs explicit flushing before DRI2 buffers are presented on the screen. v2: add (stub) implementations for all drivers, fix frontbuffer flushing v3: fix galahad --- src/gallium/docs/source/context.rst | 13 + src/gallium/drivers/freedreno/fre

[Mesa-dev] [PATCH 1/2] glsl: extract function for record comparisons

2013-11-26 Thread Grigori Goronzy
--- src/glsl/glsl_types.cpp | 61 +++-- src/glsl/glsl_types.h | 7 ++ 2 files changed, 41 insertions(+), 27 deletions(-) diff --git a/src/glsl/glsl_types.cpp b/src/glsl/glsl_types.cpp index f740130..6c9727e 100644 --- a/src/glsl/glsl_types.cpp ++

[Mesa-dev] [PATCH 2/2] glsl: match unnamed record types across stages

2013-11-26 Thread Grigori Goronzy
Unnamed record types are assigned to separate types per stage, e.g. uniform struct { ... } a; if defined in both vertex and fragment shader, will result in two separate types of different name. When linking the shader, this results in a type conflict. However, there is no reason why this should n

Re: [Mesa-dev] [PATCH 1/2] glsl: extract function for record comparisons

2013-12-03 Thread Grigori Goronzy
Ping? Can anyone review this, please? Grigori On 27.11.2013 00:15, Grigori Goronzy wrote: --- src/glsl/glsl_types.cpp | 61 +++-- src/glsl/glsl_types.h | 7 ++ 2 files changed, 41 insertions(+), 27 deletions(-) diff --git a/src/glsl

Re: [Mesa-dev] [RFC] r600g/radeonsi: Use caching buffer manager for textures as well

2014-04-10 Thread Grigori Goronzy
On 10.04.2014 11:23, Michel Dänzer wrote: From: Michel Dänzer --- This is just an RFC; if other developers approve of this approach, I can make a more extensive patch removing the use_reusable_pool parameters. The x11perf numbers below compare ShmGet/PutImage before and after this change with

Re: [Mesa-dev] The way r600g handles shaders that use more than available GPRs

2014-04-20 Thread Grigori Goronzy
On 20.04.2014 03:02, Marek Olšák wrote: It looks like the check is not needed with SB, because SB performs register allocation. What happens if you comment out the conditional which fails? SB takes the machine code generated by the "classic" compiler as input, so the check is still needed. Th

[Mesa-dev] [PATCH] r600g: fix mega_fetch_count

2013-06-03 Thread Grigori Goronzy
According to ISA docs, the range is 1..64, so effectively bytes_to_fetch-1. --- src/gallium/drivers/r600/r600_shader.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 81ed3ce..0444579 100

Re: [Mesa-dev] [PATCH] r600g/sb: improve math optimizations

2013-06-04 Thread Grigori Goronzy
On 31.05.2013 14:37, Vadim Girlin wrote: There are no regressions on evergreen with piglit tests or any other apps that I tested, with and without llvm backend. (Issue with Unigine Heaven that I mentioned on #dri-devel yesterday was in fact caused by my own well-hidden bug, now it's fixed). Impr

[Mesa-dev] [RFC] r600g: implement fast color clears on evergreen+

2013-06-07 Thread Grigori Goronzy
This is my first try to contribute anything useful to Mesa, so please bear with me. This is not finished, but I'd like feedback to make sure the code's quality and style is in line with what is expected in Mesa. ___ mesa-dev mailing list mesa-dev@lists.f

[Mesa-dev] [PATCH] r600g: implement fast color clears on evergreen+

2013-06-07 Thread Grigori Goronzy
Allows MSAA colorbuffers, which have a CMASK automatically and don't need any further special handling, to be fast cleared. Instead of clearing the buffer, set the clear color and the CMASK to the cleared state. --- src/gallium/drivers/r600/evergreen_state.c | 8 +++- src/gallium/drivers/r600

Re: [Mesa-dev] [PATCH] r600g: implement fast color clears on evergreen+

2013-06-07 Thread Grigori Goronzy
On 08.06.2013 00:40, Marek Olšák wrote: Also the fast clear shouldn't be used for array, cube, and 3D textures unless all layers are cleared together. OK. I hadn't really thought about these. One more thing. If you don't use piglit, I recommend using it before sending patches to the mailing

[Mesa-dev] [PATCH] r600g: implement fast color clears on evergreen+

2013-06-10 Thread Grigori Goronzy
Allows MSAA colorbuffers, which have a CMASK automatically and don't need any further special handling, to be fast cleared. Instead of clearing the buffer, set the clear color and the CMASK to the cleared state. Fast clear is used only when all bound colorbuffers fulfill certain conditions: a CMAS

Re: [Mesa-dev] [PATCH] r600g: implement fast color clears on evergreen+

2013-06-11 Thread Grigori Goronzy
On 11.06.2013 02:41, Marek Olšák wrote: >> + >> + /* cannot pack color, needs support in u_format */ >> + if (desc->pack_rgba_float == NULL) { >> + return false; >> + } > > Hi Grirogi, > > Is this for disallowing integer textures? You

[Mesa-dev] [PATCH v2] r600g: implement fast color clears on evergreen+

2013-06-11 Thread Grigori Goronzy
Allows MSAA colorbuffers, which have a CMASK automatically and don't need any further special handling, to be fast cleared. Instead of clearing the buffer, set the clear color and the CMASK to the cleared state. Fast clear is used only when all bound colorbuffers fulfill certain conditions: a CMAS

Re: [Mesa-dev] [PATCH v2] r600g: implement fast color clears on evergreen+

2013-06-28 Thread Grigori Goronzy
On 12.06.2013 00:04, Grigori Goronzy wrote: Allows MSAA colorbuffers, which have a CMASK automatically and don't need any further special handling, to be fast cleared. Instead of clearing the buffer, set the clear color and the CMASK to the cleared state. Fast clear is used only when all

[Mesa-dev] [PATCH 1/3] gallium: add expand_resource interface

2013-07-10 Thread Grigori Goronzy
This interface is used to expand fast-cleared window system colorbuffers. --- src/gallium/include/pipe/p_context.h | 8 src/gallium/state_trackers/dri/common/dri_drawable.c | 4 src/gallium/state_trackers/dri/drm/dri2.c| 8 ++-- 3 files changed, 18 ins

[Mesa-dev] [PATCH 2/3] r600g: add support for separately allocated CMASKs

2013-07-10 Thread Grigori Goronzy
--- src/gallium/drivers/r600/evergreen_state.c | 24 +++- src/gallium/drivers/r600/r600_hw_context.c | 12 +--- src/gallium/drivers/r600/r600_resource.h | 3 +++ src/gallium/drivers/r600/r600_texture.c| 25 - 4 files changed, 55 insertions

[Mesa-dev] [PATCH 3/3] r600g: fast color clears for single-sample buffers

2013-07-10 Thread Grigori Goronzy
Allocate a CMASK on demand and use it to fast clear single-sample colorbuffers. Both FBOs and window system colorbuffers are fast cleared. Expand as needed when colorbuffers are mapped or displayed on screen. --- src/gallium/drivers/r600/evergreen_state.c | 11 src/gallium/drivers/r600/r600

Re: [Mesa-dev] [PATCH 1/3] gallium: add expand_resource interface

2013-07-12 Thread Grigori Goronzy
On 12.07.2013 16:19, Jose Fonseca wrote: I admit I haven't fully understood what's being proposed yet. But just a few quick words. I always wanted to have a "present" method that ensures that the contents of a resource is made visible to whatever the consumer is (full-screen flip, blit to prim

Re: [Mesa-dev] [PATCH 27/30] r600g: calculate a better value for array_size

2014-02-04 Thread Grigori Goronzy
On 04.02.2014 00:53, Dave Airlie wrote: From: Dave Airlie attempt to calculate a better value for array size to avoid breaking apps. Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/r600_shader.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers

Re: [Mesa-dev] Request for support of GL_AMD_pinned_memory and GL_ARB_buffer_storage extensions

2014-02-05 Thread Grigori Goronzy
On 05.02.2014 18:08, Jose Fonseca wrote: I honestly hope that GL_AMD_pinned_memory doesn't become popular. It would have been alright if it wasn't for this bit in http://www.opengl.org/registry/specs/AMD/pinned_memory.txt which says: 2) Can the application still use the buffer using the C

[Mesa-dev] [PATCH] gallium: add geometry shader output limits

2014-02-05 Thread Grigori Goronzy
--- src/gallium/drivers/freedreno/freedreno_screen.c | 5 + src/gallium/drivers/i915/i915_screen.c | 5 + src/gallium/drivers/ilo/ilo_screen.c | 3 +++ src/gallium/drivers/llvmpipe/lp_screen.c | 3 +++ src/gallium/drivers/nouveau/nv30/nv30_screen.c | 2 ++ s

Re: [Mesa-dev] [PATCH] gallium: add geometry shader output limits

2014-02-08 Thread Grigori Goronzy
On 06.02.2014 02:46, Michel Dänzer wrote: + case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS: + return 16384; radeonsi currently can't handle more than 4095 total output components, as the buffer resource for writing to the GSVS ring only has 14 bits for the stride in byte

[Mesa-dev] [PATCH v2] gallium: add geometry shader output limits

2014-02-08 Thread Grigori Goronzy
v2: adjust limits for radeonsi and llvmpipe --- src/gallium/drivers/freedreno/freedreno_screen.c | 5 + src/gallium/drivers/i915/i915_screen.c | 5 + src/gallium/drivers/ilo/ilo_screen.c | 3 +++ src/gallium/drivers/llvmpipe/lp_screen.c | 3 +++ src/gallium/dr

[Mesa-dev] [PATCH v3] gallium: add geometry shader output limits

2014-02-09 Thread Grigori Goronzy
v2: adjust limits for radeonsi and llvmpipe v3: add documentation Cc: "10.1" --- src/gallium/docs/source/screen.rst | 6 ++ src/gallium/drivers/freedreno/freedreno_screen.c | 5 + src/gallium/drivers/i915/i915_screen.c | 5 + src/gallium/drivers/ilo/ilo_screen

[Mesa-dev] [PATCH 1/2] vl: add motion adaptive deinterlacer

2014-02-13 Thread Grigori Goronzy
/vl_deint_filter.c b/src/gallium/auxiliary/vl/vl_deint_filter.c new file mode 100644 index 000..9b05154 --- /dev/null +++ b/src/gallium/auxiliary/vl/vl_deint_filter.c @@ -0,0 +1,491 @@ +/** + * + * Copyright 2013 Grigori Goronzy

[Mesa-dev] [PATCH 2/2] st/vdpau: add support for DEINTERLACE_TEMPORAL

2014-02-13 Thread Grigori Goronzy
--- src/gallium/state_trackers/vdpau/mixer.c | 69 ++-- src/gallium/state_trackers/vdpau/query.c | 1 + src/gallium/state_trackers/vdpau/vdpau_private.h | 7 +++ 3 files changed, 73 insertions(+), 4 deletions(-) diff --git a/src/gallium/state_trackers/vdpau/m

Re: [Mesa-dev] [PATCH 1/2] vl: add motion adaptive deinterlacer

2014-02-15 Thread Grigori Goronzy
On 15.02.2014 13:14, Andy Furniss wrote: Thanks Grigori for doing this - looks really good on HD stuff I've tested and of course is easily fast enough, unlike anything on the CPU at high res. Any plans for the future? Well, adding edge-guided spatial interpolation for the temporal-spatial mo

[Mesa-dev] [PATCH 1/2] st/vdpau: fix possible NULL dereference

2014-03-02 Thread Grigori Goronzy
--- src/gallium/state_trackers/vdpau/mixer.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/gallium/state_trackers/vdpau/mixer.c b/src/gallium/state_trackers/vdpau/mixer.c index 996fd8e..e6bfb8c 100644 --- a/src/gallium/state_trackers/vdpau/mixer.c +++ b/src/galli

[Mesa-dev] [PATCH 2/2] NV_vdpau_interop: fix IsSurfaceNV return type

2014-03-02 Thread Grigori Goronzy
The spec incorrectly used void as return type, when it should have been GLboolean. This has now been fixed. According to Nvidia, their implementation always used GLboolean. --- include/GL/glext.h | 2 +- src/mapi/glapi/gen/NV_vdpau_interop.xml | 1 + src/mesa/main/vdpau.c

Re: [Mesa-dev] [PATCH 2/2] radeon/uvd: fix VC-1 simple/main profile decode

2015-09-23 Thread Grigori Goronzy
Hi, On 23.09.2015 10:11, Christian König wrote: > From: Boyuan Zhang > > Signed-off-by: Boyuan Zhang > Reviewed-by: Christian König > --- Thanks, nice to see this finally getting fixed, and it was a pretty simple thing after all... well, not quite yet apparently. Sometimes playback works corr

[Mesa-dev] [PATCH 2/2] radeonsi: use guard band clipping

2016-04-06 Thread Grigori Goronzy
With the previous changes to handling of viewport clipping, it is almost trivial to add proper support for guard band clipping. Select a suitable integer clipping value to keep inside the rasterizer's guard band range of [-32768, 32767] and program the hardware to use guard band clipping. Guard b

[Mesa-dev] [PATCH 1/2] radeonsi: do per-pixel clipping based on viewport states

2016-04-06 Thread Grigori Goronzy
From: Marek Olšák In other words, vport scissors are derived from viewport states. If the scissor test is enabled, the intersection of both is used. The guard band will disable clipping, so we have to clip per-pixel. v2: fix check for r600_draw_rectangle and other overflow conditions. (Grigori)

[Mesa-dev] [PATCH v2] radeonsi: use guard band clipping

2016-04-06 Thread Grigori Goronzy
With the previous changes to handling of viewport clipping, it is almost trivial to add proper support for guard band clipping. Select a suitable integer clipping value to keep inside the rasterizer's guard band range of [-32768, 32767] and program the hardware to use guard band clipping. Guard b

Re: [Mesa-dev] [PATCH 0/5] R600, GCN: Guard Band support

2016-04-11 Thread Grigori Goronzy
ssor & viewport code is deleted. Thanks for implementing this properly. Reviewed-by: Grigori Goronzy Grigori ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radeonsi: fix mask checking when emitting scissors and viewports

2016-04-11 Thread Grigori Goronzy
e case: Only 1 viewport is active. */ - if (mask & 1 && - !si_get_vs_info(sctx)->writes_viewport_index) { + if (!si_get_vs_info(sctx)->writes_viewport_index) { + if (!(mask & 1)) + return; + Reviewed-by: Grigori Goronzy

Re: [Mesa-dev] [PATCH 1/4] gallium/radeon: add clear_texture function

2016-04-15 Thread Grigori Goronzy
On 2016-04-15 18:38, Ilia Mirkin wrote: + } else { + union pipe_color_union color; + switch (util_format_get_blocksizebits(res->format)) { + case 128: + sf->format = PIPE_FORMAT_R32G32B32A32_UINT; Just as an FYI... this is sa

[Mesa-dev] [PATCH] amdgpu/winsys: adjust IB size based on buffer wait time

2016-04-15 Thread Grigori Goronzy
Small IBs help to reduce stalls for workloads that require a lot of synchronization. On the other hand, if there is no notable synchronization, we can use a large IB size to slightly improve performance in some cases. This introduces tuning of the IB size based on feedback on the average buffer wa

[Mesa-dev] [RFC] dynamic IB size tuning for radeonsi

2016-04-15 Thread Grigori Goronzy
Hi, apps that cause a lot of synchronization benefit from small IB sizes. The current IB size is a bit on the large side for this class of apps. On the other hand, if there isn't much synchronization going on, increasing the IB size can slightly improve performance, too. Here's a quick hack that

Re: [Mesa-dev] [PATCH 1/4] gallium/radeon: add clear_texture function

2016-04-16 Thread Grigori Goronzy
On 2016-04-15 20:30, Jakob Sinclair wrote: In other places in radeonsi that require reinterpretation (e.g. si_blit.c), the surface template is modified instead of changing the surface after creation. I'm not sure if r600/radeonsi like it if the format is changed late like here. Seems to be cleane

Re: [Mesa-dev] [RFC] dynamic IB size tuning for radeonsi

2016-04-17 Thread Grigori Goronzy
Interesting, and thanks for poking at this issue. I've been thinking about tuning IB sizes as well. I'd like for us to get this right, so I wonder: What's your theory for _why_ your change helps? See below. I think you discovered it yourself. I'll be honest with you: Right now, I think your a

[Mesa-dev] [PATCH 1/2] winsys/amdgpu: adjust IB size based on buffer wait time

2016-04-19 Thread Grigori Goronzy
Small IBs help to reduce stalls for workloads that require a lot of synchronization. On the other hand, if there is no notable synchronization, we can use a large IB size to slightly improve performance in some cases. This introduces tuning of the IB size based on feedback on the average buffer wa

[Mesa-dev] [PATCH 2/2] winsys/amdgpu: clean up and fix switch statement

2016-04-19 Thread Grigori Goronzy
Add missing break, add default case. Additionally initialize variables to avoid compiler warnings. --- src/gallium/winsys/amdgpu/drm/amdgpu_cs.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c b/src/gallium/winsys/amdgpu/drm/amdgp

Re: [Mesa-dev] [PATCH 1/2] winsys/amdgpu: adjust IB size based on buffer wait time

2016-04-20 Thread Grigori Goronzy
thout any calls into the kernel, right? The winsys code makes that conditional and calls into the kernel when no fence pointer is available. Grigori On 19.04.2016 18:13, Grigori Goronzy wrote: Small IBs help to reduce stalls for workloads that require a lot of synchronization. On the other han

[Mesa-dev] [PATCH 2/2] clover: try userptr for CL_MEM_USE_HOST_PTR

2015-05-19 Thread Grigori Goronzy
According to spec, CL_MEM_USE_HOST_PTR should directly use host memory, if possible. This is just what userptr is for, so use it. In case the memory cannot be mapped, a fallback similar to CL_MEM_COPY_HOST_PTR is used. --- src/gallium/state_trackers/clover/core/memory.cpp | 2 +- src/gallium/s

  1   2   >