On 2015-05-28 13:04, Grigori Goronzy wrote:
Work-group size should always be aligned to subgroup size; this is a
basic requirement, otherwise some work-items will be no-operation.
It might make sense to refine the value according to a kernel's
resource usage, but that's a possible op
On 2015-06-09 22:52, Francisco Jerez wrote:
+
+ if (blocking)
+ hev().wait();
+
hard_event::wait() may fail, so this should probably be done before the
ret_object() call to avoid leaks.
Alright... C++ exceptions are a minefield. :)
Is there any reason you didn't make
the same change
Hi,
AFAIR not enabling this makes LLVM generate really slow code in some
common cases. Maybe this is just a bug in LLVM/R600 triggered by unsafe
FP math optimization or some optimization is too eager. Other drivers do
fine with these types of optimization.
What's the impact on performance with un
Am 2015-02-18 09:13, schrieb Michel Dänzer:
On 18.02.2015 16:52, Grigori Goronzy wrote:
Hi,
AFAIR not enabling this makes LLVM generate really slow code in some
common cases. Maybe this is just a bug in LLVM/R600 triggered by
unsafe
FP math optimization or some optimization is too eager
This gets rid of "may be used uninitialized" compiler warnings.
---
src/amd/vulkan/radv_formats.c | 2 +-
src/amd/vulkan/radv_pipeline.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/amd/vulkan/radv_formats.c b/src/amd/vulkan/radv_formats.c
index 90c140c..76d5fa1 1006
---
src/amd/vulkan/radv_descriptor_set.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/amd/vulkan/radv_descriptor_set.c
b/src/amd/vulkan/radv_descriptor_set.c
index d1d2b1f..ba8a002 100644
--- a/src/amd/vulkan/radv_descriptor_set.c
+++ b/src/amd/vulkan/radv_descriptor_set.c
@@ -113,6 +1
---
src/amd/vulkan/radv_pipeline_cache.c | 7 +--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/src/amd/vulkan/radv_pipeline_cache.c
b/src/amd/vulkan/radv_pipeline_cache.c
index 032a7e4..85a2b6d 100644
--- a/src/amd/vulkan/radv_pipeline_cache.c
+++ b/src/amd/vulkan/radv_pipeli
On 2016-10-04 12:32, Emil Velikov wrote:
On 2 October 2016 at 14:17, Axel Davy wrote:
I'd prefer myself Oct 14, because we have a lot of patches for nine,
and
they deserve more cleaning and testing, but if it's Oct 7, we'll try
be on
time.
14th it is. As mentioned before: _don't_ wait for t
On 30.09.2013 10:06, Michel Dänzer wrote:
On Son, 2013-09-29 at 22:34 +0200, Dieter Nützel wrote:
after latest git pull I've only MPEG1, MPEG2_SIMPLE and MPEG2_MAIN with
my RV730 (AGP).
Same problem on PALM. Bisection shows that it is caused by commit
68f6dec32. The initialization order se
UVD was checked before the info fields were initialized. Introduced
by commit 68f6dec32.
---
src/gallium/drivers/r600/r600_pipe.c | 13 +++--
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/src/gallium/drivers/r600/r600_pipe.c
b/src/gallium/drivers/r600/r600_pipe.c
index 097
Fixes regression on r600g due to fast clear introduced by commit
edbbfac6.
---
src/gallium/state_trackers/egl/x11/native_dri2.c | 11 +++
1 file changed, 11 insertions(+)
diff --git a/src/gallium/state_trackers/egl/x11/native_dri2.c
b/src/gallium/state_trackers/egl/x11/native_dri2.c
inde
UVD can only support NV12 in the case of hardware decoding, but we
can still use all other formats for software decoding. Use the UNKNOWN
entrypoint to signal that we're not interesting in hardware decoding.
---
src/gallium/drivers/radeon/radeon_uvd.c | 7 +--
src/gallium/state_trackers/vdpau
MPEG-2 and later video standards align the chroma sample position
horizontally with the leftmost luma sample position. Add a half-texel
offset to the chroma texture sampling coordinate to sample at the
this position instead of sampling in the center between the luma
texels. This avoids minor color
All texture instructions can use offsets, not just TXF. Offsets into
the literals array were wrong, too.
---
src/gallium/drivers/r600/r600_shader.c | 20 ++--
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/src/gallium/drivers/r600/r600_shader.c
b/src/gallium/drive
On 03.10.2013 00:12, Grigori Goronzy wrote:
All texture instructions can use offsets, not just TXF. Offsets into
the literals array were wrong, too.
BTW, I just noticed it now: this fixes the fs-textureOffset-2D piglit
test, which unfortunately does not appear to be part of any of the test
On 07.10.2013 11:25, Christian König wrote:
Am 01.10.2013 21:12, schrieb Ilia Mirkin:
On Tue, Oct 1, 2013 at 3:06 PM, Grigori Goronzy
wrote:
UVD can only support NV12 in the case of hardware decoding, but we
can still use all other formats for software decoding. Use the UNKNOWN
entrypoint to
UVD can only support NV12 in the case of hardware decoding, but we
can still use all other formats for software decoding. Use the UNKNOWN
profile to signal that we're not interesting in hardware decoding.
v2: use profile instead of entrypoint
---
src/gallium/drivers/radeon/radeon_uvd.c | 7 +-
It doesn't work (decodes to garbage) with most videos on UVD 3.0. Worse
yet, it often results in random memory corruption or GPU hangs. Rumor
has it only the newest UVD hardware could do it anyway.
---
src/gallium/drivers/radeon/radeon_uvd.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
The DPB size calculations seem to be off; there is various random
corruption happening, even with advanced profile. Always assuming
a minimum number of references appears to fix it, similarly to
H.264. This might overallocate the DPB. Also clean up the SPS/PPS
field setup so that it matches VC-1 s
As per API specification, it is legal to supply a NULL procamp. In this
case, a CSC matrix according to the colorspace should be generated,
but no further adjustments are made.
Addresses:
https://trac.videolan.org/vlc/ticket/9281
https://bugs.freedesktop.org/show_bug.cgi?id=68792
---
src/gallium/
OutputSurfaces have simple YCbCr rendering functionality built in,
but so far only 4:2:0 subsampling worked correctly. This fixes 4:2:2
and 4:4:4 formats.
---
src/gallium/state_trackers/vdpau/output.c| 2 +-
src/gallium/state_trackers/vdpau/vdpau_private.h | 23 +++
2
pipe_screen::fence_finish with zero timeout returns quickly and
doesn't wait at all. Fix that, and also delete the fence afterwards,
so that QuerySurfaceStatus returns the right state later.
Addresses:
https://trac.videolan.org/vlc/ticket/9281
https://bugs.freedesktop.org/show_bug.cgi?id=68792
---
Add simple plain C routines for NV12<->YV12 and YUYV<->UYVY
conversions. The NV12->YV12 conversion is commonly used, for instance
by VLC.
---
src/gallium/state_trackers/vdpau/surface.c | 125 +++--
1 file changed, 117 insertions(+), 8 deletions(-)
diff --git a/src/gallium/
R600_RESOURCE_FLAG_TRANSFER forces direct mapping, and reading from
VRAM is simply too slow. VDPAU GetBitsYCbCr is unusuable. Change to
the new PIPE_BIND_LINEAR and adjust r600_transfer_map so that it uses
a staging texture.
---
src/gallium/drivers/r600/r600_uvd.c | 6 +++---
src/gallium/dri
On 10.10.2013 11:41, Christian König wrote:
Am 09.10.2013 22:19, schrieb Grigori Goronzy:
R600_RESOURCE_FLAG_TRANSFER forces direct mapping, and reading from
VRAM is simply too slow. VDPAU GetBitsYCbCr is unusuable. Change to
the new PIPE_BIND_LINEAR and adjust r600_transfer_map so that it uses
We should be able to safely set the framebuffer state without a
fragment shader bound. bind_ps_state will take care of updating the
necessary state bits later.
---
src/gallium/drivers/r600/evergreen_state.c | 4 +++-
src/gallium/drivers/r600/r600_state.c | 4 +++-
2 files changed, 6 insertion
We should be able to safely set the framebuffer state without a
fragment shader bound. bind_ps_state will take care of updating the
necessary state bits later.
v2: check in update_db_shader_control
---
src/gallium/drivers/r600/evergreen_state.c | 23 +++
src/gallium/drivers/r6
Textures that likely reside in VRAM, are mapped for reading and
don't require direct mapping should be staged into GTT, to avoid bad
performance. This fixes readback performance of VDPAU surfaces.
---
src/gallium/drivers/radeon/r600_texture.c | 6 ++
1 file changed, 6 insertions(+)
diff --git
This new bind flag forces linear storage, but does not have other
side effects like R600_RESOURCE_FLAG_TRANSFER.
---
src/gallium/drivers/r600/r600_uvd.c | 6 +++---
src/gallium/drivers/radeonsi/radeonsi_uvd.c | 8
2 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/src/
Add simple plain C routines for NV12<->YV12 and YUYV<->UYVY
conversions. The NV12->YV12 conversion is commonly used, for instance
by VLC.
---
src/gallium/state_trackers/vdpau/surface.c | 125 +++--
1 file changed, 117 insertions(+), 8 deletions(-)
diff --git a/src/gallium/
On 26.10.2013 16:31, Peter Frühberger wrote:
Hi,
I looked at the openmax decoder posted yesterday and have seen that
only two fields are missing to also decode hi10p with the current
vdpau uvd infrastructure in place.
I mailed two patches to the vdpau mailing list in order to get the API
bumped
Otherwise OutputSurface interop has funny results sometimes.
This fixes interop with the mpv media player.
---
src/gallium/state_trackers/vdpau/output.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/gallium/state_trackers/vdpau/output.c
b/src/gallium/state_trackers/vdpau/output.c
index
Mesa 10.0, but I don't
know if this is a realistic goal.
Best regards
Grigori
Thanks for the help,
Christian.
Am 06.11.2013 00:35, schrieb Grigori Goronzy:
Otherwise OutputSurface interop has funny results sometimes.
This fixes interop with the mpv media player.
---
src/gallium/state
From: Vadim Girlin
v2: make it actually work, improve condition
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68503
Cc: "10.0"
Signed-off-by: Vadim Girlin
---
src/gallium/drivers/r600/sb/sb_bc.h| 21
src/gallium/drivers/r600/sb/sb_bc_finalize.cpp | 129 +
o YMMV. :)
Best regards
Grigori
>From 386dc4f201a65a2a8740c8c9f4a039d5c8209a9c Mon Sep 17 00:00:00 2001
From: Grigori Goronzy
Date: Sun, 24 Nov 2013 20:24:58 +0100
Subject: [PATCH] WIP: fix unnamed struct type conflicts
If two shader stages define the same unnamed struct type, they will
co
We need this for radeonsi, and it might be useful for other drivers,
too.
---
src/gallium/auxiliary/util/u_format.c | 11 +++
src/gallium/auxiliary/util/u_format.h | 3 +++
src/gallium/drivers/r600/r600_blit.c | 12 +---
3 files changed, 15 insertions(+), 11 deletions(-)
diff --
This makes 4:2:2 video surfaces work in VDPAU.
---
src/gallium/drivers/radeon/r600_texture.c | 5 +-
src/gallium/drivers/radeonsi/si_blit.c| 91 ++-
src/gallium/drivers/radeonsi/si_state.c | 15 +
3 files changed, 71 insertions(+), 40 deletions(-)
diff --git
It's about as broken as on later UVD revisions.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66452
Cc: "10.1 10.2"
---
src/gallium/drivers/radeon/radeon_video.c | 5 -
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/src/gallium/drivers/radeon/radeon_video.c
b/src/gall
Ping? I'm not sure if this is completely correct, but this code path is
only excercised by VDPAU and it seems to work fine on SI.
Grigori
On 04.06.2014 18:54, Grigori Goronzy wrote:
> This makes 4:2:2 video surfaces work in VDPAU.
> ---
> src/gallium/drivers/radeon/r600_texture.c
> This looks good to me.
>>
>> Reviewed-by: Marek Olšák
>>
>> Marek
>>
>> On Wed, Jun 4, 2014 at 6:54 PM, Grigori Goronzy
>> wrote:
>>> This makes 4:2:2 video surfaces work in VDPAU.
>>> ---
>>> src/gal
On 02.07.2014 22:18, Andy Furniss wrote:
>
> Before I knew how to get field sync to use my TVs deinterlacer I had to
> modify mesa so that I could use the vdpau de-interlacer(s), when I did
> this I noticed that 422 didn't work and looked the same as it does now
> this has gone in with my si.
>
A
On 17.07.2014 12:01, Michel Dänzer wrote:
> From: Michel Dänzer
>
> This is hopefully safe: The kernel makes sure writes to these mappings
> finish before the GPU might start reading from them, and the GPU caches
> are invalidated at the start of a command stream.
>
Aren't CPU reads from write-c
Accuracy of some operations was recently improved in the R600 backend,
at the cost of slower code. This is required for compute shaders,
but not for graphics shaders. Add unsafe-fp-math hint to make LLVM
generate faster but possibly less accurate code.
Piglit didn't indicate any regressions.
---
Use K&R and same indent as most other code. No functional change
intended.
---
src/gallium/drivers/radeon/radeon_llvm_emit.c | 24 ++--
1 file changed, 14 insertions(+), 10 deletions(-)
diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c
b/src/gallium/drivers/radeon/ra
On 18.07.2014 13:45, Marek Olšák wrote:
> If the requirements of GL_MAP_COHERENT_BIT are satisfied, then the
> patch is okay.
>
Apart from correctness, I still wonder how this will affect performance,
most notably CPU reads. This change unconditionally uses write-combined,
uncached memory for MAP_
On 17.07.2014 21:24, Tom Stellard wrote:
> On Thu, Jul 17, 2014 at 06:44:25PM +0200, Grigori Goronzy wrote:
>> Accuracy of some operations was recently improved in the R600 backend,
>> at the cost of slower code. This is required for compute shaders,
>> but not for graphics s
Passes corrected piglit test and should also handle signed vs unsigned
float correctly.
---
src/gallium/drivers/radeonsi/si_state.c | 20
1 file changed, 20 insertions(+)
diff --git a/src/gallium/drivers/radeonsi/si_state.c
b/src/gallium/drivers/radeonsi/si_state.c
index 3de
On 16.07.2013 19:26, Marek Olšák wrote:
Surprisingly all drivers supporting MSAA can already do this (r300g and r600g
for sure) and I think Christoph wanted to have this feature for his Nouveau
drivers anyway.
OK, they can do it, but is it actually any faster than doing a resolve
and regular b
On 17.07.2013 02:05, Marek Olšák wrote:
No, it's not faster, but it's not slower either.
Now that I think about it, I can't come up with a good shader-based
algorithm for the resolve operation.
I don't think Christoph's approach that an MSAA texture can be viewed
as a larger single-sample textu
From: Marek Olšák
r600g needs explicit flushing before DRI2 buffers are presented on the screen.
v2: add (stub) implementations for all drivers, fix frontbuffer flushing
---
src/gallium/docs/source/context.rst | 13 +
src/gallium/drivers/freedreno/freedreno_resource.
---
src/gallium/drivers/r600/evergreen_state.c | 24 +++-
src/gallium/drivers/r600/r600_hw_context.c | 12 +---
src/gallium/drivers/r600/r600_resource.h | 3 +++
src/gallium/drivers/r600/r600_texture.c| 25 -
4 files changed, 55 insertions
Allocate a CMASK on demand and use it to fast clear single-sample
colorbuffers. Both FBOs and window system colorbuffers are fast
cleared. Expand as needed when colorbuffers are mapped or displayed
on screen.
---
src/gallium/drivers/r600/evergreen_state.c | 11 +
src/gallium/drivers/r600/r60
On 09.09.2013 16:09, Marek Olšák wrote:
/* Check colorbuffers. */
for (i = 0; i < rctx->framebuffer.state.nr_cbufs; i++) {
+ struct r600_texture *tex =
+ (struct
r600_texture*)rctx->framebuffer.state.cbufs[i]->texture;
+
Please check if cbu
Allocate a CMASK on demand and use it to fast clear single-sample
colorbuffers. Both FBOs and window system colorbuffers are fast
cleared. Expand as needed when colorbuffers are mapped or displayed
on screen.
v2: cosmetics, move transfer expansion into dma_blit
---
src/gallium/drivers/r600/evergr
v2: check for NULL cbufs
---
src/gallium/drivers/r600/evergreen_state.c | 24 +++-
src/gallium/drivers/r600/r600_hw_context.c | 18 ++
src/gallium/drivers/r600/r600_resource.h | 3 +++
src/gallium/drivers/r600/r600_texture.c| 25 -
From: Marek Olšák
r600g needs explicit flushing before DRI2 buffers are presented on the screen.
v2: add (stub) implementations for all drivers, fix frontbuffer flushing
v3: fix galahad
---
src/gallium/docs/source/context.rst | 13 +
src/gallium/drivers/freedreno/fre
---
src/glsl/glsl_types.cpp | 61 +++--
src/glsl/glsl_types.h | 7 ++
2 files changed, 41 insertions(+), 27 deletions(-)
diff --git a/src/glsl/glsl_types.cpp b/src/glsl/glsl_types.cpp
index f740130..6c9727e 100644
--- a/src/glsl/glsl_types.cpp
++
Unnamed record types are assigned to separate types per stage, e.g.
uniform struct { ... } a;
if defined in both vertex and fragment shader, will result in two
separate types of different name. When linking the shader, this
results in a type conflict. However, there is no reason why this
should n
Ping? Can anyone review this, please?
Grigori
On 27.11.2013 00:15, Grigori Goronzy wrote:
---
src/glsl/glsl_types.cpp | 61 +++--
src/glsl/glsl_types.h | 7 ++
2 files changed, 41 insertions(+), 27 deletions(-)
diff --git a/src/glsl
On 10.04.2014 11:23, Michel Dänzer wrote:
From: Michel Dänzer
---
This is just an RFC; if other developers approve of this approach, I can
make a more extensive patch removing the use_reusable_pool parameters.
The x11perf numbers below compare ShmGet/PutImage before and after this
change with
On 20.04.2014 03:02, Marek Olšák wrote:
It looks like the check is not needed with SB, because SB performs
register allocation. What happens if you comment out the conditional
which fails?
SB takes the machine code generated by the "classic" compiler as input,
so the check is still needed. Th
According to ISA docs, the range is 1..64, so effectively
bytes_to_fetch-1.
---
src/gallium/drivers/r600/r600_shader.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/gallium/drivers/r600/r600_shader.c
b/src/gallium/drivers/r600/r600_shader.c
index 81ed3ce..0444579 100
On 31.05.2013 14:37, Vadim Girlin wrote:
There are no regressions on evergreen with piglit tests or any
other apps that I tested, with and without llvm backend.
(Issue with Unigine Heaven that I mentioned on #dri-devel
yesterday was in fact caused by my own well-hidden bug, now it's fixed).
Impr
This is my first try to contribute anything useful to Mesa, so please
bear with me. This is not finished, but I'd like feedback to make sure
the code's quality and style is in line with what is expected in Mesa.
___
mesa-dev mailing list
mesa-dev@lists.f
Allows MSAA colorbuffers, which have a CMASK automatically and don't
need any further special handling, to be fast cleared. Instead
of clearing the buffer, set the clear color and the CMASK to the
cleared state.
---
src/gallium/drivers/r600/evergreen_state.c | 8 +++-
src/gallium/drivers/r600
On 08.06.2013 00:40, Marek Olšák wrote:
Also the fast clear
shouldn't be used for array, cube, and 3D textures unless all layers
are cleared together.
OK. I hadn't really thought about these.
One more thing. If you don't use piglit, I recommend using it before
sending patches to the mailing
Allows MSAA colorbuffers, which have a CMASK automatically and don't
need any further special handling, to be fast cleared. Instead
of clearing the buffer, set the clear color and the CMASK to the
cleared state.
Fast clear is used only when all bound colorbuffers fulfill certain
conditions: a CMAS
On 11.06.2013 02:41, Marek Olšák wrote:
>> +
>> + /* cannot pack color, needs support in u_format */
>> + if (desc->pack_rgba_float == NULL) {
>> + return false;
>> + }
>
> Hi Grirogi,
>
> Is this for disallowing integer textures? You
Allows MSAA colorbuffers, which have a CMASK automatically and don't
need any further special handling, to be fast cleared. Instead
of clearing the buffer, set the clear color and the CMASK to the
cleared state.
Fast clear is used only when all bound colorbuffers fulfill certain
conditions: a CMAS
On 12.06.2013 00:04, Grigori Goronzy wrote:
Allows MSAA colorbuffers, which have a CMASK automatically and don't
need any further special handling, to be fast cleared. Instead
of clearing the buffer, set the clear color and the CMASK to the
cleared state.
Fast clear is used only when all
This interface is used to expand fast-cleared window system
colorbuffers.
---
src/gallium/include/pipe/p_context.h | 8
src/gallium/state_trackers/dri/common/dri_drawable.c | 4
src/gallium/state_trackers/dri/drm/dri2.c| 8 ++--
3 files changed, 18 ins
---
src/gallium/drivers/r600/evergreen_state.c | 24 +++-
src/gallium/drivers/r600/r600_hw_context.c | 12 +---
src/gallium/drivers/r600/r600_resource.h | 3 +++
src/gallium/drivers/r600/r600_texture.c| 25 -
4 files changed, 55 insertions
Allocate a CMASK on demand and use it to fast clear single-sample
colorbuffers. Both FBOs and window system colorbuffers are fast
cleared. Expand as needed when colorbuffers are mapped or displayed
on screen.
---
src/gallium/drivers/r600/evergreen_state.c | 11
src/gallium/drivers/r600/r600
On 12.07.2013 16:19, Jose Fonseca wrote:
I admit I haven't fully understood what's being proposed yet. But just a few
quick words.
I always wanted to have a "present" method that ensures that the contents of a
resource is made visible to whatever the consumer is (full-screen flip, blit to prim
On 04.02.2014 00:53, Dave Airlie wrote:
From: Dave Airlie
attempt to calculate a better value for array size to avoid breaking apps.
Signed-off-by: Dave Airlie
---
src/gallium/drivers/r600/r600_shader.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/gallium/drivers
On 05.02.2014 18:08, Jose Fonseca wrote:
I honestly hope that GL_AMD_pinned_memory doesn't become popular. It would have
been alright if it wasn't for this bit in
http://www.opengl.org/registry/specs/AMD/pinned_memory.txt which says:
2) Can the application still use the buffer using the C
---
src/gallium/drivers/freedreno/freedreno_screen.c | 5 +
src/gallium/drivers/i915/i915_screen.c | 5 +
src/gallium/drivers/ilo/ilo_screen.c | 3 +++
src/gallium/drivers/llvmpipe/lp_screen.c | 3 +++
src/gallium/drivers/nouveau/nv30/nv30_screen.c | 2 ++
s
On 06.02.2014 02:46, Michel Dänzer wrote:
+ case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS:
+ return 16384;
radeonsi currently can't handle more than 4095 total output components,
as the buffer resource for writing to the GSVS ring only has 14 bits for
the stride in byte
v2: adjust limits for radeonsi and llvmpipe
---
src/gallium/drivers/freedreno/freedreno_screen.c | 5 +
src/gallium/drivers/i915/i915_screen.c | 5 +
src/gallium/drivers/ilo/ilo_screen.c | 3 +++
src/gallium/drivers/llvmpipe/lp_screen.c | 3 +++
src/gallium/dr
v2: adjust limits for radeonsi and llvmpipe
v3: add documentation
Cc: "10.1"
---
src/gallium/docs/source/screen.rst | 6 ++
src/gallium/drivers/freedreno/freedreno_screen.c | 5 +
src/gallium/drivers/i915/i915_screen.c | 5 +
src/gallium/drivers/ilo/ilo_screen
/vl_deint_filter.c
b/src/gallium/auxiliary/vl/vl_deint_filter.c
new file mode 100644
index 000..9b05154
--- /dev/null
+++ b/src/gallium/auxiliary/vl/vl_deint_filter.c
@@ -0,0 +1,491 @@
+/**
+ *
+ * Copyright 2013 Grigori Goronzy
---
src/gallium/state_trackers/vdpau/mixer.c | 69 ++--
src/gallium/state_trackers/vdpau/query.c | 1 +
src/gallium/state_trackers/vdpau/vdpau_private.h | 7 +++
3 files changed, 73 insertions(+), 4 deletions(-)
diff --git a/src/gallium/state_trackers/vdpau/m
On 15.02.2014 13:14, Andy Furniss wrote:
Thanks Grigori for doing this - looks really good on HD stuff I've
tested and of course is easily fast enough, unlike anything on the CPU
at high res.
Any plans for the future?
Well, adding edge-guided spatial interpolation for the temporal-spatial
mo
---
src/gallium/state_trackers/vdpau/mixer.c | 8
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/gallium/state_trackers/vdpau/mixer.c
b/src/gallium/state_trackers/vdpau/mixer.c
index 996fd8e..e6bfb8c 100644
--- a/src/gallium/state_trackers/vdpau/mixer.c
+++ b/src/galli
The spec incorrectly used void as return type, when it should have
been GLboolean. This has now been fixed. According to Nvidia, their
implementation always used GLboolean.
---
include/GL/glext.h | 2 +-
src/mapi/glapi/gen/NV_vdpau_interop.xml | 1 +
src/mesa/main/vdpau.c
Hi,
On 23.09.2015 10:11, Christian König wrote:
> From: Boyuan Zhang
>
> Signed-off-by: Boyuan Zhang
> Reviewed-by: Christian König
> ---
Thanks, nice to see this finally getting fixed, and it was a pretty
simple thing after all... well, not quite yet apparently. Sometimes
playback works corr
With the previous changes to handling of viewport clipping, it is
almost trivial to add proper support for guard band clipping. Select a
suitable integer clipping value to keep inside the rasterizer's guard
band range of [-32768, 32767] and program the hardware to use guard
band clipping.
Guard b
From: Marek Olšák
In other words, vport scissors are derived from viewport states.
If the scissor test is enabled, the intersection of both is used.
The guard band will disable clipping, so we have to clip per-pixel.
v2: fix check for r600_draw_rectangle and other overflow conditions.
(Grigori)
With the previous changes to handling of viewport clipping, it is
almost trivial to add proper support for guard band clipping. Select a
suitable integer clipping value to keep inside the rasterizer's guard
band range of [-32768, 32767] and program the hardware to use guard
band clipping.
Guard b
ssor & viewport code is deleted.
Thanks for implementing this properly.
Reviewed-by: Grigori Goronzy
Grigori
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
e case: Only 1 viewport is active. */
- if (mask & 1 &&
- !si_get_vs_info(sctx)->writes_viewport_index) {
+ if (!si_get_vs_info(sctx)->writes_viewport_index) {
+ if (!(mask & 1))
+ return;
+
Reviewed-by: Grigori Goronzy
On 2016-04-15 18:38, Ilia Mirkin wrote:
+ } else {
+ union pipe_color_union color;
+ switch (util_format_get_blocksizebits(res->format)) {
+ case 128:
+ sf->format = PIPE_FORMAT_R32G32B32A32_UINT;
Just as an FYI... this is sa
Small IBs help to reduce stalls for workloads that require a lot of
synchronization. On the other hand, if there is no notable
synchronization, we can use a large IB size to slightly improve
performance in some cases.
This introduces tuning of the IB size based on feedback on the average
buffer wa
Hi,
apps that cause a lot of synchronization benefit from small IB
sizes. The current IB size is a bit on the large side for this class
of apps. On the other hand, if there isn't much synchronization going
on, increasing the IB size can slightly improve performance, too.
Here's a quick hack that
On 2016-04-15 20:30, Jakob Sinclair wrote:
In other places in radeonsi that require reinterpretation (e.g.
si_blit.c), the surface template is modified instead of changing the
surface after creation. I'm not sure if r600/radeonsi like it if the
format is changed late like here. Seems to be cleane
Interesting, and thanks for poking at this issue. I've been thinking
about tuning IB sizes as well. I'd like for us to get this right, so I
wonder: What's your theory for _why_ your change helps?
See below. I think you discovered it yourself.
I'll be honest with you: Right now, I think your a
Small IBs help to reduce stalls for workloads that require a lot of
synchronization. On the other hand, if there is no notable
synchronization, we can use a large IB size to slightly improve
performance in some cases.
This introduces tuning of the IB size based on feedback on the average
buffer wa
Add missing break, add default case. Additionally initialize variables
to avoid compiler warnings.
---
src/gallium/winsys/amdgpu/drm/amdgpu_cs.c | 6 +-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c
b/src/gallium/winsys/amdgpu/drm/amdgp
thout any calls into the kernel, right? The
winsys code makes that conditional and calls into the kernel when no
fence pointer is available.
Grigori
On 19.04.2016 18:13, Grigori Goronzy wrote:
Small IBs help to reduce stalls for workloads that require a lot of
synchronization. On the other han
According to spec, CL_MEM_USE_HOST_PTR should directly use host memory,
if possible. This is just what userptr is for, so use it.
In case the memory cannot be mapped, a fallback similar to
CL_MEM_COPY_HOST_PTR is used.
---
src/gallium/state_trackers/clover/core/memory.cpp | 2 +-
src/gallium/s
1 - 100 of 161 matches
Mail list logo