https://bugs.freedesktop.org/show_bug.cgi?id=101334
--- Comment #33 from John ---
Created attachment 133252
--> https://bugs.freedesktop.org/attachment.cgi?id=133252&action=edit
another trace
Since I saw Dave's commit about fixing a GPU hang, I thought of trying again
but it's still no good. :
Quoting Kenneth Graunke (2017-08-05 02:10:43)
> On Friday, August 4, 2017 12:22:19 PM PDT Chris Wilson wrote:
> > Quoting Kenneth Graunke (2017-08-04 19:47:14)
> > > On Friday, July 21, 2017 8:36:42 AM PDT Chris Wilson wrote:
> > > > Patch reordering from last time so that the cosmetic tweaks are d
Quoting Chris Wilson (2017-08-04 21:01:16)
> If we need to stall to read the bo, ask the GPU to copy it into the CPU
> cache whilst we wait.
This is more food for thought, as I think we need to change the priority
ladder first. Aiui, miptree_map is the last resort so we don't want
needless complex
This series isn't yet fully baked, but I expect you can point out
approaches that need to be reworked already...
The starting point was to avoid the abysmal readback performance on !llc,
but with a simple application of blorp we get a lot of format conversions
for "free". (The only drawback is tha
With WC support, we can also use our manual detiling paths for !llc
architectures as well. This is even more important for those as the
indirection of the GTT is even more significant.
Currently, we can only effectively support WC uploads into X-tiling, as
we have to uploading into Y is slower tha
Ensure that any buffer allocated for a scanout image is kept out of the
CPU/LLC cache so as to avoid any visual glitch.
Cc: Kenneth Graunke
---
src/mesa/drivers/dri/i965/intel_screen.c | 10 ++
1 file changed, 10 insertions(+)
diff --git a/src/mesa/drivers/dri/i965/intel_screen.c
b/src
---
src/mesa/Makefile.sources | 1 -
src/mesa/drivers/common/meta.h | 17 -
src/mesa/drivers/common/meta_tex_subimage.c | 495
3 files changed, 513 deletions(-)
delete mode 100644 src/mesa/drivers/common/meta_tex_subimage.c
diff --gi
---
src/mesa/drivers/dri/i965/intel_tex.c | 63 ---
src/mesa/main/dd.h| 16 -
2 files changed, 79 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/intel_tex.c
b/src/mesa/drivers/dri/i965/intel_tex.c
index 7ce2ceb9a2..b04ccd3d57 10064
---
src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 27 +--
1 file changed, 17 insertions(+), 10 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 5cd8d24f1e..74e120b983 100644
--- a/src/mesa/driv
Uncommonly we may be able to blit into the texture where we cannot
perform the tiled memcpy fast path, for example on older generations and
non-LLC architectures (though those restrictions may be lifted in
future). Using the GPU blit, even with a linear source and forced stall,
is still much faster
Similar to the mechanism used by ReadPixels, use blorp for better format
handling than the existing blitter-only paths.
---
src/mesa/drivers/dri/i965/intel_pixel.c| 5 -
src/mesa/drivers/dri/i965/intel_pixel.h| 9 +-
src/mesa/drivers/dri/i965/intel_pixel_bitmap.c | 3 +
src/m
While it is preferrable to use a fast manual detiling method for LLC
(does not require synchronisation with a busy GPU and for accessing main
memory both the CPU and GPU have the same bandwidth), if we don't have
such a path then using the GPU to perform the blit is far preferable to
a coherent mma
Simplify the caller by reporting the incompatible formats rather than
asserting the caller doesn't request sRGB encoding/decoding.
---
src/mesa/drivers/dri/i965/intel_blit.c | 11 +++
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/intel_blit.c
b/sr
Iterate the tiled_memcpy for each face so that we can quickly do
synchronous uploads into cube maps etc.
---
src/mesa/drivers/dri/i965/intel_tex_subimage.c | 67 +++---
1 file changed, 39 insertions(+), 28 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/intel_tex_subimage.
All GEN GPU can bind to any piece of memory (thanks UMA), and so through
a special ioctl we can map a chunk of page-aligned client memory into
the GPU address space. However, not all GEN are equal. Some have
cache-coherency between the CPU and the GPU, whilst the others are
incoherent and rely on s
Y-tiling makes a mess of our cacheline WCB, forcing evictions and writes
between each pixel of the linear_to_ytiled routines, effectively
reducing the upload to UC performance (i.e. terrible). This patch takes
the simple approach of doing the detiling into a temporary page and then
copying the page
Similar to glReadPixels, using the GPU to blit back into the client's
buffer is preferrable to using a coherent mmaping (but not manual
detiling for several reasons).
Signed-off-by: Chris Wilson
---
src/mesa/drivers/dri/i965/intel_tex_image.c | 368 ++--
1 file changed, 3
If the user does use the pack/unpack offsets, simply decode those into
the offset from base and proceed with our fast manual detilined copy.
This is most frequently used for subimages where the stride or width may
not match the image.
---
src/mesa/drivers/dri/i965/intel_pixel_read.c | 6 ++
---
src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 14 +++---
src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 1 +
2 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 74e120b98
If the user supplies a pixel format of GL_RGB + GL_UNSIGNED_SHORT_5_6_5
and specifies a generic unsized GL_RGB internal format, match that to a
texture format of MESA_FORMAT_B5G6R5 if supported by the hardware.
Noticed while playing with mesa-demos/teximage:
TexImage(RGB/565 256 x 256): 79.8 im
Map the user format of GL_DEPTH_COMPONENT, GL_UNSIGNED_BYTE to the
internal format of MESA_FORMAT_S_UINT8.
---
src/mesa/main/glformats.c | 4
1 file changed, 4 insertions(+)
diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c
index 06be3ec48d..8ae833ca65 100644
--- a/src/mesa/
Return MESA_FORMAT_NONE for GL_BITMAPs rather than hit the unreachable
assertion.
---
src/mesa/main/glformats.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c
index 731934df6d..99b251a13d 100644
--- a/src/mesa/main/glformats.c
+++ b/src
A big limitation of the current direct memcpy routine is that it only
recognises a couple of (admittedly) common colour types, and cannot do
any inline conversion. If we pass the mesa_format down to memcpy and
tell it the direction of the transfer, we can start accepting a few
mixed transfers and b
Map format=GL_UNSIGNED_INT_24_8, type=GL_DEPTH_COMPONENT to
MESA_FORMAT_Z24_UNORM_x8_UINT.
---
src/mesa/main/glformats.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c
index 8ae833ca65..731934df6d 100644
--- a/src/mesa/main/glformats.c
+
LLC platforms are magic in that reads from the CPU are always cache
coherent, or rather GPU writes that bypass LLC do still invalidate the
appropriate cache line.
---
src/mesa/drivers/dri/i965/brw_bufmgr.c | 16 +++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/src/me
GL_DEPTH_COMPONENT and GL_STENCIL_INDEX are simple array formats of the
indiciated types, but were absent from the get_swizzle_from_format()
table causing them to be neglect and triggering
unreachable("Unsupported format").
Signed-off-by: Chris Wilson
---
src/mesa/main/glformats.c | 31 +
2017-07-24 10:28 GMT+02:00 Wladimir J. van der Laan :
> Signed-off-by: Wladimir J. van der Laan
Reviewed-by: Christian Gmeiner
> ---
> src/gallium/drivers/etnaviv/hw/state_3d.xml.h | 14 +-
> 1 file changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/src/gallium/drivers/etnav
2017-07-24 10:28 GMT+02:00 Wladimir J. van der Laan :
> This patch adds support for large shaders on GC3000. For example the "terrain"
> glmark benchmark with a large fragment shader will work after this.
>
> If the GPU supports ICACHE, shaders larger than the available state area will
> be uploade
2017-07-24 10:28 GMT+02:00 Wladimir J. van der Laan :
> GC3000 has changed from a separate store for VS and PS uniforms
> to a single, unified one. There is backwards compatibilty functionalty,
> however this does not work correctly together with ICACHE.
>
> This patch adds explicit support, althou
https://bugs.freedesktop.org/show_bug.cgi?id=102052
Bug ID: 102052
Summary: No package 'expat' found
Product: Mesa
Version: git
Hardware: Other
OS: All
Status: NEW
Keywords: bisected, regression
Jan Vesely writes:
> Hi,
>
> thanks for detailed explanation. I indeed missed the writeBuffer part
> in specs.
>
> On Wed, 2017-08-02 at 15:05 -0700, Francisco Jerez wrote:
>> These changes are somewhat redundant and potentially
>> performance-impacting, the reason is that in the OpenCL API,
>> c
Francisco Jerez writes:
> Jan Vesely writes:
>
>> Hi,
>>
>> thanks for detailed explanation. I indeed missed the writeBuffer part
>> in specs.
>>
>> On Wed, 2017-08-02 at 15:05 -0700, Francisco Jerez wrote:
>>> These changes are somewhat redundant and potentially
>>> performance-impacting, the r
Fixes build error on CentOS 6.9.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102052
Fixes: 5c007203b73d ("configure.ac: drop manual detection of expat
header/library")
Signed-off-by: Vinson Lee
---
configure.ac |4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)
diff --git
It justs works with the fragment shader resolve, so no need to do
a custom conversion. In fact with SRGB dest, it actually gives
wrong results.
Fixes: 69136f4e633 "radv/meta: add resolve pass using fragment/vertex shaders"
---
src/amd/vulkan/radv_meta.c| 46 ---
The argument here is a bitmask, so the old code selected .xy, which
got silently truncated to .x when constructing the vec4 from components,
instead of using .w.
Fixes: 588185eb6b7 "radv/meta: add srgb conversion to end of resolve shader."
---
src/amd/vulkan/radv_meta_resolve_cs.c | 2 +-
1 file
These seem to store very bogus results. Luckily there is some code
that converts srgb->linear already, so just making the descriptor
format UNORM should work.
Fixes: 588185eb6b7 "radv/meta: add srgb conversion to end of resolve shader."
---
src/amd/vulkan/radv_meta_resolve_cs.c | 2 +-
src/amd/v
On Fri, Aug 4, 2017 at 1:11 PM, Jan Vesely wrote:
> On Sun, 2017-07-30 at 20:26 -0500, Aaron Watry wrote:
>> The device version is the maximum CL version that the device supports.
>>
>> Eventually, this will be based on the features/extensions of the actual
>> device, but for now move it a bit clo
On Fri, Aug 4, 2017 at 1:14 PM, Jan Vesely wrote:
> On Sun, 2017-07-30 at 20:26 -0500, Aaron Watry wrote:
>> device_version and device_clc_version are not necessarily the same for
>> devices that support CL 1.0, but have a 1.1 compiler and the necessary
>> extensions.
>>
>> CC: Jan Vesey
>
> I th
On Fri, Aug 4, 2017 at 1:22 PM, Jan Vesely wrote:
> On Sun, 2017-07-30 at 20:26 -0500, Aaron Watry wrote:
>> We'll be using it to select the default language version soon.
>>
>> Signed-off-by: Aaron Watry
>> Cc: Pierre Moreau
>> Cc: Jan Vesely
>>
>> v2: (Pierre) Move changes to create_compiler_
On Fri, Aug 4, 2017 at 1:32 PM, Jan Vesely wrote:
> On Sun, 2017-07-30 at 20:26 -0500, Aaron Watry wrote:
>> According to section 5.8.4.5 of the 2.0 spec, the CL C version is chosen by:
>> 1) If you have -cl-std=CL1.1+ use the version specified
>> 2) If not, use the highest 1.x version that the
On Fri, Aug 4, 2017 at 1:43 PM, Jan Vesely wrote:
> On Sun, 2017-07-30 at 20:26 -0500, Aaron Watry wrote:
>> Signed-off-by: Aaron Watry
>> CC: Jan Vesely
>>
>> v2: base it on the device version
>> ---
>> src/gallium/state_trackers/clover/llvm/invocation.cpp | 3 ++-
>> 1 file changed, 2 inserti
On Sat, 2017-08-05 at 19:46 -0500, Aaron Watry wrote:
> On Fri, Aug 4, 2017 at 1:32 PM, Jan Vesely wrote:
> > On Sun, 2017-07-30 at 20:26 -0500, Aaron Watry wrote:
> > > According to section 5.8.4.5 of the 2.0 spec, the CL C version is chosen
> > > by:
> > > 1) If you have -cl-std=CL1.1+ use the
On Sat, Aug 5, 2017 at 8:56 PM, Jan Vesely wrote:
> On Sat, 2017-08-05 at 19:46 -0500, Aaron Watry wrote:
>> On Fri, Aug 4, 2017 at 1:32 PM, Jan Vesely wrote:
>> > On Sun, 2017-07-30 at 20:26 -0500, Aaron Watry wrote:
>> > > According to section 5.8.4.5 of the 2.0 spec, the CL C version is chosen
43 matches
Mail list logo