Re: [Mesa-dev] [PATCH] ac: add radeon_info::is_amdgpu instead of checking drm_major == 3
Reviewed-by: Samuel Pitoiset On 6/13/19 8:03 PM, Marek Olšák wrote: It's only a random cleanup. Marek On Thu, Jun 13, 2019 at 2:57 AM Samuel Pitoiset mailto:samuel.pitoi...@gmail.com>> wrote: Why do you need that? On 6/12/19 11:31 PM, Marek Olšák wrote: > From: Marek Olšák mailto:marek.ol...@amd.com>> > > and clean up > --- > src/amd/common/ac_gpu_info.c | 13 -- > src/amd/common/ac_gpu_info.h | 1 + > src/amd/vulkan/radv_debug.c | 5 +- > src/gallium/drivers/r600/r600_buffer_common.c | 6 +-- > src/gallium/drivers/r600/r600_pipe.c | 2 +- > src/gallium/drivers/r600/r600_pipe_common.c | 46 ++- > src/gallium/drivers/r600/r600_query.c | 2 +- > src/gallium/drivers/r600/r600_texture.c | 2 +- > src/gallium/drivers/r600/radeon_uvd.c | 3 +- > src/gallium/drivers/r600/radeon_vce.c | 5 +- > src/gallium/drivers/radeon/radeon_uvd.c | 2 +- > src/gallium/drivers/radeon/radeon_vce.c | 6 +-- > src/gallium/drivers/radeonsi/si_buffer.c | 2 +- > src/gallium/drivers/radeonsi/si_debug.c | 2 +- > src/gallium/drivers/radeonsi/si_get.c | 4 +- > src/gallium/drivers/radeonsi/si_pipe.c | 4 +- > src/gallium/drivers/radeonsi/si_query.c | 2 +- > src/gallium/drivers/radeonsi/si_state.c | 2 +- > .../winsys/radeon/drm/radeon_drm_winsys.c | 1 + > 19 files changed, 33 insertions(+), 77 deletions(-) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 95022] error: GLSL 1.50 is not supported.
https://bugs.freedesktop.org/show_bug.cgi?id=95022 --- Comment #6 from Ilse Twigt --- Debugging a program like this sometime takes hours and hours but with the help of your instructions I solved all the hurdles in no time.If you are searching homework help in Australia then click [[https://essayontime.com.au/homework-help-in-australia|This Site essayontime.com.au]] to get help immediately. There are no substitute of your programing level. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] ac: update llvm.amdgcn.icmp intrinsic name for LLVM 9+
LLVM r363339 changed llvm.amdgcn.icmp.i* to llvm.amdgcn.icmp.i64.i*. Signed-off-by: Samuel Pitoiset --- src/amd/common/ac_llvm_build.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c index 88e89d1dfb4..b93fdde023e 100644 --- a/src/amd/common/ac_llvm_build.c +++ b/src/amd/common/ac_llvm_build.c @@ -441,6 +441,7 @@ LLVMValueRef ac_build_ballot(struct ac_llvm_context *ctx, LLVMValueRef value) { + const char *name = HAVE_LLVM >= 0x900 ? "llvm.amdgcn.icmp.i64.i32" : "llvm.amdgcn.icmp.i32"; LLVMValueRef args[3] = { value, ctx->i32_0, @@ -454,8 +455,7 @@ ac_build_ballot(struct ac_llvm_context *ctx, args[0] = ac_to_integer(ctx, args[0]); - return ac_build_intrinsic(ctx, - "llvm.amdgcn.icmp.i32", + return ac_build_intrinsic(ctx, name, ctx->i64, args, 3, AC_FUNC_ATTR_NOUNWIND | AC_FUNC_ATTR_READNONE | @@ -465,6 +465,7 @@ ac_build_ballot(struct ac_llvm_context *ctx, LLVMValueRef ac_get_i1_sgpr_mask(struct ac_llvm_context *ctx, LLVMValueRef value) { + const char *name = HAVE_LLVM >= 0x900 ? "llvm.amdgcn.icmp.i64.i1" : "llvm.amdgcn.icmp.i1"; LLVMValueRef args[3] = { value, ctx->i1false, @@ -472,7 +473,7 @@ LLVMValueRef ac_get_i1_sgpr_mask(struct ac_llvm_context *ctx, }; assert(HAVE_LLVM >= 0x0800); - return ac_build_intrinsic(ctx, "llvm.amdgcn.icmp.i1", ctx->i64, args, 3, + return ac_build_intrinsic(ctx, name, ctx->i64, args, 3, AC_FUNC_ATTR_NOUNWIND | AC_FUNC_ATTR_READNONE | AC_FUNC_ATTR_CONVERGENT); -- 2.22.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] ac: update llvm.amdgcn.icmp intrinsic name for LLVM 9+
r-b On Fri, Jun 14, 2019 at 11:57 AM Samuel Pitoiset wrote: > > LLVM r363339 changed llvm.amdgcn.icmp.i* to llvm.amdgcn.icmp.i64.i*. > > Signed-off-by: Samuel Pitoiset > --- > src/amd/common/ac_llvm_build.c | 7 --- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c > index 88e89d1dfb4..b93fdde023e 100644 > --- a/src/amd/common/ac_llvm_build.c > +++ b/src/amd/common/ac_llvm_build.c > @@ -441,6 +441,7 @@ LLVMValueRef > ac_build_ballot(struct ac_llvm_context *ctx, > LLVMValueRef value) > { > + const char *name = HAVE_LLVM >= 0x900 ? "llvm.amdgcn.icmp.i64.i32" : > "llvm.amdgcn.icmp.i32"; > LLVMValueRef args[3] = { > value, > ctx->i32_0, > @@ -454,8 +455,7 @@ ac_build_ballot(struct ac_llvm_context *ctx, > > args[0] = ac_to_integer(ctx, args[0]); > > - return ac_build_intrinsic(ctx, > - "llvm.amdgcn.icmp.i32", > + return ac_build_intrinsic(ctx, name, > ctx->i64, args, 3, > AC_FUNC_ATTR_NOUNWIND | > AC_FUNC_ATTR_READNONE | > @@ -465,6 +465,7 @@ ac_build_ballot(struct ac_llvm_context *ctx, > LLVMValueRef ac_get_i1_sgpr_mask(struct ac_llvm_context *ctx, > LLVMValueRef value) > { > + const char *name = HAVE_LLVM >= 0x900 ? "llvm.amdgcn.icmp.i64.i1" : > "llvm.amdgcn.icmp.i1"; > LLVMValueRef args[3] = { > value, > ctx->i1false, > @@ -472,7 +473,7 @@ LLVMValueRef ac_get_i1_sgpr_mask(struct ac_llvm_context > *ctx, > }; > > assert(HAVE_LLVM >= 0x0800); > - return ac_build_intrinsic(ctx, "llvm.amdgcn.icmp.i1", ctx->i64, args, > 3, > + return ac_build_intrinsic(ctx, name, ctx->i64, args, 3, > AC_FUNC_ATTR_NOUNWIND | > AC_FUNC_ATTR_READNONE | > AC_FUNC_ATTR_CONVERGENT); > -- > 2.22.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 110709] g_glxglvnddispatchfuncs.c and glxglvnd.c fail to build with clang 8.0
https://bugs.freedesktop.org/show_bug.cgi?id=110709 Eric Engestrom changed: What|Removed |Added Status|NEW |ASSIGNED --- Comment #3 from Eric Engestrom --- We do read the bug tracker, but sometimes things slip through :) I've sent an MR with a fix here: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1103 It will be included in the next releases (19.1.1 and 19.0.7) once it's merged. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 110921] virgl on OpenGL 3.3 host regressed to OpenGL 2.1
https://bugs.freedesktop.org/show_bug.cgi?id=110921 Bug ID: 110921 Summary: virgl on OpenGL 3.3 host regressed to OpenGL 2.1 Product: Mesa Version: git Hardware: x86 (IA32) OS: Linux (All) Status: NEW Severity: normal Priority: medium Component: Other Assignee: mesa-dev@lists.freedesktop.org Reporter: rand...@mail.ru QA Contact: mesa-dev@lists.freedesktop.org Created attachment 144542 --> https://bugs.freedesktop.org/attachment.cgi?id=144542&action=edit glxinfo from guest Hi! I was testing virgl on nv50 and llvmpipe. It was working last year, showing OpenGL 3.3 in Linux guest. Now it only displays OpenGL 2.1 [but still works] LIBGL_ALWAYS_SOFTWARE=1 qemu-system-x86_64 -enable-kvm -m 1G -display sdl,gl=on -soundhw es1370 -cdrom /mnt/sdb1/slax-14_06_2019-private0.iso -vga virtio -usbdevice mouse -smp 3 -cpu max qemu-system-x86_64: -usbdevice mouse: '-usbdevice' is deprecated, please use '-device usb-...' instead gl_version 33 - core profile enabled Mesa: User error: GL_INVALID_VALUE in glTexImage2D(internalFormat=GL_ALPHA8) Mesa: User error: GL_INVALID_VALUE in glTexImage2D(internalFormat=GL_ALPHA16) GLSL feature level 330 In guest: Extended renderer info (GLX_MESA_query_renderer): Vendor: Red Hat (0x1af4) Device: virgl (0x1010) Version: 19.2.0 Accelerated: yes Video memory: 0MB Unified memory: no Preferred profile: compat (0x2) Max core profile version: 0.0 Max compat profile version: 2.1 Max GLES1 profile version: 1.1 Max GLES[23] profile version: 3.0 OpenGL version string: 2.1 Mesa 19.2.0-devel (git-83829abe03) -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 110921] virgl on OpenGL 3.3 host regressed to OpenGL 2.1
https://bugs.freedesktop.org/show_bug.cgi?id=110921 --- Comment #1 from Andrew Randrianasulu --- Created attachment 144543 --> https://bugs.freedesktop.org/attachment.cgi?id=144543&action=edit glxinfo from host (llvmpipe) -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 110921] virgl on OpenGL 3.3 host regressed to OpenGL 2.1
https://bugs.freedesktop.org/show_bug.cgi?id=110921 --- Comment #2 from Andrew Randrianasulu --- Created attachment 144544 --> https://bugs.freedesktop.org/attachment.cgi?id=144544&action=edit glxinfo from host (nv50) -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 110921] virgl on OpenGL 3.3 host regressed to OpenGL 2.1
https://bugs.freedesktop.org/show_bug.cgi?id=110921 --- Comment #3 from Ilia Mirkin --- FWIW GL_ALPHA8/16 are not valid in a core profile. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 110922] [regression][bisected] Android build test fails to include libmesa_winsys_virgl_common
https://bugs.freedesktop.org/show_bug.cgi?id=110922 Bug ID: 110922 Summary: [regression][bisected] Android build test fails to include libmesa_winsys_virgl_common Product: Mesa Version: git Hardware: Other OS: All Status: NEW Keywords: bisected, regression Severity: normal Priority: medium Component: Other Assignee: alexandros.frant...@canonical.com Reporter: clayton.a.cr...@intel.com QA Contact: mesa-dev@lists.freedesktop.org Output from Android build: [985/985] including vendor/intel/utils/Android.mk ... vendor/intel/external/project-celadon/mesa/src/gallium/winsys/virgl/drm/Android.mk: error: libmesa_winsys_virgl (STATIC_LIBRARIES android-x86_64) missing libmesa_winsys_virgl_common (STATIC_LIBRARIES android-x86_64) You can set ALLOW_MISSING_DEPENDENCIES=true in your environment if this is intentional, but that may defer real problems until later in the build. vendor/intel/external/project-celadon/mesa/src/gallium/winsys/virgl/drm/Android.mk: error: libmesa_winsys_virgl (STATIC_LIBRARIES android-x86) missing libmesa_winsys_virgl_common (STATIC_LIBRARIES android-x86) You can set ALLOW_MISSING_DEPENDENCIES=true in your environment if this is intentional, but that may defer real problems until later in the build. vendor/intel/external/project-celadon/mesa/src/gallium/winsys/virgl/vtest/Android.mk: error: libmesa_winsys_virgl_vtest (STATIC_LIBRARIES android-x86_64) missing libmesa_winsys_virgl_common (STATIC_LIBRARIES android-x86_64) You can set ALLOW_MISSING_DEPENDENCIES=true in your environment if this is intentional, but that may defer real problems until later in the build. vendor/intel/external/project-celadon/mesa/src/gallium/winsys/virgl/vtest/Android.mk: error: libmesa_winsys_virgl_vtest (STATIC_LIBRARIES android-x86) missing libmesa_winsys_virgl_common (STATIC_LIBRARIES android-x86) You can set ALLOW_MISSING_DEPENDENCIES=true in your environment if this is intentional, but that may defer real problems until later in the build. build/make/core/main.mk:833: error: exiting from previous errors. This has been bisected to: commit 801753d4b34f41625487c24a5c6ddaa912ef607a Author: Alexandros Frantzis Date: Tue Jun 11 17:58:08 2019 +0300 virgl: Use virgl_resource_cache in the vtest winsys and: commit 13f70d3668e6392bb08805f8d6f3162905ad35f0 Author: Alexandros Frantzis Date: Wed Jun 12 10:30:26 2019 +0300 virgl: Use virgl_resource_cache in the drm winsys You can view the full build log here: https://mesa-ci.01.org/mesa_master/builds/16747/group/63a9f0ea7bb98050796b649e85481845/93085/artifacts/154 -- You are receiving this mail because: You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 110709] g_glxglvnddispatchfuncs.c and glxglvnd.c fail to build with clang 8.0
https://bugs.freedesktop.org/show_bug.cgi?id=110709 --- Comment #4 from Sergey Kondakov --- (In reply to Eric Engestrom from comment #3) > We do read the bug tracker, but sometimes things slip through :) > > I've sent an MR with a fix here: > https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1103 > > It will be included in the next releases (19.1.1 and 19.0.7) once it's > merged. Thanks ! When Bugzilla has gone offline (while dropping spam-messages from bots beforehand on some long-ignored entries) and gitlab have disabled creation of Mesa issues I became sure that FDo Bugzilla will not be coming online at all because Mesa devs have decided to go the way of GCC which would not be surprising with things like bugs #41115 and #23705 Googling this up after failed build was a pleasant surprise. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 110261] Segmentation fault when using vulkaninfo on Radeon
https://bugs.freedesktop.org/show_bug.cgi?id=110261 Denis changed: What|Removed |Added Status|NEW |NEEDINFO --- Comment #12 from Denis --- Jason you was absolutely right. Test data: GPU Intel HD620 Manjaro OS vulkan-tools 1.1.101-1 - issue reproducable vulkan-tools 1.1.106-1 - issue is not reproducable Can somebody check this on radeon? Or I will try to do this later -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 110261] Segmentation fault when using vulkaninfo on Radeon
https://bugs.freedesktop.org/show_bug.cgi?id=110261 Danylo changed: What|Removed |Added CC||danylo.pilia...@gmail.com Resolution|--- |NOTOURBUG Status|NEEDINFO|RESOLVED --- Comment #13 from Danylo --- Yes, now on radeon it doesn't crash with vulkan-tools 1.1.106 -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 110921] virgl on OpenGL 3.3 host regressed to OpenGL 2.1
https://bugs.freedesktop.org/show_bug.cgi?id=110921 --- Comment #4 from Ilia Mirkin --- Looks like ARB_framebuffer_sRGB is not provided for some reason by virgl. There was recently reworked in virgl to try to be more precise, esp in ES, and I suspect something here ended up as a casualty... extensions->EXT_framebuffer_sRGB = screen->get_param(screen, PIPE_CAP_DEST_SURFACE_SRGB_CONTROL) && extensions->EXT_sRGB; Where EXT_sRGB is controlled by: { { o(EXT_sRGB) }, { PIPE_FORMAT_A8B8G8R8_SRGB, PIPE_FORMAT_B8G8R8A8_SRGB, PIPE_FORMAT_R8G8B8A8_SRGB }, GL_TRUE }, /* at least one format must be supported */ And the cap is controlled by: case PIPE_CAP_DEST_SURFACE_SRGB_CONTROL: return vscreen->caps.caps.v2.capability_bits & VIRGL_CAP_SRGB_WRITE_CONTROL; Probably one or both of those is false for some silly reason. Could even be due to an older version of the virgl "server" component which doesn't include that capability bit. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] util: Add util_is_power_of_two_minus_one
Checks if a number is one less than a power of two. Equivalently, this checks if a number is all ones in binary. The latter definition is helpful in the context of masks. The function is trivial; this is *the* canonical check and is arguably no less clean than calling util_is_power_of_two(x + 1) (the latter function implemented similarly). Still, it's worth having a dedicated check for this; semantically, in the context of masks, this check is meaningful standalone, justifying an independent implementation from the existing util_is_power_of_two* utilites. Signed-off-by: Alyssa Rosenzweig Cc: Ian Romanick Cc: Eduardo Lima Mitev --- src/util/bitscan.h | 9 + 1 file changed, 9 insertions(+) diff --git a/src/util/bitscan.h b/src/util/bitscan.h index dc89ac93f28..632f7dd2e67 100644 --- a/src/util/bitscan.h +++ b/src/util/bitscan.h @@ -158,6 +158,15 @@ util_is_power_of_two_nonzero(unsigned v) #endif } +/* Determine if an unsigned value is one less than a power-of-two + */ + +static inline bool +util_is_power_of_two_minus_one(unsigned v) +{ + return (v & (v + 1)) == 0; +} + /* For looping over a bitmask when you want to loop over consecutive bits * manually, for example: * -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] android: winsys/amdgpu, radv: fix generated amdgfxregs.h header dependecies
Fix android building errors in winsys/amdgpu and radv due to 'amdgfxregs.h' not found. Changelog: amd/common - generated $(intermediated)/common path is added to exports winsys/amdgpu - libmesa_amd_common static dependency is added radv - fix libmesa_amd_common $(intermediated)/common path in includes Fixes: f480b8a ("amd/common: use generated register header") Signed-off-by: Mauro Rossi --- src/amd/Android.common.mk| 3 ++- src/amd/vulkan/Android.mk| 2 +- src/gallium/winsys/amdgpu/drm/Android.mk | 2 +- 3 files changed, 4 insertions(+), 3 deletions(-) diff --git a/src/amd/Android.common.mk b/src/amd/Android.common.mk index 54180f16bb..1a3a00b9bc 100644 --- a/src/amd/Android.common.mk +++ b/src/amd/Android.common.mk @@ -62,7 +62,8 @@ LOCAL_C_INCLUDES := \ $(intermediates)/common LOCAL_EXPORT_C_INCLUDE_DIRS := \ - $(LOCAL_PATH)/common + $(LOCAL_PATH)/common \ + $(intermediates)/common LOCAL_SHARED_LIBRARIES := \ libdrm_amdgpu diff --git a/src/amd/vulkan/Android.mk b/src/amd/vulkan/Android.mk index ab39ba3b72..0725feacb5 100644 --- a/src/amd/vulkan/Android.mk +++ b/src/amd/vulkan/Android.mk @@ -68,7 +68,7 @@ $(call mesa-build-with-llvm) LOCAL_C_INCLUDES := \ $(RADV_COMMON_INCLUDES) \ - $(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_amd_common,,) \ + $(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_amd_common,,)/common \ $(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_nir,,)/nir \ $(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_radv_common,,) \ $(call generated-sources-dir-for,STATIC_LIBRARIES,libmesa_vulkan_util,,)/util \ diff --git a/src/gallium/winsys/amdgpu/drm/Android.mk b/src/gallium/winsys/amdgpu/drm/Android.mk index 6e84a0c8de..0b8edf972d 100644 --- a/src/gallium/winsys/amdgpu/drm/Android.mk +++ b/src/gallium/winsys/amdgpu/drm/Android.mk @@ -32,7 +32,7 @@ LOCAL_SRC_FILES := $(C_SOURCES) LOCAL_CFLAGS := $(AMDGPU_CFLAGS) -LOCAL_STATIC_LIBRARIES := libmesa_amdgpu_addrlib +LOCAL_STATIC_LIBRARIES := libmesa_amdgpu_addrlib libmesa_amd_common LOCAL_SHARED_LIBRARIES := libdrm_amdgpu LOCAL_MODULE := libmesa_winsys_amdgpu -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/11] panfrost: Integrate kernel names for tiler FBD
These names are from the replay workaround in kbase; they begin to shine some light on the meaning of these fields. In particular, we now understand why the "tiler_meta" field has the effect it does on performance in certain scenes (controlling tile granularity). Signed-off-by: Alyssa Rosenzweig --- .../drivers/panfrost/include/panfrost-job.h | 34 +- src/gallium/drivers/panfrost/pan_context.c| 44 --- .../drivers/panfrost/pandecode/decode.c | 36 ++- 3 files changed, 66 insertions(+), 48 deletions(-) diff --git a/src/gallium/drivers/panfrost/include/panfrost-job.h b/src/gallium/drivers/panfrost/include/panfrost-job.h index fd23499a00c..401fef8fcec 100644 --- a/src/gallium/drivers/panfrost/include/panfrost-job.h +++ b/src/gallium/drivers/panfrost/include/panfrost-job.h @@ -2,6 +2,7 @@ * © Copyright 2017-2018 Alyssa Rosenzweig * © Copyright 2017-2018 Connor Abbott * © Copyright 2017-2018 Lyude Paul + * © Copyright2019 Collabora * * Permission is hereby granted, free of charge, to any person obtaining a * copy of this software and associated documentation files (the "Software"), @@ -1362,16 +1363,16 @@ struct mali_single_framebuffer { u32 zero6[7]; /* Very weird format, see generation code in trans_builder.c */ -u32 resolution_check; - +u32 tiler_resolution_check; u32 tiler_flags; -u64 unknown_address_1; /* Pointing towards... a zero buffer? */ -u64 unknown_address_2; +/* Guesses? */ +mali_ptr tiler_scratch_start; /* Pointing towards... a zero buffer? */ +mali_ptr tiler_scratch_middle; /* See mali_kbase_replay.c */ -u64 tiler_heap_free; -u64 tiler_heap_end; +mali_ptr tiler_heap_free; +mali_ptr tiler_heap_end; /* More below this, maybe */ } __attribute__((packed)); @@ -1519,18 +1520,29 @@ struct bifrost_framebuffer { u32 clear_stencil : 8; u32 unk3 : 24; // = 0x100 float clear_depth; -mali_ptr tiler_meta; -/* 0x40 */ + + +/* Tiler section begins here */ +u32 tiler_unknown; + +/* Name known from the replay workaround in the kernel. What exactly is + * flagged here is less known. We do that (tiler_flags & 0x1ff) + * specifies a mask of hierarchy weights, which explains some of the + * performance mysteries around setting it. We also known (1 << 16) + * should be set, but there's no explanation in the kernel why. */ +u32 tiler_flags; /* Note: these are guesses! */ mali_ptr tiler_scratch_start; mali_ptr tiler_scratch_middle; -/* These are not, since we see symmetry with replay jobs which name these explicitly */ -mali_ptr tiler_heap_start; +/* These are not, since we see symmetry with replay + * jobs which name these explicitly */ + +mali_ptr tiler_heap_start; /* tiler heap_free_address */ mali_ptr tiler_heap_end; -u64 zero9, zero10, zero11, zero12; +u32 tiler_weights[8]; /* optional: struct bifrost_fb_extra extra */ /* struct bifrost_render_target rts[] */ diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c index d178a0f1db2..299f5f4c07b 100644 --- a/src/gallium/drivers/panfrost/pan_context.c +++ b/src/gallium/drivers/panfrost/pan_context.c @@ -107,7 +107,7 @@ panfrost_set_framebuffer_resolution(struct mali_single_framebuffer *fb, int w, i * The formula itself was discovered mostly by manual bruteforce and * aggressive algebraic simplification. */ -fb->resolution_check = ((w + h) / 3) << 4; +fb->tiler_resolution_check = ((w + h) / 3) << 4; } struct mali_single_framebuffer @@ -118,8 +118,8 @@ panfrost_emit_sfbd(struct panfrost_context *ctx) .format = 0x3000, .clear_flags = 0x1000, .unknown_address_0 = ctx->scratchpad.gpu, -.unknown_address_1 = ctx->misc_0.gpu, -.unknown_address_2 = ctx->misc_0.gpu + 40960, +.tiler_scratch_start = ctx->misc_0.gpu, +.tiler_scratch_middle = ctx->misc_0.gpu + 40960, .tiler_flags = 0xf0, .tiler_heap_free = ctx->tiler_heap.gpu, .tiler_heap_end = ctx->tiler_heap.gpu + ctx->tiler_heap.size, @@ -134,28 +134,22 @@ struct bifrost_framebuffer panfrost_emit_mfbd(struct panfrost_context *ctx) { struct bifrost_framebuffer framebuffer = { -/* It is not yet clear what tiler_meta means or how it's - * calculated, but we can tell the lower 32-bits are a - * (monotonically increasing?) function of tile count and - * geometry complexity; I suspect it defines a memory size of - * some kind? for the tiler. It's really
[Mesa-dev] [PATCH 05/11] panfrost: Add pan_tiler.h header
Signed-off-by: Alyssa Rosenzweig --- src/gallium/drivers/panfrost/pan_tiler.h | 44 1 file changed, 44 insertions(+) create mode 100644 src/gallium/drivers/panfrost/pan_tiler.h diff --git a/src/gallium/drivers/panfrost/pan_tiler.h b/src/gallium/drivers/panfrost/pan_tiler.h new file mode 100644 index 000..ad3dab68f7c --- /dev/null +++ b/src/gallium/drivers/panfrost/pan_tiler.h @@ -0,0 +1,44 @@ +/* + * Copyright (C) 2019 Collabora + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * Authors: + * Alyssa Rosenzweig + * + */ + +#ifndef __PAN_TILER_H__ +#define __PAN_TILER_H__ + +unsigned +panfrost_tiler_header_size(unsigned width, unsigned height, uint8_t mask); + +unsigned +panfrost_tiler_body_size(unsigned width, unsigned height, uint8_t mask); + +unsigned +panfrost_choose_hierarchy_mask( +unsigned width, unsigned height, +unsigned vertex_count); + +#endif + + -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 04/11] panfrost: Document tile size heuristic
I'm not sure how the blob does it, but this seems to be a dead simple test and roughly corresponds to what I've noticed from the blob, so maybe it's good enough. Signed-off-by: Alyssa Rosenzweig --- src/gallium/drivers/panfrost/pan_tiler.c | 65 1 file changed, 65 insertions(+) diff --git a/src/gallium/drivers/panfrost/pan_tiler.c b/src/gallium/drivers/panfrost/pan_tiler.c index f5103fa57ad..7e465ed542c 100644 --- a/src/gallium/drivers/panfrost/pan_tiler.c +++ b/src/gallium/drivers/panfrost/pan_tiler.c @@ -83,4 +83,69 @@ * pushed to kernel space and we can mostly ignore it here, just remembering to * set the GROWABLE flag so the kernel actually uses this path rather than * allocating a gigantic amount up front and burning a hole in RAM. + * + * As far as determining which hierarchy levels to use, the simple answer is + * that right now, we don't. In the tiler configuration fields (consistent from + * the earliest Midgard's SFBD through the latest Bifrost traces we have), + * there is a hierarchy_mask field, controlling which levels (tile sizes) are + * enabled. Ideally, the hierarchical tiling dream -- mapping big polygons to + * big tiles and small polygons to small tiles -- would be realized here as + * well. As long as there are polygons at all needing tiling, we always have to + * have big tiles available, in case there are big polygons. But we don't + * necessarily need small tiles available. Ideally, when there are small + * polygons, small tiles are enabled (to avoid waste from putting small + * triangles in the big tiles); when there are not, small tiles are disabled to + * avoid enabling more levels than necessary, which potentially costs in memory + * bandwidth / power / tiler performance. + * + * Of course, the driver has to figure this out statically. When tile + * hiearchies are actually established, this occurs by the tiler in + * fixed-function hardware, after the vertex shaders have run and there is + * sufficient information to figure out the size of triangles. The driver has + * no such luxury, again barring insane hacks like additionally running the + * vertex shaders in software or in hardware via transform feedback. Thus, for + * the driver, we need a heuristic approach. + * + * There are lots of heuristics to guess triangle size statically you could + * imagine, but one approach shines as particularly simple-stupid: assume all + * on-screen triangles are equal size and spread equidistantly throughout the + * screen. Let's be clear, this is NOT A VALID ASSUMPTION. But if we roll with + * it, then we see: + * + * Triangle Area = (Screen Area / # of triangles) + * = (Width * Height) / (# of triangles) + * + * Or if you prefer, we can also make a third CRAZY assumption that we only draw + * right triangles with edges parallel/perpendicular to the sides of the screen + * with no overdraw, forming a triangle grid across the screen: + * + * |--w--| + * _ | + * | /| /| | + * |/_|/_| h + * | /| /| | + * |/_|/_| | + * + * Then you can use some middle school geometry and algebra to work out the + * triangle dimensions. I started working on this, but realised I didn't need + * to to make my point, but couldn't bare to erase that ASCII art. Anyway. + * + * POINT IS, by considering the ratio of screen area and triangle count, we can + * estimate the triangle size. For a small size, use small bins; for a large + * size, use large bins. Intuitively, this metric makes sense: when there are + * few triangles on a large screen, you're probably compositing a UI and + * therefore the triangles are large; when there are a lot of triangles on a + * small screen, you're probably rendering a 3D mesh and therefore the + * triangles are tiny. (Or better said -- there will be tiny triangles, even if + * there are also large triangles. There have to be unless you expect crazy + * overdraw. Generally, it's better to allow more small bin sizes than + * necessary than not allow enough.) + * + * From this heuristic (or whatever), we determine the minimum allowable tile + * size, and we use that to decide the hierarchy masking, selecting from the + * minimum "ideal" tile size to the maximum tile size (2048x2048). + * + * Once we have that mask and the framebuffer dimensions, we can compute the + * size of the statically-sized polygon list structures, allocate them, and go! + * */ -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 02/11] panfrost: Add notes about the tiler allocations
This explains how the polygon list is allocated, updating the headers appropiately to sync the terminology. Signed-off-by: Alyssa Rosenzweig --- src/gallium/drivers/panfrost/pan_tiler.c | 86 1 file changed, 86 insertions(+) create mode 100644 src/gallium/drivers/panfrost/pan_tiler.c diff --git a/src/gallium/drivers/panfrost/pan_tiler.c b/src/gallium/drivers/panfrost/pan_tiler.c new file mode 100644 index 000..f5103fa57ad --- /dev/null +++ b/src/gallium/drivers/panfrost/pan_tiler.c @@ -0,0 +1,86 @@ +/* + * Copyright (C) 2019 Collabora + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * Authors: + * Alyssa Rosenzweig + */ + +/* Mali GPUs are tiled-mode renderers, rather than immediate-mode. + * Conceptually, the screen is divided into 16x16 tiles. Vertex shaders run. + * Then, a fixed-function hardware block (the tiler) consumes the gl_Position + * results. For each triangle specified, it marks each containing tile as + * containing that triangle. This set of "triangles per tile" form the "polygon + * list". Finally, the rasterization unit consumes the polygon list to invoke + * the fragment shader. + * + * In practice, it's a bit more complicated than this. 16x16 is the logical + * tile size, but Midgard features "hierarchical tiling", where power-of-two + * multiples of the base tile size can be used: hierarchy level 0 (16x16), + * level 1 (32x32), level 2 (64x64), per public information about Midgard's + * tiling. In fact, tiling goes up to 2048x2048 (!), although in practice + * 128x128 is the largest usually used (though higher modes are enabled). The + * idea behind hierarchical tiling is to use low tiling levels for small + * triangles and high levels for large triangles, to minimize memory bandwidth + * and repeated fragment shader invocations (the former issue inherent to + * immediate-mode rendering and the latter common in traditional tilers). + * + * The tiler itself works by reading varyings in and writing a polygon list + * out. Unfortunately (for us), both of these buffers are managed in main + * memory; although they ideally will be cached, it is the drivers' + * responsibility to allocate these buffers. Varying buffe allocation is + * handled elsewhere, as it is not tiler specific; the real issue is allocating + * the polygon list. + * + * This is hard, because from the driver's perspective, we have no information + * about what geometry will actually look like on screen; that information is + * only gained from running the vertex shader. (Theoretically, we could run the + * vertex shaders in software as a prepass, or in hardware with transform + * feedback as a prepass, but either idea is ludicrous on so many levels). + * + * Instead, Mali uses a bit of a hybrid approach, splitting the polygon list + * into three distinct pieces. First, the driver statically determines which + * tile hierarchy levels to use (more on that later). At this point, we know the + * framebuffer dimensions and all the possible tilings of the framebuffer, so + * we know exactly how many tiles exist across all hierarchy levels. The first + * piece of the polygon list is the header, which is exactly 8 bytes per tile, + * plus padding and a small 64-byte prologue. (If that doesn't remind you of + * AFBC, it should. See pan_afbc.c for some fun parallels). The next part is + * the polygon list body, which seems to contain 512 bytes per tile, again + * across every level of the hierarchy. These two parts form the polygon list + * buffer. This buffer has a statically determinable size, approximately equal + * to the # of tiles across all hierarchy levels * (8 bytes + 512 bytes), plus + * alignment / minimum restrictions / etc. + * + * The third piece is the easy one (for us): the tiler heap. In essence, the + * tiler heap is a gigantic slab that's as big as could possibly be necessary + * in the worst case imagin
[Mesa-dev] [PATCH 03/11] panfrost: Rename tiler fields per tiler research
Following the research into Midgard's hierarchical tiling infrastructure, we now understand (in broad stokes) the purpose of each tiler field in the MFBD. Additionally, we understand more of the tiling fields in the SFBD and in Bifrost's structures, although this knowledge is still incomplete. Update the names, decoder, and comments to reflect this new understanding. Signed-off-by: Alyssa Rosenzweig --- .../drivers/panfrost/include/panfrost-job.h | 40 -- src/gallium/drivers/panfrost/pan_context.c| 52 ++- .../drivers/panfrost/pandecode/decode.c | 22 3 files changed, 54 insertions(+), 60 deletions(-) diff --git a/src/gallium/drivers/panfrost/include/panfrost-job.h b/src/gallium/drivers/panfrost/include/panfrost-job.h index 401fef8fcec..c7cb2d7b5f4 100644 --- a/src/gallium/drivers/panfrost/include/panfrost-job.h +++ b/src/gallium/drivers/panfrost/include/panfrost-job.h @@ -963,7 +963,8 @@ struct bifrost_tiler_heap_meta { struct bifrost_tiler_meta { u64 zero0; -u32 unk; // = 0xf0 +u16 hierarchy_mask; +u16 flags; u16 width; u16 height; u64 zero1; @@ -1362,13 +1363,18 @@ struct mali_single_framebuffer { u32 zero6[7]; -/* Very weird format, see generation code in trans_builder.c */ +/* Logically, by symmetry to the MFBD, this ought to be the size of the + * polygon list. But this doesn't quite compute up. More investigation + * is needed. */ + u32 tiler_resolution_check; -u32 tiler_flags; -/* Guesses? */ -mali_ptr tiler_scratch_start; /* Pointing towards... a zero buffer? */ -mali_ptr tiler_scratch_middle; +u16 tiler_hierarchy_mask; +u16 tiler_flags; + +/* See pan_tiler.c */ +mali_ptr tiler_polygon_list; +mali_ptr tiler_polygon_list_body; /* See mali_kbase_replay.c */ mali_ptr tiler_heap_free; @@ -1523,21 +1529,23 @@ struct bifrost_framebuffer { /* Tiler section begins here */ -u32 tiler_unknown; +u32 tiler_polygon_list_size; /* Name known from the replay workaround in the kernel. What exactly is - * flagged here is less known. We do that (tiler_flags & 0x1ff) + * flagged here is less known. We do that (tiler_hierarchy_mask & 0x1ff) * specifies a mask of hierarchy weights, which explains some of the - * performance mysteries around setting it. We also known (1 << 16) - * should be set, but there's no explanation in the kernel why. */ -u32 tiler_flags; + * performance mysteries around setting it. We also see the bottom bit + * of tiler_flags set in the kernel, but no comment why. */ + +u16 tiler_hierarchy_mask; +u16 tiler_flags; -/* Note: these are guesses! */ -mali_ptr tiler_scratch_start; -mali_ptr tiler_scratch_middle; +/* See mali_tiler.c for an explanation */ +mali_ptr tiler_polygon_list; +mali_ptr tiler_polygon_list_body; -/* These are not, since we see symmetry with replay - * jobs which name these explicitly */ +/* Names based on we see symmetry with replay jobs which name these + * explicitly */ mali_ptr tiler_heap_start; /* tiler heap_free_address */ mali_ptr tiler_heap_end; diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c index 299f5f4c07b..0363591f79f 100644 --- a/src/gallium/drivers/panfrost/pan_context.c +++ b/src/gallium/drivers/panfrost/pan_context.c @@ -118,9 +118,10 @@ panfrost_emit_sfbd(struct panfrost_context *ctx) .format = 0x3000, .clear_flags = 0x1000, .unknown_address_0 = ctx->scratchpad.gpu, -.tiler_scratch_start = ctx->misc_0.gpu, -.tiler_scratch_middle = ctx->misc_0.gpu + 40960, -.tiler_flags = 0xf0, +.tiler_polygon_list = ctx->misc_0.gpu, +.tiler_polygon_list_body = ctx->misc_0.gpu + 40960, +.tiler_hierarchy_mask = 0xF0, +.tiler_flags = 0x0, .tiler_heap_free = ctx->tiler_heap.gpu, .tiler_heap_end = ctx->tiler_heap.gpu + ctx->tiler_heap.size, }; @@ -134,22 +135,22 @@ struct bifrost_framebuffer panfrost_emit_mfbd(struct panfrost_context *ctx) { struct bifrost_framebuffer framebuffer = { -/* It is not yet clear what this means or how it's - * calculated, but we can tell it is a (monotonically - * increasing?) function of tile count and geometry complexity; - * I suspect it defines a memory size of some kind? for the - * tiler. It's really unclear at the moment... but to add to - * the confusion, the hardware is happy enough to accept a zer
[Mesa-dev] [PATCH 00/11] panfrost: Hierarchical tiling work
Midgard and Bifrost GPUs feature "hierarchical tiling", publicly documented to varying degrees. Essentially, we're a regular tiler GPU, but the tile size can vary (and that variance is in driver control). This series adds some explanation how hierarchical tiling works from the drivers' perspective, in order to demystify the tiler data structures we touch. With that explanation, we are then able to use these bits to compute the tiler data structure sizes ourselves, correctly supplying the corresponding sizes in the framebuffer descriptor. Once job management is tamed, we'll use these sizes to statically allocate the polygon list on a per-framebuffer basis. Alyssa Rosenzweig (11): panfrost: Integrate kernel names for tiler FBD panfrost: Add notes about the tiler allocations panfrost: Rename tiler fields per tiler research panfrost: Document tile size heuristic panfrost: Add pan_tiler.h header panfrost: Calculate polygon list header size panfrost: Use polygon list header size computation panfrost: Compute and use polygon list body size panfrost: Sanity check tiler polygon list size panfrost: Rename misc_0 -> tiler_polygon_list panfrost: Stub out hierarchy mask selection .../drivers/panfrost/include/panfrost-job.h | 52 +++- src/gallium/drivers/panfrost/meson.build | 3 +- src/gallium/drivers/panfrost/pan_context.c| 101 +++--- src/gallium/drivers/panfrost/pan_context.h| 3 +- src/gallium/drivers/panfrost/pan_drm.c| 2 +- src/gallium/drivers/panfrost/pan_tiler.c | 291 ++ src/gallium/drivers/panfrost/pan_tiler.h | 44 +++ .../drivers/panfrost/pandecode/decode.c | 50 ++- 8 files changed, 457 insertions(+), 89 deletions(-) create mode 100644 src/gallium/drivers/panfrost/pan_tiler.c create mode 100644 src/gallium/drivers/panfrost/pan_tiler.h -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/11] panfrost: Sanity check tiler polygon list size
Signed-off-by: Alyssa Rosenzweig --- src/gallium/drivers/panfrost/pan_context.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c index a30b9e29701..d1e5b4ce647 100644 --- a/src/gallium/drivers/panfrost/pan_context.c +++ b/src/gallium/drivers/panfrost/pan_context.c @@ -178,6 +178,11 @@ panfrost_emit_mfbd(struct panfrost_context *ctx) unsigned body_size = panfrost_tiler_body_size( width, height, framebuffer.tiler_hierarchy_mask); +/* Sanity check */ + +unsigned total_size = header_size + body_size; +assert(ctx->misc_0.size >= total_size); + framebuffer.tiler_polygon_list_body = framebuffer.tiler_polygon_list + header_size; -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/11] panfrost: Use polygon list header size computation
Signed-off-by: Alyssa Rosenzweig --- src/gallium/drivers/panfrost/pan_context.c | 21 - 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c index 0363591f79f..ecb68c990a0 100644 --- a/src/gallium/drivers/panfrost/pan_context.c +++ b/src/gallium/drivers/panfrost/pan_context.c @@ -46,6 +46,7 @@ #include "pan_blending.h" #include "pan_blend_shaders.h" #include "pan_util.h" +#include "pan_tiler.h" static int performance_counter_number = 0; extern const char *pan_counters_base; @@ -134,6 +135,9 @@ panfrost_emit_sfbd(struct panfrost_context *ctx) struct bifrost_framebuffer panfrost_emit_mfbd(struct panfrost_context *ctx) { +unsigned width = ctx->pipe_framebuffer.width; +unsigned height = ctx->pipe_framebuffer.height; + struct bifrost_framebuffer framebuffer = { /* The lower 0x1ff controls the hierarchy mask. Set more bits * on for more tile granularity (which can be a performance win @@ -149,13 +153,12 @@ panfrost_emit_mfbd(struct panfrost_context *ctx) /* See pan_tiler.c */ .tiler_polygon_list = ctx->misc_0.gpu, -.tiler_polygon_list_body = ctx->misc_0.gpu + 0xf, .tiler_polygon_list_size = 0x0, -.width1 = MALI_POSITIVE(ctx->pipe_framebuffer.width), -.height1 = MALI_POSITIVE(ctx->pipe_framebuffer.height), -.width2 = MALI_POSITIVE(ctx->pipe_framebuffer.width), -.height2 = MALI_POSITIVE(ctx->pipe_framebuffer.height), +.width1 = MALI_POSITIVE(width), +.height1 = MALI_POSITIVE(height), +.width2 = MALI_POSITIVE(width), +.height2 = MALI_POSITIVE(height), .unk1 = 0x1080, @@ -168,6 +171,14 @@ panfrost_emit_mfbd(struct panfrost_context *ctx) .scratchpad = ctx->scratchpad.gpu, }; +/* Compute the polygon header size and use that to offset the body */ + +unsigned header_size = panfrost_tiler_header_size( +width, height, framebuffer.tiler_hierarchy_mask); + +framebuffer.tiler_polygon_list_body = +framebuffer.tiler_polygon_list + header_size; + return framebuffer; } -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 11/11] panfrost: Stub out hierarchy mask selection
Quite a bit of refactoring in the main driver will be necessary to make use of this effectively, so the implementation is incomplete. Signed-off-by: Alyssa Rosenzweig --- src/gallium/drivers/panfrost/pan_tiler.c | 21 + 1 file changed, 21 insertions(+) diff --git a/src/gallium/drivers/panfrost/pan_tiler.c b/src/gallium/drivers/panfrost/pan_tiler.c index 37768e9579e..9299eedb4c1 100644 --- a/src/gallium/drivers/panfrost/pan_tiler.c +++ b/src/gallium/drivers/panfrost/pan_tiler.c @@ -268,3 +268,24 @@ panfrost_tiler_body_size(unsigned width, unsigned height, uint8_t mask) return ALIGN_POT(header_size * 512 / 8, 512); } + +/* In the future, a heuristic to choose a tiler hierarchy mask would go here. + * At the moment, we just default to 0xFF, which enables all possible hierarchy + * levels. Overall this yields good performance but presumably incurs a cost in + * memory bandwidth / power consumption / etc, at least on smaller scenes that + * don't really need all the smaller levels enabled */ + +unsigned +panfrost_choose_hierarchy_mask( +unsigned width, unsigned height, +unsigned vertex_count) +{ +/* If there is no geometry, we don't bother enabling anything */ + +if (!vertex_count) +return 0x00; + +/* Otherwise, default everything on. TODO: Proper tests */ + +return 0xFF; +} -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/11] panfrost: Calculate polygon list header size
As per the notes at the beginning of pan_tiler.c, we implement a routine to calculate the size of the polygon list header given the framebuffer dimensions and the provided hierarchy mask. Signed-off-by: Alyssa Rosenzweig --- src/gallium/drivers/panfrost/meson.build | 3 +- src/gallium/drivers/panfrost/pan_tiler.c | 105 +++ 2 files changed, 107 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/panfrost/meson.build b/src/gallium/drivers/panfrost/meson.build index 006449fc48f..39f1fd8e6b0 100644 --- a/src/gallium/drivers/panfrost/meson.build +++ b/src/gallium/drivers/panfrost/meson.build @@ -59,7 +59,8 @@ files_panfrost = files( 'pan_pretty_print.c', 'pan_fragment.c', 'pan_sfbd.c', - 'pan_mfbd.c' + 'pan_mfbd.c', + 'pan_tiler.c', ) inc_panfrost = [ diff --git a/src/gallium/drivers/panfrost/pan_tiler.c b/src/gallium/drivers/panfrost/pan_tiler.c index 7e465ed542c..9e7a3a5e1b3 100644 --- a/src/gallium/drivers/panfrost/pan_tiler.c +++ b/src/gallium/drivers/panfrost/pan_tiler.c @@ -24,6 +24,10 @@ * Alyssa Rosenzweig */ +#include "util/u_math.h" +#include "util/macros.h" +#include "pan_tiler.h" + /* Mali GPUs are tiled-mode renderers, rather than immediate-mode. * Conceptually, the screen is divided into 16x16 tiles. Vertex shaders run. * Then, a fixed-function hardware block (the tiler) consumes the gl_Position @@ -149,3 +153,104 @@ * size of the statically-sized polygon list structures, allocate them, and go! * */ + +/* Hierarchical tiling spans from 16x16 to 2048x2048 tiles */ + +#define MIN_TILE_SIZE 16 +#define MAX_TILE_SIZE 2048 + +/* Constants as shifts for easier power-of-two iteration */ + +#define MIN_TILE_SHIFT util_logbase2(MIN_TILE_SIZE) +#define MAX_TILE_SHIFT util_logbase2(MAX_TILE_SIZE) + +/* The hierarchy has a 64-byte prologue */ +#define PROLOGUE_SIZE 0x40 + +/* For each tile (across all hierarchy levels), there is 8 bytes of header */ +#define HEADER_BYTES_PER_TILE 0x8 + +/* Absent any geometry, the minimum size of the header */ +#define MINIMUM_HEADER_SIZE 0x200 + +/* If the width-x-height framebuffer is divided into tile_size-x-tile_size + * tiles, how many tiles are there? Rounding up in each direction. For the + * special case of tile_size=16, this aligns with the usual Midgard count. + * tile_size must be a power-of-two. Not really repeat code from AFBC/checksum, + * because those care about the stride (not just the overall count) and only at + * a a fixed-tile size (not any of a number of power-of-twos) */ + +static unsigned +pan_tile_count(unsigned width, unsigned height, unsigned tile_size) +{ +unsigned aligned_width = ALIGN_POT(width, tile_size); +unsigned aligned_height = ALIGN_POT(height, tile_size); + +unsigned tile_count_x = aligned_width / tile_size; +unsigned tile_count_y = aligned_height / tile_size; + +return tile_count_x * tile_count_y; +} + +/* For `masked_count` of the smallest tile sizes masked out, computes how the + * size of the polygon list header. We iterate the tile sizes (16x16 through + * 2048x2048, if nothing is masked; (16*2^masked_count)x(16*2^masked_count) + * through 2048x2048 more generally. For each tile size, we figure out how many + * tiles there are at this hierarchy level and therefore many bytes this level + * is, leaving us with a byte count for each level. We then just sum up the + * byte counts across the levels to find a byte count for all levels. */ + +static unsigned +panfrost_raw_header_size(unsigned width, unsigned height, unsigned masked_count) +{ +unsigned size = PROLOGUE_SIZE; + +/* Normally we start at 16x16 tiles (MIN_TILE_SHIFT), but we add more + * if anything is masked off */ + +unsigned start_level = MIN_TILE_SHIFT + masked_count; + +/* Iterate hierarchy levels / tile sizes */ + +for (unsigned i = start_level; i < MAX_TILE_SHIFT; ++i) { +/* Shift from a level to a tile size */ +unsigned tile_size = (1 << i); + +unsigned tile_count = pan_tile_count(width, height, tile_size); +unsigned header_bytes = HEADER_BYTES_PER_TILE * tile_count; + +size += header_bytes; +} + +/* This size will be used as an offset, so ensure it's aligned */ +return ALIGN_POT(size, 512); +} + +/* Given a hierarchy mask and a framebuffer size, compute the header size */ + +unsigned +panfrost_tiler_header_size(unsigned width, unsigned height, uint8_t mask) +{ +/* If no hierarchy levels are enabled, that means there is no geometry + * for the tiler to process, so use a minimum size. Used for clears */ + +if (mask == 0x00) +return MINIMUM_HEADER_SIZE; + +/* Some levels are enabled. Ensure that only smaller levels are + * disabled and there are no gaps. Theoretically the hardware is more + * flexible, but there's no known reason to use other conf
[Mesa-dev] [PATCH 08/11] panfrost: Compute and use polygon list body size
This is a bit of a hack, but it gets the point across. Signed-off-by: Alyssa Rosenzweig --- src/gallium/drivers/panfrost/pan_context.c | 7 ++- src/gallium/drivers/panfrost/pan_tiler.c | 14 ++ 2 files changed, 20 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c index ecb68c990a0..a30b9e29701 100644 --- a/src/gallium/drivers/panfrost/pan_context.c +++ b/src/gallium/drivers/panfrost/pan_context.c @@ -153,7 +153,6 @@ panfrost_emit_mfbd(struct panfrost_context *ctx) /* See pan_tiler.c */ .tiler_polygon_list = ctx->misc_0.gpu, -.tiler_polygon_list_size = 0x0, .width1 = MALI_POSITIVE(width), .height1 = MALI_POSITIVE(height), @@ -176,9 +175,15 @@ panfrost_emit_mfbd(struct panfrost_context *ctx) unsigned header_size = panfrost_tiler_header_size( width, height, framebuffer.tiler_hierarchy_mask); +unsigned body_size = panfrost_tiler_body_size( +width, height, framebuffer.tiler_hierarchy_mask); + framebuffer.tiler_polygon_list_body = framebuffer.tiler_polygon_list + header_size; +framebuffer.tiler_polygon_list_size = +header_size + body_size; + return framebuffer; } diff --git a/src/gallium/drivers/panfrost/pan_tiler.c b/src/gallium/drivers/panfrost/pan_tiler.c index 9e7a3a5e1b3..37768e9579e 100644 --- a/src/gallium/drivers/panfrost/pan_tiler.c +++ b/src/gallium/drivers/panfrost/pan_tiler.c @@ -254,3 +254,17 @@ panfrost_tiler_header_size(unsigned width, unsigned height, uint8_t mask) return panfrost_raw_header_size(width, height, masked_count); } + +/* The body seems to be about 512 bytes per tile. Noting that the header is + * about 8 bytes per tile, we can be a little sloppy and estimate the body size + * to be equal to the header size * (512/8). Given the header size is a + * considerable overestimate, this is fine. Eventually, we should maybe figure + * out how to actually implement this. */ + +unsigned +panfrost_tiler_body_size(unsigned width, unsigned height, uint8_t mask) +{ +unsigned header_size = panfrost_tiler_header_size(width, height, mask); +return ALIGN_POT(header_size * 512 / 8, 512); +} + -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 10/11] panfrost: Rename misc_0 -> tiler_polygon_list
Just for readability. Signed-off-by: Alyssa Rosenzweig --- src/gallium/drivers/panfrost/pan_context.c | 14 +++--- src/gallium/drivers/panfrost/pan_context.h | 3 +-- src/gallium/drivers/panfrost/pan_drm.c | 2 +- 3 files changed, 9 insertions(+), 10 deletions(-) diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c index d1e5b4ce647..b4108162ca3 100644 --- a/src/gallium/drivers/panfrost/pan_context.c +++ b/src/gallium/drivers/panfrost/pan_context.c @@ -119,8 +119,8 @@ panfrost_emit_sfbd(struct panfrost_context *ctx) .format = 0x3000, .clear_flags = 0x1000, .unknown_address_0 = ctx->scratchpad.gpu, -.tiler_polygon_list = ctx->misc_0.gpu, -.tiler_polygon_list_body = ctx->misc_0.gpu + 40960, +.tiler_polygon_list = ctx->tiler_polygon_list.gpu, +.tiler_polygon_list_body = ctx->tiler_polygon_list.gpu + 40960, .tiler_hierarchy_mask = 0xF0, .tiler_flags = 0x0, .tiler_heap_free = ctx->tiler_heap.gpu, @@ -152,7 +152,7 @@ panfrost_emit_mfbd(struct panfrost_context *ctx) .tiler_heap_end = ctx->tiler_heap.gpu + ctx->tiler_heap.size, /* See pan_tiler.c */ -.tiler_polygon_list = ctx->misc_0.gpu, +.tiler_polygon_list = ctx->tiler_polygon_list.gpu, .width1 = MALI_POSITIVE(width), .height1 = MALI_POSITIVE(height), @@ -181,7 +181,7 @@ panfrost_emit_mfbd(struct panfrost_context *ctx) /* Sanity check */ unsigned total_size = header_size + body_size; -assert(ctx->misc_0.size >= total_size); +assert(ctx->tiler_polygon_list.size >= total_size); framebuffer.tiler_polygon_list_body = framebuffer.tiler_polygon_list + header_size; @@ -646,7 +646,7 @@ panfrost_set_value_job(struct panfrost_context *ctx) }; struct mali_payload_set_value payload = { -.out = ctx->misc_0.gpu, +.out = ctx->tiler_polygon_list.gpu, .unknown = 0x3, }; @@ -2383,7 +2383,7 @@ panfrost_destroy(struct pipe_context *pipe) screen->driver->free_slab(screen, &panfrost->varying_mem); screen->driver->free_slab(screen, &panfrost->shaders); screen->driver->free_slab(screen, &panfrost->tiler_heap); -screen->driver->free_slab(screen, &panfrost->misc_0); +screen->driver->free_slab(screen, &panfrost->tiler_polygon_list); } static struct pipe_query * @@ -2538,7 +2538,7 @@ panfrost_setup_hardware(struct panfrost_context *ctx) screen->driver->allocate_slab(screen, &ctx->varying_mem, 16384, false, PAN_ALLOCATE_INVISIBLE | PAN_ALLOCATE_COHERENT_LOCAL, 0, 0); screen->driver->allocate_slab(screen, &ctx->shaders, 4096, true, PAN_ALLOCATE_EXECUTE, 0, 0); screen->driver->allocate_slab(screen, &ctx->tiler_heap, 32768, false, PAN_ALLOCATE_INVISIBLE | PAN_ALLOCATE_GROWABLE, 1, 128); -screen->driver->allocate_slab(screen, &ctx->misc_0, 128*128, false, PAN_ALLOCATE_INVISIBLE | PAN_ALLOCATE_GROWABLE, 1, 128); +screen->driver->allocate_slab(screen, &ctx->tiler_polygon_list, 128*128, false, PAN_ALLOCATE_INVISIBLE | PAN_ALLOCATE_GROWABLE, 1, 128); } diff --git a/src/gallium/drivers/panfrost/pan_context.h b/src/gallium/drivers/panfrost/pan_context.h index 3eac8d6b5a2..58a913d9338 100644 --- a/src/gallium/drivers/panfrost/pan_context.h +++ b/src/gallium/drivers/panfrost/pan_context.h @@ -133,8 +133,7 @@ struct panfrost_context { struct panfrost_memory scratchpad; struct panfrost_memory tiler_heap; struct panfrost_memory varying_mem; -struct panfrost_memory misc_0; -struct panfrost_memory misc_1; +struct panfrost_memory tiler_polygon_list; struct panfrost_memory depth_stencil_buffer; struct panfrost_query *occlusion_query; diff --git a/src/gallium/drivers/panfrost/pan_drm.c b/src/gallium/drivers/panfrost/pan_drm.c index 3ceff41e25c..2b4097c5ce5 100644 --- a/src/gallium/drivers/panfrost/pan_drm.c +++ b/src/gallium/drivers/panfrost/pan_drm.c @@ -236,7 +236,7 @@ panfrost_drm_submit_job(struct panfrost_context *ctx, u64 job_desc, int reqs, st bo_handles[submit.bo_handle_count++] = ctx->scratchpad.gem_handle; bo_handles[submit.bo_handle_count++] = ctx->tiler_heap.gem_handle; bo_handles[submit.bo_handle_count++] = ctx->varying_mem.gem_handle; - bo_handles[submit.bo_handle_count++] = ctx->misc_0.gem_handle; + bo_handles[submit.bo_handle_count++] = ctx->tiler_polygon_list.gem_handle; submit.bo_handles = (u64) (uintptr_t) bo_handles; if (drmIoctl(drm->fd, DRM_IOCTL_PANFROST_SUBMIT, &submit)) { -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org
Re: [Mesa-dev] [PATCH 1/2] util: Add util_is_power_of_two_minus_one
On 6/14/19 9:42 AM, Alyssa Rosenzweig wrote: > Checks if a number is one less than a power of two. Equivalently, this > checks if a number is all ones in binary. The latter definition is > helpful in the context of masks. > > The function is trivial; this is *the* canonical check and is > arguably no less clean than calling util_is_power_of_two(x + 1) (the Except it would have to be util_is_power_of_two_or_zero because util_is_power_of_two(0x + 1) is false. :) Is there actually a 2/2 for this? We usually wouldn't land something like this without a caller. > latter function implemented similarly). Still, it's worth having a > dedicated check for this; semantically, in the context of masks, this > check is meaningful standalone, justifying an independent implementation > from the existing util_is_power_of_two* utilites. > > Signed-off-by: Alyssa Rosenzweig > Cc: Ian Romanick > Cc: Eduardo Lima Mitev > --- > src/util/bitscan.h | 9 + > 1 file changed, 9 insertions(+) > > diff --git a/src/util/bitscan.h b/src/util/bitscan.h > index dc89ac93f28..632f7dd2e67 100644 > --- a/src/util/bitscan.h > +++ b/src/util/bitscan.h > @@ -158,6 +158,15 @@ util_is_power_of_two_nonzero(unsigned v) > #endif > } > > +/* Determine if an unsigned value is one less than a power-of-two > + */ > + > +static inline bool > +util_is_power_of_two_minus_one(unsigned v) > +{ > + return (v & (v + 1)) == 0; This will return true for v == 0. Is that the desired behavior? I mean, that is 2**0 - 1, but it is not "all ones in binary." I think the result may surprise people wanting to use this to detect a mask. This is also how we ended up with util_is_power_of_two_nonzero and util_is_power_of_two_or_zero. > +} > + > /* For looping over a bitmask when you want to loop over consecutive bits > * manually, for example: > * > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 110923] Multiple VkSubpassDependency-entries with the same dstSubpass not handled correctly
https://bugs.freedesktop.org/show_bug.cgi?id=110923 Bug ID: 110923 Summary: Multiple VkSubpassDependency-entries with the same dstSubpass not handled correctly Product: Mesa Version: git Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: Drivers/Vulkan/radeon Assignee: mesa-dev@lists.freedesktop.org Reporter: chris.forf...@gmail.com QA Contact: mesa-dev@lists.freedesktop.org Looking at [1], it seems like if VkRenderPassCreateInfo->pDependencies has multiple entries with the same value for dstSubpass then each entry will overwrite the effects of the previous. Take [2] as an example; in this case pass->subpasses[0].start_barrier.src_stage_mask appears to end up as pDependencies[1].srcStageMask instead of the seemingly correct (pDependencies[0].srcStageMask | pDependencies[1].srcStageMask). In other words, as the entries in pDependencies are considered the masks should be OR-ed, not assigned. [1] https://github.com/intel/external-mesa/blob/a749ad9d7d8558c8b085e0484a91d83ca84d9db2/src/amd/vulkan/radv_pass.c#L366 [2] https://github.com/KhronosGroup/Vulkan-Tools/blob/b99797641e8275e31557b3eb0610e9d282f96c35/cube/cube.c#L1896 -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/6] panfrost: Misc. fixes
This series includes misc. fixes to improve robustness for for more complex apps. Alyssa Rosenzweig (6): panfrost: Identify and decode mfbd_flags panfrost: Disable the tiler for clear-only jobs panfrost: Improve viewport (clipping) robustness panfrost: Flush scanout too panfrost: Remove forced flush on clears panfrost: Handle missing texture case .../drivers/panfrost/include/panfrost-job.h | 4 +- src/gallium/drivers/panfrost/pan_context.c| 161 -- src/gallium/drivers/panfrost/pan_context.h| 11 +- src/gallium/drivers/panfrost/pan_drm.c| 2 +- src/gallium/drivers/panfrost/pan_fragment.c | 6 +- src/gallium/drivers/panfrost/pan_mfbd.c | 16 +- src/gallium/drivers/panfrost/pan_sfbd.c | 4 +- .../drivers/panfrost/pandecode/decode.c | 16 +- 8 files changed, 142 insertions(+), 78 deletions(-) -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/6] panfrost: Improve viewport (clipping) robustness
On more complex apps (possibly using desktop GL specific extensions?), our viewport code was getting wacky results for unclear reasons. Let's be a little less wacky. Signed-off-by: Alyssa Rosenzweig --- src/gallium/drivers/panfrost/pan_context.c | 41 ++ 1 file changed, 35 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c index de8da320b02..bce57625163 100644 --- a/src/gallium/drivers/panfrost/pan_context.c +++ b/src/gallium/drivers/panfrost/pan_context.c @@ -1287,28 +1287,57 @@ panfrost_emit_for_draw(struct panfrost_context *ctx, bool with_vertex_data) }; /* Always scissor to the viewport by default. */ -view.viewport0[0] = (int) (vp->translate[0] - vp->scale[0]); -view.viewport1[0] = MALI_POSITIVE((int) (vp->translate[0] + vp->scale[0])); +int minx = (int) (vp->translate[0] - vp->scale[0]); +int maxx = (int) (vp->translate[0] + vp->scale[0]); int miny = (int) (vp->translate[1] - vp->scale[1]); int maxy = (int) (vp->translate[1] + vp->scale[1]); -if (ss && ctx->rasterizer && ctx->rasterizer->base.scissor) { -view.viewport0[0] = ss->minx; -view.viewport1[0] = MALI_POSITIVE(ss->maxx); +/* Apply the scissor test */ +if (ss && ctx->rasterizer && ctx->rasterizer->base.scissor) { +minx = ss->minx; +maxx = ss->maxx; miny = ss->miny; maxy = ss->maxy; } /* Hardware needs the min/max to be strictly ordered, so flip if we - * need to */ + * need to. The viewport transformation in the vertex shader will + * handle the negatives if we don't */ + if (miny > maxy) { int temp = miny; miny = maxy; maxy = temp; } +if (minx > maxx) { +int temp = minx; +minx = maxx; +maxx = temp; +} + +/* Clamp everything positive, just in case */ + +maxx = MAX2(0, maxx); +maxy = MAX2(0, maxy); +minx = MAX2(0, minx); +miny = MAX2(0, miny); + +/* Clamp to the framebuffer size as a last check */ + +minx = MIN2(ctx->pipe_framebuffer.width, minx); +maxx = MIN2(ctx->pipe_framebuffer.width, maxx); + +miny = MIN2(ctx->pipe_framebuffer.height, miny); +maxy = MIN2(ctx->pipe_framebuffer.height, maxy); + +/* Upload */ + +view.viewport0[0] = minx; +view.viewport1[0] = MALI_POSITIVE(maxx); + view.viewport0[1] = miny; view.viewport1[1] = MALI_POSITIVE(maxy); -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/6] panfrost: Disable the tiler for clear-only jobs
To do so, we route some basic information through to the FBD creation routines (currently just a binary toggle of "has draws?"). Eventually, more refactoring will enable dynamic hierarchy mask selection, but right now we do the most basic. Signed-off-by: Alyssa Rosenzweig --- src/gallium/drivers/panfrost/pan_context.c | 60 - src/gallium/drivers/panfrost/pan_context.h | 11 ++-- src/gallium/drivers/panfrost/pan_drm.c | 2 +- src/gallium/drivers/panfrost/pan_fragment.c | 6 +-- src/gallium/drivers/panfrost/pan_mfbd.c | 4 +- src/gallium/drivers/panfrost/pan_sfbd.c | 4 +- 6 files changed, 50 insertions(+), 37 deletions(-) diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c index fc6b8fbb0a1..de8da320b02 100644 --- a/src/gallium/drivers/panfrost/pan_context.c +++ b/src/gallium/drivers/panfrost/pan_context.c @@ -112,7 +112,7 @@ panfrost_set_framebuffer_resolution(struct mali_single_framebuffer *fb, int w, i } struct mali_single_framebuffer -panfrost_emit_sfbd(struct panfrost_context *ctx) +panfrost_emit_sfbd(struct panfrost_context *ctx, unsigned vertex_count) { struct mali_single_framebuffer framebuffer = { .unknown2 = 0x1f, @@ -133,27 +133,12 @@ panfrost_emit_sfbd(struct panfrost_context *ctx) } struct bifrost_framebuffer -panfrost_emit_mfbd(struct panfrost_context *ctx) +panfrost_emit_mfbd(struct panfrost_context *ctx, unsigned vertex_count) { unsigned width = ctx->pipe_framebuffer.width; unsigned height = ctx->pipe_framebuffer.height; struct bifrost_framebuffer framebuffer = { -/* The lower 0x1ff controls the hierarchy mask. Set more bits - * on for more tile granularity (which can be a performance win - * on some scenes, at memory bandwidth costs). For now, be lazy - * and enable everything. This might be a terrible idea. */ - -.tiler_hierarchy_mask = 0xff, -.tiler_flags = 0x0, - -/* The hardware deals with suballocation; we don't care */ -.tiler_heap_start = ctx->tiler_heap.gpu, -.tiler_heap_end = ctx->tiler_heap.gpu + ctx->tiler_heap.size, - -/* See pan_tiler.c */ -.tiler_polygon_list = ctx->tiler_polygon_list.gpu, - .width1 = MALI_POSITIVE(width), .height1 = MALI_POSITIVE(height), .width2 = MALI_POSITIVE(width), @@ -170,6 +155,9 @@ panfrost_emit_mfbd(struct panfrost_context *ctx) .scratchpad = ctx->scratchpad.gpu, }; +framebuffer.tiler_hierarchy_mask = +panfrost_choose_hierarchy_mask(width, height, vertex_count); + /* Compute the polygon header size and use that to offset the body */ unsigned header_size = panfrost_tiler_header_size( @@ -181,7 +169,28 @@ panfrost_emit_mfbd(struct panfrost_context *ctx) /* Sanity check */ unsigned total_size = header_size + body_size; -assert(ctx->tiler_polygon_list.size >= total_size); + +if (framebuffer.tiler_hierarchy_mask) { + assert(ctx->tiler_polygon_list.size >= total_size); + +/* Specify allocated tiler structures */ +framebuffer.tiler_polygon_list = ctx->tiler_polygon_list.gpu; + +/* Allow the entire tiler heap */ +framebuffer.tiler_heap_start = ctx->tiler_heap.gpu; +framebuffer.tiler_heap_end = +ctx->tiler_heap.gpu + ctx->tiler_heap.size; +} else { +/* The tiler is disabled, so don't allow the tiler heap */ +framebuffer.tiler_heap_start = ctx->tiler_heap.gpu; +framebuffer.tiler_heap_end = framebuffer.tiler_heap_start; + +/* Use a dummy polygon list */ +framebuffer.tiler_polygon_list = ctx->tiler_dummy.gpu; + +/* Also, set a "tiler disabled?" flag? */ +framebuffer.tiler_hierarchy_mask |= 0x1000; +} framebuffer.tiler_polygon_list_body = framebuffer.tiler_polygon_list + header_size; @@ -189,6 +198,8 @@ panfrost_emit_mfbd(struct panfrost_context *ctx) framebuffer.tiler_polygon_list_size = header_size + body_size; + + return framebuffer; } @@ -307,9 +318,9 @@ panfrost_invalidate_frame(struct panfrost_context *ctx) ctx->cmdstream_i = 0; if (ctx->require_sfbd) -ctx->vt_framebuffer_sfbd = panfrost_emit_sfbd(ctx); +ctx->vt_framebuffer_sfbd = panfrost_emit_sfbd(ctx, ~0); else -ctx->vt_framebuffer_mfbd = panfrost_emit_mfbd(ctx); +ctx->vt_framebuffer_mfbd = panfrost_emit_mfbd(ctx, ~0); /* Reset varyings allocated */ ctx->varying_height = 0; @@ -2145,9 +2156,9
[Mesa-dev] [PATCH 6/6] panfrost: Handle missing texture case
In some cases, Gallium can give us bad info about the texture count, counting some NULL textures. We pass Gallium's info to the hardware blindly, which can confuse the hardware in edge cases. This patch adjusts accordingly. --- src/gallium/drivers/panfrost/pan_context.c | 47 +- 1 file changed, 29 insertions(+), 18 deletions(-) diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c index 09bded80296..8dbdc84209d 100644 --- a/src/gallium/drivers/panfrost/pan_context.c +++ b/src/gallium/drivers/panfrost/pan_context.c @@ -900,23 +900,27 @@ panfrost_upload_sampler_descriptors(struct panfrost_context *ctx) size_t desc_size = sizeof(struct mali_sampler_descriptor); for (int t = 0; t <= PIPE_SHADER_FRAGMENT; ++t) { -if (!ctx->sampler_count[t]) continue; +mali_ptr upload = 0; -size_t transfer_size = desc_size * ctx->sampler_count[t]; +if (ctx->sampler_count[t] && ctx->sampler_view_count[t]) { +size_t transfer_size = desc_size * ctx->sampler_count[t]; -struct panfrost_transfer transfer = -panfrost_allocate_transient(ctx, transfer_size); +struct panfrost_transfer transfer = +panfrost_allocate_transient(ctx, transfer_size); -struct mali_sampler_descriptor *desc = -(struct mali_sampler_descriptor *) transfer.cpu; +struct mali_sampler_descriptor *desc = +(struct mali_sampler_descriptor *) transfer.cpu; -for (int i = 0; i < ctx->sampler_count[t]; ++i) -desc[i] = ctx->samplers[t][i]->hw; +for (int i = 0; i < ctx->sampler_count[t]; ++i) +desc[i] = ctx->samplers[t][i]->hw; + +upload = transfer.gpu; +} if (t == PIPE_SHADER_FRAGMENT) -ctx->payload_tiler.postfix.sampler_descriptor = transfer.gpu; +ctx->payload_tiler.postfix.sampler_descriptor = upload; else if (t == PIPE_SHADER_VERTEX) -ctx->payload_vertex.postfix.sampler_descriptor = transfer.gpu; +ctx->payload_vertex.postfix.sampler_descriptor = upload; else assert(0); } @@ -977,16 +981,17 @@ static void panfrost_upload_texture_descriptors(struct panfrost_context *ctx) { for (int t = 0; t <= PIPE_SHADER_FRAGMENT; ++t) { -/* Shortcircuit */ -if (!ctx->sampler_view_count[t]) continue; +mali_ptr trampoline = 0; -uint64_t trampolines[PIPE_MAX_SHADER_SAMPLER_VIEWS]; +if (ctx->sampler_view_count[t]) { +uint64_t trampolines[PIPE_MAX_SHADER_SAMPLER_VIEWS]; -for (int i = 0; i < ctx->sampler_view_count[t]; ++i) -trampolines[i] = -panfrost_upload_tex(ctx, ctx->sampler_views[t][i]); +for (int i = 0; i < ctx->sampler_view_count[t]; ++i) +trampolines[i] = +panfrost_upload_tex(ctx, ctx->sampler_views[t][i]); -mali_ptr trampoline = panfrost_upload_transient(ctx, trampolines, sizeof(uint64_t) * ctx->sampler_view_count[t]); +trampoline = panfrost_upload_transient(ctx, trampolines, sizeof(uint64_t) * ctx->sampler_view_count[t]); +} if (t == PIPE_SHADER_FRAGMENT) ctx->payload_tiler.postfix.texture_trampoline = trampoline; @@ -2128,7 +2133,13 @@ panfrost_set_sampler_views( assert(start_slot == 0); -ctx->sampler_view_count[shader] = num_views; +unsigned new_nr = 0; +for (unsigned i = 0; i < num_views; ++i) { +if (views[i]) +new_nr = i + 1; +} + +ctx->sampler_view_count[shader] = new_nr; memcpy(ctx->sampler_views[shader], views, num_views * sizeof (void *)); ctx->dirty |= PAN_DIRTY_TEXTURES; -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 110923] Multiple VkSubpassDependency-entries with the same dstSubpass not handled correctly
https://bugs.freedesktop.org/show_bug.cgi?id=110923 Christian Forfang changed: What|Removed |Added CC||chris.forf...@gmail.com -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/6] panfrost: Remove forced flush on clears
This worked around a bug in ld versions of Panfrost. Nowadays, its presence is, at best, *creating* bugs. Let's wack it. Signed-off-by: Alyssa Rosenzweig --- src/gallium/drivers/panfrost/pan_context.c | 4 1 file changed, 4 deletions(-) diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c index b6839e71ca1..09bded80296 100644 --- a/src/gallium/drivers/panfrost/pan_context.c +++ b/src/gallium/drivers/panfrost/pan_context.c @@ -1391,10 +1391,6 @@ panfrost_submit_frame(struct panfrost_context *ctx, bool flush_immediate, /* Edge case if screen is cleared and nothing else */ bool has_draws = ctx->draw_count > 0; -/* Workaround a bizarre lockup (a hardware errata?) */ -if (!has_draws) -flush_immediate = true; - #ifndef DRY_RUN bool is_scanout = panfrost_is_scanout(ctx); -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/6] panfrost: Identify and decode mfbd_flags
Previously known as the unk3 field. Signed-off-by: Alyssa Rosenzweig --- .../drivers/panfrost/include/panfrost-job.h | 4 ++-- src/gallium/drivers/panfrost/pan_mfbd.c | 12 ++-- src/gallium/drivers/panfrost/pandecode/decode.c | 16 ++-- 3 files changed, 22 insertions(+), 10 deletions(-) diff --git a/src/gallium/drivers/panfrost/include/panfrost-job.h b/src/gallium/drivers/panfrost/include/panfrost-job.h index c7cb2d7b5f4..e320785542b 100644 --- a/src/gallium/drivers/panfrost/include/panfrost-job.h +++ b/src/gallium/drivers/panfrost/include/panfrost-job.h @@ -1493,7 +1493,7 @@ struct bifrost_fb_extra { u64 zero3, zero4; } __attribute__((packed)); -/* flags for unk3 */ +/* Flags for mfbd_flags */ /* Enables writing depth results back to main memory (rather than keeping them * on-chip in the tile buffer and then discarding) */ @@ -1524,7 +1524,7 @@ struct bifrost_framebuffer { u32 zero4 : 5; /* 0x30 */ u32 clear_stencil : 8; -u32 unk3 : 24; // = 0x100 +u32 mfbd_flags : 24; // = 0x100 float clear_depth; diff --git a/src/gallium/drivers/panfrost/pan_mfbd.c b/src/gallium/drivers/panfrost/pan_mfbd.c index 78d676511d6..8f1ae32fa40 100644 --- a/src/gallium/drivers/panfrost/pan_mfbd.c +++ b/src/gallium/drivers/panfrost/pan_mfbd.c @@ -124,7 +124,7 @@ panfrost_mfbd_set_zsbuf( struct panfrost_resource *rsrc = pan_resource(surf->texture); if (rsrc->bo->layout == PAN_AFBC) { -fb->unk3 |= MALI_MFBD_EXTRA; +fb->mfbd_flags |= MALI_MFBD_EXTRA; fbx->flags = MALI_EXTRA_PRESENT | @@ -141,7 +141,7 @@ panfrost_mfbd_set_zsbuf( fbx->ds_afbc.zero1 = 0x10009; fbx->ds_afbc.padding = 0x1000; } else if (rsrc->bo->layout == PAN_LINEAR) { -fb->unk3 |= MALI_MFBD_EXTRA; +fb->mfbd_flags |= MALI_MFBD_EXTRA; fbx->flags |= MALI_EXTRA_PRESENT | MALI_EXTRA_ZS | 0x1; fbx->ds_linear.depth = rsrc->bo->gpu; @@ -171,7 +171,7 @@ panfrost_mfbd_upload( off_t offset = 0; /* There may be extra data stuck in the middle */ -bool has_extra = fb->unk3 & MALI_MFBD_EXTRA; +bool has_extra = fb->mfbd_flags & MALI_MFBD_EXTRA; /* Compute total size for transfer */ @@ -213,7 +213,7 @@ panfrost_mfbd_fragment(struct panfrost_context *ctx) /* XXX: MRT case */ fb.rt_count_2 = 1; -fb.unk3 = 0x100; +fb.mfbd_flags = 0x100; /* TODO: MRT clear */ panfrost_mfbd_clear(job, &fb, &fbx, &rts[0]); @@ -263,13 +263,13 @@ panfrost_mfbd_fragment(struct panfrost_context *ctx) } if (job->requirements & PAN_REQ_DEPTH_WRITE) -fb.unk3 |= MALI_MFBD_DEPTH_WRITE; +fb.mfbd_flags |= MALI_MFBD_DEPTH_WRITE; if (ctx->pipe_framebuffer.nr_cbufs == 1) { struct panfrost_resource *rsrc = (struct panfrost_resource *) ctx->pipe_framebuffer.cbufs[0]->texture; if (rsrc->bo->has_checksum) { -fb.unk3 |= MALI_MFBD_EXTRA; +fb.mfbd_flags |= MALI_MFBD_EXTRA; fbx.flags |= MALI_EXTRA_PRESENT; fbx.checksum_stride = rsrc->bo->checksum_stride; fbx.checksum = rsrc->bo->gpu + rsrc->bo->slices[0].stride * rsrc->base.height0; diff --git a/src/gallium/drivers/panfrost/pandecode/decode.c b/src/gallium/drivers/panfrost/pandecode/decode.c index 46cdef313b5..fdb820a37f4 100644 --- a/src/gallium/drivers/panfrost/pandecode/decode.c +++ b/src/gallium/drivers/panfrost/pandecode/decode.c @@ -230,6 +230,15 @@ static const struct pandecode_flag_info shader_unknown1_flag_info [] = { }; #undef FLAG_INFO +#define FLAG_INFO(flag) { MALI_MFBD_##flag, "MALI_MFBD_" #flag } +static const struct pandecode_flag_info mfbd_flag_info [] = { +FLAG_INFO(DEPTH_WRITE), +FLAG_INFO(EXTRA), +{} +}; +#undef FLAG_INFO + + extern char *replace_fragment; extern char *replace_vertex; @@ -659,7 +668,10 @@ pandecode_replay_mfbd_bfr(uint64_t gpu_va, int job_no, bool with_render_targets) pandecode_prop("rt_count_1 = MALI_POSITIVE(%d)", fb->rt_count_1 + 1); pandecode_prop("rt_count_2 = %d", fb->rt_count_2); -pandecode_prop("unk3 = 0x%x", fb->unk3); +pandecode_log(".mfbd_flags = "); +pandecode_log_decoded_flags(mfbd_flag_info, fb->mfbd_flags); +pandecode_log_cont(",\n"); + pandecode_prop("clear_stencil = 0x%x", fb->clear_stencil); pandecode_prop("clear_depth = %f", fb->clear_depth); @@ -697,7 +709,7 @@ pandecode_replay_mfbd_bfr(uint64_t gpu_va, int job_no, bool with_render_targets) gpu_va += sizeof(struct bifrost_framebuffer); -if ((fb->unk3 & MALI_MFBD_EXTRA) && with_render_targets) { +if (
[Mesa-dev] [PATCH 4/6] panfrost: Flush scanout too
In a poorly coded app, the framebuffer can be partially drawn, an FBO switched, switch back to the framebuffer and keep drawing, etc. Reordering would fix this, but for now we need to just be careful about flushing scanout too. Signed-off-by: Alyssa Rosenzweig --- src/gallium/drivers/panfrost/pan_context.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c index bce57625163..b6839e71ca1 100644 --- a/src/gallium/drivers/panfrost/pan_context.c +++ b/src/gallium/drivers/panfrost/pan_context.c @@ -2153,12 +2153,15 @@ panfrost_set_framebuffer_state(struct pipe_context *pctx, { struct panfrost_context *ctx = pan_context(pctx); -/* Flush when switching away from an FBO, but not if the framebuffer +/* Flush when switching framebuffers, but not if the framebuffer * state is being restored by u_blitter */ -if (!panfrost_is_scanout(ctx) && !ctx->blitter->running) { -panfrost_flush(pctx, NULL, 0); +bool is_scanout = panfrost_is_scanout(ctx); +bool has_draws = ctx->draw_count > 0; + +if (!ctx->blitter->running && (!is_scanout || has_draws)) { +panfrost_flush(pctx, NULL, PIPE_FLUSH_END_OF_FRAME); } ctx->pipe_framebuffer.nr_cbufs = fb->nr_cbufs; -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] util: Add util_is_power_of_two_minus_one
> Except it would have to be util_is_power_of_two_or_zero because > util_is_power_of_two(0x + 1) is false. :) Corner cases, corner cases! > Is there actually a 2/2 for this? We usually wouldn't land something > like this without a caller. The 1/2 on there was accidentally, sorry. The use is in https://lists.freedesktop.org/archives/mesa-dev/2019-June/220201.html, but I figure you didn't need to get neck deep into Mali :) It's just for an assert(), so I can get rid of it if this little patch is too much to bikeshed ;) Just thought I'd try. > This will return true for v == 0. Is that the desired behavior? I > mean, that is 2**0 - 1, but it is not "all ones in binary." I think the > result may surprise people wanting to use this to detect a mask. This > is also how we ended up with util_is_power_of_two_nonzero and > util_is_power_of_two_or_zero. Hm. I'm not sure *is* the desired behaviour. In my case, (v == 0) is special-cased anyway (see linked patch). Maybe not landing this is easier to just avoid the can of worms. signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] util: Add util_is_power_of_two_minus_one
Honestly, maybe I should just retract the patch. What's the lingo for that? :P signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] amd: update addrlib
From: Marek Olšák --- Please test and fix RADV if needed. Compressed textures may be broken. This change contains the necessary radeonsi fixes for this addrlib. src/amd/addrlib/inc/addrinterface.h | 12 +- src/amd/addrlib/inc/addrtypes.h | 36 +- src/amd/addrlib/src/addrinterface.cpp | 2 +- src/amd/addrlib/src/amdgpu_asic_addr.h | 2 +- src/amd/addrlib/src/chip/gfx9/gfx9_gb_reg.h | 2 +- src/amd/addrlib/src/chip/r800/si_gb_reg.h | 2 +- src/amd/addrlib/src/core/addrcommon.h | 21 +- src/amd/addrlib/src/core/addrelemlib.cpp| 2 +- src/amd/addrlib/src/core/addrelemlib.h | 2 +- src/amd/addrlib/src/core/addrlib.cpp| 3 +- src/amd/addrlib/src/core/addrlib.h | 3 +- src/amd/addrlib/src/core/addrlib1.cpp | 2 +- src/amd/addrlib/src/core/addrlib1.h | 2 +- src/amd/addrlib/src/core/addrlib2.cpp | 48 +- src/amd/addrlib/src/core/addrlib2.h | 18 +- src/amd/addrlib/src/core/addrobject.cpp | 2 +- src/amd/addrlib/src/core/addrobject.h | 2 +- src/amd/addrlib/src/core/coord.cpp | 2 +- src/amd/addrlib/src/core/coord.h| 2 +- src/amd/addrlib/src/gfx9/gfx9addrlib.cpp| 898 +++- src/amd/addrlib/src/gfx9/gfx9addrlib.h | 30 +- src/amd/addrlib/src/r800/ciaddrlib.cpp | 16 +- src/amd/addrlib/src/r800/ciaddrlib.h| 7 +- src/amd/addrlib/src/r800/egbaddrlib.cpp | 4 +- src/amd/addrlib/src/r800/egbaddrlib.h | 2 +- src/amd/addrlib/src/r800/siaddrlib.cpp | 7 +- src/amd/addrlib/src/r800/siaddrlib.h| 2 +- src/gallium/drivers/radeonsi/si_texture.c | 24 + 28 files changed, 665 insertions(+), 490 deletions(-) diff --git a/src/amd/addrlib/inc/addrinterface.h b/src/amd/addrlib/inc/addrinterface.h index 1a2690970be..8e8f36378b3 100644 --- a/src/amd/addrlib/inc/addrinterface.h +++ b/src/amd/addrlib/inc/addrinterface.h @@ -1,5 +1,5 @@ /* - * Copyright © 2007-2018 Advanced Micro Devices, Inc. + * Copyright © 2007-2019 Advanced Micro Devices, Inc. * All Rights Reserved. * * Permission is hereby granted, free of charge, to any person obtaining @@ -307,7 +307,8 @@ typedef union _ADDR_CREATE_FLAGS UINT_32 checkLast2DLevel : 1;///< Check the last 2D mip sub level UINT_32 useHtileSliceAlign : 1;///< Do htile single slice alignment UINT_32 allowLargeThickTile: 1;///< Allow 64*thickness*bytesPerPixel > rowSize -UINT_32 reserved : 25; ///< Reserved bits for future use +UINT_32 forceDccAndTcCompat: 1;///< Force enable DCC and TC compatibility +UINT_32 reserved : 24; ///< Reserved bits for future use }; UINT_32 value; @@ -2879,6 +2880,9 @@ typedef struct _ADDR2_COMPUTE_CMASKINFO_INPUT UINT_32 unalignedWidth; ///< Color surface original width UINT_32 unalignedHeight;///< Color surface original height UINT_32 numSlices; ///< Number of slices of color buffer +UINT_32 numMipLevels; ///< Number of mip levels +UINT_32 firstMipIdInTail; ///< The id of first mip in tail, if no mip is in tail, +/// it should be number of mip levels } ADDR2_COMPUTE_CMASK_INFO_INPUT; /** @@ -2904,7 +2908,9 @@ typedef struct _ADDR2_COMPUTE_CMASK_INFO_OUTPUT UINT_32metaBlkWidth; ///< Meta block width UINT_32metaBlkHeight; ///< Meta block height -UINT_32metaBlkNumPerSlice; ///< Number of metablock within one slice +UINT_32metaBlkNumPerSlice; ///< Number of metablock within one slice + +ADDR2_META_MIP_INFO* pMipInfo; ///< CMASK mip information } ADDR2_COMPUTE_CMASK_INFO_OUTPUT; /** diff --git a/src/amd/addrlib/inc/addrtypes.h b/src/amd/addrlib/inc/addrtypes.h index c9393579b7e..36e342f3176 100644 --- a/src/amd/addrlib/inc/addrtypes.h +++ b/src/amd/addrlib/inc/addrtypes.h @@ -1,5 +1,5 @@ /* - * Copyright © 2007-2018 Advanced Micro Devices, Inc. + * Copyright © 2007-2019 Advanced Micro Devices, Inc. * All Rights Reserved. * * Permission is hereby granted, free of charge, to any person obtaining @@ -567,23 +567,23 @@ typedef enum _AddrHtileBlockSize */ typedef enum _AddrPipeCfg { -ADDR_PIPECFG_INVALID = 0, -ADDR_PIPECFG_P2 = 1, /// 2 pipes, -ADDR_PIPECFG_P4_8x16 = 5, /// 4 pipes, -ADDR_PIPECFG_P4_16x16= 6, -ADDR_PIPECFG_P4_16x32= 7, -ADDR_PIPECFG_P4_32x32= 8, -ADDR_PIPECFG_P8_16x16_8x16 = 9, /// 8 pipes -ADDR_PIPECFG_P8_16x32_8x16 = 10, -ADDR_PIPECFG_P8_32x32_8x16 = 11, -ADDR_PIPECFG_P8_16x32_16x16 = 12, -ADDR_PIPECFG_P8_32x32_16x16 = 13, -ADDR_PIPECFG_P8_32x32_16x32 = 14, -ADDR_PIPECFG_P8_32x64_32x32 = 15, -ADDR_PIPECFG_P16_32x32_8x16 = 17, /// 16 pipes -ADDR_PIPECFG_P16
[Mesa-dev] [PATCH] radeonsi: reduce MAX_GEOMETRY_OUTPUT_VERTICES
From: Nicolai Hähnle This fixes piglit spec@glsl-1.50@gs-max-output. --- src/gallium/drivers/radeonsi/si_get.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/si_get.c b/src/gallium/drivers/radeonsi/si_get.c index c1bddca1a66..9496817ac84 100644 --- a/src/gallium/drivers/radeonsi/si_get.c +++ b/src/gallium/drivers/radeonsi/si_get.c @@ -256,21 +256,23 @@ static int si_get_param(struct pipe_screen *pscreen, enum pipe_cap param) return sscreen->info.chip_class <= GFX8 ? PIPE_QUIRK_TEXTURE_BORDER_COLOR_SWIZZLE_R600 : 0; /* Stream output. */ case PIPE_CAP_MAX_STREAM_OUTPUT_SEPARATE_COMPONENTS: case PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS: return 32*4; /* Geometry shader output. */ case PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES: - return 1024; + /* gfx8 and earlier can do more, but nobody uses it because it +* would be a bad idea for performance. */ + return 256; case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS: return 4095; case PIPE_CAP_MAX_GS_INVOCATIONS: /* The closed driver exposes 127, but 125 is the greatest * number that works. */ return 125; case PIPE_CAP_MAX_VERTEX_ATTRIB_STRIDE: return 2048; -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] ac: update llvm.amdgcn.icmp intrinsic name for LLVM 9+
Reviewed-by: Marek Olšák Marek On Fri, Jun 14, 2019 at 5:57 AM Samuel Pitoiset wrote: > LLVM r363339 changed llvm.amdgcn.icmp.i* to llvm.amdgcn.icmp.i64.i*. > > Signed-off-by: Samuel Pitoiset > --- > src/amd/common/ac_llvm_build.c | 7 --- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/src/amd/common/ac_llvm_build.c > b/src/amd/common/ac_llvm_build.c > index 88e89d1dfb4..b93fdde023e 100644 > --- a/src/amd/common/ac_llvm_build.c > +++ b/src/amd/common/ac_llvm_build.c > @@ -441,6 +441,7 @@ LLVMValueRef > ac_build_ballot(struct ac_llvm_context *ctx, > LLVMValueRef value) > { > + const char *name = HAVE_LLVM >= 0x900 ? "llvm.amdgcn.icmp.i64.i32" > : "llvm.amdgcn.icmp.i32"; > LLVMValueRef args[3] = { > value, > ctx->i32_0, > @@ -454,8 +455,7 @@ ac_build_ballot(struct ac_llvm_context *ctx, > > args[0] = ac_to_integer(ctx, args[0]); > > - return ac_build_intrinsic(ctx, > - "llvm.amdgcn.icmp.i32", > + return ac_build_intrinsic(ctx, name, > ctx->i64, args, 3, > AC_FUNC_ATTR_NOUNWIND | > AC_FUNC_ATTR_READNONE | > @@ -465,6 +465,7 @@ ac_build_ballot(struct ac_llvm_context *ctx, > LLVMValueRef ac_get_i1_sgpr_mask(struct ac_llvm_context *ctx, > LLVMValueRef value) > { > + const char *name = HAVE_LLVM >= 0x900 ? "llvm.amdgcn.icmp.i64.i1" > : "llvm.amdgcn.icmp.i1"; > LLVMValueRef args[3] = { > value, > ctx->i1false, > @@ -472,7 +473,7 @@ LLVMValueRef ac_get_i1_sgpr_mask(struct > ac_llvm_context *ctx, > }; > > assert(HAVE_LLVM >= 0x0800); > - return ac_build_intrinsic(ctx, "llvm.amdgcn.icmp.i1", ctx->i64, > args, 3, > + return ac_build_intrinsic(ctx, name, ctx->i64, args, 3, > AC_FUNC_ATTR_NOUNWIND | > AC_FUNC_ATTR_READNONE | > AC_FUNC_ATTR_CONVERGENT); > -- > 2.22.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/8] panfrost/midgard: Adjust swizzles for 2D arrays
Signed-off-by: Alyssa Rosenzweig --- src/gallium/drivers/panfrost/midgard/midgard_compile.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/src/gallium/drivers/panfrost/midgard/midgard_compile.c b/src/gallium/drivers/panfrost/midgard/midgard_compile.c index 7d7bda6ee12..6dbfd4ade6e 100644 --- a/src/gallium/drivers/panfrost/midgard/midgard_compile.c +++ b/src/gallium/drivers/panfrost/midgard/midgard_compile.c @@ -86,6 +86,7 @@ midgard_block_add_successor(midgard_block *block, midgard_block *successor) #define SWIZZLE_XYXX SWIZZLE(COMPONENT_X, COMPONENT_Y, COMPONENT_X, COMPONENT_X) #define SWIZZLE_XYZX SWIZZLE(COMPONENT_X, COMPONENT_Y, COMPONENT_Z, COMPONENT_X) #define SWIZZLE_XYZW SWIZZLE(COMPONENT_X, COMPONENT_Y, COMPONENT_Z, COMPONENT_W) +#define SWIZZLE_XYXZ SWIZZLE(COMPONENT_X, COMPONENT_Y, COMPONENT_X, COMPONENT_Z) #define SWIZZLE_ SWIZZLE(COMPONENT_W, COMPONENT_W, COMPONENT_W, COMPONENT_W) static inline unsigned @@ -1405,6 +1406,15 @@ emit_tex(compiler_context *ctx, nir_tex_instr *instr) ins.alu.mask = expand_writemask(mask_of(nr_comp)); emit_mir_instruction(ctx, ins); +/* To the hardware, z is depth, w is array + * layer. To NIR, z is array layer for a 2D + * array */ + +bool has_array = instr->texture_array_size > 0; +bool is_2d = instr->sampler_dim == GLSL_SAMPLER_DIM_2D; + +if (is_2d && has_array) +position_swizzle = SWIZZLE_XYXZ; } break; -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/8] panfrost: 2D array and 3D textures
Exactly what it says on the tin. Decode them and implement them. Alyssa Rosenzweig (8): panfrost/midgard: Add swizzle_of/mask_of helpers panfrost/midgard: Fix 3D texture masks/swizzles panfrost: Specify 3D in texture descriptor panfrost: Implement 3D texture resource management panfrost: Decode array textures panfrost: Set array_size to permit array textures panfrost/midgard: Adjust swizzles for 2D arrays panfrost: Resource management for linear 2D texture arrays .../drivers/panfrost/include/panfrost-job.h | 6 +- .../panfrost/midgard/midgard_compile.c| 57 --- src/gallium/drivers/panfrost/pan_context.c| 13 - src/gallium/drivers/panfrost/pan_resource.c | 48 ++-- .../drivers/panfrost/pandecode/decode.c | 6 +- 5 files changed, 111 insertions(+), 19 deletions(-) -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/8] panfrost: Set array_size to permit array textures
Signed-off-by: Alyssa Rosenzweig --- src/gallium/drivers/panfrost/pan_context.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c index 4541b84754c..ec0e4ef7876 100644 --- a/src/gallium/drivers/panfrost/pan_context.c +++ b/src/gallium/drivers/panfrost/pan_context.c @@ -2089,10 +2089,21 @@ panfrost_create_sampler_view( } } +/* In the hardware, array_size refers specifically to array textures, + * whereas in Gallium, it also covers cubemaps */ + +unsigned array_size = texture->array_size; + +if (texture->target == PIPE_TEXTURE_CUBE) { +/* TODO: Cubemap arrays */ +assert(array_size == 6); +} + struct mali_texture_descriptor texture_descriptor = { .width = MALI_POSITIVE(texture->width0), .height = MALI_POSITIVE(texture->height0), .depth = MALI_POSITIVE(texture->depth0), +.array_size = MALI_POSITIVE(array_size), /* TODO: Decode */ .format = { -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 8/8] panfrost: Resource management for linear 2D texture arrays
Signed-off-by: Alyssa Rosenzweig --- src/gallium/drivers/panfrost/pan_resource.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/panfrost/pan_resource.c b/src/gallium/drivers/panfrost/pan_resource.c index 7c0d54a1f9f..81a74735592 100644 --- a/src/gallium/drivers/panfrost/pan_resource.c +++ b/src/gallium/drivers/panfrost/pan_resource.c @@ -283,7 +283,7 @@ panfrost_create_bo(struct panfrost_screen *screen, const struct pipe_resource *t /* Tiling textures is almost always faster, unless we only use it once */ bool is_texture = (template->bind & PIPE_BIND_SAMPLER_VIEW); -bool is_2d = template->depth0 == 1; +bool is_2d = template->depth0 == 1 && template->array_size == 1; bool is_streaming = (template->usage != PIPE_USAGE_STREAM); bool should_tile = is_streaming && is_texture && is_2d; @@ -326,6 +326,7 @@ panfrost_resource_create(struct pipe_screen *screen, case PIPE_TEXTURE_3D: case PIPE_TEXTURE_CUBE: case PIPE_TEXTURE_RECT: +case PIPE_TEXTURE_2D_ARRAY: break; default: DBG("Unknown texture target %d\n", template->target); -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/8] panfrost: Decode array textures
Signed-off-by: Alyssa Rosenzweig --- src/gallium/drivers/panfrost/include/panfrost-job.h | 3 +-- src/gallium/drivers/panfrost/pandecode/decode.c | 6 -- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/src/gallium/drivers/panfrost/include/panfrost-job.h b/src/gallium/drivers/panfrost/include/panfrost-job.h index d8a6decdc60..60f9c8d6227 100644 --- a/src/gallium/drivers/panfrost/include/panfrost-job.h +++ b/src/gallium/drivers/panfrost/include/panfrost-job.h @@ -1138,8 +1138,7 @@ struct mali_texture_descriptor { uint16_t width; uint16_t height; uint16_t depth; - -uint16_t unknown1; +uint16_t array_size; struct mali_texture_format format; diff --git a/src/gallium/drivers/panfrost/pandecode/decode.c b/src/gallium/drivers/panfrost/pandecode/decode.c index fdb820a37f4..b7a2e8cdc93 100644 --- a/src/gallium/drivers/panfrost/pandecode/decode.c +++ b/src/gallium/drivers/panfrost/pandecode/decode.c @@ -1507,8 +1507,7 @@ pandecode_replay_vertex_tiler_postfix_pre(const struct mali_vertex_tiler_postfix pandecode_prop("width = MALI_POSITIVE(%" PRId16 ")", t->width + 1); pandecode_prop("height = MALI_POSITIVE(%" PRId16 ")", t->height + 1); pandecode_prop("depth = MALI_POSITIVE(%" PRId16 ")", t->depth + 1); - -pandecode_prop("unknown1 = %" PRId16, t->unknown1); +pandecode_prop("array_size = MALI_POSITIVE(%" PRId16 ")", t->array_size + 1); pandecode_prop("unknown3 = %" PRId16, t->unknown3); pandecode_prop("unknown3A = %" PRId8, t->unknown3A); pandecode_prop("nr_mipmap_levels = %" PRId8, t->nr_mipmap_levels); @@ -1558,6 +1557,9 @@ pandecode_replay_vertex_tiler_postfix_pre(const struct mali_vertex_tiler_postfix if (!f.is_not_cubemap) bitmap_count *= 6; +/* Array of textures */ +bitmap_count *= MALI_NEGATIVE(t->array_size); + /* Stride for each element */ if (manual_stride) bitmap_count *= 2; -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/8] panfrost/midgard: Fix 3D texture masks/swizzles
Signed-off-by: Alyssa Rosenzweig --- .../drivers/panfrost/midgard/midgard_compile.c| 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/panfrost/midgard/midgard_compile.c b/src/gallium/drivers/panfrost/midgard/midgard_compile.c index d121fdee506..7d7bda6ee12 100644 --- a/src/gallium/drivers/panfrost/midgard/midgard_compile.c +++ b/src/gallium/drivers/panfrost/midgard/midgard_compile.c @@ -1370,9 +1370,12 @@ emit_tex(compiler_context *ctx, nir_tex_instr *instr) int texture_index = instr->texture_index; int sampler_index = texture_index; +unsigned position_swizzle = 0; + for (unsigned i = 0; i < instr->num_srcs; ++i) { int reg = SSA_FIXED_REGISTER(REGISTER_TEXTURE_BASE + in_reg); int index = nir_src_index(ctx, &instr->src[i].src); +int nr_comp = nir_src_num_components(instr->src[i].src); midgard_vector_alu_src alu_src = blank_alu_src; switch (instr->src[i].src_type) { @@ -1394,12 +1397,14 @@ emit_tex(compiler_context *ctx, nir_tex_instr *instr) st.load_store.swizzle = alu_src.swizzle; emit_mir_instruction(ctx, st); +position_swizzle = swizzle_of(2); } else { -alu_src.swizzle = SWIZZLE(COMPONENT_X, COMPONENT_Y, COMPONENT_X, COMPONENT_X); +position_swizzle = alu_src.swizzle = swizzle_of(nr_comp); midgard_instruction ins = v_fmov(index, alu_src, reg); -ins.alu.mask = expand_writemask(0x3); /* xy */ +ins.alu.mask = expand_writemask(mask_of(nr_comp)); emit_mir_instruction(ctx, ins); + } break; @@ -1438,7 +1443,7 @@ emit_tex(compiler_context *ctx, nir_tex_instr *instr) /* TODO: half */ .in_reg_full = 1, -.in_reg_swizzle = SWIZZLE_XYXX, +.in_reg_swizzle = position_swizzle, .out_full = 1, /* Always 1 */ -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/8] panfrost: Implement 3D texture resource management
Passes dEQP-GLES3.functional.texture.format.unsized.*3d* Signed-off-by: Alyssa Rosenzweig --- src/gallium/drivers/panfrost/pan_resource.c | 47 ++--- 1 file changed, 42 insertions(+), 5 deletions(-) diff --git a/src/gallium/drivers/panfrost/pan_resource.c b/src/gallium/drivers/panfrost/pan_resource.c index bcde38ae8b4..7c0d54a1f9f 100644 --- a/src/gallium/drivers/panfrost/pan_resource.c +++ b/src/gallium/drivers/panfrost/pan_resource.c @@ -181,8 +181,11 @@ panfrost_setup_slices(const struct pipe_resource *tmpl, struct panfrost_bo *bo) { unsigned width = tmpl->width0; unsigned height = tmpl->height0; +unsigned depth = tmpl->depth0; unsigned bytes_per_pixel = util_format_get_blocksize(tmpl->format); +assert(depth > 0); + /* Tiled operates blockwise; linear is packed. Also, anything * we render to has to be tile-aligned. Maybe not strictly * necessary, but we're not *that* pressed for memory and it @@ -193,17 +196,25 @@ panfrost_setup_slices(const struct pipe_resource *tmpl, struct panfrost_bo *bo) bool tiled = bo->layout == PAN_TILED; bool should_align = renderable || tiled; +/* We don't know how to specify a 2D stride for 3D textures */ + +bool should_align_stride = +tmpl->target != PIPE_TEXTURE_3D; + unsigned offset = 0; +unsigned size_2d = 0; for (unsigned l = 0; l <= tmpl->last_level; ++l) { struct panfrost_slice *slice = &bo->slices[l]; unsigned effective_width = width; unsigned effective_height = height; +unsigned effective_depth = depth; if (should_align) { effective_width = ALIGN(effective_width, 16); effective_height = ALIGN(effective_height, 16); +effective_depth = ALIGN(effective_height, 16); } slice->offset = offset; @@ -212,19 +223,40 @@ panfrost_setup_slices(const struct pipe_resource *tmpl, struct panfrost_bo *bo) unsigned stride = bytes_per_pixel * effective_width; /* ..but cache-line align it for performance */ -stride = ALIGN(stride, 64); +if (should_align_stride) +stride = ALIGN(stride, 64); + slice->stride = stride; -offset += slice->stride * effective_height; +unsigned slice_one_size = slice->stride * effective_height; +unsigned slice_full_size = slice_one_size * effective_depth; + +/* Report 2D size for 3D texturing */ + +if (l == 0) +size_2d = slice_one_size; + +offset += slice_full_size; width = u_minify(width, 1); height = u_minify(height, 1); +depth = u_minify(depth, 1); } assert(tmpl->array_size); -bo->cubemap_stride = ALIGN(offset, 64); -bo->size = ALIGN(bo->cubemap_stride * tmpl->array_size, 4096); +if (tmpl->target != PIPE_TEXTURE_3D) { +/* Arrays and cubemaps have the entire miptree duplicated */ + +bo->cubemap_stride = ALIGN(offset, 64); +bo->size = ALIGN(bo->cubemap_stride * tmpl->array_size, 4096); +} else { +/* 3D strides across the 2D layers */ +assert(tmpl->array_size == 1); + +bo->cubemap_stride = size_2d; +bo->size = ALIGN(offset, 4096); +} } static struct panfrost_bo * @@ -249,7 +281,12 @@ panfrost_create_bo(struct panfrost_screen *screen, const struct pipe_resource *t */ /* Tiling textures is almost always faster, unless we only use it once */ -bool should_tile = (template->usage != PIPE_USAGE_STREAM) && (template->bind & PIPE_BIND_SAMPLER_VIEW); + +bool is_texture = (template->bind & PIPE_BIND_SAMPLER_VIEW); +bool is_2d = template->depth0 == 1; +bool is_streaming = (template->usage != PIPE_USAGE_STREAM); + +bool should_tile = is_streaming && is_texture && is_2d; /* Set the layout appropriately */ bo->layout = should_tile ? PAN_TILED : PAN_LINEAR; -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/8] panfrost: Specify 3D in texture descriptor
Signed-off-by: Alyssa Rosenzweig --- src/gallium/drivers/panfrost/include/panfrost-job.h | 3 +++ src/gallium/drivers/panfrost/pan_context.c | 2 +- 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/panfrost/include/panfrost-job.h b/src/gallium/drivers/panfrost/include/panfrost-job.h index e320785542b..d8a6decdc60 100644 --- a/src/gallium/drivers/panfrost/include/panfrost-job.h +++ b/src/gallium/drivers/panfrost/include/panfrost-job.h @@ -1119,6 +1119,9 @@ enum mali_wrap_mode { /* Corresponds to the type passed to glTexImage2D and so forth */ +/* For usage1 */ +#define MALI_TEX_3D (0x04) + /* Flags for usage2 */ #define MALI_TEX_MANUAL_STRIDE (0x20) diff --git a/src/gallium/drivers/panfrost/pan_context.c b/src/gallium/drivers/panfrost/pan_context.c index 8dbdc84209d..4541b84754c 100644 --- a/src/gallium/drivers/panfrost/pan_context.c +++ b/src/gallium/drivers/panfrost/pan_context.c @@ -2099,7 +2099,7 @@ panfrost_create_sampler_view( .swizzle = panfrost_translate_swizzle_4(desc->swizzle), .format = format, -.usage1 = 0x0, +.usage1 = (texture->target == PIPE_TEXTURE_3D) ? MALI_TEX_3D : 0, .is_not_cubemap = texture->target != PIPE_TEXTURE_CUBE, .usage2 = usage2_layout -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/8] panfrost/midgard: Add swizzle_of/mask_of helpers
These make manipulating vectors in the Midgard compiler easier. Signed-off-by: Alyssa Rosenzweig --- .../panfrost/midgard/midgard_compile.c| 36 +++ 1 file changed, 30 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/panfrost/midgard/midgard_compile.c b/src/gallium/drivers/panfrost/midgard/midgard_compile.c index 1374c1ee647..d121fdee506 100644 --- a/src/gallium/drivers/panfrost/midgard/midgard_compile.c +++ b/src/gallium/drivers/panfrost/midgard/midgard_compile.c @@ -82,11 +82,35 @@ midgard_block_add_successor(midgard_block *block, midgard_block *successor) * driver seems to do it that way */ #define EMIT(op, ...) emit_mir_instruction(ctx, v_##op(__VA_ARGS__)); -#define SWIZZLE_XYZW SWIZZLE(COMPONENT_X, COMPONENT_Y, COMPONENT_Z, COMPONENT_W) -#define SWIZZLE_XYXX SWIZZLE(COMPONENT_X, COMPONENT_Y, COMPONENT_X, COMPONENT_X) #define SWIZZLE_ SWIZZLE(COMPONENT_X, COMPONENT_X, COMPONENT_X, COMPONENT_X) +#define SWIZZLE_XYXX SWIZZLE(COMPONENT_X, COMPONENT_Y, COMPONENT_X, COMPONENT_X) +#define SWIZZLE_XYZX SWIZZLE(COMPONENT_X, COMPONENT_Y, COMPONENT_Z, COMPONENT_X) +#define SWIZZLE_XYZW SWIZZLE(COMPONENT_X, COMPONENT_Y, COMPONENT_Z, COMPONENT_W) #define SWIZZLE_ SWIZZLE(COMPONENT_W, COMPONENT_W, COMPONENT_W, COMPONENT_W) +static inline unsigned +swizzle_of(unsigned comp) +{ +switch (comp) { +case 1: +return SWIZZLE_; +case 2: +return SWIZZLE_XYXX; +case 3: +return SWIZZLE_XYZX; +case 4: +return SWIZZLE_XYZW; +default: +unreachable("Invalid component count"); +} +} + +static inline unsigned +mask_of(unsigned nr_comp) +{ +return (1 << nr_comp) - 1; +} + #define M_LOAD_STORE(name, rname, uname) \ static midgard_instruction m_##name(unsigned ssa, unsigned address) { \ midgard_instruction i = { \ @@ -593,7 +617,7 @@ emit_condition_mixed(compiler_context *ctx, nir_alu_src *src, unsigned nr_comp) .outmod = midgard_outmod_int_wrap, .reg_mode = midgard_reg_mode_32, .dest_override = midgard_dest_override_none, -.mask = expand_writemask((1 << nr_comp) - 1), +.mask = expand_writemask(mask_of(nr_comp)), .src1 = vector_alu_srco_unsigned(alu_src), .src2 = vector_alu_srco_unsigned(alu_src) }, @@ -904,7 +928,7 @@ emit_alu(compiler_context *ctx, nir_alu_instr *instr) .outmod = outmod, /* Writemask only valid for non-SSA NIR */ -.mask = expand_writemask((1 << nr_components) - 1), +.mask = expand_writemask(mask_of(nr_components)), .src1 = vector_alu_srco_unsigned(vector_alu_modifiers(nirmods[0], is_int)), .src2 = vector_alu_srco_unsigned(vector_alu_modifiers(nirmods[1], is_int)), @@ -1021,7 +1045,7 @@ emit_varying_read( /* TODO: swizzle, mask */ midgard_instruction ins = m_ld_vary_32(dest, offset); -ins.load_store.mask = (1 << nr_comp) - 1; +ins.load_store.mask = mask_of(nr_comp); ins.load_store.swizzle = SWIZZLE_XYZW >> (2 * component); midgard_varying_parameter p = { @@ -1186,7 +1210,7 @@ emit_intrinsic(compiler_context *ctx, nir_intrinsic_instr *instr) } else if (ctx->stage == MESA_SHADER_VERTEX) { midgard_instruction ins = m_ld_attr_32(reg, offset); ins.load_store.unknown = 0x1E1E; /* XXX: What is this? */ -ins.load_store.mask = (1 << nr_comp) - 1; +ins.load_store.mask = mask_of(nr_comp); emit_mir_instruction(ctx, ins); } else { DBG("Unknown load\n"); -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] panfrost: Don't align to tile either for non-alignable textures
Tag to the 3D texture series. With this it passes 100% of dEQP-GLES3.functional.texture.format.unsized.* I plan to squash this in but don't want to resend the whole series just for this. Signed-off-by: Alyssa Rosenzweig --- src/gallium/drivers/panfrost/pan_resource.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/gallium/drivers/panfrost/pan_resource.c b/src/gallium/drivers/panfrost/pan_resource.c index 81a74735592..e1b8cee74f3 100644 --- a/src/gallium/drivers/panfrost/pan_resource.c +++ b/src/gallium/drivers/panfrost/pan_resource.c @@ -201,6 +201,8 @@ panfrost_setup_slices(const struct pipe_resource *tmpl, struct panfrost_bo *bo) bool should_align_stride = tmpl->target != PIPE_TEXTURE_3D; +should_align &= should_align_stride; + unsigned offset = 0; unsigned size_2d = 0; -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH 3/3] amd: Apply elf relocations and allow code with relocations
Am 14.06.2019 08:13, schrieb Jan Vesely: On Thu, 2019-06-13 at 21:20 +0200, Dieter Nützel wrote: Am 13.06.2019 07:10, schrieb Marek Olšák: > FYI, I just pushed the new linker. > > Marek Thank you very much Marek and _Nicolai_ for this GREAT stuff. It brings back some speed after 1/8 drop with glmark2, lately. Maybe my amd-staging-drm-next tree (5.2-rc1) didn't honor the kernel mitigation parameter right. @Jan Go ahead with your nice relocation and image work. Send me what you have in the works. The relocation work is no longer needed as the new linker handles things. The corruption is caused either by (still faulty) conversion builtins, or incorrect buffer coherence handling. Both need fixing, but I'm not sure which one is to blame in this case. Latest Mesa git (with Nicolai's new linker) let all 3 luxmark versions run. Only 'Hotel lobby' (with v3.0 and v3.1) show some corruption but do NOT crash any longer. Numbers for 'Neumann TLM-102 SE' (medium) show ~43000K (!!!). https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.phoronix.com%2Fforums%2Fforum%2Fphoronix%2Flatest-phoronix-articles%2F1106085-linux-kernel-set-to-expose-hidden-nvidia-hda-controllers-helping-laptop-users%3Fp%3D1106199%23post1106199&data=02%7C01%7Cjan.vesely%40cs.rutgers.edu%7Cae4545df023e4910433c08d6f03438a8%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636960504419864592&sdata=xSOotxsWyJDb2J14lNk1NV4bK2nRK3%2FzWoxNyRj6IqU%3D&reserved=0 Blender crash as expected ;-) /home/dieter> trying to save userpref at /home/dieter/.config/blender/2.79/config/userpref.blend ok Read blend: /data/Blender/barbershop_interior_gpu.blend scripts disabled for "/data/Blender/barbershop_interior_gpu.blend", skipping 'generate_customprops.py' skipping driver 'var', automatic scripts are disabled skipping driver 'var', automatic scripts are disabled skipping driver 'var', automatic scripts are disabled skipping driver 'var', automatic scripts are disabled skipping driver 'var', automatic scripts are disabled skipping driver 'var', automatic scripts are disabled skipping driver 'var', automatic scripts are disabled skipping driver 'var', automatic scripts are disabled skipping driver 'var', automatic scripts are disabled Device init success Compiling OpenCL program split Kernel compilation of split finished in 8.41s. Compiling OpenCL program base Kernel compilation of base finished in 4.55s. Compiling OpenCL program denoising Kernel compilation of denoising finished in 2.08s. blender: ../src/gallium/drivers/radeonsi/si_compute.c:319: si_set_global_binding: Assertion `first + n <= MAX_GLOBAL_BUFFERS' failed. [1]Abbruch blender (core dumped) The number of max global buffers was bumped in 06bf56725d to fix similar crash in luxmark. I guess it needs another bump. Hello Jan, I'm so blind... ...bumping it 48 and 64 (first try) works. 33 not ;-) We shouldn't waste to much memory. Now, let's start with the libclc work. Luxmark 'Hotel' is very blocky and Blender 'barbershop_interior_gpu' mostly black. I have some images. Shouldn't we better open a new ticket. Any hints for a good name? Or do we have one already? I can put my pictures, there. Simpler scenes work, but mostly gray (without colors/texture). Dieter ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] android: virgl: fix libmesa_virgil_common build and dependencies
Fixes the following building errors and resolves Bug 110922 Fixes gallium_dri target missing symbols at linking. external/mesa/src/gallium/winsys/virgl/drm/Android.mk: error: libmesa_winsys_virgl (STATIC_LIBRARIES android-x86_64) missing libmesa_winsys_virgl_common (STATIC_LIBRARIES android-x86_64) ... external/mesa/src/gallium/winsys/virgl/vtest/Android.mk: error: libmesa_winsys_virgl_vtest (STATIC_LIBRARIES android-x86_64) missing libmesa_winsys_virgl_common (STATIC_LIBRARIES android-x86_64) ... build/core/main.mk:728: error: exiting from previous errors. In file included from external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_socket.c:34: external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_winsys.h:35:10: fatal error: 'virgl_resource_cache.h' file not found ^~~~ 1 error generated. In file included from external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_winsys.c:32: external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_winsys.h:35:10: fatal error: 'virgl_resource_cache.h' file not found #include "virgl_resource_cache.h" ^~~~ 1 error generated. Fixes: b18f09a ("virgl: Introduce virgl_resource_cache") Signed-off-by: Mauro Rossi --- src/gallium/Android.mk| 2 +- src/gallium/drivers/virgl/Android.mk | 2 +- src/gallium/winsys/virgl/drm/Android.mk | 2 ++ src/gallium/winsys/virgl/vtest/Android.mk | 2 ++ 4 files changed, 6 insertions(+), 2 deletions(-) diff --git a/src/gallium/Android.mk b/src/gallium/Android.mk index 3a3f042c7a..37e923c225 100644 --- a/src/gallium/Android.mk +++ b/src/gallium/Android.mk @@ -43,7 +43,7 @@ SUBDIRS += winsys/radeon/drm drivers/r300 SUBDIRS += winsys/radeon/drm drivers/r600 SUBDIRS += winsys/radeon/drm winsys/amdgpu/drm drivers/radeonsi SUBDIRS += winsys/vc4/drm drivers/vc4 -SUBDIRS += winsys/virgl/drm winsys/virgl/vtest drivers/virgl +SUBDIRS += winsys/virgl/common winsys/virgl/drm winsys/virgl/vtest drivers/virgl SUBDIRS += winsys/svga/drm drivers/svga SUBDIRS += winsys/etnaviv/drm drivers/etnaviv drivers/renderonly SUBDIRS += state_trackers/dri diff --git a/src/gallium/drivers/virgl/Android.mk b/src/gallium/drivers/virgl/Android.mk index 0067dfa702..a6fe53fbe9 100644 --- a/src/gallium/drivers/virgl/Android.mk +++ b/src/gallium/drivers/virgl/Android.mk @@ -35,5 +35,5 @@ include $(BUILD_STATIC_LIBRARY) ifneq ($(HAVE_GALLIUM_VIRGL),) GALLIUM_TARGET_DRIVERS += virtio_gpu -$(eval GALLIUM_LIBS += $(LOCAL_MODULE) libmesa_winsys_virgl libmesa_winsys_virgl_vtest) +$(eval GALLIUM_LIBS += $(LOCAL_MODULE) libmesa_winsys_virgl_common libmesa_winsys_virgl libmesa_winsys_virgl_vtest) endif diff --git a/src/gallium/winsys/virgl/drm/Android.mk b/src/gallium/winsys/virgl/drm/Android.mk index 5e2500774e..398a7645bc 100644 --- a/src/gallium/winsys/virgl/drm/Android.mk +++ b/src/gallium/winsys/virgl/drm/Android.mk @@ -27,6 +27,8 @@ include $(CLEAR_VARS) LOCAL_SRC_FILES := $(C_SOURCES) +LOCAL_C_INCLUDES := $(GALLIUM_TOP)/winsys/virgl/common + LOCAL_MODULE := libmesa_winsys_virgl LOCAL_STATIC_LIBRARIES := libmesa_winsys_virgl_common diff --git a/src/gallium/winsys/virgl/vtest/Android.mk b/src/gallium/winsys/virgl/vtest/Android.mk index 5b33f67711..6d35223c8e 100644 --- a/src/gallium/winsys/virgl/vtest/Android.mk +++ b/src/gallium/winsys/virgl/vtest/Android.mk @@ -27,6 +27,8 @@ include $(CLEAR_VARS) LOCAL_SRC_FILES := $(C_SOURCES) +LOCAL_C_INCLUDES := $(GALLIUM_TOP)/winsys/virgl/common + LOCAL_MODULE := libmesa_winsys_virgl_vtest LOCAL_STATIC_LIBRARIES := libmesa_winsys_virgl_common -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] android: virgl: fix libmesa_virgil_common build and dependencies
Hi, there is a typo in the commit title, the library is libmesa_winsys_virgl_common I will correct it in the final commit Mauro On Sat, Jun 15, 2019 at 7:39 AM Mauro Rossi wrote: > Fixes the following building errors and resolves Bug 110922 > Fixes gallium_dri target missing symbols at linking. > > external/mesa/src/gallium/winsys/virgl/drm/Android.mk: > error: libmesa_winsys_virgl (STATIC_LIBRARIES android-x86_64) missing > libmesa_winsys_virgl_common (STATIC_LIBRARIES android-x86_64) > ... > external/mesa/src/gallium/winsys/virgl/vtest/Android.mk: > error: libmesa_winsys_virgl_vtest (STATIC_LIBRARIES android-x86_64) > missing libmesa_winsys_virgl_common (STATIC_LIBRARIES android-x86_64) > ... > build/core/main.mk:728: error: exiting from previous errors. > > In file included from > external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_socket.c:34: > external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_winsys.h:35:10: > fatal error: 'virgl_resource_cache.h' file not found > ^~~~ > 1 error generated. > > In file included from > external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_winsys.c:32: > external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_winsys.h:35:10: > fatal error: 'virgl_resource_cache.h' file not found > #include "virgl_resource_cache.h" > ^~~~ > 1 error generated. > > Fixes: b18f09a ("virgl: Introduce virgl_resource_cache") > Signed-off-by: Mauro Rossi > --- > src/gallium/Android.mk| 2 +- > src/gallium/drivers/virgl/Android.mk | 2 +- > src/gallium/winsys/virgl/drm/Android.mk | 2 ++ > src/gallium/winsys/virgl/vtest/Android.mk | 2 ++ > 4 files changed, 6 insertions(+), 2 deletions(-) > > diff --git a/src/gallium/Android.mk b/src/gallium/Android.mk > index 3a3f042c7a..37e923c225 100644 > --- a/src/gallium/Android.mk > +++ b/src/gallium/Android.mk > @@ -43,7 +43,7 @@ SUBDIRS += winsys/radeon/drm drivers/r300 > SUBDIRS += winsys/radeon/drm drivers/r600 > SUBDIRS += winsys/radeon/drm winsys/amdgpu/drm drivers/radeonsi > SUBDIRS += winsys/vc4/drm drivers/vc4 > -SUBDIRS += winsys/virgl/drm winsys/virgl/vtest drivers/virgl > +SUBDIRS += winsys/virgl/common winsys/virgl/drm winsys/virgl/vtest > drivers/virgl > SUBDIRS += winsys/svga/drm drivers/svga > SUBDIRS += winsys/etnaviv/drm drivers/etnaviv drivers/renderonly > SUBDIRS += state_trackers/dri > diff --git a/src/gallium/drivers/virgl/Android.mk > b/src/gallium/drivers/virgl/Android.mk > index 0067dfa702..a6fe53fbe9 100644 > --- a/src/gallium/drivers/virgl/Android.mk > +++ b/src/gallium/drivers/virgl/Android.mk > @@ -35,5 +35,5 @@ include $(BUILD_STATIC_LIBRARY) > > ifneq ($(HAVE_GALLIUM_VIRGL),) > GALLIUM_TARGET_DRIVERS += virtio_gpu > -$(eval GALLIUM_LIBS += $(LOCAL_MODULE) libmesa_winsys_virgl > libmesa_winsys_virgl_vtest) > +$(eval GALLIUM_LIBS += $(LOCAL_MODULE) libmesa_winsys_virgl_common > libmesa_winsys_virgl libmesa_winsys_virgl_vtest) > endif > diff --git a/src/gallium/winsys/virgl/drm/Android.mk > b/src/gallium/winsys/virgl/drm/Android.mk > index 5e2500774e..398a7645bc 100644 > --- a/src/gallium/winsys/virgl/drm/Android.mk > +++ b/src/gallium/winsys/virgl/drm/Android.mk > @@ -27,6 +27,8 @@ include $(CLEAR_VARS) > > LOCAL_SRC_FILES := $(C_SOURCES) > > +LOCAL_C_INCLUDES := $(GALLIUM_TOP)/winsys/virgl/common > + > LOCAL_MODULE := libmesa_winsys_virgl > > LOCAL_STATIC_LIBRARIES := libmesa_winsys_virgl_common > diff --git a/src/gallium/winsys/virgl/vtest/Android.mk > b/src/gallium/winsys/virgl/vtest/Android.mk > index 5b33f67711..6d35223c8e 100644 > --- a/src/gallium/winsys/virgl/vtest/Android.mk > +++ b/src/gallium/winsys/virgl/vtest/Android.mk > @@ -27,6 +27,8 @@ include $(CLEAR_VARS) > > LOCAL_SRC_FILES := $(C_SOURCES) > > +LOCAL_C_INCLUDES := $(GALLIUM_TOP)/winsys/virgl/common > + > LOCAL_MODULE := libmesa_winsys_virgl_vtest > > LOCAL_STATIC_LIBRARIES := libmesa_winsys_virgl_common > -- > 2.20.1 > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev