On Sun, Jun 2, 2019 at 12:32 PM Alex Smith <asm...@feralinteractive.com> wrote: > > Put the uncached GTT type at a higher index than the visible VRAM type, > rather than having GTT first. > > When we don't have dedicated VRAM, we don't have a non-visible VRAM > type, and the property flags for GTT and visible VRAM are identical. > According to the spec, for types with identical flags, we should give > the one with better performance a lower index. > > Previously, apps which follow the spec guidance for choosing a memory > type would have picked the GTT type in preference to visible VRAM (all > Feral games will do this), and end up with lower performance. > > On a Ryzen 5 2500U laptop (Raven Ridge), this improves average FPS in > the Rise of the Tomb Raider benchmark by up to ~30%. Tested a couple of > other (Feral) games and saw similar improvement on those as well. > > Signed-off-by: Alex Smith <asm...@feralinteractive.com> > --- > I noticed that the memory types advertised on my Raven laptop looked a > bit odd so played around with it and found this. I'm not sure if it is > actually expected that the performance difference between visible VRAM > and GTT is so large, seeing as it's not dedicated VRAM, but the results > are clear (and consistent, tested multiple times).
AFAIU it is still using different memory paths, with GTT using different pagetables (those from the CPU I believe on APUs) and possible CPU snooping. Main risk here seems applications pushing out driver internal stuff (descriptor sets etc.) from "VRAM", posssibly hitting perf elsewhere. That said, Reviewed-by: Bas Nieuwenhuizen <b...@basnieuwenhuizen.nl> > --- > src/amd/vulkan/radv_device.c | 18 +++++++++++++++--- > 1 file changed, 15 insertions(+), 3 deletions(-) > > diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c > index 3cf050ed220..d36ee226ebd 100644 > --- a/src/amd/vulkan/radv_device.c > +++ b/src/amd/vulkan/radv_device.c > @@ -171,12 +171,11 @@ radv_physical_device_init_mem_types(struct > radv_physical_device *device) > .heapIndex = vram_index, > }; > } > - if (gart_index >= 0) { > + if (gart_index >= 0 && device->rad_info.has_dedicated_vram) { > device->mem_type_indices[type_count] = > RADV_MEM_TYPE_GTT_WRITE_COMBINE; > device->memory_properties.memoryTypes[type_count++] = > (VkMemoryType) { > .propertyFlags = VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | > - VK_MEMORY_PROPERTY_HOST_COHERENT_BIT | > - (device->rad_info.has_dedicated_vram ? 0 : > VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT), > + VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, > .heapIndex = gart_index, > }; > } > @@ -189,6 +188,19 @@ radv_physical_device_init_mem_types(struct > radv_physical_device *device) > .heapIndex = visible_vram_index, > }; > } > + if (gart_index >= 0 && !device->rad_info.has_dedicated_vram) { > + /* Put GTT after visible VRAM for GPUs without dedicated VRAM > + * as they have identical property flags, and according to the > + * spec, for types with identical flags, the one with greater > + * performance must be given a lower index. */ > + device->mem_type_indices[type_count] = > RADV_MEM_TYPE_GTT_WRITE_COMBINE; > + device->memory_properties.memoryTypes[type_count++] = > (VkMemoryType) { > + .propertyFlags = VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT | > + VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | > + VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, > + .heapIndex = gart_index, > + }; > + } > if (gart_index >= 0) { > device->mem_type_indices[type_count] = > RADV_MEM_TYPE_GTT_CACHED; > device->memory_properties.memoryTypes[type_count++] = > (VkMemoryType) { > -- > 2.21.0 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev