(Warning, this is fairly long, and the situation involved includes a fair number of potentially-moving pieces.)
I've recently built a new computer, installed Debian, configured it to largely match my previous setup, and migrated my data across. Now, I'm trying to take advantage of one of the hardware improvements over the system I migrated away from: a newer, better-performing GPU. In particular, I want to run software which makes use of the Vulkan API for improved graphics performance. However, I'm running into walls left and right; every time I find a way around one, I hit another. Most of the solutions I find out there, based on the error messages I'm encountering, seem to be aimed at people using older graphics-card models; they should not, and as far as I can tell do not, apply in my case. As far as I can tell, Vulkan is simply not working at all on my system, even though there doesn't seem to be any established/known reason why it wouldn't / shouldn't. I've found a fair bit of discussion of ways to get it to work for other distributions, but none at all for Debian, except for ones that appear to be outdated or otherwise not applicable to my hardware. The specific problem I'm seeing seems to be that Vulkan does not seem to be seeing my actual GPU, at all. The specific graphics card I have is https://www.amazon.com/gp/product/B07WNSP41M , which is a MSI-branded Radeon RX 5700 XT. lshw reports it as: product: Navi 10 [Radeon RX 5600 OEM/5600 XT / 5700/5700 XT] 'inxi -Fxxxz' reports it as: Graphics: Device-1: AMD Navi 10 [Radeon RX 5600 OEM/5600 XT / 5700/5700 XT] vendor: Micro-Star MSI driver: amdgpu v: kernel bus ID: 0d:00.0 chip ID: 1002:731f class ID: 0300 Display: server: X.Org 1.20.11 driver: loaded: amdgpu,ati unloaded: fbdev,modesetting,radeon,vesa resolution: 2560x1600~60Hz s-dpi: 96 OpenGL: renderer: AMD Radeon RX 5700 XT (NAVI10 DRM 3.40.0 5.10.0-7-amd64 LLVM 11.0.1) v: 4.6 Mesa 20.3.4 direct render: Yes Notably, both the above and 'lspci -k' report that the amdgpu driver is being used. There are many online discussions from earlier AMD graphics card generations in which the solution is to prevent the use of the legacy radeon driver and force the use of the amdgpu driver instead, by passing appropriate options to the kernel at boot time. I have not yet specifically tried any of these, because the condition which they're expected to produce - the use of the amdgpu driver - seems to already be present. It's also important to note that I *do* appear to be getting functional 3D acceleration. I haven't done much that would need it yet, but the little I've done with native Linux software and OpenGL etc. has worked seamlessly, 'vblank_mode=0 glxgears' is reporting something on the order of 10x the FPS it did on my previous (ten-or-so years old) system, and the output of 'glxgears -info' includes the model name of the GPU. I don't find it plausible that I'd be seeing results like that if the system just weren't using the GPU's acceleration capabilities at all. It's just Vulkan that seems to be having problems. As I understand matters, the baseline interface to test whether Vulkan is functioning - and, if so, get information about it - is the 'vulkaninfo' command from the vulkan-tools package. If I run that command unadorned, I get lengthy output, which does not mention the above graphics card at all. Instead, it mentions two different llvmpipe devices. I understand this to be effectively software-based rendering, not actual hardware acceleration as such. This is supported by the fact that the only deviceType entry in the "Device Properties and Extensions" section iis set to PHYSICAL_DEVICE_TYPE_CPU. If I run the included command 'vkcube', I get a window (somewhat larger, I think, than the one produced by 'glxgears') which depicts an animated 3D cube. The cube spins so fast that I can't make out what the textures on it are supposed to be. The system fans ramp up slowly while the process is active. The message WARNING: lavapipe is not a conformant vulkan implementation, testing use only is printed to the terminal. (I am given to understand that this message is expected when the llvmpipe backend is used, and that it reflects the fact that you really aren't supposed to use it for anything production-level. Also, from what I've read, I think lavapipe is just the current name for the Vulkan variant of llvmpipe.) If I instead run the included command 'vkcubepp', I get a similar window. The cube is spinning much slower at the start, but speeds up over time. As it speeds up, the system fans ramp up much faster than with plain 'vkcube'. The same message as before is printed to the terminal. I do not know what the 'pp' may stand for, or what the difference is supposed to be. The program I'm specifically interested in trying to run is Final Fantasy XIV, via Wine, and - to minimize configuration and setup troubles, which have been bad enough even as it stands - via Lutris. If I run that program via Lutris with Vulkan enabled (the default) and with - as far as I recall - no special other configuration outside of what Lutris itself applies, I get a crash, and a message that the llvmpipe device was skipped. Running it with maximum-verbosity debug output reveals that the crash error code apparently means that the 'dxvk' Vulkan backend for Wine did not find any adapters to use. Online searching finds that one possible way around this, and/or other issues which I encountered along the way, is to specify a different ICD with the VK_ICD_FILENAMES environment variable (set to one or more full paths, apparently colon-separated). Appropriate ICD profile definition files are apparently stored / shipped under /usr/share/vulkan/icd.d/. In that location, I have JSON files defining profiles for 32-bit and 64-bit versions of Intel, Radeon, and lavapipe ICDs, all of which come from the mesa-vulkan-drivers package. Examining the Radeon ICD definition files shows nothing which would obviously make it look as if this is specifically for e.g. the old, legacy radeon driver; everything I've found seems to say that it should also work with amdgpu. If I specify one or both of the Radeon ICD definition files using that environment variable, and try the above tests again, the results are different but in no way better. During launch of Lutris (not even of the Wine session to run the actual program), the message "Vulkan is not available or your system isn't Vulkan capable" is printed to the terminal. Trying to launch the actual Wine session produces even less useful results. If I run 'vulkaninfo' with that variable set, I get the following output: ERROR: [Loader Message] Code 0 : /usr/lib/i386-linux-gnu/libvulkan_radeon.so: wrong ELF class: ELFCLASS32 ERROR at /build/vulkan-tools-oFB8Ns/vulkan-tools-1.2.162.0+dfsg1/vulkaninfo/vulkaninfo.h:248:vkEnumerateInstanceExtensionProperties failed with ERROR_INITIALIZATION_FAILED If I run 'vkcube' with that variable set, I get the following output: vkcube: /build/vulkan-tools-oFB8Ns/vulkan-tools-1.2.162.0+dfsg1/cube/cube.c:3271: demo_init_vk: Assertion `!err' failed. Aborted If I run 'vkcubepp' with that variable set, I get the following output: vkcubepp: /build/vulkan-tools-oFB8Ns/vulkan-tools-1.2.162.0+dfsg1/cube/cube.cpp:1247: void Demo::init_vk(): Assertion `result == vk::Result::eSuccess' failed. Aborted At this point in my testing, I'm fairly sure all of these are reporting the same problem: failure to find a graphics adapter to use. That would seem to suggest that, perhaps, the Radeon ICD definition files involved don't actually work with amdgpu and this GPU model. I've found discussions which appear to report or imply success with similar graphics card models using the 'amdvlk' Vulkan implementation. That is available for various other distributions, including Ubuntu (possibly even in the form of .deb files made available, as part of an automated installer, by AMD themselves); however, it does not appear to have been packaged for Debian. A set of RFPs was filed back in 2018, by a DD, but it does not appear that anything ever got done to follow up on it; see bug #916552 and the listed blocking bugs. There's an unofficial "how to install the official AMD upstream driver" procedure, which may or may not include amdvlk, at https://wiki.debian.org/AMDGPUDriverOnStretchAndBuster2 - but it's specifically focused on how to effectively backport this to earlier Debian releases, not on how to get it to work with current Debian testing. I'm reluctant to go mixing sources in that way, regardless. If I try to run the Lutris setup with Vulkan disabled, I get what appears to be an entirely different crash; that's another road, and while I may try to walk it at some point, it is not the issue I'm focusing on here. I'm currently focused on trying to figure out why Vulkan doesn't seem to be working in a way that actually uses my GPU. Does anyone have experience with using Vulkan with AMD graphics on vaguely current Debian, using an even vaguely recent GPU? (Where "vaguely current Debian" means Debian testing more recently than the release of current stable, and a "vaguely recent GPU" means something new enough that disabling the support for it in the legacy radeon driver isn't necessary, because that driver doesn't support it to begin with.) Any suggestions for ways to pursue getting this working, that don't involve hybridizing or Frankensteining or otherwise messing up my installed system? -- The Wanderer The reasonable man adapts himself to the world; the unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man. -- George Bernard Shaw
signature.asc
Description: OpenPGP digital signature