On 2026-03-22 07:47, Helmut Grohne wrote:
> I see I didn't look far enough. This is generally unsolvable due to
> policy section 2.2.1. The intention of that section is that installing a
> main package cannot pull a non-main package by default. If ggml were
> somehow managing to express a dependency on cuda, it would immediately
> violate that section.
>
> An alternative might be picking the blas (or vulkan) backend by default
> as it is less hardware-specific than the others and recommending
> "libggml0-backend-blas | libggml0-backend".
Using the Vulkan backend by default might actually be a decent solution,
supporting CPUs, GPUs and NPUs in the best case where drivers and other
packages are up to date. The BLAS backend only supports a tiny subset of
operations and using the model quantizations I've been testing, I
haven't noticed any performance difference to pure CPU.
I just wish we had more user data/feedback on this, especially with more
backends to come.
Just to make sure I'm missing something: if we pick
libggml0-backend-vulkan | libggml0-backend
now, then the downsides are:
(1) Users wanting CUDA or HIP support must explicitly install that
backend
=> I think this is fine
(2) If the default should change away from Vulkan, existing
installations would need manual intervention
=> Could be addressed with a NEWS entry
Mathieu has done quite a bit of tests with Vulkan and deemed them
successful and our CI confirms this with the stacks installed from Debian.
Mathieu, I think you also tested this with vendor stacks? Either way,
I'm going to check if we can add vendor stacks to our CI somehow.
Best,
Christian
PS: I think (but could be wrong) that finally being able to ship
backports for trixie and noble (and soon, resolute) could accelerate
this feedback.