On 2026-03-22 07:47, Helmut Grohne wrote:
> I see I didn't look far enough. This is generally unsolvable due to
> policy section 2.2.1. The intention of that section is that installing a
> main package cannot pull a non-main package by default. If ggml were
> somehow managing to express a dependency on cuda, it would immediately
> violate that section.
> 
> An alternative might be picking the blas (or vulkan) backend by default
> as it is less hardware-specific than the others and recommending
> "libggml0-backend-blas | libggml0-backend".

Using the Vulkan backend by default might actually be a decent solution,
supporting CPUs, GPUs and NPUs in the best case where drivers and other
packages are up to date. The BLAS backend only supports a tiny subset of
operations and using the model quantizations I've been testing, I
haven't noticed any performance difference to pure CPU.

I just wish we had more user data/feedback on this, especially with more
backends to come.

Just to make sure I'm missing something: if we pick

    libggml0-backend-vulkan | libggml0-backend

now, then the downsides are:

  (1) Users wanting CUDA or HIP support must explicitly install that
      backend
        => I think this is fine

  (2) If the default should change away from Vulkan, existing
      installations would need manual intervention
        => Could be addressed with a NEWS entry

Mathieu has done quite a bit of tests with Vulkan and deemed them
successful and our CI confirms this with the stacks installed from Debian.

Mathieu, I think you also tested this with vendor stacks? Either way,
I'm going to check if we can add vendor stacks to our CI somehow.

Best,
Christian

PS: I think (but could be wrong) that finally being able to ship
backports for trixie and noble (and soon, resolute) could accelerate
this feedback.

Reply via email to