Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++

Christian Kastner Wed, 05 Feb 2025 13:03:18 -0800

On 2025-02-05 21:58, Petter Reinholdtsen wrote:
> Where can I find the draft packaging for llama.cpp now?  Is there a
> public git repo somewhere?


Repo is here [1].

I'm currently doing the final build, will upload to NEW once its
finished. I'll also upload binaries to apt.rocm.debian.net.

This version is good enough for general release. The packaging still
needs some improvement, but the user-facing side is done -- meaning that
all of the utilities are present, and performance should be the best
achievable.

TODOs, for example, are:
  * Adding tests. I still need to check what can be enabled without
    needing a (non-free) model.
  * Enabling backend dlopen support. Currently, all backends are "full"
    builds that conflict with each other. However this requires upstream
    changes because our amd64 baseline is not supported. Also, there
    are some corner cases that can cause an abort. In any case, the
    current solution just wastes a bit of space.
  * manpages for the tools (using help2man).

I thought it best to upload now and fix the remaining issues above while
the package sits in NEW.

This was fun to work on. After much initial experimenting, the final
result is pretty minimal for what it does. That's mostly because cmake
makes multiple builds so easy.

Best,
Christian

[1]: https://salsa.debian.org/deeplearning-team/llama.cpp

Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++

Reply via email to