Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++

Cordell Bloor Sat, 14 Dec 2024 23:57:29 -0800

Hi Christian and Petter,

On Sat, 9 Mar 2024 10:20:32 +0100 Christian Kastner <c...@debian.org> wrote:
> I've discarded the simple package and now plan another approach: a
> package that ships a helper to rebuild the utility when needed, similar
> to DKMS. Rationale:
> * Continuously developed upstream, no build suited for stable
> * Build optimized for the current host's hardware, which is a key
> feature. Building for our amd64 ISA standard would be absurd.
> I'm open for better ideas, though.


Perhaps we are letting the perfect be the enemy of the good?

There are lots of fast-moving projects that get frozen at some versionfor stable. While that can be annoying for maintenance, it is alsosomething that provides value. It's hard to build on top of somethingthat keeps changing.

I would also argue that you're taking on too much responsibility tryingto enable -march=native optimizations. It's true that you can getsignificantly more performance using AVX instructions available on mostmodern computers, but if llama.cpp really wanted they could implementdynamic dispatch themselves. The CPU instruction set is also irrelevantfor the GPU-accelerated version of the package.

Why not deliver the basics before we try to do something fancy? In thetime that passed between the creation of this issue and now, Fedoracreated their own llama.cpp package [1]. I think they had the rightidea. There's value in providing a working package to users today, evenif it's imperfect.


Sincerely,
Cory Bloor

[1]: https://packages.fedoraproject.org/pkgs/llama-cpp/llama-cpp/

Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++

Reply via email to