[AMD Official Use Only - AMD Internal Distribution Only] I maintain Fedora's llama-cpp This package is fast moving, best to pick a reasonably recent build and stick with it for a while. I picked ours to solve some CVE's reported against llama-cpp as well as sync with our python-llama-cpp package. I have stripped out almost all of it and export what is needed by python-llama-cpp. If you want to coordinate with Fedora on versions, let me know. Tom
-----Original Message----- From: Cordell Bloor <c...@slerp.xyz> Sent: Saturday, December 14, 2024 11:46 PM To: 1063...@bugs.debian.org; Christian Kastner <c...@debian.org>; Petter Reinholdtsen <p...@hungry.com> Cc: Debian ROCm Team <debian...@lists.debian.org> Subject: Re: Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++ Hi Christian and Petter, On Sat, 9 Mar 2024 10:20:32 +0100 Christian Kastner <c...@debian.org> wrote: > I've discarded the simple package and now plan another approach: a > > package that ships a helper to rebuild the utility when needed, similar > > to DKMS. Rationale: > * Continuously developed upstream, no build suited for stable > * Build > optimized for the current host's hardware, which is a key > feature. > Building for our amd64 ISA standard would be absurd. > I'm open for better ideas, though. Perhaps we are letting the perfect be the enemy of the good? There are lots of fast-moving projects that get frozen at some version for stable. While that can be annoying for maintenance, it is also something that provides value. It's hard to build on top of something that keeps changing. I would also argue that you're taking on too much responsibility trying to enable -march=native optimizations. It's true that you can get significantly more performance using AVX instructions available on most modern computers, but if llama.cpp really wanted they could implement dynamic dispatch themselves. The CPU instruction set is also irrelevant for the GPU-accelerated version of the package. Why not deliver the basics before we try to do something fancy? In the time that passed between the creation of this issue and now, Fedora created their own llama.cpp package [1]. I think they had the right idea. There's value in providing a working package to users today, even if it's imperfect. Sincerely, Cory Bloor [1]: https://packages.fedoraproject.org/pkgs/llama-cpp/llama-cpp/