Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++

Christian Kastner Thu, 06 Feb 2025 00:15:15 -0800

On 2025-02-06 02:42, M. Zhou wrote:
> I second this. llama-server is also the service endpoint for DebGPT.


I'll prioritize fixing this.

> I pushed a fix for ppc64el. The hwcaps works correctly for power9, given the 
> baseline is power 8.

Ah good catch. The broken install pattern was due to a last-minute fix
that I only tested on amd64...

I meant to ask anyway: performance-wise, is it comparable to your local
build? I mean, I wouldn't know what in the code would alter this, but I
built and tested this on platti.d.o and performance was poor, so another
data point would be useful.

Best,
Christian

Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++

Reply via email to