On Thu, 2025-02-06 at 09:13 +0100, Christian Kastner wrote: > > I meant to ask anyway: performance-wise, is it comparable to your local > build? I mean, I wouldn't know what in the code would alter this, but I > built and tested this on platti.d.o and performance was poor, so another > data point would be useful.
For ppc64el, the llama.cpp-blas backend is way slower than the -cpu backend. I did not test on amd64. But on ppc64el the package does not feel different than local build. CPU is slow anyway. How does HIP performs? phi-4-q4.gguf | power9, cpu (8-threads) | 0.62 tokens/s phi-4-q4.gguf | amd64, 13900H | 6.7 tokens/s GPU is way faster than this. The phi-4 model does not fit in my nvidia GPU. No number for GPU this time.