Re: Project-wide LLM budget for helping people

M. Zhou Sun, 12 Jan 2025 14:02:11 -0800

On Sun, 2025-01-12 at 22:36 +0100, Philipp Kern wrote:
> 
> No-one is stopped from using any of the free offers. I don't think we
> need our own chat bot. Of course that means, in turn, that we give up on
> feeding it domain-specific knowledge and our own prompt. But that's...
> probably fine?


One long term goal of debian deep learning team is to host an LLM with
the team's AMD GPUs and expose it to the members. That said, the necessary
packages to run that kind of service are still missing from our archive.
It is a good way to use existing GPUs any way.

Even if we get no commercial sponsorship of API calls, we will eventually
experiment and evaluate one with the team's infrastructure. We are still
working towards that.

> If those LLMs support that, one could still produce a guide on how to
> feed more interesting data into it - or provide a LoRA. It's not like
> inference requires a GPU.

First, DebGPT is designed to conveniently put any particular information
whether or not Debian-specific, to the context of LLM. I have also implemented
some mapreduce algorithm to let the LLM deal with extremely overlength
context such as a whole ratt buildlog directory.

LoRA is only sound when you have a clear definition on the task you want
the LLM to deal with. If we do not know what the user want, then forget
about LoRA and just carefully provide the context to LLM. DebGPT is
technically on the right way in terms of feasibility and efficiency.

RAG may help. I have already implemented the vector database and the
retrieval modules in DebGPT, but the frontend part for RAG is still
under development.

> But then again saying things like "oh, look, I could easily answer the
> NM templates with this" is the context you want to put this work in.

My intention is always to explore possible and potential ways to make LLM
useful in any possible extent. To support my idea, I wrote DebGPT, and
I tend to only claim things that is *already implemented* and *reproducible*
in DebGPT.

For instance, I've added the automatic answering of the nm-tempaltes in
DebGPT and the following script can quickly give all the answer.
The answers are pretty good at a first glance. I'll postpone the full
evaluation when I wrote the code for all nm-templates.

I simply dislike saying nonsense that cannot be implemented in DebGPT.
But please do not limit your imagination with my readily available
demo examples or the use cases I claimed.


(you need to use the latest git version of DebGPT)
```
# nm_assigned.txt
debgpt -f nm:nm_assigned -a 'pretend to be lu...@debian.org and answer the 
question. Give concrete examples, and links as evidence supporting them are 
preferred.' -o nm-assigned-selfintro.txt

# nm_pp1.txt
for Q in PH0 PH1 PH2 PH3 PH4 PH5 PH6 PH7 PHa; do
debgpt -HQf nm:pp1.${Q} -a 'Be concise and answer in just several sentences.' 
-o nm-pp1-${Q}-brief.txt;
debgpt -HQf nm:pp1.${Q} -a 'Be precise and answer with details explained.' -o 
nm-pp1-${Q}-detail.txt;
done

# nm_pp1_extras.txt
for Q in PH0 PH8 PH9 PHb; do
debgpt -HQf nm:pp1e.${Q} -a 'Be concise and answer in just several sentences.' 
-o nm-pp1e-${Q}-brief.txt;
debgpt -HQf nm:pp1e.${Q} -a 'Be precise and answer with details explained.' -o 
nm-pp1e-${Q}-detail.txt;
done
```

[1] DebGPT: https://salsa.debian.org/deeplearning-team/debgpt

Re: Project-wide LLM budget for helping people

Reply via email to