jeremy ardley wrote:

> The modern way would be to use a LLM in API mode and set
> a context to achieve your aims.

Here is the command. Turns out, I used the llamafile method
with llava or mistral as LLMs.

In the command, we see '-c 2048'. This, says the man file, is
to "Set the size of the prompt context." If the unit is bytes
(chars) it isn't a lot.

But then note '--prompt-cache-all' so what you do can be saved
and brought back.

While this I'm sure is enough for a lot of use cases, here we
envision a huge backlog file, obviously it cannot be loaded in
chunks of 2048 chars at the time.

I also provide a CLI but REPL version for people who would
care to experiment. But neither method will work for this,
at least not in their present form.

llm is a link so you can alternate between llava and mistral
as LLM :)

taskset -c 0-2       \
./llm                \
  --cli              \
  --log-disable      \
  --prompt-cache-all \
  --silent-prompt    \
  -c 2048            \
  -ngl 9999          \
  -p "$(cat $src)" > $dst

./llm \
    --cli           \
    --color         \
    --log-disable   \
    --silent-prompt \
    -cml            \
    -i              \
    -ld log         \

Versions are:

$ mistral-7b-instruct-v0.2.Q5_K_M.llamafile --version
llamafile v0.8.5 (Apache License 2.0)

llava-v1.5-7b-q4.llamafile

Apache License 2.0 is FOSS so this is all CLI, all local and
free. If it can be made to work for this, maybe people would
be happy about it all tho there isn't an old school algorithm
which is deterministic and you can fiddle with until it is
just right, so you are missing that out unfortunately.

-- 
underground experts united
https://dataswamp.org/~incal

Reply via email to