On 10/7/24 23:41, Emanuel Berg wrote:
jeremy ardley wrote:

The modern way would be to use a LLM in API mode and set
a context to achieve your aims.

Here is the command. Turns out, I used the llamafile method
with llava or mistral as LLMs.

In the command, we see '-c 2048'. This, says the man file, is
to "Set the size of the prompt context." If the unit is bytes
(chars) it isn't a lot.

But then note '--prompt-cache-all' so what you do can be saved
and brought back.

While this I'm sure is enough for a lot of use cases, here we
envision a huge backlog file, obviously it cannot be loaded in
chunks of 2048 chars at the time.

I also provide a CLI but REPL version for people who would
care to experiment. But neither method will work for this,
at least not in their present form.

llm is a link so you can alternate between llava and mistral
as LLM :)

taskset -c 0-2       \
./llm                \
   --cli              \
   --log-disable      \
   --prompt-cache-all \
   --silent-prompt    \
   -c 2048            \
   -ngl 9999          \
   -p "$(cat $src)" > $dst

./llm \
     --cli           \
     --color         \
     --log-disable   \
     --silent-prompt \
     -cml            \
     -i              \
     -ld log         \

Versions are:

$ mistral-7b-instruct-v0.2.Q5_K_M.llamafile --version
llamafile v0.8.5 (Apache License 2.0)

llava-v1.5-7b-q4.llamafile

Apache License 2.0 is FOSS so this is all CLI, all local and
free. If it can be made to work for this, maybe people would
be happy about it all tho there isn't an old school algorithm
which is deterministic and you can fiddle with until it is
just right, so you are missing that out unfortunately.



I asked ChatGPT4 about this using your original email as a prompt. It came back with a solution based on how LLMs are trained but it would have required some development. The many-dimensional vector comparison mechanisms used in LLMs do seem quite well matched to what you want to do.

In your case, without fully knowing precisely what your aim is, one possible approach is to put all the text you want to search into a GPT4All localdocs directory where it will be indexed on the fly. Then create a prompt/context with the search text and instructions to generate a similarity index and report any that meet some threshold.

You will have to get the results in some format such as json and post process

You may want to get ChatGPT 4 to help you craft the general prompt.

For reference, I have GPT4All and am planning to use its localdocs feature with man page text. That will ensure that answers it gives on technical questions will at least quote the man pages accurately.

Reply via email to