Dear Jean Louis, thanks so much for this detailed response! This is
incredibly helpful, especially on using embeddings and PostgreSQL for
structured linking. Your LLM function example is also great to see in
action! I’ve never used PostgreSQL or anything similar before, but I’ll
give it a try. It seems like a powerful approach, and I’m excited to
explore it. Thanks again for taking the time to share your experience and
approach! Best, Gideon.

El jue, 6 feb 2025 a las 10:31, Jean Louis (<bugs@gnu.support>) escribió:

> * Gideon Silberman Moro <gerardomor...@gmail.com> [2025-02-05 09:39]:
> > Hi everyone,
> >
> > I'm looking for a way to automatically link notes in Zetteldeft using AI.
> > Ideally, I'd like an approach that analyzes the content of my notes and
> > suggests or creates links between relevant ones.
> >
> > Has anyone experimented with integrating AI (e.g., LLMs, embeddings, or
> > external tools like OpenAI or local models) to automate or enhance
> > Zetteldeft's linking process? Are there existing Emacs packages or
> > workflows that could help with this (without the need of an API)?
>
> Hi! You can automate linking in Zetteldeft using AI by leveraging
> local models or embeddings. Here's a quick approach:
>
> 1. **Embeddings**: Use a local model (e.g., Sentence Transformers) to
>    generate embeddings for your notes. Compare embeddings to find
>    semantic similarities and suggest links. Tools like `transformers`
>    or `gensim` can help.
>
>    Personally I work with Dynamic Knowledge Repository which in turn
>    encompass Org documents and all other kinds of documents. So my
>    information is ordered in the PostgreSQL database. Using Langchain
>    and other tools for chunking is necessary in that sense, as by
>    using chunks it is possible to augment it better and better find
>    relevant documents.
>
>    Though RAG could be used as well, it depends of course of how much
>    data you have. My "Meta" Org is like 70,000 documents, and they are
>    all hyperlinks, but what about hyperlinks within hyperlinks? So
>    that purpose of hyperlinking automatically is possible by using
>    either RAG or embeddings. Providing RAG or embeddings is rather
>    easy when there is database involved, considering that vector type
>    already exists in PostgreSQL.
>
> 2. **LLMs**: Run a local LLM (e.g., IBM Granite, or Microsoft Phi as
>    fully free software) to analyze note content and suggest links. You
>    can script this in Emacs Lisp over the Python.
>
> 3. **Emacs Packages**: I would not recommend any in this moment. Your
>    request is very specific. I am making my own LLM functions, here is
>    one of them that works and can be adjusted:
>
> (defun rcd-llm-llamafile (prompt &optional memory rcd-llm-model)
>   "Send PROMPT to Llama file.
>
> Optional MEMORY and MODEL may be used."
>        (let* ((rcd-llm-model (cond ((boundp 'rcd-llm-model) rcd-llm-model)
>                                    (t "LLaMA_CPP")))
>               (memory (cond ((and memory rcd-llm-use-users-llm-memory)
>                              (concat "Following is user's memory, until
> the END-OF-MEMORY-TAG: \n\n" memory "\n\n END-OF-MEMORY-TAG\n\n"))))
>               (prompt (cond (memory (concat memory "\n\n" prompt))
>                             (t prompt)))
>               (temperature 0.8)
>               (max-tokens -1)
>               (top-p 0.95)
>               (stream :json-false)
>               (buffer (let ((url-request-method "POST")
>                             (url-request-extra-headers
>                              '(("Content-Type" . "application/json")
>                                ("Authorization" . "Bearer no-key")))
>                             (prompt (encode-coding-string prompt 'utf-8))
>                             (url-request-data
>                              (encode-coding-string
>                               (setq rcd-llm-last-json
>                                     (json-encode
>                                      `((model . ,rcd-llm-model)
>                                        (messages . [ ((role . "system")
>                                                       (content . "You are
> a helpful assistant. Answer short."))
>                                                     ((role . "user")
>                                                      (content . ,prompt))])
>                                        (temperature . ,temperature)
>                                        (max_tokens . ,max-tokens)
>                                        (top_p . ,top-p)
>                                        (stream . ,stream))))
>                               'utf-8)))
>                         (url-retrieve-synchronously
>                          ;; "http://127.0.0.1:8080/v1/chat/completions
> "))))
>                          "http://192.168.188.140:8080/v1/chat/completions
> "))))
>        (rcd-llm-response buffer)))
>
> and you can read there it uses some memory if necessary, and that
> memory can be also the list of links which you would like to insert.
>
> So the solution could be in the simple function which context or
> system message contains the summaries and list of links, a simple
> prompt could instruct the LLM to hyperlink it all.
>
> Additionally you could use grammar instruction from llama.cpp
>
> No API needed if you stick to local models!
>
> Your idea is great.
>
> Let me say it this way, solution to your problem is so much closer
> then we think. It is just there, requires some tuning and it can
> already work.
>
> It requires planning of the knowledge. I don't want all links by
> hyperlinked just because they match, 70,000 documents is there, but I
> don't want them hyperlinked. I want specific hyperlinks hyperlinked.
>
> Many of them are also ranked, I worked with many. So I would like it
> by rank too. You have to plan first how to sort the information, which
> information, etc.
>
> Then you provide it to embeddings, but how? Where are you going to
> store vectors? Or RAG?
>
> Using PostgreSQL and vector type is good way to go.
>
> --
> Jean Louis
>

Reply via email to