Re: One big file or multiple small ones

Jean Louis Thu, 13 Mar 2025 03:38:28 -0700

* Zororg via "General discussions about Org-mode. <[email protected]> 
[2025-03-12 20:52]:
> I had also been scrambling myself about this as well.
> 
> I started out with denote (many files) since it was nice and I was using many
> packages.
> 
> Later as I went on, started digging into org mode and wondered about
> managing notesin single file.

When using Org mode with files—like most users—I prefer to categorize
them. Naturally, I wouldn't mix my business notes with my
personal/family ones; that would make little sense unless you have
very few or insignificant notes where mixing could lead to confusion
between your business and family matters.

It is just an example of family and business there, though I have too
many categories, so it doesn't really make sense mixing different
distinct categories in a single file.

Different users have different needs. If user says he is going to mix
all categories of notes in a single fine, than be it so. Though it
doesn't stand the time test. Over the time, user will start
accumulating, IMHO, user is sooner or later to realize that it is
better categorizing.

The notion of categorizing is anyway principle of Org mode use.

> One by alphapapa, who states that rather than a system, notes should
> focus on searching. People all over internet prefer searching and
> filtering by tags. So we should leverage that as a practise.

You need both the system and various searching capabilities. In fact,
when system of categorizing is well designed, you almost need no
searching. That is the whole point of the system.

If you don't apply the system well enough or user doesn't know how to
apply it, then searching is the only option left.

All notes that I have are categorized by related people and related
subjects.

Any notes related Zororg, any files, any e-mails, phone calls, meeting
minutes, documents, are related to Zororg and because of that
relation, anything I have done with Zororg is clearly easy to pinpoint
in the set or file related to Zororg.

When I think of my personal categories, I may think of "ffmpeg" or
"computer stuff" and I first get into that category then I find the
rest.

But let's say I think of Sacha Chua, first thing I do is I write
"Chua" and find the person, then I click on anything related to Chua.

At that time point I can already use C-s to find whatever else related
to notes to Sacha Chua. Maybe some Emacs News.

I get any website pages I created related to her work, and whatever
other notes, documents, audio, etc.

Because of categorizing, which is the meaning of the "system", I am
speedy in searching.

Tags are just one way of categorizing. In fact I use tags when I need
intersection of information, such as price of cocoa, price of coffee,
price of red crude oil, those are all different tags.

Tags are part of the system, not separate issue.

> Another one when I saw yantar (org maintainer) notes and workflow in
> org meetup. He manages single so well with information.

I am sure people do, and it is possible, as Org mode itself has
sections and sections help in categorizing.

Though, again, it is subjective to the user.

Imagine you manage football club, and you got all the members and
families coming for celebration to the club, so are you really going
to keep your private issues together with memberships? I don't think
so. It would also look embarassing in the club.

When user has no privacy issues, then it is fine to keep it in one
place, even then user is to categorize it as that is totally natural
in Org mode application.

> One thing many go for small files is because of bi-directional
> links, creating those cool graphs

I have many Org files on my system, as for many people, meaning
hundreds of them. When file is related to person, the file has been
automatically opened, prepared, with some default sections, so that I
can enter notes, tasks and transactions.

When I have realized that it is not going to pass the time test, I
have switched to database work, so all my notes are in the database,
including Org notes as database entries and Org files as files on the
file system.

That enables me to related single documents, and tasks, notes, to
multiple people, companies, who are working on it, so that is how
Dynamic Knowledge Repository as envisioned by Doug Engelbart came to
existence.

About Dynamic Knowledge Repositories (DKR):
https://www.dougengelbart.org/content/view/190/163/

** Statistics

╔════════════════════════╦════════╦══════════════════════════════╦═══════╗
║ Total number of people ║ 243204 ║ Total Hyperdocuments ║ 66453 ║
╠════════════════════════╬════════╬══════════════════════════════╬═══════╣
║ People in last week ║ 40 ║ Hyperdocuments in last week ║ 91 ║
╠════════════════════════╬════════╬══════════════════════════════╬═══════╣
║ People in last month ║ 219 ║ Hyperdocuments in last month ║ 447 ║
╚════════════════════════╩════════╩══════════════════════════════╩═══════╝

As you see, large number of people are coming into business, I cannot
be wasting my time to make tags, so I search by people. It is
different use case, but is supposed to give some insights.

Instead of writing only in Org mode, I can write in any kind of
markup, including adding any kind of files into system.

Human user needs to make a note for some later action, but writing
tags about that note may be wasting efforts. It may be you are writing
more tags and spending more time writing tags then writing the actual
note itself. That would be detrimental situation that serves only the
purpose of "mastering tags".

Another issue is limitation of tags in Org mode, if there are too many
tags, the section really starts looking confusing.

Tags with spaces are not allowed.

That limitation I could not stand, so I have made categories of tags
(tag types) and tags, and automatic tags.

If person is in category of "people who wish to get employed" he may
get corresponding tag automatically, if document is employment
agreement, there will be corresponding tags too. It is for system to
make tags which are already predictable. Those other tags not
predictable by system, I can place them.

Recently, I don't even tag, I am just employing Large Language Model
(LLM) on my computer to generate tags. And it can just run without my
supervision. It is for computer to release me of my effort whenever
possible, not so?

Like for the above text:

1. **Org-mode**
2. **Categorization vs Tags**
3. **Single File Management in Org Mode**
4. **Dynamic Knowledge Repositories (DKR)**
5. **Statistics and Usage Metrics**
6. **Tag Limitations in Org mode**
7. **Automated Tagging with LLMs (Large Language Models)**
8. **Bi-directional Links and org-super-links Package**
9. **One Big File Workflow Benefits**
10. **Org-mode Features: Sparse Tree, Exporting Tags, Consult Outline**
11. **Datetree Capture Template Usage**
12. **ID Utilization in Org Mode (Inspired by Denote)**
13. **Comfort and Efficiency in Information Retrieval**

I just guess you can't use that type of tags in Org mode. But it could
be this way:

1. Org-mode-Discussion
2. Note-Management-SingleFile
3. Categorization-Personal-Business
4. User-Needs-Variability
5. System-Architecture-OrgMode
6. Tagging-Capabilities-InOrg
7. File-Organization-Alphabets
8. Search-Focus-Tags
9. Database-Management-Knowledge
10.DynamicKnowledgeRepository-DKR

And then searching by tags is also soon rather old fashioned.

- Creating tags by user's effort means giving some meanings to the
system so that user can search by meanings, semantically;

Creating tags through users' efforts involves assigning labels or
keywords (tags) to items such as documents, images, posts, etc., which
helps in organizing and categorizing content. When these tags are
created with specific meanings intended for search purposes—where the
system understands them semantically—it enhances the ability of a user
to find related information based on those concepts.

However, it's important to note:

- The effectiveness of this approach depends heavily on users
consistently using tags in a meaningful and standardized way.

- For true semantic searching capabilities beyond simple keyword
matching (like understanding synonyms or related concepts),
additional technologies like natural language processing might be
needed alongside user-generated tagging.

And why should I not use Large Language Model (LLM) capabilities when
it is available? Small models can fit in average computers and can run
in memory, like this one, everyone can run it:

https://huggingface.co/HuggingFaceTB/SmolLM2-360M

That it maybe sounds "hard", is just an illusion, it is as hard as
installing any software.

Using Org mode is also hard, and yet many people learn all the fine
details how to deal with it. But not that it provides magical uses.

How to improve Org search semantically?

1. Make simple table in PostgreSQL that will hold Org document or file
path to it;

2. Run simple shell script, finding all Org files and generating
tokens for PostgreSQL full text search: PostgreSQL: Documentation:
15: Chapter 12. Full Text Search:
https://www.postgresql.org/docs/15/textsearch.html

3. Make Emacs Lisp script to search by full text and get results; it
is going to search with tags, and dates, and everything what was
provided to tokenizer in PostgreSQL;

There are desktop searches:

- doodle
- searchmonkey
- tracker
- catfish
- find

and others.

Emacs alone have other ways of searching for files and contents, user
need not think to constrain oneself only within the Org package.

> I had been using a one big file for over several months, and I could
> never think of a way to go back to small ones (denote or just org)

Different users, different use cases.

> In short I'd say, we feel comfortable and better to use wikipedia, wiki,
> search engine cause everything is available at one stop with few
> searches and clicks.
>
> Similar way searching shold be focused and practised well to fetch
> information.

Exactly, Org and Emacs don't have enough integration.

Anyway, it is not moving the optimum direction. We can stay fiddling
with the code, while other people don't code any more, they just tell
what they want, and they get it. Computer does that for them.

Why fiddle with templates and cryptic stuff?

That is too much work for human. Unless you got the time.

Buying little better computer is beneficial for each Org user, little
more memory or speed, and you can already use helpful software that
enhances your experience.

We are talking about search here.

Our, whatever, operating systems, don't excel with
searches. Installing more sophisticated search engines is not easy,
and users struggle with the search.

Org search capabilities are good enough basically for everything what
common users need, of course not perfect, like nothing is perfect
anyway.

Just that I see how Org mode community tend to stay where it is.

It was presented back then as "Your Life in Plain Text," but does it
really live up to its name? I just don't have that feeling. Sure, it
is plain text in the literal sense, but there are many complications
and numerous requirements for users.

> Hope to know others workflow as well.

My best workflow is to forget the Org mode as the only way of doing
it, and go the level up. The Dynamic Knowledge Repository as by the
vision Doug Engelbart is the way to go. Large Language Model (LLM) of
today tremendously help users.

/usr/local/bin/llama-server -ngl 999 -v -c 8192 -ub 8192 --embedding
--log-timestamps --host 192.168.1.68 --port 9999 -m
/mnt/data/LLM/nomic-ai/quantized/nomic-embed-text-v1.5-Q8_0.gguf

That embeddings model is running all time on my computer. But why?
Because to be able to search semantically.

To forget the direct fiddling with tags, dates, all dates,

> One thing I realized is to not worry about the system or files, and
> craft a working way that makes sense and gives bliss (comfort), most
> importanly which retrieves information quick.

That is exactly so!

You should forget anything and only seek information you need, and you
are supposed to get it.

I think it should be very easy by this workflow:

- User says, by mouth, by speaking "Make a note" or similar;

- Computer asks, is it note about people, or some personal categories?

- User says "It's about Zororg"

- Computer offers to choose right person, user confirms it

- Computer asks "What should I record?"

- User says "Meeting on Monday 10 o'clock related to Org tagging and searching"

- User can freely speak and say "Save this file" or similar command

When searching:

- "Is there any meeting today?"

- "What awaits me to do tomorrow?"

Computer answers:

- Meeting with Zororg is scheduled on Monday, but today and tomorrow
there are no meetings, relax.

And all that in natural language manner.

The notion of your life in plain text is over.

--
Jean Louis

Re: One big file or multiple small ones

Reply via email to