Re: Querying cached parse trees without opening files

Kristoffer Balintona Mon, 26 May 2025 14:03:56 -0700

On Mon, May 26, 2025 at 12:02 PM chris <inkbottle...@gmail.com> wrote:


> Org-node seems very interesting! I noticed that your [parser.el](https://
> github.com/meedstrom/org-mem/blob/main/org-mem-parser.el) is only about 600
> lines long, whereas Org-mode’s parser seems larger and possibly more
> scattered? Are they roughly equivalent in scope/intent, or is your version
> focused on a different subset of Org features?

Hi,

I am not Martin, but I’ll share a bit about what I’ve gathered about the
project after having used org-node for a few months.

As far as I can tell, the org-mem parser is a parser specially tailored
for a specific end, namely, speed. What sets org-node apart from
org-roam is that it does not need anything on-disk; it maintains hash
tables inside Emacs for all its data. (Additionally, and in line with
org-node’s mission for performance, it does not end up needing to load
org at all, since its parser is an implementation independent of it.) It
can get away with this because the parser is very fast and leverage’s
el-job’s[1] asynchronous processing of lists.

Of course, the trade off for parsing speed is completeness: org-mem must
implement its own regexps to find the data it needs. Everything else is
ignored. So if org-mem wants to collect e.g. timestamp data, it must do
so without any help from org (as was recently implemented). Org also
does a lot to process things like org keywords in files. And, of course,
this approach is susceptible to mismatches with what org’s parser
actually recognizes since org-mem’s parser is bespoke.

I’m guessing part of Martin’s motivation to ask his original question is
related to how tenable maintaining a parser independent from org is. It
would be much easier to rely on the definitive org parser if possible. And
if I would speculate further, I think what he has in mind is to store
the parse trees on disk and read from those (potentially caching those
on-disk parse trees if necessary) rather than the user’s files. This way,
performance is still fast since the user’s org files are already parsed
(which is the expensive part).

Martin can chime in and share to correct me if I’m wrong.

-- 
Kind regards,
Kristoffer

Re: Querying cached parse trees without opening files

Reply via email to