Karthik Chikmagalur <karthikchikmaga...@gmail.com> writes:

>>> 1. When running `save-buffers-kill-emacs', the time is down from several
>>> minutes to about 10 seconds -- still undesirably long but not egregious.
>>> This is the profile org-persist-delay-kill-emacs.eld.
>>
>> Ok. What about write time distribution now?
>
> Sorry, I lost this thread of the discussion.  Did you mean the
> speedscope flamegraph style trace?

I was mostly trying to figure out which exact files are causing the
slowdown - is it index or is it something else (say, org cache)?
But your data on my diff also answers this question - it was writing
index that is slow.

>> May you try the attached untested diff? (instead of previous)
>
> I tried it now.  There is an apples-to-apples comparison of LaTeX
> preview runs with 600 fragments in the two attached profiles:
>
> - org-persist-profile-test2.eld (Run with your new diff applied) and
> - no-cache-profile-test2.eld (Run without org-persist caching)
>
> I ran org-latex-preview on the buffer about ~15 times in each case to
> generate more data.  In both profiles, please ignore the function
> `org-latex-preview-clear-cache' and anything involving the word
> "precompile" or "precompilation".  These functions had to run to clear
> the cache before each run, and I don't think the profiler can be
> suspended once it starts.
>
> Qualitatively, the performance is much better with this org-persist
> strategy compared to what's on main right now, but there is still a
> large performance and responsiveness difference between caching the
> files in /tmp (where no elisp files are written) and caching them with
> org-persist.  This is evident in the profiles, where
> `org-latex-preview--cache-image' takes 25% of the total time when using
> org-persist, and 5% of the total when caching in /tmp.  This ratio is
> worse (45% vs 7%, not accurate) once you remove the time spent in
> precompilation and clearing the cache from the equation.

There is no way we can be on par with writing to a throwaway folder,
simply because org-persist has to be careful handling multiple Emacs
processes writing to the same cache simultaneously and writing the
metadata necessary to GC the old cache data.

I am attaching a set of proper patches:
1. The previous diff transformed onto a patch
2. Two more patches that should slightly improve bottlenecks revealed by
   your profiles

I can also think about two completely different approaches that could
allow faster read/write access to the cache when you can guranantee that
the same cache data is not written to index at the same time:

1. We can allow arbitrary files in the index, removing them only when
   they are older than some global setting. Then, you can generate the
   files by yourself and read them as needed. org-persist will simply
   serve as the way to figure out where to write the files and will take
   care about cleaning up the files after some time.

2. We can allow file containers to store an actual directory. Then, you
   will only need to write/read the directory once per session, obtain
   its real path from org-persist API, and read/write files inside
   manually, as you do without org-persist enabled. This will eliminate
   most of the overheads you experience.

>From 4e2f142bc09b3d9d43fb9115fcddf334e503da52 Mon Sep 17 00:00:00 2001
Message-ID: <4e2f142bc09b3d9d43fb9115fcddf334e503da52.1743249239.git.yanta...@posteo.net>
From: Ihor Radchenko <yanta...@posteo.net>
Date: Sat, 29 Mar 2025 12:15:38 +0100
Subject: [PATCH 1/3] org-persist--get-collection: Avoid slow-ish
 `replace-regexp-in-string'

* lisp/org-persist.el (org-persist--get-collection): It should be
faster to use `substring'.

Link: https://orgmode.org/list/87h63cj1c5....@gmail.com
---
 lisp/org-persist.el | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lisp/org-persist.el b/lisp/org-persist.el
index a639699d93..603d811778 100644
--- a/lisp/org-persist.el
+++ b/lisp/org-persist.el
@@ -640,7 +640,8 @@ (defun org-persist--get-collection (container &optional associated misc)
        (nconc
         (list :container (org-persist--normalize-container container)
               :persist-file
-              (replace-regexp-in-string "^.." "\\&/" (org-id-uuid))
+              (let ((uuid (org-id-uuid)))
+                (concat (substring uuid 0 2) "/" (substring uuid 2)))
               :associated associated)
         misc))))
 
-- 
2.47.1

>From 0564bc493fa4b0182c4474cb1d23eb0892e9ca3e Mon Sep 17 00:00:00 2001
Message-ID: <0564bc493fa4b0182c4474cb1d23eb0892e9ca3e.1743249239.git.yanta...@posteo.net>
In-Reply-To: <4e2f142bc09b3d9d43fb9115fcddf334e503da52.1743249239.git.yanta...@posteo.net>
References: <4e2f142bc09b3d9d43fb9115fcddf334e503da52.1743249239.git.yanta...@posteo.net>
From: Ihor Radchenko <yanta...@posteo.net>
Date: Sat, 29 Mar 2025 12:38:30 +0100
Subject: [PATCH 2/3] org-persist: Optimize performance of
 `org-persist--normalize-associated'

* lisp/org-persist.el (org-persist--get-collection):
(org-persist-register): Only call `org-persist--normalize-associated'
from API functions.  Assume that arguments to internal functions are
normalized.
(org-persist--normalize-associated): Make inlinable.  Handle nil value fast.
---
 lisp/org-persist.el | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lisp/org-persist.el b/lisp/org-persist.el
index 603d811778..d96a303820 100644
--- a/lisp/org-persist.el
+++ b/lisp/org-persist.el
@@ -630,7 +630,6 @@ (defun org-persist--get-collection (container &optional associated misc)
 MISC, if non-nil will be appended to the collection.  It must be a plist."
   (unless (and (listp container) (listp (car container)))
     (setq container (list container)))
-  (setq associated (org-persist--normalize-associated associated))
   (when (and misc (or (not (listp misc)) (= 1 (% (length misc) 2))))
     (error "org-persist: Not a plist: %S" misc))
   (or (org-persist--find-index
@@ -667,9 +666,10 @@ (defun org-persist--normalize-container (container &optional inner)
 (defvar org-persist--associated-buffer-cache (make-hash-table :weakness 'key)
   "Buffer hash cache.")
 
-(defun org-persist--normalize-associated (associated)
+(defsubst org-persist--normalize-associated (associated)
   "Normalize ASSOCIATED representation into (:type value)."
   (pcase associated
+    (`nil nil)
     ((or (pred stringp) `(:file ,_))
      (unless (stringp associated)
        (setq associated (cadr associated)))
@@ -1009,6 +1009,7 @@ (cl-defun org-persist-register (container &optional associated &rest misc
 with `org-persist-write'."
   (unless org-persist--index (org-persist--load-index))
   (setq container (org-persist--normalize-container container))
+  (setq associated (org-persist--normalize-associated associated))
   (when inherit
     (setq inherit (org-persist--normalize-container inherit))
     (let ((inherited-collection (org-persist--get-collection inherit associated))
-- 
2.47.1

>From 30591c95b3a217d83d7bd2739d28b0b9d1582d3d Mon Sep 17 00:00:00 2001
Message-ID: <30591c95b3a217d83d7bd2739d28b0b9d1582d3d.1743249239.git.yanta...@posteo.net>
In-Reply-To: <4e2f142bc09b3d9d43fb9115fcddf334e503da52.1743249239.git.yanta...@posteo.net>
References: <4e2f142bc09b3d9d43fb9115fcddf334e503da52.1743249239.git.yanta...@posteo.net>
From: Ihor Radchenko <yanta...@posteo.net>
Date: Sat, 29 Mar 2025 12:49:38 +0100
Subject: [PATCH 3/3] org-persist: Avoid writing index frequently

* lisp/org-persist.el (org-persist--write-elisp-file):
Do not write index every time.  Remove optional argument.
(org-persist-write:index): Update call.
(org-persist--merge-index): When BASE is nil, just return OTHER.  This
speeds things up when we have no index loaded.
(org-persist-register):
(org-persist-unregister):
(org-persist-read):
(org-persist-load-all):
(org-persist-write-all):
(org-persist-gc): Make sure that index is synchronized with disk and
conflicting (non-equal) entries are dropped.  This is an alternative
approach compared to 7999433067 that avoid frequent idnex
writes (which are slow).  We instead _read_ index solving the problem
of multiple Emacs instances writing to cache in different, faster,
way.

Link: https://orgmode.org/list/87h63cj1c5....@gmail.com
---
 lisp/org-persist.el | 104 +++++++++++++++++++++++---------------------
 1 file changed, 54 insertions(+), 50 deletions(-)

diff --git a/lisp/org-persist.el b/lisp/org-persist.el
index d96a303820..ae75ab7afc 100644
--- a/lisp/org-persist.el
+++ b/lisp/org-persist.el
@@ -454,15 +454,12 @@ (defun org-persist--read-elisp-file (&optional buffer-or-file)
 
 ;; FIXME: `pp' is very slow when writing even moderately large datasets
 ;; We should probably drop it or find some fast formatter.
-(defun org-persist--write-elisp-file
-    (file data &optional no-circular pp inhibit-writing-index)
+(defun org-persist--write-elisp-file (file data &optional no-circular pp)
   "Write to index and then write elisp DATA to FILE.
 When optional argument NO-CIRCULAR is non-nil, do not bind
 `print-circle' to t.
 When optional argument PP is non-nil, pretty-print the data (slow on
-moderately large data).
-INHIBIT-WRITING-INDEX will disable writing index file.
-"
+moderately large data)."
   ;; Fsync slightly reduces the chance of an incomplete filesystem
   ;; write, however on modern hardware its effectiveness is
   ;; questionable and it is insufficient to guarantee complete writes.
@@ -488,28 +485,23 @@ (defun org-persist--write-elisp-file
         print-number-table
         (start-time (float-time))
         (tmp-file (make-temp-file "org-persist-")))
-    ;; Every time we write cache data, make sure that index is up to
-    ;; date. This prevents situation when two Emacs sessions are writing
-    ;; different data under the same cache key, but do not update the
-    ;; index metadata about the cache data written (e.g. hash).
-    (when (or inhibit-writing-index (org-persist--save-index))
-      (unless (file-exists-p (file-name-directory file))
-        (make-directory (file-name-directory file) t))
-      ;; Do not write to FILE directly.  Another Emacs instance may be
-      ;; doing the same at the same time.  Instead, write to new
-      ;; temporary file and then rename it (renaming is atomic
-      ;; operation that does not create data races).
-      ;; See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=75209#35
-      (with-temp-file tmp-file
-        (insert ";;   -*- mode: lisp-data; -*-\n")
-        (if pp
-            (let ((pp-use-max-width nil)) ; Emacs bug#58687
-              (pp data (current-buffer)))
-          (prin1 data (current-buffer))))
-      (rename-file tmp-file file 'overwrite)
-      (org-persist--display-time
-       (- (float-time) start-time)
-       "Writing to %S" file))))
+    (unless (file-exists-p (file-name-directory file))
+      (make-directory (file-name-directory file) t))
+    ;; Do not write to FILE directly.  Another Emacs instance may be
+    ;; doing the same at the same time.  Instead, write to new
+    ;; temporary file and then rename it (renaming is atomic
+    ;; operation that does not create data races).
+    ;; See https://debbugs.gnu.org/cgi/bugreport.cgi?bug=75209#35
+    (with-temp-file tmp-file
+      (insert ";;   -*- mode: lisp-data; -*-\n")
+      (if pp
+          (let ((pp-use-max-width nil)) ; Emacs bug#58687
+            (pp data (current-buffer)))
+        (prin1 data (current-buffer))))
+    (rename-file tmp-file file 'overwrite)
+    (org-persist--display-time
+     (- (float-time) start-time)
+     "Writing to %S" file)))
 
 (defmacro org-persist-gc:generic (container collection)
   "Garbage collect CONTAINER data from COLLECTION."
@@ -944,7 +936,7 @@ (defun org-persist-write:index (container _)
     (let ((index-file
            (org-file-name-concat org-persist-directory org-persist-index-file)))
       (org-persist--merge-index-with-disk)
-      (org-persist--write-elisp-file index-file org-persist--index t nil t)
+      (org-persist--write-elisp-file index-file org-persist--index t nil)
       (setq org-persist--index-age
             (file-attribute-modification-time (file-attributes index-file)))
       index-file)))
@@ -973,26 +965,28 @@ (defun org-persist--merge-index (base other)
   "Attempt to merge new index items in OTHER into BASE.
 Items with different details are considered too difficult, and skipped."
   (if other
-      (let ((new (cl-set-difference other base :test #'equal))
-            (base-files (mapcar (lambda (s) (plist-get s :persist-file)) base))
-            (combined (reverse base)))
-        (dolist (item (nreverse new))
-          (unless (or (memq 'index (mapcar #'car (plist-get item :container)))
-                      (not (file-exists-p
+      (if (not base) other
+        (let ((new (cl-set-difference other base :test #'equal))
+              (base-files (mapcar (lambda (s) (plist-get s :persist-file)) base))
+              (combined (reverse base)))
+          (dolist (item (nreverse new))
+            (unless (or (memq 'index (mapcar #'car (plist-get item :container)))
+                        (not (file-exists-p
                             (org-file-name-concat org-persist-directory
                                                   (plist-get item :persist-file))))
-                      (member (plist-get item :persist-file) base-files))
-            (push item combined)))
-        (nreverse combined))
+                        (member (plist-get item :persist-file) base-files))
+              (push item combined)))
+          (nreverse combined)))
     base))
 
 ;;;; Public API
 
-(cl-defun org-persist-register (container &optional associated &rest misc
-                               &key inherit
-                               &key (expiry org-persist-default-expiry)
-                               &key (write-immediately nil)
-                               &allow-other-keys)
+(cl-defun org-persist-register
+    ( container &optional associated &rest misc
+      &key inherit
+      &key (expiry org-persist-default-expiry)
+      &key (write-immediately nil)
+      &allow-other-keys)
   "Register CONTAINER in ASSOCIATED to be persistent across Emacs sessions.
 Optional key INHERIT makes CONTAINER dependent on another container.
 Such dependency means that data shared between variables will be
@@ -1007,7 +1001,9 @@ (cl-defun org-persist-register (container &optional associated &rest misc
 VALUE pairs.
 When WRITE-IMMEDIATELY is non-nil, the return value will be the same
 with `org-persist-write'."
-  (unless org-persist--index (org-persist--load-index))
+  ;; Sync cache with disk, dropping conflicting items between multiple
+  ;; Emacsen.
+  (org-persist--merge-index-with-disk)
   (setq container (org-persist--normalize-container container))
   (setq associated (org-persist--normalize-associated associated))
   (when inherit
@@ -1035,7 +1031,9 @@ (cl-defun org-persist-unregister (container &optional associated &key remove-rel
 When ASSOCIATED is `all', unregister CONTAINER everywhere.
 When REMOVE-RELATED is non-nil, remove all the containers stored with
 the CONTAINER as well."
-  (unless org-persist--index (org-persist--load-index))
+  ;; Sync cache with disk, dropping conflicting items between multiple
+  ;; Emacsen.
+  (org-persist--merge-index-with-disk)
   (setq container (org-persist--normalize-container container))
   (if (eq associated 'all)
       (mapc (lambda (collection)
@@ -1072,7 +1070,9 @@ (cl-defun org-persist-read (container &optional associated hash-must-match load
     (org-persist-read \"My data\") ; => \"My data\"
     (org-persist-read \"My data\" nil nil nil
                       :read-related t) ; => (\"My data\" \"test\")"
-  (unless org-persist--index (org-persist--load-index))
+  ;; Sync cache with disk, dropping conflicting items between multiple
+  ;; Emacsen.
+  (org-persist--merge-index-with-disk)
   (setq associated (org-persist--normalize-associated associated))
   (setq container (org-persist--normalize-container container))
   (let* ((collection (org-persist--find-index `(:container ,container :associated ,associated)))
@@ -1120,7 +1120,9 @@ (cl-defun org-persist-load (container &optional associated hash-must-match &key
 
 (defun org-persist-load-all (&optional associated)
   "Restore all the persistent data associated with ASSOCIATED."
-  (unless org-persist--index (org-persist--load-index))
+  ;; Sync cache with disk, dropping conflicting items between multiple
+  ;; Emacsen.
+  (org-persist--merge-index-with-disk)
   (setq associated (org-persist--normalize-associated associated))
   (let (all-containers)
     (dolist (collection org-persist--index)
@@ -1175,7 +1177,9 @@ (defun org-persist-write (container &optional associated ignore-return)
 (defun org-persist-write-all (&optional associated)
   "Save all the persistent data.
 When ASSOCIATED is non-nil, only save the matching data."
-  (unless org-persist--index (org-persist--load-index))
+  ;; Sync cache with disk, dropping conflicting items between multiple
+  ;; Emacsen.
+  (org-persist--merge-index-with-disk)
   (setq associated (org-persist--normalize-associated associated))
   (if
       (and (equal 1 (length org-persist--index))
@@ -1287,9 +1291,9 @@ (defun org-persist--gc-orphan-p ()
 (defun org-persist-gc ()
   "Remove expired or unregistered containers and orphaned files.
 Also, remove containers associated with non-existing files."
-  (if org-persist--index
-      (org-persist--merge-index-with-disk)
-    (org-persist--load-index))
+  ;; Sync cache with disk, dropping conflicting items between multiple
+  ;; Emacsen.
+  (org-persist--merge-index-with-disk)
   (let (new-index
         (remote-files-num 0)
         (orphan-files
-- 
2.47.1

-- 
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

Reply via email to