Re: multipage html output

Ihor Radchenko Thu, 04 Jul 2024 09:19:48 -0700

Orm Finnendahl <orm.finnend...@selma.hfmdk-frankfurt.de> writes:

> Sure. I'm not at all familiar with the peculiarities of other output
> backends, but see your point. If you can give any hints or have any
> ideas *how* we could find general rules for separating the subtrees,
> which cover foreseeable use cases, or devise a flexible mechanism for
> doing so, I'd be glad to help setting them up and implementing them. I
> definitely agree, the code should be as general as possible while
> providing complete backward compatibility.


I think that the easiest would be adding a new option to
`org-export-options-alist' - it is already extendable for individual
backends and allows users to tweak things via in-buffer keywords,
properties, variables, and export options.

The most generic rule would be some kind of function that takes AST
node as input and returns whether that node should be going to a separate
file or not, and if yes, tell (1) which export backend to use to export
that subtree to a file (may as well allow exporting to different
formats, while we are at it); (2) what are the export parameters to be
used for that export, (possibly) including the file path.

Then, in addition to the most generic (and most flexible) "rule being an
Elisp function", we can allow some simplified semantics to define rules.

The semantics should probably give a couple of toggles to customize:
(1) which subtrees are selected for export; (2) which export backend is
used (3) how their file names are generated; (4) (optional) how they are
represented when exporting the whole original file; e.g. whether to put
links to exported files in place of their subtrees; (5) (optional) how
the original file is represented in the exported subtrees; e.g. whether
to put backlink to parent file

The subtree selection may boil down to the usual TAGS matcher (or
function), as described in "11.3.3 Matching tags and properties" section
of the manual. This will cover the previously discussed separation based
on headline level, a tag, or a property.

The export backend selection may be realized by allowing multiple rules
with each rule defining selection/backend/file name/....

In terms of the value semantics in Elisp, I am thinking about something
re-using backend definition format:

(setq org-export-pages
      '(:selector "LEVEL=2+blog+TODO=DONE"
        :backend html
         ;; completely remove the exported subtree is original document
         ;; is being exported.
        :page-transcoder nil
         ;; or :page-transcoder #'org-export-page-as-heading-with-link
        :export-file-name "%{TITLE}-%{page-number}" ;; or some other kind of 
template syntax
        )

       '(:selector a-function-accepting-ast-node
         :source-backend any 
         :backend
         (:parent html ;; `org-export-define-derived-backend'-like semantics
          :options-alist
          ;; Do not export private headings in HTML pages.
          ((:exclude-tags "EXCLUDE_TAGS" nil (cons "private" 
org-export-exclude-tags) split))))

        '(:selector "+export_ascii_page"
          :source-backend html ; only use this rule when exporting to html
          :backend
          (:parent ascii
           ((template .
              (lambda (contents info)
                (format "Paged out from %s\n%s"
                   (plist-get
                     ;; INFO channel for parent document
                     (plist-get info :page-source)
                     :title)
                   (org-ascii-template contents info)))))))))

>> 2. Some backends, as you proposed, may target multipage export from the
>>    very beginning. So, we need to provide some way for the backend (in
>>    org-export-define*-backend) to specify that it wants to split the
>>    original parse tree. I imagine some kind of option with default
>>    values configured via backend, but optionally overwritten by user
>>    settings/in-buffer keywords.
>
> I'll look into that and maybe I can come up with something. I was
> hesitant to propose anything as I tried to stay as limited as possible
> and not get too deep into changing things. If you have suggestions,
> please let me know.

One way could be simply adding an option like :selector above to the
backend definition. Then, it will be used as default selector:

(setq org-export-pages
  (:selector default :backend html) ; export pages to html with default selector
)

or even

(setq org-export-pages
  (:backend html) ; export pages to html with default selector
)

or just

;; export using the same target backend as selected in the export menu
(setq org-export-pages t)
;; (setq org-export-pages nil) - existing single page export
;; (setq org-export-pages 'only-pages) - only export pages, ignore original file

>> 3. Your suggestion to add a new export option for splitting based on
>>    headline level is one idea.
>> 
>>    Another idea is to split out subtrees with :EXPORT_FILE_NAME:
>>    property.
>
> I'm not sure I fully understand what you mean: Do you mean specifying
> different :EXPORT_FILE_NAME: properties throughout the same document
> and then export accordingly?

Yes. It is re-using the existing idea with subtree export

13.2 Export Settings

‘EXPORT_FILE_NAME’
     The name of the output file to be generated.  Otherwise, Org
     generates the file name based on the buffer name and the extension
     based on the backend format.

If a subtree has that property set, it is used as output file name.
Since there is usually no reason to set this property unless you also
want to export subtree to individual file, it makes sense to use this as
selector for what to export as pages.

Example:

#+TITLE: Index document

* Emacs notes
** Emacs blog post #1
:PROPERTIES:
:EXPORT_FILE_NAME: my-first-post
:END:
...
** Fleeting note at [2024-06-20 Thu 22:16]
Some notes, no need to export them.

* Personal notes
** Personal blog post #1
:PROPERTIES:
:EXPORT_FILE_NAME: private/personal-post-trial
:END:
...

>> 6. I can see people flipping between exporting the whole document and
>>    multipage document. We probably need some kind of easy switch in M-x
>>    org-export-dispatch to choose how to export.
>
> Sure, that is the disadvantage of my proposal to make everything a
> "multipage" document. Another disadvantage is that when the user
> chooses to open the final document or display it in a buffer the user
> can't choose whether to only open/display one page or every exported
> page. In most circumstances it should be advisable to just
> open/display the first page. We can also just add a switch between
> single-page and multipage, with multipage always just exporting to
> file, but that also has disadvantages.

What to open is a minor detail, really. It can be worked out any moment
we need to. The most sensible default, IMHO, it to open dired with the
containing directory with all the exported pages.

> As the code I proposed is encapsulated in the html backend and not
> spreading all over the place, I will now first go ahead to finalize
> the existing code to a fully working setup. ASFAICT adapting that to
> other needs shouldn't require a complete rewrite. And I might be
> around for a while ;-)

I advice against doing this.
While reading your code, I saw that you used some html-specific
functions for modifications in ox.el. If you start by modifying ox.el in
Org git repo directly, simply doing "make compile" will warn about
instances of using functions not defined in ox.el.
Another advantage of editing the ox.el and using Org repository is that
you can run "make test" any time and see if you managed to break Org :)

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

Re: multipage html output

Reply via email to