Re: proposal: arbitrary metadata in org links

Jean Louis Sun, 21 Jun 2026 23:16:05 -0700

* Julien Dallot <[email protected]> [2026-06-18 01:21]:
> 
> Hello everyone,
> 
> Based on what we discussed so far, here are some high-level guidelines we may 
> want to follows:
> - we want to be able to include arbitrary metadata inside an org
> link (in plain text form)


That is such a bad idea—because metadata in the link syntax is
metadata in the wrong place, and every new piece of metadata you add
to the link is a piece of information you can never query, version, or
intersect with anything else.

> - we want to be able to store arbitrary metadata in the org element
> - (once the link was parsed)

Maybe, but creates such a developed traffic (lot of work), because
storing arbitrary metadata in the Org element after parsing is just
moving the problem from the link syntax to the property drawer, and
you still end up with unstructured, non-queryable, non-versioned data
cluttering your Org files instead of living in a database where it
belongs.

> - whenever possible, we want to follow web standards for links.

Yes, follow web standards. But web standards also say that link types
are expandable. The HTML5 specification defines link types as an
extensible set—you can use any value that is not a registered type,
and the behavior is implementation-defined. The rel attribute can take
multiple values, and custom types are perfectly valid.

<a href="https://example.com"; 
   rel="magic video youtube" 
   data-page="12" 
   data-edges="0.240196 0.478535 0.331699 0.494949"
   data-timestamp="45">
   Ali Cook - Snake Goddess
</a>

Metadata in attributes—not in the URL, valid thing.

The web standard approach would be:

[[id:127190][Ali Cook - Snake Goddess]]

And everything else—type, subtype, page number, coordinates,
timestamp, relationships—stored in the database as properties of the
object referenced by id:127190. That is exactly what the web does with
HTML, JSON, and REST APIs. The URL is the identifier. The metadata is
in the response headers or the payload.

Following web standards is good, but do not confuse link syntax with
metadata. Link is identifier, metadata belongs elsewhere, the web is
based on that since decades ago.

<a href="pdf:document.pdf#page=12" 
   rel="pdf" 
   data-viewer="zathura" 
   data-highlight="0.240196 0.478535 0.331699 0.494949">
   Section on Magic
</a>

A browser will render this as a link. Clicking it will attempt to open
pdf:document.pdf#page=12 using whatever handler is registered for the
pdf: scheme on your system.

The web standard does not limit you to http: and https:. It supports:

- Custom URI schemes (pdf:, mailto:, tel:, spotify:, etc.)
- Custom rel values
- Custom data-* attributes
- Fragment identifiers (#page=12)

> Based on those criteria, here is a possible alternative for the pdf link I 
> presented at the beg of this thread:
> [[file:<path>#page=1#highlight=0.240196,0.478535,0.331699,0.494949]]
> and here is the "web-compatible" equivalent of current org link with 
> ::<search-text> at the end:
> [[file:<path>#:~:text=<search-text>]]

This is valid in the sense that Org will parse the link and pass the
fragment to the file handler. You could write Elisp to parse the
fragment and extract the metadata.

But it fails, as it doesn't solve fundamental problem that metadata is
still embedded in the link, you cannot query it, version it, intersect
it, you cannot attach it to objects independently of the file path. It
conflates location with identity, if the file moves, every link
breaks, there is no single source of truth.

I have links for Org which may stay in Org file forever, and no matter
where the file moves, the link will stay working.

And that system is not web compatible. The web does not use fragments
for storing arbitrary structured metadata like coordinates, viewer
preferences, or relationships. It uses data-* attributes, rel
attributes, and separate data structures (JSON, databases, APIs).

Your idea scales poorly. What happens when you have 10,000 PDF
annotations? What happens when you want to change the highlight color
globally? What happens when you want to find all highlights on page 42
across all your PDFs? What happens when you want to add a new field to
every annotation?

Better this:

<a href="pdf:document.pdf#page=12" 
   rel="pdf" 
   data-page="12" 
   data-highlight="0.240196 0.478535 0.331699 0.494949">
   Section on Magic
</a>

In Org terms:

[[id:127190][Section on Magic]]

> The idea is that the link content (inside [[]]) is an (almost)
> working web link.

> As is, it actually works: firefox will open the pdf at the right page (if you 
> take the "highlight" part out).
> Note that some post-treatment is necessary to reliably convert any
> org link to a web link, just to replace white spaces with special
> characters "%20" for instance --- but it seems major web browsers
> (firefox and chrome) do this conversion automatically.

It is not "web-compatible" in any meaningful sense.

> Otherwise known as the fragment directive, this sequence of characters tells 
> the browser that what comes next is one or more user-agent instructions, 
> which are stripped from the URL during loading so that author scripts cannot 
> directly interact with them. User-agent instructions are also called 
> directives.
> "
> coming from 
> [[https://developer.mozilla.org/en-US/docs/Web/URI/Reference/Fragment/Text_fragments#:~:text=fragment%20directive][there]]
> 
> 
> Although desirable, this compatibility with web seems fragile for many 
> reasons:
> - adding any keyword that's not recognized by the web browser makes the whole 
> set of keywords useless (as I try rn).
>   So its seems that if we want to add an emacs-specific keyword, then the 
> whole link becomes corrupted --- it still opens, but no further action is 
> performed (like scrolling to the right pdf page)
> - neither firefox nor chrome support region highlighting for now 
> ("#highlight=<lt>,<rt>,<top>,<btm>")
> - (certainly minor) as discussed, it creates links that are harder to parse 
> for a human compared to plists.

The web standard for metadata is not the URL but for metadata is
data-* attributes, JSON-LD, microdata, or an API endpoint. The URL is
an identifier and the metadata is separate.

-- 
Jean Louis

Re: proposal: arbitrary metadata in org links

Reply via email to