* Julien Dallot <[email protected]> [2026-06-18 01:21]: > > Hello everyone, > > Based on what we discussed so far, here are some high-level guidelines we may > want to follows: > - we want to be able to include arbitrary metadata inside an org > link (in plain text form)
That is such a bad idea—because metadata in the link syntax is metadata in the wrong place, and every new piece of metadata you add to the link is a piece of information you can never query, version, or intersect with anything else. > - we want to be able to store arbitrary metadata in the org element > - (once the link was parsed) Maybe, but creates such a developed traffic (lot of work), because storing arbitrary metadata in the Org element after parsing is just moving the problem from the link syntax to the property drawer, and you still end up with unstructured, non-queryable, non-versioned data cluttering your Org files instead of living in a database where it belongs. > - whenever possible, we want to follow web standards for links. Yes, follow web standards. But web standards also say that link types are expandable. The HTML5 specification defines link types as an extensible set—you can use any value that is not a registered type, and the behavior is implementation-defined. The rel attribute can take multiple values, and custom types are perfectly valid. <a href="https://example.com" rel="magic video youtube" data-page="12" data-edges="0.240196 0.478535 0.331699 0.494949" data-timestamp="45"> Ali Cook - Snake Goddess </a> Metadata in attributes—not in the URL, valid thing. The web standard approach would be: [[id:127190][Ali Cook - Snake Goddess]] And everything else—type, subtype, page number, coordinates, timestamp, relationships—stored in the database as properties of the object referenced by id:127190. That is exactly what the web does with HTML, JSON, and REST APIs. The URL is the identifier. The metadata is in the response headers or the payload. Following web standards is good, but do not confuse link syntax with metadata. Link is identifier, metadata belongs elsewhere, the web is based on that since decades ago. <a href="pdf:document.pdf#page=12" rel="pdf" data-viewer="zathura" data-highlight="0.240196 0.478535 0.331699 0.494949"> Section on Magic </a> A browser will render this as a link. Clicking it will attempt to open pdf:document.pdf#page=12 using whatever handler is registered for the pdf: scheme on your system. The web standard does not limit you to http: and https:. It supports: - Custom URI schemes (pdf:, mailto:, tel:, spotify:, etc.) - Custom rel values - Custom data-* attributes - Fragment identifiers (#page=12) > Based on those criteria, here is a possible alternative for the pdf link I > presented at the beg of this thread: > [[file:<path>#page=1#highlight=0.240196,0.478535,0.331699,0.494949]] > and here is the "web-compatible" equivalent of current org link with > ::<search-text> at the end: > [[file:<path>#:~:text=<search-text>]] This is valid in the sense that Org will parse the link and pass the fragment to the file handler. You could write Elisp to parse the fragment and extract the metadata. But it fails, as it doesn't solve fundamental problem that metadata is still embedded in the link, you cannot query it, version it, intersect it, you cannot attach it to objects independently of the file path. It conflates location with identity, if the file moves, every link breaks, there is no single source of truth. I have links for Org which may stay in Org file forever, and no matter where the file moves, the link will stay working. And that system is not web compatible. The web does not use fragments for storing arbitrary structured metadata like coordinates, viewer preferences, or relationships. It uses data-* attributes, rel attributes, and separate data structures (JSON, databases, APIs). Your idea scales poorly. What happens when you have 10,000 PDF annotations? What happens when you want to change the highlight color globally? What happens when you want to find all highlights on page 42 across all your PDFs? What happens when you want to add a new field to every annotation? Better this: <a href="pdf:document.pdf#page=12" rel="pdf" data-page="12" data-highlight="0.240196 0.478535 0.331699 0.494949"> Section on Magic </a> In Org terms: [[id:127190][Section on Magic]] > The idea is that the link content (inside [[]]) is an (almost) > working web link. > As is, it actually works: firefox will open the pdf at the right page (if you > take the "highlight" part out). > Note that some post-treatment is necessary to reliably convert any > org link to a web link, just to replace white spaces with special > characters "%20" for instance --- but it seems major web browsers > (firefox and chrome) do this conversion automatically. It is not "web-compatible" in any meaningful sense. > Otherwise known as the fragment directive, this sequence of characters tells > the browser that what comes next is one or more user-agent instructions, > which are stripped from the URL during loading so that author scripts cannot > directly interact with them. User-agent instructions are also called > directives. > " > coming from > [[https://developer.mozilla.org/en-US/docs/Web/URI/Reference/Fragment/Text_fragments#:~:text=fragment%20directive][there]] > > > Although desirable, this compatibility with web seems fragile for many > reasons: > - adding any keyword that's not recognized by the web browser makes the whole > set of keywords useless (as I try rn). > So its seems that if we want to add an emacs-specific keyword, then the > whole link becomes corrupted --- it still opens, but no further action is > performed (like scrolling to the right pdf page) > - neither firefox nor chrome support region highlighting for now > ("#highlight=<lt>,<rt>,<top>,<btm>") > - (certainly minor) as discussed, it creates links that are harder to parse > for a human compared to plists. The web standard for metadata is not the URL but for metadata is data-* attributes, JSON-LD, microdata, or an API endpoint. The URL is an identifier and the metadata is separate. -- Jean Louis
