Max Nikulin <maniku...@gmail.com> writes: Hi thank you for the thorough and well-informed answer.
> On 19/07/2023 19:06, Arthur Miller wrote: >> I want to auto insert a title from an HTML page as description for an org >> link in >> my notes. >> (defun org-link-from-clipboard () > ... >> (url-retrieve url >> (lambda (buffer) >> (goto-char (point-min)) >> (when (re-search-forward "<title>\\(.*\\)</title>" nil t) > > What are origins of your links? > If it is an URL opened in browser then `org-capture' or > org-protocol:/store-link/ may be used. There are a number of browser > extensions > for that. I do use org-protocol, and I do have it in my FFX, so I am aware of it. But sometimes I copy a link from a readme file or a piece of code or elsewhere and wish to stash it away in a note but not necessary open in a browser. You know, "todo" to come back later for it :). > More metadata sometimes desired and just page title is not enough. For > extracting it within Emacs see e.g. Ihor's > https://github.com/yantar92/org-capture-ref > Search for its discussions on this mailing lists. > > Some complications: > - titles may have &...; entities > - Not all pages have <title>, so heuristics have to be used. Yepp, I am aware, the goal was not to be 100% fool proof. I had experienced sometimes a couple of characters that Emacs can't dissambiguate, but it is not a problem and yes, in case of no title it will prompt; the other strategy I used was to return just url itself or "no description". Perhaps I should revert to just the url. > - Some HTML files contains nothing besides JavaScript to load actual content > - Some URLs are from minifiers or obfuscated by Outlook "protection", > trampolines to prevent leaking of data through the Referer header, etc. > Likely > redirection target should be saved, not original URL. > - Some sites like GitHub have API that allows to get metadata in JSON format. > It > is better than parsing HTML with regexp. Yes, I am aware and completely agree with you! Luckely I am getting quite old by now and don't visit too many sites or sites of dubious JS character, so for my needs IDC :). Miros idea served me well for several years now, I just improved it a bit the other day to skip prmpting for the URL and used asynchornous download to skip that slight second or two of delay in some links. > Anyway I suggest to split non-interactive part of the command to allow code > reuse (for drag and drop, etc.). Tell me more here? Can I drag a link from one buffer into a note buffer, or how can I use it? Thank you for the answer.