I generated the regexp with (regexp-opt '("ftp" "http" "https" "file")). I admit that the list of protocols is not clear from the regexp, but it was already done this way before my change, and I tried to minimize the change. More concerning in my opinion is how the Info-get-token is written. As some point of the code one can read the following statement: ;; First look for a match for START that goes across POS. (while (and (not (bobp)) (> (point) (- pos (length start))) (not (looking-at start))) (forward-char -1))
Here start is a regexp, so (length start) is just the length of the string holding the regexp, not the max length over which the regexp match can span. For instance assume that aaaaaaaaaa is a new new protocol which you want Info-get-token to catch. This is 10 character long, but the typical regexp to catch it would be « a\{10\} » which is only 7 character long. It would propably be cleaner to provide the value to be used instead of (length start) as a separate optional argument that would be set to (length start) if omitted. Another way would be to have some standard function max-matchable-length that given some regexp would compute the maximum length of its match (or output some special value like t if the maximum length is infinite), thus (length start) could be replaced by (max-matchable-length start) --- maybe this is already somthing existing. In the same vein there could be some standard function that given regexp re and some position pos-in would function position pos-out such that the following expression would be true: (save-excursion (goto-char pos-out) (and (<= pos-out pos-in) (looking-at re) (>= (match-end 0) pos-in))) I think that such standard function is already existing, but I can't remember the package name which provides it… V.