On Mon, May 01, 2023 at 09:45:46PM +0200, Francesco Ariis wrote: > A workaround is to `cp /etc/urlview/system.urlview ~/.urlview` and > then replace REGEXP with > REGEXP (((http|https|ftp|gopher)|mailto):(//)?[^ > <>"\t]*|(www|ftp)[0-9]?\.[-a-z0-9.]+)[^ .,;\t\n\r<">\):]?[^, <>"\t]*[^ > .,;\t\n\r<">:] > (i.e. erasing that last “\)”). This is quite detrimental to the very common case of a URL that's entirely parenthesised, or one that ends a parenthetical; compare: $ tail -n3 text Debian#1035358: https://en.wikipedia.org/wiki/Close_Combat_(series) vs (https://en.wikipedia.org/wiki/Debian) vs https://en.wikipedia.org/wiki/(You_Gotta)_Fight_for_Your_Right_(To_Party!) $ grep -Eio '((http|https|ftp|gopher|gemini|mailto):(//)?[^ <>"]*|(www|ftp)[0-9]?\.[-a-z0-9.]+)[^ .,;<">\):]?[^, <>"]*[^ .,;<">:\)]' text | tail -n3 https://en.wikipedia.org/wiki/Close_Combat_(series https://en.wikipedia.org/wiki/Debian https://en.wikipedia.org/wiki/(You_Gotta)_Fight_for_Your_Right_(To_Party! $ grep -Eio '((http|https|ftp|gopher|gemini|mailto):(//)?[^ <>"]*|(www|ftp)[0-9]?\.[-a-z0-9.]+)[^ .,;<">\):]?[^, <>"]*[^ .,;<">:]' text | tail -n3 https://en.wikipedia.org/wiki/Close_Combat_(series) https://en.wikipedia.org/wiki/Debian) https://en.wikipedia.org/wiki/(You_Gotta)_Fight_for_Your_Right_(To_Party!) so this trivial solution fixes an IME rare case of an URL ending with a ')' by breaking the much more common one.
It is quite likely something /can/ be cooked here, I haven't managed to in a good few minutes of fiddling. Attaching my test driver. Best, наб
static auto regex =
// R"DUPA(((http|https|ftp|gopher|gemini|mailto):(//)?[^
<>"]*|(www|ftp)[0-9]?\.[-a-z0-9.]+)[^ .,;<">\):]?(([^, <>"]*)|(\([^,
<>"]*\)))*[^ .,;<">:\)])DUPA";
R"DUPA(((http|https|ftp|gopher|gemini|mailto):(//)?[^
<>"]*|(www|ftp)[0-9]?\.[-a-z0-9.]+)[^ .,;<">\):]?[^, <>"]*[^ .,;<">:\)])DUPA";
#include <cstdio>
#include <cstdlib>
#include <initializer_list>
#include <regex.h>
int main() {
regex_t rgx;
if(regcomp(&rgx, regex, REG_EXTENDED | REG_ICASE))
abort();
for(auto l : {"Debian#1035358:
https://en.wikipedia.org/wiki/Close_Combat_(series)", //
" vs (https://en.wikipedia.org/wiki/Debian)",
//
" vs
https://en.wikipedia.org/wiki/(You_Gotta)_Fight_for_Your_Right_(To_Party!)"}) {
regmatch_t matches[20];
if(regexec(&rgx, l, 20, matches, 0))
abort();
puts(l);
for(size_t i = 0; i <= rgx.re_nsub; ++i)
std::printf("%zu: \"%.*s\"\n", i,
(int)(matches[i].rm_eo - matches[i].rm_so), l + matches[i].rm_so);
puts("");
}
}
signature.asc
Description: PGP signature

