On Wed, Sep 21, 2022 at 11:36:08AM +0200, Arne Wichmann wrote:
> Bail out! ERROR:common/util.c:67:strip_ansi_escapes: assertion failed (err ==
> NULL): Error while compiling regular expression
> ?[\u001b\u009b][[()#;?]*(?:[0-9]{1,4}(?:;[0-9]{0,4})*)?[0-9A-ORZcf-nqry=><]?
> at char 3: unrecognised character following \ (g-regex-error-quark, 103)
Argl. That's quite certainly the upstream bug
https://github.com/luakit/luakit/issues/1005
I've not commented on this because I was not really sure what
encoding the gchar *in would be -- if it's UCS-2, then the
complicated \u escapes make sense. If it's UTF-8, a simple \x1b
would do the job.
Even more importantly, looking at where this code is being used (when
formatting tracebacks, and when writing via va_log when things are
going to a file), I'm now convinced that this is a lot less critical
than I first thought. In particular, javascript console.log is *not*
sanitised at all, let alone by this code; to see that, run
luakit http://www.tfiu.de/log-escape.html |& cat
(if you still have a running luakit somewhere) -- you'll see that the
colored messages are now b/w, but the escape sequence from javascript
is not filtered (which would feel like a good idea), so you end up
with a terminal writing in reverse video.
I hence felt it's ok to just experiment, and it seems we're talking
about UTF-8 strings here.
Can you build from https://salsa.debian.org/debian/luakit.git and see
whether the thing (a) builds and (b) whether luakit's log messages
are b/w when filtered through cat as above?
Thanks,
Markus