I can't chage the source of the HTML page, because the page will be
generated by another system, where I don't have access. I get only the
pages from there and our Apache module makes a post-processing step just
before the pages will be sent to the user's browser. And there I need a
parser to chang
I have looked into the libxml code and I found the method
htmlParseScript() within HTMLParser.c.
https://gitlab.gnome.org/GNOME/libxml2/blob/master/HTMLparser.c
It describes the problem with the "<" character within scripts.
But it offers the possibility to use the recover mode to ignore
the tags