date:20180820

Re: [xml] Error on parsing HTML with libxml

2018-08-20 Thread André Rothe

I can't chage the source of the HTML page, because the page will be generated by another system, where I don't have access. I get only the pages from there and our Apache module makes a post-processing step just before the pages will be sent to the user's browser. And there I need a parser to chang

Re: [xml] Error on parsing HTML with libxml

2018-08-20 Thread André Rothe

I have looked into the libxml code and I found the method htmlParseScript() within HTMLParser.c. https://gitlab.gnome.org/GNOME/libxml2/blob/master/HTMLparser.c It describes the problem with the "<" character within scripts. But it offers the possibility to use the recover mode to ignore the tags