On Thursday 16 April 2015 10:32:32 you wrote: > > There you go; you find the updated patch attached. It now requires > > HTML_PARSE_RECOVER option to be set for recovering from stand-alone > > less-than characters. > > That sounds fine *except* it doesn't raise an error. > The parser knows it's a broken construct that must be pointed out.
Ok, I see what I can do about that. ;) > It sounds a bit weird to handle that error case as one of the main content > cases, I would still be tempted to go into htmlParseStartTag, get the > error reported, but push corrective data instead in recover mode. My initial thought solution was to enter htmlParseElement() like before, and in case htmlParseElement() encounters an error, it would handle the chunk as text instead (if recover option is on). That would probably come to the closest what most browsers seem to do. But the problem: that would require the public API function's prototype of void htmlParseElement(htmlParserCtxtPtr) to be changed to int htmlParseElement(htmlParserCtxtPtr) To avoid that API change, one could add another internal (static) version of htmlParseElement() providing a return value, however there is already one htmlParseElementInternal(), so adding yet another one would become nasty IMO. Best regards, Christian Schoenebeck _______________________________________________ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml