Re: [xml] [PATCH] less-than character and HTML parser module

2015-06-29 Thread Daniel Veillard
On Thu, Apr 16, 2015 at 04:32:32PM +0800, Daniel Veillard wrote: > On Tue, Apr 14, 2015 at 05:43:42PM +0200, Christian Schoenebeck wrote: > > On Tuesday 14 April 2015 17:50:51 you wrote: > > > If anything like this does get put in, it should only be if it is a > > > configurable option that is disa

Re: [xml] [PATCH] less-than character and HTML parser module

2015-04-27 Thread Christian Schoenebeck
On Sunday 26 April 2015 03:24:35 Christian Schoenebeck wrote: > The 2nd patch (libxml-invalid-tag-as-text.patch) uses that more general way > to resolve this overall issue. That is, instead of looking at the content > and trying to guess ahead whether a less than character will yield in a > valid t

Re: [xml] [PATCH] less-than character and HTML parser module

2015-04-25 Thread Christian Schoenebeck
On Thursday 16 April 2015 13:59:28 Christian Schoenebeck wrote: > On Thursday 16 April 2015 10:32:32 you wrote: > > > There you go; you find the updated patch attached. It now requires > > > HTML_PARSE_RECOVER option to be set for recovering from stand-alone > > > less-than characters. > > > > Tha

Re: [xml] [PATCH] less-than character and HTML parser module

2015-04-16 Thread Christian Schoenebeck
On Thursday 16 April 2015 10:32:32 you wrote: > > There you go; you find the updated patch attached. It now requires > > HTML_PARSE_RECOVER option to be set for recovering from stand-alone > > less-than characters. > > That sounds fine *except* it doesn't raise an error. > The parser knows it's a

Re: [xml] [PATCH] less-than character and HTML parser module

2015-04-16 Thread Daniel Veillard
On Tue, Apr 14, 2015 at 05:43:42PM +0200, Christian Schoenebeck wrote: > On Tuesday 14 April 2015 17:50:51 you wrote: > > If anything like this does get put in, it should only be if it is a > > configurable option that is disabled by default - an xml parser should > > only accept a strictly-conform

Re: [xml] [PATCH] less-than character and HTML parser module

2015-04-16 Thread Daniel Veillard
On Tue, Apr 14, 2015 at 04:50:51PM +0100, Chris Tapp wrote: > > > On 14 Apr 2015, at 15:24, Christian Schoenebeck > > wrote: > > > > On Tuesday 14 April 2015 09:31:25 Alex Bligh wrote: > >> On 13 Apr 2015, at 22:43, Christian Schoenebeck > > wrote: > >>> I just encountered an issue with stand-

Re: [xml] [PATCH] less-than character and HTML parser module

2015-04-14 Thread Christian Schoenebeck
On Tuesday 14 April 2015 17:50:51 you wrote: > If anything like this does get put in, it should only be if it is a > configurable option that is disabled by default - an xml parser should > only accept a strictly-conforming document by default. Adding support for > ‘broken’ html because other (weak

Re: [xml] [PATCH] less-than character and HTML parser module

2015-04-14 Thread Chris Tapp
> On 14 Apr 2015, at 15:24, Christian Schoenebeck > wrote: > > On Tuesday 14 April 2015 09:31:25 Alex Bligh wrote: >> On 13 Apr 2015, at 22:43, Christian Schoenebeck > wrote: >>> I just encountered an issue with stand-alone less-than characters if the >>> document is parsed by libxml2's HTML p

[xml] [PATCH] less-than character and HTML parser module

2015-04-14 Thread Christian Schoenebeck
On Tuesday 14 April 2015 09:31:25 Alex Bligh wrote: > On 13 Apr 2015, at 22:43, Christian Schoenebeck wrote: > > I just encountered an issue with stand-alone less-than characters if the > > document is parsed by libxml2's HTML parser module. Consider you have a > > text > > > > in your HTML docu