Hey, I'm using libxml2-2.9.8.
When using libxml to parse xml I can use ctxt->record_info = true xmlInitNodeInfoSeq(&ctxt->node_seq); xmlParseDocument(ctxt) to record positions for the parsed nodes. However, for HTML the following ctxt->record_info = 1; xmlInitNodeInfoSeq(&ctxt->node_seq); htmlParseDocument(ctxt); leads to seg fault for some (not necessarily well formed) HTML files. A minimal example would be an HTML file with content "<label></label>" which leads to a seg fault: #0 0x0000555555695199 in xmlSAX2EndElement (ctx=0x555555975a20, name=0x55555570141e "body") at external/libxml2/libxml2-2.9.8/SAX2.c:1815 #1 0x000055555561412b in htmlAutoCloseOnEnd (ctxt=0x555555975a20) at external/libxml2/libxml2-2.9.8/HTMLparser.c:1384 #2 0x000055555561cae2 in htmlParseContentInternal (ctxt=0x555555975a20) at external/libxml2/libxml2-2.9.8/HTMLparser.c:4674 #3 0x000055555561d0da in htmlParseDocument (ctxt=0x555555975a20) at external/libxml2/libxml2-2.9.8/HTMLparser.c:4817 #4 0x000055555556f81d in ParseHTML (content="<label></label>\n", nodes=0x7fffffffd7a0, error_message=0x7fffffffd8b0) at parser/xml_parser.cpp:431 #5 0x00005555555711e6 in main (argc=2, argv=0x7fffffffdb08) at parser/xml_parser.cpp:596 Does the API for parsing HTML files support recording positions of the nodes? If so, what am I doing wrong or what can be done to prevent the seg fault? Thank you and best regards Ben
_______________________________________________ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml