I wrote: > I saw that one. It would be good to have a replacement for > xmlParseBalancedChunkMemory, because after looking at the libxml2 > sources I realize that that's classed as a SAX1 function, which means > it will likely go away at some point (maybe it's already not there in > some builds). That's a long-term consideration though.
Actually ... after nosing around in libxml2 some more, I noticed xmlParseInNodeContext, which is the only other function specified to parse a Well Balanced Chunk. It requires a context node, but AFAICS we can just gin up a dummy root node and use that. It's existed for plenty long enough for our purposes, and it's not semi-deprecated, and it lacks the bug at hand. So I'm now thinking about the attached. As far as the errcontext changes go: I think we have to just bite the bullet and accept them. It looks like 2.13 has a completely different mechanism than prior versions for deciding when to issue XML_ERR_NOT_WELL_BALANCED. And it's not even clear that it's wrong; for example, in our first failing case DETAIL: line 1: xmlParseEntityRef: no name <invalidentity>&</invalidentity> ^ -line 1: chunk is not well balanced -<invalidentity>&</invalidentity> - ^ it's kind of hard to argue that the chunk isn't well-balanced. So we can either suppress errdetails from the expected output, or set up an additional expected-file. I'm leaning to the "\set VERBOSITY terse" solution. regards, tom lane
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c index d75f765de0..4a5517fd75 100644 --- a/src/backend/utils/adt/xml.c +++ b/src/backend/utils/adt/xml.c @@ -1822,6 +1822,8 @@ xml_parse(text *data, XmlOptionType xmloption_arg, } else { + xmlNodePtr root; + doc = xmlNewDoc(version); if (doc == NULL || xmlerrcxt->err_occurred) xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY, @@ -1834,19 +1836,39 @@ xml_parse(text *data, XmlOptionType xmloption_arg, "could not allocate XML document"); doc->standalone = standalone; + root = xmlNewNode(NULL, (const xmlChar *) "content-root"); + if (root == NULL || xmlerrcxt->err_occurred) + xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY, + "could not allocate xml node"); + /* This attaches root to doc, so we need not free it separately. */ + xmlDocSetRootElement(doc, root); + /* allow empty content */ if (*(utf8string + count)) { - res_code = xmlParseBalancedChunkMemory(doc, NULL, NULL, 0, - utf8string + count, - parsed_nodes); - if (res_code != 0 || xmlerrcxt->err_occurred) + xmlNodePtr node_list = NULL; + xmlParserErrors res; + + res = xmlParseInNodeContext(root, + (char *) utf8string + count, + strlen((char *) utf8string + count), + XML_PARSE_NOENT | XML_PARSE_DTDATTR + | (preserve_whitespace ? 0 : XML_PARSE_NOBLANKS), + &node_list); + + if (res != XML_ERR_OK || xmlerrcxt->err_occurred) { + xmlFreeNodeList(node_list); xml_errsave(escontext, xmlerrcxt, ERRCODE_INVALID_XML_CONTENT, "invalid XML content"); goto fail; } + + if (parsed_nodes != NULL) + *parsed_nodes = node_list; + else + xmlFreeNodeList(node_list); } }