Hadley, Thank you. I am able to get the xml_ns_strip() function to work with my file directly so I will likely be able to reach my immediate goal.
However, I still have had no success with understanding the namespace problem. I am not able to use read_xml() using the object I generated for the reproducible example, which is simply a character vector of length 4 having the contents of the XML file as produce by readLines(). I then used dput() to define the structure. The resulting structure apparently is not to the liking of read_xml(). I have reproduced the necessary code here for your convenience. There error is below. ## library(xml2) library(stringr) with_ns_xml <- c("<?xml version=\"1.0\" ?>", "<WorkSet xmlns=\"http://labkey.org/etl/xml\">", "<Description>MFIA 9-Plex (CharlesRiver)</Description>", "</WorkSet>") ## without str_c() collapse it complain of a vector of length > 1 also. read_xml(str_c(with_ns_xml, collapse = TRUE)) Error in doc_parse_raw(x, encoding = encoding, base_url = base_url, as_html = as_html, : Start tag expected, '<' not found [4] ## produces the following error message. Error in doc_parse_raw(x, encoding = encoding, base_url = base_url, as_html = as_html, : Start tag expected, '<' not found [4] I have similar issues with xml2::xml_find_all xml_find_all(str_c(with_ns_xml, collapse = TRUE), "/WorkSet//Description") ## Produces the following error message. Error in UseMethod("xml_find_all") : no applicable method for 'xml_find_all' applied to an object of class "character" R. Mark Sharp, Ph.D. msh...@txbiomed.org > On Jan 31, 2017, at 4:27 PM, Hadley Wickham <h.wick...@gmail.com> wrote: > > See the last example in ?xml2::xml_find_all or use xml2::xml2::xml_ns_strip() > > Hadley > > On Tue, Jan 31, 2017 at 9:43 AM, Mark Sharp <msh...@txbiomed.org> wrote: >> I am trying to read a series of XML files that use a namespace and I have >> failed, thus far, to discover the proper syntax. I have a reproducible >> example below. I have two XML character strings defined: one without a >> namespace and one with. I show that I can successfully extract the node >> using the XML string without the namespace and fail when using the XML >> string with the namespace. >> >> Mark >> PS I am having the same problem with the xml2 package and am hoping >> understanding one with help with the other. >> >> ## >> library(XML) >> ## The first XML text (no_ns_xml) does not have a namespace defined >> no_ns_xml <- c("<?xml version=\"1.0\" ?>", "<WorkSet>", >> "<Description>MFIA 9-Plex (CharlesRiver)</Description>", >> "</WorkSet>") >> l_no_ns_xml <-xmlTreeParse(no_ns_xml, asText = TRUE, getDTD = FALSE, >> useInternalNodes = TRUE) >> ## The node is found >> getNodeSet(l_no_ns_xml, "/WorkSet//Description") >> >> ## The second XML text (with_ns_xml) has a namespace defined >> with_ns_xml <- c("<?xml version=\"1.0\" ?>", >> "<WorkSet xmlns=\"http://labkey.org/etl/xml\">", >> "<Description>MFIA 9-Plex (CharlesRiver)</Description>", >> "</WorkSet>") >> >> l_with_ns_xml <-xmlTreeParse(with_ns_xml, asText = TRUE, getDTD = FALSE, >> useInternalNodes = TRUE) >> ## The node is not found >> getNodeSet(l_with_ns_xml, "/WorkSet//Description") >> ## I attempt to provide the namespace, but fail. >> ns <- "http://labkey.org/etl/xml" >> names(ns)[1] <- "xmlns" >> getNodeSet(l_with_ns_xml, "/WorkSet//Description", namespaces = ns) >> >> R. Mark Sharp, Ph.D. >> Director of Data Science Core >> Southwest National Primate Research Center >> Texas Biomedical Research Institute >> P.O. Box 760549 >> San Antonio, TX 78245-0549 >> Telephone: (210)258-9476 >> e-mail: msh...@txbiomed.org >> >> >> >> >> >> >> >> >> >> CONFIDENTIALITY NOTICE: This e-mail and any files and/or...{{dropped:10}} >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > > -- > http://hadley.nz CONFIDENTIALITY NOTICE: This e-mail and any files and/or...{{dropped:10}} ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.