Hello all, Thank you for the extremely helpful information. As a follow up, some of the nested elements are of the form below: -<DischargeMedication> <Medication MedAdmin="0" MedID="10"/> <Medication MedAdmin="0" MedID="11"/>
I've been having trouble extracting this information and was wondering if anyone had any suggestions. Thank you, Andrew On Thu, Jan 5, 2017 at 7:39 AM, Franzini, Gabriele [Nervianoms] < gabriele.franz...@nervianoms.com> wrote: > Hello Andrew, > > as you are "clean slate" anyway in handling XML files, you could take a > look to XSLT processing -- also an off-topic area. > There are free tools available around, and many examples of "XML to CSV > XSLT" on StackOverflow. > > HTH, > Gabriele > > -----Original Message----- > > On January 4, 2017 12:45:08 PM PST, Ben Tupper <btup...@bigelow.org> > wrote: > >Hi, > > > >You should keep replies on the list - you never know when someone will > >swoop in with the right answer to make your life easier. > > > >Below is a simple example that uses xpath syntax to identify (and in > >this case retrieve) children that match your xpath expression. xpath > >epxressions are sort of like /a/directory/structure/description so you > >can visualize elements of XML like nested folders or subdirectories. > > > >Hopefully this will get you started. A lot more on xpath here > >http://www.w3schools.com/xml/xml_xpath.asp There are other extraction > >tools in xml2 - just type ?xml2 at the command prompt to see more. > > > >Since you have more deeply nested elements you'll need to play with > >this a bit first. > > > >library(xml2) > >uri = 'http://www.w3schools.com/xml/simple.xml' > >x = read_xml(uri) > > > >name_nodes = xml_find_all(x, "//name") > >name = xml_text(name_nodes) > > > >price_nodes = xml_find_all(x, "//price") > >price = xml_text(price_nodes) > > > >calories_nodes = xml_find_all(x, "//calories") > >calories = xml_double(calories_nodes) > > > >X = data.frame(name, price, calories, stringsAsFactors = FALSE) > >write.csv(X, file = 'foo.csv') > > > >Cheers, > >Ben > > > >> On Jan 4, 2017, at 2:13 PM, Andrew Lachance <alach...@bates.edu> > >wrote: > >> > >> Hello Ben, > >> > >> Thank you for the advice. I am extremely new to any sort of coding so > >I have learned a lot already. Essentially, I was given an XML file and > >was told to convert all of it to a csv so that it could be uploaded > >into a database. Unfortunately the information I am working with is > >medical information and can't really share it. I initially tried to > >convert it using online programs, however that ended up with a large > >amount of blank spaces that wasn't useful for uploading into the > >database. > >> > >> So essentially, my goal is to parse all the data in the XML to a > >coherent, succinct CSV that could be uploaded. In the document, there > >are 361 patient files with 13 subcategories for each patient which > >further branches off to around 150 categories total. Since I am so new, > >I have been having a hard time seeing the bigger picture or knowing if > >there are any intermediary steps that will prevent all the blank spaces > >that the online conversion programs created. > >> > >> I will look through the information on the xml2 package. Any advice > >or recommendations would be greatly appreciated as I have felt fairly > >stuck. Once again, thank you very much for your help. > >> > >> Best, > >> Andrew > >> > >> On Tue, Jan 3, 2017 at 2:29 PM, Ben Tupper <btup...@bigelow.org > ><mailto:btup...@bigelow.org>> wrote: > >> Hi, > >> > >> It's hard to know what to advise - much depends upon the XML data you > >have and what you want to extract from it. Without knowing about those > >two things there is little anyone could do to help. Can you post to > >the internet a to example data and provide the link here? Then state > >explicitly what you want to have in hand at the end. > >> > >> If you are just starting out I suggest that you try xml2 package ( > >https://cran.r-project.org/web/packages/xml2/ > ><https://cran.r-project.org/web/packages/xml2/> ) rather than XML > >package ( https://cran.r-project.org/web/packages/XML/ > ><https://cran.r-project.org/web/packages/XML/> ). I have been using it > >much more since the authors added the ability to create xml nodes > >(rather than just extracting data from existing xml nodes). > >> > >> Cheers, > >> Ben > >> > >> P.S. Hello to my niece Olivia S on the Bates EMS team. > >> > >> > >> > On Jan 3, 2017, at 11:27 AM, Andrew Lachance <alach...@bates.edu > ><mailto:alach...@bates.edu>> wrote: > >> > > >> > up votdown votefavorite > >> > > ><http://stats.stackexchange.com/questions/254328/how-to- > convert-a-large-xml-file-to-a-csv-file-using-r?noredirect=1# > ><http://stats.stackexchange.com/questions/254328/how-to- > convert-a-large-xml-file-to-a-csv-file-using-r?noredirect=1#>> > >> > > >> > I am completely new to R and have tried to use several functions > >within the > >> > xml packages to convert an XML to a csv and have had little > >success. Since > >> > I am so new, I am not sure what the necessary steps are to complete > >this > >> > conversion without a lot of NA. > >> > > >> > -- > >> > Andrew D. Lachance > >> > Chief of Service, Bates Emergency Medical Service > >> > Residence Coordinator, Hopkins House > >> > Bates College Class of 2017 > >> > alach...@bates.edu <mailto:alach...@bates.edu> <wcur...@bates.edu > ><mailto:wcur...@bates.edu>> > >> > (207) 620-4854 > >> > > >> > [[alternative HTML version deleted]] > >> > > >> > ______________________________________________ > >> > R-help@r-project.org <mailto:R-help@r-project.org> mailing list -- > >To UNSUBSCRIBE and more, see > >> > https://stat.ethz.ch/mailman/listinfo/r-help > ><https://stat.ethz.ch/mailman/listinfo/r-help> > >> > PLEASE do read the posting guide > >http://www.R-project.org/posting-guide.html > ><http://www.r-project.org/posting-guide.html> > >> > and provide commented, minimal, self-contained, reproducible code. > >> > >> Ben Tupper > >> Bigelow Laboratory for Ocean Sciences > >> 60 Bigelow Drive, P.O. Box 380 > >> East Boothbay, Maine 04544 > >> http://www.bigelow.org <http://www.bigelow.org/> > >> > >> > >> > >> > >> > >> > >> -- > >> Andrew D. Lachance > >> Chief of Service, Bates Emergency Medical Service > >> Residence Coordinator, Hopkins House > >> Bates College Class of 2017 > >> alach...@bates.edu <mailto:wcur...@bates.edu> > >> (207) 620-4854 > > > >Ben Tupper > >Bigelow Laboratory for Ocean Sciences > >60 Bigelow Drive, P.O. Box 380 > >East Boothbay, Maine 04544 > >http://www.bigelow.org > > > > > > > > > > [[alternative HTML version deleted]] > > > >______________________________________________ > >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >https://stat.ethz.ch/mailman/listinfo/r-help > >PLEASE do read the posting guide > >http://www.R-project.org/posting-guide.html > >and provide commented, minimal, self-contained, reproducible code. > > > -- Andrew D. Lachance Chief of Service, Bates Emergency Medical Service Residence Coordinator, Hopkins House Bates College Class of 2017 alach...@bates.edu <wcur...@bates.edu> (207) 620-4854 [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.