Thank you so much for the reply. I have not investigated the LCNAF data set thoroughly. However, my default/ideal is to read in all variables from a dataset.
So, I was wondering if any one had an example Python or Perl script for reading RDF/XML, Turtle, or N-triples file. A simple/partial example would be fine. Thanks, Jean On Mon, 29 Sep 2014, Kyle Banerjee wrote: KB> The best way to handle them depends on what you want to do. You need to KB> actually download the NAF files rather than countries or other small files KB> as different kinds of data will be organized differently. Just don't try to KB> read multigigabyte files in a text editor :) KB> KB> If you start with one of the giant XML files, the first thing you'll KB> probably want to do is extract just the elements that are interesting to KB> you. A short string parsing or SAX routine in your language of choice KB> should let you get the information in a format you like. KB> KB> If you download the linked data files and you're interested in actual KB> headings (as opposed to traversing relationships), grep and sed in KB> combination with the join utility are handy for extracting the elements you KB> want and flattening the relationships into something more convenient to KB> work with. But there are plenty of other tools that you could also use. KB> KB> If you don't already have a convenient environment to work on, I'm a fan KB> of virtualbox. You can drag and drop things into and out of your regular KB> desktop or even access it directly. That way you can view/manipulate files KB> with the linux utilities without having to deal with a bunch of clunky file KB> transfer operations involving another machine. Very handy for when you have KB> to deal with multigigabyte files. KB> KB> kyle KB> KB> On Mon, Sep 29, 2014 at 11:19 AM, Jean Roth <jr...@nber.org> wrote: KB> KB> > Thank you! It looks like the files are available as RDF/XML, Turtle, or KB> > N-triples files. KB> > KB> > Any examples or suggestions for reading any of these formats? KB> > KB> > The MARC Countries file is small, 31-79 kb. I assume a script that KB> > would read a small file like that would at least be a start for the LCNAF KB> > KB> > KB>