Hi, I have a huge XML file, 1.7GB, 53080215 lines. I am trying to extract an attribute from each record (code=). I several problems one of which is the size of the file is making it painful to test my scripts and methods for parsing.
I would like to extract a few hundred records (by any means) so I can experiment. I think XPath is the way to go here. The file (currently) sits on a *nix system but I was going to do the parsing to on a Win32 workstation rather than steal all the memory on a server. Below is a sample of some data. I have XML::XPath installed, there doesn't seems to be a libXML2 for Win32 . This is my first effort but I haven't been able to run it fully as my workstation began to page severely after a while. So I would like so hints before try again as each attempt takes ages. TIA, Dp. ======= #!/bin/perl use strict; use warnings; use XML::XPath; use XML::XPath::XMLParser; my $xmp = XML::XPath->new(filename => 'myfile.xml'); my $nodeset = $xmp->find('/records/record/'); foreach my $node ($nodeset->get_nodelist) { my $attrib = $node->getNodeType('ATTRIBUTE_NODE'); print "$attrib\n"; } ===== <?xml version = "1.0" encoding= "utf-8"?> <records> <record code="65020/0002"> <display_number>65020/003</display_number> <title>Moulded resistors in synthetic resin</title> <created_date>05-Mar-85</created_date> <updated_date>15-Nov-07</updated_date> <restrictions> </restrictions> </image> ...snip -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/