In fact i had to handle the ODP dump on two occaisions the first time the results went into a mysql db, the second time it went into a series of files.
On both occaisions i used SAX parsers. DOM would just roll over and die with this much of data. I placed code in the end element handler that would either save the data into a db or would save it to a file. In either case i only kept the data in memory for a short period. ie from the time the start element was detected through the character data handling until the end element was detected. (Obviously i am not talking of the root node here :-))
During the whole process you barely noticed the memory usage, however the disk usage still went up of course. Reading from disk 1 and writing to disk 2 does wonders!
please let me know if you need any further clarifications.
Pablo Gosse wrote:
Raditha Dissanayake wrote:
[snip]The biggest XML job i have handled with PHP is parsing the ODP RDF dump which is around 700MB. Obviously arrays are out of the question in such a scenario, even though only one user will be accessing the script At a given moment. the ODP dump has a couple of million records[/snip]
What was your solution for this, Raditha? How did you handle the parsing of such a large job?
Cheers, Pablo
-- Raditha Dissanayake. ------------------------------------------------------------------------ http://www.radinks.com/sftp/ | http://www.raditha.com/megaupload Lean and mean Secure FTP applet with | Mega Upload - PHP file uploader Graphical User Inteface. Just 150 KB | with progress bar.
-- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php