On Thu, Nov 19, 2009 at 9:07 AM, Dan Fish <d...@ninemoons.com> wrote:
> I'm working on a project that requires some rather large extractions from > an > Oracle DB (about 2 million rows) and while monitoring the task manager, I > get an "Out of memory error" at about 2GB of mem usage. > > > > Client System Specifics: > > > > Win 2K3 Server 32-Bit SP2 > > Intel Xeon 2.5Ghz 8 CPU w/4GB RAM > > Activestate Perl 5.10.1 > > DBD-Oracle v1.21 > > DBI v1.607 > > Oracle 8i Database running on a SPARC/Solaris platform (using Oracle Client > 10.2 on the windows box) > > > > I've never tried handling any extractions this big before, so have several > questions: > > > > Is their a 2GB address barrier in Perl? > > > > Being a Windows "Server" version (which I'm not familiar with), Are there > some windows user/process settings that might be limiting the max usable > RAM? > > > > Is there a built-in way to transparently "disk buffer" the returned > records? > (I.E. not having to handle the buffering details myself and still be able > to > use $sth->fetchall_arrayref() ) > > > > Any other elegant methods or suggestions for handling data extractions of > this size? > > > > Once again, this is new territory for me... any suggestions or examples > greatly appreciated! > > > > Thanks! > > -Dan > > > > --- > > "There are only 10 kinds of people in the world... Those that understand > binary and those that don't!" > > > > Hi Dan, Having messed about with DBD::Oracle from time to time my self I can assure you the last thing you want to do is: $sth->fetchall_arrayref() for exactly the reasons you encountered. Unless you are 100% certain the data you are fetching is going to always be relatively small you will have to fetch the records one at a time or write a little loop to fetch say 100k records dump them to disk and fetch the next 100k etc... Regardless of the upper memory limit on Perl (which I have no idea about (there are much smarter people on this list that will tell you about that)) just pulling out a whole lot of records and shoving them into memory is a very risky thing. Now your DB is rather smallish with just 2M records, but imagine you are trying to do the same on a data warehouse with tables containing say 200M records, or even more... you will inevitably run out of memory at some point regardless of a memory limit. Even if you don't what about the other jobs running on the machine they will not be happy if you start eating up all available memory basically risking their health as they will not be able to allocate more etc. There should not be any reason why you would need to have 2M records in memory, if you need to transform the data have the DB do that for you by crafting a query that does so on returning the select statement, if there should be a selection or an alignment with other records again have the DB do this for you it will be much faster and more efficient at doing so then you will be able to code it in Perl. Fetching 100k records with a simple for ( my $i = 0; i < 100000; i++ ) { <fetch row, perform what ever action you need on the data>; push ( @$arrayref, $fetched_row); } should do the trick. Regards, Rob