Please don't top post. On 4/23/06, Rob Coops <[EMAIL PROTECTED]> wrote: > Thanks for the tips, I think a combination of the two will be my best chance > for this one. > > Store the bulk on disk or in a DB and use several smaller hashes to do the > merging. After which I can retreive the bulk from disk/db while looping over > the resulting combined hash. (note to self: must not forget to dereference > those several smaller hashes as soon as they are no longer useful) > > > On 4/21/06, Dr.Ruud <[EMAIL PROTECTED]> wrote: > > > > Rob Coops schreef: > > > > > I have two quite large hashes each are several hundreds of MB's in > > > size, now I want to with some logic merge these into a single hash. I > > > this works of course but as one might imagine this takes quite a lot > > > of memory. And can depending on the machine simply run out of memory. > > > > What is the structure of these hashes? > > > > Does each hash contain a lot of redundancy? If so, try to 'normalize' > > them, which basically means splitting them up in even more hashes. :) > > > > How many keys? What is the minimal/average/maximal size of the values? > > How about storing the data in a database? > > > > -- > > Affijn, Ruud
Benchmark a couple of different options to see, but my gut reaction is that three databases would be your best bet: two for the original hashes and one for the new hash. That should be fairly efficient performance and memory-wise, and tied hashes are drop-in replacements for your current hashes. See the docs for tie() and dbmopen() (and ignore the note about dbmopen() being depricated. Those who know best--e.g. Tom Pheonix and Randall Schwartz--advocate dbmopen() over tie()). If you're using HoH or HoA structures, though, you'll have to get creative with pack or split; dbm doesn't handle references. also, whenever you're working with large hashes--or lagre arrays or lists--make sure you're using while and not for or foreach to iterate over your data. HTH, -- jay -------------------------------------------------- This email and attachment(s): [ ] blogable; [ x ] ask first; [ ] private and confidential daggerquill [at] gmail [dot] com http://www.tuaw.com http://www.dpguru.com http://www.engatiki.org values of β will give rise to dom!