Re: Hashes and memory consumption

Jenda Krynicky Thu, 16 Jan 2003 13:42:04 -0800

From: Schoenwaelder Oliver <[EMAIL PROTECTED]>
> I have some problems with a script using hashes.
> I use hashes for years but never had this kind of problem before... 
I
> have a ascii file with 6.5 MB of data. This file is tokenized by
> Parse:Lex module. The tokens are then stored in a two level hash:
> $TokenHash{$TokenType}->{$TokenID}=$TokenValue. The file contains
> 536,332 tokens which will lead to 79 keys for %TokenHash. I'm
> evaluating the hash with two loops, one for each level. Due to that 
I
> need to move back and forth through the _sorted_ hash while being 
in
> the loop I can't use the built-in procedures like "foreach $key1 
(keys
> %TokenHash)...". So I decided to use Tie::LLHash. Now I'm amazed by
> the memory consumption. The script uses up to 300MB for processing
> this small file which will lead to a 3.5 MB file at the end.



You may want to try Tie::IxHash instead.

But I think it would be best to store it as

$TokenHash{$TokenType,$TokenID}=$TokenValue. 

and tie the %TokenHash to DB_File in the  DB_BTREE format.
This will allow you to move back and forth and it should reduce the 
memory consumtion of your script considerably. (by storing the data 
on disk).

Let me know if you need more details :-)

Jenda

===== [EMAIL PROTECTED] === http://Jenda.Krynicky.cz =====
When it comes to wine, women and song, wizards are allowed 
to get drunk and croon as much as they like.
        -- Terry Pratchett in Sourcery


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Hashes and memory consumption

Reply via email to