On 1/8/10 Fri  Jan 8, 2010  2:06 PM, "Aimee Cardenas"
<aim...@sfbrgenetics.org> scribbled:

> Hi, All,
> 
> Hope everyone's new year is starting out very well.
> 
> I'm having to adjust a perl program I wrote to manipulate some
> genetics data.  Originally, the program had no memory problems but now
> that I've added a couple more hashes, I'm having memory issues.  It
> now runs out of memory when it's about half way through processing the
> data.  Unfortunately, the data is very interconnected and the
> statistics I need to execute involves data from several of the
> hashes.  I'm thinking of using threads & threads::shared in order to
> be able to process, store and access the data among several of 8
> processors on a Sun Spark system.  Of course, I'll need to install the
> threads and threads::shared modules and possible even re-compile perl
> on this machine but before I go and do all this fun stuff, I wanted to
> ask your opinion about whether or not I'm going down the right rabbit
> hole or if I'm just digging myself a shallow grave.  Would this be the
> way you might do it?  I've also heard of Semaphores.  Might this be a
> better way to go about spreading the data in hash form among several
> processors on one machine and still be able to access the data in each
> hash from the main program?

I am not familiar with the details of the Sparc platform architecture, but
in general multi-threading will not help with memory problems. Normally, all
processors in a multi-processor system are accessing the same physical
memory, although each process will have access to different parts. In a
multi-threaded program, all threads are accessing the same subset of memory.

How much memory does your system have? Are you running in 32-bit mode or
64-bit mode (the latter can access more memory)?

You need to minimize your memory usage by trimming your hashes to the
minimum, making sure you have deleted data from memory after you are through
with it, and using a compact data representation. If you still run out of
memory, you will need to keep some data in a disk file or a disk-based
database, although that will slow down your program noticeably. Arrays have
a little less overhead than hashes, so use them instead when you can.



-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to