Processing High CPU & Memory intensive tasks on Hadoop - Architecture question

amit handa Fri, 24 Apr 2009 22:37:10 -0700

Hi,

We are planning to use hadoop for some very expensive and long running
processing tasks.
The computing nodes that we plan to use are very heavy in terms of CPU and
memory requirement e.g one process instance takes almost 100% CPU (1 core)
and around 300 -400 MB of RAM.
The first time the process loads it can take around 1-1:30 minutes but after
that we can provide the data to process and it takes few seconds to process.
Can I model it on hadoop ?
Can I have my processes pre-loaded on the task processing machines and the
data be provided by hadoop? This will save the 1-1:30 minutes of intial load
time that it would otherwise take for each task.
I want to run a number of these processes in parallel  based on the machines
capacity (e.g 6 instances on a 8 cpu box) or using capacity scheduler.


Please let me know if this is possible or any pointers to how it can be done
?

Thanks,
Amit

Processing High CPU & Memory intensive tasks on Hadoop - Architecture question

Reply via email to