Re: Architecture question.

Bill de hOra Wed, 31 Dec 2008 09:28:09 -0800

Steve Loughran wrote:

aakash shah wrote:
We can assume that this record has only one key->value mapping. Valuewill be updated every minute. Currently we have 1 Million these (key->value ) pairs but I have to make sure that we can scale it upto10 million of these ( key-> value ) pairs.
Every 10 minute I will be updating all of these value using theirkeys. This is the reason I cannot go for database as a solution.
I wouldn't be so quick to dismiss a database. All your big telcos runtheir mobile phone systems on databases, where the big issue is havingenough memory for the DB to stay in memory; some dedicated databases(e.g. TimesTen) are designed to have bounded latency on lookup so youcan predict how long operations will take.
That said, if you are only doing atomic updates of a single record,there's less need for the advanced features. Assuming >1 machine, somekind of distributed hash table may work
I was thinking about going with memcache pool. In the mean-time Iheard about hadoop and wanted to get advice from this mailing listregarding memcache pool vs hadoop for this specific problem.
It's not an area Hadoop deals with at all.

The record size sounds too small for HDFS, unless the records are inturn grouped to something optimal for the block size. For records thatsize, I would also consider a) writing them out again instead of doingupdates, b) testing for physical (disk) bottlenecks.


Also, there's memcacheddb as an alternative for persistent hashing:

http://memcachedb.org/benchmark.html

Bill

Re: Architecture question.

Reply via email to