On Wed, Apr 18, 2007 at 03:22:07PM +0530, Siju George wrote:
> Hi,
> 
> How Do you handle when you have to Serve terrabytes of Data through
> http/https/ftp etc?
> Put it on Differrent machines and use some knid of
> loadbalancer/intelligent program that directs to the right mahine?
> 
> use some kind of clustering Software?
> 
> Waht hardware do you use to make your System Scalable from a few
> terrabytes of Data to a few hundred of them?
> 
> Does OpenBSD have any clustering Software available?
> 
> Is anyone running such setups?
> Please let me know :-)

I don't really know, but how about some http proxy (hoststated comes to
mind, pound or squid also works) and a lot of hosts each serving a
subset of the total behind that? Yes, that's exactly what you said.

I don't think NFS/AFS is that good an idea; you'll need very beefy
fileservers and a fast network. Maybe rsync'ing from a central
fileserver would work?

However, there are a lot of specialized solutions available (various
SANs come to mind; Google has published several papers on filesystems
and algorithms like MapReduce, although the latter isn't going to help
you for serving HTTP).

All in all, though, I think the most important part are rate of change
and reliability conditions. A big web host might hit an impressive
amount of data, but it doesn't change all that often and a site
occasionally going offline is usually tolerated (just restore a recent
backup). In such cases, something like the above seems to work.

                Joachim

-- 
TFMotD: moduli (5) - system moduli file

Reply via email to