Siju George wrote: > Hi, > > How Do you handle when you have to Serve terrabytes of Data through > http/https/ftp etc? > Put it on Differrent machines and use some knid of > loadbalancer/intelligent program that directs to the right mahine? > > use some kind of clustering Software? > > Waht hardware do you use to make your System Scalable from a few > terrabytes of Data to a few hundred of them? > > Does OpenBSD have any clustering Software available? > > Is anyone running such setups? > Please let me know :-) > > Thankyou so much > > Kind Regards > > Siju
Too open-ended a question... Are you talking about many TB on one site? Lots of sites? Is there some reason it has to be on one server or one site? Is this "huge storage, huge demand"? Huge storage, low demand? Is this storage all needed on day 1, or will it grow with time? (hint: if it grows with time, build for NOW, with ability to add later, don't buy storage in advance!) etc. Let the answers to those questions guide your engineering work, don't rely on knee-jerk reactions. And don't be afraid to change the question to meet available answers. :) Common error is to take the given proposed solution (posed as a problem, but often someone has digested the REAL problem into what they think is the only possible model, and sent you down a bad alley) as gospel, and never question the basic assumptions. I've got a web server with over 3.5TB of storage on it that cost about $6000US a year or so ago. It's a huge-storage, low-demand app, probably gets on average a query a day, if that. If the box breaks, time can be spent repairing it, but we don't want to lose the data (it's carefully backed up, but the backup media is so compressed, it takes longer to uncompress the files than it does to scp them back into the box!). So, the thing has redundancy where it counts (disk) and simplicity where it doesn't matter, and it can be upgraded, enhanced and changed as needed. And, we have a small enough amount invested in the thing that we can completely change our mind about the approach to the problem any time in the future and throw it all away with a very clear conscience. (My current boss-of-the-week thinks he wants to replace this with an unknown proprietary app feeding a $30,000 per-processor database server attached to a $60,000 disk array, so you can see how insignificant the price tag on this system is. You can also see something about my boss. And why I'm looking for a better job). Let's say you have one website that you are trying to serve massive amounts of static files from. I presume you aren't just dropping people at the root of a massive directory tree and letting them dig for their desired file...you probably have some kind of app directing them to the file they need. Well, you should have no problem also directing them to the SERVER they need, as well...do a little magic on the front-end machine, you could also implement massive amounts of very cheap redundancy for very low cost. For example, if you have two machines, A and B, skip RAID, just put both data sets on both machines. If you lose A, serve A's files from B, it's a little slower, but still working. Repair A, resync (if needed) and you are back up and running at 100%. Now you can use the absolutely cheapest and least redundant machines around to accomplish your task. (in this case, your front-end machines would have to be a little more sophisticated...but still should have multiple-machine redundancy). SANs are the cool way to do this, of course. Also a very expensive way...and something I'd try to avoid unless it was really needed. Design it simple, design it to be fixable WHEN it breaks, and you will save your hair... Use all the tricks you can for YOUR solution, including: * lots of "small" partitions * RO any partitions you can (no need to fsck after an oops) * Assume you will need more storage later, and figure out how to add it without removing data from your existing storage * Assume your existing 500G disk is going to look pathetic in a few years when 10TB microdrives are in your palmtop computer, and make sure you have a plan to migrate the data off those first disks you installed. * Guess how much processor you need, and figure out how to deal with it when you are wrong. * Keep in mind if you don't expect lots of demand this year, next year's systems will be a lot faster, bigger and cheaper. * Last year's computers loaded with modern disks are still pretty darned fast for many applications. Nick.