On 2013-02-04 23:33, dormitionsk...@hotmail.com wrote:
I see some of the posts some of you all make about really complex setups, and I see photos of data centers with racks upon racks upon racks of servers... and I hear some of you all talk about how you set up a thousand servers a year... and I can't imagine why any but the largest of organizations would need that many.
I guess I'll stray away into the offtopic land from the original thread subject... sorry... So, to an extent, a number of servers may be needed to: 1) Achieve redundancy - if one of identical or otherwise equivalent servers goes down, on schedule or by error, systems still work. For example, servers with published state of a website, or email relays, or replicas of a database or an LDAP catalog... 2) Achieve separation of tasks - for predictability of available resources to important tasks, and sometimes to separate from possible break-ins and DoS attacks. For example, don't run a sensitive private database on a system connected to the internet - have some DMZ with public servers in it, and tiny holes in firewalls that only allow so much access as required for these systems to work. 3) Resource limits. There's only so much a computer can do. On some it may be a limitation of memory, with newer programs being more and more RAM-hungry. It doesn't really pay off to many developers and their management to spend months refining a program so that it consumes less memory, if over this time the modules become twice cheaper. HDDs are limited in amount of seeks they can do per second - this is one other reason why servers proliferate, as well as dedicated storage systems like ZFS boxes with SSD caches to feed many CPU+RAM nodes with tasks off of several colocated HDDs and caches over a fast network. 4) Amount of tasks. Even as you begin to play with zones, you can see how each running environment spawns some 50 or 100 processes. Even when they sleep, they require a bit of processing time from the kernel which walks the process tree and checks if any process needs CPU time - and even more overhead if the process does need to number-crunch a bit. Change between processes requires finite time to do a "context switch", replacing CPU cache contents and registers. At some number of active processes you can find that the context switches are all the CPU does, with little to no time going to the processes (or that they receive it too rarely, which is on average the same thing). Different OSes may hit it as a problem at different moments. Back in my time with Linux spawning a thousand do-nothing processes could bring my box to a crawl, while Solaris wasn't unhappy about some 5000 - but too came to a halt at about 7000 processes per CPU. The amount of processable tasks per second translates, for large systems, to amount of processable requests, happy clients - staff in a bank or office, buyers in a webshop, etc. For example, it may be a requirement that the response should not be greater than one second, and when more visitors come - more computers have to be thrown at handling them and conforming to the SLA (Service-level agreement, gentleman's or a juridical-fiscal obliging document). In part, this is why multiprocessing systems (multi-socket, multi-core, multi-thread) are good for such multitasked loads as, basically, any computer presents now - it is possible for a system to keep some of the more active processes always using the same CPU and reduce the needed context switches or the amount of work needed during the switch (i.e. not evict from CPU cache only to repopulate it a few microseconds later), so it gets its time shares quicker - or maybe never leaves the CPU. Sun T-series processors (Niagaras) were a particularly successful experiment on this theory in that they were computationally puny, but had several cores and many "strands" per core - up to 8. The strands were basically register caches which allowed to use the core to compute one task while several more were cached away (usually waiting for IO to system RAM - which takes ages compared to IO to CPU cache memory, and compared to this the HDD and network IO takes epochs enought to spawn and destroy a universe), and then switching back to the stuffed-away tasks was very quick: even the CPU registers remained the same, the core only needed to begin executing another set of them - another strand. Since then the Niagaras grew to have some specialized cores (i.e. for floating point math and cryptography) and generally became more performant - but rather unavailable, unless you sell your soul and grandchildren's university stash to Oracle ;) Still, the earlier generations of this family of CPUs are the OpenSPARC architecture which Sun open-sourced so anyone can make compatible multi-core CPUs for research or benefit. But this only boosts multiprocessing horsepower - with slow IO to any type of storage becoming even a more visible bottleneck to overall system performance. Needing more boxes to improve... :) 5) In larger organizations there may also be an organizational problem of different departments owning different hardware (as in paid from their budget) which may lead to sprawl, and which may be solvable by virtualized colocation - if their politicians agree. And there may be black boxes which "just work", nobody knows why or what for, but prefer to not touch them ;] Likewise, separate subsystems may have separate hardware - i.e. the web farm, the corporate LDAP, the mail systems, the payroll database. There are different people responsible for them, and different schedules of wholesale upgrades/replacements, etc. even if the boxes are under-utilized. I am sure if the list members were to share their observations, we could find dozens more of reasons to have many servers even in a moderately small organization... I'd say that performance (latency due to IO and context-switching and CPU utilization), redundancy and reaching the limits of hardware (no more RAM) so you can't add more tasks and/or that filesystem caching becomes inefficient and IO latency problems hit you regularly are the main reasons for us here :) HTH, //Jim _______________________________________________ OpenIndiana-discuss mailing list OpenIndiana-discuss@openindiana.org http://openindiana.org/mailman/listinfo/openindiana-discuss