Clusters are a brain child of the 60s and 70s: Back in the 60s collections of small computers (so called compute farms) were used to crack numerical problems faster when money for the expensive big boxes was not available. The first architecture we would call a cluster today was introduced in 1976 with Tandem's Non-Stop system, a single system image ( SSI ) cluster with high availability and scalability out of the box.
The early 80s saw a lot of development effort going into availability clusters as a response to Tandem's huge success and high prices. These clusters were primarily Tandem clones that tried to reduce cost by using standard, off-the-shelf components (Intel or Motorola microprocessors) and some version of Unix as the operating system starting point. Auragen, Sequoia, Synapse and Tolerant all followed this approach with various success.
The introduction of Digital's VAXCluster in 1984 was a turning point for the cluster development: while the start-ups in the early 80s all used custom developed hardware and software the VAXCluster was the first project to build a cluster on regular server hardware (VAX) and software (VMS) beefed up with a shared disk attachment (StarCoupler) and technologies originating from distributed systems. Constant refinement over the years (e.g. cluster file system, cluster virtual IP and cluster management) resulted in one of the most successful cluster products.
The VAXCluster concept became the blueprint for many other cluster products: Pyramid Reliant and IBM HACMP were among the first (around 1990) to cluster their regular server products. Many others followed including Microsoft that introduced the Microsoft Cluster Server (MSCS) for Windows NT in 1997.
In high performance computing it became obvious towards the end of the 80s that the supercomputer development would
be increasingly more difficult due to cost and physical limitations.
The rediscovery of the
power of parallelism lead to several different parallel architectures:
- large SMP using several dozen to hundreds of CPUs on a uniform memory architecture (UMA) as a high performance extension to the well established SMP architecture. SMPs have the same programming model as uniprocessors but the time wasted in spin-locks and waiting for memory increases with every CPU added.
- ccNUMA machines that use SMP groups with typically four to eight CPUs and local memory together with a memory interconnect that allows each CPU to access (parts of) the memory of other CPU groups. ccNUMA computers have more memory bandwidth and fewer spin-lock problems but the performance of an application depends on the locality of reference (local vs. remote memory access). Both SMP and ccNUMA machines have a single fault zone.
- MPP with thousands of CPUs using either only private memory and message based communication or a NUMA memory scheme. MPPs do not have a memory or spin-lock problem but require a different programming paradigm: application must be split in many parts which execute in parallel and communicate via messages.
- Compute clusters using COTS PC or workstations. The dramatic increase in CPU power, the advent of high speed networking and new parallel execution environments paved the way for high performance compute clusters that challenged traditional supercomputers at only a fraction of the cost. Like MPPs compute clusters require a parallel programming paradigm. Since 1994 the Beowulf cluster uses common PC hardware running Linux to provide super computing for the masses. Today Beowulf type clusters are ubiquitous and occupy top spots in the TOP500 Supercomputer site.