Title: Clustering Solutions and Zero Downtime Hosting Pitfalls Author: Godfrey Heron Email: info@irieisle-online.com Word Count:1452 Copyright: © 2005 by Godfrey Heron Article URL:www.irieisle-online.com/zero-downtime-hosting.htmPublishing Guidelines: You may publish this article in your newsletter, on your web site, or in your print publication provided you include
resource box at
end. Notification would be appreciated but is not required. -------------------------
Clustering Solutions and Zero Downtime Hosting Pitfalls
There are a number of benchmarks, which we may use to evaluate hosting companies. One of these is, reliability.
Like most things in this life, reliability in web hosting is typically a function of how much we are willing to spend for it. In essence, a “cost-effectiveness” equation needs to be determined and solved.
Reliability can be measured in terms of percentage availability. Industry personnel will talk of reliability in terms of system availability with three (99.9%), four (99.99%) or five nines.(99.999%).
Typically, web-hosting availability exceeding three nines was
purvue of extremely large companies with multiple layers of redundancy built into their network and software systems. However technology has now brought high-availability theory and cost-effective reality into alignment.
High availability can be achieved by removing, as far as possible, any “single point/s of failure”, or, where this is not altogether possible, minimizing
time spent in a “failure” situation.
One of
ways in which small businesses and ISP’s can reasonably avoid single point of failures is by employing server farm clustering and load-balancing solutions.
Webopedia defines server farm clustering as follows:
“A server farm is a group of networked servers that are housed in one location. A server farm streamlines internal processes by distributing
workload between
individual components of
farm and expedites computing processes by harnessing
power of multiple servers.
The farms rely on load-balancing software that accomplishes such tasks as tracking demand for processing power from different machines, prioritizing
tasks and scheduling and rescheduling them depending on priority and demand that users put on
network. When one server in
farm fails, another can step in as a backup.”
It is important to note, that typically, web servers, which are load-balanced in such a manner, display one external IP address to
public Internet, while using internal network IP’s to communicate between
clustered servers and load balancer. Now this is indeed fantastic! Not only do you receive web site peak demand scalability with web server clusters, but you also have
built-in “high uptime availability” component which is so important.
However this is only half of
picture. There are very important cautionary notes to keep in mind.
Where web hosting is concerned, availability depends on two things:
1.Hardware reliability (RAID drives, server clustering etc) within
Data Center;
2.High Bandwidth Internet Connectivity to
Data Center / Network Operating Center (NOC).
Now, with all your well thought out server clustering solutions, what would be
result, if, (as had recently occurred in a very high profile web company), a fire in
Network vicinity had caused
entire Data Center to shut down power for hours. Or, a bandwidth provider to
NOC had router problems. All your websites would be showing
dreaded “Page Cannot be Displayed” page.
The ideal solution therefore would be to employ clustering solutions with servers in entirely different Data Centers with different bandwidth providers. Redundant Data Centers eliminate
NOC itself being a single point of failure. This scenario becomes interesting at this point, because
difficulty of addressing
potential problems now increase exponentially.
We now have to deal with DNS caching,
concept of failover, and how static and dynamic web applications respond to failure events.
Failover and Load balancing are frequently used interchangeably, however they are in fact quite different.
·Load Balancing refers to physically sharing servers capacity, so that one server is not overloaded and swamped with requests.
·Failover however, is
process that manually or automatically switches a failed server or bandwidth provider to a standby server or network if
primary system fails or is temporarily shut down for servicing.