As mentioned already, one way to increase both scalability and availability is by using Network Load Balancing, which allows you to cluster up to 32 servers to respond to IP requests. This comes in handy when you need to distribute Web client requests among a cluster of Internet Information Services (IIS) servers, for instance. The benefit of this is that the client trying to use the service only has to remember one IP address, and NLB hides all servers in the background. If the server handling the client's request fails, another server takes its place, and the client should not notice the switch, as illustrated in Figure 3-2.
Of course, you can use Web servers other than IIS—NLB works with all Web servers on a Windows system. You can also use NLB with Terminal Service servers for client access and many other applications like streaming media and virtual private networks (VPNs).
All servers in an NLB cluster host a driver called wlbs.sys. You might recognize the first part of the name from earlier in the chapter. WLBS is, of course, Windows Load Balancing Service, and it is a surviving Windows NT 4.0 driver. Or, at least, the name survived. The driver determines which server handles the incoming request by using a statistical mapping algorithm. The wlbs.sys driver runs between TCP/IP and the network driver, as you can see in Figure 3-3.
The NLB driver executes in parallel on each member of the cluster. This is called distributed software architecture and is great because it makes it possible not to have a single point of failure. Network Load Balancing will still function if one or more servers fail.
The cluster has one primary IP address on which all of the members listen. The cluster members need to be located on the same IP subnet so that they can detect network traffic destined for this IP address. You configure the primary IP address just like you configure any other IP address. But if several servers have the same IP address configured, would this not generate an error? Normally it would, but NLB will take care of this for you, so you will never get an error if you set it up properly.
You can also allow something called a multihomed NLB cluster. This is simply a cluster with more than one primary IP address configured.
To handle the communication with a specific server in the cluster, all servers have a dedicated IP address. So if you for administrative purposes need to connect to a certain server, you use the dedicated IP address instead of the primary IP address. Otherwise, you would not know which server actually responded to you. This is also the reason why only one dedicated IP address can be configured for each cluster server. But keep in mind that all servers can have several primary IP addresses, as shown in Figure 3-4; otherwise, you could not configure a multihomed NLB cluster.
All servers in a cluster send something called a heartbeat message to all other servers in the cluster. This happens once every second. If any server during a five-second period notices that another server has not sent its heartbeat message, the server will begin a process called convergence. This process determines which servers are still part of the cluster and a few other things.
If you add a new server to the cluster, convergence takes place on the first server that notices the newcomer. This happens because the new server sends out a message that says "Hello. Here I am." and starts the convergence process.
Convergence takes about ten seconds to complete. If an incoming request was bound for a server that for some reason failed, the client might experience a ten-second delay before convergence finishes and another server handles the request.
One of the advantages of NLB is that, as your applications need more server power, you do not have to shut down the cluster to increase it. You simply add a new industry standard PC to the cluster and voilą, power just increased.
But there is a limit to how many servers can be part of a cluster. A cluster can include up to 32 servers, but this really is not a problem. If your cluster should need more power than that, you can always use a cluster of clusters. That is, behind each of the 32 members in the cluster you hide another cluster. And behind every member of these clusters you hide … we think you get the picture. What this all adds up to is that NLB is a great way to increase scalability. We know companies that do not need a large Web farm most of the year. But when Christmas approaches, for example, the pressure on their Web site increases and something has to be done. What these companies do is simply lease some servers from a leasing company, configure them for NLB, replicate all Web content, and then insert the servers into the cluster. Now these companies are ready to handle all customers. When Christmas is over, the traffic decreases and the servers are returned to the leasing company.
As we have mentioned, NLB also helps with increasing availability. You have many identical servers in the cluster, but it does not necessarily matter if one or more servers fail. As long as at least one of them is up and running, client requests will continue to be processed. NLB will reroute the requests to a functioning server within ten seconds of the failure of a server. But of course, if only one server remains in the cluster and the load is big, the response times will increase. And if the load increases even further, the last server might give up from sheer overload. On the other hand, the likelihood of all but one server crashing at the same time is small, so you really should not lose any sleep over this issue.
One question that arises if you have a large cluster is how you actually manage the servers. How can the administrator have time to do his or her job when numerous servers need to be handled? If the administrator would try to administer all these servers locally, there would not be any time left in the day. But have no fear; several tools are available to make life easier. One such tool, which comes out of the box, is called wlbs.exe (recognize this, anyone?). This tool allows an administrator to remotely manage servers in an NLB cluster. Another tool, actually a server in itself, is Application Center, which can be purchased from retailers everywhere. This product gives you great opportunities for cluster control as well as a lot of added value. Application Center is discussed in more detail in the "Application Center Overview" section later in this chapter.
Do you have to use a Microsoft solution to accomplish these benefits of load balancing? The answer is, of course not! There are other ways to handle load balancing. But these might have some drawbacks that NLB does not have.
Round-Robin DNS (RRDNS), for instance, distributes the IP requests among the servers in a cluster. But the difference is that if one server fails, requests are still forwarded to that server until an administrator removes it from the address list. NLB handles this by itself by rerouting the requests.
Third-party hardware products are also available. But these are often expensive and can potentially be a single point of failure. Since NLB in itself is a distributed application, no single point of failure exists, unless, of course, the unlikely event occurs that all but one server crashes simultaneously.
NLB requires no special hardware. This way you can save money by running it on the servers you have already decided to be part of your cluster. You do not even have to use the same operating system on your machines. If you already have an old Windows NT 4.0 cluster running WLBS, you can integrate these machines into a Windows 2000 NLB cluster easily and at the speed you choose for yourself.
But upgrading is the way to go. Why? Well, Microsoft has made some big enhancements to NLB. NLB is not a virtual NIC anymore. Instead, it is an optional LAN service that is automatically installed with the operating system. When you want to use it, you just set the correct properties for it, as shown in Figure 3-5, and you are ready to rock.
Changes to these properties can even be made without causing a reboot of the server. The service will be back after a short delay of 15 to 20 seconds.
You can also configure NLB when you perform an unattended installation. This helps in deployment of new servers.