Windows Server Architecture

Any performance monitoring, tuning, or optimization you do on a Windows Server system involves the kernel. Actually everything you do involves the kernel. The kernel is the most central part of the operating system, and without it you do not have much use of the Windows System at all. Figure 4-1 shows an overview of the Windows architecture, which we will look at more closely later in this chapter. It is very similar to the Windows 2000 architecture.

Figure 4-1: The Windows Server 2003 architecture

If you want to optimize and secure your system, you need to spend a little time on this piece of operating system machinery. The kernel is basically the system32 directory in the Windows catalog on the hard drive. It is made up of many files, including .dll, .exe, and .com files. The executive, shown in Figure 4-2, serves to protect the kernel. This is the part of the operating system that applications access. This way nothing ever accesses the kernel itself, and it becomes harder to sabotage the system.

Figure 4-2: The executive protects the kernel

The executive can be considered the interface to the kernel, in the same way that business logic, in a multitier application, always should be exposed by an interface. It adds a layer of protection around what is important to your system, while still allowing others to use it.

The kernel and the executive are closely linked together, so when discussing the executive in this book, we almost always talk about it in terms of an interface to the kernel. We will take a closer look at it soon, starting with the executive service. But first we need to make a short detour.

Threads

Before we go any further, we need to digress a moment to cover threads because they are a central part in any operating system architecture discussion.

A thread is a single path of execution within a process. Threads share all resources the process has (such as memory). All the applications running on your servers have one process each. Within this process all threads belonging to the application are contained. There can be one or many threads within a process—thus the terms single- or multithreaded applications. If you open Windows Task Manager and click the Processes tab, you can see the processes running on your computer.

The process is the container of everything that has to do with the application. Here you find the virtual memory space (which you will learn more about later), system resources, application settings, and data of the application.

In order to make Windows change the way it processes your applications, you can change a process's priority. You can do this by right-clicking a process in Task Manager and selecting the appropriate priority from Set priority. You can choose from these six priorities:

Realtime
High
AboveNormal
Normal
BelowNormal
Low

The higher you set the priority, the higher the performance of the application should be.

Threads are assigned slices of processor time based on their priority. This priority is in turn based on the priority of the process containing the thread, so the only way you can affect it is by changing the process priority. When the operating system executes threads, the one with the highest priority is executed first. The order of execution is handled by the thread scheduler. If you have many threads with the same priority, the scheduler loops through these and gives them a fixed size of execution time each.

When the scheduler has executed all threads with the highest priority, it moves to the next lower level and executes the threads waiting there. It does not move to the lower level as long as there are higher-level threads left to execute.

If a higher-level prioritized thread starts running again, the scheduler leaves the lower-level thread to execute the higher prioritized thread. This is referred to as the lower-level thread being preempted.

Note

The operating system adjusts the priority of threads dynamically sometimes. This happens when an application with a graphical user interface shifts between foreground and background, for example. The foreground application is given a higher priority than the background application.

Executive Service

Most of the Windows executive can be found in a file called ntoskrnl.exe (see Figure 4-3). This file contains not only the executive, but also parts of the kernel as well. If you use a tool like the Dependency Walker, usually shipped with Visual Studio, you can see that this exposes more than 1200 functions.

Figure 4-3: The ntoskrnl.exe file of Windows Server 2003 Web Edition

Note

The ntoskrnl.exe file exists in various forms for an operating system version. If you run your OS on a single CPU machine, one version of the file is installed. If you run the OS on a multi-CPU machine, another version is installed. Obviously this means that the file (or rather a version of the file) is important from a performance perspective as well.

Software-Based and Hardware-Based Security

There are two kinds of security in your servers (and this applies to workstations as well, but we will mainly talk about servers here). The first is software based and controlled by the operating system. The other is hardware based and controlled by the processor. Let us consider the Intel processor architecture for a while. The very simplified view of this architecture shown in Figure 4-4 includes four rings of protection for software running on the server.

Figure 4-4: The four rings of security surrounding the core of the system

These layers can be seen as hardware-based security. The closer you get to the core, the more secure the environment is. You can think of these rings as the walls surrounding the Helm's Deep fortress in The Lord of the Rings, which took a lot of energy and time to break through. The inner ring, ring 0, is where the operating system resides. Not much can touch this ring. Your applications reside in ring 3, on the other hand, and do not have much protection at all.

User Mode vs. Kernel Mode

Now it is time to introduce the terms kernel mode and user mode and see where they fit into all this. Ring 0 is what is commonly referred to as kernel mode, and as you can see, this is a fully protected environment. This mode allows access to system memory and all processor instructions. Operating system code runs in kernel mode, as do most device drivers.

Ring 3 is referred to as user mode, and is hardly protected at all. This mode is a nonprivileged processor mode, with a limited set of available interfaces, and limited access to system data.

In Figure 4-5, you see a simplified view of what kinds of applications and services execute in these two modes. The main point of these two modes is to protect the operating system from applications. User applications should not be allowed to access or modify operating system data directly. When an application needs the services of the OS, it makes a system service call. The operating system then validates the request and switches the processor from user mode to kernel mode, thereby allowing the request to be processed. After the system service call is completed, the processor mode is switched back to user mode before returning control to the user application.

Figure 4-5: User mode, kernel mode, and the applications within

All user-mode threads execute in a private, fully protected address space when they are not executing in kernel mode. Because of this privacy, one process cannot view or modify another process's memory without special permissions. This prevents user-mode processes from causing failures to other applications or to the operating system itself. Errors from failing applications are also prevented from interfering with other applications due to this arrangement.

In kernel mode, on the other hand, all operating system components and device drivers share a single virtual address space, and thereby have access to all system data. This is why it is so important that all kernel-mode code be well designed and well-tested, because poor-quality code could violate system security or perhaps corrupt system data (not to mention what trouble a malicious hacker could cause in this mode).

The key system components of the Windows Server 2003 are basically the same as for Windows 2000. In Figure 4-6, you can see these components and which of them operate in kernel mode and which in user mode.

Figure 4-6: The system components of the Windows architecture

Kernel-Mode Components

These are the kernel mode components:

The executive: This contains the base operating system services, which we will study closer later in this chapter.
Kernel: Here you find low-level functions.
Device drivers: These are file system and network drivers and hardware device drivers that translate input/output (I/O) function calls into hardware device I/O requests.
Hardware Abstraction Layer (HAL): This layer isolates the kernel, the executive, and the device drivers from differences between various hardware platforms.
Windowing and graphics system: This is where the graphical user interface (GUI) is implemented.

User-Mode Components

The use mode components are as follows:

System processes: These contain processes that are not Windows services, like the logon process, meaning they are not started by the service control manager. (These are sometimes called system support services.)
Service processes: Unlike system processes, these are started by the service control manager.
User applications: Win32 applications and the like are found here and are supported natively by the OS. Other types of user applications can be supported by installing the appropriate environment subsystem.
Environment subsystems: These expose native operating system services to user applications.
Subsystem DLLs: These translate function calls (APIs) into internal system service calls.
Core Win32 subsystem DLLs: Here you find kernel32.ddl, advapi32.dll, user32.dll, and many more.

What Does the Executive Do?

The execute exposes five kinds of functions, or tasks. Perhaps we should call them "collections of functions," since this better describes what they are.

The first one is the application programming interface (API), which operates in user mode. Here you find various functions a developer can access to get different tasks done. An example of this is the possibility to read or write data to the disk. The interface to these functions is a DLL called ntdll.dll (see Figure 4-7).

Figure 4-7: The property page of ntdll.dll

More than 200 functions are available from this file, but instead of accessing ntdll.dll itself, developers use the Win32 API. The Win32 API in turn makes most of these accessible, without the developer having to make the low-level calls necessary to access them.

The second collection, called the internal system, also operates in user mode, and it exposes functions primarily intended for use by the operating system's applications. Another application could call these functions, but that is not very common. In this collection, you find functions that help in

Performing local procedure calls (LPCs)
Creating paging file(s) on disk
Returning low-level information, like an object security identifier (SID)

The next collection, called driver, operates in kernel mode. Developers can use low-level calls to access the functions for this collection, but Microsoft has a driver development kit (DDK) available that you can use instead to access these functions. The functions in this collection operate in kernel mode since they expose methods to access the operating system directly. Most hardware vendors use these to develop drivers for their hardware.

Now we have reached the fourth collection, internal system component, which also operates in kernel mode. Here is where you find functions that let various operating system managers communicate with other subsystems. The I/O manager perhaps needs to send data to the graphics subsystem in order to display something on screen.

The final set of functions operates in either user or kernel mode, and is called internal components. These functions are designed to let COM components have special access to the operating system. Depending on what the component is designed to do, it operates in either mode.

The Executive Provides More Than Just Exported Functions

The executive also provides various managers. One of them, the I/O Manager, is mentioned earlier. Before we describe this and the others, you need to have an understanding of the two categories they can fall into. The first, referred to as internal here, provides internal support for other managers. These are only accessible to components inside the kernel. The other, called external, responds to needs outside the executive.

We will start with the aforementioned I/O Manager. This manager handles all input and output to the operating system. This I/O can, for instance, be between the OS and the hardware devices. The I/O Manager in reality consists of three components:

File systems: This component takes an I/O request and transforms it to calls that a physical device can understand.
Cache manager: The cache manager provides caching services to the file systems. That is, it places recently accessed data from the hard drives into memory, so it can be accessed quicker. This way performance is improved. An example of this is when a document is accessed continuously by users. The cache manager quickly discovers this, and places the document in RAM instead so that it will be accessed quicker for future calls. The cache manager provides caching services not only to file system calls, but also to networking components. Not everything is cached, however. You cannot force the cache manager to cache things it does not want to cache, either. A file that is continuously accessed is cached, as you saw earlier, but a file that is read sequentially is not. So depending on the type of activity on your server, you always have more or less caching activity going on.
Device drivers: Because some parts of the OS lack the ability to talk directly to the hardware, you need a translator. This translator is the I/O Manager. There are two kinds of drivers when it comes to the I/O Manager: high-level drivers and low-level drivers. The high-level drivers are those that need a translator, because they do not know how to communicate with the hardware directly. An example of such a driver might be the file system drivers like NTFS or FAT. They are depending on the I/O Manager to translate their requests to the device so they can be processed. The low-level drivers, on the other hand, do not need a translator, because they can communicate with the physical device on their own. The SCSI Host Bus Adapter (HBA) is an example of a low-level driver. Other low-level drivers are those that do not support the Windows Driver Model (WDM). These are drivers that control a physical device directly.

The next manager, also internal, is named the LPC Facility (short for Local Procedure Call Facility). This was developed to make it easier to pass calls between applications and the Windows subsystems. To better understand how it works, let us consider a remote procedure call (RPC). This is when you make are mote connection to a server from another computer and ask it to do you a favor by executing a function. When you do this, you have created a shared resource. When you have both the client and the server on the same machine, an LPC is established instead of an RPC. When your applications request the services of the I/O Manager, the stub in the application process packages the parameters for the call. After the actual call has been packaged, it is sent to the executive via the LPC.

The Object Manager is responsible for allowing the operating system to create, manage, and delete objects. These objects can be threads, processes, synchronization objects, and other abstract objects that represent operating system resources.

The next manager we would like to discuss is the Virtual Memory Manager, or VMM. By using virtual memory, you can trick the operating system into believing it has more RAM than the system actually has. This is done by swapping data between the physical RAM and a temporary paging file on the hard disk called a swap file. VMM handles the swapping by determining which data to move in or out of RAM. The data that has been in RAM the longest period of time is the first to be moved to the swap file when the system is running out of physical memory.

The VMM keeps track of all physical memory pages in your system. It stores information about these pages, such as their status, in the page frame database. The status of a page could be one of the following:

Free: This page is free but has not been zeroed yet. A page with this status is read-only.
Valid: Such a page is in use by a process.
Modified: A page that has been altered but not yet written to disk.
Standby: If a page has this status, it has been removed from the process's working set.
Zeroed: This page is available to the system.
Bad: If you have a bad page, you also have a hardware error. No process can use this page.

Every time a page is moved in or out of memory, the VMM uses the page frame database, and if necessary updates the status field.

The page frame database associates pages with each other based on their status. So instead of searching the entire database for all free pages, it only has to find the first, and all others will follow behind. So we can actually say the page frame database consists of six different lists, based on the status.

VMM must make sure a minimum number of pages are available to the system at all times. It does so by processing the modified and the standby lists to move pages from these to the free list (changing their status to Free). Before moving modified pages, VMM writes their changes to disk. Before the system can use a page (for other than read-only activity) it must change the status to Zeroed, and it is VMM that handles this, too.

Each process created in RAM is assigned a unique virtual address space. The VMM maps this virtual address to physical pages in memory. It thereby eliminates the chance of one process or its threads accessing memory allocated for another process.

When a process requires a page in memory, and the system cannot find it at the requested location, we say a page fault has occurred. Page faults can be one of two kinds:

Hard page faults
Soft page faults

Hard page faults mean the requested data has to be fetched from disk. When this happens, the processor is interrupted and you lose performance.

Soft page faults occur when the requested data is found elsewhere in memory. The CPU can handle many soft page faults, as opposed to hard page faults, without losing performance.

The Process and Thread Manager is external, and provides functions to create and terminate processes and threads. It is the kernel that manages processes and threads, but the executive provides an interface to these functions via the Process and Thread Manager.

Security policies on the local computer are enforced by the Security Reference Manager, which is an external manager. Policies are enforced at both the kernel-mode level and the user-mode level. One of this manager's functions is to prevent users from accessing processes running in kernel mode. Another is to restrict them from accessing data or objects they are not allowed to access. The Security Reference Manager checks with the Local Security Authority, or LSA (which we will discuss in more detail later in the chapter) to find out if a user has the right permissions to the object he or she is trying to access.

The next manager is the Run-Time Library. This manager is internal, and provides the arithmetic, string, structure processing functions, and data conversions of the operating system.

The last manager, also internal, is called Support Routines. These routines allow the executive to allocate paged and nonpaged system memory. They also allow the executive to interlock memory access. The Support Routines are also used to let the OS create two synchronization types: resource and fast mutex.

How to Work with the Executive Objects

Windows applications, like Win32 applications and OS/2 applications, act as an environment for other applications. These application environments, or subsystems as they are also called, allow Windows to emulate other operating systems. The executive provides generic services that all environment subsystems can call to perform basic operating system functions. The subsystems build on the services of the executive to provide environments that meet the specific needs of their client applications. Each application (user mode) is bound to only one subsystem and cannot be bound to more.

Windows 2000 provided support for Win32, OS/2, and POSIX subsystems. In Windows Server 2003 there has been a slight change. Now only Win32 and POSIX subsystem support remains. To check which subsystems are installed and the operating system files they use, open the registry editor and look at the following registry key (see Figure 4-8):

    HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\
    SubSystems

Figure 4-8: Here are the subsystems on our system.

Windows itself has been designed to natively run Win32 applications, of course, which gives it several benefits over other applications. To be honest, most applications running on a Windows system are Win32 applications these days, and the support for other subsystems is not that vital.

You cannot run your Windows system without the Win32 subsystem, however. The code to handle windows creation and display I/O is implemented in the Win32 subsystem. So when an application needs to perform display I/O, other subsystems call on the Win32 subsystem to handle these tasks.

What the subsystems do is act as translators between the application environments and the executive. This is necessary because the executive cannot provide the services needed for each environment. The applications call the functions in the operating environment, which in turn calls the executive objects. This way the applications can receive a result they expect and can understand.

Table 4-1 shows the available object types and a short description of each.

Table 4-1: Objects Made Available by the Executive
Object	Description
Access token	An access token is used to determine the rights an object has to access resources.
Event	An event is used to notify objects of a system occurrence, or to synchronize two objects' activities.
File	A file is a universal object that allows you to work with data on the hard drive.
Key	If you open the registry editor, you can see these keys for yourself. They hold data that define the properties for a resource or object. A key can contain zero or more values (properties).
Mutex	The mutex is a synchronization object used to serialize access to some resource.
Object directory	The object directory is a container, and as such is used to hold other objects. Windows Explorer shows a hierarchical directory created with the help of the object directory.
Port	This object is used to pass messages between processes.
Process	A process can be described as an application. It contains a set of threads (described in the Thread entry later in this table). A process contains at least one thread, but can contain many as well. The process also contains the virtual address space and control information.
Profile	This measures the execution time of a process within a given address space.
Queue	A queue in real life is a waiting line, and so is this queue. It is used to notify threads about completed I/O operations.
Section	Memory that is shared by more than one thread or process is called a section. In Win32 this is referred to as a file-mapping object, and is used for file caching, among other things.
Semaphore	Every time an object is requested by another, a counter is incremented. The counter is a part of the semaphore. When the object is no longer needed, the counter is decremented. When a set access count level is reached, the semaphore restricts access to the object.
Symbolic link	A symbolic link is an object pointer, which is a shortcut, or indirect link, to another object.
Thread	A thread is an element of execution within a process.
Timer	This is basically an alarm clock for threads.

Synchronization

Because Windows is a multitasking, multiprocessing operating system, more than one application can try to access some part of the executive's memory at the same time. The Memory Manager uses the page frame database every time an application accesses memory, wants more memory, or releases memory. Because only one such database exists, two applications could possibly try to access it at the same time if they perform memory-related tasks. This could, of course, cause problems.

Note

The page frame database keeps track of whether a page frame is in use by a certain application and whether the page frame is currently in memory.

This is why the operating system needs to synchronize access to resources like the page frame database. Synchronization access includes maintaining a queue for accessing the resources needed, which means a performance penalty. So if you want to optimize the amount of system resources your applications use, you need to make sure they spend as little time as possible accessing a synchronized resource. Table 4-2 lists the various synchronization methods the executive can use.

Table 4-2: Synchronization Methods the Executive Exposes
Method	Description
Spinlock	The spinlock locks access to a resource so the accessing process has sole access to it. It is used by operating system components and device drivers to protect data. Since the spinlock locks the processor, it should be used for something short and very specific to the current process.
Dispatcher objects	These synchronization objects are provided by the kernel to the executive. They can be used in either kernel or user mode. The Win32 API exposes these as different functions, like WaitForSingleObject() and WaitForMultipleObjects(). This way an application developer can use them to make sure resources are available to an application, and that no one else uses them.
Executive resources	These are synchronization methods that only kernel-mode applications, such as drivers, can use.

Note

Do not confuse application spinlocks with kernel spinlocks. The kernel spinlocks are used internally by the Windows executive and I/O drivers to avoid many processes accessing the same resource at the same time.

Hardware Abstraction Layer (HAL)

From Windows NT and later, Windows has been designed to function on a multitude of hardware and CPU versions. The operating system should not need to know anything about the platform it is working with, at least not within reason. To provide this functionality the Microsoft constructed HAL. The OS uses the HAL to access devices that the machine provides, and this includes the processor. If you examine your machine, you will find a file named hal.dll, as shown in Figure 4-9, in the system32 catalog in the Windows directory.

Figure 4-9: The hal.dll property page

Every hardware platform has a different HAL, which makes it possible for Windows to operate on various platforms with minimal need for rewrites of code. In short, Windows talks to the device drivers, which in turn talk to the HAL. The HAL in its turn talks to the hardware.

The HAL always provides the same interface, so the operating system does not really care on which platform it resides, as long as the HAL is correct for the platform. Device drivers are, of course, still needed, as they are a part of the communication chain just mentioned.

Note

Even though this might sound good, it still does not make it effortless to move software from one platform to another. It reduces the amount of code you need to write when you must move an application between platforms, however. With .NET applications this move will be easier, since you only need to know that the new platform uses the same .NET Framework version as you used on your earlier platform.

Windows Subsystems

We mentioned earlier in the chapter that Windows includes environment subsystems. These operate in user mode. In Figure 4-10, you see an overview of the components included in the Windows architecture.

Figure 4-10: The components of the Windows Architecture

Since we have already covered the most important aspects of the subsystems, it is sufficient to say that performance is negatively affected when running applications like OS/2 or POSIX applications on Windows systems. This is due to all the overhead added by emulating another operating system. You can see the same thing happening when you run Win32 applications on a 64-bit Windows system.

The Integral subsystem, which we will take a look at here, is not a subsystem in which applications run, nor is it an emulator of another operating system. What it does is handle a multitude of other tasks. Among these you find functions that control the overall security of the local computer. Other functions that you find here are the server service, workstation service, and the security subsystem. Anyone who has looked through the service MMC recognizes the server and workstation service. Now you know where these two functions come from, so let us take a closer look at them:

Server service: This is a component that makes it possible for Windows to provide network resources, which could be something as simple as sharing a directory on the network. This service is important to Windows, since many other services, like Message Queuing, depend on it.
Workstation service: The workstation service lets your computer access other computers' shared resources on the network. That is, it lets you access a resource that a server service provides on another machine. It also provides an API so you can programmatically access the network redirect.

In the security subsystem you find a number of subcomponents. These enable the security subsystem to control access to Windows and its resources. Table 4-3 shows the functions of the security subsystem and the components behind them.

Table 4-3: Subcomponents of the Security Subsystem
Subcomponent	Function
Logon process	Initial logon authentication. It also accepts local as well as remote logon requests from users.
Security Reference Monitor	Keeps an eye on rights and permissions that are associated with user accounts.
Local Security Authority	Monitors which system resources need to be audited.

Now we have finished covering the basic architecture of Windows. When you design your applications, it is a good thing to have the architecture in the back of your mind. You do not need to know it by heart, but this knowledge is good to have anyway, since there will be many times it will be a benefit to you.

Next, we will explore scalability, availability, and reliability.