Inside Windows Storage: Server Storage Technologies for Windows 2000, Windows Server 2003 and Beyond

   

6.6 SAN File Systems

Storage area networks allow administrators to have a pool of storage resources that coexist with a group of servers and have individual storage resources assigned to a particular server. SANs still require that at any given moment, only a particular server may be accessing a particular storage resource. SANs just facilitate the easy reassignment of a storage resource from one server to another. To understand this better, consider Figure 6.14.

Figure 6.14. SAN Usage Scenario with a Local File System

Figure 6.14 shows a typical three-tiered SAN deployment. At the top are clients accessing servers using a LAN. The servers are connected to a Fibre Channel switch. In addition, several storage disks are connected to the Fibre Channel switch. The storage disks can be considered to be a pool of storage disks, consisting of Disks D1 through D4. Figure 6.14 shows Server 1 and Disks D1 and D3 shaded to indicate that Server 1 is exclusively accessing Disks D1 and D3. Server 2 is exclusively accessing Disks D2 and D4.

The SAN simply facilitates relatively easy movement of a disk from one server to another. SANs do not facilitate true simultaneous sharing of the storage devices. SANs simply make some storage resources appear to be direct-attached storage resources, as far as upper layers of software such as file systems (and above) are concerned . This is true whether the SAN is Fibre Channel based or IP storage based. [8]

[8] IP storage is discussed in detail in Chapter 8.

To allow a storage resource such as a volume to be truly simultaneously shared and accessed by different servers, one needs an enhanced file system, often referred to as a SAN file system . SAN file systems allow for multiple servers to access the same storage device simultaneously while still providing for some files or parts of files to have exclusive access by only a particular server process for some duration of time. Astute readers might argue that even network-attached storage allows for files to be simultaneously shared, and they would be correct. The difference is that network-attached storage has a single server (the NAS server) acting as a gatekeeper, and all file operations (e.g., open , close, read, write, lock) are issued to that server.

The NAS server can easily become a bottleneck. Network file systems such as CIFS and NFS (described in Chapter 3) provide file system sharing at the file level for clients accessing servers using a network protocol such as TCP/IP. SAN file systems provide for sharing of storage devices at the block level for clients accessing the storage device using a block mode protocol such as SCSI. With SAN file systems, each server is running what it thinks is a file system running on a local disk. In reality, however, multiple such servers are operating under this illusion, and the SAN file system operating on each server correctly maintains file system state on the volume that is being simultaneously operated on by multiple servers.

A diagram might help explain this. Figure 6.15 shows two scenarios. The left-hand side of the figure shows a network-attached storage disk being accessed by multiple servers via a network file system, and the right-hand side of the figure shows multiple servers accessing a single disk via a SAN file system. In the first case, each server uses its network file system (such as SMB or NFS) to send requests to the server on the NAS device. The NAS device thus constitutes a potential single point of failure, as well as a potential bottleneck. When a SAN file system is deployed, there is no such potential bottleneck or failure point. The storage disk can be accessed in a load-sharing fashion via both Servers 1 and 2. If one of the servers fails, the disk data can still be accessed via the other disk. Of course, the cost here is the added complexity and cost of the SAN file system.

Figure 6.15. SAN and NAS File System Usage Scenario

6.6.1 Advantages of SAN File Systems

The advantages of SAN file systems include the following:

6.6.2 Technical Challenges of SAN File Systems

One of the engineering feats in implementing SAN file systems is striking the right balance between concurrent access and serialization. Concurrent access to files and disks is required to have a highly scalable system that allows multiple processes to access the same set of files simultaneously. Synchronization is required to ensure that the integrity of user data and file system metadata is maintained , even while multiple processes or users are simultaneously accessing files.

Note that this challenge of concurrent access and serialization exists even on non-SAN file systems such as NTFS. The difference is that the mechanisms needed to ensure the proper serialization are much simpler and are provided by the operating systems; for example, the synchronization mechanisms provided by the Windows operating system, such as spinlocks and semaphores, are perfectly adequate for non-SAN file systems such as NTFS.

A complete description of the technology behind creating SAN file systems is beyond the scope of this book. Suffice it to say that the issues involved include the following:

At an extremely high level, SAN file systems may be designed in two ways:

The asymmetric approach to a SAN file system is illustrated in Figure 6.16:

Step 1. A client connects to a server and requests some data from a file using a protocol such as CIFS (explained in Chapter 3).

Step 2. The server contacts a metadata server and obtains information about the storage device on which the file resides, including particulars of the disk block on which the file resides.

Step 3. At this stage the server can accomplish all I/O directly, using the data it received from the metadata server.

Figure 6.16. SAN File System with Metadata Server

6.6.3 Commercially Available SAN Systems

Some vendors have implemented SAN file systems for the Windows NT platform using the asymmetric approach. Examples include EMC, with its Celerra HighRoad product line; Tivoli, with its SANergy product; and ADIC, with its StorNext product (formerly known as CentraVision). All of these products use a Windows server for implementing the metadata server and support access to the metadata server by secondary Windows servers. Some of these products support a standby metadata server; some do not. In addition, some of these products support other servers (such as Netware, UNIX, or Solaris) accessing the metadata server, and some do not.

It is interesting to explore the details of how such functionality is implemented and the details of the execution, with respect to the Windows NT I/O stack.

Figure 6.17 shows the Windows NT network I/O stack, as well as the local storage (Storport and SCSI) I/O stack. The SAN file system filter driver (shaded in the figure) layers itself over the network file system in general and the CIFS redirector in particular. The filter driver intercepts file open, close, create, and delete requests and lets them flow along the regular network file system stack. The interception is simply to register a completion routine. For all files successfully opened, the filter driver then optionally obtains information about the exact disk track, sector, and blocks where the file data resides.

Figure 6.17. Windows NT SAN File System I/O Stack

This is done for all large files. Some implementations choose not to do this for small files, the underlying thought being that the overhead of obtaining disk track or sector information for small files is comparable to the actual read or write operation on those few sectors. Thereafter, all file operations such as read or write operations (that do not involve manipulation of file system metadata) are handled directly by block-level I/O between the server and the storage disk.

The drawback of having a centralized metadata server is that this server can become a bottleneck, as well as a single point of failure. Some vendors provide capability in their products to have a standby metadata server take over in case of failure of the primary metadata server. On the other hand, the metadata server is the only server that caches metadata, so clusterwide I/O to read and write metadata is avoided.


   
Top

Категории