8.1. Hierarchical Filesystem Management The operations defined for local filesystems are divided into two parts. Common to all local filesystems are hierarchical naming, locking, quotas, attribute management, and protection. These features, which are independent of how data are stored, are provided by the UFS code described in the first seven sections of this chapter. The other part of the local filesystem, the filestore, is concerned with the organization and management of the data on the storage media. Storage is managed by the datastore filesystem operations described in the final two sections of this chapter. The vnode operations defined for doing hierarchical filesystem operations are shown in Table 8.1 (on page 296). The most complex of these operations is that for doing a lookup. The filesystem-independent part of the lookup is described in Section 6.5. The algorithm used to lookup a pathname component in a directory is described in Section 8.3. Table 8.1. Hierarchical filesystem operations.Operation done | Operator names |
---|
pathname searching | lookup | name creation | create, mknod, link, symlink, mkdir | name change/deletion | rename, remove, rmdir | attribute manipulation | access, getattr, setattr | object interpretation | open, readdir, readlink, mmap, close | process control | advlock, ioctl, poll | object management | lock, unlock, inactive, reclaim |
There are five operators for creating names. The operator used depends on the type of object being created. The create operator creates regular files and also is used by the networking code to create AF_LOCAL domain sockets. The link operator creates additional names for existing objects. The symlink operator creates a symbolic link (see Section 8.3 for a discussion of symbolic links). The mknod operator creates character special devices (for compatibility with other UNIX systems that still use them); it is also used to create fifos. The mkdir operator creates directories. There are three operators for changing or deleting existing names. The rename operator deletes a name for an object in one location and creates a new name for the object in another location. The implementation of this operator is complex when the kernel is dealing with the movement of a directory from one part of the filesystem tree to another. The remove operator removes a name. If the removed name is the final reference to the object, the space associated with the underlying object is reclaimed. The remove operator operates on all object types except directories; they are removed using the rmdir operator. Three operators are supplied for object attributes. The kernel retrieves attributes from an object using the getattr operator and stores them using the setattr operator. Access checks for a given user are provided by the access operator. Five operators are provided for interpreting objects. The open and close operators have only peripheral use for regular files, but when they are used on special devices, they notify the appropriate device driver of device activation or shutdown. The readdir operator converts the filesystem-specific format of a directory to the standard list of directory entries expected by an application. Note that the interpretation of the contents of a directory is provided by the hierarchical filesystem-management layer; the filestore code considers a directory as just another object holding data. The readlink operator returns the contents of a symbolic link. As with directories, the filestore code considers a symbolic link as just another object holding data. The mmap operator prepares an object to be mapped into the address space of a process. Three operators are provided to allow process control over objects. The poll operator allows a process to find out whether an object is ready to be read or written. The ioctl operator passes control requests to a special device. The advlock operator allows a process to acquire or release an advisory lock on an object. None of these operators modifies the object in the filestore. They are simply using the object for naming or directing the desired operation. There are four operations for management of the objects. The inactive and reclaim operators were described in Section 6.6. The lock and unlock operators allow the callers of the vnode interface to provide hints to the code that implement operations on the underlying objects. Stateless filesystems such as NFS ignore these hints. Stateful filesystems, however, can use hints to avoid doing extra work. For example, an open system call requesting that a new file be created requires two steps. First, a lookup call is done to see if the file already exists. Before the lookup is started, a lock request is made on the directory being searched. While scanning through the directory checking for the name, the lookup code also identifies a location within the directory that contains enough space to hold the new name. If the lookup returns successfully (meaning that the name does not already exist), the open code verifies that the user has permission to create the file. If the caller is not eligible to create the new file, then they are expected to call unlock to release the lock that they acquired during the lookup. Otherwise, the create operation is called. If the filesystem is stateful and has been able to lock the directory, then it can simply create the name in the previously identified space, because it knows that no other processes will have had access to the directory. Once the name is created, an unlock request is made on the directory. If the filesystem is stateless, then it cannot lock the directory, so the create operator must rescan the directory to find space and to verify that the name has not been created since the lookup. |