Solaris Internals: Solaris 10 and OpenSolaris Kernel Architecture (2nd Edition)
9.5. Segment Drivers
Another example of the object-oriented approach to memory management is the memory "segment" object. Memory segments manage the mapping of a linear range of virtual memory into an address space. The mapping is between the address space and some type of device. The objective of the memory segment is to allow both memory and devices to be mapped into an address space. Traditionally, this required hard-coding memory and device information into the address space handlers for each device. The object architecture allows different behaviors for different segments. For example, one segment might be a mapping of a file into an address space (with mmap()), and another segment might be the mapping of a hardware device into the process's address space (a graphics framebuffer). In this case, the segment driver provides a similar view of linear address space, even though the file mapping operation with mmap() uses pages of memory to cache the file data, whereas the framebuffer device maps the hardware device into the address space. The flexibility of the segment object allows us to use virtually any abstraction to represent a linear address space that is visible to a process, regardless of the real facilities behind the scenes. struct seg { caddr_t s_base; /* base virtual address */ size_t s_size; /* size in bytes */ uint_t s_szc; /* max page size code */ uint_t s_flags; /* flags for segment, see below */ struct as *s_as; /* containing address space */ avl_node_t s_tree; /* AVL tree links to segs in this as */ struct seg_ops *s_ops; /* ops vector: see below */ void *s_data; /* private data for instance */ }; See vm/seg.h
To implement an address space, a segment driver implementation is required to provide at least the following: functions to create a mapping for a linear address range, page fault handling routines to deal with machine exceptions within that linear address range, and a function to destroy the mapping. These functions are packaged together into a segment driver, which is an instantiation of the segment object interface. Figure 9.9 illustrates the relationship between an address space and a segment and shows a segment mapping the heap space of a process. Figure 9.9. Segment Interface
A segment driver implements a subset of the methods described in Table 9.5, as well as a constructor function to create the first instance of the object. Functions in the segment operations structure, s_ops, point to functions within the vnode segment driver and are prefixed with segvn. A segment object is created when another subsystem wants to create a mapping by calling as_map() to create a mapping at a specific address. The segment's create routine is passed as an argument to as_map(), a segment object is created, and a segment object pointer is returned. Once the segment is created, other parts of the virtual memory system can call into the segment for different address space operations without knowing what the underlying segment driver is using the segment method operations for. For example, when a file is mapped into an address space with mmap(), the address space map routine as_map() is called with segvn_create() (the vnode segment driver constructor) as an argument, which in turn calls into the seg_vn segment driver to create the mapping. The segment object is created and inserted into the segment list for the address space (struct as), and from that point on, the address space can perform operations on the mapping without knowing what the underlying segment is. The address space routines can operate on the segment without knowing what type of segment is underlying by calling the segment operation macros. For example, if the address space layer wants to call the fault handler for a segment, it calls SEGOP_FAULT(), which invokes the segment-specific page fault method, as shown below. #define SEGOP_FAULT(h, s, a, l, t, rw) \ (*(s)->s_ops->fault)((h), (s), (a), (l), (t), (rw)) See vm/seg.h The Solaris kernel is implemented with a range of segment drivers for various functions. The different types of drivers are shown in Table 9.4. Most of the process address space mappingincluding executable text, data, heap, stack and memory-mapped filesis performed with the vnode segment driver, seg_vn. Other types of mappings that don't have vnodes associated with them require different segment drivers. The other segment drivers are typically associated with kernel memory mappings or hardware devices, such as graphics adapters.
Table 9.5 describes segment driver methods implemented in Solaris 10.
9.5.1. The vnode Segment: seg_vn
The most widely used segment driver is the vnode segment driver, seg_vn. The seg_vn driver maps files (or vnodes) into a process address space, using physical memory as a cache. The seg_vn segment driver also creates anonymous memory within the process address space for the heap and stack and provides support for System V (non ISM) shared memory. (See Section 4.4.) The seg_vn segment driver manages the following mappings into process address space:
9.5.1.1. Memory Mapped Files
We can map a file into a process's address space with the mmap system call. (See mmap(2).) When we map a file into our address space, we call into the address space routines to create a new segment, a vnode segment. A vnode segment handles memory address translation and page faults for the memory range requested in the mmap system call, and the new segment is added to the list of segments in the process's address space. When the segment is created, the seg_vn driver initializes the segment structure with the address and length of the mapping, then creates a seg_vn-specific data structure within the segment structure's s_data field. The seg_vn-specific data structure holds all of the information the seg_vn driver needs to handle the address mappings for the segment. The seg_vn-specific data structure (struct segvn_data) contains pointers to the vnode that is mapped and to any anonymous memory that has been allocated for this segment. The file system does most of the work of mapped files once the mapping is created. As a result, the seg_vn driver is fairly simplemost of the seg_vn work is done during creation and deletion of the mapping. The more complex part of the seg_vn driver implementation is its handling of anonymous memory pages within the segment, which we discuss in the sections that follow. When we create a file mapping, we put the vnode and offset of the file being mapped into the segvn_data structure members, vp and offset. The seg_vn data structure is shown below; Figure 9.10 illustrates the seg_vn segment driver vnode relationship. Figure 9.10. The seg_vn Segment Driver Vnode Relationship
typedef struct segvn_data { krwlock_t lock; /* protect segvn_data and vpage array */ kmutex_t segp_slock; /* serialize insertions into seg_pcache */ uchar_t pageprot; /* true if per page protections present */ uchar_t prot; /* current segment prot if pageprot == 0 */ uchar_t maxprot; /* maximum segment protections */ uchar_t type; /* type of sharing done */ u_offset_t offset; /* starting offset of vnode for mapping */ struct vnode *vp; /* vnode that segment mapping is to */ ulong_t anon_index; /* starting index into anon_map anon array */ struct anon_map *amp; /* pointer to anon share structure, if needed */ struct vpage *vpage; /* per-page information, if needed */ struct cred *cred; /* mapping credentials */ size_t swresv; /* swap space reserved for this segment */ uchar_t advice; /* madvise flags for segment */ uchar_t pageadvice; /* true if per page advice set */ ushort_t flags; /* flags - from sys/mman.h */ ssize_t softlockcnt; /* # of pages SOFTLOCKED in seg */ lgrp_mem_policy_info_t policy_info; /* memory allocation policy */ } segvn_data_t; See vm/seg_vn.h
Creating a mapping for a file is done with the mmap() system call, which calls the map method for the file system that contains the file. For example, calling mmap() for a file on a UFS file system will call ufs_map(), which in turn calls into the seg_vn driver to create a mapped file segment in the address space with the segvn_create() function. At this point we create an actual virtual memory mapping by talking to the hardware through the hardware address translation functions by using the hat_map() function. The hat_map() function is the central function for creating address space mappings. It calls into the hardware-specific memory implementation for the platform to program the hardware MMU, so that memory address references within the supplied address range will trigger the page fault handler in the segment driver until a valid physical memory page has been placed at the accessed location. Once the hardware MMU mapping is established, the seg_vn driver can begin handling page faults within that segment. Having established a valid hardware mapping for our file, we can look at how our mapped file is effectively read into the address space. The hardware MMU can generate traps for memory accesses to the memory within that segment. These traps will be routed to our seg_vn driver through the as_fault() routine. (See Section 9.4.4.) The first time we access a memory location within our segment, the segvn_fault() page fault handling routine is called. This fault handler recognizes our segment as a mapped file (by looking in the segvn_data structure) and simply calls into the vnode's file system (in this case, with ufs_getpage()) to read in a page-sized chunk from the file system. The subsequent access to memory that is now backed by physical memory simply results in a normal memory access. It's not until a page is stolen from behind the segment (the page scanner can do this) that a page fault will occur again. Writing to a mapped file is done by updating the contents of memory within the mapped segment. The file is not updated instantly, since there is no software- or hardware-initiated event to trigger any such write. Updates occur when the file system flush daemon finds that the page of memory has been modified and then pushes the page to the file system with the file systems putpage routine, in this case, ufs_putpage(). 9.5.2. Copy-on-Write
The copy-on-write process occurs when a process writes to a page that is mapped with MAP_PRIVATE. This process prevents other mappings to the page from seeing changes that are made. seg_vn implements a copy-on-write by setting the hardware MMU permissions of a segment to read-only and setting the segment permissions to read-write. When a process attempts to write to a mapping that is configured this way, the MMU generates an exception and causes a page fault on the page in question. The page fault handler in seg_vn looks at the protection mode for the segment; if it is mapped private and read-write, then the handler initiates a copy-on-write. The copy-on-write unmaps the shared vnode page where the fault occurred, creates a page of anonymous memory at that address, and then copies the contents of the old page to the new anonymous page. All of this happens in the context of the page fault, so the process never knows what's happening underneath it. The copy-on-write operation behaves slightly differently under different memory conditions. When memory is low, rather than creating a new physical memory page, the copy-on-write steals the page from the offset of the file underneath and renames it to be the new anonymous page. This only occurs when free memory is lower than the system parameter minfree. 9.5.3. Page Protection and Advice
The seg_vn segment supports memory protection modes on either the whole segment or individual pages within a segment. Whole segment protection is implemented by the segvn_data structure member, prot; its enablement depends on the boolean switch, pageprot, in the segvn_data structure. If pageprot is equal to zero, then the entire segment's protection mode is set by prot; otherwise, page-level protection is enabled. Page-level protection is implemented by an array of page descriptors pointed to by the vpage structure, shown below. If page-level protection is enabled, then vpage points to an array of vpage structures. Every possible page in the address space has one array entry, which means that the number of vpage members is the segment virtual address space size divided by the fundamental page size for the segment (8 Kbytes on UltraSPARC). struct vpage { uchar_t nvp_prot; /* see <sys/mman.h> prot flags */ uchar_t nvp_advice; /* pplock & <sys/mman.h> madvise flags */ }; See vm/page.h The vpage enTRy for each page uses the standard memory protection bits (see mmap(2)). The per-page vpage structures are also used to implement memory advice for memory-mapped files in the seg_vn segment. |
Категории