To understand the mechanics of region sharing, let's revisit the virtual addressing. The virtual address is really only of concern from the process's point of view. In order for a process thread to run, the kernel must first set up the processor space registers to point to the process's four quadrants. Once the environment has been initialized for a thread, future context switches for the thread will simply save and restore these as part or the thread's process control block (pcb) in its uarea. The representation of the VAS as a large checkerboard helps us visualize and discuss memory management concepts. As the process maps its logical pregions to the VAS, the process's pregions must be ordered so that no two overlap within the process logical view. A simple allocation map is maintained by the kernel to guarantee that each process gets unique space numbers for its private quadrants. System-shared quadrants have resource maps to make sure that each shared object mapped to them occupies a unique address range. Resource maps exist for each of the quadrants in the kernel space, and special maps are created for the first shared quadrants 2 and 3 (if memory windows are enabled, additional maps will be created). All other quadrant mappings are controlled by the address assignments made in the process pregion chains. If different processes need to share a region, the second one simply duplicates first one's pregion, inserts it in its pregion list, and increments the r_refcnt in the region structure. It's easy to get lost in the details of the region structure, but we need to remember what the purpose of this structure is: to map a contiguous block of virtual pages to specific physical front-store or back-store locations. To handle this task, the region structure is merely a header that leads us to pairs of virtual frame descriptors and disk block descriptors, one for each page frame the region manages. Virtual Page Locator, VFD, and DBD There are many different answers to the question, Where is my page? Page faults occur when a virtual page's location cannot be determined by the hardware or the first-level fault handler. The kernel must then find the region responsible for the management of the missing page. The region structure is simply a collection of per-page objects that define the current state and location of each page frame in the region set. Figure 6-9 demonstrates these per-page structures. Figure 6-9. VFD|DBD
Each and every page in a region has a vfd|dbd structure pair; these define the current status of the page. Let's look at an annotated listing of these structures (Listings 6.7 and 6.8). Listing 6.7. q4> fields struct vfd The first bit is the valid bit; it indicates that the data in the vfd is valid 0 0 0 1 u_int pg_v The next bit sets copy-on-write access mode for a page 0 1 0 1 u_int pg_cw When a page is locked for an I/O operation, this bit is set 0 2 0 1 u_int pg_lock If an mlock() call has been made for a region, this bit is set 0 3 0 1 u_int pg_mlock If lazy swap has been activated and a swap page has been reserved, this bit is set 0 4 0 1 u_int pg_swresv On narrow systems the physical page frame number may only use 21 bits (due to the memory size limitations of 32-bit hardware). For a wide kernel, this will be a single 27-bit field 0 5 0 6 u_int pg_fill 1 3 2 5 u_int pg_pfn Listing 6.8. q4> fields struct dbd The first four bits of this structure denote the page type: 0000 DBD_NONE There is no copy of this page on the disk 0001 DBD_FSTORE A copy of the page is on the front store 0010 DBD_BSTORE A copy of the page is on the back store 0011 DBD_DZERO This will be a "demand zero" page and will be filled with zeros when it is first requested 0100 DBD_DFILL The page is of type "demand fill." When first allocated the page will not be initially filled with zeros (this is only used during the initialization of pages allocated for use by a UAREA) 0101 DBD_HOLE This page will be used to map a sparse memory- mapped file. A read request will return a zero, but a write will cause a fresh page to be allocated and zero-filled 0 0 0 3 u_int dbd_type The remaining 28 bits contain pointer(s) to the page's image on either a front store or back store. 0 3 3 5 u_int dbd_data The primary job of the region is to organize the vfd|dbd data in an easy-to-search manner. It would be very easy to place the vfd|dbd's in an array and use the relative page number as an index into the array. This would be a suitable solution if all regions were of a similar size, but this is not the case. Previously, we learned that there are several types of process pregions, and in a case of guilt by association, there are several varieties of regions. The size of these regions varies greatly, from a mere 4 to 8 pages for a uarea to literally millions of pages in a large wide process data region. Another challenge is that some regions may need to grow as the process's threads execute: think of the malloc() call or a memory-mapped file being written to. Now we see the challenge: design a data base that may be easily allocated and deallocated from kernel memory space, that handles very small and very large data sets, that may be expanded on the fly, and that provides a low overhead searching algorithm. Easy, right? The HP-UX kernel employs a design called the b-tree. Let's take a look. |