Mac OS X Internals: A Systems Approach
5.3. High-Level Processor Initialization
Figure 58 shows an overview of the control flow of the ppc_init() function, including other notable functions it calls. Note that ppc_init() also marks the transition from assembly-language code to C code. Figure 58. High-level processor initialization
ppc_init() first sets up various fields in the per-processor data area of the boot processor. One of the fields is pp_cbfr [osfmk/console/ppc/serial_console.c], a pointer to a per-processor console buffer used by the kernel to handle multiprocessor console output. Let us look at the key operations performed by each function in the sequence depicted in Figure 58. 5.3.1. Before Virtual Memory
thread_bootstrap() [osfmk/kern/thread.c] populates a static thread structure (tHRead_template) used as a template for fast initialization of newly created threads. It then uses this template to initialize init_thread, another static thread structure. thread_bootstrap() finishes by setting init_thread as the current thread, which in turn loads the SPRG1 register[6] with init_thread. Upon return from thread_bootstrap(), ppc_init() initializes certain aspects of the current thread's machine-dependent state. [6] SPRG1 holds the active thread. cpu_bootstrap() [osfmk/ppc/cpu.c] initializes certain locking data structures. cpu_init() [osfmk/ppc/cpu.c] restores the Timebase Register from values saved in the per_proc_info structure. It also sets the values of some informational fields in the per_proc_info structure. // osfmk/ppc/cpu.c void cpu_init(void) { // Restore the Timebase ... proc_info->cpu_type = CPU_TYPE_POWERPC; proc_info->cpu_subtype = (cpu_subtype_t)proc_info->pf.rptdProc; proc_info->cpu_threadtype = CPU_THREADTYPE_NONE; proc_info->running = TRUE; }
processor_bootstrap() [osfmk/kern/processor.c] is a Mach function that sets the value of the global variable master_processor from the value of the global variable master_cpu, which is set to 0 before this function is called. It calls the cpu_to_processor() [osfmk/ppc/cpu.c] function to convert a cpu (an integer) to a processor (a processor_t). // osfmk/ppc/cpu.c processor_t cpu_to_processor(int cpu) { return ((processor_t)PerProcTable[cpu].ppe_vaddr->processor); } As we saw in Figure 53, the ppe_vaddr field points to a per_proc_info structure. Its processor field, shown as a character array in Figure 53, houses a processor_t data type, which is Mach's abstraction for a processor.[7] Its contents include several data structures related to scheduling. processor_bootstrap() calls processor_init() [osfmk/kern/processor.c], which initializes a processor_t's scheduling-related fields, and sets up a timer for quantum expiration. [7] We will look at details of Mach's processor abstraction in Chapter 7. ppc_init() then sets the static_memory_end global variable to the highest address used in the kernel's data area, rounded off to the nearest page. Recall from Chapter 4 that the topOfKernelData field of the boot_args structure contains this value. ppc_init() calls PE_init_platform() [pexpert/ppc/pe_init.c] to initialize some aspects of the Platform Expert. The call is made with the first argument (vm_initialized) set to FALSE, indicating that the virtual memory (VM) subsystem is not yet initialized. PE_init_platform() copies the boot arguments pointer, the pointer to the device tree, and the display properties to a global structure variable called PE_state, which is of type PE_state_t. // pexpert/pexpert/pexpert.h typedef struct PE_state { boolean_t initialized; PE_Video video; void *deviceTreeHead; void *bootArgs; #if __i386__ void *fakePPCBootArgs; #endif } PE_state_t; extern PE_state_t PE_state; // pexpert/ppc/pe_init.c PE_state_t PE_state; PE_init_platform() then calls DTInit() [pexpert/gen/device_tree.c] to initialize the Open Firmware device tree routines. DTInit() simply initializes a pointer to the device tree's root node. Finally, PE_init_platform() calls pe_identify_machine() [pexpert/ppc/pe_identify_machine.c], which populates a clock_frequency_info_t variable (gPEClockFrequencyInfo) with various frequencies such as that of the Timebase, the processor, and the bus. // pexpert/pexpert/pexpert.h struct clock_frequency_info_t { unsigned long bus_clock_rate_hz; unsigned long cpu_clock_rate_hz; unsigned long dec_clock_rate_hz; ... unsigned long long cpu_frequency_hz; unsigned long long cpu_frequency_min_hz; unsigned long long cpu_frequency_max_hz; }; typedef struct clock_frequency_info_t clock_frequency_info_t; extern clock_frequency_info_t gPEClockFrequencyInfo; ppc_init() parses several boot arguments at this point, such as novmx, fn, pmsx, lcks, diag, ctrc, tb, maxmem, wcte, mcklog, and ht_shift. We came across all these in Chapter 4. However, not all arguments are processed immediatelyin the case of some arguments, ppc_init() sets the values of only certain kernel variables for later referral. 5.3.2. Low-Level Virtual Memory Initialization
ppc_init() calls ppc_vm_init() [osfmk/ppc/ppc_vm_init.c] to initialize hardware-dependent aspects of the virtual memory subsystem. The key actions performed by ppc_vm_init() are shown in Figure 58. 5.3.2.1. Sizing Memory
ppc_vm_init() first invalidates the in-memory shadow BATs by loading them with zeros. It then retrieves information about physical memory banks from the boot arguments. This information is used to calculate the total amount of memory on the machine. For each available bank that is usable, ppc_vm_init() initializes a memory region structure (mem_region_t). // osfmk/ppc/mappings.h typedef struct mem_region { phys_entry *mrPhysTab; // Base of region table ppnum_t mrStart; // Start of region ppnum_t mrEnd; // Last page in region ppnum_t mrAStart; // Next page in region to allocate ppnum_t mrAEnd; // Last page in region to allocate } mem_region_t; ... #define PMAP_MEM_REGION_MAX 11 extern mem_region_t \ pmap_mem_regions[PMAP_MEM_REGION_MAX + 1]; extern int pmap_mem_regions_count; ...
Note that it is possible for physical memory to be noncontiguous. The kernel maps the potentially noncontiguous physical space into contiguous physical-to-virtual mapping tables. pmap_vm_init() creates an entry in the pmap_mem_regions array for each DRAM bank it uses, while incrementing pmap_mem_regions_count. The kernel calculates several maximum values for memory size. For example, on machines with more than 2GB of physical memory, one of the maximum memory values is pinned at 2GB for compatibility. Certain data structures must also reside within the first 2GB of physical memory. The following are specific examples of memory limits established by ppc_vm_init().
ppc_vm_init() sets the first_avail variable, which represents the first available virtual address, to static_memory_end (note that virtual memory is not operational yet). Next, it computes kmapsizethe size of kernel text and databy retrieving segment addresses from the kernel's Mach-O headers. It then calls pmap_bootstrap() [osfmk/ppc/pmap.c] with three arguments: max_mem, first_avail, and kmapsize. Next, pmap_bootstrap() prepares the system for running with virtual memory. 5.3.2.2. Pmap Initialization
The physical map (pmap) layer[8] is the machine-dependent portion of Mach's virtual memory subsystem. pmap_bootstrap() first initializes the kernel's physical map (kernel_pmap). It then finds space for the page table entry group (PTEG) hash table and the PTEG Control Area (PCA). The in-memory hash table has the following characteristics. [8] We will discuss the pmap layer in Chapter 8.
The PCA's structure is declared in osfmk/ppc/mappings.h. // osfmk/ppc/mappings.h typedef struct PCA { union flgs { unsigned int PCAallo; // Allocation controls struct PCAalflgs { unsigned char PCAfree; // Indicates the slot is free unsigned char PCAsteal; // Steal scan start position unsigned char PCAauto; // Indicates that the PTE was autogenned unsigned char PCAmisc; // Miscellaneous flags #define PCAlock 1 // This locks up the associated PTEG #define PCAlockb 31 } PCAalflgs; } flgs; } PCA_t; The program in Figure 59 performs the same calculations as the kernel to calculate the page hash table size on a machine. You can use it to determine the amount of memory used by the page table given the amount of physical memory on the machine and the size of a PTEG. Note the use of the cntlzw PowerPC instruction to count the number of leading zeros. Figure 59. Calculating the PowerPC PTEG hash table size used by the kernel
pmap_bootstrap() calls hw_hash_init() [osfmk/ppc/hw_vm.s] to initialize the hash table and the PCA. It then calls hw_setup_trans() [osfmk/ppc/hw_vm.s], which we came across earlier in this chapter. Recall that hw_setup_trans() only configures the hardware registers required for address translationit does not actually start address translation. pmap_bootstrap() calculates the amount of memory that needs to be designated as "allocated" (i.e., it cannot be marked free). This includes memory for the initial context save areas, trace tables, physical entries (phys_entry_t), the kernel text, the logical pages (struct vm_page) needed to map physical memory, and the address-mapping structures (struct vm_map_entry). It then allocates the initial context save areas by calling savearea_init() [osfmk/ppc/savearea.c]. This allows the processor to take an interrupt.
pmap_bootstrap() initializes the mapping tables by calling mapping_init() [osfmk/ppc/mappings.c]. It then calls pmap_map() [osfmk/ppc/pmap.c] to map memory for page tables in the kernel's map. The page tables are mapped V=Rthat is, with virtual address being equal to the real address. On 64-bit machines, pmap_bootstrap() calls pmap_map_physical() [osfmk/ppc/pmap.c] to block-map physical memory regionsin units of up to 256MBinto the kernel's address map. Physical memory is mapped at virtual addresses starting from PHYS_MEM_WINDOW_VADDR, which is defined to be 0x100000000ULL (4GB) in osfmk/ppc/pmap.h. Moreover, in this physical memory window, an I/O hole of size IO_MEM_WINDOW_SIZE (defined to be 2GB in osfmk/ppc/pmap.h) is mapped at an offset IO_MEM_WINDOW_VADDR (defined to be 2GB in osfmk/ppc/pmap.h). The pmap_map_iohole() [osfmk/ppc/pmap.c] function is called on a 64-bit machine to map the I/O hole. Finally, pmap_bootstrap() sets the next available page pointer (first_avail) and the first free virtual address pointer (first_free_virt). The rest of the memory is marked free and is added to the free regions, from where it can be allocated by pmap_steal_memory() [osfmk/vm/vm_resident.c]. ppc_vm_init() now calls pmap_map() to map (again, V=R) exception vectors in the kernel's address map, starting from the address exception_entry through the address exception_endboth addresses are defined in osfmk/ppc/lowmem_vectors.s. Other pmap_map() calls that are made include those for the kernel's text (__TEXT) and data (__DATA) segments. The __KLD and __LINKEDIT segments are mapped (wired) through pmap_enter() [osfmk/ppc/pmap.c], page by page. These segments are unloaded by the I/O Kit in their entirety, to reclaim that memory, after booting completes. ppc_vm_init() next calls MapUserMemoryWindowInit() [osfmk/ppc/pmap.c] to initialize a mechanism the kernel uses for mapping portions of user-space memory into the kernel. The copyin() and copyout() functions, both of which are implemented in osfmk/ppc/movc.s, primarily use this facility by calling MapUserMemoryWindow() [osfmk/ppc/pmap.c], which maps a user address range into a predefined kernel range. The range is 512MB in size and starts at USER_MEM_WINDOW_VADDR, which is defined to be 0xE0000000ULL (3.5GB) in osfmk/ppc/pmap.h. 5.3.2.3. Starting Address-Translation
Now that the memory management hardware has been configured and virtual memory subsystem data structures have been allocated and initialized, ppc_vm_init() calls hw_start_trans() [osfmk/ppc/hw_vm.s] to start address translation. Note that this is the first time in the boot process that address translation is enabled. 5.3.3. After Virtual Memory
ppc_init() makes a call to PE_init_platform(), but with the vm_initialized Boolean argument set to trUE (unlike the earlier call made by ppc_init()). As a result, PE_init_platform() calls pe_init_debug() [pexpert/gen/pe_gen.c], which copies the debug flags, if any, from the boot arguments to the kernel variable DEBUGFlag. printf_init() [osfmk/kern/printf.c] initializes locks used by the printf() and sprintf() kernel functions. It also calls bsd_log_init() [bsd/kern/subr_log.c] to initialize a message buffer for kernel logging. The buffer structure is declared in bsd/sys/msgbuf.h. // bsd/sys/msgbuf.h #define MSG_BSIZE (4096 - 3 * sizeof(long)) struct msgbuf { #define MSG_MAGIC 0x063061 long msg_magic; long msg_bufx; // write pointer long msg_bufr; // read pointer char msg_bufc[MSG_BSIZE]; // buffer }; #ifdef KERNEL extern struct msgbuf *msgbufp; ...
Since logs may be written at interrupt level, it is possible for a log manipulation to affect another processor at interrupt level. Therefore, printf_init() also initializes a log spinlock to serialize access to log buffers. panic_init() [osfmk/kern/debug.c] initializes a lock used to serialize modifications by multiple processors to the global panic string. printf() and panic() are required if a debugger needs to run. 5.3.3.1. Console Initialization
PE_init_kprintf() [pexpert/ppc/pe_kprintf.c] determines which console character output method to use. It checks the /options node in the device tree for the presence of input-device and output-device properties. If either property's value is a string of the format scca:x, where x is a number with six or fewer digits, PE_init_kprintf() attempts to use a serial port, with x being the baud rate. However, if the serialbaud boot argument is present, its value is used as the baud rate instead. PE_init_kprintf() then attempts to find an onboard serial port. Figure 510 shows an excerpt from kprintf() initialization. Figure 510. Initialization of the kprintf() function
PE_find_scc() [pexpert/ppc/pe_identify_machine.c] looks for a serial port[10] in the device tree. If one is found, PE_find_scc() returns the physical I/O address of the port, which is then passed to io_map_spec() [osfmk/ppc/io_map.c] to be mapped into the kernel's virtual address space. Since virtual memory is enabled at this point, io_map_spec() calls io_map() [osfmk/ppc/io_map.c] to allocate pageable kernel memory in which the desired mapping is created. initialize_serial() [osfmk/ppc/serial.c] configures the serial hardware by performing I/O to the appropriate registers. Finally, PE_init_kprintf() sets the PE_kputc function pointer to serial_putc() [osfmk/ppc/ke_printf.c], which in turn calls scc_putc() [osfmk/ppc/serial_io.c] to output a character to a serial line. [10] A legacy serial port is named escc-legacy, whereas a new-style serial port is named escc in the device tree. If no serial ports could be found, PE_init_kprintf() sets PE_kprintf to cnputc() [osfmk/console/ppc/serial_console.c], which calls the putc member of the appropriate entry[11] of the cons_ops structure to perform console output. [11] Depending on whether the serial console or the graphics console is the default, the appropriate entry is set to SCC_CONS_OPS or VC_CONS_OPS, respectively, at compile time. // osfmk/console/ppc/serial_console.c #define OPS(putc, getc, nosplputc, nosplgetc) putc, getc const struct console_ops { int (* putc)(int, int, int); int (* getc)(int, int, boolean_t, boolean_t); } cons_ops[] = { #define SCC_CONS_OPS 0 { OPS(scc_putc, scc_getc, no_spl_scputc, no_spl_scgetc) }, #define VC_CONS_OPS 1 { OPS(vcputc, vcgetc, no_spl_vcputc, no_spl_vcgetc) }, }; #define NCONSOPS (sizeof cons_ops / sizeof cons_ops[0])
osfmk/console/ppc/serial_console.c contains a console operations table with entries for both a serial console and a video console.
vcputc() [osfmk/console/video_console.c] outputs to the graphical console by drawing characters directly to the framebuffer. ppc_vm_init() now checks whether a serial console was requested at boot time, and if so, it calls switch_to_serial_console() [osfmk/console/ppc/serial_console.c] to set the SCC_CONS_OPS entry of console_ops as the default for console output. ppc_vm_init() calls PE_create_console() [pexpert/ppc/pe_init.c] to create either the graphical or the textual console, depending on the type of video set in the PE_state.video.v_display field, which was initialized earlier by PE_init_platform(). // pexpert/ppc/pe_init.c void PE_init_platform(boolean_t vm_initialized, void *_args) { ... boot_args *args = (boot_args *)_args; if (PE_state.initialized == FALSE) { PE_state.initialized = TRUE; ... PE_state.video.v_display = args->Video.v_display; ... } ... } ... void PE_create_console(void) { if (PE_state.video.v_display) PE_initialize_console(&PE_state.video, kPEGraphicsMode); else PE_initialize_console(&PE_state.video, kPETextMode); }
PE_initialize_console() [pexpert/ppc/pe_init.c] supports disabling the screen (switching to the serial console), enabling the screen (switching to the "last" console), or simply initializing the screen. All three operations involve calling initialize_screen() [osfmk/console/video_console.c], which is responsible for retrieving the graphical framebuffer address. osfmk/console/video_console.c also implements functions used while displaying boot progress during a graphical boot. ppc_vm_init() finally calls PE_init_printf() [pexpert/gen/pe_gen.c]. After ppc_vm_init() returns, ppc_init() processes the wcte and mcksoft boot arguments (see Table 412) on 64-bit hardware. 5.3.3.2. Preparing for the Bootstrapping of Kernel Subsystems
Finally, ppc_init() calls machine_startup() [osfmk/ppc/model_dep.c], which never returns. machine_startup() processes several boot arguments. In particular, it checks whether the kernel must halt in the debugger. It initializes locks used by the debugger (debugger_lock) and the backtrace print mechanism (pbtlock). debugger_lock is used to ensure that there is only one processor in the debugger at a time. pbtlock is used by print_backtrace() [osfmk/ppc/model_dep.c] to ensure that only one backtrace can occur at a time. If the built-in kernel debuggerKDBhas been compiled into the kernel, machine_startup() calls ddb_init() [osfmk/ddb/db_sym.c] to initialize KDB. Moreover, if the kernel has been instructed to halt in KDB, machine_startup() calls Debugger() [osfmk/ppc/model_dep.c] to enter the debugger. // osfmk/ppc/model_dep.c #define TRAP_DEBUGGER __asm__ volatile("tw 4,r3,r3"); ... void machine_startup(boot_args *args) { ... #if MACH_KDB ... ddb_init(); if (boot_arg & DDB_KDB) current_debugger = KDB_CUR_DB; if (halt_in_debugger && (current_debugger == KDB_CUR_DB)) { Debugger("inline call to debugger(machine_startup)"); ... } ... } ... void Debugger(const char *message) { ... if ((current_debugger != NO_CUR_DB)) { // debugger configured printf("Debugger(%s)\n", message); TRAP_DEBUGGER; // enter the debugger splx(spl); return; } ... }
machine_startup() calls machine_conf() [osfmk/ppc/model_dep.c], which manipulates Mach's machine_info structure [osfmk/mach/machine.h]. The host_info() Mach call[12] retrieves information from this structure. Note that the memory_size field is pinned to 2GB on machines with more than 2GB of physical memory. [12] We will see an example of using this call in Chapter 6. // osfmk/mach/machine.h struct machine_info { integer_t major_version; // kernel major version ID integer_t minor_version; // kernel minor version ID integer_t max_cpus; // maximum number of CPUs possible integer_t avail_cpus; // number of CPUs now available uint32_t memory_size; // memory size in bytes, capped at 2GB uint64_t max_mem; // actual physical memory size integer_t physical_cpu; // number of physical CPUs now available integer_t physical_cpu_max; // maximum number of physical CPUs possible integer_t logical_cpu; // number of logical CPUs now available integer_t logical_cpu_max; // maximum number of logical CPUs possible }; typedef struct machine_info *machine_info_t; typedef struct machine_info machine_info_data_t; extern struct machine_info machine_info; ...
On older kernels, machine_startup() also initializes thermal monitoring for the processor by calling ml_thrm_init() [osfmk/ppc/machine_routines_asm.s]. Newer kernels handle thermal initialization entirely in the I/O Kitml_thrm_init() performs no work on these kernels.
Finally, machine_conf() calls kernel_bootstrap() [osfmk/kern/startup.c], which never returns. |
Категории