Mac OS X Internals: A Systems Approach

2017-07-07 02:10:07

8.7. Using the Mach VM Interfaces

Let us now look at several examples of using Mach VM interface routines.

The examples shown in this section use the new Mach VM API that we discussed in Section 8.6. The new API's implementation is transitional at the time of this writing. If you face problems while experimenting with it, you can resort to the vm_* routines. Moreover, the examples in this section can be compiled as either 64-bit or 32-bit programs. They are shown here as compiled for 64-bit.

8.7.1. Controlling Memory Inheritance

In this example, we will allocate two pages of memory using mach_vm_allocate(). We will call mach_vm_inherit() to set the inheritance attribute of one page to VM_INHERIT_SHARE and the other's to VM_INHERIT_COPY. We will then write some "marker" data to the two pages and call fork(). The parent will wait for the child to exit. The child will write its own marker data to the pages, which will cause the contents of the shared page to change in place, whereas the other page will be physically copied on being written. We will use the VM_REGION_TOP_INFO flavor of mach_vm_region() to inspect the VM objects corresponding to the two pages. Figure 814 shows the program.

Note in Figure 814 that in this example, the program is being compiled as a 64-bit PowerPC executable, as specified by the ppc64 architecture value. Since the Mach VM user interfaces are architecture-independent, the program will compile and run on all architectures supported by Mac OS X.

Figure 814. Controlling memory inheritance

// vm_inherit.c #include <stdio.h> #include <sys/wait.h> #include <stdlib.h> #include <unistd.h> #include <mach/mach.h> #include <mach/mach_vm.h> #define OUT_ON_MACH_ERROR(msg, retval) \ if (kr != KERN_SUCCESS) { mach_error(msg ":" , kr); goto out; } #define FIRST_UINT32(addr) (*((uint32_t *)addr)) static mach_vm_address_t page_shared; // fully shared static mach_vm_address_t page_cow; // shared copy-on-write kern_return_t get_object_id(mach_vm_address_t offset, int *obj_id, int *ref_count) { kern_return_t kr; mach_port_t unused; mach_vm_size_t size = (mach_vm_size_t)vm_page_size; mach_vm_address_t address = offset; vm_region_top_info_data_t info; mach_msg_type_number_t count = VM_REGION_TOP_INFO_COUNT; kr = mach_vm_region(mach_task_self(), &address, &size, VM_REGION_TOP_INFO, (vm_region_info_t)&info, &count, &unused); if (kr == KERN_SUCCESS) { *obj_id = info.obj_id; *ref_count = info.ref_count; } return kr; } void peek_at_some_memory(const char *who, const char *msg) { int obj_id, ref_count; kern_return_t kr; kr = get_object_id(page_shared, &obj_id, &ref_count); printf("%-12s%-8s%-10x%-12x%-10d%s\n", who, "SHARED", FIRST_UINT32(page_shared), obj_id, ref_count, msg); kr = get_object_id(page_cow, &obj_id, &ref_count); printf("%-12s%-8s%-10x%-12x%-10d%s\n", who, "COW", FIRST_UINT32(page_cow), obj_id, ref_count, msg); } void child_process(void) { peek_at_some_memory("child", "before touching any memory"); FIRST_UINT32(page_shared) = (unsigned int)0xFEEDF00D; FIRST_UINT32(page_cow) = (unsigned int)0xBADDF00D; peek_at_some_memory("child", "after writing to memory"); exit(0); } int main(void) { kern_return_t kr; int status; mach_port_t mytask = mach_task_self(); mach_vm_size_t size = (mach_vm_size_t)vm_page_size; kr = mach_vm_allocate(mytask, &page_shared, size, VM_FLAGS_ANYWHERE); OUT_ON_MACH_ERROR("vm_allocate", kr); kr = mach_vm_allocate(mytask, &page_cow, size, VM_FLAGS_ANYWHERE); OUT_ON_MACH_ERROR("vm_allocate", kr); kr = mach_vm_inherit(mytask, page_shared, size, VM_INHERIT_SHARE); OUT_ON_MACH_ERROR("vm_inherit(VM_INHERIT_SHARE)", kr); kr = mach_vm_inherit(mytask, page_cow, size, VM_INHERIT_COPY); OUT_ON_MACH_ERROR("vm_inherit(VM_INHERIT_COPY)", kr); FIRST_UINT32(page_shared) = (unsigned int)0xAAAAAAAA; FIRST_UINT32(page_cow) = (unsigned int)0xBBBBBBBB; printf("%-12s%-8s%-10s%-12s%-10s%s\n", "Process", "Page", "Contents", "VM Object", "Refcount", "Event"); peek_at_some_memory("parent", "before forking"); if (fork() == 0) child_process(); // this will also exit the child wait(&status); peek_at_some_memory("parent", "after child is done"); out: mach_vm_deallocate(mytask, page_shared, size); mach_vm_deallocate(mytask, page_cow, size); exit(0); } $ gcc -arch ppc64 -Wall -o vm_inherit vm_inherit.c $ ./vm_inherit Process Page Contents VM Object Refcount Event parent SHARED aaaaaaaa 4fa4000 1 before forking parent COW bbbbbbbb 5a93088 1 before forking child SHARED aaaaaaaa 4fa4000 2 before touching any memory child COW bbbbbbbb 5a93088 2 before touching any memory child SHARED feedf00d 4fa4000 2 after writing to memory child COW baddf00d 4ade198 1 after writing to memory parent SHARED feedf00d 4fa4000 1 after child is done parent COW bbbbbbbb 5a93088 1 after child is done

Note in the output shown in Figure 814 that the VM object corresponding to the copy-on-written page is different from the one before the child writes to the page.

8.7.2. Debugging the Mach VM Subsystem

The Mac OS X kernel provides powerful user-space interfaces for debugging the Mach VM and IPC subsystems. These interfaces provide access to a variety of kernel data structures that are normally not exposed to user space. However, the kernel must be recompiled in the DEBUG configuration to enable these interfaces. For example, the MACH_VM_DEBUG and MACH_IPC_DEBUG kernel-build-time configuration options enable the debugging routines for VM and IPC, respectively.

Let us consider an examplethat of mach_vm_region_info(). This routine retrieves detailed information about a memory region: Given a memory address, it retrieves the contents of the corresponding VM map entry structure, along with the associated VM objects. We say "objects" because if there is a shadow chain, mach_vm_region_info() follows it.

kern_return_t mach_vm_region_info(vm_map_t map, vm_offset_t address, vm_info_region_t *regionp, vm_info_object_array_t *objectsp, mach_msg_type_number_t *objects_countp);

The vm_info_region_t structure [osfmk/mach_debug/vm_info.h] contains selected information from the VM map entry structure corresponding to address in the address space specified by map. On return, objectsp will point to an array containing objects_countp enTRies, each of which is a vm_info_object_t structure [osfmk/mach_debug/vm_info.h] containing information from a VM object.

Other routines in the VM debugging interface include the following:

mach_vm_region_info_64() provides a 64-bit version of mach_vm_region_info()

vm_mapped_pages_info() retrieves a list containing addresses of virtual pages mapped in a given task

host_virtual_physical_table_info() retrieves information about the host's virtual-to-physical table

8.7.3. Protecting Memory

The program in Figure 815 is a trivial example of using mach_vm_protect() to change the protection attribute of a given memory region. The program allocates a page of memory using mach_vm_allocate() and writes a string at an offset of 2048 bytes in the page. It then calls mach_vm_protect() to deny all access to the memory starting at the page's starting address, but it specifies a region length of only 4 bytes. We know that Mach will round the region size up to a page size, which means the program will not be able to access the string it wrote.

Figure 815. Protecting memory

// vm_protect.c #include <stdio.h> #include <stdlib.h> #include <mach/mach.h> #include <mach/mach_vm.h> #define OUT_ON_MACH_ERROR(msg, retval) \ if (kr != KERN_SUCCESS) { mach_error(msg ":" , kr); goto out; } int main(int argc, char **argv) { char *ptr; kern_return_t kr; mach_vm_address_t a_page = (mach_vm_address_t)0; mach_vm_size_t a_size = (mach_vm_size_t)vm_page_size; kr = mach_vm_allocate(mach_task_self(), &a_page, a_size, VM_FLAGS_ANYWHERE); OUT_ON_MACH_ERROR("vm_allocate", kr); ptr = (char *)a_page + 2048; snprintf(ptr, (size_t)16, "Hello, Mach!"); if (argc == 2) { // deny read access to a_page kr = mach_vm_protect( mach_task_self(), // target address space (mach_vm_address_t)a_page,// starting address of region (mach_vm_size_t)4, // length of region in bytes FALSE, // set maximum? VM_PROT_NONE); // deny all access OUT_ON_MACH_ERROR("vm_protect", kr); } printf("%s\n", ptr); out: if (a_page) mach_vm_deallocate(mach_task_self(), a_page, a_size); exit(kr); } $ gcc -arch ppc64 -Wall -o vm_protect vm_protect.c $ ./vm_protect Hello, Mach! $ ./vm_protect VM_PROT_NONE zsh: bus error ./vm_prot_none VM_PROT_NONE

8.7.4. Accessing Another Task's Memory

In this example, we will use mach_vm_read() and mach_vm_write() to manipulate the memory of one task from another. The target task will allocate a page of memory and fill it with the character A. It will then display its process ID and the newly allocated page's address and will go into a busy loop, exiting the loop when the first byte of the page changes to something other than A. The other programthe masterwill read the target's memory, modify the first character to B, and write it back into the target's address space, which will cause the target to end its busy loop and exit.

Figure 816 shows the source for the target and master programs. Note that beginning with the x86-based Macintosh systems, the task_for_pid() call requires superuser privileges.

Figure 816. Accessing another task's memory

// vm_rw_target.c #include <stdio.h> #include <unistd.h> #include <stdlib.h> #include <mach/mach.h> #include <mach/mach_vm.h> #define SOME_CHAR 'A' int main() { kern_return_t kr; mach_vm_address_t address; mach_vm_size_t size = (mach_vm_size_t)vm_page_size; // get a page of memory kr = mach_vm_allocate(mach_task_self(), &address, size, VM_FLAGS_ANYWHERE); if (kr != KERN_SUCCESS) { mach_error("vm_allocate:", kr); exit(1); } // color it with something memset((char *)address, SOME_CHAR, vm_page_size); // display the address so the master can read/write to it printf("pid=%d, address=%p\n", getpid(), (void *)address); // wait until master writes to us while (*(char *)address == SOME_CHAR) ; mach_vm_deallocate(mach_task_self(), address, size); exit(0); } // vm_rw_master.c #include <stdio.h> #include <stdlib.h> #include <mach/mach.h> #include <mach/mach_vm.h> #define PROGNAME "vm_rw_master" #define EXIT_ON_MACH_ERROR(msg, retval) \ if (kr != KERN_SUCCESS) { mach_error(msg ":" , kr); exit((retval)); } int main(int argc, char **argv) { kern_return_t kr; pid_t pid; mach_port_t target_task; mach_vm_address_t address; mach_vm_size_t size = (mach_vm_size_t)vm_page_size; vm_offset_t local_address; mach_msg_type_number_t local_size = vm_page_size; if (argc != 3) { fprintf(stderr, "usage: %s <pid> <address in hex>\n", PROGNAME); exit(1); } pid = atoi(argv[1]); address = strtoul(argv[2], NULL, 16); kr = task_for_pid(mach_task_self(), pid, &target_task); EXIT_ON_MACH_ERROR("task_for_pid", kr); printf("reading address %p in target task\n", (void *)address); kr = mach_vm_read(target_task, address, size, &local_address, &local_size); EXIT_ON_MACH_ERROR("vm_read", kr); // display some of the memory we read from the target task printf("read %u bytes from address %p in target task, first byte=%c\n", local_size, (void *)address, *(char *)local_address); // change some of the memory *(char *)local_address = 'B'; // write it back to the target task kr = mach_vm_write(target_task, address, local_address, local_size); EXIT_ON_MACH_ERROR("vm_write", kr); exit(0); } $ gcc -arch ppc64 -Wall -o vm_rw_target vm_rw_target.c $ gcc -arch ppc64 -Wall -o vm_rw_master vm_rw_master.c $ ./vm_rw_target pid=3592, address=0x5000 # another shell # will need superuser privileges on newer versions of Mac OS X $ ./vm_rw_master 3592 0x5000 reading address 0x5000 in target task read 4096 bytes from address 0x5000 in target task, first byte=A $ $

8.7.5. Naming and Sharing Memory

We came across the mach_make_memory_entry_64() routine while discussing mach_vm_map() in Section 8.6.1. In this example, we will write a program that uses this routine to create a named entry corresponding to a given mapped portion of its address space. Thereafter, the program will become a Mach server, waiting for a client to send it a Mach IPC message, which it will respond to by sending the named entry handle in a reply message. The client can then use the handle in a mach_vm_map() call to map the associated memory into its address space. When creating the named entry, the server will specify a permission value consisting of both VM_PROT_READ and VM_PROT_WRITE, allowing a client full read/write shared access.

The IPC concepts used in this example are discussed in Chapter 9. The example appears here because it is more of a VM example than an IPC example.

Figure 817 shows a common header file that both the client and server program sources will use.

Figure 817. Common header file for the shared memory client-server example

// shm_ipc_common.h #ifndef _SHM_IPC_COMMON_H_ #define _SHM_IPC_COMMON_H_ #include <mach/mach.h> #include <mach/mach_vm.h> #include <servers/bootstrap.h> #define SERVICE_NAME "com.osxbook.SHMServer" #define SHM_MSG_ID 400 #define EXIT_ON_MACH_ERROR(msg, retval, success_retval) \ if (kr != success_retval) { mach_error(msg ":" , kr); exit((retval)); } // send-side version of the request message (as seen by the client) typedef struct { mach_msg_header_t header; } msg_format_request_t; // receive-side version of the request message (as seen by the server) typedef struct { mach_msg_header_t header; mach_msg_trailer_t trailer; } msg_format_request_r_t; // send-side version of the response message (as seen by the server) typedef struct { mach_msg_header_t header; mach_msg_body_t body; // start of kernel processed data mach_msg_port_descriptor_t data; // end of kernel processed data } msg_format_response_t; // receive-side version of the response message (as seen by the client) typedef struct { mach_msg_header_t header; mach_msg_body_t body; // start of kernel processed data mach_msg_port_descriptor_t data; // end of kernel processed data mach_msg_trailer_t trailer; } msg_format_response_r_t; #endif // _SHM_IPC_COMMON_H_

Figure 818 shows the source for the client. In the mach_vm_map() call, the client requests the kernel to map the memory object represented by the received named entry handle at any available location in its address space. Note that the client also writes a string (the program's first argument) to the string.

Figure 818. Source for the shared memory client

// shm_ipc_client.c #include <stdio.h> #include <stdlib.h> #include "shm_ipc_common.h" int main(int argc, char **argv) { kern_return_t kr; msg_format_request_t send_msg; msg_format_response_r_t recv_msg; mach_msg_header_t *send_hdr, *recv_hdr; mach_port_t client_port, server_port, object_handle; // find the server kr = bootstrap_look_up(bootstrap_port, SERVICE_NAME, &server_port); EXIT_ON_MACH_ERROR("bootstrap_look_up", kr, BOOTSTRAP_SUCCESS); // allocate a port for receiving the server's reply kr = mach_port_allocate(mach_task_self(), // our task is acquiring MACH_PORT_RIGHT_RECEIVE, // a new receive right &client_port); // with this name EXIT_ON_MACH_ERROR("mach_port_allocate", kr, KERN_SUCCESS); // prepare and send a request message to the server send_hdr = &(send_msg.header); send_hdr->msgh_bits = MACH_MSGH_BITS(MACH_MSG_TYPE_COPY_SEND, \ MACH_MSG_TYPE_MAKE_SEND); send_hdr->msgh_size = sizeof(send_msg); send_hdr->msgh_remote_port = server_port; send_hdr->msgh_local_port = client_port; send_hdr->msgh_reserved = 0; send_hdr->msgh_id = SHM_MSG_ID; kr = mach_msg(send_hdr, // message buffer MACH_SEND_MSG, // option indicating send send_hdr->msgh_size, // size of header + body 0, // receive limit MACH_PORT_NULL, // receive name MACH_MSG_TIMEOUT_NONE, // no timeout, wait forever MACH_PORT_NULL); // no notification port EXIT_ON_MACH_ERROR("mach_msg(send)", kr, MACH_MSG_SUCCESS); do { recv_hdr = &(recv_msg.header); recv_hdr->msgh_remote_port = server_port; recv_hdr->msgh_local_port = client_port; recv_hdr->msgh_size = sizeof(recv_msg); recv_msg.data.name = 0; kr = mach_msg(recv_hdr, // message buffer MACH_RCV_MSG, // option indicating receive 0, // send size recv_hdr->msgh_size, // size of header + body client_port, // receive name MACH_MSG_TIMEOUT_NONE, // no timeout, wait forever MACH_PORT_NULL); // no notification port EXIT_ON_MACH_ERROR("mach_msg(rcv)", kr, MACH_MSG_SUCCESS); printf("recv_msg.data.name = %#08x\n", recv_msg.data.name); object_handle = recv_msg.data.name; { // map the specified memory object to a region of our address space mach_vm_size_t size = vm_page_size; mach_vm_address_t address = 0; kr = mach_vm_map( mach_task_self(), // target address space (us) (mach_vm_address_t *)&address, // map it and tell us where (mach_vm_size_t)size, // number of bytes to allocate (mach_vm_offset_t)0, // address mask for alignment TRUE, // map it anywhere (mem_entry_name_port_t)object_handle, // the memory object (memory_object_offset_t)0, // offset within memory object FALSE, // don't copy -- directly map VM_PROT_READ|VM_PROT_WRITE, // current protection VM_PROT_READ|VM_PROT_WRITE, // maximum protection VM_INHERIT_NONE); // inheritance properties if (kr != KERN_SUCCESS) mach_error("vm_map", kr); else { // display the current contents of the memory printf("%s\n", (char *)address); if (argc == 2) { // write specified string to the memory printf("writing \"%s\" to shared memory\n", argv[1]); strncpy((char *)address, argv[1], (size_t)size); ((char *)address)[size - 1] = '\0'; } mach_vm_deallocate(mach_task_self(), address, size); } } } while (recv_hdr->msgh_id != SHM_MSG_ID); exit(0); }

Figure 819 shows the source for the server. Since the named entry is represented by a Mach port, the server must send it specially: wrapped in a port descriptor, rather than passive, inline data. We will discuss such special IPC transfers in Section 9.5.5.

Figure 819. Source for the shared memory server

// shm_ipc_server.c #include <stdio.h> #include <stdlib.h> #include "shm_ipc_common.h" int main(void) { char *ptr; kern_return_t kr; mach_vm_address_t address = 0; memory_object_size_t size = (memory_object_size_t)vm_page_size; mach_port_t object_handle = MACH_PORT_NULL; msg_format_request_r_t recv_msg; msg_format_response_t send_msg; mach_msg_header_t *recv_hdr, *send_hdr; mach_port_t server_port; kr = mach_vm_allocate(mach_task_self(), &address, size, VM_FLAGS_ANYWHERE); EXIT_ON_MACH_ERROR("vm_allocate", kr, KERN_SUCCESS); printf("memory allocated at %p\n", (void *)address); // Create a named entry corresponding to the given mapped portion of our // address space. We can then share this named entry with other tasks. kr = mach_make_memory_entry_64( (vm_map_t)mach_task_self(), // target address map &size, // so many bytes (memory_object_offset_t)address, // at this address (vm_prot_t)(VM_PROT_READ|VM_PROT_WRITE), // with these permissions (mem_entry_name_port_t *)&object_handle, // outcoming object handle (mem_entry_name_port_t)NULL); // parent handle // ideally we should vm_deallocate() before we exit EXIT_ON_MACH_ERROR("mach_make_memory_entry", kr, KERN_SUCCESS); // put some data into the shared memory ptr = (char *)address; strcpy(ptr, "Hello, Mach!"); // become a Mach server kr = bootstrap_create_service(bootstrap_port, SERVICE_NAME, &server_port); EXIT_ON_MACH_ERROR("bootstrap_create_service", kr, BOOTSTRAP_SUCCESS); kr = bootstrap_check_in(bootstrap_port, SERVICE_NAME, &server_port); EXIT_ON_MACH_ERROR("bootstrap_check_in", kr, BOOTSTRAP_SUCCESS); for (;;) { // server loop // receive a message recv_hdr = &(recv_msg.header); recv_hdr->msgh_local_port = server_port; recv_hdr->msgh_size = sizeof(recv_msg); kr = mach_msg(recv_hdr, // message buffer MACH_RCV_MSG, // option indicating service 0, // send size recv_hdr->msgh_size, // size of header + body server_port, // receive name MACH_MSG_TIMEOUT_NONE, // no timeout, wait forever MACH_PORT_NULL); // no notification port EXIT_ON_MACH_ERROR("mach_msg(recv)", kr, KERN_SUCCESS); // send named entry object handle as the reply send_hdr = &(send_msg.header); send_hdr->msgh_bits = MACH_MSGH_BITS_LOCAL(recv_hdr->msgh_bits); send_hdr->msgh_bits |= MACH_MSGH_BITS_COMPLEX; send_hdr->msgh_size = sizeof(send_msg); send_hdr->msgh_local_port = MACH_PORT_NULL; send_hdr->msgh_remote_port = recv_hdr->msgh_remote_port; send_hdr->msgh_id = recv_hdr->msgh_id; send_msg.body.msgh_descriptor_count = 1; send_msg.data.name = object_handle; send_msg.data.disposition = MACH_MSG_TYPE_COPY_SEND; send_msg.data.type = MACH_MSG_PORT_DESCRIPTOR; kr = mach_msg(send_hdr, // message buffer MACH_SEND_MSG, // option indicating send send_hdr->msgh_size, // size of header + body 0, // receive limit MACH_PORT_NULL, // receive name MACH_MSG_TIMEOUT_NONE, // no timeout, wait forever MACH_PORT_NULL); // no notification port EXIT_ON_MACH_ERROR("mach_msg(send)", kr, KERN_SUCCESS); } mach_port_deallocate(mach_task_self(), object_handle); mach_vm_deallocate(mach_task_self(), address, size); return kr; }

Let us now test the shared memory client and server programs.

$ gcc -arch ppc64 -Wall -o shm_ipc_client shm_ipc_client.c $ gcc -arch ppc64 -Wall -o shm_ipc_server shm_ipc_server.c $ ./shm_ipc_server memory allocated at 0x5000 # another shell $ ./shm_ipc_client recv_msg.data.name = 0x001003 Hello, Mach! $ ./shm_ipc_client abcdefgh recv_msg.data.name = 0x001003 Hello, Mach! writing "abcdefgh" to shared memory $ ./shm_ipc_client recv_msg.data.name = 0x001003 abcdefgh $ ^C $