Solaris Internals: Solaris 10 and OpenSolaris Kernel Architecture (2nd Edition)

2.5. Kernel Process Table

Every process occupies a slot in the kernel process table, which maintains a process structure (commonly abbreviated as proc structure) for the process. The process structure is relatively large, and contains all the information the kernel needs to manage the process and schedule the LWPs and kthreads for execution. As processes are created, kernel memory space for the process table is allocated dynamically by the kmem cache allocation and management routines.

The kernel process objects are allocated from object-specific kernel memory (kmem) caches. A process_cache, thread_cache, and lwp_cache are created and initialized at boot time, and kernel memory for processes, threads, and LWPs is managed through each object's respective kmem cache. Statistics on these caches can be observed with the mdb(1) kmem_cache and kmastat dcmds, as well as the kstat(1) command.

# mdb -k Loading modules: [ unix krtld genunix specfs dtrace ufs sd ip sctp usba fctl nca nfs random sppp lofs crypto ptm ipc logindmux ] > ::kmastat cache buf buf buf memory alloc alloc name size in use total in use succeed fail ------------------------- ------ ------ ------ --------- --------- ----- kmem_magazine_1 16 7982 8064 131072 7982 0 kmem_magazine_3 32 6790 6804 221184 8809 0 . . . thread_cache 848 180 198 180224 12923 0 lwp_cache 1408 180 192 294912 919 0 . . . process_cache 3120 50 63 200704 1509 0 . . .

The most commonly requested information for memory statistics is memory used or consumed, which can be determined for each cache from the memory in use column in the example above (the value is in bytes).

The kstats for each cache are observed with the kstat(1) command:

# kstat -n process_cache module: unix instance: 0 name: process_cache class: kmem_cache align 8 alloc 1515 alloc_fail 0 buf_avail 22 buf_constructed 14 buf_inuse 50 buf_max 72 buf_size 3120 buf_total 72 chunk_size 3120 crtime 246.452541137 depot_alloc 46 depot_contention 0 depot_free 53 empty_magazines 3 free 1472 full_magazines 0 hash_lookup_depth 0 hash_rescale 0 hash_size 64 magazine_size 3 slab_alloc 64 slab_create 8 slab_destroy 0 slab_free 0 slab_size 28672 snaptime 284376.59969931 vmem_source 23

The kstats maintained reflect the objects managed by the kmem allocator. See Section 11.2 for a description of the buf, depot, magazine, and slab objects that constitute a kmem cache. The same set of statistics is maintained for the thread_cache and lwp_cache. Actually, statistics are maintained for all kernel object kmem caches (try kstat -c kmem_cache on your Solaris 10 systems).

The fast, scalable kmem cache mechanism is a perfect fit for the kernel process objects. It quickly allocates and frees kernel memory as processes and threads are created and destroyed on a running system. It reuses uninitialized object structures for fast instantiation when a new process, thread, or LWP is created.

2.5.1. Process Limits

At system boot time, the kernel initializes the process_cache to begin the allocation of kernel memory for storing the process table. Initially, space is allocated for one proc structure. The table itself is implemented as a doubly linked list, such that each proc structure contains a pointer to the next process and previous processes on the list.

The maximum size of the process table is based on the amount of physical memory (RAM) in the system and is established at boot time. The system first sets an internal variable called maxusers (which has absolutely nothing to do with the maximum number of the users the system will support), using the following code.

#define MIN_DEFAULT_MAXUSERS 8u #define MAX_DEFAULT_MAXUSERS 2048u #define MAX_MAXUSERS 4096u if (maxusers == 0) { pgcnt_t physmegs = physmem >> (20 - PAGESHIFT); pgcnt_t virtmegs = vmem_size(heap_arena, VMEM_FREE) >> 20; maxusers = MIN(MAX(MIN(physmegs, virtmegs), MIN_DEFAULT_MAXUSERS), MAX_DEFAULT_MAXUSERS); } if (maxusers > MAX_MAXUSERS) { maxusers = MAX_MAXUSERS; cmn_err(CE_NOTE, "maxusers limited to %d", MAX_MAXUSERS); } See usr/src/uts/common/conf/param.c

The net effect of the code above is that maxusers is set according to memory size, with a ceiling value of MAX_MAXUSERS (4096). maxusers is subsequently used to set the kernel variables max_nprocs and maxuprc.

/* * This allows platform-dependent code to constrain the maximum * number of processes allowed in case there are, e.g., VM limitations * with how many contexts are available. */ if (max_nprocs == 0) max_nprocs = (10 + 16 * maxusers); if (platform_max_nprocs > 0 && max_nprocs > platform_max_nprocs) max_nprocs = platform_max_nprocs; if (max_nprocs > maxpid) max_nprocs = maxpid; if (maxuprc == 0) maxuprc = (max_nprocs - reserved_procs); See usr/src/uts/common/conf/param.c

The max_nprocs value is the maximum number of processes systemwide, and maxuprc determines the maximum number of processes a non-root user can have occupying a process table slot at any time. The system actually uses a data structure, the var structure, which holds generic system configuration information, to store these values in. There are three related values:

  • v_proc. Set equal to max_nprocs.

  • v_maxupttl. The maximum number of process slots that can be used by all non-root users on the system. It is set to max_nprocs minus some number of reserved process slots (currently reserved_procs is 5).

  • v_maxup. The maximum number of process slots a non-root user can occupy. It is set to the maxuprc value. Note that v_maxup (an individual non-root user) and v_maxupttl (total of all non-root users on the system) end up being set to the same value, which is max_nprocs minus 5.

You can use mdb(1) to examine the values of maxusers, max_nprocs, and maxuprc on a running system.

# mdb -k Loading modules: [ unix krtld genunix specfs dtrace ufs sd ip sctp usba fctl nca nfs random sppp lofs crypto ptm ipc logindmux ] > max_nprocs/D max_nprocs: max_nprocs: 30000 > maxuprc/D maxuprc: maxuprc: 29995 > maxusers/D maxusers: maxusers: 2048 >

You can also use mdb(1) to examine the system var structure.

> v::print "struct var" { v_buf = 0x64 v_call = 0 v_proc = 0x7530 v_maxupttl = 0x752b v_nglobpris = 0xaa v_maxsyspri = 0x63 v_clist = 0 v_maxup = 0x752b v_hbuf = 0x1000 v_hmask = 0xfff v_pbuf = 0 v_sptmap = 0 v_maxpmem = 0 v_autoup = 0x1e v_bufhwm = 0x14350 } >0x7530=d 30000 >

Note that the values are displayed in base 16 (hex). You can convert to decimal right in mdb(1), as shown at the bottom of the example.

Finally, sar(1M) with the -v flag gives you the maximum process table size and the current number of processes on the system.

$ sar -v 1 1 SunOS pae1 5.10 Generic sun4u 02/24/2006 20:09:52 proc-sz ov inod-sz ov file-sz ov lock-sz 20:09:53 118/30000 0 21719/129797 0 556/556 0 0/0

Under the proc-sz column, the 118/30000 values represent the current number of processes (118) and the maximum number of processes (30,000).

The kernel does impose a maximum value in case max_nprocs is set in /etc/ system to something beyond what is reasonable, even for a large system. The maximum is 30,000, which is determined by the MAXPID macro in the param.h header file (available in /usr/include/sys).

In the kernel fork code, the current number of processes is checked against the v_proc parameter. If the limit is reached, the system produces an "out of processes" message on the console and increments the proc table overflow counter maintained in the cpu_sysinfo structure. This value is reflected in the ov column to the right of proc-sz in the sar(1M) output. For non-root users, a check is made against the v_maxup parameter, and an "out of per-user processes for uid (UID)" message is logged. In both cases, the calling program gets a -1 return value from fork(2), signifying an error.

The kernel maintains a /var/adm/utmp and /var/adm/wtmp file for the storage of user information used by the who(1), write(1), and login(1) commands (the accounting software and commands use utmp and wtmp as well). The PID data is maintained in a signed short data type, which has a maximum value of 32,000.

2.5.2. Thread Limits

Now that we've examined the limits the kernel imposes on the number of processes systemwide, let's look at the limits on the maximum number of LWP/kthread pairs that can exist in the system at any one time.

Each LWP has a kernel stack, allocated out of the segkp kernel address space segment. The size of the kernel segkp segment and the space allocated for LWP kernel stacks can vary according to the hardware platform. The stack itself is a default size of 24 Kbytes, and the default segkp size on both UltraSPARC and x64 platforms is 2 Gbytes. Thus, there is space for roughly (2GB ÷ 24K) 88,000 LWP stacks. This is a theoretical limitother constraining factors, such as available physical memory, may well come into play before we reach 88,000 LWPs. Also, the segkp segment is used for other pageable components of the LWP, not just the stack. Even though segkp is a pageable kernel segment, the performance of a system actively paging LWP stacks in and out would likely be unacceptable.

You can determine the size of your system's segkp segment by using kstat(1).

sol10$ kstat -n segkp module: vmem instance: 34 name: segkp class: vmem alloc 586432 contains 0 contains_search 0 crtime 144.618836467 fail 0 free 586231 lookup 170 mem_import 0 mem_inuse 26345472 mem_total 2147483648 . . .

The mem_total field indicates 2 Gbytes for segkp on this system (26 Mbytes are actually being usedmem_inuse field).

The maximum number of user threads is constrained by the process's address space size for 32-bit binaries. Each user thread has a user stack, and the default stack size is 1 Mbyte for a 32-bit process. Since a 32-bit process has a maximum address space of 4 Gbytes (this varies slightly for different platforms), the maximum number of threads would equate to roughly (4GB ÷ 1MB) or 4,000 threads. In practice, the number is less since a process's address space is consumed by other segments (text, heap, etc.). For 64-bit processes, the default thread stack size is 2 Mbytes. The address space of a 64-bit process is large enough that limits imposed by available address space for thread stacks are virtually nonexistent. A 64-bit process tends to be constrained by other resource issues (available physical memory, LWP limits, etc.).

Категории