Mac OS X Internals: A Systems Approach

4.5. Forth

Open Firmware is based on the Forth programming language. All programming examples that follow are also written in Forth. Therefore, let us take a whirlwind tour of the language before we continue our discussion of Open Firmware.

Forth is an interactive, extensible, high-level programming language developed by Charles "Chuck" Moore in the early 1970s while he was working at the National Radio Astronomy Observatory (NRAO) in Arizona. Moore was using a third-generation minicomputer, the IBM 1130. The language he created was meant for the nextfourthgeneration computers, and Moore would have called it Fourth, except that the 1130 permitted only five-character identifiers. Hence, the "u" was dropped, and Fourth became Forth. Moore's most important goals in developing the language were extensibility and simplicity. Forth provides a rich vocabulary of built-in commands or words. The language can be extended by defining one's own words,[12] either by using existing words as building blocks or by defining words directly in Forth assembly. Forth combines the properties of a high-level language, an assembly language, an operating environment, an interactive command interpreter, and a set of development tools.

[12] A key point to understand is that the language itself can be extended: Forth allows you to define words that can be used in subsequent word definitions as keywords, if you will. Similarly, you can define new words that are used while compiling Forth words.

4.5.1. Cells

A cell is the primary unit of information in a Forth system. As a data type, a cell consists of a certain number of bits depending on the underlying instruction set architecture. A byte is defined to contain one address unit, whereas a cell usually contains multiple address units. A typical cell size is 32 bits.

4.5.2. Stacks

Forth is a stack-based language that uses reverse Polish notation (RPN), also known as postfix notation. You can interact with Forth at Open Firmware's ok prompt:

ok 0 > 3 ok 1 > 5 ok 2 > + ok 1 > . 8 ok 0 >

The number before the > indicates the number of items on the stack; initially there are zero items. Typing a number pushes it on the stack. Thus, after typing 3 and 5, the stack has two items. Next, we type +, an operator that consumes two numbers and yields one: the sum. The top two items on the stack are replaced by their sum, leaving a single item on the stack. Typing . displays the topmost item and pops it from the stack.

Sometimes, you might find it useful to have Open Firmware display the entire contents of the stack instead of only the number of items. The showstack command achieves this by including the stack's contents in the prompt:

0 > showstack ok -> <- Empty 1 ok -> 1 <- Top 2 ok -> 1 2 <- Top 3 ok -> 1 2 3 <- Top + ok -> 1 5 <- Top

The noshowstack command turns off this behavior. You can use the .s command to display the entire contents of the stack without any side effects:

0 > 1 2 3 4 ok 4 > .s -> 1 2 3 4 <- Top ok 4 >

The stack, or more precisely the data or parameter stack, is simply a region of last-in first-out (LIFO) memory whose primary use is to pass parameters to commands. The size of a single element (cell) on the stack is determined by the word size of the underlying processor.

Note that a processor word (e.g., 32 bits of addressable information) is different from a Forth word, which is Forth parlance for command. We shall use the term word to mean a Forth word, unless stated otherwise.

Forth also has a return stack that the system uses to pass control between words, and for programming-constructs such as looping. Although the programmer can access the return stack and can use it to store data temporarily, such use may have caveats, and the Forth standard discourages it.

4.5.3. Words

Forth words are essentially commandstypically analogous to procedures in many high-level languages. Forth provides a variety of standard, built-in words. New ones can be easily defined. For example, a word to compute the square of a number can be defined as follows:

: mysquare dup * ;

mysquare expects to be called with at least one itema numberon the stack. It will "consume" the number, which would be the top item on the stack, without referring to any other items that may be present on the stack. dup, a built-in word, duplicates the top item. The multiplication operator (*) multiplies the top two items, replacing them with their product.

Forth is a rather terse language, and it is beneficial to comment your code as much as possible. Consider mysquare again, commented this time:

\ mysquare - compute the square of a number : mysquare ( x -- square ) dup ( x x ) * ( square ) ;

Figure 41 shows the structure of a typical word definition.

Figure 41. Defining a Forth word: syntactic requirements and conventions

The \ character is conventionally used for descriptive comments, whereas comments that show the stack state are usually placed between ( and ). It is often sufficient to describe a Forth word by simply specifying its stack notation that shows "before" and "after" stack contents.

In the context of Forth programming in the Open Firmware environment, we will use the term word interchangeably with the terms function and method, except in places where such use is likely to cause confusion.

4.5.4. Dictionary

The region of memory where Forth stores its word definitions is called the dictionary. It is prepopulated with a built-in set of Forth words: the base set, as defined by the ANSI X3.215-1994 standard. When a new word is defined, Forth compiles itthat is, translates it into an internal formatand stores it in the dictionary, which stores new words in a last-come first-used fashion. Let us consider an example.

0 > : times2 ( x -- ) 2 * . ; ok \ Double the input number, display it, and pop it 0 > 2 times2 4 ok 0 > : times2 ( x -- ) 3 * . ; ok \ Define the same word differently 0 > 2 times2 6 ok \ New definition is used 0 > forget times2 ok \ Forget the last definition 0 > 2 times2 4 ok \ Original definition is used 0 > forget times2 ok \ Forget that definition too 0 > 2 times2 \ Try using the word times2, unknown word ok 0 > forget times2 times2, unknown word

forget removes the topmost instance of a word, if any, from the dictionary. You can view the definition of an existing word by using the see word:

0 > : times2 ( x -- ) 2 * . ; ok 0 > see times2 : times2 2 * . ; ok

4.5.4.1. A Sampling of Built-in Words

Open Firmware's Forth environment contains built-in words belonging to various categories. Let us look at some of these categories. Note that words are "described" through their stack notations.

Stacks

This category includes words for duplication, removal, and rearrangement of stack elements.

dup ( x -- x x ) ?dup ( x -- x x ) if x is not 0, ( x -- x ) if x is 0 clear ( x1 x2 ... xn -- ) depth ( x1 x2 ... xn -- n ) drop ( x -- ) rot ( x1 x2 x3 -- x2 x3 x1 ) -rot ( x1 x2 x3 -- x3 x1 x2 ) swap ( x1 x2 -- x2 x1 )

The return stack is shown with an R: prefix in the stack notation. There exist words to move and copy items between the data and return stacks:

\ move from data stack to return stack >r ( x -- ) ( R: -- x ) \ move from return stack to data stack r> ( -- x ) ( R: x -- ) \ copy from return stack to data stack r@ ( -- x ) ( R: x -- x )

Memory

This category includes words for memory access, allocation, and deallocation.

\ fetch the number of address units in a byte /c ( -- n ) \ fetch the number of address units in a cell /n ( -- n ) \ fetch the item stored at address addr addr @ ( addr -- x ) \ store item x at address addr x addr ! ( x addr -- ) \ add v to the value stored at address addr v addr +! ( v addr -- ) \ fetch the byte stored at address addr addr c@ ( addr -- b ) \ store byte b at address addr b addr c! ( b addr -- ) \ display len bytes of memory starting at address addr addr len dump ( addr len -- ) \ set n bytes beginning at address addr to value b addr len b fill ( addr len b -- ) \ set len bytes beginning at address addr to 0 addr len erase ( addr len -- ) \ allocate len bytes of general-purpose memory len alloc-mem ( len -- addr ) \ free len bytes of memory starting at address addr addr len free-mem ( addr len -- ) \ allocate len bytes of general-purpose memory, where \ mybuffer names the address of the allocated region len buffer: mybuffer ( len -- )

Creating and accessing named data are very common operations in a programming endeavor. The following are some examples of doing so.

0 > 1 constant myone ok \ Create a constant with value 1 0 > myone . 1 ok \ Verify its value 0 > 2 value mytwo ok \ Set value of mytwo to 2 0 > mytwo . 2 ok \ Verify value of mytwo 0 > 3 to mytwo ok \ Set value of mytwo to 3 0 > mytwo . 3 ok \ Verify value of mytwo 0 > 2 to myone \ Try to modify value of a constant invalid use of TO 0 > variable mythree ok \ Create a variable called mythree 0 > mythree . ff9d0800 ok \ Address of mythree 0 > 3 mythree ! ok \ Store 3 in mythree 0 > mythree @ ok \ Fetch the contents of mythree 1 > . 3 ok 0 > 4 buffer: mybuffer ok \ get a 4-byte buffer 0 > mybuffer . ffbd2c00 ok \ allocation address 0 > mybuffer 4 dump \ dump memory contents ffbd2c00: ff ff fb b0 |....| ok 0 > mybuffer 4 erase ok \ erase memory contents 0 > mybuffer 4 dump \ dump memory contents ffbd2c00: 00 00 00 00 |....| ok 0 > mybuffer 4 1 fill ok \ fill memory with 1's 0 > mybuffer 4 dump \ dump memory contents ffbd2c00: 01 01 01 01 |....| ok 0 > 4 mybuffer 2 + c! ok \ store 4 at third byte 0 > mybuffer 4 dump \ dump memory contents ffbd2c00: 01 01 04 01 |....| ok

Operators

This category includes words for single-precision integer arithmetic operations, double-number arithmetic operations, bitwise logical operations, and comparison.

1+ ( n -- n+1 ) 2+ ( n -- n+2 ) 1- ( n -- n-1 ) 2- ( n -- n-2 ) 2* ( n -- 2*n ) 2/ ( n -- n/2 ) abs ( n -- |n| ) max ( n1 n2 -- greater of n1 and n2 ) min ( n1 n2 -- smaller of n1 and n2 ) negate ( n -- -n ) and ( n1 n2 -- n1&n2 ) or ( n1 n2 -- n1|n2 ) decimal ( -- change base to 10 ) hex ( -- change base to 16 ) octal ( -- change base to 8 )

One double number uses two items on the stack, with the most significant part being the topmost item.

The variable called base stores the current number base. Besides using the built-in words for changing the base to a commonly used value, you can set the base to an arbitrary number by "manually" storing the desired value in the base variable.

0 > base @ ok 1 > . 10 ok 0 > 123456 ok 1 > 2 base ! ok 1 > . 11110001001000000 ok 0 > 11111111 ok 1 > hex ok 1 > . f ok 0 >

Console I/O

This category includes words for console input and output, reading of characters and edited input lines from the console input device, formatting, and string manipulation.

key ( -- c ) waits for a character to be typed ascii x ( x -- c ) ascii code for x c emit ( c -- ) prints character with ascii code c cr ( -- ) carriage return space ( -- ) single space u.r ( u width -- ) prints u right-justified within width ." text" ( -- ) prints the string .( text) ( -- ) prints the string

A literal string is specified with a leading space after the opening quote, for example: " hello".

Control Flow

This category includes words for conditional and iterative loops, the if-then-else clause, and the case statement. Many of these words refer to a Boolean flag that can be either true (1) or false (0). Such a flag is often a result of a comparison operator:

0 > 1 2 < ok \ is 1 < 2 ? 1 > . ffffffff ok \ true 0 > 2 1 < ok \ is 2 < 1 ? 1 > . 0 ok \ false

Following are some common control-flow constructs used in Forth programs.

\ Unconditional infinite loop begin \ do some processing again \ Conditional "while" loop begin <C> \ some condition while ... \ do some processing repeat \ Conditional branch <C> \ some condition if ... \ condition <C> is true else ... \ condition <C> is false then \ Iterative loop with a unitary increment <limit> <start> \ maximum and initial values of loop counter do ... \ do some processing ... \ the variable i contains current value of the counter loop \ Iterative loop with a specified increment <limit> <start> \ maximum and initial values of loop counter do ... \ do some processing ... \ the variable i contains current value of the counter <delta> \ value to be added to loop counter +loop

Other commonly used Forth words include the following:

  • Words for converting data types and address types

  • Words for error handling, including an exception mechanism that supports catch and throw

  • Words for creating and executing machine-level code definitions

This BootROM Is Brought to You By . . .

The built-in word kudos shows a list of credits containing names of those who contributed to the hardware initialization, Open Firmware, and diagnostics aspects of the Boot ROM.

4.5.4.2. Searching the Dictionary

Open Firmware's Forth dictionary may contain thousands of words. The sifting word allows you to search for words containing a specified string:

0 > sifting get-time get-time in /pci@f2000000/mac-io@17/via-pmu@16000/rtc get-time ok

A search could also yield multiple matches:

0 > sifting buffer frame-buffer-addr buffer: alloc-buffer:s in /packages/deblocker empty-buffers in /pci@f0000000/ATY,JasperParent@10/ATY,Jasper_A@0 frame-buffer-adr in /pci@f0000000/ATY,JasperParent@10/ATY,Jasper_B@1 frame-buffer-adr ok

An unsuccessful search fails silently:

0 > sifting nonsense ok 0 >

4.5.5. Debugging

Open Firmware includes a source-level debugger for single-stepping and tracing Forth programs. Some of the relevant words include the following:

debug ( command -- ) mark command for debugging resume ( -- ) exit from the debugger's subinterpreter and go back into the debugger stepping ( -- ) set single-stepping mode for debugging tracing ( -- ) set trace mode for debugging

Let us trace the execution of the following simple Forth program.

: factorial ( n -- n! ) dup 0 > if dup 1 - recurse * else drop 1 then ; 0 > showstack ok -> <- Empty debug factorial ok -> <- Empty tracing ok -> <- Empty 3 factorial debug: factorial type ? for help at ffa22bd0 -- -> 3 <- Top --> dup at ffa22bd4 -- -> 3 3 <- Top --> 0 at ffa22bd8 -- -> 3 3 0 <- Top --> > at ffa22bdc -- -> 3 ffffffff <- Top --> if at ffa22be4 -- -> 3 <- Top --> dup at ffa22be8 -- -> 3 3 <- Top --> 1 at ffa22bec -- -> 3 3 1 <- Top --> - at ffa22bf0 -- -> 3 2 <- Top --> factorial at ffa22bf4 -- -> 3 2 <- Top --> * at ffa22bf8 -- -> 6 <- Top --> branch+ at ffa22c04 -- -> 6 <- Top --> exit ok -> 6 <- Top

Alternatively, single-stepping through the program will prompt the user for a keystroke at every Forth wordthat is, the single-stepping is at Forth word level. Valid keystrokes that the user may type to control the execution of the program include the following.

  • <space> executes the current word and goes to the next word.

  • c continues the program without prompting any further; the program is traced, however.

  • f suspends debugging and starts a secondary Forth shell, which can be exited through the resume command, after which debugging continues from the point it was suspended.

  • q aborts execution of the current word and all its callers; control goes back to the Open Firmware prompt.

Depending on the Open Firmware version and the underlying processor architecture, contents of processor registers can be viewed, and in some cases modified, through implementation-specific words.

Категории